一个MongoDB索引走偏的案例及探究分析
接业务需求,有一个MongoDB的简单查询,太耗时了,执行了 70S 左右,严重影响用户的体验。。
查询代码主要如下:
db.duoduologmodel.find({"Tags.SN": "QZ435698245"}) .projection({}) .sort({OPTime: -1})
.limit(20)
此集合在字段OPTime上有索引idx_OPTime;在"Tags"数组中的内嵌字段"SN"有索引idx_TSN;两者都是独立的索引。此集合存放的是执行Log,相对Size较大。
查看此查询对应的慢查询日志,如下:
2019-08-13T15:07:16.767+0800 I COMMAND [conn536359] command shqq_zp.duoduologmodel command: find { find: "duoduologmodel", filter: { Tags.SN: "QZ435698245" }, sort: { OPTime: -1 }, projection: {}, limit: 20, $db: "shqq_zp", $clusterTime: { clusterTime: Timestamp(1565679737, 71), signature: { hash: BinData(0, E7B0A887E83BAD0AA0A72016A39C677B53ABDBE2), keyId: 6658116868433248258 } }, lsid: { id: UUID("0f22409b-f122-41e9-a094-46ccf04e44c7") } } planSummary: IXSCAN { OPTime: 1 } cursorid:61603998663 keysExamined:12145431 docsExamined:12145431 fromMultiPlanner:1 replanned:1 numYields:97720 nreturned:14 reslen:16772198 locks:{ Global: { acquireCount: { r: 97721 } }, Database: { acquireCount: { r: 97721 } }, Collection: { acquireCount: { r: 97721 } } } protocol:op_msg 692537ms
查看此查询的执行计划,执行代码
db.duoduologmodel.find({"Tags.SN": "QZ435698245"}) .projection({}) .sort({OPTime: -1}) .explain()
主要反馈信息
"queryPlanner" : { "plannerVersion" : 1, "namespace" : "shqq_zp.duoduologmodel", "indexFilterSet" : false, "parsedQuery" : { "Tags.SN" : { "$eq" : "QZ435698245" } }, "winningPlan" : { "stage" : "SORT", "sortPattern" : { "OPTime" : -1 }, "inputStage" : { "stage" : "SORT_KEY_GENERATOR", "inputStage" : { "stage" : "FETCH", "inputStage" : { "stage" : "IXSCAN", "keyPattern" : { "Tags.SN" : 1 }, "indexName" : "idx_TSN", "isMultiKey" : true, "multiKeyPaths" : { "Tags.SN" : [ "Tags" ] }, "isUnique" : false, "isSparse" : true, "isPartial" : false, "indexVersion" : 2, "direction" : "forward", "indexBounds" : { "Tags.SN" : [ "[\"QZ435698245\", \"QZ435698245\"]" ] } } } } }, "rejectedPlans" : [ { "stage" : "FETCH", "filter" : { "Tags.SN" : { "$eq" : "QZ435698245" } }, "inputStage" : { "stage" : "IXSCAN", "keyPattern" : { "OPTime" : 1 }, "indexName" : "idx_OPTime", "isMultiKey" : false, "multiKeyPaths" : { "OPTime" : [ ] }, "isUnique" : false, "isSparse" : false, "isPartial" : false, "indexVersion" : 2, "direction" : "backward", "indexBounds" : { "OPTime" : [ "[MaxKey, MinKey]" ] } } } ] }
假如不用排序,删除 .sort({OperationTime: -1}),其执行计划
"queryPlanner" : { "plannerVersion" : 1, "namespace" : "shqq_zp.duoduologmodel", "indexFilterSet" : false, "parsedQuery" : { "Tags.SN" : { "$eq" : "QZ435698245" } }, "winningPlan" : { "stage" : "FETCH", "inputStage" : { "stage" : "IXSCAN", "keyPattern" : { "Tags.SN" : 1 }, "indexName" : "idx_TSN", "isMultiKey" : true, "multiKeyPaths" : { "Tags.SN" : [ "Tags" ] }, "isUnique" : false, "isSparse" : true, "isPartial" : false, "indexVersion" : 2, "direction" : "forward", "indexBounds" : { "Tags.SN" : [ "[\"QZ435698245\", \"QZ435698245\"]" ] } } }, "rejectedPlans" : [ ] }
此时,执行查询确实变快了很多,在2S以内执行完毕。
删除OPTime字段索引后的执行计划
"queryPlanner" : { "plannerVersion" : 1, "namespace" : "shqq_zp.duoduologmodel", "indexFilterSet" : false, "parsedQuery" : { "Tags.SN" : { "$eq" : "QZ435698245" } }, "winningPlan" : { "stage" : "SORT", "sortPattern" : { "OPTime" : -1 }, "inputStage" : { "stage" : "SORT_KEY_GENERATOR", "inputStage" : { "stage" : "FETCH", "inputStage" : { "stage" : "IXSCAN", "keyPattern" : { "Tags.SN" : 1 }, "indexName" : "idx_TSN", "isMultiKey" : true, "multiKeyPaths" : { "Tags.SN" : [ "Tags" ] }, "isUnique" : false, "isSparse" : true, "isPartial" : false, "indexVersion" : 2, "direction" : "forward", "indexBounds" : { "Tags.SN" : [ "[\"QZ435698245\", \"QZ435698245\"]" ] } } } } }, "rejectedPlans" : [ ] }
删除这个索引后,查看报错
{ "message" : "Executor error during find command :: caused by :: Sort operation used more than the maximum 33554432 bytes of RAM. Add an index, or specify a smaller limit.", "OPTime" : "Timestamp(1565681389, 1)", "ok" : 0, "code" : 96, "codeName" : "OperationFailed", "$clusterTime" : { "clusterTime" : "Timestamp(1565681389, 1)", "signature" : { "hash" : "vODWe0BCzyihrRVM08wPSFIMvo0=", "keyId" : "6658116868433248258" } }, "name" : "MongoError" }
原因比较明确:Sort operation used more than the maximum 33554432 bytes of RAM.,33554432 bytes算下来正好是32Mb,而Mongodb的sort操作是把数据拿到内存中再进行排序的,为了节约内存,默认给sort操作限制了最大内存为32Mb,当数据量越来越大直到超过32Mb的时候就自然抛出异常了!
因这个查看功能执行不多,并发不高。将系统排序内存由默认的32M调整到64M。
(此操作需谨慎,一般不建议修改,需要结合业务的使用情况,比如并发,数据量的大小;应优先考虑通过调整索引或集合的设计、甚至前端的设计来实现优化。)
db.adminCommand({setParameter:1, internalQueryExecMaxBlockingSortBytes:64554432})
(注意;这个设置在重启MongoDB服务就会失效,重新变成默认的32M了)。
再次执行查询查看,不再报错。并且快速返回结果(2S)
因为还有其他需求,会根据OPTime字段查看,重新添加这个索引。
再次执行,也可以比较迅速的返回结果(2S)。
从以上分析我们可以推断;
(1)explain()查看的执行计划,有时候还是有偏差的。
(2)Sort排序情况会影响索引的选择。即当internalQueryExecMaxBlockingSortBytes不足以支持先查询(by tag.sn)后排序(by optime)时,系统自动选择了一个已排序好的索引(by optime),进行查看。
知识补充:
queryPlanner.namespace:该值返回的是该query所查询的表;
queryPlanner.indexFilterSet:针对该query是否有indexfilter;
queryPlanner.winningPlan:查询优化器针对该query所返回的最优执行计划的详细内容;
queryPlanner.rejectedPlans:其他执行计划(非最优而被查询优化器reject的)的详细返回。
本文版权归作者所有,未经作者同意不得转载,谢谢配合!!!
本文版权归作者所有,未经作者同意不得转载,谢谢配合!!!
本文版权归作者所有,未经作者同意不得转载,谢谢配合!!!