mongodb数据库范围分片数据分布不均匀
【说明】
当前使用mongodb分片,三个分片三个副本,三个config servers,三个mongos
mongos> sh.status() --- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("66a30ccca62de41d6b0241a4") } shards: { "_id" : "mongo1", "host" : "mongo1/mongo1:2700,mongo2:2700,mongo3:2700", "state" : 1 } { "_id" : "mongo2", "host" : "mongo2/mongo1:2701,mongo2:2701,mongo3:2701", "state" : 1 } { "_id" : "mongo3", "host" : "mongo3/mongo1:2702,mongo2:2702,mongo3:2702", "state" : 1 } active mongoses: "4.2.8" : 3 autosplit: Currently enabled: yes balancer: Currently enabled: yes Currently running: no Failed balancer rounds in last 5 attempts: 0 Migration Results for the last 24 hours: 690 : Success
【表范围分片测试】
sh.enableSharding("test");
sh.shardCollection("test.messages", { createTime : 1} ); 注意这里只是测试时候将时间字段作为范围分片键
使用js插入数据
cat shard_test_messages.js testdb = db.getSiblingDB('test'); var messages = ["Hello there", "Good Morning", "valar morghulis"]; var createTime = new Date(); for (var j = 0; j < 50000; j ++) { createTime.setFullYear(2024); createTime.setMonth(Math.floor(Math.random() * 12)); createTime.setDate(Math.floor(Math.random() * 31) + 1); createTime.setHours(Math.floor(Math.random() * 24)); createTime.setMinutes(Math.floor(Math.random() * 60)); createTime.setSeconds(Math.floor(Math.random() * 60)); testdb.messages.insertOne({ userid: Math.floor(Math.random()*50000), message: messages[Math.floor(Math.random()*messages.length)], createTime: createTime }) } db.messages.ensureIndex({createTime: 1});
mongo "localhost:27017/admin" /tmp/shard_test_messages.js -u admin -p 123456
【查看表分片情况】
mongos> db.messages.getShardDistribution() Shard mongo1 at mongo1/mongo1:2700,mongo2:2700,mongo3:2700 data : 2.01MiB docs : 25000 chunks : 1 estimated data per chunk : 2.01MiB estimated docs per chunk : 25000 Totals data : 2.01MiB docs : 25000 chunks : 1 Shard mongo1 contains 100% data, 100% docs in cluster, avg obj size on shard : 84B mongos> sh.status() --- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("66a30ccca62de41d6b0241a4") } shards: { "_id" : "mongo1", "host" : "mongo1/mongo1:2700,mongo2:2700,mongo3:2700", "state" : 1 } { "_id" : "mongo2", "host" : "mongo2/mongo1:2701,mongo2:2701,mongo3:2701", "state" : 1 } { "_id" : "mongo3", "host" : "mongo3/mongo1:2702,mongo2:2702,mongo3:2702", "state" : 1 } active mongoses: "4.2.8" : 3 autosplit: Currently enabled: yes balancer: Currently enabled: yes Currently running: no Failed balancer rounds in last 5 attempts: 0 Migration Results for the last 24 hours: No recent migrations databases: { "_id" : "config", "primary" : "config", "partitioned" : true } config.system.sessions shard key: { "_id" : 1 } unique: false balancing: true chunks: mongo1 1 { "_id" : { "$minKey" : 1 } } -->> { "_id" : { "$maxKey" : 1 } } on : mongo1 Timestamp(1, 0) { "_id" : "test", "primary" : "mongo1", "partitioned" : true, "version" : { "uuid" : UUID("35a9a3e5-3ba5-4315-977c-9c7176d891ae"), "lastMod" : 1 } } test.messages shard key: { "createTime" : 1 } unique: false balancing: true chunks: mongo1 1 { "createTime" : { "$minKey" : 1 } } -->> { "createTime" : { "$maxKey" : 1 } } on : mongo1 Timestamp(1, 0)
mongodb的sharding.autosplit是默认开启的,并且跟checkSize参数关联,默认为64M,当前安装的时候执行了修改这个参数为1M便于方便测试,之前版本的是在mongos的配置文件参数中设置,
当前版本跟新版本参数在config库下面的setting表,使用命令修改,各版本的修改命令有差异,需要注意:https://www.mongodb.com/zh-cn/docs/manual/tutorial/modify-chunk-size-in-sharded-cluster/#std-label-tutorial-modifying-range-size;
根据淘宝月报的mongodb分片知识,查看链接:https://www.bookstack.cn/read/aliyun-rds-core/5717773a4eef2615.md
查看当前范围分片的数据只在分片shard1的primary shard中,根据chunk触发迁移条件,手工测试balance,查看到报错参数报错信息:
mongos> sh.startBalancer() 2024-07-30T10:39:49.588+0800 E QUERY [js] uncaught exception: Error: command failed: { "ok" : 0, "errmsg" : "Failed to refresh the chunk sizes settings :: caused by :: Expected field \"value\" to have numeric type, but found string", "code" : 14, "codeName" : "TypeMismatch", "operationTime" : Timestamp(1722307189, 1), "$clusterTime" : { "clusterTime" : Timestamp(1722307189, 1), "signature" : { "hash" : BinData(0,"XinbmSBnIvaxzrWfcNLn8IsFI78="), "keyId" : NumberLong("7395769083385348113") } } } : _getErrorWithCode@src/mongo/shell/utils.js:25:13 doassert@src/mongo/shell/assert.js:18:14 _assertCommandWorked@src/mongo/shell/assert.js:583:17 assert.commandWorked@src/mongo/shell/assert.js:673:16 sh.startBalancer@src/mongo/shell/utils_sh.js:184:12 @(shell):1:1
根据报错查看是参数值类型不对,查看配置信息,确实是字符串类型:
mongos> configdb.settings.find() { "_id" : "chunksize", "value" : "1" }
当前安装的分片集群是使用ansible脚本安装,使用变量方式传入导致参数异常:
查看参数是数字类型
# The chunksize for shards in MB
mongos_chunk_size: 1
查看参数输入时候加了''号为string类型:
configdb = db.getSiblingDB('config');
configdb.settings.save( { _id:"chunksize", value: '{{ mongos_chunk_size }}' } )
查看参数跟报错一致,修改参数,测试范围分片数据正常
mongos> configdb.settings.save( { _id:"chunksize", value: 1 } )
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
mongos> configdb.settings.find()
{ "_id" : "chunksize", "value" : 1 }
mongos>
mongos> sh.startBalancer() { "ok" : 1, "operationTime" : Timestamp(1722307355, 23), "$clusterTime" : { "clusterTime" : Timestamp(1722307355, 23), "signature" : { "hash" : BinData(0,"SfLbKefZpUbx4FCKVT1JeRvsrOY="), "keyId" : NumberLong("7395769083385348113") } } } mongos> mongos> sh.getBalancerState() true mongos> db.messages.getShardDistribution() Shard mongo1 at mongo1/mongo1:2700,mongo2:2700,mongo3:2700 data : 3.37MiB docs : 41863 chunks : 5 estimated data per chunk : 692KiB estimated docs per chunk : 8372 Shard mongo2 at mongo2/mongo1:2701,mongo2:2701,mongo3:2701 data : 3.34MiB docs : 41482 chunks : 5 estimated data per chunk : 685KiB estimated docs per chunk : 8296 Shard mongo3 at mongo3/mongo1:2702,mongo2:2702,mongo3:2702 data : 3.36MiB docs : 41655 chunks : 5 estimated data per chunk : 688KiB estimated docs per chunk : 8331 Totals data : 10.09MiB docs : 125000 chunks : 15 Shard mongo1 contains 33.49% data, 33.49% docs in cluster, avg obj size on shard : 84B Shard mongo2 contains 33.18% data, 33.18% docs in cluster, avg obj size on shard : 84B Shard mongo3 contains 33.32% data, 33.32% docs in cluster, avg obj size on shard : 84B