返回总目录页

mongodb之shard分片

 

总的

 

 

1:在3台独立服务器上,分别运行 27017,27018,27019实例, 互为副本集,形成3套repl set
2: 在3台服务器上,各配置config server, 运行27020端口上

3: 配置mongos
./bin/mongos --port 30000 \
 --dbconfig 192.168.1.201:27020,192.168.1.202:27020,192.168.1.203:27020

4:连接路由器
./bin/mongo --port 30000

5: 添加repl set为片
>sh.addShard(‘192.168.1.201:27017’);
>sh.addShard(‘192.168.1.203:27017’);
>sh.addShard(‘192.168.1.203:27017’);


6: 添加待分片的库
>sh.enableSharding(databaseName);

7: 添加待分片的表
>sh.shardCollection(‘dbName.collectionName’,{field:1});

Field是collection的一个字段,系统将会利用filed的值,来计算应该分到哪一个片上.
这个filed叫”片键”, shard key


mongodb不是从单篇文档的级别,绝对平均的散落在各个片上, 

而是N篇文档,形成一个块"chunk",
优先放在某个片上, 
当这片上的chunk,比另一个片的chunk,区别比较大时, (>=3) ,会把本片上的chunk,移到另一个片上, 以chunk为单位,
维护片之间的数据均衡

问: 为什么插入了10万条数据,才2个chunk?
答: 说明chunk比较大(默认是64M)
在config数据库中,修改chunksize的值.

问: 既然优先往某个片上插入,当chunk失衡时,再移动chunk,
自然,随着数据的增多,shard的实例之间,有chunk来回移动的现象,这将带来什么问题?
答: 服务器之间IO的增加, 

接上问: 能否我定义一个规则, 某N条数据形成1个块,预告分配M个chunk,
M个chunk预告分配在不同片上. 
以后的数据直接入各自预分配好的chunk,不再来回移动?

答: 能, 手动预先分片!

以shop.user表为例
1: sh.shardCollection(‘shop.user’,{userid:1}); //user表用userid做shard key

2: for(var i=1;i<=40;i++) { sh.splitAt('shop.user',{userid:i*1000}) } // 预先在1K 2K...40K这样的界限切好chunk(虽然chunk是空的), 这些chunk将会均匀移动到各片上.

3: 通过mongos添加user数据. 数据会添加到预先分配好的chunk上, chunk就不会来回移动了.

分片

部署使用分片的mongodb集群

var rsconf = {
    _id:'rs2',
    members:
      [
        {_id:0,
        host:'10.0.0.11:27017'
        },
        {_id:1,
        host:'10.0.0.11:27018'
        },
        {_id:2,
        host:'10.0.0.11:27020'
        }
      ]    
}



mongod --dbpath /mongodb/m17 --logpath /mongodb/mlog/m17.log --fork --port 27017 --smallfiles
mongod --dbpath /mongodb/m18 --logpath /mongodb/mlog/m18.log --fork --port 27018 --smallfiles
mongod --dbpath /mongodb/m20 --logpath /mongodb/mlog/m20.log --fork --port 27020 --configsvr
需要有一个配置数据库服务,存储元数据用,使用参数--configsvr

mongos --logpath /mongodb/mlog/m30.log --port 30000  --configdb 10.0.0.11:27020  --fork
mongos需要指定配置数据库,

[mongod@mcw01 ~]$ mongod --dbpath /mongodb/m17 --logpath /mongodb/mlog/m17.log --fork --port 27017 --smallfiles
about to fork child process, waiting until server is ready for connections.
forked process: 18608
child process started successfully, parent exiting
[mongod@mcw01 ~]$ mongod --dbpath /mongodb/m18 --logpath /mongodb/mlog/m18.log --fork --port 27018 --smallfiles
about to fork child process, waiting until server is ready for connections.
forked process: 18627
child process started successfully, parent exiting
[mongod@mcw01 ~]$ mongod --dbpath /mongodb/m20 --logpath /mongodb/mlog/m20.log --fork --port 27020 --configsvr
about to fork child process, waiting until server is ready for connections.
forked process: 18646
child process started successfully, parent exiting
[mongod@mcw01 ~]$ mongos --logpath /mongodb/mlog/m30.log --port 30000  --configdb 10.0.0.11:27020  --fork
2022-03-05T00:26:41.452+0800 W SHARDING [main] Running a sharded cluster with fewer than 3 config servers should only be done for testing purposes and is not recommended for production.
about to fork child process, waiting until server is ready for connections.
forked process: 18667
child process started successfully, parent exiting
[mongod@mcw01 ~]$ ps -ef|grep -v grep |grep mongo
root      16595  16566  0 Mar04 pts/0    00:00:00 su - mongod
mongod    16596  16595  0 Mar04 pts/0    00:00:03 -bash
root      17669  17593  0 Mar04 pts/1    00:00:00 su - mongod
mongod    17670  17669  0 Mar04 pts/1    00:00:00 -bash
root      17735  17715  0 Mar04 pts/2    00:00:00 su - mongod
mongod    17736  17735  0 Mar04 pts/2    00:00:00 -bash
mongod    18608      1  0 00:26 ?        00:00:03 mongod --dbpath /mongodb/m17 --logpath /mongodb/mlog/m17.log --fork --port 27017 --smallfiles
mongod    18627      1  0 00:26 ?        00:00:03 mongod --dbpath /mongodb/m18 --logpath /mongodb/mlog/m18.log --fork --port 27018 --smallfiles
mongod    18646      1  0 00:26 ?        00:00:04 mongod --dbpath /mongodb/m20 --logpath /mongodb/mlog/m20.log --fork --port 27020 --configsvr
mongod    18667      1  0 00:26 ?        00:00:01 mongos --logpath /mongodb/mlog/m30.log --port 30000 --configdb 10.0.0.11:27020 --fork
mongod    18698  16596  0 00:36 pts/0    00:00:00 ps -ef
[mongod@mcw01 ~]$ 


现在configsvr和mongos绑在一块了,但是和后面的两个mongodb分片还没有关系。
下面需要连接mongos,给它增加两个shard(片节点)。

[mongod@mcw01 ~]$ mongo --port 30000
MongoDB shell version: 3.2.8
connecting to: 127.0.0.1:30000/test
mongos> show dbs;
config  0.000GB
mongos> use config;
switched to db config
mongos> show tables; #查看mongos中有的表
chunks
lockpings
locks
mongos
settings
shards
tags
version
mongos> 
mongos> 
bye
[mongod@mcw01 ~]$ mongo --port 30000
MongoDB shell version: 3.2.8
connecting to: 127.0.0.1:30000/test
mongos> sh.addShard('10.0.0.11:27017');  #添加两个shard
{ "shardAdded" : "shard0000", "ok" : 1 }
mongos> sh.addShard('10.0.0.11:27018');
{ "shardAdded" : "shard0001", "ok" : 1 }
mongos> sh.status()  #查看shard状况
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }  #可以看到有两个shard,
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" } #这两个片已经加到configsvr里了
  active mongoses:
    "3.2.8" : 1
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        No recent migrations
  databases:

mongos> 
mongos> use test
switched to db test
mongos> db.stu.insert({name:'poly'});  #现在在mongos上创建四条数据。可以查询到
WriteResult({ "nInserted" : 1 })
mongos> db.stu.insert({name:'lily'});
WriteResult({ "nInserted" : 1 })
mongos> db.stu.insert({name:'hmm'});
WriteResult({ "nInserted" : 1 })
mongos> db.stu.insert({name:'lucy'});
WriteResult({ "nInserted" : 1 })
mongos> db.stu.find();
{ "_id" : ObjectId("6222427bc425e356ae71d452"), "name" : "poly" }
{ "_id" : ObjectId("62224282c425e356ae71d453"), "name" : "lily" }
{ "_id" : ObjectId("62224287c425e356ae71d454"), "name" : "hmm" }
{ "_id" : ObjectId("6222428dc425e356ae71d455"), "name" : "lucy" }
mongos> 



此时我在27017上能看到
[mongod@mcw01 ~]$ mongo --port 27017
.......
> show dbs;
local  0.000GB
test   0.000GB
> use test;
switched to db test
> db.stu.find();
{ "_id" : ObjectId("6222427bc425e356ae71d452"), "name" : "poly" }
{ "_id" : ObjectId("62224282c425e356ae71d453"), "name" : "lily" }
{ "_id" : ObjectId("62224287c425e356ae71d454"), "name" : "hmm" }
{ "_id" : ObjectId("6222428dc425e356ae71d455"), "name" : "lucy" }
> 


但是在27018上查不到数据
[mongod@mcw01 ~]$ mongo --port 27018
.......
> show dbs;
local  0.000GB
> 

没有设定数据的分片规则。下面我们进入mongos查看分片情况
mongos> sh.status();
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" }
  active mongoses:
    "3.2.8" : 1
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        No recent migrations
  databases:
    {  "_id" : "test",  "primary" : "shard0000",  "partitioned" : false }

mongos>  #如上可以看到,test库分区(partitioned)是false,没有分片,就默认首选放到主上的分片shard0000

给库开启分片

如下:shop是不存在的库。给shop开启分片。可看到是true了,且优先放到shard0001上,但是这还不完善
mongos> show dbs;
config  0.000GB
test    0.000GB
mongos> sh.enable
sh.enableBalancing(  sh.enableSharding(
mongos> sh.enableSharding('shop');
{ "ok" : 1 }
mongos> sh.status();
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" }
  active mongoses:
    "3.2.8" : 1
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        No recent migrations
  databases:
    {  "_id" : "test",  "primary" : "shard0000",  "partitioned" : false }
    {  "_id" : "shop",  "primary" : "shard0001",  "partitioned" : true }

mongos> 

指定db下那个表(集合)做分片,指定分片依据那个字段

mongos> sh.shardCollection('shop.goods',{goods_id:1});
{ "collectionsharded" : "shop.goods", "ok" : 1 }
mongos> sh.status();
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" }
  active mongoses:
    "3.2.8" : 1
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        No recent migrations
  databases:
    {  "_id" : "test",  "primary" : "shard0000",  "partitioned" : false }
    {  "_id" : "shop",  "primary" : "shard0001",  "partitioned" : true }
        shop.goods   #shop库下的goods需要分片。
            shard key: { "goods_id" : 1 } #分片键是这个字段
            unique: false
            balancing: true
            chunks:  
                shard0001    1   #chunk优先放到shard0001分片上
            { "goods_id" : { "$minKey" : 1 } } -->> { "goods_id" : { "$maxKey" : 1 } } on : shard0001 Timestamp(1, 0) 

mongos> 

如下插入多条数据,可以发现基本都分配到shard1上了,此时是默认chunk,是很大的

for(var i=1;i<=10000;i++){ db.goods.insert({goods_id:i,goods_name:'mcw mfsfowfofsfewfwifonwainfsfffsf'}) };

[mongod@mcw01 ~]$ mongo --port 30000
MongoDB shell version: 3.2.8
connecting to: 127.0.0.1:30000/test
mongos> use shop;
switched to db shop
mongos> for(var i=1;i<=10000;i++){ db.goods.insert({goods_id:i,goods_name:'mcw mfsfowfofsfewfwifonwainfsfffsf'}) };
WriteResult({ "nInserted" : 1 })
mongos> sh.status();
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" }
  active mongoses:
    "3.2.8" : 1
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        1 : Success
  databases:
    {  "_id" : "test",  "primary" : "shard0000",  "partitioned" : false }
    {  "_id" : "shop",  "primary" : "shard0001",  "partitioned" : true }
        shop.goods
            shard key: { "goods_id" : 1 }
            unique: false
            balancing: true
            chunks:
                shard0000    1
                shard0001    2
            { "goods_id" : { "$minKey" : 1 } } -->> { "goods_id" : 2 } on : shard0000 Timestamp(2, 0) 
            { "goods_id" : 2 } -->> { "goods_id" : 12 } on : shard0001 Timestamp(2, 1) 
            { "goods_id" : 12 } -->> { "goods_id" : { "$maxKey" : 1 } } on : shard0001 Timestamp(1, 3) 

mongos> 
mongos> db.goods.find().count();
10000
mongos> 


在27017上可以看到一条数据
[mongod@mcw01 ~]$ mongo --port 27017
> use shop
switched to db shop
> show tables;
goods
> db.goods.find().count();
1
> db.goods.find();
{ "_id" : ObjectId("622252d8b541d8768347746e"), "goods_id" : 1, "goods_name" : "mcw mfsfowfofsfewfwifonwainfsfffsf" }
> 


在27018分片2上有很多,分片分的不均。符合上面显示的id是2以上的,都在shard1上
[mongod@mcw01 ~]$ mongo --port 27018
> use shop;
switched to db shop
> db.goods.find().count();
9999
> db.goods.find().skip(9996);
{ "_id" : ObjectId("622252e3b541d87683479b7b"), "goods_id" : 9998, "goods_name" : "mcw mfsfowfofsfewfwifonwainfsfffsf" }
{ "_id" : ObjectId("622252e3b541d87683479b7c"), "goods_id" : 9999, "goods_name" : "mcw mfsfowfofsfewfwifonwainfsfffsf" }
{ "_id" : ObjectId("622252e3b541d87683479b7d"), "goods_id" : 10000, "goods_name" : "mcw mfsfowfofsfewfwifonwainfsfffsf" }
> 

修改chunk大小的配置

mongos> use config; #在mongos上,需要切到config库
switched to db config
mongos> show tables;
changelog
chunks
collections
databases
lockpings
locks
mongos
settings
shards
tags
version
mongos> db.settings.find();  #chunk大小的设置在设置里面,默认大小是64M
{ "_id" : "chunksize", "value" : NumberLong(64) }
mongos> db.settings.find();
{ "_id" : "chunksize", "value" : NumberLong(64) }
mongos>  db.settings.save({_id:'chunksize'},{$set:{value: 1}});  #这里不能用update的方式修改
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
mongos> db.settings.find();
{ "_id" : "chunksize" }
mongos> db.settings.save({ "_id" : "chunksize", "value" : NumberLong(64) });
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
mongos> db.settings.save({ "_id" : "chunksize", "value" : NumberLong(1) });  #修改chunk大小的配置为1M
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
mongos> db.settings.find();  #查看修改成功
{ "_id" : "chunksize", "value" : NumberLong(1) }
mongos> 

下面我们插入15万行数据,查看分片规则下的分片情况

for(var i=1;i<=150000;i++){ db.goods.insert({goods_id:i,goods_name:'mcw mfsfowfofsfewfwifonwainfsfffsf fsff'}) };



之前的表被删了,分片规则肯定也被删除了,重新建立分片规则吧
mongos> use shop;
switched to db shop
mongos> db.goods.drop();
false
mongos> for(var i=1;i<=150000;i++){ db.goods.insert({goods_id:i,goods_name:'mcw mfsfowfofsfewfwifonwainfsfffsf fsff'}) };
WriteResult({ "nInserted" : 1 })
mongos> sh.status();
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" }
  active mongoses:
    "3.2.8" : 1
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        1 : Success
  databases:
    {  "_id" : "test",  "primary" : "shard0000",  "partitioned" : false }
    {  "_id" : "shop",  "primary" : "shard0001",  "partitioned" : true }

mongos> sh.shardCollection('shop.goods',{goods_id:1});
{
    "proposedKey" : {
        "goods_id" : 1
    },
    "curIndexes" : [
        {
            "v" : 1,
            "key" : {
                "_id" : 1
            },
            "name" : "_id_",
            "ns" : "shop.goods"
        }
    ],
    "ok" : 0,
    "errmsg" : "please create an index that starts with the shard key before sharding."
}
mongos> 



重新建立分片规则,然后添加数据
mongos> db.goods.drop();
true
mongos> sh.shardCollection('shop.goods',{goods_id:1});
{ "collectionsharded" : "shop.goods", "ok" : 1 }
mongos> sh.status();
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" }
  active mongoses:
    "3.2.8" : 1
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        1 : Success
  databases:
    {  "_id" : "test",  "primary" : "shard0000",  "partitioned" : false }
    {  "_id" : "shop",  "primary" : "shard0001",  "partitioned" : true }
        shop.goods
            shard key: { "goods_id" : 1 }
            unique: false
            balancing: true
            chunks:
                shard0001    1
            { "goods_id" : { "$minKey" : 1 } } -->> { "goods_id" : { "$maxKey" : 1 } } on : shard0001 Timestamp(1, 0) 

重新插入数据,发现一个分片分了7个chunk,一个分片分了20个chunk,有点不均匀。手动预分配的方式更好
mongos> for(var i=1;i<=150000;i++){ db.goods.insert({goods_id:i,goods_name:'mcw mfsfowfofsfewfwifonwainfsfffsf fsff'}) }; 
WriteResult({ "nInserted" : 1 })
mongos> sh.status();
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" }
  active mongoses:
    "3.2.8" : 1
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        8 : Success
        13 : Failed with error 'aborted', from shard0001 to shard0000
  databases:
    {  "_id" : "test",  "primary" : "shard0000",  "partitioned" : false }
    {  "_id" : "shop",  "primary" : "shard0001",  "partitioned" : true }
        shop.goods
            shard key: { "goods_id" : 1 }
            unique: false
            balancing: true
            chunks:
                shard0000    7
                shard0001    20
            too many chunks to print, use verbose if you want to force print

mongos> 


根据提示,试了几次,正确显示出详细分片情况如下
mongos> sh.status({verbose:1});
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" }
  active mongoses:
    {  "_id" : "mcw01:30000",  "ping" : ISODate("2022-03-04T18:32:36.132Z"),  "up" : NumberLong(7555),  "waiting" : true,  "mongoVersion" : "3.2.8" }
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        8 : Success
        29 : Failed with error 'aborted', from shard0001 to shard0000
  databases:
    {  "_id" : "test",  "primary" : "shard0000",  "partitioned" : false }
    {  "_id" : "shop",  "primary" : "shard0001",  "partitioned" : true }
        shop.goods
            shard key: { "goods_id" : 1 }
            unique: false
            balancing: true
            chunks:
                shard0000    7
                shard0001    20
            { "goods_id" : { "$minKey" : 1 } } -->> { "goods_id" : 2 } on : shard0000 Timestamp(8, 1) 
            { "goods_id" : 2 } -->> { "goods_id" : 12 } on : shard0001 Timestamp(7, 1) 
            { "goods_id" : 12 } -->> { "goods_id" : 5473 } on : shard0001 Timestamp(2, 2) 
            { "goods_id" : 5473 } -->> { "goods_id" : 12733 } on : shard0001 Timestamp(2, 3) 
            { "goods_id" : 12733 } -->> { "goods_id" : 18194 } on : shard0000 Timestamp(3, 2) 
            { "goods_id" : 18194 } -->> { "goods_id" : 23785 } on : shard0000 Timestamp(3, 3) 
            { "goods_id" : 23785 } -->> { "goods_id" : 29246 } on : shard0001 Timestamp(4, 2) 
            { "goods_id" : 29246 } -->> { "goods_id" : 34731 } on : shard0001 Timestamp(4, 3) 
            { "goods_id" : 34731 } -->> { "goods_id" : 40192 } on : shard0000 Timestamp(5, 2) 
            { "goods_id" : 40192 } -->> { "goods_id" : 45913 } on : shard0000 Timestamp(5, 3) 
            { "goods_id" : 45913 } -->> { "goods_id" : 51374 } on : shard0001 Timestamp(6, 2) 
            { "goods_id" : 51374 } -->> { "goods_id" : 57694 } on : shard0001 Timestamp(6, 3) 
            { "goods_id" : 57694 } -->> { "goods_id" : 63155 } on : shard0000 Timestamp(7, 2) 
            { "goods_id" : 63155 } -->> { "goods_id" : 69367 } on : shard0000 Timestamp(7, 3) 
            { "goods_id" : 69367 } -->> { "goods_id" : 74828 } on : shard0001 Timestamp(8, 2) 
            { "goods_id" : 74828 } -->> { "goods_id" : 81170 } on : shard0001 Timestamp(8, 3) 
            { "goods_id" : 81170 } -->> { "goods_id" : 86631 } on : shard0001 Timestamp(8, 5) 
            { "goods_id" : 86631 } -->> { "goods_id" : 93462 } on : shard0001 Timestamp(8, 6) 
            { "goods_id" : 93462 } -->> { "goods_id" : 98923 } on : shard0001 Timestamp(8, 8) 
            { "goods_id" : 98923 } -->> { "goods_id" : 106012 } on : shard0001 Timestamp(8, 9) 
            { "goods_id" : 106012 } -->> { "goods_id" : 111473 } on : shard0001 Timestamp(8, 11) 
            { "goods_id" : 111473 } -->> { "goods_id" : 118412 } on : shard0001 Timestamp(8, 12) 
            { "goods_id" : 118412 } -->> { "goods_id" : 123873 } on : shard0001 Timestamp(8, 14) 
            { "goods_id" : 123873 } -->> { "goods_id" : 130255 } on : shard0001 Timestamp(8, 15) 
            { "goods_id" : 130255 } -->> { "goods_id" : 135716 } on : shard0001 Timestamp(8, 17) 
            { "goods_id" : 135716 } -->> { "goods_id" : 142058 } on : shard0001 Timestamp(8, 18) 
            { "goods_id" : 142058 } -->> { "goods_id" : { "$maxKey" : 1 } } on : shard0001 Timestamp(8, 19) 

mongos> 


过一天后再看,发现两个分片上的chunk分片的比较均衡了13,14,说明它没有后台自动在做均衡,而且不是很快即均衡的,需要时间。
[mongod@mcw01 ~]$ mongo --port 30000
MongoDB shell version: 3.2.8
connecting to: 127.0.0.1:30000/test
mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" }
  active mongoses:
    "3.2.8" : 1
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        14 : Success
        65 : Failed with error 'aborted', from shard0001 to shard0000
  databases:
    {  "_id" : "test",  "primary" : "shard0000",  "partitioned" : false }
    {  "_id" : "shop",  "primary" : "shard0001",  "partitioned" : true }
        shop.goods
            shard key: { "goods_id" : 1 }
            unique: false
            balancing: true
            chunks:
                shard0000    13
                shard0001    14
            too many chunks to print, use verbose if you want to force print

mongos> 

手动预先分片

分片的命令

replication
英 [ˌreplɪ'keɪʃ(ə)n]   美 [ˌreplɪ'keɪʃ(ə)n]  
n.
(绘画等的)复制;拷贝;重复(实验);(尤指对答辩的)回答

for(var i=1;i<=40;i++){sh.splitAt('shop.user',{userid:i*1000})}
给shop这个库下的user表切割分片,只要userid字段是1000的倍数就切割一次,形成一个新的chunk。

mongos> sh.help();
    sh.addShard( host )                       server:port OR setname/server:port
    sh.enableSharding(dbname)                 enables sharding on the database dbname
    sh.shardCollection(fullName,key,unique)   shards the collection
    sh.splitFind(fullName,find)               splits the chunk that find is in at the median
    sh.splitAt(fullName,middle)               splits the chunk that middle is in at middle
    sh.moveChunk(fullName,find,to)            move the chunk where 'find' is to 'to' (name of shard)
    sh.setBalancerState( <bool on or not> )   turns the balancer on or off true=on, false=off
    sh.getBalancerState()                     return true if enabled
    sh.isBalancerRunning()                    return true if the balancer has work in progress on any mongos
    sh.disableBalancing(coll)                 disable balancing on one collection
    sh.enableBalancing(coll)                  re-enable balancing on one collection
    sh.addShardTag(shard,tag)                 adds the tag to the shard
    sh.removeShardTag(shard,tag)              removes the tag from the shard
    sh.addTagRange(fullName,min,max,tag)      tags the specified range of the given collection
    sh.removeTagRange(fullName,min,max,tag)   removes the tagged range of the given collection
    sh.status()                               prints a general overview of the cluster
mongos> 

预先分片

给user表做分片,以userid作为片键进行分片。
假设预计一年内增长4千万用户,这两个sharding上每个上分2千万,2千万又分为20个片,每个片上是一百万个数据

我们模拟一下一共30-40个片,每个片上1千条数据 。这样预分片得使用切割的方法
mongos> use shop
switched to db shop
mongos> sh.shardCollection('shop.user',{userid:1})
{ "collectionsharded" : "shop.user", "ok" : 1 }
mongos> #给shop这个库下的user表切割分片,只要userid字段是1000的倍数就切割一次,形成一个新的chunk。
mongos> for(var i=1;i<=40;i++){sh.splitAt('shop.user',{userid:i*1000})}
{ "ok" : 1 }
mongos> 

执行预先分片后查看

查看user表,当稳定后,20,21就不会因为数据不平衡来回转移分片中的chunk,导致机器的io很高
mongos> sh.status();
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("62223dc1dd5791b451d9b441")
}
  shards:
    {  "_id" : "shard0000",  "host" : "10.0.0.11:27017" }
    {  "_id" : "shard0001",  "host" : "10.0.0.11:27018" }
  active mongoses:
    "3.2.8" : 1
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        34 : Success
        65 : Failed with error 'aborted', from shard0001 to shard0000
  databases:
    {  "_id" : "test",  "primary" : "shard0000",  "partitioned" : false }
    {  "_id" : "shop",  "primary" : "shard0001",  "partitioned" : true }
        shop.goods
            shard key: { "goods_id" : 1 }
            unique: false
            balancing: true
            chunks:
                shard0000    13
                shard0001    14
            too many chunks to print, use verbose if you want to force print
        shop.user
            shard key: { "userid" : 1 }
            unique: false
            balancing: true
            chunks:
                shard0000    20
                shard0001    21
            too many chunks to print, use verbose if you want to force print

mongos> 

查看分片的详情,可以看到是这样分片的,现在还没有插入数据,但是已经提前知道大概多少数据,根据usserid已经划分好userid所在的分片了。这样当插入数据的时候,它应会保存到符合条件的范围chunk上的。
mongos> sh.status({verbose:1});
--- Sharding Status --- 
............. 
        shop.user
            shard key: { "userid" : 1 }
            unique: false
            balancing: true
            chunks:
                shard0000    20
                shard0001    21
            { "userid" : { "$minKey" : 1 } } -->> { "userid" : 1000 } on : shard0000 Timestamp(2, 0) 
            { "userid" : 1000 } -->> { "userid" : 2000 } on : shard0000 Timestamp(3, 0) 
            { "userid" : 2000 } -->> { "userid" : 3000 } on : shard0000 Timestamp(4, 0) 
            { "userid" : 3000 } -->> { "userid" : 4000 } on : shard0000 Timestamp(5, 0) 
            { "userid" : 4000 } -->> { "userid" : 5000 } on : shard0000 Timestamp(6, 0) 
            { "userid" : 5000 } -->> { "userid" : 6000 } on : shard0000 Timestamp(7, 0) 
            { "userid" : 6000 } -->> { "userid" : 7000 } on : shard0000 Timestamp(8, 0) 
            { "userid" : 7000 } -->> { "userid" : 8000 } on : shard0000 Timestamp(9, 0) 
            { "userid" : 8000 } -->> { "userid" : 9000 } on : shard0000 Timestamp(10, 0) 
            { "userid" : 9000 } -->> { "userid" : 10000 } on : shard0000 Timestamp(11, 0) 
            { "userid" : 10000 } -->> { "userid" : 11000 } on : shard0000 Timestamp(12, 0) 
            { "userid" : 11000 } -->> { "userid" : 12000 } on : shard0000 Timestamp(13, 0) 
            { "userid" : 12000 } -->> { "userid" : 13000 } on : shard0000 Timestamp(14, 0) 
            { "userid" : 13000 } -->> { "userid" : 14000 } on : shard0000 Timestamp(15, 0) 
            { "userid" : 14000 } -->> { "userid" : 15000 } on : shard0000 Timestamp(16, 0) 
            { "userid" : 15000 } -->> { "userid" : 16000 } on : shard0000 Timestamp(17, 0) 
            { "userid" : 16000 } -->> { "userid" : 17000 } on : shard0000 Timestamp(18, 0) 
            { "userid" : 17000 } -->> { "userid" : 18000 } on : shard0000 Timestamp(19, 0) 
            { "userid" : 18000 } -->> { "userid" : 19000 } on : shard0000 Timestamp(20, 0) 
            { "userid" : 19000 } -->> { "userid" : 20000 } on : shard0000 Timestamp(21, 0) 
            { "userid" : 20000 } -->> { "userid" : 21000 } on : shard0001 Timestamp(21, 1) 
            { "userid" : 21000 } -->> { "userid" : 22000 } on : shard0001 Timestamp(1, 43) 
            { "userid" : 22000 } -->> { "userid" : 23000 } on : shard0001 Timestamp(1, 45) 
            { "userid" : 23000 } -->> { "userid" : 24000 } on : shard0001 Timestamp(1, 47) 
            { "userid" : 24000 } -->> { "userid" : 25000 } on : shard0001 Timestamp(1, 49) 
            { "userid" : 25000 } -->> { "userid" : 26000 } on : shard0001 Timestamp(1, 51) 
            { "userid" : 26000 } -->> { "userid" : 27000 } on : shard0001 Timestamp(1, 53) 
            { "userid" : 27000 } -->> { "userid" : 28000 } on : shard0001 Timestamp(1, 55) 
            { "userid" : 28000 } -->> { "userid" : 29000 } on : shard0001 Timestamp(1, 57) 
            { "userid" : 29000 } -->> { "userid" : 30000 } on : shard0001 Timestamp(1, 59) 
            { "userid" : 30000 } -->> { "userid" : 31000 } on : shard0001 Timestamp(1, 61) 
            { "userid" : 31000 } -->> { "userid" : 32000 } on : shard0001 Timestamp(1, 63) 
            { "userid" : 32000 } -->> { "userid" : 33000 } on : shard0001 Timestamp(1, 65) 
            { "userid" : 33000 } -->> { "userid" : 34000 } on : shard0001 Timestamp(1, 67) 
            { "userid" : 34000 } -->> { "userid" : 35000 } on : shard0001 Timestamp(1, 69) 
            { "userid" : 35000 } -->> { "userid" : 36000 } on : shard0001 Timestamp(1, 71) 
            { "userid" : 36000 } -->> { "userid" : 37000 } on : shard0001 Timestamp(1, 73) 
            { "userid" : 37000 } -->> { "userid" : 38000 } on : shard0001 Timestamp(1, 75) 
            { "userid" : 38000 } -->> { "userid" : 39000 } on : shard0001 Timestamp(1, 77) 
            { "userid" : 39000 } -->> { "userid" : 40000 } on : shard0001 Timestamp(1, 79) 
            { "userid" : 40000 } -->> { "userid" : { "$maxKey" : 1 } } on : shard0001 Timestamp(1, 80) 

mongos> 

插入数据,查看手动预分片的效果,防止数据在节点间来回复制

当chunk快满的时候,一定要提前解决,不然新增新的分片,导致数据大量的移动,io太高而发生服务器挂掉的情况

在mongos上插入数据
mongos> for(var i=1;i<=40000;i++){db.user.insert({userid:i,name:'xiao ma guo he'})};
WriteResult({ "nInserted" : 1 })
mongos> 

在27017上可以看到1-19999共19999条数据。根据上面的分片详情,可以知道shard0000上就是分配了1-20000,而这里是取前不取后。20000在shard0001上,所以0上有19999条数据,而1上有20000-40000的数据,20000-39999是2万条数据在1上,加上40000到最大在1上,所以就是20001条数据。预分配分片相对来说比较稳定,这也不会因为当随着数据插入分配不均衡时,数据在两个节点之间来回复制带来的性能问题。
[mongod@mcw01 ~]$ mongo --port 27017
> use shop
switched to db shop
> db.user.find().count();
19999
> db.user.find().skip(19997);
{ "_id" : ObjectId("6222d91f69eed283bf054e96"), "userid" : 19998, "name" : "xiao ma guo he" }
{ "_id" : ObjectId("6222d91f69eed283bf054e97"), "userid" : 19999, "name" : "xiao ma guo he" }
> 


[mongod@mcw01 ~]$ mongo --port 27018
> use shop
switched to db shop
> db.user.find().count();
20001
> db.user.find().skip(19999);
{ "_id" : ObjectId("6222d93669eed283bf059cb7"), "userid" : 39999, "name" : "xiao ma guo he" }
{ "_id" : ObjectId("6222d93669eed283bf059cb8"), "userid" : 40000, "name" : "xiao ma guo he" }
> db.user.find().limit(1);
{ "_id" : ObjectId("6222d91f69eed283bf054e98"), "userid" : 20000, "name" : "xiao ma guo he" }
> 

 

posted @ 2022-03-05 16:14  马昌伟  阅读(1097)  评论(0编辑  收藏  举报
博主链接地址:https://www.cnblogs.com/machangwei-8/