MongoDB副本集(二)

操作

9.1 建立副本集: 

[mongo_3 /mongodb]# numactl --interleave=all mongod  --replSet sh1 --port 10000 --dbpath=/mongodb/sh1/data  \

--logpath=/mongodb/sh1/logs/sh1.log --logappend --fork

[mongo_3 /mongodb]# mongo --port 10000

> rs.status()     --该命令用于监测集群状态

{

"startupStatus" : 3,

"info" : "run rs.initiate(...) if not yet done for the set",

"ok" : 0,

"errmsg" : "can't get local.system.replset config from self or any seed (EMPTYCONFIG)"

}

--提示未进行初始化,观察后台日志。

Tue Jul 16 20:49:42.533 [initandlisten] waiting for connections on port 10000

Tue Jul 16 20:49:42.539 [rsStart] replSet info you may need to run replSetInitiate -- rs.initiate() in the shell -- if that is not already done

Tue Jul 16 20:49:50.623 [initandlisten] connection accepted from 127.0.0.1:48208 #1 (1 connection now open)

Tue Jul 16 20:49:52.540 [rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)

Tue Jul 16 20:50:52.542 [rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)

9.2 进行集群初始化:

> use admin

switched to db admin

>  db.runCommand({"replSetInitiate" :

...                  {

...                   "_id" : "sh1",

...                   "members" : [

...                      {"_id" : 1,"host" : "192.168.69.43:10000",priority:4}

...                               ]

...                  }

...                }

...               )

sh1:PRIMARY>     --立刻成为primary

日志如下

Tue Jul 16 21:04:46.397 [conn4] replSet replSetInitiate exception: can't find self in the replset config my port: 10000

Tue Jul 16 21:04:46.397 [conn4] command admin.$cmd command: { replSetInitiate: { _id: "sh1", members: [ { _id: 1.0, host: "192.168.73.192:10000", priority: 4.0 } ] } } ntoreturn:1 keyUpdates:0 locks(micros) W:68 reslen:122 63001ms

Tue Jul 16 21:04:52.580 [rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)

Tue Jul 16 21:04:57.684 [initandlisten] connection accepted from 127.0.0.1:48213 #5 (2 connections now open)

Tue Jul 16 21:05:02.581 [rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)

Tue Jul 16 21:05:07.687 [initandlisten] connection accepted from 127.0.0.1:48214 #6 (3 connections now open)

Tue Jul 16 21:05:12.582 [rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)

Tue Jul 16 21:05:22.582 [rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)

Tue Jul 16 21:05:25.921 [conn6] replSet replSetInitiate admin command received from client

Tue Jul 16 21:05:25.921 [conn6] replSet replSetInitiate config object parses ok, 1 members specified

Tue Jul 16 21:05:25.922 [conn6] replSet replSetInitiate all members seem up

Tue Jul 16 21:05:25.922 [conn6] ******

Tue Jul 16 21:05:25.922 [conn6] creating replication oplog of size: 23441MB... –5%硬盘大小

Tue Jul 16 21:05:26.107 [FileAllocator] done allocating datafile /mongodb/sh1/data/local.11, size: 2047MB,  took 0.011 secs

Tue Jul 16 21:05:26.109 [FileAllocator] allocating new datafile /mongodb/sh1/data/local.12, filling with zeroes...

Tue Jul 16 21:05:26.118 [FileAllocator] done allocating datafile /mongodb/sh1/data/local.12, size: 2047MB,  took 0.008 secs

Tue Jul 16 21:05:26.118 [conn6] ******

Tue Jul 16 21:05:26.119 [conn6] replSet info saving a newer config version to local.system.replset

Tue Jul 16 21:05:26.119 [conn6] replSet saveConfigLocally done

Tue Jul 16 21:05:26.119 [conn6] replSet replSetInitiate config now saved locally.  Should come online in about a minute.

Tue Jul 16 21:05:26.119 [conn6] command admin.$cmd command: { replSetInitiate: { _id: "sh1", members: [ { _id: 1.0, host: "192.168.69.43:10000", priority: 4.0 } ] } } ntoreturn:1 keyUpdates:0 locks(micros) W:197451 reslen:112 198ms

Tue Jul 16 21:05:32.583 [rsStart] replSet I am 192.168.69.43:10000

Tue Jul 16 21:05:32.583 [rsStart] replSet STARTUP2

Tue Jul 16 21:05:33.584 [rsSync] replSet SECONDARY

Tue Jul 16 21:05:33.585 [rsMgr] replSet info electSelf 1

Tue Jul 16 21:05:34.584 [rsMgr] replSet PRIMARY

我们看到,启动后就建立oplog日志,占用23441MB,大约5%硬盘大小。在投票时候,首先投给了自己,随后就变成primary

9.3      启动第二个成员

默认优先级,添加到副本集:

[root@localhost mongodb]# numactl --interleave=all /mongodb/bin/mongod  --replSet sh1  \

--port 10000 --dbpath=/mongodb/sh1/data  --logpath=/mongodb/sh1/logs/sh1.log --logappend --fork


sh1:PRIMARY> rs.add("192.168.69.44:10000")

{ "ok" : 1 }

sh1:PRIMARY> rs.conf()

{

"_id" : "sh1",

"version" : 2,

"members" : [

{

"_id" : 1,

"host" : "192.168.69.43:10000",

"priority" : 4

},

{

"_id" : 2,

"host" : "192.168.69.44:10000"

}

]

}

日志表明选举过程:

Wed Jul 17 09:48:00.134 [conn8] replSet replSetReconfig config object parses ok, 2 members specified

Wed Jul 17 09:48:00.139 [conn8] replSet replSetReconfig [2]

Wed Jul 17 09:48:00.139 [conn8] replSet info saving a newer config version to local.system.replset

Wed Jul 17 09:48:00.139 [conn8] replSet saveConfigLocally done

Wed Jul 17 09:48:00.139 [conn8] replSet info : additive change to configuration

Wed Jul 17 09:48:00.139 [conn8] replSet replSetReconfig new config saved locally

Wed Jul 17 09:48:00.140 [rsHealthPoll] replSet member 192.168.69.44:10000 is up

Wed Jul 17 09:48:00.140 [rsMgr] replSet total number of votes is even - add arbiter or give one member an extra vote  --建议奇数个成员

Wed Jul 17 09:48:04.803 [initandlisten] connection accepted from 192.168.69.44:49397 #9 (2 connections now open)

Wed Jul 17 09:48:04.804 [conn9] end connection 192.168.69.44:49397 (1 connection now open)

Wed Jul 17 09:48:04.805 [initandlisten] connection accepted from 192.168.69.44:49398 #10 (2 connections now open)

Wed Jul 17 09:48:06.143 [rsHealthPoll] replset info 192.168.69.44:10000 thinks that we are down

Wed Jul 17 09:48:06.143 [rsHealthPoll] replSet member 192.168.69.44:10000 is now in state STARTUP2

Wed Jul 17 09:48:21.093 [initandlisten] connection accepted from 192.168.69.44:49399 #11 (3 connections now open)

Wed Jul 17 09:48:21.297 [conn11] end connection 192.168.69.44:49399 (2 connections now open)

Wed Jul 17 09:48:21.816 [initandlisten] connection accepted from 192.168.69.44:49400 #12 (3 connections now open)

Wed Jul 17 09:48:22.151 [rsHealthPoll] replSet member 192.168.69.44:10000 is now in state RECOVERING

Wed Jul 17 09:48:22.300 [initandlisten] connection accepted from 192.168.69.44:49401 #13 (4 connections now open)

Wed Jul 17 09:48:23.305 [slaveTracking] build index local.slaves { _id: 1 }

Wed Jul 17 09:48:23.308 [slaveTracking] build index done.  scanned 0 total records. 0.001 secs

Wed Jul 17 09:48:24.152 [rsHealthPoll] replSet member 192.168.69.44:10000 is now in state SECONDARY

Wed Jul 17 09:48:34.822 [conn10] end connection 192.168.69.44:49398 (3 connections now open)

Wed Jul 17 09:48:34.823 [initandlisten] connection accepted from 192.168.69.44:49402 #14 (4 connections now open)

Wed Jul 17 09:49:04.838 [conn14] end connection 192.168.69.44:49402 (3 connections now open)

9.4      kill其中一个

由于无法和两个以上节点通信,所以无法进行选举

Wed Jul 17 09:54:12.672 [conn12] end connection 192.168.69.44:49400 (1 connection now open)

Wed Jul 17 09:54:14.341 [rsHealthPoll] DBClientCursor::init call() failed

Wed Jul 17 09:54:14.341 [rsHealthPoll] replset info 192.168.69.44:10000 heartbeat failed, retrying

Wed Jul 17 09:54:14.342 [rsHealthPoll] replSet info 192.168.69.44:10000 is down (or slow to respond):

Wed Jul 17 09:54:14.342 [rsHealthPoll] replSet member 192.168.69.44:10000 is now in state DOWN

Wed Jul 17 09:54:14.342 [rsMgr] can't see a majority of the set, relinquishing primary

Wed Jul 17 09:54:14.342 [rsMgr] replSet relinquishing primary state

Wed Jul 17 09:54:14.342 [rsMgr] replSet SECONDARY

Wed Jul 17 09:54:14.342 [rsMgr] replSet closing client sockets after relinquishing primary

Wed Jul 17 09:54:15.343 [conn8] end connection 127.0.0.1:48216 (0 connections now open)

Wed Jul 17 09:54:15.346 [initandlisten] connection accepted from 127.0.0.1:48232 #26 (1 connection now open)

Wed Jul 17 09:54:16.343 [rsHealthPoll] replset info 192.168.69.44:10000 heartbeat failed, retrying

Wed Jul 17 09:54:18.345 [rsHealthPoll] replset info 192.168.69.44:10000 heartbeat failed, retrying

Wed Jul 17 09:54:20.346 [rsHealthPoll] replset info 192.168.69.44:10000 heartbeat failed, retrying

Wed Jul 17 09:54:20.346 [rsMgr] replSet can't see a majority, will not try to elect self

Wed Jul 17 09:54:22.347 [rsHealthPoll] replset info 192.168.69.44:10000 heartbeat failed, retrying


sh1:SECONDARY> rs.status()

{

"set" : "sh1",

"date" : ISODate("2013-07-17T01:55:41Z"),

"myState" : 2,

"members" : [

{

"_id" : 1,

"name" : "192.168.69.43:10000",

"health" : 1,

"state" : 2,   --状态为1表示primary,2表示secondary,3表示recovering,8表示not healthy 10表示removed 

"stateStr" : "SECONDARY",

"uptime" : 47159,

"optime" : {

"t" : 1374025680,

"i" : 1

},

"optimeDate" : ISODate("2013-07-17T01:48:00Z"),

"self" : true

},

{

"_id" : 2,

"name" : "192.168.69.44:10000",

"health" : 0,

"state" : 8,

"stateStr" : "(not reachable/healthy)",

"uptime" : 0,

"optime" : {

"t" : 1374025680,

"i" : 1

},

"optimeDate" : ISODate("2013-07-17T01:48:00Z"),

"lastHeartbeat" : ISODate("2013-07-17T01:55:40Z"),

"lastHeartbeatRecv" : ISODate("1970-01-01T00:00:00Z"),

"pingMs" : 0,

"syncingTo" : "192.168.69.43:10000"

}

],

"ok" : 1

}

9.5 启动第三个节点,加入到副本集

优先级设置为5,重新开始选举,第三个节点接管成为primary。

sh1:PRIMARY> rs.add({_id: 3,host:"192.168.69.45:10000",priority:5})

{ "ok" : 1 }


修改优先级:

现在尝试修改69,44上的优先级,让他成为主节点:


sh1:PRIMARY> rs.conf()

{

"_id" : "sh1",

"version" : 13,

"members" : [

{

"_id" : 2,

"host" : "192.168.69.44:10000"

},

{

"_id" : 3,

"host" : "192.168.69.45:10000",

"priority" : 5

},

{

"_id" : 1,

"host" : "192.168.69.43:10000",

"priority" : 2

},

{

"_id" : 4,

"host" : "192.168.69.46:10000",

"priority" : 7

}

]

}

sh1:PRIMARY> cfg=rs.conf()

sh1:PRIMARY> cfg.members[0].priority = 10

rs.reconfig(cfg)

注意member[]里的数的选择。例如,对于上例,从第一个开始数,members依次为0,1,2,3.不要以_id为准。

这种方式通常会使得primary step down,引发新的选举,而且这通常会耗费10-20s时间,客户端会终端。

节点故障后,如果短时间无法恢复,最好手动用命令rs.remove()把他移除出副本集,否则剩余服务器还会尝试连接。

9.6 添加仲裁节点

numactl --interleave=all /mongodb/bin/mongod  --replSet sh1 --port 10001 \

--dbpath=/mongodb/arb/data  --logpath=/mongodb/arb/logs/arb.log --logappend --fork

sh1:PRIMARY> rs.addArb("192.168.69.45:10001")

{ "ok" : 1 }

sh1:PRIMARY>

sh1:PRIMARY>

sh1:PRIMARY>

sh1:PRIMARY> rs.conf

function () { return db.getSisterDB("local").system.replset.findOne(); }

sh1:PRIMARY> rs.conf()

{

"_id" : "sh1",

"version" : 24,

"members" : [

{

"_id" : 2,

"host" : "192.168.69.44:10000",

"priority" : 20

},

{

"_id" : 3,

"host" : "192.168.69.45:10000",

"priority" : 5

},

{

"_id" : 4,

"host" : "192.168.69.46:10000",

"priority" : 8

},

{

"_id" : 5,

"host" : "192.168.69.45:10001",

"arbiterOnly" : true

}

]

}

posted @ 2021-05-02 18:49  hexel  阅读(246)  评论(0编辑  收藏  举报