MongoDB 副本集详解
一、概述
MongoDB做主从模式有两种,第一种:是MongoDB主从模式,该模式主要是在配置时要明确主服务器(当实际运行过程中主服务器挂了,从服务器不会自动升级到主服务器),另外该模式不能实现从服务器到从服务器的复制,因为从服务器没有oplog。第二种,则是MongoDB副本集,副本集的主要优势在于没有明确固定主服务器(例如当前主服务器挂了,副本集系统会自动在从服务器中竞选一台从服务器升级为主服务器)。
二、基础说明
下面对MongoDB的副本集进行详细讲解。环境如下:
结构图如下:由主服务器承担来自应用或者服务接口的相关处理请求
三、安装MongoDB
MongoDB的安装有多种方式,例如:源码、yum、rpm等。这里直接下载官方的源码。地址:http://downloads.mongodb.org/linux/mongodb-linux-x86_64-rhel62-3.4.9.tgz?_ga=2.225511514.1125230407.1527905895-199343347.1523847839 其实,源码安装非常简单,直接下载然后解压、运行。
这里,我将所有的配置直接写进一个配置文件,相关操作如下:
1 #基础操作 2 wget http://downloads.mongodb.org/linux/mongodb-linux-x86_64-rhel62-3.4.9.tgz?_ga=2.225511514.1125230407.1527905895-199343347.1523847839 3 mkdir -p /app/mongodb/data 4 mkdir -p /app/mongodb/logs 5 tar xf mongodb-linux-x86_64-rhel62-3.4.9.tgz 6 mv mongodb-linux-x86_64-rhel62-3.4.9 /usr/local/mongodb
基础安装在上面已经完成,下面介绍一下配置文件
vim /usr/local/mongodb/bin/config.conf dbpath=/app/mongodb/data logpath=/app/mongodb/logs/mongodb.log port=27017 fork=true nohttpinterface=true replSet=raytest #上面是常规项不做介绍,该项是启动MongoDB副本集及名称
启动MongoDB
/usr/local/mongodb/bin/mongod --config /usr/local/mongodb/bin/config.conf
四、配置MongoDB副本集
按照上面的操作,现在应该启动了MongoDB(注意:将上述操作在172.20.10.79、80、81三台服务器均进行操作,配置文件也要拷贝相关服务器),接下来进行MongoDB副本集配置,操作如下:
现在登陆任意一台服务器,执行show dbs操作都会报错,错误原因也说的非常明显,没有master服务器。如下:
/usr/local/mongodb/bin/mongo > show dbs 2018-06-02T18:35:20.269+0800 E QUERY [thread1] Error: listDatabases failed:{ "ok" : 0, "errmsg" : "not master and slaveOk=false", "code" : 13435, "codeName" : "NotMasterNoSlaveOk" } : _getErrorWithCode@src/mongo/shell/utils.js:25:13 Mongo.prototype.getDBs@src/mongo/shell/mongo.js:62:1 shellHelper.show@src/mongo/shell/utils.js:769:19 shellHelper@src/mongo/shell/utils.js:659:15 @(shellhelp2):1:1
接下来初始化副本集,下面命令只在其中一台MongoDB服务器上面操作,这里在172.20.10.79这台服务器上操作。如下:
#初始化命令 config = { _id:"raytest", members:[ {_id:0,host:"172.20.10.79:27017"}, {_id:1,host:"172.20.10.80:27017"}, {_id:2,host:"172.20.10.81:27017"}] } #执行config中的配置初始化 rs.initiate(config)
下面是执行结果
> config = { _id:"raytest", members:[ ... {_id:0,host:"172.20.10.79:27017"}, ... {_id:1,host:"172.20.10.80:27017"}, ... {_id:2,host:"172.20.10.81:27017"}] ... } { "_id" : "raytest", "members" : [ { "_id" : 0, "host" : "172.20.10.79:27017" }, { "_id" : 1, "host" : "172.20.10.80:27017" }, { "_id" : 2, "host" : "172.20.10.81:27017" } ] } > rs.initiate(config) { "ok" : 1 }
这时查看副本集状态,在副本集中的任意一台服务器上去执行rs.status()查看副本集状态,都一样。
raytest:OTHER> rs.status() { "set" : "raytest", "date" : ISODate("2018-06-02T10:45:22.416Z"), "myState" : 1, "term" : NumberLong(1), "heartbeatIntervalMillis" : NumberLong(2000), "optimes" : { "lastCommittedOpTime" : { "ts" : Timestamp(1527936315, 1), "t" : NumberLong(1) }, "appliedOpTime" : { "ts" : Timestamp(1527936315, 1), "t" : NumberLong(1) }, "durableOpTime" : { "ts" : Timestamp(1527936315, 1), "t" : NumberLong(1) } }, "members" : [ { "_id" : 0, "name" : "172.20.10.79:27017", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 652, "optime" : { "ts" : Timestamp(1527936315, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2018-06-02T10:45:15Z"), "electionTime" : Timestamp(1527936124, 1), "electionDate" : ISODate("2018-06-02T10:42:04Z"), "configVersion" : 1, "self" : true }, { "_id" : 1, "name" : "172.20.10.80:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 209, "optime" : { "ts" : Timestamp(1527936315, 1), "t" : NumberLong(1) }, "optimeDurable" : { "ts" : Timestamp(1527936315, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2018-06-02T10:45:15Z"), "optimeDurableDate" : ISODate("2018-06-02T10:45:15Z"), "lastHeartbeat" : ISODate("2018-06-02T10:45:20.511Z"), "lastHeartbeatRecv" : ISODate("2018-06-02T10:45:21.300Z"), "pingMs" : NumberLong(2), "syncingTo" : "172.20.10.79:27017", "configVersion" : 1 }, { "_id" : 2, "name" : "172.20.10.81:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 209, "optime" : { "ts" : Timestamp(1527936315, 1), "t" : NumberLong(1) }, "optimeDurable" : { "ts" : Timestamp(1527936315, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2018-06-02T10:45:15Z"), "optimeDurableDate" : ISODate("2018-06-02T10:45:15Z"), "lastHeartbeat" : ISODate("2018-06-02T10:45:20.511Z"), "lastHeartbeatRecv" : ISODate("2018-06-02T10:45:21.285Z"), "pingMs" : NumberLong(2), "syncingTo" : "172.20.10.79:27017", "configVersion" : 1 } ], "ok" : 1 }
五、测试
分别在主服务器和从服务器插入一条数据,看看是否成功
主服务器插入数据
raytest:PRIMARY> use ray switched to db ray raytest:PRIMARY> db.raytables.insert({"user":"ray"}) WriteResult({ "nInserted" : 1 }) raytest:PRIMARY> show tables raytables raytest:PRIMARY> db.raytables.find() { "_id" : ObjectId("5b127649e8be407ef3f1d487"), "user" : "ray" } raytest:PRIMARY>
可以看到主服务器插入数据成功。下面在从服务器进行操作,插入数据
raytest:SECONDARY> use ray switched to db ray raytest:SECONDARY> db.raytables.insert({"name":"jack"}) WriteResult({ "writeError" : { "code" : 10107, "errmsg" : "not master" } }) raytest:SECONDARY>
可以看到报错,错误也很明显:not master,不是主服务器,不能进行相关操作。可以看看刚刚在主服务器插入的数据从服务器是否同步
raytest:SECONDARY> use ray switched to db ray raytest:SECONDARY> show tables raytables raytest:SECONDARY> db.raytables.find() { "_id" : ObjectId("5b127649e8be407ef3f1d487"), "user" : "ray" } raytest:SECONDARY>
六、模拟故障
先模式从服务器故障,这里kill掉172.20.10.81服务器的MongoDB
#kill MongoDB ps axu | grep mongo root 29240 0.3 1.1 1548624 43920 ? Sl 18:34 0:04 /usr/local/mongodb/bin/mongod --config /usr/local/mongodb/bin/config.conf root 29320 0.0 0.0 103252 828 pts/0 R+ 18:57 0:00 grep mongo kill -9 29240 ps axu | grep mongo root 29322 0.0 0.0 103252 824 pts/0 R+ 18:57 0:00 grep mongo
查看MongoDB副本集的状态
raytest:PRIMARY> rs.status() { "set" : "raytest", "date" : ISODate("2018-06-02T10:58:07.311Z"), "myState" : 1, "term" : NumberLong(1), "heartbeatIntervalMillis" : NumberLong(2000), "optimes" : { "lastCommittedOpTime" : { "ts" : Timestamp(1527937085, 1), "t" : NumberLong(1) }, "appliedOpTime" : { "ts" : Timestamp(1527937085, 1), "t" : NumberLong(1) }, "durableOpTime" : { "ts" : Timestamp(1527937085, 1), "t" : NumberLong(1) } }, "members" : [ { "_id" : 0, "name" : "172.20.10.79:27017", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 1417, "optime" : { "ts" : Timestamp(1527937085, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2018-06-02T10:58:05Z"), "electionTime" : Timestamp(1527936124, 1), "electionDate" : ISODate("2018-06-02T10:42:04Z"), "configVersion" : 1, "self" : true }, { "_id" : 1, "name" : "172.20.10.80:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 974, "optime" : { "ts" : Timestamp(1527937075, 1), "t" : NumberLong(1) }, "optimeDurable" : { "ts" : Timestamp(1527937075, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2018-06-02T10:57:55Z"), "optimeDurableDate" : ISODate("2018-06-02T10:57:55Z"), "lastHeartbeat" : ISODate("2018-06-02T10:58:05.943Z"), "lastHeartbeatRecv" : ISODate("2018-06-02T10:58:06.504Z"), "pingMs" : NumberLong(2), "syncingTo" : "172.20.10.79:27017", "configVersion" : 1 }, { "_id" : 2, "name" : "172.20.10.81:27017", "health" : 0, "state" : 8, "stateStr" : "(not reachable/healthy)", "uptime" : 0, "optime" : { "ts" : Timestamp(0, 0), "t" : NumberLong(-1) }, "optimeDurable" : { "ts" : Timestamp(0, 0), "t" : NumberLong(-1) }, "optimeDate" : ISODate("1970-01-01T00:00:00Z"), "optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"), "lastHeartbeat" : ISODate("2018-06-02T10:58:05.943Z"), "lastHeartbeatRecv" : ISODate("2018-06-02T10:57:56.456Z"), "pingMs" : NumberLong(1), "lastHeartbeatMessage" : "Connection refused", "configVersion" : -1 } ], "ok" : 1 }
可以看到172.20.10.81的状态已经变为了:not reachable/healthy。下面看看其他MongoDB节点的日志(注意:日志会一直打印,注意磁盘空间)
#一直打印下列内容 2018-06-02T18:58:15.955+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.20.10.81:27017 2018-06-02T18:58:15.955+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect to 172.20.10.81:27017 - HostUnreachable: Connection refused 2018-06-02T18:58:15.955+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Dropping all pooled connections to 172.20.10.81:27017 due to failed operation on a connection 2018-06-02T18:58:15.955+0800 I REPL [ReplicationExecutor] Error in heartbeat request to 172.20.10.81:27017; HostUnreachable: Connection refused
当服务器恢复后,MongoDB副本集又变为正常状态。
raytest:PRIMARY> rs.status() { "set" : "raytest", "date" : ISODate("2018-06-02T11:03:17.039Z"), "myState" : 1, "term" : NumberLong(1), "heartbeatIntervalMillis" : NumberLong(2000), "optimes" : { "lastCommittedOpTime" : { "ts" : Timestamp(1527937395, 1), "t" : NumberLong(1) }, "appliedOpTime" : { "ts" : Timestamp(1527937395, 1), "t" : NumberLong(1) }, "durableOpTime" : { "ts" : Timestamp(1527937395, 1), "t" : NumberLong(1) } }, "members" : [ { "_id" : 0, "name" : "172.20.10.79:27017", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 1727, "optime" : { "ts" : Timestamp(1527937395, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2018-06-02T11:03:15Z"), "electionTime" : Timestamp(1527936124, 1), "electionDate" : ISODate("2018-06-02T10:42:04Z"), "configVersion" : 1, "self" : true }, { "_id" : 1, "name" : "172.20.10.80:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 1284, "optime" : { "ts" : Timestamp(1527937395, 1), "t" : NumberLong(1) }, "optimeDurable" : { "ts" : Timestamp(1527937395, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2018-06-02T11:03:15Z"), "optimeDurableDate" : ISODate("2018-06-02T11:03:15Z"), "lastHeartbeat" : ISODate("2018-06-02T11:03:16.478Z"), "lastHeartbeatRecv" : ISODate("2018-06-02T11:03:17.023Z"), "pingMs" : NumberLong(1), "syncingTo" : "172.20.10.79:27017", "configVersion" : 1 }, { "_id" : 2, "name" : "172.20.10.81:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 6, "optime" : { "ts" : Timestamp(1527937395, 1), "t" : NumberLong(1) }, "optimeDurable" : { "ts" : Timestamp(1527937395, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2018-06-02T11:03:15Z"), "optimeDurableDate" : ISODate("2018-06-02T11:03:15Z"), "lastHeartbeat" : ISODate("2018-06-02T11:03:16.299Z"), "lastHeartbeatRecv" : ISODate("2018-06-02T11:03:14.297Z"), "pingMs" : NumberLong(2), "syncingTo" : "172.20.10.79:27017", "configVersion" : 1 } ], "ok" : 1 }
下面模拟主服务器故障(IP:172.20.10.79)
#kill MongoDB Master [root@mongodb-001 src]# ps aux | grep mong root 29393 0.3 1.1 1637000 44000 ? Sl 18:34 0:06 /usr/local/mongodb/bin/mongod --config /usr/local/mongodb/bin/config.conf root 29527 0.0 0.0 103252 832 pts/0 S+ 19:04 0:00 grep mong [root@mongodb-001 src]# kill -9 29393 [root@mongodb-001 src]# ps aux | grep mong root 29529 0.0 0.0 103252 828 pts/0 R+ 19:05 0:00 grep mong
查看MongoDB副本集状态
raytest:PRIMARY> rs.status() { "set" : "raytest", "date" : ISODate("2018-06-02T11:06:01.866Z"), "myState" : 1, "term" : NumberLong(2), "heartbeatIntervalMillis" : NumberLong(2000), "optimes" : { "lastCommittedOpTime" : { "ts" : Timestamp(1527937554, 1), "t" : NumberLong(2) }, "appliedOpTime" : { "ts" : Timestamp(1527937554, 1), "t" : NumberLong(2) }, "durableOpTime" : { "ts" : Timestamp(1527937554, 1), "t" : NumberLong(2) } }, "members" : [ { "_id" : 0, "name" : "172.20.10.79:27017", "health" : 0, "state" : 8, "stateStr" : "(not reachable/healthy)", "uptime" : 0, "optime" : { "ts" : Timestamp(0, 0), "t" : NumberLong(-1) }, "optimeDurable" : { "ts" : Timestamp(0, 0), "t" : NumberLong(-1) }, "optimeDate" : ISODate("1970-01-01T00:00:00Z"), "optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"), "lastHeartbeat" : ISODate("2018-06-02T11:06:01.620Z"), "lastHeartbeatRecv" : ISODate("2018-06-02T11:04:44.624Z"), "pingMs" : NumberLong(1), "lastHeartbeatMessage" : "Connection refused", "configVersion" : -1 }, { "_id" : 1, "name" : "172.20.10.80:27017", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 1872, "optime" : { "ts" : Timestamp(1527937554, 1), "t" : NumberLong(2) }, "optimeDate" : ISODate("2018-06-02T11:05:54Z"), "infoMessage" : "could not find member to sync from", "electionTime" : Timestamp(1527937493, 1), "electionDate" : ISODate("2018-06-02T11:04:53Z"), "configVersion" : 1, "self" : true }, { "_id" : 2, "name" : "172.20.10.81:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 170, "optime" : { "ts" : Timestamp(1527937554, 1), "t" : NumberLong(2) }, "optimeDurable" : { "ts" : Timestamp(1527937554, 1), "t" : NumberLong(2) }, "optimeDate" : ISODate("2018-06-02T11:05:54Z"), "optimeDurableDate" : ISODate("2018-06-02T11:05:54Z"), "lastHeartbeat" : ISODate("2018-06-02T11:06:01.620Z"), "lastHeartbeatRecv" : ISODate("2018-06-02T11:06:00.776Z"), "pingMs" : NumberLong(1), "syncingTo" : "172.20.10.80:27017", "configVersion" : 1 } ], "ok" : 1 }
可以看到主服务器由原来的172.20.10.79变为了172.20.10.80这台服务器。下面看看其他节点的日志
2018-06-02T19:07:09.700+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 172.20.10.79:27017 2018-06-02T19:07:09.700+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect to 172.20.10.79:27017 - HostUnreachable: Connection refused 2018-06-02T19:07:09.700+0800 I ASIO [NetworkInterfaceASIO-Replication-0] Dropping all pooled connections to 172.20.10.79:27017 due to failed operation on a connection 2018-06-02T19:07:09.700+0800 I REPL [ReplicationExecutor] Error in heartbeat request to 172.20.10.79:27017; HostUnreachable: Connection refused
最后恢复主服务器看看MongoDB副本集的状态
raytest:PRIMARY> rs.status() { "set" : "raytest", "date" : ISODate("2018-06-02T11:08:06.269Z"), "myState" : 1, "term" : NumberLong(2), "heartbeatIntervalMillis" : NumberLong(2000), "optimes" : { "lastCommittedOpTime" : { "ts" : Timestamp(1527937684, 1), "t" : NumberLong(2) }, "appliedOpTime" : { "ts" : Timestamp(1527937684, 1), "t" : NumberLong(2) }, "durableOpTime" : { "ts" : Timestamp(1527937684, 1), "t" : NumberLong(2) } }, "members" : [ { "_id" : 0, "name" : "172.20.10.79:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 6, "optime" : { "ts" : Timestamp(1527937684, 1), "t" : NumberLong(2) }, "optimeDurable" : { "ts" : Timestamp(1527937684, 1), "t" : NumberLong(2) }, "optimeDate" : ISODate("2018-06-02T11:08:04Z"), "optimeDurableDate" : ISODate("2018-06-02T11:08:04Z"), "lastHeartbeat" : ISODate("2018-06-02T11:08:05.792Z"), "lastHeartbeatRecv" : ISODate("2018-06-02T11:08:04.376Z"), "pingMs" : NumberLong(1), "syncingTo" : "172.20.10.80:27017", "configVersion" : 1 }, { "_id" : 1, "name" : "172.20.10.80:27017", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 1997, "optime" : { "ts" : Timestamp(1527937684, 1), "t" : NumberLong(2) }, "optimeDate" : ISODate("2018-06-02T11:08:04Z"), "electionTime" : Timestamp(1527937493, 1), "electionDate" : ISODate("2018-06-02T11:04:53Z"), "configVersion" : 1, "self" : true }, { "_id" : 2, "name" : "172.20.10.81:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 295, "optime" : { "ts" : Timestamp(1527937684, 1), "t" : NumberLong(2) }, "optimeDurable" : { "ts" : Timestamp(1527937684, 1), "t" : NumberLong(2) }, "optimeDate" : ISODate("2018-06-02T11:08:04Z"), "optimeDurableDate" : ISODate("2018-06-02T11:08:04Z"), "lastHeartbeat" : ISODate("2018-06-02T11:08:05.788Z"), "lastHeartbeatRecv" : ISODate("2018-06-02T11:08:04.972Z"), "pingMs" : NumberLong(1), "syncingTo" : "172.20.10.80:27017", "configVersion" : 1 } ], "ok" : 1 }
可以看到主服务器恢复,自动加入到整个MongoDB的副本集,但是由原来的主服务器变为了从服务器,也说明主服务器掉线后恢复不会抢占主服务器权限。
至此,MongoDB副本集配置已经完成了。下面说明几个在实际运行中遇见的问题
七、特殊说明
1、MongoDB搭建副本集,只能是空库进行搭建,如果MongoDB里面有数据则会在搭建时报错,请先将数据备份,然后清空MongoDB,最后进行搭建
2、在完成MongoDB副本集建立后,在从服务器执行show dbs这些命令会报错,解决方式是执行:rs.slaveOk()
MongoDB主从操作可以参考:https://blog.csdn.net/canot/article/details/50739359