098、Swarm 如何实现 Failover (Swarm05)
故障是在所难免的,容器可能崩溃,Docker Host 可能宕机,不过幸运的是,Swarm 已经内置了 failover策略。
创建Service 的时候,我们没有告诉 swarm 发生故障时该如何处理,只是说明了我们期望的状态(比如 3 份副本),swarm会尽最大努力达成这个期望的状态,无论发生什么状况。
root@host03:~# docker service ps web_server
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
tznw0lrn8wh8 web_server.1 httpd:latest host01 Running Running 28 minutes ago
jykct1jmfrte \_ web_server.1 httpd:latest host03 Shutdown Shutdown 28 minutes ago
4x32c9x1hizg web_server.2 httpd:latest host01 Running Running 28 minutes ago
n4afxlx16tny \_ web_server.2 httpd:latest host03 Shutdown Shutdown 28 minutes ago
mlsb1ey4n65r web_server.4 httpd:latest host02 Running Running about an hour ago
当前 3 个副本运行在 host01(2副本) 和 host02 (1副本)上 ,现在我们测试swarm 的failover特性,host01 关机
root@host03:~# docker service ps web_server # host01 关机前的状态
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
tznw0lrn8wh8 web_server.1 httpd:latest host01 Running Running 12 seconds ago
jykct1jmfrte \_ web_server.1 httpd:latest host03 Shutdown Shutdown 44 minutes ago
4x32c9x1hizg web_server.2 httpd:latest host01 Running Running 12 seconds ago
n4afxlx16tny \_ web_server.2 httpd:latest host03 Shutdown Shutdown 44 minutes ago
mlsb1ey4n65r web_server.4 httpd:latest host02 Running Running 2 hours ago
root@host03:~# docker service ps web_server # swarm检测到host01 关机,开始在host02上启动新的容器
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
wtipc3fngioo web_server.1 httpd:latest host02 Ready Ready less than a second ago
tznw0lrn8wh8 \_ web_server.1 httpd:latest host01 Shutdown Running 13 seconds ago
jykct1jmfrte \_ web_server.1 httpd:latest host03 Shutdown Shutdown 44 minutes ago
vhp70jgq49j8 web_server.2 httpd:latest host02 Ready Ready less than a second ago
4x32c9x1hizg \_ web_server.2 httpd:latest host01 Shutdown Running 13 seconds ago
n4afxlx16tny \_ web_server.2 httpd:latest host03 Shutdown Shutdown 44 minutes ago
mlsb1ey4n65r web_server.4 httpd:latest host02 Running Running 2 hours ago
root@host03:~# docker service ps web_server # 故障转移完毕,3个副本都运行在了host02 上
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
wtipc3fngioo web_server.1 httpd:latest host02 Running Running 5 seconds ago
tznw0lrn8wh8 \_ web_server.1 httpd:latest host01 Shutdown Running 24 seconds ago
jykct1jmfrte \_ web_server.1 httpd:latest host03 Shutdown Shutdown 44 minutes ago
vhp70jgq49j8 web_server.2 httpd:latest host02 Running Running 5 seconds ago
4x32c9x1hizg \_ web_server.2 httpd:latest host01 Shutdown Running 24 seconds ago
n4afxlx16tny \_ web_server.2 httpd:latest host03 Shutdown Shutdown 44 minutes ago
mlsb1ey4n65r web_server.4 httpd:latest host02 Running Running 2 hours ago
root@host03:~# docker node ls # host01 已被标记为 Down
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
hvt2ez9e7zvqm2hz8nix1eke7 host01 Down Active 18.09.6
asn5ufnogzkyqigk4mizatoer host02 Ready Active 18.09.6
h6rzavsz2vjxstwj3pytiebjb * host03 Ready Drain Leader 18.09.6