107、如何配置 Health Check ? (Swarm14)
容器状态是UP的,那应用就是健康的吗? 不一定
Docker 只能从容器启动进程的返回代码判断其状态,而对于容器内部应用的运行情况基本没有了解。
执行 docker run 命令时,通常会根据Dockerfile中的CMD或 ENTRYPOINT 启动一个进程,这个进程的状态就是 docker ps STATUS 列显示容器的状态。
上面命令显示:
1、有的容器正在运行,状态为 UP
2、有的容器已经正常停止了,状态是exited(0)
3、有的则因发生故障停止了,退出代码非 0 ,例如 exited (137)、exited(1) 等
即使容器状态是 UP ,也不能保证应用没有问题,web server 虽然没有崩溃,但如果总是返回 HTTP500 - Internal Server Error ,对于应用来说就是严重的故障。
如何从应用的业务层面检查容器状态呢,那就是 Health Check。
Docker 支持的 Health Check 可以是任何一个单独的命令,Docker 会在容器中执行该命令,如果返回 0 , 容器被认为是healthy ,如果返回 1 则为unhealthy。
对于提供HTTP 服务接口的应用,常用的 Health Check 是通过curl 检查HTTP状态码,比如
curl --fail http://localhost:8080 || exit 1
如果 curl 命令检测到任何一个错误的HTTP 状态码,则返回 1 ,Health Check 失败。
root@host03:~# docker service create --name web_server --health-cmd "curl http://localhost:8091/pools || exit 1" couchbase
81p63ao2rnuimt0f9i5vahmz8
overall progress: 1 out of 1 tasks
1/1: running
verify: Service converged
root@host03:~# docker service ps web_server
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
0ckikfsanfb2 web_server.1 couchbase:latest host01 Running Running 52 seconds ago
root@host01:~# docker ps -a # SATUS栏可以看到容器的监控状态
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f82561439414 couchbase:latest "/entrypoint.sh couc…" 57 seconds ago Up 56 seconds (healthy) 8091-8096/tcp, 11207/tcp, 11210-11211/tcp, 18091-18096/tcp web_server.1.0ckikfsanfb254nmkyjrrtxzp
--health-timeout 命令超时时间,默认30s
--health-interval 命令执行间隔,默认30s
--health-retries 命令失败重试的次数,默认为3,如果 3 次都失败了则会将容器标记为unhealthy。swarm会销毁并重建unhealthy的副本。
下面模拟一个unhealthy的场景,让curl 访问一个不存在的url
root@host03:~# docker service create --name ng_server --health-cmd "curl --fail http://localhost:8091/non-exist || exit 1" couchbase
y4qappjx5ke1ffp8ltsivt892
overall progress: 0 out of 1 tasks
1/1: starting [============================================> ] # 创建Service一直卡在这里
root@host03:~# docker service ps ng_server 过一段时间查看该Service对应的容器,发现有三个处于shutdown状态,这就对应了前面说的,当健康检查不通过的时候会销毁并重建新的副本(容器),这些容器其实的node上只是被shutdown了
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
l0o7fyf6v8to ng_server.1 couchbase:latest host02 Ready Ready 3 seconds ago
j3r6u80e9pe8 \_ ng_server.1 couchbase:latest host02 Shutdown Complete 3 seconds ago
177fik81ja20 \_ ng_server.1 couchbase:latest host02 Shutdown Complete about a minute ago
4ko4dmanao2y \_ ng_server.1 couchbase:latest host02 Shutdown Complete 3 minutes ago
下面是在node上查看容器状态,可见累计创建过四个容器(副本),有问题的副本(容器)都shutdown了
root@host02:~# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0316bb763a73 couchbase:latest "/entrypoint.sh couc…" About a minute ago Up About a minute (health: starting) 8091-8096/tcp, 11207/tcp, 11210-11211/tcp, 18091-18096/tcp ng_server.1.4ko4dmanao2yntzkati7rqf2v
root@host02:~# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c08f7f88d929 couchbase:latest "/entrypoint.sh couc…" About a minute ago Up About a minute (health: starting) 8091-8096/tcp, 11207/tcp, 11210-11211/tcp, 18091-18096/tcp ng_server.1.177fik81ja20jwr36dz56sfh3
root@host02:~# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
eabec2359233 couchbase:latest "/entrypoint.sh couc…" 24 seconds ago Up 18 seconds (health: starting) 8091-8096/tcp, 11207/tcp, 11210-11211/tcp, 18091-18096/tcp ng_server.1.j3r6u80e9pe8101xvt6mixvxc
root@host02:~# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d7ad2c993564 couchbase:latest "/entrypoint.sh couc…" 10 seconds ago Up 4 seconds (health: starting) 8091-8096/tcp, 11207/tcp, 11210-11211/tcp, 18091-18096/tcp ng_server.1.l0o7fyf6v8tocy60kdsb7djcy
root@host02:~# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f13e80a53400 couchbase:latest "/entrypoint.sh couc…" 1 minutes ago Up About 1 minutes (health: starting) 8091-8096/tcp, 11207/tcp, 11210-11211/tcp, 18091-18096/tcp ng_server.1.koawuqt4rjyo65i3kilj18hsi
fe5c0436aaed couchbase:latest "/entrypoint.sh couc…" 4 minutes ago Exited (0) 2 minutes ago ng_server.1.nsy7l5anl0megyjfms3nbooyy
7d99a8eb3268 couchbase:latest "/entrypoint.sh couc…" 5 minutes ago Exited (0) 4 minutes ago ng_server.1.m93f6tpx0hqd2z9sl0jadi8ib
d7ad2c993564 couchbase:latest "/entrypoint.sh couc…" 7 minutes ago Exited (0) 5 minutes ago ng_server.1.l0o7fyf6v8tocy60kdsb7djcy
Docker 默认只能通过容器进程的返回码判断容器的状态,Health Check 则能够从业务角度判断应用是否发生故障,是否需要重启。