107、如何配置 Health Check ? (Swarm14)

 
容器状态是UP的,那应用就是健康的吗?    不一定
 
Docker 只能从容器启动进程的返回代码判断其状态,而对于容器内部应用的运行情况基本没有了解。
 
执行 docker run 命令时,通常会根据Dockerfile中的CMD或 ENTRYPOINT 启动一个进程,这个进程的状态就是 docker ps STATUS 列显示容器的状态。
 
 
上面命令显示:
    1、有的容器正在运行,状态为 UP
    2、有的容器已经正常停止了,状态是exited(0)
    3、有的则因发生故障停止了,退出代码非 0 ,例如 exited (137)、exited(1) 等
 
即使容器状态是 UP ,也不能保证应用没有问题,web server 虽然没有崩溃,但如果总是返回 HTTP500 - Internal Server Error ,对于应用来说就是严重的故障。
 
如何从应用的业务层面检查容器状态呢,那就是 Health Check。
 
Docker 支持的 Health Check 可以是任何一个单独的命令,Docker 会在容器中执行该命令,如果返回 0 , 容器被认为是healthy ,如果返回 1 则为unhealthy。
 
对于提供HTTP 服务接口的应用,常用的 Health Check 是通过curl 检查HTTP状态码,比如
 
curl --fail http://localhost:8080 || exit 1
 
如果 curl 命令检测到任何一个错误的HTTP 状态码,则返回 1 ,Health Check 失败。
 
root@host03:~# docker service create --name web_server --health-cmd "curl http://localhost:8091/pools || exit 1"  couchbase
81p63ao2rnuimt0f9i5vahmz8
overall progress: 1 out of 1 tasks
1/1: running   
verify: Service converged
 
root@host03:~# docker service ps web_server
ID                  NAME                IMAGE               NODE                DESIRED STATE       CURRENT STATE            ERROR               PORTS
0ckikfsanfb2        web_server.1        couchbase:latest    host01              Running             Running 52 seconds ago  
 
root@host01:~# docker ps -a    #    SATUS栏可以看到容器的监控状态
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS                    PORTS                                                        NAMES
f82561439414        couchbase:latest    "/entrypoint.sh couc…"   57 seconds ago      Up 56 seconds (healthy)   8091-8096/tcp, 11207/tcp, 11210-11211/tcp, 18091-18096/tcp   web_server.1.0ckikfsanfb254nmkyjrrtxzp
 
 
--health-timeout    命令超时时间,默认30s
--health-interval    命令执行间隔,默认30s
--health-retries      命令失败重试的次数,默认为3,如果 3 次都失败了则会将容器标记为unhealthy。swarm会销毁并重建unhealthy的副本。
 
下面模拟一个unhealthy的场景,让curl 访问一个不存在的url
 
root@host03:~# docker service create --name ng_server --health-cmd "curl --fail  http://localhost:8091/non-exist || exit 1" couchbase
y4qappjx5ke1ffp8ltsivt892
overall progress: 0 out of 1 tasks
1/1: starting  [============================================>      ]    #    创建Service一直卡在这里
 
root@host03:~# docker service ps ng_server    过一段时间查看该Service对应的容器,发现有三个处于shutdown状态,这就对应了前面说的,当健康检查不通过的时候会销毁并重建新的副本(容器),这些容器其实的node上只是被shutdown了
ID                  NAME                IMAGE               NODE                DESIRED STATE       CURRENT STATE                 ERROR               PORTS
l0o7fyf6v8to        ng_server.1         couchbase:latest    host02              Ready               Ready 3 seconds ago                               
j3r6u80e9pe8         \_ ng_server.1     couchbase:latest    host02              Shutdown            Complete 3 seconds ago                            
177fik81ja20         \_ ng_server.1     couchbase:latest    host02              Shutdown            Complete about a minute ago                       
4ko4dmanao2y         \_ ng_server.1     couchbase:latest    host02              Shutdown            Complete 3 minutes ago                            
 
下面是在node上查看容器状态,可见累计创建过四个容器(副本),有问题的副本(容器)都shutdown了
 
root@host02:~# docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED              STATUS                                 PORTS                                                        NAMES
0316bb763a73        couchbase:latest    "/entrypoint.sh couc…"   About a minute ago   Up About a minute (health: starting)   8091-8096/tcp, 11207/tcp, 11210-11211/tcp, 18091-18096/tcp   ng_server.1.4ko4dmanao2yntzkati7rqf2v
 
root@host02:~# docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED              STATUS                                 PORTS                                                        NAMES
c08f7f88d929        couchbase:latest    "/entrypoint.sh couc…"   About a minute ago   Up About a minute (health: starting)   8091-8096/tcp, 11207/tcp, 11210-11211/tcp, 18091-18096/tcp   ng_server.1.177fik81ja20jwr36dz56sfh3
 
root@host02:~# docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS                             PORTS                                                        NAMES
eabec2359233        couchbase:latest    "/entrypoint.sh couc…"   24 seconds ago      Up 18 seconds (health: starting)   8091-8096/tcp, 11207/tcp, 11210-11211/tcp, 18091-18096/tcp   ng_server.1.j3r6u80e9pe8101xvt6mixvxc
 
root@host02:~# docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS                            PORTS                                                        NAMES
d7ad2c993564        couchbase:latest    "/entrypoint.sh couc…"   10 seconds ago      Up 4 seconds (health: starting)   8091-8096/tcp, 11207/tcp, 11210-11211/tcp, 18091-18096/tcp   ng_server.1.l0o7fyf6v8tocy60kdsb7djcy
 
root@host02:~# docker ps -a
CONTAINER ID        IMAGE               COMMAND                  CREATED              STATUS                                 PORTS                                                        NAMES
f13e80a53400        couchbase:latest    "/entrypoint.sh couc…"   1 minutes ago        Up About 1 minutes (health: starting)   8091-8096/tcp, 11207/tcp, 11210-11211/tcp, 18091-18096/tcp  ng_server.1.koawuqt4rjyo65i3kilj18hsi
fe5c0436aaed        couchbase:latest    "/entrypoint.sh couc…"   4 minutes ago        Exited (0) 2 minutes ago                                                                            ng_server.1.nsy7l5anl0megyjfms3nbooyy
7d99a8eb3268        couchbase:latest    "/entrypoint.sh couc…"   5 minutes ago        Exited (0) 4 minutes ago                                                                            ng_server.1.m93f6tpx0hqd2z9sl0jadi8ib
d7ad2c993564        couchbase:latest    "/entrypoint.sh couc…"   7 minutes ago        Exited (0) 5 minutes ago                                                                            ng_server.1.l0o7fyf6v8tocy60kdsb7djcy
 
 
Docker 默认只能通过容器进程的返回码判断容器的状态,Health Check 则能够从业务角度判断应用是否发生故障,是否需要重启。
 
 
posted @ 2019-05-17 13:34  三角形  阅读(817)  评论(0编辑  收藏  举报