容器技术之Docker-swarm

  前文我聊到了docker machine的简单使用和基本原理的说明,回顾请参考https://www.cnblogs.com/qiuhom-1874/p/13160915.html;今天我们来聊一聊docker集群管理工具docker swarm;docker swarm是docker 官方的集群管理工具,它可以让跨主机节点来创建,管理docker 集群;它的主要作用就是可以把多个节点主机的docker环境整合成一个大的docker资源池;docker swarm面向的就是这个大的docker 资源池在上面管理容器;在前面我们都只是在单台主机上的创建,管理容器,但是在生产环境中通常一台物理机上的容器实在是不能够满足当前业务的需求,所以docker swarm提供了一种集群解决方案,方便在多个节点上创建,管理容器;接下来我们来看看docker swarm集群的搭建过程吧;

  docker swarm 在我们安装好docker时就已经安装好了,我们可以使用docker info来查看

[root@node1 ~]# docker info
Client:
 Debug Mode: false

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: 19.03.11
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 3.10.0-693.el7.x86_64
 Operating System: CentOS Linux 7 (Core)
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 3.686GiB
 Name: docker-node01
 ID: 4HXP:YJ5W:4SM5:NAPM:NXPZ:QFIU:ARVJ:BYDG:KVWU:5AAJ:77GC:X7GQ
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
  provider=generic
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

[root@node1 ~]# 

  提示:从上面的信息可以看到,swarm是处于非活跃状态,这是因为我们还没有初始化集群,所以对应的swarm选项的值是处于inactive状态;

  初始化集群

[root@docker-node01 ~]# docker swarm init --advertise-addr 192.168.0.41
Swarm initialized: current node (ynz304mbltxx10v3i15ldkmj1) is now a manager.

To add a worker to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-6difxlq3wc8emlwxzuw95gp8rmvbz2oq62kux3as0e4rbyqhk3-2m9x12n102ca4qlyjpseobzik 192.168.0.41:2377

To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

[root@docker-node01 ~]# 

  提示:从上面反馈的信息可以看到,集群初始化成功,并且告诉我们当前节点为管理节点,如果想要其他节点加入到该集群,可以在对应节点上运行docker swarm join --token SWMTKN-1-6difxlq3wc8emlwxzuw95gp8rmvbz2oq62kux3as0e4rbyqhk3-2m9x12n102ca4qlyjpseobzik 192.168.0.41:2377 这个命令,就把对应节点当作work节点加入到该集群,如果想要以管理节点身份加入到集群,我们需要在当前终端运行docker swarm join-token manager命令

[root@docker-node01 ~]# docker swarm join-token manager
To add a manager to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-6difxlq3wc8emlwxzuw95gp8rmvbz2oq62kux3as0e4rbyqhk3-dqjeh8hp6cp99bksjc03b8yu3 192.168.0.41:2377

[root@docker-node01 ~]# 

  提示:我们执行docker swarm join-token manager命令,它返回了一个命令,并告诉我们添加一个管理节点,在对应节点上执行docker swarm join --token SWMTKN-1-6difxlq3wc8emlwxzuw95gp8rmvbz2oq62kux3as0e4rbyqhk3-dqjeh8hp6cp99bksjc03b8yu3 192.168.0.41:2377命令即可;

  到此docker swarm集群就初始化完毕,接下来我们把其他节点加入到该集群

  把docker-node02以work节点身份加入集群

[root@node2 ~]# docker swarm join --token SWMTKN-1-6difxlq3wc8emlwxzuw95gp8rmvbz2oq62kux3as0e4rbyqhk3-2m9x12n102ca4qlyjpseobzik 192.168.0.41:2377
This node joined a swarm as a worker.
[root@node2 ~]# 

  提示:没有报错就表示加入集群成功;我们可以使用docker info来查看当前的docker 环境详细信息

  提示:从上面的信息可以看到,在docker-node02这台主机上docker swarm 已经激活,并且可以看到管理节点的地址;除了以上方式可以确定docker-node02以及加入到集群;我们还可以在管理节点上运行docker node ls 查看集群节点信息;

  查看集群节点信息

  提示:在管理节点上运行docker node ls 就可以列出当前集群里有多少节点已经成功加入进来;

  把docker-node03以管理节点身份加入到集群

  提示:可以看到docker-node03已经是集群的管理节点,所以可以在docker-node03这个节点执行docker node ls 命令;到此docker swarm集群就搭建好了;接下来我们来说一说docker swarm集群的常用管理

  有关节点相关管理命令

  docker node ls :列出当前集群上的所有节点

[root@docker-node01 ~]# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Active              Leader              19.03.11
tzkm0ymzjdmc1r8d54snievf1     docker-node02       Ready               Active                                  19.03.11
aeo8j7zit9qkoeeft3j0q1h0z     docker-node03       Ready               Active              Reachable           19.03.11
[root@docker-node01 ~]# 

  提示:该命令只能在管理节点上执行;

  docker node inspect :查看指定节点的详细信息;

[root@docker-node01 ~]# docker node inspect docker-node01
[
    {
        "ID": "ynz304mbltxx10v3i15ldkmj1",
        "Version": {
            "Index": 9
        },
        "CreatedAt": "2020-06-20T05:57:17.57684293Z",
        "UpdatedAt": "2020-06-20T05:57:18.18575648Z",
        "Spec": {
            "Labels": {},
            "Role": "manager",
            "Availability": "active"
        },
        "Description": {
            "Hostname": "docker-node01",
            "Platform": {
                "Architecture": "x86_64",
                "OS": "linux"
            },
            "Resources": {
                "NanoCPUs": 4000000000,
                "MemoryBytes": 3958075392
            },
            "Engine": {
                "EngineVersion": "19.03.11",
                "Labels": {
                    "provider": "generic"
                },
                "Plugins": [
                    {
                        "Type": "Log",
                        "Name": "awslogs"
                    },
                    {
                        "Type": "Log",
                        "Name": "fluentd"
                    },
                    {
                        "Type": "Log",
                        "Name": "gcplogs"
                    },
                    {
                        "Type": "Log",
                        "Name": "gelf"
                    },
                    {
                        "Type": "Log",
                        "Name": "journald"
                    },
                    {
                        "Type": "Log",
                        "Name": "json-file"
                    },
                    {
                        "Type": "Log",
                        "Name": "local"
                    },
                    {
                        "Type": "Log",
                        "Name": "logentries"
                    },
                    {
                        "Type": "Log",
                        "Name": "splunk"
                    },
                    {
                        "Type": "Log",
                        "Name": "syslog"
                    },
                    {
                        "Type": "Network",
                        "Name": "bridge"
                    },
                    {
                        "Type": "Network",
                        "Name": "host"
                    },
                    {
                        "Type": "Network",
                        "Name": "ipvlan"
                    },
                    {
                        "Type": "Network",
                        "Name": "macvlan"
                    },
                    {
                        "Type": "Network",
                        "Name": "null"
                    },
                    {
                        "Type": "Network",
                        "Name": "overlay"
                    },
                    {
                        "Type": "Volume",
                        "Name": "local"
                    }
                ]
            },
            "TLSInfo": {
                "TrustRoot": "-----BEGIN CERTIFICATE-----\nMIIBaTCCARCgAwIBAgIUeBd/eSZ7WaiyLby9o1yWpjps3gwwCgYIKoZIzj0EAwIw\nEzERMA8GA1UEAxMIc3dhcm0tY2EwHhcNMjAwNjIwMDU1MjAwWhcNNDAwNjE1MDU1\nMjAwWjATMREwDwYDVQQDEwhzd2FybS1jYTBZMBMGByqGSM49AgEGCCqGSM49AwEH\nA0IABMsYxnGoPbM4gqb23E1TvOeQcLcY56XysLuF8tYKm56GuKpeD/SqXrUCYqKZ\nHV+WSqcM0fD1g+mgZwlUwFzNxhajQjBAMA4GA1UdDwEB/wQEAwIBBjAPBgNVHRMB\nAf8EBTADAQH/MB0GA1UdDgQWBBTV64kbvS83eRHyI6hdJeEIv3GmrTAKBggqhkjO\nPQQDAgNHADBEAiBBB4hLn0ijybJWH5j5rtMdAoj8l/6M3PXERnRSlhbcawIgLoby\newMHCnm8IIrUGe7s4CZ07iHG477punuPMKDgqJ0=\n-----END CERTIFICATE-----\n",
                "CertIssuerSubject": "MBMxETAPBgNVBAMTCHN3YXJtLWNh",
                "CertIssuerPublicKey": "MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEyxjGcag9sziCpvbcTVO855BwtxjnpfKwu4Xy1gqbnoa4ql4P9KpetQJiopkdX5ZKpwzR8PWD6aBnCVTAXM3GFg=="
            }
        },
        "Status": {
            "State": "ready",
            "Addr": "192.168.0.41"
        },
        "ManagerStatus": {
            "Leader": true,
            "Reachability": "reachable",
            "Addr": "192.168.0.41:2377"
        }
    }
]
[root@docker-node01 ~]#

  docker node ps :列出指定节点上运行容器的清单

[root@docker-node01 ~]# docker node ps 
ID                  NAME                IMAGE               NODE                DESIRED STATE       CURRENT STATE       ERROR               PORTS
[root@docker-node01 ~]# docker node ps docker-node01
ID                  NAME                IMAGE               NODE                DESIRED STATE       CURRENT STATE       ERROR               PORTS
[root@docker-node01 ~]# 

  提示:类似docker ps 命令,我上面没有运行容器,所以看不到对应信息;默认不指定节点名称表示查看当前节点上的运行容器清单;

  docker node rm :删除指定节点

[root@docker-node01 ~]# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Active              Leader              19.03.11
tzkm0ymzjdmc1r8d54snievf1     docker-node02       Ready               Active                                  19.03.11
aeo8j7zit9qkoeeft3j0q1h0z     docker-node03       Ready               Active              Reachable           19.03.11
[root@docker-node01 ~]# docker node rm docker-node03
Error response from daemon: rpc error: code = FailedPrecondition desc = node aeo8j7zit9qkoeeft3j0q1h0z is a cluster manager and is a member of the raft cluster. It must be demoted to worker before removal
[root@docker-node01 ~]# docker node rm docker-node02
Error response from daemon: rpc error: code = FailedPrecondition desc = node tzkm0ymzjdmc1r8d54snievf1 is not down and can't be removed
[root@docker-node01 ~]# 

  提示:删除节点前必须满足,被删除的节点不是管理节点,其次就是要删除的节点必须是down状态;

  docker swarm leave:离开当前集群

[root@docker-node03 ~]# docker ps 
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
e7958ffa16cd        nginx               "/docker-entrypoint.…"   28 seconds ago      Up 26 seconds       80/tcp              n1
[root@docker-node03 ~]# docker swarm leave 
Error response from daemon: You are attempting to leave the swarm on a node that is participating as a manager. Removing this node leaves 1 managers out of 2. Without a Raft quorum your swarm will be inaccessible. The only way to restore a swarm that has lost consensus is to reinitialize it with `--force-new-cluster`. Use `--force` to suppress this message.
[root@docker-node03 ~]# docker swarm leave -f
Node left the swarm.
[root@docker-node03 ~]# 

  提示:管理节点默认是不允许离开集群的,如果强制使用-f选项离开集群,会导致在其他管理节点无法正常管理集群;

[root@docker-node01 ~]# docker node ls
Error response from daemon: rpc error: code = Unknown desc = The swarm does not have a leader. It's possible that too few managers are online. Make sure more than half of the managers are online.
[root@docker-node01 ~]#

  提示:我们在docker-node01上现在就不能使用docker node ls 来查看集群节点列表了;解决办法重新初始化集群;

[root@docker-node01 ~]# docker node ls 
Error response from daemon: rpc error: code = Unknown desc = The swarm does not have a leader. It's possible that too few managers are online. Make sure more than half of the managers are online.
[root@docker-node01 ~]# docker swarm init --advertise-addr 192.168.0.41
Error response from daemon: This node is already part of a swarm. Use "docker swarm leave" to leave this swarm and join another one.
[root@docker-node01 ~]# docker swarm init --force-new-cluster 
Swarm initialized: current node (ynz304mbltxx10v3i15ldkmj1) is now a manager.

To add a worker to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-6difxlq3wc8emlwxzuw95gp8rmvbz2oq62kux3as0e4rbyqhk3-2m9x12n102ca4qlyjpseobzik 192.168.0.41:2377

To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

[root@docker-node01 ~]# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Active              Leader              19.03.11
tzkm0ymzjdmc1r8d54snievf1     docker-node02       Unknown             Active                                  19.03.11
aeo8j7zit9qkoeeft3j0q1h0z     docker-node03       Down                Active                                  19.03.11
rm3j7cjvmoa35yy8ckuzoay46     docker-node03       Unknown             Active                                  19.03.11
[root@docker-node01 ~]# 

  提示:重新初始化集群不能使用docker swarm init --advertise-addr 192.168.0.41这种方式初始化,必须使用docker swarm init --force-new-cluster,该命令表示使用从当前状态强制创建一个集群;现在我们就可以使用docker node rm 把down状态的节点从集群删除;

  删除down状态的节点

[root@docker-node01 ~]# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Active              Leader              19.03.11
tzkm0ymzjdmc1r8d54snievf1     docker-node02       Ready               Active                                  19.03.11
aeo8j7zit9qkoeeft3j0q1h0z     docker-node03       Down                Active                                  19.03.11
rm3j7cjvmoa35yy8ckuzoay46     docker-node03       Down                Active                                  19.03.11
[root@docker-node01 ~]# docker node rm aeo8j7zit9qkoeeft3j0q1h0z rm3j7cjvmoa35yy8ckuzoay46
aeo8j7zit9qkoeeft3j0q1h0z
rm3j7cjvmoa35yy8ckuzoay46
[root@docker-node01 ~]# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Active              Leader              19.03.11
tzkm0ymzjdmc1r8d54snievf1     docker-node02       Ready               Active                                  19.03.11
[root@docker-node01 ~]# 

  docker node promote:把指定节点提升为管理节点

[root@docker-node01 ~]# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Active              Leader              19.03.11
tzkm0ymzjdmc1r8d54snievf1     docker-node02       Ready               Active                                  19.03.11
[root@docker-node01 ~]# docker node promote docker-node02
Node docker-node02 promoted to a manager in the swarm.
[root@docker-node01 ~]# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Active              Leader              19.03.11
tzkm0ymzjdmc1r8d54snievf1     docker-node02       Ready               Active              Reachable           19.03.11
[root@docker-node01 ~]# 

  docker node demote:把指定节点降级为work节点

[root@docker-node01 ~]# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Active              Leader              19.03.11
tzkm0ymzjdmc1r8d54snievf1     docker-node02       Ready               Active              Reachable           19.03.11
[root@docker-node01 ~]# docker node demote docker-node02
Manager docker-node02 demoted in the swarm.
[root@docker-node01 ~]# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Active              Leader              19.03.11
tzkm0ymzjdmc1r8d54snievf1     docker-node02       Ready               Active                                  19.03.11
[root@docker-node01 ~]# 

  docker node update:更新指定节点

[root@docker-node01 ~]# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Active              Leader              19.03.11
tzkm0ymzjdmc1r8d54snievf1     docker-node02       Ready               Active                                  19.03.11
[root@docker-node01 ~]# docker node update docker-node01 --availability drain 
docker-node01
[root@docker-node01 ~]# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Drain               Leader              19.03.11
tzkm0ymzjdmc1r8d54snievf1     docker-node02       Ready               Active                                  19.03.11
[root@docker-node01 ~]# 

  提示:以上命令把docker-node01的availability属性更改为drain,这样更改后docker-node01的资源就不会被调度到用来运行容器;

  为docker swarm集群添加图形界面

[root@docker-node01 docker]# docker run --name v1 -d -p 8888:8080 -e HOST=192.168.0.41 -e PORT=8080 -v /var/run/docker.sock:/var/run/docker.sock docker-registry.io/test/visualizer
Unable to find image 'docker-registry.io/test/visualizer:latest' locally
latest: Pulling from test/visualizer
cd784148e348: Pull complete 
f6268ae5d1d7: Pull complete 
97eb9028b14b: Pull complete 
9975a7a2a3d1: Pull complete 
ba903e5e6801: Pull complete 
7f034edb1086: Pull complete 
cd5dbf77b483: Pull complete 
5e7311667ddb: Pull complete 
687c1072bfcb: Pull complete 
aa18e5d3472c: Pull complete 
a3da1957bd6b: Pull complete 
e42dbf1c67c4: Pull complete 
5a18b01011d2: Pull complete 
Digest: sha256:54d65cbcbff52ee7d789cd285fbe68f07a46e3419c8fcded437af4c616915c85
Status: Downloaded newer image for docker-registry.io/test/visualizer:latest
3c15b186ff51848130393944e09a427bd40d2504c54614f93e28477a4961f8b6
[root@docker-node01 docker]# docker ps 
CONTAINER ID        IMAGE                                COMMAND             CREATED             STATUS                            PORTS                    NAMES
3c15b186ff51        docker-registry.io/test/visualizer   "npm start"         6 seconds ago       Up 5 seconds (health: starting)   0.0.0.0:8888->8080/tcp   v1
[root@docker-node01 docker]# 

  提示:我上面的命令是从私有仓库中下载的镜像,原因是互联网下载太慢了,所以我提前下载好,放在私有仓库中;有关私有仓库的搭建使用,请参考https://www.cnblogs.com/qiuhom-1874/p/13061984.html或者https://www.cnblogs.com/qiuhom-1874/p/13058338.html;在管理节点上运行visualizer容器后,我们就可以直接访问该管理节点地址的8888端口,就可以看到当前容器的情况;如下图

  提示:从上面的信息可以看到当前集群有一个管理节点和两个work节点;现目前集群里没有运行任何容器;

  在docker swarm运行服务

[root@docker-node01 ~]# docker service create --name myweb docker-registry.io/test/nginx:latest
i0j6wvvtfe1360ibj04jxulmd
overall progress: 1 out of 1 tasks 
1/1: running   [==================================================>] 
verify: Service converged 
[root@docker-node01 ~]# docker service ls
ID                  NAME                MODE                REPLICAS            IMAGE                                  PORTS
i0j6wvvtfe13        myweb               replicated          1/1                 docker-registry.io/test/nginx:latest   
[root@docker-node01 ~]# docker service ps myweb
ID                  NAME                IMAGE                                  NODE                DESIRED STATE       CURRENT STATE            ERROR               PORTS
99y8towew77e        myweb.1             docker-registry.io/test/nginx:latest   docker-node03       Running             Running 1 minutes ago                       
[root@docker-node01 ~]#

  提示:docker service create 表示在当前swarm集群环境中创建一个服务;以上命令表示在swarm集群上创建一个名为myweb的服务,用docker-registry.io/test/nginx:latest镜像;默认情况下只启动一个副本;

  提示:可以看到当前集群中运行了一个myweb的容器,并且运行在docker-node03这台主机上;

  在swarm 集群上创建多个副本服务

[root@docker-node01 ~]# docker service create --replicas 3 --name web docker-registry.io/test/nginx:latest
mbiap412jyugfpi4a38mb5i1k
overall progress: 3 out of 3 tasks 
1/3: running   [==================================================>] 
2/3: running   [==================================================>] 
3/3: running   [==================================================>] 
verify: Service converged 
[root@docker-node01 ~]# docker service ls
ID                  NAME                MODE                REPLICAS            IMAGE                                  PORTS
i0j6wvvtfe13        myweb               replicated          1/1                 docker-registry.io/test/nginx:latest   
mbiap412jyug        web                 replicated          3/3                 docker-registry.io/test/nginx:latest   
[root@docker-node01 ~]#docker service ps web
ID                  NAME                IMAGE                                  NODE                DESIRED STATE       CURRENT STATE            ERROR               PORTS
1rt0e7u4senz        web.1               docker-registry.io/test/nginx:latest   docker-node02       Running             Running 28 seconds ago                       
31ll0zu7udld        web.2               docker-registry.io/test/nginx:latest   docker-node02       Running             Running 28 seconds ago                       
l9jtbswl2x22        web.3               docker-registry.io/test/nginx:latest   docker-node03       Running             Running 32 seconds ago                       
[root@docker-node01 ~]# 

  提示:--replicas选项用来指定期望运行的副本数量,该选项会在集群上创建我们指定数量的副本,即便我们集群中有节点宕机,它始终会创建我们指定数量的容器在集群上运行着;

  测试:把docker-node03关机,看看我们运行的服务是否会迁移到节点2上呢?

  docker-node03关机前

  docker-node03关机后

  提示:从上面的截图可以看到,当节点3宕机后,节点3上跑的所有容器,会全部迁移到节点2上来;这就是创建容器时用--replicas选项的作用;总结一点,创建服务使用副本模式,该服务所在节点故障,它会把对应节点上的服务迁移到其他节点上;这里需要提醒一点的是,只要集群上的服务副本满足我们指定的replicas的数量,即便故障的节点恢复了,它是不会把服务迁移回来的;

[root@docker-node01 ~]# docker service ps web
ID                  NAME                IMAGE                                  NODE                DESIRED STATE       CURRENT STATE             ERROR               PORTS
1rt0e7u4senz        web.1               docker-registry.io/test/nginx:latest   docker-node02       Running             Running 15 minutes ago                        
31ll0zu7udld        web.2               docker-registry.io/test/nginx:latest   docker-node02       Running             Running 15 minutes ago                        
t3gjvsgtpuql        web.3               docker-registry.io/test/nginx:latest   docker-node02       Running             Running 6 minutes ago                         
l9jtbswl2x22         \_ web.3           docker-registry.io/test/nginx:latest   docker-node03       Shutdown            Shutdown 23 seconds ago                       
[root@docker-node01 ~]# 

  提示:我们在管理节点查看服务列表,可以看到它迁移服务就是把对应节点上的副本停掉,然后在其他节点创建一个新的副本;

  服务伸缩

[root@docker-node01 ~]# docker service ls
ID                  NAME                MODE                REPLICAS            IMAGE                                  PORTS
i0j6wvvtfe13        myweb               replicated          1/1                 docker-registry.io/test/nginx:latest   
mbiap412jyug        web                 replicated          3/3                 docker-registry.io/test/nginx:latest   
[root@docker-node01 ~]# docker service scale myweb=3 web=5
myweb scaled to 3
web scaled to 5
overall progress: 3 out of 3 tasks 
1/3: running   [==================================================>] 
2/3: running   [==================================================>] 
3/3: running   [==================================================>] 
verify: Service converged 
overall progress: 5 out of 5 tasks 
1/5: running   [==================================================>] 
2/5: running   [==================================================>] 
3/5: running   [==================================================>] 
4/5: running   [==================================================>] 
5/5: running   [==================================================>] 
verify: Service converged 
[root@docker-node01 ~]# docker service ls
ID                  NAME                MODE                REPLICAS            IMAGE                                  PORTS
i0j6wvvtfe13        myweb               replicated          3/3                 docker-registry.io/test/nginx:latest   
mbiap412jyug        web                 replicated          5/5                 docker-registry.io/test/nginx:latest   
[root@docker-node01 ~]# docker service ps myweb web
ID                  NAME                IMAGE                                  NODE                DESIRED STATE       CURRENT STATE            ERROR               PORTS
j7w490h2lons        myweb.1             docker-registry.io/test/nginx:latest   docker-node02       Running             Running 12 minutes ago                       
1rt0e7u4senz        web.1               docker-registry.io/test/nginx:latest   docker-node02       Running             Running 21 minutes ago                       
99y8towew77e        myweb.1             docker-registry.io/test/nginx:latest   docker-node03       Shutdown            Shutdown 5 minutes ago                       
en5rk0jf09wu        myweb.2             docker-registry.io/test/nginx:latest   docker-node03       Running             Running 31 seconds ago                       
31ll0zu7udld        web.2               docker-registry.io/test/nginx:latest   docker-node02       Running             Running 21 minutes ago                       
h1hze7h819ca        myweb.3             docker-registry.io/test/nginx:latest   docker-node03       Running             Running 30 seconds ago                       
t3gjvsgtpuql        web.3               docker-registry.io/test/nginx:latest   docker-node02       Running             Running 12 minutes ago                       
l9jtbswl2x22         \_ web.3           docker-registry.io/test/nginx:latest   docker-node03       Shutdown            Shutdown 5 minutes ago                       
od3ti2ixpsgc        web.4               docker-registry.io/test/nginx:latest   docker-node03       Running             Running 31 seconds ago                       
n1vur8wbmkgz        web.5               docker-registry.io/test/nginx:latest   docker-node03       Running             Running 31 seconds ago                       
[root@docker-node01 ~]# 

  提示:docker service scale 命令用来指定服务的副本数量,从而实现动态伸缩;

  服务暴露

[root@docker-node01 ~]# docker service ls
ID                  NAME                MODE                REPLICAS            IMAGE                                  PORTS
i0j6wvvtfe13        myweb               replicated          3/3                 docker-registry.io/test/nginx:latest   
mbiap412jyug        web                 replicated          5/5                 docker-registry.io/test/nginx:latest   
[root@docker-node01 ~]# docker service update  --publish-add 80:80 myweb 
myweb
overall progress: 3 out of 3 tasks 
1/3: running   [==================================================>] 
2/3: running   [==================================================>] 
3/3: running   [==================================================>] 
verify: Service converged 
[root@docker-node01 ~]#

  提示:docker swarm集群中的服务暴露和docker里面的端口暴露原理是一样的,都是通过iptables 规则表或LVS规则实现的;

  提示:我们可以在管理节点上看到对应80端口已经处于监听状态,并且在iptables规则表中多了一项访问本机80端口都DNAT到172.18.0.2的80上了;其实不光是在管理节点,在work节点上相应的iptables规则也都发生了变化;如下

  提示:从上面的规则来看,我们访问节点地址的80端口,都会DNAT到172.18.0.2的80;

  提示:从上面是显示结果看,我们不难得知在docker-node02运行myweb容器的内部地址是10.0.0.7,那为什么我们访问172.18.0.2是能够访问到容器内部的服务呢?

  测试:我们在docker-node02追踪查看nginx容器的访问日志,看看到容器的IP地址是那个?

[root@docker-node02 ~]# docker ps
CONTAINER ID        IMAGE                                  COMMAND                  CREATED             STATUS              PORTS               NAMES
2134e1b2c689        docker-registry.io/test/nginx:latest   "/docker-entrypoint.…"   24 minutes ago      Up 24 minutes       80/tcp              nginx.1.ych7y3ugxp6o592pbz5k2i412
[root@docker-node02 ~]# docker logs -f nginx.1.ych7y3ugxp6o592pbz5k2i412 
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: Getting the checksum of /etc/nginx/conf.d/default.conf
10-listen-on-ipv6-by-default.sh: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf
/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
/docker-entrypoint.sh: Configuration complete; ready for start up
10.0.0.3 - - [21/Jun/2020:02:37:11 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.29.0" "-"
172.18.0.1 - - [21/Jun/2020:02:38:35 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.29.0" "-"
10.0.0.2 - - [21/Jun/2020:02:53:32 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.29.0" "-"
10.0.0.2 - - [21/Jun/2020:02:53:58 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.29.0" "-"
^C
[root@docker-node02 ~]# 

  提示:我们在管理节点上访问172.18.0.2在node2节点上看到的日志是10.0.0.2的ip访问到nginx服务;这是为什么呢?其实原因就是在每个节点上都有一个ingress-sbox容器,该容器的地址就10.0.0.2;不同节点上的ingress-sbox的地址都不同,所以我们访问不同节点地址,在nginx上看到地址也就不同;如下图所示

  提示:访问不同的节点地址,在nginx日志上记录的IP各不相同

  提示:从上面的截图可以了解到每个节点的ingress-sbox容器的地址各不相同,但他们都把网关指向10.0.0.1,这意味着各个节点容器通信就可以基于这个网关来进行,从而实现了swarm集群上的容器间通信能够基于ingress网络进行;现在还有一个问题就是172.18.0.0/16的网络是怎么和10.0.0.0/24的网络通信的?

  提示:从上面的截图可以看到,在管理节点上有两个网络名称空间,一个id为0,而id为0的网络名称空间中有veth0和vxlan0这两个网卡;而veth0和vxlan0都是桥接到br0上的,br0的地址就是10.0.0.1/24;vxlan的vlan id为4096;结合上面nginx的日志,不难想到我们访问管理节点上的80,通过iptables规则把流量转发给docker-gwbridge网络上;现在我们还不清楚docker-gwbridge网络上那个名称空间的网络,但是我们清楚知道在容器内部有两张网卡,一张是eth0,一张是eth1,而eth1就是桥接到docker-gwbridge网络上,这也就意味着容docker-gwbridge网络的名称空间和容器内部的eth1网络名称空间相同;

  提示:从上面的截图看,1-u5mwgfq7rb这个名称的网络名称空间有三张网卡,分别是eth0,eth1和vxlan0,它们都是桥接在br0这个网卡上;而上面管理节点也在1-u5mwgfq7rb这个网络名称空间,并且它们中的vxlan0的vlan id都是4096,这意味着管理节点上的vxlan0可以同node2上的vxlan0直接通信(相同网络名称空间中的相同VLAN id是可以直接通信的),而vxlan0又是直接桥接到br0这块网卡,所以我们在nginx日志中能够看到ingress-sbox容器的地址在访问nginx;这其中的原因是ingress-sbox的网关就是br0;其实node3也是相同逻辑,不同节点上的容器间通信都是走vxlan0,与外部通信走eth1---->然后通过SNAT走docker-gwbridge---->物理网卡出去;

  提示:一个容器上有两个网络,一个是eth0 ingress网络,一个是eth1属于docker-gwbridge网络,两者都属于同一容器中的网络名称空间,所以我们访问172.18.0.2就会通过ingress-sbox容器把源地址更改为docker-gwbridge上的ingress-sbox的地址,从而我们在看nginx日志,就会看到10.0.0.2的地址;ingress-sbox容器作用我们可以理解为做SNAT的作用;

  测试:访问管理节点的80服务看看是否能够访问到nginx提供的页面呢?

[root@docker-node02 ~]# docker ps
CONTAINER ID        IMAGE                                  COMMAND                  CREATED             STATUS              PORTS               NAMES
b829991d6966        docker-registry.io/test/nginx:latest   "/docker-entrypoint.…"   About an hour ago   Up About an hour    80/tcp              myweb.1.ilhkslrlnreyo6xx5j2h9isjb
8c2965fbdc27        docker-registry.io/test/nginx:latest   "/docker-entrypoint.…"   2 hours ago         Up 2 hours          80/tcp              web.2.pthe8da2n45i06oee4n7h4krd
b019d663e48e        docker-registry.io/test/nginx:latest   "/docker-entrypoint.…"   2 hours ago         Up 2 hours          80/tcp              web.3.w26gqpoyysgplm7qwhjbgisiv
a7c1afd76f1f        docker-registry.io/test/nginx:latest   "/docker-entrypoint.…"   2 hours ago         Up 2 hours          80/tcp              web.1.ho0d7u3wensl0kah0ioz1lpk5
[root@docker-node02 ~]# docker exec -it myweb.1.ilhkslrlnreyo6xx5j2h9isjb  bash
root@b829991d6966:/# cd /usr/share/nginx/html/
root@b829991d6966:/usr/share/nginx/html# ls
50x.html  index.html
root@b829991d6966:/usr/share/nginx/html# echo "this is docker-node02 index page" >index.html
root@b829991d6966:/usr/share/nginx/html# cat index.html
this is docker-node02 index page
root@b829991d6966:/usr/share/nginx/html# 

  提示:以上是在docker-node02节点上对运行的nginx容器的主页进行了修改,接下我们访问管理节点的80端口,看看是否能够访问得到work节点上的容器,它们会有什么效果?是轮询?还是一直访问一个容器?

  提示:可以看到我们访问管理节点的80端口,会轮询的访问到work节点上的容器;用浏览器测试可能存在缓存的问题,我们可以用curl命令测试比较准确;如下

[root@docker-node03 ~]# docker ps
CONTAINER ID        IMAGE                                  COMMAND                  CREATED             STATUS              PORTS               NAMES
f43fdb9ec7fc        docker-registry.io/test/nginx:latest   "/docker-entrypoint.…"   2 hours ago         Up 2 hours          80/tcp              myweb.3.pgdjutofb5thlk02aj7387oj0
4470785f3d00        docker-registry.io/test/nginx:latest   "/docker-entrypoint.…"   2 hours ago         Up 2 hours          80/tcp              myweb.2.uwxbe182qzq00qgfc7odcmx87
7493dcac95ba        docker-registry.io/test/nginx:latest   "/docker-entrypoint.…"   2 hours ago         Up 2 hours          80/tcp              web.4.rix50fhlmg6m9txw9urk66gvw
118880d300f4        docker-registry.io/test/nginx:latest   "/docker-entrypoint.…"   2 hours ago         Up 2 hours          80/tcp              web.5.vo7c7vjgpf92b0ryelb7eque0
[root@docker-node03 ~]# docker exec -it myweb.2.uwxbe182qzq00qgfc7odcmx87 bash
root@4470785f3d00:/# cd /usr/share/nginx/html/
root@4470785f3d00:/usr/share/nginx/html# echo "this is myweb.2 index page" > index.html 
root@4470785f3d00:/usr/share/nginx/html# cat index.html
this is myweb.2 index page
root@4470785f3d00:/usr/share/nginx/html# exit
exit
[root@docker-node03 ~]# docker exec -it myweb.3.pgdjutofb5thlk02aj7387oj0 bash
root@f43fdb9ec7fc:/# cd /usr/share/nginx/html/
root@f43fdb9ec7fc:/usr/share/nginx/html# echo "this is myweb.3 index page" >index.html 
root@f43fdb9ec7fc:/usr/share/nginx/html# cat index.html
this is myweb.3 index page
root@f43fdb9ec7fc:/usr/share/nginx/html# exit
exit
[root@docker-node03 ~]# 

  提示:为了访问方便看得出效果,我们把myweb.2和myweb.3的主页都更改了内容

[root@docker-node01 ~]# for i in {1..10} ; do curl 192.168.0.41; done
this is myweb.3 index page
this is docker-node02 index page
this is myweb.2 index page
this is myweb.3 index page
this is docker-node02 index page
this is myweb.2 index page
this is myweb.3 index page
this is docker-node02 index page
this is myweb.2 index page
this is myweb.3 index page
[root@docker-node01 ~]# 

  提示:通过上面的测试,我们在使用--publish-add 暴露服务时,就相当于在管理节点创建了一个load balance;

posted @ 2020-06-20 23:56  Linux-1874  阅读(4605)  评论(0编辑  收藏  举报