docker-09-docker swarm
什么是docker swarm
Swarm是Docker公司推出的用来管理docker集群的平台,几乎全部用GO语言来完成的开发的
源码地址: https://github.com/docker/swarm
Swarm使用标准的Docker API接口作为其前端的访问入口
换言之,各种形式的Docker Client(compose,docker-py等)均可以直接与Swarm通信
甚至Docker本身都可以很容易的与Swarm集成
这大大方便了用户将原本基于单节点的系统移植到Swarm上
同时Swarm内置了对Docker网络插件的支持
用户也很容易的部署跨主机的容器集群服务
Docker Swarm 和 Docker Compose 一样,都是 Docker 官方容器编排项目
但不同的是,Docker Compose 是一个在单个服务器或主机上创建多个容器的工具
而 Docker Swarm 则可以在多个服务器或主机上创建容器集群服务
对于微服务的部署,显然 Docker Swarm 会更加适合
从 Docker 1.12.0 版本开始,Docker Swarm 已经包含在 Docker 引擎中(docker swarm)
并且已经内置了服务发现工具,我们就不需要像之前一样,再配置 Etcd 或者 Consul 来进行服务发现配置了
Swarm deamon只是一个调度器(Scheduler)加路由器(router)
Swarm自己不运行容器
它只是接受Docker客户端发来的请求,调度适合的节点来运行容器
这就意味着,即使Swarm由于某些原因挂掉了,集群中的节点也会照常运行
当Swarm重新恢复运行之后,他会收集重建集群信息
部署swarm集群
1 环境准备,购买服务器
建议到云平台购买4台服务器,服务器的配置为1C2G
操作系统为centos7.9
网络和安全组
自定义密码
2 给4台服务器安装docker环境
docker环境安装完成
3 部署swarm集群(两管理节点,容易出现问题)
1 在docker001服务器上(manager管理节点)
# 查看swarm命令帮助
[root@docker001 ~]# docker swarm init --help
Usage: docker swarm init [OPTIONS]
Initialize a swarm
Options:
--advertise-addr string Advertised address (format: <ip|interface>[:port])
# 查看docker001服务器的ip地址
[root@docker001 ~]# ip addr
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP>
inet 172.18.213.61/20 brd 172.18.223.255
# 创建swarm集群的管理节点
[root@docker001 ~]# docker swarm init --advertise-addr 172.18.213.61
# 提示已经是管理节点
Swarm initialized: current node (qbia02mk2as1k1j27qevxe623) is now a manager.
# 如果以worker节点加入到这个集群,运行以下命令
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-0dakw28f2r0amjeb5wr32i69i2fdfpv4eers78lwkyhibxc547-9bdz0yt7jiaqdxx9exmve7ky8 172.18.213.61:2377
# 如果以管理节点加入到该集群,运行命令生成token令牌
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
# 以管理节点加入到该集群,运行如下命令
[root@docker001 ~]# docker swarm join-token manager
To add a manager to this swarm, run the following command:
docker swarm join --token SWMTKN-1-0dakw28f2r0amjeb5wr32i69i2fdfpv4eers78lwkyhibxc547-b9ghpo4knegpp3b5hp77q8y2k 172.18.213.61:2377
2 在docker002服务器上(manager管理节点)
[root@docker002 ~]# docker swarm join --token SWMTKN-1-0dakw28f2r0amjeb5wr32i69i2fdfpv4eers78lwkyhibxc547-b9ghpo4knegpp3b5hp77q8y2k 172.18.213.61:2377
3 在docker003服务器上(worker工作节点)
[root@docker003 ~]# docker swarm join --token SWMTKN-1-0dakw28f2r0amjeb5wr32i69i2fdfpv4eers78lwkyhibxc547-9bdz0yt7jiaqdxx9exmve7ky8 172.18.213.61:2377
4 在docker004服务器上(worker工作节点)
[root@docker004 ~]# docker swarm join --token SWMTKN-1-0dakw28f2r0amjeb5wr32i69i2fdfpv4eers78lwkyhibxc547-9bdz0yt7jiaqdxx9exmve7ky8 172.18.213.61:2377
5 测试查看
# 在manager节点,不然命令会报错
# 双主双从模式 该模式存在问题,不推荐使用
[root@docker001 ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
qbia02mk2as1k1j27qevxe623 * docker001 Ready Active Leader 20.10.8
x25eceyb65w590c25yguqsy7o docker002 Ready Active Reachable 20.10.8
fpctrtqs0zlodpfl5ze2yr65r docker003 Ready Active 20.10.8
09ih2yeax5pwwkmv5kwke981y docker004 Ready Active 20.10.8
6 命令总结
# 创建swarm集群,并把该主机设置为manager管理节点
docker swarm init --advertise-addr IP地址
# 在manager管理节点上才能执行如下命令
# 查看加入manager管理节点的命令
[root@docker001 ~]# docker swarm join-token manager
To add a manager to this swarm, run the following command:
docker swarm join --token SWMTKN-1-0dakw28f2r0amjeb5wr32i69i2fdfpv4eers78lwkyhibxc547-b9ghpo4knegpp3b5hp77q8y2k 172.18.213.61:2377
# 查看加入worker工作节点的命令
[root@docker001 ~]# docker swarm join-token worker
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-0dakw28f2r0amjeb5wr32i69i2fdfpv4eers78lwkyhibxc547-9bdz0yt7jiaqdxx9exmve7ky8 172.18.213.61:2377
Raft协议
Raft协议:保证大多数节点存活才可以使用,至少节点存活>1,集群节点至少>3
所以,上面的双主双从的架构,假设一个节点挂了,那么其他节点也不可用
实验:两台manager管理节点
1 将docker001服务器的docker服务停止,模拟manager节点出现故障,可以看到另外一个manager节点也不能使用
# 停止docker服务
# 注意这里要执行2条命令,不然不能完全停止docker服务
[root@docker001 ~]# systemctl stop docker.service
[root@docker001 ~]# systemctl stop docker.socket
# 查看节点信息失败
[root@docker001 ~]# docker node ls
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
2 在docker002上查看节点信息,也失败
[root@docker002 ~]# docker node ls
Error response from daemon: rpc error: code = Unknown desc = The swarm does not have a leader. It's possible that too few managers are online. Make sure more than half of the managers are online.
结论:manager管理节点建议至少3台服务器,2台服务器没啥意义,因为挂掉一台,另外一台也不可用
实验:三台manager管理节点
1 在dokcer001服务器上启动docker服务
# 启动docker服务
[root@docker001 ~]# systemctl start docker
[root@docker001 ~]# systemctl start docker.socket
# 查看节点信息
# 可以看到该节点已经变成了Reachable,不再是Leader,但仍然是管理节点
[root@docker001 ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
qbia02mk2as1k1j27qevxe623 * docker001 Ready Active Reachable 20.10.8
x25eceyb65w590c25yguqsy7o docker002 Ready Active Leader 20.10.8
fpctrtqs0zlodpfl5ze2yr65r docker003 Ready Active 20.10.8
09ih2yeax5pwwkmv5kwke981y docker004 Ready Active 20.10.8
2 在docker003服务器上离开该集群
[root@docker003 ~]# docker swarm leave
Node left the swarm.
3 在docker001服务器上查看
# 可以看到 docker003 的状态为 down 说明已经离开了该集群
[root@docker001 ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
qbia02mk2as1k1j27qevxe623 * docker001 Ready Active Reachable 20.10.8
x25eceyb65w590c25yguqsy7o docker002 Ready Active Leader 20.10.8
fpctrtqs0zlodpfl5ze2yr65r docker003 Down Active 20.10.8
09ih2yeax5pwwkmv5kwke981y docker004 Ready Active 20.10.8
4 在docker003服务器上以manager管理者加入集群
[root@docker003 ~]# docker swarm join --token SWMTKN-1-0dakw28f2r0amjeb5wr32i69i2fdfpv4eers78lwkyhibxc547-b9ghpo4knegpp3b5hp77q8y2k 172.18.213.61:2377
# 三个管理节点了
[root@docker003 ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
qbia02mk2as1k1j27qevxe623 docker001 Ready Active Reachable 20.10.8
x25eceyb65w590c25yguqsy7o docker002 Ready Active Leader 20.10.8
fpctrtqs0zlodpfl5ze2yr65r docker003 Down Active 20.10.8
ssatmcxgxh0rv16mb7lvadc56 * docker003 Ready Active Reachable 20.10.8
09ih2yeax5pwwkmv5kwke981y docker004 Ready Active 20.10.8
5 将docker001服务器的docker服务停止,模拟manager节点出现故障
[root@docker001 ~]# systemctl stop docker
[root@docker001 ~]# systemctl stop docker.socket
# docker001管理节点不可用
[root@docker001 ~]# docker node ls
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
6 可以看到其他manager节点还可以继续使用
[root@docker002 ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
qbia02mk2as1k1j27qevxe623 docker001 Down Active Unreachable 20.10.8
x25eceyb65w590c25yguqsy7o * docker002 Ready Active Leader 20.10.8
fpctrtqs0zlodpfl5ze2yr65r docker003 Down Active 20.10.8
ssatmcxgxh0rv16mb7lvadc56 docker003 Ready Active Reachable 20.10.8
09ih2yeax5pwwkmv5kwke981y docker004 Ready Active 20.10.8
[root@docker003 ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
qbia02mk2as1k1j27qevxe623 docker001 Down Active Unreachable 20.10.8
x25eceyb65w590c25yguqsy7o docker002 Ready Active Leader 20.10.8
fpctrtqs0zlodpfl5ze2yr65r docker003 Down Active 20.10.8
ssatmcxgxh0rv16mb7lvadc56 * docker003 Ready Active Reachable 20.10.8
09ih2yeax5pwwkmv5kwke981y docker004 Ready Active 20.10.8
7 如果再停掉一个manger管理节点,则剩下的另外一台manager管理节点也不可用
结论:集群要想高可用,建议3个主节点以上,>1台管理节点存活才可以使用,双管理节点没啥意义
使用swarm集群发布服务
swarm集群的好处
可以弹性扩缩容
docker run: 启动单个docker容器
docker-compose up: 启动单个项目(多个容器)
docker service(swarm): 启动一个服务集群
1 创建服务
# docker run 容器启动 不能扩缩容器功能
# docker service 服务 具有扩缩容器功能
# 目前环境 3个管理节点 1个工作节点
[root@docker001 ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
qbia02mk2as1k1j27qevxe623 * docker001 Ready Active Reachable 20.10.8
x25eceyb65w590c25yguqsy7o docker002 Ready Active Reachable 20.10.8
fpctrtqs0zlodpfl5ze2yr65r docker003 Down Active 20.10.8
ssatmcxgxh0rv16mb7lvadc56 docker003 Ready Active Leader 20.10.8
09ih2yeax5pwwkmv5kwke981y docker004 Ready Active 20.10.8
# 查看创建服务的命令帮助
[root@docker001 ~]# docker service --help
Usage: docker service COMMAND
Manage services
Commands:
create Create a new service
inspect Display detailed information on one or more services
logs Fetch the logs of a service or task
ls List services
ps List the tasks of one or more services
rm Remove one or more services
rollback Revert changes to a service's configuration
scale Scale one or multiple replicated services
update Update a service
Run 'docker service COMMAND --help' for more information on a command.
# 创建一个nginx服务
[root@docker001 ~]# docker service create -p 8888:80 --name my-nginx nginx
# 查看服务
[root@docker001 ~]# docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
5582lwjnor9g my-nginx replicated 1/1 nginx:latest *:8888->80/tcp
[root@docker001 ~]# docker service ps my-nginx
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
l1gb7cpk9ya5 my-nginx.1 nginx:latest docker001 Running Running 2 hours ago
测试访问
docker001服务器
# 可以看到容器是随机创建的
[root@docker001 ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
9879932a5b64 nginx:latest "/docker-entrypoint.…" 2 hours ago Up 2 hours 80/tcp my-nginx.1.l1gb7cpk9ya5llop42kradb93
[root@docker001 ~]# docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
5582lwjnor9g my-nginx replicated 1/1 nginx:latest *:8888->80/tcp
docker002服务器
[root@docker002 ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
[root@docker002 ~]# docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
5582lwjnor9g my-nginx replicated 1/1 nginx:latest *:8888->80/tcp
结论:发现docker容器即使只跑在docker001的服务器上,然而nginx服务却可以通过这个集群中的任何一台服务器的外网IP进行访问
2 动态扩缩容
# 动态扩容
# 第一种方法
# 加上之前创建的,总共3个副本
[root@docker001 ~]# docker service update --replicas 3 my-nginx
my-nginx
overall progress: 3 out of 3 tasks
1/3: running [==================================================>]
2/3: running [==================================================>]
3/3: running [==================================================>]
verify: Service converged
[root@docker001 ~]# docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
5582lwjnor9g my-nginx replicated 3/3 nginx:latest *:8888->80/tcp
[root@docker001 ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
9879932a5b64 nginx:latest "/docker-entrypoint.…" 2 hours ago Up 2 hours 80/tcp my-nginx.1.l1gb7cpk9ya5llop42kradb93
# 动态扩容
# 第二种方法
[root@docker001 ~]# docker service scale my-nginx=5
my-nginx scaled to 5
overall progress: 5 out of 5 tasks
1/5: running [==================================================>]
2/5: running [==================================================>]
3/5: running [==================================================>]
4/5: running [==================================================>]
5/5: running [==================================================>]
verify: Service converged
[root@docker001 ~]# docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
5582lwjnor9g my-nginx replicated 5/5 nginx:latest *:8888->80/tcp
# 动态缩容
# 方法1
[root@docker001 ~]# docker service update --replicas 2 my-nginx
my-nginx
overall progress: 2 out of 2 tasks
1/2: running [==================================================>]
2/2: running [==================================================>]
verify: Service converged
[root@docker001 ~]# docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
5582lwjnor9g my-nginx replicated 2/2 nginx:latest *:8888->80/tcp
# 方法2
[root@docker001 ~]# docker service scale my-nginx=1
my-nginx scaled to 1
overall progress: 1 out of 1 tasks
1/1: running [==================================================>]
verify: Service converged
[root@docker001 ~]# docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
5582lwjnor9g my-nginx replicated 1/1 nginx:latest *:8888->80/tcp
[root@docker001 ~]#
Docker swarm 只要掌握搭建集群、启动服务、动态管理容器即可
概念总结
Swarm
集群的管理和编号,docker可以初始化一个swarm集群,其他结点可以加入(管理者,工作者)
Node
就是一个docker服务器的结点,多个结点就组成了一个网络集群(管理者、工作者)
Service
任务,可以在管理结点或者工作结点来运行,是核心,提供用户访问
Task
容器内的命令、细节任务!
命令 -> 管理 -> api -> 调度 -> 工作结点(创建Task容器,维护容器)
服务副本和全局服务
调整service以什么方式运行
# 移除my-nginx服务
[root@docker001 ~]# docker service rm my-nginx
my-nginx
[root@docker001 ~]# docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
# 查看命令帮助
[root@docker001 ~]# docker service create --help
--mode string Service mode (replicated, global, replicated-job, or global-job) (default "replicated")
# 1 随机创建一个服务,可能在管理节点也可能在工作节点
[root@docker001 ~]# docker service create --mode replicated --name test-nginx -p 8888:80
[root@docker001 ~]# docker service scale test-nginx=5
# 2 全局创建,这个会在所有的节点都创建服务,包括管理节点和工作节点
# 这里创建了4个服务
[root@docker001 ~]# docker service create --mode global --name g-nginx nginx
q8llksq08rglze7lmwd43djkv
overall progress: 4 out of 4 tasks
qbia02mk2as1: running [==================================================>]
09ih2yeax5pw: running [==================================================>]
x25eceyb65w5: running [==================================================>]
ssatmcxgxh0r: running [==================================================>]
verify: Service converged
# 3 只在工作节点运行
[root@docker001 ~]# docker service create --mode replicated-job --name job-nginx nginx
网络模式
# 查看所有网络
[root@docker001 ~]# docker network ls
NETWORK ID NAME DRIVER SCOPE
6ea4eb855f69 bridge bridge local
ac9c1f580cc5 docker_gwbridge bridge local
6bd28b52cb44 host host local
tsxu9o5jas6w ingress overlay swarm
3f4a7bbd59f3 none null local
# 查看 ingress 网络
[root@docker001 ~]# docker network inspect ingress
[
{
"Name": "ingress",
## 这里分别是集群中每台服务器的内网地址
"Peers": [
{
"Name": "9878bafdb90b",
"IP": "172.18.213.62"
},
{
"Name": "6d36e762e052",
"IP": "172.18.213.59"
},
{
"Name": "338c241250eb",
"IP": "172.18.213.60"
},
{
"Name": "dbf2be54f556",
"IP": "172.18.213.61"
}
]
}
]
# 网络模式 "PublishMode":"ingress"
# Swarm:
# Overlay:
# ingress: 特殊的Overlay网络!负载均衡的功能!ipvs vip!
其他命令
docker statck
# docker-compose 单机部署项目
# docker stack 集群部署
# 单机
docker-compose up -d wordpress.yaml
# 集群
docker stack deploy 服务名称 --compose-file=docker-compose.yml
[root@docker001 wordpress]# pwd
/root/wordpress
[root@docker001 wordpress]# cat docker-compose.yml
version: "3.9"
services:
db:
image: mysql:5.7
volumes:
- db_data:/var/lib/mysql
restart: always
environment:
MYSQL_ROOT_PASSWORD: somewordpress
MYSQL_DATABASE: wordpress
MYSQL_USER: wordpress
MYSQL_PASSWORD: wordpress
wordpress:
depends_on:
- db
image: wordpress:latest
volumes:
- wordpress_data:/var/www/html
ports:
- "8000:80"
restart: always
environment:
WORDPRESS_DB_HOST: db:3306
WORDPRESS_DB_USER: wordpress
WORDPRESS_DB_PASSWORD: wordpress
WORDPRESS_DB_NAME: wordpress
deploy:
replicas: 2 # 这里创建了2个副本
volumes:
db_data: {}
wordpress_data: {}
# 创建并且启动服务
[root@docker001 wordpress]# docker stack deploy my_wordpress --compose-file=docker-compose.yml
# 可以看到db服务只有一个副本,而WordPress服务有2个副本
[root@docker001 wordpress]# docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
stsyruc88v6n my_wordpress_db replicated 1/1 mysql:5.7
rihqk9g304m8 my_wordpress_wordpress replicated 2/2 wordpress:latest *:8000->80/tcp
可以通过在该集群服务器所在的任何外网IP访问
docker secret
# 安全相关
# 配置密码
# 证书
[root@docker001 wordpress]# docker secret --help
Usage: docker secret COMMAND
Manage Docker secrets
Commands:
create Create a secret from a file or STDIN as content
inspect Display detailed information on one or more secrets
ls List secrets
rm Remove one or more secrets
Run 'docker secret COMMAND --help' for more information on a command.
docker config
# 配置
[root@docker001 wordpress]# docker config --help
Usage: docker config COMMAND
Manage Docker configs
Commands:
create Create a config from a file or STDIN
inspect Display detailed information on one or more configs
ls List configs
rm Remove one or more configs
Run 'docker config COMMAND --help' for more information on a command.