ETCD分布式存储部署
一、ETCD 概述
ETCD
是一个分布式一致性k-v存储系统,可用于服务注册发现与共享配置。具有一下优点:
- 简单: 相比于晦涩难懂的paxos算法,etcd基于相对简单且易实现的raft算法实现一致性,并通过gRPC提供接口调用
- 安全:支持TLS通信,并可以针对不同的用户进行对key的读写控制
- 高性能:10,000/秒的写性能
二、overlay网络模式
容器在两个跨主机通信的时候,是使用overlay network这个网络模式进行通信,如果使用host也可以实现跨主机进行通信,直接使用这个物理的ip地址就可以进行通信。overlay它会虚拟出一个网络比如10.0.9.3 这个ip地址,在这个overlay网络模式里面,有一个类似于服务网关的地址,然后这个包转发到物理服务器这个地址,最终通过路由和交换,到达另一个服务器的ip地址。
在docker容器里面overlay是怎么实现的呢?
我们会有一个服务发现,比如说consul,会定义一个ip地址池,比如10.0.9.0/24 之类的,上面会有容器,容器ip地址会从上面去获取,获取完了后,会通过eth1进行通信,贼这实现跨主机的东西。
三、部署ETCD集群
node1部署
[root@node1 ~]# wget https://github.com/coreos/etcd/releases/download/v3.0.12/etcd-v3.0.12-linux-amd64.tar.gz
[root@node1 ~]# tar zxf etcd-v3.0.12-linux-amd64.tar.gz
[root@node1 ~]# cd etcd-v3.0.12-linux-amd64/
[root@node1 etcd-v3.0.12-linux-amd64]# nohup ./etcd --name docker-node1 --initial-advertise-peer-urls http://10.211.55.12:2380 \
> --listen-peer-urls http://10.211.55.12:2380 \
> --listen-client-urls http://10.211.55.12:2379,http://127.0.0.1:2379 \
> --advertise-client-urls http://10.211.55.12:2379 \
> --initial-cluster-token etcd-cluster \
> --initial-cluster docker-node1=http://10.211.55.12:2380,docker-node2=http://10.211.55.13:2380 \
> --initial-cluster-state new&
[1] 32505
node2部署
[root@node2 ~]# wget https://github.com/coreos/etcd/releases/download/v3.0.12/etcd-v3.0.12-linux-amd64.tar.gz
[root@node2 ~]# tar zxf etcd-v3.0.12-linux-amd64.tar.gz
[root@node2 ~]# cd etcd-v3.0.12-linux-amd64/
[root@node2 etcd-v3.0.12-linux-amd64]# nohup ./etcd --name docker-node2 --initial-advertise-peer-urls http://10.211.55.13:2380 \
> --listen-peer-urls http://10.211.55.13:2380 \
> --listen-client-urls http://10.211.55.13:2379,http://127.0.0.1:2379 \
> --advertise-client-urls http://10.211.55.13:2379 \
> --initial-cluster-token etcd-cluster \
> --initial-cluster docker-node1=http://10.211.55.12:2380,docker-node2=http://10.211.55.13:2380 \
> --initial-cluster-state new&
[1] 19240
检查cluster状态
[root@node2 etcd-v3.0.12-linux-amd64]# ./etcdctl cluster-health
member 98da03f1eca9d9d is healthy: got healthy result from http://10.211.55.12:2379
member 63a987e985acb514 is healthy: got healthy result from http://10.211.55.13:2379
cluster is healthy
参数说明
参数说明:
● –data-dir 指定节点的数据存储目录,若不指定,则默认是当前目录。这些数据包括节点ID,集群ID,集群初始化配置,Snapshot文件,若未指 定–wal-dir,还会存储WAL文件
● –wal-dir 指定节点的was文件存储目录,若指定了该参数,wal文件会和其他数据文件分开存储
● –name 节点名称
● –initial-advertise-peer-urls 告知集群其他节点的URL,tcp2380端口用于集群通信
● –listen-peer-urls 监听URL,用于与其他节点通讯
● –advertise-client-urls 告知客户端的URL, 也就是服务的URL,tcp2379端口用于监听客户端请求
● –initial-cluster-token 集群的ID
● –initial-cluster 集群中所有节点
● –initial-cluster-state 集群状态,new为新创建集群,existing为已存在的集群
四、管理etcd集群
1.查看集群版本
# etcdctl --version
# etcdctl --help
1
2
2.查看集群健康状态
# etcdctl cluster-health
1
3.查看集群成员
# etcdctl member list
1
在任一节点上执行,可以看到集群的节点情况,并能看出哪个是leader节点
4.更新一个节点
如果你想更新一个节点的IP(peerURLS),首先你需要知道那个节点的ID
# etcdctl member list
# etcdctl member update memberID http://ip:2380
1
2
5.删除一个节点(Etcd集群成员的缩)
# etcdctl member list
# etcdctl member remove memberID
# etcdctl member list
# ps -ef|grep etcd //在相关节点上kill掉etcd进程
1
2
3
4
6.增加一个新节点(Etcd集群成员的伸)
注意:步骤很重要,不然会报集群ID不匹配
# etcdctl member add --help
1
a. 将目标节点添加到集群
# etcdctl member add etcd3 http://10.1.2.174:2380
Addedmember named etcd3 with ID 28e0d98e7ec15cd4 to cluster
ETCD_NAME="etcd3"
ETCD_INITIAL_CLUSTER="etcd0=http://10.1.2.61:2380,etcd1=http://10.1.2.172:2380,etcd2=http://10.1.2.173:2380,etcd3=http://10.1.2.174:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"
1
2
3
4
5
6
b. 查看新增成员列表,etcd3状态为unstarted
# etcdctl member list
d4f257d2b5f99b64[unstarted]:peerURLs=http://10.1.2.174:2380
1
2
c. 清空目标节点etcd3的data-dir
节点删除后,集群中的成员信息会更新,新节点是作为一个全新的节点加入集群,如果data-dir有数据,etcd启动时会读取己经存在的数据,仍然用老的memberID会造成无法加入集群,所以一定要清空新节点的data-dir。
# rm -rf /path/to/etcd/data
1
d. 在目标节点上启动新增加的成员
这里的initial标记一定要指定为existing,如果为new,则会自动生成一个新的memberID,这和前面添加节点时生成的ID不一致,故日志中会报节点ID不匹配的错。
# vim etcd3.sh
1
修改etcd3.sh,脚本中–advertise-client-urls 和 –initial-advertis-peer-urls 参数修改为etcd3的,–initial-cluster-state改为existing
# nohup ./etcd3.sh &
# etcdctl member list
五、Docker使用ETCD分布式存储
node1
[root@node1 ~]#service docker stop
[root@node1 ~]# /usr/bin/dockerd -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --cluster-store=etcd://10.211.55.12:2379 --cluster-advertise=10.211.55.12:2375&
node2
[root@node2 ~]#service docker stop
[root@node2 ~]# /usr/bin/dockerd -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --cluster-store=etcd://10.211.55.13:2379 --cluster-advertise=10.211.55.13:2375&
六、创建overlay network
在node1上创建一个demo的overlay network
[root@node1 ~]# docker network ls
NETWORK ID NAME DRIVER SCOPE
550d5b450fe3 bridge bridge local
cca92be73cc6 host host local
d21360748bfc none null local
[root@node1 ~]# docker network create -d overlay demo
97e959031044ec634d61d2e721cb0348d7ff852af3f575d75d2988c07e0f9846
[root@node1 ~]# docker network ls
NETWORK ID NAME DRIVER SCOPE
550d5b450fe3 bridge bridge local
97e959031044 demo overlay global
cca92be73cc6 host host local
d21360748bfc none null local
[root@node1 ~]# docker network inspect demo
[
{
"Name": "demo",
"Id": "97e959031044ec634d61d2e721cb0348d7ff852af3f575d75d2988c07e0f9846",
"Created": "2018-08-01T22:22:01.958142468+08:00",
"Scope": "global",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": {},
"Config": [
{
"Subnet": "10.0.0.0/24",
"Gateway": "10.0.0.1"
}
]
},
"Internal": false,
"Attachable": false,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {},
"Options": {},
"Labels": {}
}
]
我们会看到在node2上,这个demo的overlay network会被同步创建
[root@node2 etcd-v3.0.12-linux-amd64]# cd
[root@node2 ~]# docker network ls
NETWORK ID NAME DRIVER SCOPE
c6af37ef6765 bridge bridge local
97e959031044 demo overlay global
de15cdab46b0 host host local
cc3ec612fd29 none null local
# 说明etcd分布式存储已经起作用,两个节点数据已同步
通过查看etcd的key-value, 我们获取到,这个demo的network是通过etcd从node1同步到node2的
[root@node2 etcd-v3.0.12-linux-amd64]# ./etcdctl ls /docker
/docker/nodes
/docker/network
[root@node2 etcd-v3.0.12-linux-amd64]# ./etcdctl ls /docker/nodes
/docker/nodes/10.211.55.12:2375
/docker/nodes/10.211.55.13:2375
[root@node2 etcd-v3.0.12-linux-amd64]# ./etcdctl ls /docker/network/v1.0/network
/docker/network/v1.0/network/97e959031044ec634d61d2e721cb0348d7ff852af3f575d75d2988c07e0f9846
[root@node2 etcd-v3.0.12-linux-amd64]# ./etcdctl get /docker/network/v1.0/network/97e959031044ec634d61d2e721cb0348d7ff852af3f575d75d2988c07e0f9846
{"addrSpace":"GlobalDefault","attachable":false,"configFrom":"","configOnly":false,"created":"2018-08-01T22:22:01.958142468+08:00","enableIPv6":false,"generic":{"com.docker.network.enable_ipv6":false,"com.docker.network.generic":{}},"id":"97e959031044ec634d61d2e721cb0348d7ff852af3f575d75d2988c07e0f9846","inDelete":false,"ingress":false,"internal":false,"ipamOptions":{},"ipamType":"default","ipamV4Config":"[{\"PreferredPool\":\"\",\"SubPool\":\"\",\"Gateway\":\"\",\"AuxAddresses\":null}]","ipamV4Info":"[{\"IPAMData\":\"{\\\"AddressSpace\\\":\\\"GlobalDefault\\\",\\\"Gateway\\\":\\\"10.0.0.1/24\\\",\\\"Pool\\\":\\\"10.0.0.0/24\\\"}\",\"PoolID\":\"GlobalDefault/10.0.0.0/24\"}]","labels":{},"loadBalancerIP":"","name":"demo","networkType":"overlay","persist":true,"postIPv6":false,"scope":"global"}
七、创建连接demo网络的容器
node1
[root@node1 ~]# docker run -d --name test1 --net demo busybox sh -c "while true; do sleep 3600; done"
Unable to find image 'busybox:latest' locally
latest: Pulling from library/busybox
8c5a7da1afbc: Pull complete
Digest: sha256:cb63aa0641a885f54de20f61d152187419e8f6b159ed11a251a09d115fdff9bd
Status: Downloaded newer image for busybox:latest
4bc3ab1cb7d838e8ef314618e6d3d878e744ef7842196a00b3999e6b6fe8402f
ERRO[2018-08-01T22:26:33.105124642+08:00] Failed to deserialize netlink ndmsg: invalid argument
INFO[0379] shim docker-containerd-shim started address="/containerd-shim/moby/4bc3ab1cb7d838e8ef314618e6d3d878e744ef7842196a00b3999e6b6fe8402f/shim.sock" debug=false pid=6630
[root@node1 ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4bc3ab1cb7d8 busybox "sh -c 'while true; …" 26 seconds ago Up 24 seconds test1
[root@node1 ~]# docker exec test1 ifconfig
eth0 Link encap:Ethernet HWaddr 02:42:0A:00:00:02
inet addr:10.0.0.2 Bcast:10.0.0.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
eth1 Link encap:Ethernet HWaddr 02:42:AC:12:00:02
inet addr:172.18.0.2 Bcast:172.18.255.255 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:31 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:4045 (3.9 KiB) TX bytes:0 (0.0 B)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
node2
[root@node2 ~]# docker run -d --name test1 --net demo busybox sh -c "while true; do sleep 3600; done"
Unable to find image 'busybox:latest' locally
latest: Pulling from library/busybox
56bec22e3559: Pull complete
Digest: sha256:29f5d56d12684887bdfa50dcd29fc31eea4aaf4ad3bec43daf19026a7ce69912
Status: Downloaded newer image for busybox:latest
fad6dc6538a85d3dcc958e8ed7b1ec3810feee3e454c1d3f4e53ba25429b290b
docker: Error response from daemon: service endpoint with name test1 already exists. # 容器已存在
[root@node2 ~]# docker run -d --name test2 --net demo busybox sh -c "while true; do sleep 3600; done"
9d494a2f66a69e6b861961d0c6af2446265bec9b1d273d7e70d0e46eb2e98d20
验证连通性
[root@node2 etcd-v3.0.12-linux-amd64]# docker exec -it test2 ifconfig
eth0 Link encap:Ethernet HWaddr 02:42:0A:00:00:03
inet addr:10.0.0.3 Bcast:10.0.0.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
eth1 Link encap:Ethernet HWaddr 02:42:AC:12:00:02
inet addr:172.18.0.2 Bcast:172.18.255.255 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:32 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:4110 (4.0 KiB) TX bytes:0 (0.0 B)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
[root@node1 ~]# docker exec test1 sh -c "ping 10.0.0.3"
PING 10.0.0.3 (10.0.0.3): 56 data bytes
64 bytes from 10.0.0.3: seq=0 ttl=64 time=0.698 ms
64 bytes from 10.0.0.3: seq=1 ttl=64 time=1.034 ms
64 bytes from 10.0.0.3: seq=2 ttl=64 time=1.177 ms
64 bytes from 10.0.0.3: seq=3 ttl=64 time=0.708 ms
64 bytes from 10.0.0.3: seq=4 ttl=64 time=0.651 ms