etcd集群安装和维护(V2)
yum install -y etcd
如果之前安装过etcd,停掉etcd服务并,清理持久化目录,默认目录在/var/lib/etcd/default.etcd。
停止方法:systemctl stop etcd
数据清理方法: mv /var/lib/etcd/default.etcd /var/lib/etcd/default.etcd.bak
1、修改配置文件:vi /etc/etcd/etcd.conf
配置文件如下,绿色部分IP需要修改,以172.28.73.54、172.28.73.40、172.28.16.153为例
172.28.73.54配置
ETCD_NAME="node1" ETCD_DATA_DIR="/var/lib/etcd/default.etcd" ETCD_LISTEN_PEER_URLS="http://172.28.73.54:2380" ETCD_LISTEN_CLIENT_URLS="http://172.28.73.54:2379,http://127.0.0.1:2379" ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.28.73.54:2380" ETCD_ADVERTISE_CLIENT_URLS="http://172.28.73.54:2379" ETCD_INITIAL_CLUSTER="node1=http://172.28.73.54:2380,node2=http://172.28.73.40:2380,node3=http://172.28.16.153:2380" ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster" ETCD_INITIAL_CLUSTER_STATE="new"
172.28.73.40配置
ETCD_NAME="node2" ETCD_DATA_DIR="/var/lib/etcd/default.etcd" ETCD_LISTEN_PEER_URLS="http://172.28.73.40:2380" ETCD_LISTEN_CLIENT_URLS="http://172.28.73.40:2379,http://127.0.0.1:2379" ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.28.73.40:2380" ETCD_ADVERTISE_CLIENT_URLS="http://172.28.73.40:2379" ETCD_INITIAL_CLUSTER="node1=http://172.28.73.54:2380,node2=http://172.28.73.40:2380,node3=http://172.28.16.153:2380" ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster" ETCD_INITIAL_CLUSTER_STATE="new"
172.28.16.53配置
ETCD_NAME="node3" ETCD_DATA_DIR="/var/lib/etcd/default.etcd" ETCD_LISTEN_PEER_URLS="http://172.28.16.153:2380" ETCD_LISTEN_CLIENT_URLS="http://172.28.16.153:2379,http://127.0.0.1:2379" ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.28.16.153:2380" ETCD_ADVERTISE_CLIENT_URLS="http://172.28.16.153:2379" ETCD_INITIAL_CLUSTER="node1=http://172.28.73.54:2380,node2=http://172.28.73.40:2380,node3=http://172.28.16.153:2380" ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster" ETCD_INITIAL_CLUSTER_STATE="new"
2、每台服务器修改etcd系统启动文件
vi /usr/lib/systemd/system/etcd.service
[Service] Type=notify WorkingDirectory=/var/lib/etcd/ EnvironmentFile=-/etc/etcd/etcd.conf User=etcd # set GOMAXPROCS to number of processors ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/bin/etcd \ --name=\"${ETCD_NAME}\" \ --data-dir=\"${ETCD_DATA_DIR}\" \ --listen-peer-urls=\"${ETCD_LISTEN_PEER_URLS}\" \ --listen-client-urls=\"${ETCD_LISTEN_CLIENT_URLS}\" \ --initial-advertise-peer-urls=\"${ETCD_INITIAL_ADVERTISE_PEER_URLS}\" \ --advertise-client-urls=\"${ETCD_ADVERTISE_CLIENT_URLS}\" \ --initial-cluster=\"${ETCD_INITIAL_CLUSTER}\" \ --initial-cluster-token=\"${ETCD_INITIAL_CLUSTER_TOKEN}\" \ --initial-cluster-state=\"${ETCD_INITIAL_CLUSTER_STATE}\"" Restart=on-failure LimitNOFILE=65536
3、启动
依次启动etcd服务
systemclt start etcdctl
4、验证
查看成员
etcdctl member list
每个节点上执行查看健康状态
etcdctl cluster-health
在任意一个节点etcdctl set /test "hello etcd"
在另一个几点 etcdctl get /test 能看到数据则正常
5、添加新节点
如在ip172.28.16.166上添加
配置文件
ETCD_NAME="node4" ETCD_DATA_DIR="/var/lib/etcd/default.etcd" ETCD_LISTEN_PEER_URLS="http://172.28.16.166:2380" ETCD_LISTEN_CLIENT_URLS="http://172.28.16.166:2379,http://127.0.0.1:2379" ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.28.16.166:2380" ETCD_ADVERTISE_CLIENT_URLS="http://172.28.16.166:2379" ETCD_INITIAL_CLUSTER="node1=http://172.28.73.54:2380,node2=http://172.28.73.40:2380,node3=http://172.28.16.153:2380,node4=http://172.28.16.166:2380" ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster" ETCD_INITIAL_CLUSTER_STATE="existing"
启动文件修改同上
启动节点systemctl start etcd
在另外任意节点上执行
etcdctl member add node4 http://172.28.16.166:2380
etcdctl member list 查看节点信息和状态
如果出现添加失败,则清理node4的存储目录并重启node4的etcd服务
修改另外3个节点的 ETCD_INITIAL_CLUSTER ,添加node4信息,重启etcd服务
6、移除节点
先查出节点id etcdctl member list 删除 etcdctl member remove 节点id
修改另外3个节点的 ETCD_INITIAL_CLUSTER ,去掉node4信息,重启etcd服务
7、持久化数据备份和恢复
etcd v2 和 v3 的数据不能混合存放,默认是v2。以下是v2的备份和恢复方法,v3参考 etcd集群备份和数据恢复
7.1备份
//预置测试数据,非必须
etcdctl set /test test
etcdctl backup --data-dir /var/lib/etcd/default.etcd/ -backup-dir /tmp/etcd_backup
tar -cvf backup.etcd.tar.gz /tmp/etcd_backup
backup.etcd.tar.gz 为备份的数据
7.2恢复
将backup.etcd.tar.gz copy到要恢复的集群任意一个服务器上,集群默认配置为1中的配置,将第一个恢复的服务器的ETCD_INITIAL_CLUSTER_STATE设置为new,其他两个设置为exsiting
tar -xvf backup.etcd.tar.gz
rm -rf /var/lib/etcd/* mv tmp/etcd_backup/ /var/lib/etcd/default.etcd/
7.3数据验证和修改节点信息
etcd -data-dir="/var/lib/etcd/default.etcd" --name="node1" --force-new-cluster
开启一个新的窗口 etcdctl get /test
查看节点id
etcdctl member list
修改peerURL
curl http://localhost:2379/v2/members/71bf939a1c01 -XPUT -H "Content-Type:application/json" -d '{"peerURLs":["http://172.28.73.54:2380"]}'
etcdctl member list
第一个窗口ctrl c,关闭当前节点
7.4启动当前节点
chown etcd:etcd /var/lib/etcd -R systemctl daemon-reload systemctl start etcd
7.5添加节点(ip为第二个服务器ip)
etcdctl member add node2 http://172.28.73.40:2380
etcdctl member list
不管。
7.6启动第二个节点
更新/etc/etcd/etcd.conf的配置
ETCD_INITIAL_CLUSTER="node1=http://172.28.73.54:2380,node2=http://172.28.73.40:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"
启动
rm -rf /var/lib/etcd/* systemctl daemon-reload systemctl start etcd
7.7启动第三个节点
etcdctl member add node3 http://172.28.16.153:2380
同7.6
做好数据验证,如没问题,将第二个节点的ETCD_INITIAL_CLUSTER更新为3个节点的,然后重启。
8、常见错误
无法启动:
方法1、清理data-dir再重启
方法2、查看相关目录权限是否和/usr/lib/systemd/system/etcd.service中的一直,如不一致,则修改,如默认为etcd,经过迁移后相关目录权限变成了root导致无法启动
chown etcd:etcd /var/lib/etcd -R systemctl daemon-reload systemctl start etcd