OpenShift环境中手工模式添加etcd server
模拟备份和恢复,在现有的集群环境,单master(etcd), infra和node上面添加另外一台机器作为etcd Server.
基于OpenShift 3.11版本,详情可以参考
为了减少步骤,先clone那台master出来成为etcd1,然后修改ip,主机名,然后将上面的服务移除
# mkdir -p /etc/origin/node/pods-stopped # mv /etc/origin/node/pods/* /etc/origin/node/pods-stopped/
然后开始具体步骤:
- 修改双方机器的/etc/hosts加入节点
- 生成新节点所需要的证书
master节点上操作
export NEW_ETCD_HOSTNAME="etcd1.example.com" export NEW_ETCD_IP="192.168.56.109" export CN=$NEW_ETCD_HOSTNAME export SAN="IP:${NEW_ETCD_IP}, DNS:${NEW_ETCD_HOSTNAME}" export PREFIX="/etc/etcd/generated_certs/etcd-$CN/" export OPENSSLCFG="/etc/etcd/ca/openssl.cnf"
# mkdir -p ${PREFIX} # openssl req -new -config ${OPENSSLCFG} \ -keyout ${PREFIX}server.key \ -out ${PREFIX}server.csr \ -reqexts etcd_v3_req -batch -nodes \ -subj /CN=$CN # openssl ca -name etcd_ca -config ${OPENSSLCFG} \ -out ${PREFIX}server.crt \ -in ${PREFIX}server.csr \ -extensions etcd_v3_ca_server -batch # openssl req -new -config ${OPENSSLCFG} \ -keyout ${PREFIX}peer.key \ -out ${PREFIX}peer.csr \ -reqexts etcd_v3_req -batch -nodes \ -subj /CN=$CN # openssl ca -name etcd_ca -config ${OPENSSLCFG} \ -out ${PREFIX}peer.crt \ -in ${PREFIX}peer.csr \ -extensions etcd_v3_ca_peer -batch
将配置etcd.conf和ca.crt拷贝到master下为新的etcd节点配置的路径
# cp /etc/etcd/etcd.conf ${PREFIX}
# cp /etc/etcd/ca.crt ${PREFIX}
- 添加节点,在master机器上操作
先member list一下,确保没有localhost
etcdctl --cert-file=/etc/etcd/peer.crt \ --key-file=/etc/etcd/peer.key \ --ca-file=/etc/etcd/ca.crt \ --peers="https://192.168.56.103:2379" member list
etcdctl -C https://192.168.56.103:2379 \ --ca-file=/etc/etcd/ca.crt \ --cert-file=/etc/etcd/peer.crt \ --key-file=/etc/etcd/peer.key member add ${NEW_ETCD_HOSTNAME} https://${NEW_ETCD_IP}:2380 Member 2bc199c384f701e3 added to cluster e99c0083931d3d79 ETCD_NAME="etcd1.example.com" ETCD_INITIAL_CLUSTER="etcd1.example.com=https://192.168.56.109:2380,master.example.com=https://192.168.56.103:2380" ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.56.109:2380" ETCD_INITIAL_CLUSTER_STATE="existing"
- 修改配置
修改${PREFIX}/etcd.conf的各种值,按照上面的输出,主要是包括这些字段
ETCD_NAME
ETCD_INITIAL_CLUSTER
ETCD_INITIAL_CLUSTER_STATE
ETCD_LISTEN_PEER_URLS
ETCD_LISTEN_CLIENT_URLS
ETCD_INITIAL_ADVERTISE_PEER_URLS
ETCD_ADVERTISE_CLIENT_URLS
打包拷贝到新的etcd机器
# tar -czvf /etc/etcd/generated_certs/${CN}.tgz -C ${PREFIX} .
# scp /etc/etcd/generated_certs/${CN}.tgz ${CN}:/tmp/
- 新的etcd的机器上操作
停进程
# mkdir -p /etc/origin/node/pods-stopped # mv /etc/origin/node/pods/* /etc/origin/node/pods-stopped/
- 删除现有数据
# rm -Rf /etc/etcd/* # rm -Rf /var/lib/etcd/*
# tar xzvf /tmp/etcd0.example.com.tgz -C /etc/etcd/ # chown -R etcd.etcd /etc/etcd/* # chown -R etcd.etcd /var/lib/etcd/
检查一下这些数据的时间点
- 启动新的etcd
# cp /etc/origin/node/pods-stopped/etcd.yaml /etc/origin/node/pods/
通过master-logs观察数据
/usr/local/bin/master-logs etcd etcd -f
在/var/lib/etcd下会同步一份新的数据
无误后检查
相同步骤添加另一个Server.
etcd数据恢复
如果是原来就有3个Etcd Server,可以先用snapshot.db恢复第一台,然后基于member add添加另外一台,启动另外那台就可,
不需要配置证书等步骤。