K8s 集群 etcd节点故障解决方案
1 环境说明
k8s版本:v1.20
etcd节点(192.168.0.12)故障:
报错详情:
4月 24 22:47:13 k8s-node2 etcd[9543]: {"level":"warn","ts":"2023-04-24T22:47:13.571+0800","caller":"etcdserver/server.go:2065","msg":"failed to publish local member to cluster through raft","local-member-id":"b8fffb7f5b2f26e","local-member-attributes":"{Name:etcd-3 ClientURLs:[https://192.168.0.12:2379]}","request-path":"/0/members/b8fffb7f5b2f26e/attributes","publish-timeout":"7s","error":"etcdserver: request timed out"}
2 查看etcd集群
/opt/etcd/bin/etcdctl --cacert=/opt/etcd/ssl/ca.pem --cert=/opt/etcd/ssl/server.pem --key=/opt/etcd/ssl/server-key.pem --endpoints="https://192.168.0.5:2379,https://192.168.0.11:2379,https://192.168.0.12:2379" member list
3 移除故障节点
/opt/etcd/bin/etcdctl --cacert=/opt/etcd/ssl/ca.pem --cert=/opt/etcd/ssl/server.pem --key=/opt/etcd/ssl/server-key.pem --endpoints="https://192.168.0.5:2379,https://192.168.0.11:2379,https://192.168.0.12:2379" member remove b8fffb7f5b2f26e
4 删除故障节点的数据
rm -rf /var/lib/etcd/default.etcd/member/
5 修改故障节点etcd配置文件
将new改为existing
#[Member]
ETCD_NAME="etcd-3"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.0.12:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.0.12:2379"
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.0.12:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.0.12:2379"
ETCD_INITIAL_CLUSTER="etcd-1=https://192.168.0.5:2380,etcd-2=https://192.168.0.11:2380,etcd-3=https://192.168.0.12:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="existing"
6 重新加入集群
/opt/etcd/bin/etcdctl --cacert=/opt/etcd/ssl/ca.pem --cert=/opt/etcd/ssl/server.pem --key=/opt/etcd/ssl/server-key.pem --endpoints="https://192.168.0.5:2379,https://192.168.0.11:2379,https://192.168.0.12:2379" member add etcd-3 --peer-urls=https://192.168.0.12:2380
7 重启故障节点的etcd
systemctl restart etcd
查看etcd服务状态
8 查看k8s集群健康状态
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· Manus的开源复刻OpenManus初探
· AI 智能体引爆开源社区「GitHub 热点速览」
· C#/.NET/.NET Core技术前沿周刊 | 第 29 期(2025年3.1-3.9)
· 从HTTP原因短语缺失研究HTTP/2和HTTP/3的设计差异