kafka集群停止_zk元数据异常
故障现象:
报错日志内容:
[2021-04-13 15:39:39,332] ERROR Error while creating ephemeral at /brokers/ids/3, node already exists and owner '0' does not match current session '146785369381863503' (kafka.zk.KafkaZkClient$CheckedEphemeral)
[2021-04-13 15:39:39,341] ERROR [KafkaServer id=3] Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer) org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists
返回重启kafka不行,没办法删除zk数据节点。
(处理过程汇总:通过kafka describe命令,集群正常,但实际上各节点应用都已经停止了,登录zk,也看不到/kafka/nextsfpaycore/controller文件,故此剔除zk里面的kfk节点,重新建立关系)
[appdeploy@cnsz22VLK10176:/home/appdeploy]$/app/zookeeper-3.4.6/bin/zkCli.sh -server 10.xx.16.70:2181
[zk: 10.xx.16.70:2181(CONNECTED) 0] ls /kafka/nextsfpaycore/controller
[]
[zk: 10.xx.16.70:2181(CONNECTED) 1] get /kafka/nextsfpaycore/controller
{"version":1,"brokerid":1,"timestamp":"1618302213857"}
cZxid = 0x500001648
ctime = Tue Apr 13 16:23:33 CST 2021
mZxid = 0x500001648
mtime = Tue Apr 13 16:23:33 CST 2021
pZxid = 0x500001648
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x1097c84abbb0056
dataLength = 54
numChildren = 0
[zk: 10.xxx.16.70:2181(CONNECTED) 2] ls /kafka/nextsfpaycore/brokers/ids
[1, 2, 3, 4, 5]
[zk: 10.208.16.70:2181(CONNECTED) 3] get /kafka/nextsfpaycore/brokers/ids
[1,2,3,4]
cZxid = 0x3017f09be
ctime = Mon Apr 12 19:37:35 CST 2021
mZxid = 0x3017f09be
mtime = Mon Apr 12 19:37:35 CST 2021
pZxid = 0x50000169e
cversion = 19
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 9
numChildren = 5
[zk: 10.xxx.16.70:2181(CONNECTED) 4] get /kafka/nextsfpaycore/brokers/ids/1
{"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT","SASL_PLAINTEXT":"SASL_PLAINTEXT"},"endpoints":["PLAINTEXT://10.xx.16.66:9092","SASL_PLAINTEXT://10.xx.16.66:9093"],"jmx_port":7007,"host":"10.xx.16.66","timestamp":"1618299910700","port":9092,"version":4}
cZxid = 0x500000deb
ctime = Tue Apr 13 15:45:10 CST 2021
mZxid = 0x500000deb
mtime = Tue Apr 13 15:45:10 CST 2021
pZxid = 0x500000deb
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x1097c84abbb0056
dataLength = 267
numChildren = 0
[zk: 10.xx.16.70:2181(CONNECTED) 5] delete /kafka/nextsfpaycore/brokers/ids/2 (剔除改kafka节点,然后重启对应的节点)
如上,依次剔除节点,重启kfk服务,最后使用describe查看集群状态,kfk恢复正常。
[mwopr@CNSZ22PL407 scripts]$ /app/kafka_2.12-2.4.0/bin/kafka-topics.sh --zookeeper 10.xx.16.70:2181/kafka/nextspaycore --describe
如上,故障处理完成。
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· AI与.NET技术实操系列:基于图像分类模型对图像进行分类
· go语言实现终端里的倒计时
· 如何编写易于单元测试的代码
· 10年+ .NET Coder 心语,封装的思维:从隐藏、稳定开始理解其本质意义
· .NET Core 中如何实现缓存的预热?
· 25岁的心里话
· 闲置电脑爆改个人服务器(超详细) #公网映射 #Vmware虚拟网络编辑器
· 零经验选手,Compose 一天开发一款小游戏!
· 通过 API 将Deepseek响应流式内容输出到前端
· AI Agent开发,如何调用三方的API Function,是通过提示词来发起调用的吗