kafka集群停止_zk元数据异常

故障现象: 

 报错日志内容:

[2021-04-13 15:39:39,332] ERROR Error while creating ephemeral at /brokers/ids/3, node already exists and owner '0' does not match current session '146785369381863503' (kafka.zk.KafkaZkClient$CheckedEphemeral)

[2021-04-13 15:39:39,341] ERROR [KafkaServer id=3] Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer) org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists

 返回重启kafka不行,没办法删除zk数据节点。

(处理过程汇总:通过kafka describe命令,集群正常,但实际上各节点应用都已经停止了,登录zk,也看不到/kafka/nextsfpaycore/controller文件,故此剔除zk里面的kfk节点,重新建立关系)

[appdeploy@cnsz22VLK10176:/home/appdeploy]$/app/zookeeper-3.4.6/bin/zkCli.sh -server 10.xx.16.70:2181

[zk: 10.xx.16.70:2181(CONNECTED) 0] ls /kafka/nextsfpaycore/controller

[]

[zk: 10.xx.16.70:2181(CONNECTED) 1] get /kafka/nextsfpaycore/controller

{"version":1,"brokerid":1,"timestamp":"1618302213857"}

cZxid = 0x500001648

ctime = Tue Apr 13 16:23:33 CST 2021

mZxid = 0x500001648

mtime = Tue Apr 13 16:23:33 CST 2021

pZxid = 0x500001648

cversion = 0

dataVersion = 0

aclVersion = 0

ephemeralOwner = 0x1097c84abbb0056

dataLength = 54

numChildren = 0

[zk: 10.xxx.16.70:2181(CONNECTED) 2] ls /kafka/nextsfpaycore/brokers/ids

[1, 2, 3, 4, 5]

[zk: 10.208.16.70:2181(CONNECTED) 3] get /kafka/nextsfpaycore/brokers/ids

[1,2,3,4]

cZxid = 0x3017f09be

ctime = Mon Apr 12 19:37:35 CST 2021

mZxid = 0x3017f09be

mtime = Mon Apr 12 19:37:35 CST 2021

pZxid = 0x50000169e

cversion = 19

dataVersion = 0

aclVersion = 0

ephemeralOwner = 0x0

dataLength = 9

numChildren = 5

[zk: 10.xxx.16.70:2181(CONNECTED) 4] get /kafka/nextsfpaycore/brokers/ids/1

{"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT","SASL_PLAINTEXT":"SASL_PLAINTEXT"},"endpoints":["PLAINTEXT://10.xx.16.66:9092","SASL_PLAINTEXT://10.xx.16.66:9093"],"jmx_port":7007,"host":"10.xx.16.66","timestamp":"1618299910700","port":9092,"version":4}

cZxid = 0x500000deb

ctime = Tue Apr 13 15:45:10 CST 2021

mZxid = 0x500000deb

mtime = Tue Apr 13 15:45:10 CST 2021

pZxid = 0x500000deb

cversion = 0

dataVersion = 1

aclVersion = 0

ephemeralOwner = 0x1097c84abbb0056

dataLength = 267

numChildren = 0

[zk: 10.xx.16.70:2181(CONNECTED) 5] delete /kafka/nextsfpaycore/brokers/ids/2  (剔除改kafka节点,然后重启对应的节点)

 

如上,依次剔除节点,重启kfk服务,最后使用describe查看集群状态,kfk恢复正常。

[mwopr@CNSZ22PL407 scripts]$ /app/kafka_2.12-2.4.0/bin/kafka-topics.sh --zookeeper 10.xx.16.70:2181/kafka/nextspaycore --describe

 如上,故障处理完成。

posted @   wang_wei123  阅读(5465)  评论(0编辑  收藏  举报
编辑推荐:
· AI与.NET技术实操系列:基于图像分类模型对图像进行分类
· go语言实现终端里的倒计时
· 如何编写易于单元测试的代码
· 10年+ .NET Coder 心语,封装的思维:从隐藏、稳定开始理解其本质意义
· .NET Core 中如何实现缓存的预热?
阅读排行:
· 25岁的心里话
· 闲置电脑爆改个人服务器(超详细) #公网映射 #Vmware虚拟网络编辑器
· 零经验选手,Compose 一天开发一款小游戏!
· 通过 API 将Deepseek响应流式内容输出到前端
· AI Agent开发,如何调用三方的API Function,是通过提示词来发起调用的吗
点击右上角即可分享
微信分享提示