kafka集群停止_zk元数据异常

故障现象: 

 报错日志内容:

[2021-04-13 15:39:39,332] ERROR Error while creating ephemeral at /brokers/ids/3, node already exists and owner '0' does not match current session '146785369381863503' (kafka.zk.KafkaZkClient$CheckedEphemeral)

[2021-04-13 15:39:39,341] ERROR [KafkaServer id=3] Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer) org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists

 返回重启kafka不行,没办法删除zk数据节点。

(处理过程汇总:通过kafka describe命令,集群正常,但实际上各节点应用都已经停止了,登录zk,也看不到/kafka/nextsfpaycore/controller文件,故此剔除zk里面的kfk节点,重新建立关系)

[appdeploy@cnsz22VLK10176:/home/appdeploy]$/app/zookeeper-3.4.6/bin/zkCli.sh -server 10.xx.16.70:2181

[zk: 10.xx.16.70:2181(CONNECTED) 0] ls /kafka/nextsfpaycore/controller

[]

[zk: 10.xx.16.70:2181(CONNECTED) 1] get /kafka/nextsfpaycore/controller

{"version":1,"brokerid":1,"timestamp":"1618302213857"}

cZxid = 0x500001648

ctime = Tue Apr 13 16:23:33 CST 2021

mZxid = 0x500001648

mtime = Tue Apr 13 16:23:33 CST 2021

pZxid = 0x500001648

cversion = 0

dataVersion = 0

aclVersion = 0

ephemeralOwner = 0x1097c84abbb0056

dataLength = 54

numChildren = 0

[zk: 10.xxx.16.70:2181(CONNECTED) 2] ls /kafka/nextsfpaycore/brokers/ids

[1, 2, 3, 4, 5]

[zk: 10.208.16.70:2181(CONNECTED) 3] get /kafka/nextsfpaycore/brokers/ids

[1,2,3,4]

cZxid = 0x3017f09be

ctime = Mon Apr 12 19:37:35 CST 2021

mZxid = 0x3017f09be

mtime = Mon Apr 12 19:37:35 CST 2021

pZxid = 0x50000169e

cversion = 19

dataVersion = 0

aclVersion = 0

ephemeralOwner = 0x0

dataLength = 9

numChildren = 5

[zk: 10.xxx.16.70:2181(CONNECTED) 4] get /kafka/nextsfpaycore/brokers/ids/1

{"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT","SASL_PLAINTEXT":"SASL_PLAINTEXT"},"endpoints":["PLAINTEXT://10.xx.16.66:9092","SASL_PLAINTEXT://10.xx.16.66:9093"],"jmx_port":7007,"host":"10.xx.16.66","timestamp":"1618299910700","port":9092,"version":4}

cZxid = 0x500000deb

ctime = Tue Apr 13 15:45:10 CST 2021

mZxid = 0x500000deb

mtime = Tue Apr 13 15:45:10 CST 2021

pZxid = 0x500000deb

cversion = 0

dataVersion = 1

aclVersion = 0

ephemeralOwner = 0x1097c84abbb0056

dataLength = 267

numChildren = 0

[zk: 10.xx.16.70:2181(CONNECTED) 5] delete /kafka/nextsfpaycore/brokers/ids/2  (剔除改kafka节点,然后重启对应的节点)

 

如上,依次剔除节点,重启kfk服务,最后使用describe查看集群状态,kfk恢复正常。

[mwopr@CNSZ22PL407 scripts]$ /app/kafka_2.12-2.4.0/bin/kafka-topics.sh --zookeeper 10.xx.16.70:2181/kafka/nextspaycore --describe

 如上,故障处理完成。

posted @ 2021-04-13 16:57  wang_wei123  阅读(5074)  评论(0编辑  收藏  举报