problems_kafka
problems_kafka
1 启动kafka报错1
errorlog:
kafka-console-consumer.sh --from-beginning --zookeeper node01:8121,node02:8121,node03:8121 --topic log_monitor
Using the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release. Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper].
[2019-08-04 10:03:45,906] WARN Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
java.net.ConnectException: Connection refused
action: 将zk端口号修改为9092,报另一个错误:
[2019-08-04 10:04:17,647] WARN Client session timed out, have not heard from server in 10006ms for sessionid 0x0 (org.apache.zookeeper.ClientCnxn)
[2019-08-04 10:04:28,461] WARN Client session timed out, have not heard from server in 10004ms for sessionid 0x0 (org.apache.zookeeper.ClientCnxn)
[2019-08-04 10:04:39,418] WARN Client session timed out, have not heard from server in 10001ms for sessionid 0x0 (org.apache.zookeeper.ClientCnxn)
No brokers found in ZK.
RCA: 检查发现是zk的端口号写错了,写成了8121,或者 9092。
solution: 将端口号修改为2181.
2 启动kafka报错2
执行启动kafka服务的命令时报错:nohup bin/kafka-server-start.sh config/server.properties 2>&1 &
errorlog:
[2019-07-28 12:58:44,760] ERROR [ReplicaManager broker=1] Error while making broker the follower for partition Topic: __consumer_offsets; Partition: 41; Leader: None; AssignedReplicas: ; InSyncReplicas: in dir None (kafka.server.ReplicaManager)
org.apache.kafka.common.errors.KafkaStorageException: Error while reading checkpoint file /develop/kafka_2.11-1.0.0/logs/replication-offset-checkpoint
Caused by: java.io.IOException: Malformed line in checkpoint file (/develop/kafka_2.11-1.0.0/logs/replication-offset-checkpoint): '
at kafka.server.checkpoints.CheckpointFile.malformedLineException$1(CheckpointFile.scala:84)
at kafka.server.checkpoints.CheckpointFile.liftedTree2$1(CheckpointFile.scala:117)
at kafka.server.checkpoints.CheckpointFile.read(CheckpointFile.scala:86)
at kafka.server.checkpoints.OffsetCheckpointFile.read(OffsetCheckpointFile.scala:61)
at kafka.cluster.Partition$$anonfun$getOrCreateReplica$1.apply(Partition.scala:147)
at kafka.cluster.Partition$$anonfun$getOrCreateReplica$1.apply(Partition.scala:142)
at kafka.utils.Pool.getAndMaybePut(Pool.scala:65)
at kafka.cluster.Partition.getOrCreateReplica(Partition.scala:141)
at kafka.server.ReplicaManager$$anonfun$makeFollowers$3.apply(ReplicaManager.scala:1227)
at kafka.server.ReplicaManager$$anonfun$makeFollowers$3.apply(ReplicaManager.scala:1204)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:130)
at kafka.server.ReplicaManager.makeFollowers(ReplicaManager.scala:1204)
at kafka.server.ReplicaManager.becomeLeaderOrFollower(ReplicaManager.scala:1065)
at kafka.server.KafkaApis.handleLeaderAndIsrRequest(KafkaApis.scala:173)
at kafka.server.KafkaApis.handle(KafkaApis.scala:103)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:65)
at java.lang.Thread.run(Thread.java:748)
[2019-07-28 12:58:44,764] ERROR Error while reading checkpoint file /develop/kafka_2.11-1.0.0/logs/replication-offset-checkpoint (kafka.server.LogDirFailureChannel)
java.io.IOException: Malformed line in checkpoint file (/develop/kafka_2.11-1.0.0/logs/replication-offset-checkpoint): '
at kafka.server.checkpoints.CheckpointFile.malformedLineException$1(CheckpointFile.scala:84)
at kafka.server.checkpoints.CheckpointFile.liftedTree2$1(CheckpointFile.scala:117)
at kafka.server.checkpoints.CheckpointFile.read(CheckpointFile.scala:86)
at kafka.server.checkpoints.OffsetCheckpointFile.read(OffsetCheckpointFile.scala:61)
at kafka.cluster.Partition$$anonfun$getOrCreateReplica$1.apply(Partition.scala:147)
at kafka.cluster.Partition$$anonfun$getOrCreateReplica$1.apply(Partition.scala:142)
at kafka.utils.Pool.getAndMaybePut(Pool.scala:65)
at kafka.cluster.Partition.getOrCreateReplica(Partition.scala:141)
at kafka.server.ReplicaManager$$anonfun$makeFollowers$3.apply(ReplicaManager.scala:1227)
at kafka.server.ReplicaManager$$anonfun$makeFollowers$3.apply(ReplicaManager.scala:1204)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:130)
at kafka.server.ReplicaManager.makeFollowers(ReplicaManager.scala:1204)
at kafka.server.ReplicaManager.becomeLeaderOrFollower(ReplicaManager.scala:1065)
at kafka.server.KafkaApis.handleLeaderAndIsrRequest(KafkaApis.scala:173)
at kafka.server.KafkaApis.handle(KafkaApis.scala:103)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:65)
at java.lang.Thread.run(Thread.java:748)
[2019-07-28 12:58:44,765] ERROR [ReplicaManager broker=1] Error while making broker the follower for partition Topic: __consumer_offsets; Partition: 32; Leader: None; AssignedReplicas: ; InSyncReplicas: in dir None (kafka.server.ReplicaManager)
org.apache.kafka.common.errors.KafkaStorageException: Error while reading checkpoint file /develop/kafka_2.11-1.0.0/logs/replication-offset-checkpoint
Caused by: java.io.IOException: Malformed line in checkpoint file (/develop/kafka_2.11-1.0.0/logs/replication-offset-checkpoint): '
at kafka.server.checkpoints.CheckpointFile.malformedLineException$1(CheckpointFile.scala:84)
at kafka.server.checkpoints.CheckpointFile.liftedTree2$1(CheckpointFile.scala:117)
at kafka.server.checkpoints.CheckpointFile.read(CheckpointFile.scala:86)
at kafka.server.checkpoints.OffsetCheckpointFile.read(OffsetCheckpointFile.scala:61)
at kafka.cluster.Partition$$anonfun$getOrCreateReplica$1.apply(Partition.scala:147)
at kafka.cluster.Partition$$anonfun$getOrCreateReplica$1.apply(Partition.scala:142)
at kafka.utils.Pool.getAndMaybePut(Pool.scala:65)
at kafka.cluster.Partition.getOrCreateReplica(Partition.scala:141)
at kafka.server.ReplicaManager$$anonfun$makeFollowers$3.apply(ReplicaManager.scala:1227)
at kafka.server.ReplicaManager$$anonfun$makeFollowers$3.apply(ReplicaManager.scala:1204)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:130)
at kafka.server.ReplicaManager.makeFollowers(ReplicaManager.scala:1204)
at kafka.server.ReplicaManager.becomeLeaderOrFollower(ReplicaManager.scala:1065)
at kafka.server.KafkaApis.handleLeaderAndIsrRequest(KafkaApis.scala:173)
at kafka.server.KafkaApis.handle(KafkaApis.scala:103)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:65)
at java.lang.Thread.run(Thread.java:748)
[2019-07-28 12:58:44,766] INFO [ReplicaFetcherManager on broker 1] Removed fetcher for partitions (kafka.server.ReplicaFetcherManager)
[2019-07-28 12:58:44,774] INFO [ReplicaFetcherManager on broker 1] Added fetcher for partitions List() (kafka.server.ReplicaFetcherManager)
[2019-07-28 12:58:44,785] INFO [ReplicaManager broker=1] Partitions are offline due to failure on log directory /develop/kafka_2.11-1.0.0/logs (kafka.server.ReplicaManager)
[2019-07-28 12:58:44,804] INFO [ReplicaFetcherManager on broker 1] Removed fetcher for partitions (kafka.server.ReplicaFetcherManager)
[2019-07-28 12:58:44,812] INFO [ReplicaManager broker=1] Broker 1 stopped fetcher for partitions because they are in the failed log dir /develop/kafka_2.11-1.0.0/logs (kafka.server.ReplicaManager)
[2019-07-28 12:58:44,820] INFO Stopping serving logs in dir /develop/kafka_2.11-1.0.0/logs (kafka.log.LogManager)
[2019-07-28 12:58:44,824] FATAL Shutdown broker because all log dirs in /develop/kafka_2.11-1.0.0/logs have failed (kafka.log.LogManager)
RCA: kill -9 强制杀掉kafka进程,导致停止kafka时某些步骤没有处理,比如replication-offset-checkpoint文件没有处理好。
solution: 备份/develop/kafka_2.11-1.0.0/logs/replication-offset-checkpoint文件,删掉该文件,然后重启kafka,就好了。
3 kafka集群模拟生产和消费数据报错
# 集群所有节点启动kafka
cd /usr/develop/kafka_2.11-0.10.0.0
nohup bin/kafka-server-start.sh config/server.properties 2>&1 &
#创建kafka的topic: test
bin/kafka-topics.sh --create --zookeeper cdh01:2181,cdh02:2181,cdh03:2181,cdh04:2181,cdh05:2181,cdh06:2181 --replication-factor 2 --partitions 3 --topic test
#列举所有的topics
bin/kafka-topics.sh --list --zookeeper cdh01:2181,cdh02:2181,cdh03:2181,cdh04:2181,cdh05:2181,cdh06:2181
#模拟生产数据
bin/kafka-console-producer.sh --broker-list cdh01:9092,cdh02:9092,cdh03:9092,cdh04:9092,cdh05:9092,cdh06:9092 --topic test
#模拟消费数据
bin/kafka-console-consumer.sh --from-beginning --topic test --zookeeper cdh01:2181,cdh02:2181,cdh03:2181,cdh04:2181,cdh05:2181,cdh06:2181
在生产数据端报错:
java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Batch containing 1 record(s) expired due to timeout while requesting metadata from brokers for test-0
在消费数据端报错:
Failed to send producer request with correlation id 8 to broker 2 with data for partitions [test,0]
java.nio.channels.ClosedChannelException
RCA: 网络问题。原来是没有关闭SELinux。
我之前在配置关闭SELinux时,本来应该写SELINUX=disabled,我写成了SELINUX=disable。
solution: 重新配置关闭SELinux。
note: 网络问题可能有很多中,可能时未关闭防火墙,SELinux,也可能网络就是不通。
4 kafka创建topic报错
kafka执行如下创建topic的语句:
[root@node01 kafka_2.11-1.0.0]# bin/kafka-topics.sh --create --topic streaming-test --replication-factor 1 --partitions 3 --zookeeper node01:2181,node02:2181,node03:2181
errorlog:
Error while executing topic command : Replication factor: 1 larger than available brokers: 0.
报错:[2019-10-15 20:23:25,461] ERROR org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 1 larger than available brokers: 0.
RCA: broker的配置文件server.properties中的配置项zookeeper.connect指定了kafka的zookeeper的根目录(zookeeper.connect=node01:2181,node02:2181,node03:2181/kafka)
solution: 命令行参数“--zookeeper”的值也需要带上根目录,如下:
bin/kafka-topics.sh --create --topic streaming-test --replication-factor 1 --partitions 3 --zookeeper node01:2181,node02:2181,node03:2181/kafka