kafka集群安装和使用
kafka
(1)kafka是一个分布式的消息缓存系统
(2)kafka集群中的服务器都叫做broker
(3)kafka有两类客户端,一个叫做producer(消息生产者),一类叫做consumer(消息消费者),客户端和broker服务器之间采用TCP协议连接
(4)kafka中的消息可以通过topic进行区分,而且每一个消息topic都会被分区,以分担消息服务器的负载
(5)每一个分区都可以有多个副本,以防止数据的丢失
(6)某一个分区中的数据如果需要更新,都必须通知该分区所有副本中的leader来更新
(7)消费者可以分组,比如有两个消费者组A和B,共同消费一个topic:order_info,A和B所消费的消息不会重复,如order_info中有100个消息,每个消息都有一个id,编号从1-99,那么如果A组消费从0-49,B组消费就从50-99,当然不一定都是连续的
(8)消费者在具体消费某个topic中的消息时,可以制定起始偏移量
集群安装
官网教程
http://kafka.apache.org/22/documentation.html#introduction
1.解压
2.修改server.properties
broker.id=1
zookeeper.connect=hadoop01:2182,hadoop02:2182,hadoop03:2182
3.将zookeeper集群启动
4.在每一台节点上启动broker
bin/kafka-server-start.sh config/server.properties
#5.自带zookeeper,用于单节点,一般集群不用
#bin/zookeeper-server-start.sh config/zookeeper.proterties
开始安装
1.下载解压
[linyouyi@hadoop01 software]$ wget https://mirrors.aliyun.com/apache/kafka/2.2.0/kafka_2.11-2.2.0.tgz [linyouyi@hadoop01 software]$ tar -zxvf kafka_2.11-2.2.0.tgz -C /hadoop/module/
2.修改配置文件
[linyouyi@hadoop01 software]$ cd /hadoop/module/ [linyouyi@hadoop01 module]$ ll total 24 drwxrwxr-x 18 linyouyi linyouyi 4096 Aug 12 21:24 apache-storm-2.0.0 drwxr-xr-x 12 linyouyi linyouyi 4096 Aug 9 22:51 hadoop-2.7.7 drwxrwxr-x 7 linyouyi linyouyi 4096 Aug 11 12:10 hbase-2.0.5 drwxr-xr-x 7 linyouyi linyouyi 4096 Jul 22 2017 jdk1.8.0_144 drwxr-xr-x 6 linyouyi linyouyi 4096 Mar 10 03:47 kafka_2.11-2.2.0 drwxr-xr-x 15 linyouyi linyouyi 4096 Aug 8 11:03 zookeeper-3.4.14 [linyouyi@hadoop01 kafka_2.11-2.2.0]$ vim config/server.properties broker.id=1 log.dirs=/hadoop/kafka_2.11-2.2.0/log/kafka-logs zookeeper.connect=hadoop01:2182,hadoop02:2182,hadoop03:2182
3.拷贝到其他节点
[linyouyi@hadoop01 kafka_2.11-2.2.0]$ cd ../ [linyouyi@hadoop01 module]$ scp -r kafka_2.11-2.2.0/ linyouyi@hadoop02:/hadoop/module/ [linyouyi@hadoop01 module]$ scp -r kafka_2.11-2.2.0/ linyouyi@hadoop03:/hadoop/module/
4.修改另外两台的broker.id
[linyouyi@hadoop02 kafka_2.11-2.2.0]$ vim config/server.properties broker.id=2 [linyouyi@hadoop03 kafka_2.11-2.2.0]$ vim config/server.properties broker.id=3
5.启动kafka
[linyouyi@hadoop01 kafka_2.11-2.2.0]$ bin/kafka-server-start.sh -deamon config/server.properties [linyouyi@hadoop01 kafka_2.11-2.2.0]$ jps 109904 Jps 44057 QuorumPeerMain 109503 Kafka [linyouyi@hadoop02 kafka_2.11-2.2.0]$ bin/kafka-server-start.sh -deamon config/server.properties [linyouyi@hadoop02 kafka_2.11-2.2.0]$ jps 21308 Jps 20925 Kafka 34879 QuorumPeerMain [linyouyi@hadoop03 kafka_2.11-2.2.0]$ bin/kafka-server-start.sh -deamon config/server.properties [linyouyi@hadoop03 kafka_2.11-2.2.0]$ jps 119587 QuorumPeerMain 37651 Jps 37269 Kafka
使用
1.创建话题
//hadoop01:9092,hadoop02:9092,hadoop03:9092都行
[linyouyi@hadoop03 kafka_2.11-2.2.0]$ bin/kafka-topics.sh --create --bootstrap-server hadoop01:9092 --replication-factor 3 --partitions 1 --topic linyouyi [2019-08-17 15:30:34,616] INFO [ReplicaFetcherManager on broker 3] Removed fetcher for partitions Set(linyouyi-0) (kafka.server.ReplicaFetcherManager) [2019-08-17 15:30:34,649] INFO [Log partition=linyouyi-0, dir=/tmp/kafka-logs] Loading producer state till offset 0 with message format version 2 (kafka.log.Log) [2019-08-17 15:30:34,653] INFO [Log partition=linyouyi-0, dir=/tmp/kafka-logs] Completed load of log with 1 segments, log start offset 0 and log end offset 0 in 25 ms (kafka.log.Log) [2019-08-17 15:30:34,655] INFO Created log for partition linyouyi-0 in /tmp/kafka-logs with properties {compression.type -> producer, message.format.version -> 2.2-IV1, file.delete.delay.ms -> 60000, max.message.bytes -> 1000012, min.compaction.lag.ms -> 0, message.timestamp.type -> CreateTime, message.downconversion.enable -> true, min.insync.replicas -> 1, segment.jitter.ms -> 0, preallocate -> false, min.cleanable.dirty.ratio -> 0.5, index.interval.bytes -> 4096, unclean.leader.election.enable -> false, retention.bytes -> -1, delete.retention.ms -> 86400000, cleanup.policy -> [delete], flush.ms -> 9223372036854775807, segment.ms -> 604800000, segment.bytes -> 1073741824, retention.ms -> 604800000, message.timestamp.difference.max.ms -> 9223372036854775807, segment.index.bytes -> 10485760, flush.messages -> 9223372036854775807}. (kafka.log.LogManager) [2019-08-17 15:30:34,656] INFO [Partition linyouyi-0 broker=3] No checkpointed highwatermark is found for partition linyouyi-0 (kafka.cluster.Partition) [2019-08-17 15:30:34,658] INFO Replica loaded for partition linyouyi-0 with initial high watermark 0 (kafka.cluster.Replica) [2019-08-17 15:30:34,658] INFO Replica loaded for partition linyouyi-0 with initial high watermark 0 (kafka.cluster.Replica) [2019-08-17 15:30:34,658] INFO Replica loaded for partition linyouyi-0 with initial high watermark 0 (kafka.cluster.Replica) [2019-08-17 15:30:34,660] INFO [Partition linyouyi-0 broker=3] linyouyi-0 starts at Leader Epoch 0 from offset 0. Previous Leader Epoch was: -1 (kafka.cluster.Partition)
[linyouyi@hadoop03 kafka_2.11-2.2.0]$ bin/kafka-topics.sh --list --bootstrap-server localhost:9092 linyouyi [linyouyi@hadoop03 kafka_2.11-2.2.0]$ bin/kafka-topics.sh --list --bootstrap-server hadoop01:9092 linyouyi [linyouyi@hadoop03 kafka_2.11-2.2.0]$ bin/kafka-topics.sh --list --zookeeper hadoop01:2181 linyouyi [linyouyi@hadoop03 kafka_2.11-2.2.0]$ bin/kafka-topics.sh --create --bootstrap-server hadoop01:9092 --replication-factor 3 --partitions 1 --topic youyi [linyouyi@hadoop03 kafka_2.11-2.2.0]$ bin/kafka-topics.sh --list --zookeeper hadoop01:2181 linyouyi youyi
2.生产者往话题里面写消息
[linyouyi@hadoop03 kafka_2.11-2.2.0]$ bin/kafka-console-producer.sh --broker-list localhost:9092 --topic linyouyi>This is a message >This is another message
3.消费者立马消费消息
[linyouyi@hadoop02 kafka_2.11-2.2.0]$ bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic linyouyi --from-beginning This is a message This is another message
4.接着在生产者继续写话题,消费者立马就消费了
5.查看topic linyouyi总体情况
[linyouyi@hadoop01 kafka_2.11-2.2.0]$ bin/kafka-topics.sh --describe --bootstrap-server localhost:9092 --topic linyouyi Topic:linyouyi PartitionCount:1 ReplicationFactor:3 Configs:segment.bytes=1073741824 Topic: linyouyi Partition: 0 Leader: 3 Replicas: 3,1,2 Isr: 3,1,2
[linyouyi@hadoop01 kafka_2.11-2.2.0]$ bin/kafka-topics.sh --describe --bootstrap-server localhost:9092 --topic youyi Topic:youyi PartitionCount:1 ReplicationFactor:3 Configs:segment.bytes=1073741824 Topic: youyi Partition: 0 Leader: 3 Replicas: 3,1,2 Isr: 3,1,2
6、删除topic
bin/kafka-topics.sh --delete --topic topic_name --bootstrap-server localhost:9092
7、查看 kafka 中某一个消费者组的消费情况:
bash-5.1# ./bin/kafka-consumer-groups.sh --bootstrap-server :9092 --group pecsGroup --describe GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID pecsGroup BDSS_BDGATEWAYG_TOPIC-PROD 0 4 4 0 consumer-pecsGroup-1-f079e738-9b1c-4f53-83b3-7096e8b84b2f /192.168.85.122 consumer-pecsGroup-1 pecsGroup PECS_SERVER_TOPIC-PROD 0 6 6 0 consumer-pecsGroup-1-f079e738-9b1c-4f53-83b3-7096e8b84b2f /192.168.85.122 consumer-pecsGroup-1 pecsGroup PECS_GATEWAY_TOPIC-PROD 0 4 4 0 consumer-pecsGroup-1-88c565b8-fbf7-4401-8ecf-7cdeb7d6e336 /192.168.85.122 consumer-pecsGroup-1 pecsGroup BDSS_SEND_TOPIC-PROD 0 6 6 0 consumer-pecsGroup-1-d84b1bc6-3a7b-4b2c-88d3-f92e6b41eee4 /192.168.85.122 consumer-pecsGroup-1 bash-5.1#
我们可以看到该消费者组消费的 topic、partition、当前消费到的 offset 、最新 offset 、LAG(消费进度) 等等。如果消费者的 offset 很长时间没有提交导致 LAG 越来越大,则证明消费 Kafka 的服务异常。