Fork me on Gitee

Java操作Kafka客户端及Springboot整合Kafka

1.生产者的基本实现

1.1 引入依赖

  <dependency>
            <groupId>org.apache.kafka</groupId>
            <artifactId>kafka-clients</artifactId>
            <!--版本号根据kafka安装包版本指定,比如kafka_2.12-2.0.0-->
            <version>2.0.0</version>
   </dependency>

1.2 具体代码实现

同步发送消息

  private final static String TOPIC_NAME = "my-replicated-topic";
	public static void main(String[] args) throws ExecutionException, InterruptedException {
        Properties props = new Properties();
        //1. 设置参数
        props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "192.168.21.107:9092,192.168.21.108:9092,192.168.21.109:9092");
        //把发送的key从字符串序列化为字符数组
        props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
        //把发送消息的value从字符串序列化为字节数组
        props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());

        //发送消息的客户端
        Producer<String, String> producer = new KafkaProducer<String, String>(props);

        User user = new User();
        user.setErpId(500000L);
        user.setErpName("张三三");
        user.setRealName("张三三");
        user.setCreateTime(new Date());
        user.setUpdateTime(new Date());

        //指定发送分区
    		//key: 作用是决定了往哪个分区上发送消息,value表示具体要发送的消息内容
        ProducerRecord<String, String> producerRecord = new ProducerRecord<>(TOPIC_NAME,
                user.getErpId().toString(), JSON.toJSONString(user));

        //发送消息,得到消息发送的元数据并输出
        RecordMetadata metadata = producer.send(producerRecord).get();
        System.out.println("同步方式发送消息结果:" + "topic-" +
                metadata.topic() + "|partition-"+ metadata.partition() + "|offset-" + metadata.offset());
    }
 }

在同步发送消息的场景下,如果生产者发送消息三次没有收到ack,生产者会阻塞,阻塞到3s的时间,如果还没有收到消息,会进行重试。重试的次数为3次。

异步发送消息

//异步发送消息
producer.send(producerRecord, new Callback() {
  @Override
  public void onCompletion(RecordMetadata metadata, Exception e) {
    if(e!=null){
      System.out.println("发送消息失败:"+e.getMessage());
    }
    if(metadata!=null){
      System.out.println("异步方式发送消息结果:" + "topic-" +
                         metadata.topic() + "|partition-"
                         + metadata.partition() + "|offset-" + metadata.offset());
    }
  }
});


// 因为异步提交后主线程可能已经停止,没有拿到onCompletion的回调,因此可让主线程阻塞一段时间看到效果
Thread.sleep(100000L);

1.3 生产者中ack的配置

在同步发送的前提下,生产者在获得集群返回的ack之前会一直阻塞。集群中ack共有三个配置:

  • ack = 0, kafka-cluster不需要任何broker收到消息,就立即返回ack给生产者,同时,这也是最容易丢消息的,但效率也是最高的。
  • ack = 1(default), 多副本之间的leader已经收到消息,并且把消息写入到本地log中,才会返回ack给生产者,性能和安全性也是最均衡的。
  • ack = -1 /all。里面有默认的配置min.insync.replicas=2(默认为1,推荐配置大于等于2), 此时就需要leader和一个follower同步完后,才会返回ack给生产者(此时集群中有2个 broker已完成数据的接收),这种方式最安全,但性能最差。

对应于ack和重试(如果没有收到ack,就开始重试)的配置

//ack和重试
props.put(ProducerConfig.ACKS_CONFIG,"1");

/**
         * 发送失败会重试,默认重试间隔100ms,重试能保证消息发送的可靠性,但是也可能造成消息重复发送,
         * 比如网络抖动,所以需要在接收者那边做好消息接收的幂等性处理
         */
props.put(ProducerConfig.RETRIES_CONFIG,3);
//重试间隔设置
props.put(ProducerConfig.RETRY_BACKOFF_MS_CONFIG,300);

1.4 生产者发送消息的缓冲区配置

  • kafka默认会创建一个消息缓冲区,用来存放要发送的消息,缓冲区大小为32m
props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 33554432);
  • kafka本地线程会去缓冲区中一次拉取16k的数据,发送到broker
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 16384);
  • 如果线程拉不到16k的数据,间隔10ms也会将已拉到的数据发送给broker
props.put(ProducerConfig.LINGER_MS_CONFIG, 10);

2. 消费者的实现

2.1 消费者的基本实现

private static final  String TOPIC_NAME = "my-replicated-topic";
private static final String CONSUMER_GROUP_NAME = "testGroup";

public static void main(String[] args) {
        Properties props = new Properties();
        			     props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,"192.168.21.107:9092,192.168.21.108:9092,192.168.21.109:9092");
        //消费分组名
        props.put(ConsumerConfig.GROUP_ID_CONFIG,CONSUMER_GROUP_NAME);
        props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
        props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,StringDeserializer.class.getName());

        //1.创建一个消费者的客户端信息
        KafkaConsumer<String,String> consumer = new KafkaConsumer<String, String>(props);
        //2. 消费者订阅主题列表
        consumer.subscribe(Arrays.asList(TOPIC_NAME));

        while (true){
            //3. poll() API是拉取消息的长轮询
            ConsumerRecords<String,String> records = consumer.poll(Duration.ofMillis(1000));
            for (ConsumerRecord<String,String> record:records){
                //4. 打印消息
                System.out.printf("收到消息:partition = %d,offset = %d, key = %s, value = %s%n", record.partition(),
                record.offset(), record.key(), record.value());
            }

        }

2.2 消费者自动提交和手动提交offset

消费者无论是自动提交还是手动提交,都需要把所属的消费组+消费的某个主题+消费的某个分区及消费的偏移量,这样的信息提交到集群的_consumer_offsets主题里面

自动提交

消费者poll消息下来以后就会自动提交offset

//是否自动提交offset,默认就是true
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG,"true");
//自动提交offset的间隔时间
props.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG,"1000");

自动提交会丢消息。因为消费者在消费前提交offset,有可能提交完后还没消费时消费者挂了。

手动提交

需要把自动提交的配置改为false

//是否自动提交offset,默认就是true
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG,"false");

手动同步提交:在消费完消息后调用同步提交的方法,当集群返回ack前一直阻塞,返回ack后表示提交完成,执行之后的逻辑

 while (true){
   //3. poll() API是拉取消息的长轮询
   ConsumerRecords<String,String> records = consumer.poll(Duration.ofMillis(1000));
   for (ConsumerRecord<String,String> record:records){
     //4. 打印消息
     System.out.printf("收到消息:partition = %d,offset = %d, key = %s, value = %s%n", record.partition(),
                       record.offset(), record.key(), record.value());
   }
   //所有的消息都已消费完
   if(records.count()>0){
     //手动同步提交offset,当前线程会阻塞直到offset提交成功
     consumer.commitSync();
   }

 }

手动异步提交:在消息消费完后提交,不需要等待集群ack,直接执行之后的逻辑,可以设置一个毁掉方法,供集群调用

  while (true){
            //3. poll() API是拉取消息的长轮询
            ConsumerRecords<String,String> records = consumer.poll(Duration.ofMillis(1000));
            for (ConsumerRecord<String,String> record:records){
                //4. 打印消息
                System.out.printf("收到消息:partition = %d,offset = %d, key = %s, value = %s%n", record.partition(),
                record.offset(), record.key(), record.value());
            }
            //所有的消息都已消费完
            if(records.count()>0){
                //手动异步提交offset,当前线程提交offset不会阻塞,可以继续执行后面的程序逻辑
                consumer.commitAsync(new OffsetCommitCallback() {
                    @Override
                    public void onComplete(Map<TopicPartition, OffsetAndMetadata> offsets, Exception exception) {
                        if (exception != null) {
                            System.err.println("Commit failed for " + offsets);
                            System.err.println("Commit failed exception: " +
                                    exception.getStackTrace());
                        }
                    }
                });
            }

        }

2.3 长轮询poll消息

  • 默认情况下,消费者一次会poll 500条消息

     // 一次poll最大拉取消息的条数,可以根据消费速度的快慢来设置
     props.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG,500);
    

2.4 消费者健康状态检查

消费者每隔1s向kafka集群发送心跳,集群发现如果有超过10s没有续约的消费者,将被提出消费组,触发该消费组的rebalance机制,将该分区交给消费组里的其他消费者进行消费。

  //consumer给broker发送心跳的间隔时间
  props.put(ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG,1000);
  //kafka如果超过10s没有收到消费者的心跳,则会把消费者提出消费组,进行reBalance,把分区分配给其他消费者。
  props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG,10*1000);

2.5 指定分区、偏移量和实际那消费

  • 指定分区消费
//指定分区消费
consumer.assign(Arrays.asList(new TopicPartition(TOPIC_NAME,0)));
  • 消息回溯消费
consumer.assign(Arrays.asList(new TopicPartition(TOPIC_NAME,0)));
consumer.seekToBeginning(Arrays.asList(new TopicPartition(TOPIC_NAME,0)));
收到消息:partition = 3,offset = 0, key = 500000, value = {"createTime":1636702968492,"erpId":500000,"erpName":"张三三","realName":"张三三","updateTime":1636702968492}
收到消息:partition = 3,offset = 1, key = 500005, value = {"createTime":1636957199355,"erpId":500005,"erpName":"张三三","realName":"张三三","updateTime":1636957199355}
收到消息:partition = 3,offset = 2, key = 500005, value = {"createTime":1636957211271,"erpId":500005,"erpName":"张三三","realName":"张三三","updateTime":1636957211271}
收到消息:partition = 3,offset = 3, key = 500005, value = {"createTime":1636958064780,"erpId":500005,"erpName":"张三三","realName":"张三三","updateTime":1636958064780}
  • 指定offset消费
consumer.assign(Arrays.asList(new TopicPartition(TOPIC_NAME,0)));
consumer.seek(new TopicPartition(TOPIC_NAME,0),10);
收到消息:partition = 3,offset = 2, key = 500005, value = {"createTime":1636957211271,"erpId":500005,"erpName":"张三三","realName":"张三三","updateTime":1636957211271}
收到消息:partition = 3,offset = 3, key = 500005, value = {"createTime":1636958064780,"erpId":500005,"erpName":"张三三","realName":"张三三","updateTime":1636958064780}
  • 指定时间去消费

根据时间,去所有的Partition中确定该时间对应的offset,然后去所有的partition中找到该offset之后的消息开始消费

 List<PartitionInfo> topicPartitions = consumer.partitionsFor(TOPIC_NAME);
  List<PartitionInfo> topicPartitions = consumer.partitionsFor(TOPIC_NAME);
  //从1小时前开始消费
  long fetchDataTime = new Date().getTime() - 1000 * 60 * 60;
  Map<TopicPartition, Long> map = new HashMap<>();
  for (PartitionInfo par : topicPartitions) {
    map.put(new TopicPartition(TOPIC_NAME, par.partition()), fetchDataTime);
  }
  Map<TopicPartition, OffsetAndTimestamp> parMap = consumer.offsetsForTimes(map);
  for (Map.Entry<TopicPartition, OffsetAndTimestamp> entry : parMap.entrySet()) {
    TopicPartition key = entry.getKey();
    OffsetAndTimestamp value = entry.getValue();
    if (key == null || value == null) continue;
    Long offset = value.offset();
    System.out.println("partition-" + key.partition() +
                       "|offset-" + offset);
    System.out.println();
    //根据消费里的timestamp确定offset if (value != null) {
    consumer.assign(Arrays.asList(key));
    consumer.seek(key, offset);
  }

2.6 新消费组的消费offset规则

新消费组的消费者在启动后,默认会从当前分区的最后一条消息offset+1开始消费(消费新消息)。可以通过以下设置,让新的消费者第一次从头开始消费,之后开始消费新消息(最后消费的位置的偏移量+1)

  • latest: 默认的,消费新消息

  • earliest: 第一次从头开始消费,之后开始消费新消息(最后消费的位置的偏移量+1)

    props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
    
收到消息:partition = 4,offset = 0, key = 500004, value = {"createTime":1636948165301,"erpId":500004,"erpName":"张三三","realName":"张三三","updateTime":1636948165301}
收到消息:partition = 1,offset = 0, key = 500001, value = {"createTime":1636703013717,"erpId":500001,"erpName":"李丝丝","realName":"李丝丝","updateTime":1636703013717}
收到消息:partition = 3,offset = 0, key = 500000, value = {"createTime":1636702968492,"erpId":500000,"erpName":"张三三","realName":"张三三","updateTime":1636702968492}
收到消息:partition = 3,offset = 1, key = 500005, value = {"createTime":1636957199355,"erpId":500005,"erpName":"张三三","realName":"张三三","updateTime":1636957199355}
收到消息:partition = 3,offset = 2, key = 500005, value = {"createTime":1636957211271,"erpId":500005,"erpName":"张三三","realName":"张三三","updateTime":1636957211271}
收到消息:partition = 3,offset = 3, key = 500005, value = {"createTime":1636958064780,"erpId":500005,"erpName":"张三三","realName":"张三三","updateTime":1636958064780}

3. SpringBoot整合Kafka

3.1 引入spring-kafka依赖

    <dependency>
            <groupId>org.springframework.kafka</groupId>
            <artifactId>spring-kafka</artifactId>
    </dependency>

3.2 application.yml中的配置

spring:
  kafka:
    bootstrap-servers: 192.168.21.107:9092,192.168.21.108:9092,192.168.21.109:9092
    # 生产者
    producer:
      # 设置大于0的值,则客户端会将发送失败的记录重新发送
      retries: 3
      batch-size: 16384
      buffer-memory: 33554432
      acks: 1
      # 指定消息key和消息体的编解码方式
      key-serializer: org.apache.kafka.common.serialization.StringSerializer
      value-serializer: org.apache.kafka.common.serialization.StringSerializer
    consumer:
      group-id: default-group
      enable-auto-commit: false
      auto-offset-reset: earliest
      key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      max-poll-records: 500

    listener:
      # 手动调用Acknowledgment.acknowledge()后立即提交,一般使用这种
      ack-mode: manual_immediate

3.3 编写消息生产者

private final static String TOPIC_NAME= "my-replicated-topic";

@Autowired
private KafkaTemplate<String,String> kafkaTemplate;

@RequestMapping(value ="/send")
public String sendMessage(){
      kafkaTemplate.send(TOPIC_NAME,"key","this is a test kafka message");
      return "send message success";
}

3. 4 编写消息消费者

		@KafkaListener(topics = "my-replicated-topic")
    public void listenGroup(ConsumerRecord<String, String> record,
                            Acknowledgment ack) {
        String value = record.value();
        System.out.println(value);
        System.out.println(record);
        //手动提交offset
        ack.acknowledge();
    }

3.5 消费者中配置消费主题、分区和偏移量

 @KafkaListener(groupId = "testGroup", topicPartitions = {
            @TopicPartition(topic = "topic1", partitions = {"0", "1"}),
            @TopicPartition(topic = "topic2", partitions = "0",
                    partitionOffsets = @PartitionOffset(partition = "1",
                            initialOffset = "100"))
    },concurrency = "3")//concurrency就是同组下的消费者个数,就是并发消费数,建 议小于等于分区总数
    public void listenGroupPro(ConsumerRecord<String, String> record,
                               Acknowledgment ack) {
        String value = record.value(); 
        System.out.println(value); 
        System.out.println(record); 
        //手动提交offset 
        ack.acknowledge();
    }
posted @ 2021-11-15 16:15  shine-rainbow  阅读(1648)  评论(0编辑  收藏  举报