kafka代码操作生产者，消费者，拦截器，Eagle监控

生产者发送数据

生产者异步发送数据给kafka（没有主题，生产者会创造主题）

<dependencies>
    <dependency>
        <groupId>org.apache.kafka</groupId>
        <artifactId>kafka-clients</artifactId>
        <version>0.11.0.0</version>
    </dependency>
</dependencies>

依赖

public class MyProducer {
    public static void main(String[] args) {
        //创建kafka生产者配置信息
        Properties properties = new Properties();
        //kafka集群,broker-list
        properties.put("bootstrap.servers", "hostname1:9092");
        properties.put("acks", "all");
        //重试次数
        properties.put("retries", 1);
//        //下面的配置都是默认的，可以不配置
//        //批次大小16k
//        properties.put("batch.size", 16384);
//        //等待时间
//        properties.put("linger.ms", 1);
//        //RecordAccumulator缓冲区大小32M
//        properties.put("buffer.memory", 33554432);
        //k,v的序列化类配置
        properties.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        properties.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        Producer<String, String> producer = new KafkaProducer<String, String>(properties);
        for (int i = 0; i < 100; i++) {
            //无回调
            //producer.send(new ProducerRecord<String, String>("first", Integer.toString(i)));
            //有回调
            producer.send(
                    new ProducerRecord<String, String>("first", Integer.toString(i)),
                    new Callback() {
                        public void onCompletion(RecordMetadata recordMetadata, Exception e) {
                            if (e == null) {
                                System.out.println(recordMetadata.partition() + "----" + recordMetadata.offset());
                            }
                        }
                    });

        }
        //必须要关闭，或者sleep 1ms才能将数据发送到kafka中
        producer.close();
    }
}

生产者测试类

同步发送给kafka（有序传数据）

send方法返回Future对象，调用对象的get方法，会使线程等待完成，才能进行下一个线程中的数据传输。

producer.send(new ProducerRecord<String, String>("first",Integer.toString(i),Integer.toString(i))).get();

消费者消费数据

public class MyConsumer {
    public static void main(String[] args) {
        //创建消费者配置信息
        Properties properties = new Properties();
        //连接集群
        properties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,"hostname1:9092");
        //自动提交（消费位置offset）（默认为true）
        properties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG,true);
        //自动提交的延迟
        properties.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG,"1000");
        //K,V String反序列化
        properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.StringDeserializer");
        properties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.StringDeserializer");
        //消费者组
        properties.put(ConsumerConfig.GROUP_ID_CONFIG,"bigdata");
        //重置消费者的offset
        properties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG,"earliest");
        KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(properties);
        //订阅主题first,second
        consumer.subscribe(Arrays.asList("first","second"));
        //获取数据
       while (true){
           ConsumerRecords<String, String> consumerRecords = consumer.poll(100);
           for (ConsumerRecord<String, String> consumerRecord : consumerRecords) {
               System.out.println(consumerRecord.key()+"===="+consumerRecord.value());
           }
       }
    }
}

消费者测试类

重置offset消费，需要先换消费组，再开启重置offset配置

消费者提交offset

虽然自动提交 offset 十分便利，但由于其是基于时间提交的，开发人员难以把握offset 提交的时机。因此 Kafka 还提供了手动提交 offset 的 API。（可以解决数据丢失的问题，不能解决数据重复）

手动提交 offset 的方法有两种：

commitSync（同步提交）
commitAsync（异步提交）

两者的相同点是，都会将本次 poll 的一批数据最高的偏移量提交；

不同点是，commitSync 阻塞当前线程，一直到提交成功，并且会自动失败重试（由不可控因素导致，也会出现提交失败）；而 commitAsync 则没有失败重试机制，故有可能提交失败。

public class MyConsumer1 {

    public static void main(String[] args) {
        //创建消费者配置信息
        Properties properties = new Properties();
        //连接集群
        properties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,"hostname1:9092");
        //自动提交（消费位置offset）（默认为true）
        properties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG,false);
        //K,V String反序列化
        properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.StringDeserializer");
        properties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.StringDeserializer");
        //消费者组
        properties.put(ConsumerConfig.GROUP_ID_CONFIG,"bigdata");
        //重置消费者的offset
        properties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG,"earliest");
        KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(properties);
        //订阅主题first,second
        consumer.subscribe(Arrays.asList("first","second"));
        //获取数据
        while (true){
            ConsumerRecords<String, String> consumerRecords = consumer.poll(100);
            for (ConsumerRecord<String, String> consumerRecord : consumerRecords) {
                System.out.println(consumerRecord.key()+"===="+consumerRecord.value());
            }
            //同步提交，当前线程会阻塞直到 offset 提交成功
            consumer.commitSync();
        }
    }
}

同步提交

public class MyConsumer2 {
    public static void main(String[] args) {
        //创建消费者配置信息
        Properties properties = new Properties();
        //连接集群
        properties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "hostname1:9092");
        //自动提交（消费位置offset）（默认为true）
        properties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
        //K,V String反序列化
        properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer");
        properties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer");
        //消费者组
        properties.put(ConsumerConfig.GROUP_ID_CONFIG, "bigdata");
        //重置消费者的offset
        properties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
        KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(properties);
        //订阅主题first,second
        consumer.subscribe(Arrays.asList("first", "second"));
        //获取数据
        while (true) {
            ConsumerRecords<String, String> consumerRecords = consumer.poll(100);
            for (ConsumerRecord<String, String> consumerRecord : consumerRecords) {
                System.out.println(consumerRecord.key() + "====" + consumerRecord.value());
            }
            //异步提交
            consumer.commitAsync(new OffsetCommitCallback() {
                @Override
                public void onComplete(Map<TopicPartition, OffsetAndMetadata> offsets, Exception exception) {
                    if (exception != null) {
                        System.err.println("Commit failed for" + offsets);
                    }
                }
            });
        }
    }

}

异步提交

自定义存储offset(可将数据处理完成和 offset做成事务存入MySQL数据库)（保证数据不丢失，不重复）

Kafka 0.9 版本之前， offset 存储在 zookeeper， 0.9 版本及之后，默认将 offset 存储在 Kafka的一个内置的 topic 中。除此之外， Kafka 还可以选择自定义存储 offset。

offset 的维护是相当繁琐的，因为需要考虑到消费者的 Rebalace。

当有新的消费者加入消费者组、已有的消费者推出消费者组或者所订阅的主题的分区发生变化，就会触发到分区的重新分配，重新分配的过程叫做 Rebalance。

消费者发生 Rebalance 之后，每个消费者消费的分区就会发生变化。因此消费者要首先获取到自己被重新分配到的分区，并且定位到每个分区最近提交的 offset 位置继续消费。

要实现自定义存储 offset，需要借助 ConsumerRebalanceListener，以下为示例代码，其中提交和获取 offset 的方法，需要根据所选的 offset 存储系统自行实现。(可将offset存入MySQL数据库)

public class MyConsumer3 {
    private static Map<TopicPartition, Long> currentOffset = new HashMap<TopicPartition, Long>();

    public static void main(String[] args) {
        //创建消费者配置信息
        Properties properties = new Properties();
        //连接集群
        properties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,"hostname1:9092");
        //自动提交（消费位置offset）（默认为true）
        properties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG,false);
        //K,V String反序列化
        properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.StringDeserializer");
        properties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.StringDeserializer");
        //消费者组
        properties.put(ConsumerConfig.GROUP_ID_CONFIG,"bigdata");
        //重置消费者的offset
        properties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG,"earliest");
        final KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(properties);
        //订阅主题first,second
        consumer.subscribe(Arrays.asList("first","second"),
                //<-------------------------------------
                new ConsumerRebalanceListener() {
                    // 该方法会在 Rebalance 之前调用
                    @Override
                    public void onPartitionsRevoked(Collection<TopicPartition> partitions) {
                        commitOffset(currentOffset);
                    }

                    // 该方法会在 Rebalance 之后调用
                    @Override
                    public void onPartitionsAssigned(Collection<TopicPartition> partitions) {

                        currentOffset.clear();
                        for (TopicPartition partition : partitions) {
                            consumer.seek(partition, getOffset(partition));// 定位到最近提交的 offset 位置继续消费
                        }
                    }
                });

        while (true) {
            ConsumerRecords<String, String> records = consumer.poll(100);// 消费者拉取数据
            for (ConsumerRecord<String, String> record : records) {
                System.out.printf("offset = %d, key = %s, value = %s%n", record.offset(), record.key(), record.value());
                currentOffset.put(new TopicPartition(record.topic(), record.partition()), record.offset());
            }
            commitOffset(currentOffset);// 异步提交
        }
    }

    // 获取某分区的最新 offset
    private static long getOffset(TopicPartition partition) {
        return 0;
    }

    // 提交该消费者所有分区的 offset
    private static void commitOffset(Map<TopicPartition, Long> currentOffset) {
    }

}

自定义存储offset

生产者拦截器

public class TimeInterceptor implements ProducerInterceptor<String,String> {
    //配置信息
    @Override
    public void configure(Map<String, ?> map) {

    }
    //发送给消费者的数据
    @Override
    public ProducerRecord<String, String> onSend(ProducerRecord<String, String> producerRecord) {
        //因为无法修改record
        //所以创建一个新的修改部分内容的record返回
        return new ProducerRecord<String, String>(producerRecord.topic(),producerRecord.partition(),producerRecord.key(),System.currentTimeMillis()+"---"+producerRecord.value());
    }
    //发送给kafka ack后
    @Override
    public void onAcknowledgement(RecordMetadata recordMetadata, Exception e) {

    }
    //生产者关闭时
    @Override
    public void close() {

    }
}

修改生产者发送数据

public class CountInterceptor implements ProducerInterceptor<String,String> {
    int successCount;
    int errorCount;
    @Override
    public void configure(Map<String, ?> map) {

    }

    @Override
    public ProducerRecord<String, String> onSend(ProducerRecord<String, String> producerRecord) {
        return producerRecord;
    }

    @Override
    public void onAcknowledgement(RecordMetadata recordMetadata, Exception e) {
        if(recordMetadata!=null){
            successCount++;
        }else {
            errorCount++;
        }
    }

    @Override
    public void close() {
        System.out.println("success: "+successCount);
        System.out.println("error: "+errorCount);
    }
}

统计生产者发送给消费者数据成功失败次数

public class MyProducer1 {
    public static void main(String[] args) {
        //创建kafka生产者配置信息
        Properties properties = new Properties();
        //kafka集群,broker-list
        properties.put("bootstrap.servers", "hostname1:9092");
        properties.put("acks", "all");
        //重试次数
        properties.put("retries", 1);
//        //下面的配置都是默认的，可以不配置
//        //批次大小16k
//        properties.put("batch.size", 16384);
//        //等待时间
//        properties.put("linger.ms", 1);
//        //RecordAccumulator缓冲区大小32M
//        properties.put("buffer.memory", 33554432);
        //k,v的序列化类配置
        properties.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        properties.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        ArrayList<String> interceptors = new ArrayList<String>();
        interceptors.add("com.demo.TimeInterceptor");
        interceptors.add("com.demo.CountInterceptor");
        properties.put(ProducerConfig.INTERCEPTOR_CLASSES_CONFIG,interceptors);
        Producer<String, String> producer = new KafkaProducer<String, String>(properties);
        for (int i = 0; i < 100; i++) {
            //无回调
            //producer.send(new ProducerRecord<String, String>("first", Integer.toString(i)));
            //有回调
            producer.send(
                    new ProducerRecord<String, String>("first", Integer.toString(i)),
                    new Callback() {
                        public void onCompletion(RecordMetadata recordMetadata, Exception e) {
                            if (e == null) {
                                System.out.println(recordMetadata.partition() + "----" + recordMetadata.offset());
                            }
                        }
                    });

        }
        //必须要关闭，或者sleep 1ms才能将数据发送到kafka中
        producer.close();
    }
}

添加拦截器的生产者

Eagle监控(使用kafka-eagle-bin-1.3.7.tar.gz，mysql5.5版本，版本过高不能连接成功)

cd /opt/kafka_2.11-0.11.0.0/bin

修改kafka-server-start.sh的

export KAFKA_HEAP_OPTS="-Xmx1G -Xms1G"

为

export KAFKA_HEAP_OPTS="-server -Xms2G -Xmx2G -XX:PermSize=128m -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:ParallelGCThreads=8 -XX:ConcGCThreads=5 -XX:InitiatingHeapOccupancyPercent=70"
export JMX_PORT="9999"
#export KAFKA_HEAP_OPTS="-Xmx1G -Xms1G"

分发到其他的服务器上同步

xsync kafka-server-start.sh

解压监控文件夹

tar -zxvf kafka-eagle-bin-1.3.7.tar.gz

cd /kafka-eagle-bin-1.3.7

tar -zxvf kafka-eagle-web-1.3.7-bin.tar.gz

修改环境变量（监控需要）

sudo vi /etc/profile

添加内容（eagle根目录）

export KE_HOME=/opt/kafka-eagle-bin-1.3.7/kafka-eagle-web-1.3.7
export PATH=$PATH:$KE_HOME/bin

更新环境变量

source /etc/profile

给监控中心的sh文件赋予权限

cd /opt/kafka-eagle-bin-1.3.7/kafka-eagle-web-1.3.7/bin

chmod 777 ke.sh

修改配置

cd /opt/kafka-eagle-bin-1.3.7/kafka-eagle-web-1.3.7/conf

vi system-config.properties

修改zk集群地址

#kafka.eagle.zk.cluster.alias=cluster1,cluster2
kafka.eagle.zk.cluster.alias=cluster1
cluster1.zk.list=hostname1:2181,hostname2:2181
#cluster2.zk.list=xdn10:2181,xdn11:2181,xdn12:2181

修改offset存放地为kafka，而不是zookeeper（这个地方只能写zookeeper，kafka老版本可以用）

cluster1.kafka.eagle.offset.storage=kafka
#cluster2.kafka.eagle.offset.storage=zookeeper

将监控图表打开

kafka.eagle.metrics.charts=true

开启监控

ke.sh start

关闭

ke.sh stop

状态

ke.sh status

log中的error.log记录了报错

进入监控中心（用户名admin，密码：123456）

http://192.168.199.123:8048/ke

posted @ 2021-09-01 22:30 低调的。。。阅读(184) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部