kafka代码操作生产者,消费者,拦截器,Eagle监控

 

生产者发送数据

生产者异步发送数据给kafka(没有主题,生产者会创造主题)
<dependencies>
    <dependency>
        <groupId>org.apache.kafka</groupId>
        <artifactId>kafka-clients</artifactId>
        <version>0.11.0.0</version>
    </dependency>
</dependencies>
依赖
public class MyProducer {
    public static void main(String[] args) {
        //创建kafka生产者配置信息
        Properties properties = new Properties();
        //kafka集群,broker-list
        properties.put("bootstrap.servers", "hostname1:9092");
        properties.put("acks", "all");
        //重试次数
        properties.put("retries", 1);
//        //下面的配置都是默认的,可以不配置
//        //批次大小16k
//        properties.put("batch.size", 16384);
//        //等待时间
//        properties.put("linger.ms", 1);
//        //RecordAccumulator缓冲区大小32M
//        properties.put("buffer.memory", 33554432);
        //k,v的序列化类配置
        properties.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        properties.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        Producer<String, String> producer = new KafkaProducer<String, String>(properties);
        for (int i = 0; i < 100; i++) {
            //无回调
            //producer.send(new ProducerRecord<String, String>("first", Integer.toString(i)));
            //有回调
            producer.send(
                    new ProducerRecord<String, String>("first", Integer.toString(i)),
                    new Callback() {
                        public void onCompletion(RecordMetadata recordMetadata, Exception e) {
                            if (e == null) {
                                System.out.println(recordMetadata.partition() + "----" + recordMetadata.offset());
                            }
                        }
                    });

        }
        //必须要关闭,或者sleep 1ms才能将数据发送到kafka中
        producer.close();
    }
}
生产者测试类

同步发送给kafka(有序传数据)

send方法返回Future对象,调用对象的get方法,会使线程等待完成,才能进行下一个线程中的数据传输。

 producer.send(new ProducerRecord<String, String>("first",Integer.toString(i),Integer.toString(i))).get();

 

消费者消费数据

public class MyConsumer {
    public static void main(String[] args) {
        //创建消费者配置信息
        Properties properties = new Properties();
        //连接集群
        properties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,"hostname1:9092");
        //自动提交(消费位置offset)(默认为true)
        properties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG,true);
        //自动提交的延迟
        properties.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG,"1000");
        //K,V String反序列化
        properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.StringDeserializer");
        properties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.StringDeserializer");
        //消费者组
        properties.put(ConsumerConfig.GROUP_ID_CONFIG,"bigdata");
        //重置消费者的offset
        properties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG,"earliest");
        KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(properties);
        //订阅主题first,second
        consumer.subscribe(Arrays.asList("first","second"));
        //获取数据
       while (true){
           ConsumerRecords<String, String> consumerRecords = consumer.poll(100);
           for (ConsumerRecord<String, String> consumerRecord : consumerRecords) {
               System.out.println(consumerRecord.key()+"===="+consumerRecord.value());
           }
       }
    }
}
消费者测试类

 重置offset消费,需要先换消费组,再开启重置offset配置

 

消费者提交offset

虽然自动提交 offset 十分便利,但由于其是基于时间提交的, 开发人员难以把握offset 提交的时机。因此 Kafka 还提供了手动提交 offset 的 API。(可以解决数据丢失的问题,不能解决数据重复

手动提交 offset 的方法有两种:

  1. commitSync(同步提交)
  2. commitAsync(异步提交)

两者的相同点是,都会将本次 poll 的一批数据最高的偏移量提交;

不同点是,commitSync 阻塞当前线程,一直到提交成功,并且会自动失败重试(由不可控因素导致,也会出现提交失败);而 commitAsync 则没有失败重试机制,故有可能提交失败。

public class MyConsumer1 {

    public static void main(String[] args) {
        //创建消费者配置信息
        Properties properties = new Properties();
        //连接集群
        properties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,"hostname1:9092");
        //自动提交(消费位置offset)(默认为true)
        properties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG,false);
        //K,V String反序列化
        properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.StringDeserializer");
        properties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.StringDeserializer");
        //消费者组
        properties.put(ConsumerConfig.GROUP_ID_CONFIG,"bigdata");
        //重置消费者的offset
        properties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG,"earliest");
        KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(properties);
        //订阅主题first,second
        consumer.subscribe(Arrays.asList("first","second"));
        //获取数据
        while (true){
            ConsumerRecords<String, String> consumerRecords = consumer.poll(100);
            for (ConsumerRecord<String, String> consumerRecord : consumerRecords) {
                System.out.println(consumerRecord.key()+"===="+consumerRecord.value());
            }
            //同步提交,当前线程会阻塞直到 offset 提交成功
            consumer.commitSync();
        }
    }
}
同步提交
public class MyConsumer2 {
    public static void main(String[] args) {
        //创建消费者配置信息
        Properties properties = new Properties();
        //连接集群
        properties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "hostname1:9092");
        //自动提交(消费位置offset)(默认为true)
        properties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
        //K,V String反序列化
        properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer");
        properties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer");
        //消费者组
        properties.put(ConsumerConfig.GROUP_ID_CONFIG, "bigdata");
        //重置消费者的offset
        properties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
        KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(properties);
        //订阅主题first,second
        consumer.subscribe(Arrays.asList("first", "second"));
        //获取数据
        while (true) {
            ConsumerRecords<String, String> consumerRecords = consumer.poll(100);
            for (ConsumerRecord<String, String> consumerRecord : consumerRecords) {
                System.out.println(consumerRecord.key() + "====" + consumerRecord.value());
            }
            //异步提交
            consumer.commitAsync(new OffsetCommitCallback() {
                @Override
                public void onComplete(Map<TopicPartition, OffsetAndMetadata> offsets, Exception exception) {
                    if (exception != null) {
                        System.err.println("Commit failed for" + offsets);
                    }
                }
            });
        }
    }

}
异步提交

 

自定义存储offset(可将数据处理完成  和  offset做成事务    存入MySQL数据库)(保证数据不丢失,不重复)

Kafka 0.9 版本之前, offset 存储在 zookeeper, 0.9 版本及之后,默认将 offset 存储在 Kafka的一个内置的 topic 中。除此之外, Kafka 还可以选择自定义存储 offset。

offset 的维护是相当繁琐的, 因为需要考虑到消费者的 Rebalace。

当有新的消费者加入消费者组、 已有的消费者推出消费者组或者所订阅的主题的分区发生变化,就会触发到分区的重新分配,重新分配的过程叫做 Rebalance。

消费者发生 Rebalance 之后,每个消费者消费的分区就会发生变化。因此消费者要首先获取到自己被重新分配到的分区,并且定位到每个分区最近提交的 offset 位置继续消费。

要实现自定义存储 offset,需要借助 ConsumerRebalanceListener, 以下为示例代码,其中提交和获取 offset 的方法,需要根据所选的 offset 存储系统自行实现。(可将offset存入MySQL数据库)

 

public class MyConsumer3 {
    private static Map<TopicPartition, Long> currentOffset = new HashMap<TopicPartition, Long>();

    public static void main(String[] args) {
        //创建消费者配置信息
        Properties properties = new Properties();
        //连接集群
        properties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,"hostname1:9092");
        //自动提交(消费位置offset)(默认为true)
        properties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG,false);
        //K,V String反序列化
        properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.StringDeserializer");
        properties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.StringDeserializer");
        //消费者组
        properties.put(ConsumerConfig.GROUP_ID_CONFIG,"bigdata");
        //重置消费者的offset
        properties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG,"earliest");
        final KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(properties);
        //订阅主题first,second
        consumer.subscribe(Arrays.asList("first","second"),
                //<-------------------------------------
                new ConsumerRebalanceListener() {
                    // 该方法会在 Rebalance 之前调用
                    @Override
                    public void onPartitionsRevoked(Collection<TopicPartition> partitions) {
                        commitOffset(currentOffset);
                    }

                    // 该方法会在 Rebalance 之后调用
                    @Override
                    public void onPartitionsAssigned(Collection<TopicPartition> partitions) {

                        currentOffset.clear();
                        for (TopicPartition partition : partitions) {
                            consumer.seek(partition, getOffset(partition));// 定位到最近提交的 offset 位置继续消费
                        }
                    }
                });

        while (true) {
            ConsumerRecords<String, String> records = consumer.poll(100);// 消费者拉取数据
            for (ConsumerRecord<String, String> record : records) {
                System.out.printf("offset = %d, key = %s, value = %s%n", record.offset(), record.key(), record.value());
                currentOffset.put(new TopicPartition(record.topic(), record.partition()), record.offset());
            }
            commitOffset(currentOffset);// 异步提交
        }
    }

    // 获取某分区的最新 offset
    private static long getOffset(TopicPartition partition) {
        return 0;
    }

    // 提交该消费者所有分区的 offset
    private static void commitOffset(Map<TopicPartition, Long> currentOffset) {
    }

}
自定义存储offset

 

 

 

生产者拦截器

public class TimeInterceptor implements ProducerInterceptor<String,String> {
    //配置信息
    @Override
    public void configure(Map<String, ?> map) {

    }
    //发送给消费者的数据
    @Override
    public ProducerRecord<String, String> onSend(ProducerRecord<String, String> producerRecord) {
        //因为无法修改record
        //所以创建一个新的修改部分内容的record返回
        return new ProducerRecord<String, String>(producerRecord.topic(),producerRecord.partition(),producerRecord.key(),System.currentTimeMillis()+"---"+producerRecord.value());
    }
    //发送给kafka ack后
    @Override
    public void onAcknowledgement(RecordMetadata recordMetadata, Exception e) {

    }
    //生产者关闭时
    @Override
    public void close() {

    }
}
修改生产者发送数据
public class CountInterceptor implements ProducerInterceptor<String,String> {
    int successCount;
    int errorCount;
    @Override
    public void configure(Map<String, ?> map) {

    }

    @Override
    public ProducerRecord<String, String> onSend(ProducerRecord<String, String> producerRecord) {
        return producerRecord;
    }

    @Override
    public void onAcknowledgement(RecordMetadata recordMetadata, Exception e) {
        if(recordMetadata!=null){
            successCount++;
        }else {
            errorCount++;
        }
    }

    @Override
    public void close() {
        System.out.println("success: "+successCount);
        System.out.println("error: "+errorCount);
    }
}
统计生产者发送给消费者数据成功失败次数
public class MyProducer1 {
    public static void main(String[] args) {
        //创建kafka生产者配置信息
        Properties properties = new Properties();
        //kafka集群,broker-list
        properties.put("bootstrap.servers", "hostname1:9092");
        properties.put("acks", "all");
        //重试次数
        properties.put("retries", 1);
//        //下面的配置都是默认的,可以不配置
//        //批次大小16k
//        properties.put("batch.size", 16384);
//        //等待时间
//        properties.put("linger.ms", 1);
//        //RecordAccumulator缓冲区大小32M
//        properties.put("buffer.memory", 33554432);
        //k,v的序列化类配置
        properties.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        properties.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        ArrayList<String> interceptors = new ArrayList<String>();
        interceptors.add("com.demo.TimeInterceptor");
        interceptors.add("com.demo.CountInterceptor");
        properties.put(ProducerConfig.INTERCEPTOR_CLASSES_CONFIG,interceptors);
        Producer<String, String> producer = new KafkaProducer<String, String>(properties);
        for (int i = 0; i < 100; i++) {
            //无回调
            //producer.send(new ProducerRecord<String, String>("first", Integer.toString(i)));
            //有回调
            producer.send(
                    new ProducerRecord<String, String>("first", Integer.toString(i)),
                    new Callback() {
                        public void onCompletion(RecordMetadata recordMetadata, Exception e) {
                            if (e == null) {
                                System.out.println(recordMetadata.partition() + "----" + recordMetadata.offset());
                            }
                        }
                    });

        }
        //必须要关闭,或者sleep 1ms才能将数据发送到kafka中
        producer.close();
    }
}
添加拦截器的生产者

 

Eagle监控(使用kafka-eagle-bin-1.3.7.tar.gz,mysql5.5版本,版本过高不能连接成功)

cd /opt/kafka_2.11-0.11.0.0/bin

修改kafka-server-start.sh的

export KAFKA_HEAP_OPTS="-Xmx1G -Xms1G"

export KAFKA_HEAP_OPTS="-server -Xms2G -Xmx2G -XX:PermSize=128m -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:ParallelGCThreads=8 -XX:ConcGCThreads=5 -XX:InitiatingHeapOccupancyPercent=70"
export JMX_PORT="9999"
#export KAFKA_HEAP_OPTS="-Xmx1G -Xms1G"

分发到其他的服务器上同步

xsync kafka-server-start.sh

解压监控文件夹

tar -zxvf kafka-eagle-bin-1.3.7.tar.gz

cd /kafka-eagle-bin-1.3.7

tar -zxvf kafka-eagle-web-1.3.7-bin.tar.gz

修改环境变量(监控需要)

sudo vi /etc/profile

添加内容(eagle根目录)

export KE_HOME=/opt/kafka-eagle-bin-1.3.7/kafka-eagle-web-1.3.7
export PATH=$PATH:$KE_HOME/bin

更新环境变量

source /etc/profile

给监控中心的sh文件赋予权限

cd /opt/kafka-eagle-bin-1.3.7/kafka-eagle-web-1.3.7/bin

chmod 777 ke.sh

修改配置

cd /opt/kafka-eagle-bin-1.3.7/kafka-eagle-web-1.3.7/conf

vi system-config.properties

修改zk集群地址

#kafka.eagle.zk.cluster.alias=cluster1,cluster2
kafka.eagle.zk.cluster.alias=cluster1
cluster1.zk.list=hostname1:2181,hostname2:2181
#cluster2.zk.list=xdn10:2181,xdn11:2181,xdn12:2181

修改offset存放地为kafka,而不是zookeeper(这个地方只能写zookeeper,kafka老版本可以用)

cluster1.kafka.eagle.offset.storage=kafka
#cluster2.kafka.eagle.offset.storage=zookeeper

将监控图表打开

kafka.eagle.metrics.charts=true

开启监控

ke.sh start

关闭

ke.sh stop

状态

ke.sh status

 

log中的error.log记录了报错

 

进入监控中心(用户名admin,密码:123456)

http://192.168.199.123:8048/ke

 

posted @ 2021-09-01 22:30  低调的。。。  阅读(184)  评论(0编辑  收藏  举报