Kafka备忘

 

 

 

官网 http://kafka.apache.org/

多生产者多消费者

多topic和多分区

多消费者组。每组中消息不能重复消费,组间不影响

 

 

启动

RunKafka(){
    cd $kafka_home
    nohup ./bin/zookeeper-server-start.sh ./config/zookeeper.properties >> zk$dayStr.log  & 
    echo 'Starting zookeeper...'
    sleep 5s #wait a monment until zookeeper is ready

    nohup ./bin/kafka-server-start.sh ./config/server.properties  >> kafka$dayStr.log & 
    echo 'Starting kafka-server...'
}

流程:启动zookeeper -> 启动kafka-server -> 创建topic -> 创建生产者 -> 创建消费者

 

基本命令

# kafka basic common
# ./app/kafka/bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 4 --topic huashi
# ./app/kafka/bin/kafka-topics.sh --list --zookeeper localhost:2181
# ./app/kafka/bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic huashi
# ./app/kafka/bin/kafka-topics.sh --delete --zookeeper localhost:2181 --topic huashi

# ./bin/kafka-console-producer.sh --broker-list localhost:9092 --topic huashi
# ./bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic huashi --from-beginning
# ./bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic huashi --partitions 4

# ==================delete consumer group======
# ~/kafka/bin/zookeeper-shell.sh localhost:2181
# ls /consumers
# rmr /consumers/bd/offsets
# rmr /consumers/test/offsets/huashi20151108


# ls /tmp/kafka-logs/

 

 

 

Python包

https://github.com/mumrah/kafka-python

用法:http://kafka-python.readthedocs.org/en/latest/usage.html 

 

 

 

FakeProducer.py

#!/usr/bin/python
# -*- coding: utf-8 -*-

__author__ = 'manhua'

from kafka import SimpleProducer, KafkaClient
import time
# To send messages synchronously
kafka = KafkaClient('localhost:9092')
producer = SimpleProducer(kafka)
n=0
while True:
    producer.send_messages(b'huashi', str(n))
    n+=1
    time.sleep(1)
    print n

 

 

ConsumerTest.py

#!/usr/bin/python
# -*- coding: utf-8 -*-

__author__ = 'manhua'

from kafka import KafkaConsumer
import sys


class UnzipConsumer:
    def __init__(self, topic, partition_id, gid='bd', bs_server='localhost:9092'):
        self.partition_id = partition_id
        self.consumer = KafkaConsumer((topic, int(partition_id)),  # must specify an id, or it will quite slow
                                      group_id=gid,
                                      bootstrap_servers=[bs_server],
                                      auto_offset_reset='smallest'  # ,
                                      # consumer_timeout_ms=1000*60*30
                                      )

    def run(self):
        for message in self.consumer:
            print("%s:%d:%d: key=%s value=%s" % (message.topic, message.partition,
                                             message.offset, message.key,
                                             message.value))
            self.consumer.task_done(message)
            self.consumer.commit()


if __name__ == '__main__':

    if len(sys.argv) == 3:
        obj = UnzipConsumer(sys.argv[1], sys.argv[2])
        obj.run()
    else:
        print 'Parameters: [topic] [id]'


# python unzipConsumer.py  huashi #0

 

 

监控工具

https://github.com/quantifind/KafkaOffsetMonitor

实时监控kafka的consumer以及他们在partition中的offset

因为KafkaOffsetMonitor中有些资源文件(css,js)是访问google资源,所以有人做了修改版 http://pan.baidu.com/s/1qWH05q8 

java -cp KafkaOffsetMonitor-assembly-0.2.0.jar com.quantifind.kafka.offsetapp.OffsetGetterWeb --zk localhost --refresh 15.minutes --retain 5.day --port 5354

  

 

 

Ref:

http://blog.csdn.net/lizhitao/article/details/27199863

posted @ 2015-11-11 15:45  Man_华  阅读(397)  评论(0编辑  收藏  举报