flume + kafka
前提:
1、下载 flume http://flume.apache.org/download.html
2、下载配置 kafka http://www.cnblogs.com/eggplantpro/articles/8428932.html
3、服务器3台,我这边是5台
s1:10.211.55.16 zk&kafka zk是zookeeper
s2:10.211.55.17 zk
s3:10.211.55.18 zk
s4:10.211.55.19 kafka&flume
s5:10.211.55.20 kafka
安装:
1、解压
结构如上,也是中规中矩。显然,配置文件在 conf 下
2、配置
flume 的配置不同于其他的软件,flume是一种类型的服务,就是一种配置。conf 也有模板,建议配置对应的 sources 、channel、sink 都去官方指导文档里去找
http://flume.apache.org/FlumeUserGuide.html
vim flume-kafka.properties
这是官方给的demo,我们可以跟着demo改就好了
# example.conf: A single-node Flume configuration # Name the components on this agent a1.sources = r1 a1.sinks = k1 a1.channels = c1 # Describe/configure the source a1.sources.r1.type = netcat a1.sources.r1.bind = localhost a1.sources.r1.port = 44444 # Describe the sink a1.sinks.k1.type = logger # Use a channel which buffers events in memory a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 # Bind the source and sink to the channel a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1
改好的配置
# example.conf: A single-node Flume configuration # Name the components on this agent a1.sources = r1 a1.sinks = k1 a1.channels = c1 # Describe/configure the source #a1.sources.r1.type = netcat #a1.sources.r1.bind = localhost #a1.sources.r1.port = 44444 a1.sources.r1.type = exec # source 的类型是命令 a1.sources.r1.command = tail -F /home/test.log #tail 一个日志 的命令。只要有日志写入,就会下沉到sink # Describe the sink a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink a1.sinks.k1.kafka.topic = mxb # 这是 kafka topic的名称 a1.sinks.k1.kafka.bootstrap.servers = s1:9092 这是kafka的服务器地址和ip a1.sinks.k1.kafka.flumeBatchSize = 20 a1.sinks.k1.kafka.producer.acks = 1 a1.sinks.k1.kafka.producer.linger.ms = 1 a1.sinks.k1.kafka.producer.compression.type = snappy # Use a channel which buffers events in memory a1.channels.c1.type = memory #管道类型是 内存 # Bind the source and sink to the channel a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1
3、启动
../bin/flume-ng agent --conf conf --conf-file flume-kafka.properties --name a1 -Dflume.root.logger=INFO,console
4、检验
在kafka的服务器上 启动一个kafka的consumer
cd /kafka/bin ./kafka-console-consumer.sh --zookeeper s1:2181 --from-beginning --topic mxb
因为在flume source是 tail 一个日志,所以我们往 /home/test.log 写入内容即可
for((i=0;i<5000;i++)) do echo test$i done
执行写入日志的命令后,在启动 kafka-consumer 的服务器会看到消费的信息