flume + kafka

前提:

1、下载 flume http://flume.apache.org/download.html

2、下载配置 kafka http://www.cnblogs.com/eggplantpro/articles/8428932.html

3、服务器3台,我这边是5台

s1:10.211.55.16 zk&kafka           zk是zookeeper

s2:10.211.55.17 zk

s3:10.211.55.18 zk 

s4:10.211.55.19 kafka&flume

s5:10.211.55.20 kafka

安装:

1、解压

结构如上,也是中规中矩。显然,配置文件在 conf 下

2、配置

flume 的配置不同于其他的软件,flume是一种类型的服务,就是一种配置。conf 也有模板,建议配置对应的 sources 、channel、sink 都去官方指导文档里去找

http://flume.apache.org/FlumeUserGuide.html

vim flume-kafka.properties

 

这是官方给的demo,我们可以跟着demo改就好了

# example.conf: A single-node Flume configuration

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444

# Describe the sink
a1.sinks.k1.type = logger

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

改好的配置

# example.conf: A single-node Flume configuration

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
#a1.sources.r1.type = netcat
#a1.sources.r1.bind = localhost
#a1.sources.r1.port = 44444

a1.sources.r1.type = exec   # source 的类型是命令
a1.sources.r1.command = tail -F /home/test.log  #tail 一个日志 的命令。只要有日志写入,就会下沉到sink


# Describe the sink
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.kafka.topic = mxb  # 这是 kafka topic的名称
a1.sinks.k1.kafka.bootstrap.servers = s1:9092 这是kafka的服务器地址和ip
a1.sinks.k1.kafka.flumeBatchSize = 20
a1.sinks.k1.kafka.producer.acks = 1
a1.sinks.k1.kafka.producer.linger.ms = 1
a1.sinks.k1.kafka.producer.compression.type = snappy

# Use a channel which buffers events in memory
a1.channels.c1.type = memory #管道类型是 内存

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

 3、启动

../bin/flume-ng agent --conf conf --conf-file flume-kafka.properties --name a1 -Dflume.root.logger=INFO,console

 4、检验

在kafka的服务器上 启动一个kafka的consumer 

cd /kafka/bin
./kafka-console-consumer.sh --zookeeper s1:2181 --from-beginning --topic mxb

 

因为在flume source是 tail 一个日志,所以我们往 /home/test.log 写入内容即可

for((i=0;i<5000;i++))
do
echo test$i
done

执行写入日志的命令后,在启动 kafka-consumer 的服务器会看到消费的信息

 

 

 

 

  

posted on 2018-02-08 00:55  chouc  阅读(161)  评论(0编辑  收藏  举报

导航