Spark Streaming——Flume实例
Flume 官网http://flume.apache.org/releases/content/1.9.0/FlumeUserGuide.html
此文章共有三个实例:
crtl+c后停止flume
实例一直接监控端口
配置文件
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = dblab-VirtualBox
a1.sources.r1.port = 44444
# Describe the sink
a1.sinks.k1.type = logger
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
启动agent:
flume-ng agent \
-n a1 \
-c $FLUME_HOME/conf \
-f $FLUME_HOME/conf/example.conf \
-Dflume.root.logger=INFO,console
实例二监控文件
配置文件
a1.sources = r1
a1.sinks = k1
a1.channels = c1
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /home/hadoop/data/data.log
a1.sources.r1.shell = /bin/sh -c
a1.sinks.k1.type = logger
a1.channels.c1.type = memory
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
启动agent:
flume-ng agent \
-n a1 \
-c $FLUME_HOME/conf \
-f $FLUME_HOME/conf/exec-memory-logger.conf \
-Dflume.root.logger=INFO,console
目前是sink到控制台a1.sinks.k1.type = logger 如果是离线处理的话可以sink到HDFS
书写格式
实例三Flume实战
机器配置图
技术选型:
A: exec source +memory channel +avro sink
B: avro source +memory channel +logger sink
配置文件
exec-memory-avro.conf
exec-memory-avro.sources = exec-source
exec-memory-avro.sinks = avro-sink
exec-memory-avro.channels = memory-channel
exec-memory-avro.sources.exec-source.type = exec
exec-memory-avro.sources.exec-source.command = tail -F /home/hadoop/data/data.log
exec-memory-avro.sources.exec-source.shell = /bin/sh -c
exec-memory-avro.sinks.avro-sink.type = avro
exec-memory-avro.sinks.avro-sink.hostname = dblab-VirtualBox
exec-memory-avro.sinks.avro-sink.port = 44444
exec-memory-avro.channels.memory-channel.type = memory
exec-memory-avro.sources.exec-source.channels = memory-channel
exec-memory-avro.sinks.avro-sink.channel = memory-channel
avro-memory-logger.conf
avro-memory-logger.sources = avro-source
avro-memory-logger.sinks = logger-sink
avro-memory-logger.channels = memory-channel
avro-memory-logger.sources.avro-source.type = avro
avro-memory-logger.sources.avro-source.bind= dblab-VirtualBox
avro-memory-logger.sources.avro-source.port=44444
avro-memory-logger.sinks.logger-sink.type = logger
avro-memory-logger.channels.memory-channel.type = memory
avro-memory-logger.sources.avro-source.channels = memory-channel
avro-memory-logger.sinks.logger-sink.channel = memory-channel
启动agent:
先启动avro-memory-logger.conf wang-one启动
flume-ng agent --name avro-memory-logger \
--conf $FLUME_HOME/conf \
--conf-file $FLUME_HOME/conf/avro-memory-logger.conf -Dflume.root.logger=INFO,console
再启动exec-memory-avro.conf wang-two启动
flume-ng agent --name exec-memory-avro \
--conf $FLUME_HOME/conf \
--conf-file $FLUME_HOME/conf/exec-memory-avro.conf \
-Dflume.root.logger=INFO,console
exec-memory-avro.conf
avro-memory-logger.conf