CHD-5.3.6集群上Flume安装
Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. It uses a simple extensible data model that allows for online analytic application.
翻译:
Flume是一种分布式、可靠且可用的服务,用于高效地收集、聚合和移动大量日志数据。它具有基于流数据流的简单灵活的体系结构。它具有鲁棒性和容错性,具有可调的可靠性机制和许多故障转移和恢复机制。它使用一个简单的可扩展数据模型,允许在线分析应用程序。
web Server 端产生日志,Source到具体目录下获取日志文件,把日志Channel中,Sink写到HDFS上
Source → Channel 可以进行数据清洗
Channel → Sink 可以进行数据清洗
vent是Flume数据传输的基本单元
Flume以事件的形式将数据从源头传送到最终的目的
Event由可选的header和载有数据的一个byte array构成
载有的数据对flume是不透明的
Header是容纳了key-value字符串对的无序集合,key在集合内是唯 一的。
Header可以在上下文路由中使用扩展
前提依赖:
* 运行在有log的地方
* 运行在LINUX
* JVM
解压:flume-ng-1.5.0-cdh5.3.6.tar.gz
mv apache-flume-1.5.0-cdh5.3.6-bin/ flume-1.5.0-cdh5.3.6
cd /home/hadoop/CDH5.3.6/flume-1.5.0-cdh5.3.6/conf
cp flume-env.sh.template flume-env.sh
vi flume-env.sh
export JAVA_HOME=/usr/local/jdk1.8
[hadoop@master flume-1.5.0-cdh5.3.6]$ bin/flume-ng Usage: bin/flume-ng <command> [options]... commands: agent run a Flume agent avro-client run an avro Flume client version show Flume version info global options: --conf,-c <conf> use configs in <conf> directory-Dproperty=value sets a Java system property value agent options: --name,-n <name> the name of this agent (required) --conf-file,-f <file> specify a config file (required if -z missing)
运行命令:
bin/flume-ng agent --conf conf --name agent-test --conf-file test.conf
bin/flume-ng agent -c conf -n agent-test -f test.conf
配置一个a.conf
# The configuration file needs to define the sources, # the channels and the sinks. # Sources, channels and sinks are defined per agent, # in this case called 'agent' ###define agent a1.sources = r1 a1.channels = c1 a1.sinks = k1 ### define sources a1.sources.r1.type = netcat a1.sources.r1.bind = master a1.sources.r1.port = 44444 ### define channel a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 ### define sink a1.sinks.k1.type=logger a1.sinks.k1.maxBytyesToLog = 2014 ### bind the source and sinks to the channel a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1
安装 xinetd
[root@master telnet]# ll total 224 ----rwxr-x. 1 hadoop hadoop 59120 Jun 22 23:49 telnet-0.17-47.el6_3.1.x86_64.rpm ----rwxr-x. 1 hadoop hadoop 37748 Jun 22 23:49 telnet-server-0.17-47.el6_3.1.x86_64.rpm ----rwxr-x. 1 hadoop hadoop 124280 Jun 22 23:49 xinetd-2.3.14-38.el6.x86_64.rpm [root@master telnet]# rpm -ivh *.rpm warning: telnet-0.17-47.el6_3.1.x86_64.rpm: Header V3 RSA/SHA1 Signature, key ID c105b9de: NOKEY Preparing... ########################################### [100%] 1:xinetd ########################################### [ 33%] 2:telnet-server ########################################### [ 67%] 3:telnet ########################################### [100%] [root@master telnet]#
启动服务:
[root@master telnet]# /etc/rc.d/init.d/xinetd restart Stopping xinetd: [FAILED] Starting xinetd: [ OK ] [root@master telnet]#
运行命令:
bin/flume-ng agent -c conf -n a1 -f conf/a.conf -D flume.root.logger=DEBUG,consol
查看端口:
[root@master flume-1.5.0-cdh5.3.6]# netstat -tnlp Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 192.168.1.30:19888 0.0.0.0:* LISTEN 3735/java tcp 0 0 0.0.0.0:10033 0.0.0.0:* LISTEN 3735/java tcp 0 0 0.0.0.0:50070 0.0.0.0:* LISTEN 2715/java tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1924/sshd tcp 0 0 127.0.0.1:631 0.0.0.0:* LISTEN 1685/cupsd tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 2299/master tcp 0 0 0.0.0.0:50010 0.0.0.0:* LISTEN 2815/java tcp 0 0 0.0.0.0:50075 0.0.0.0:* LISTEN 2815/java tcp 0 0 192.168.1.30:10020 0.0.0.0:* LISTEN 3735/java tcp 0 0 0.0.0.0:50020 0.0.0.0:* LISTEN 2815/java tcp 0 0 192.168.1.30:9000 0.0.0.0:* LISTEN 2715/java tcp 0 0 192.168.1.30:50090 0.0.0.0:* LISTEN 2990/java tcp 0 0 :::22 :::* LISTEN 1924/sshd tcp 0 0 ::1:631 :::* LISTEN 1685/cupsd tcp 0 0 ::1:25 :::* LISTEN 2299/master tcp 0 0 ::ffff:192.168.1.30:44444 :::* LISTEN 17488/java tcp 0 0 :::3306 :::* LISTEN 2152/mysqld
运行命令:bin/flume-ng agent -c conf -n a1 -f conf/a.conf -Dflume.root.logger=DEBUG,console
在另一个控制台
[hadoop@master ~]$ telnet master 44444 Trying 192.168.1.30... Connected to master. Escape character is '^]'. hello flume OK hello world OK
在控制台的运行命令下,会出现: