cloudera笔记
一:Cloudera平台搭建
运行后启动的服务
运行三台机子后主机启动的服务
启动后首先安装kafka,测试hdfs
9 hadoop dfs -mkdir /test 10 hadoop dfs -put words /test
hadoop jar /opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/jars/hadoop-mapreduce-examples-2.6.0-cdh5.14.2.jar wordcount /test/words /test/output
但是内存开销巨大
关闭服务后依然在运行
有时候ui会不能登录
尝试tcpdump
二:测试
尝试kafka
kafka-topics --create --zookeeper node03:2181/kafka --replication-factor 1 --partitions 1 --topic wordcount
由于zookeeper leader在node03上
Error while executing topic command : Replication factor: 1 larger than available brokers: 0. 19/11/26 00:50:34 ERROR admin.TopicCommand$: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 1 larger than available brokers: 0.
结果是代码错误
kafka-topics --create --zookeeper node03:2181 --replication-factor 1 --partitions 1 --topic wordcount
#显示成功
INFO admin.AdminUtils$: Topic creation {"version":1,"partitions":{"0":[68]}} Created topic "wordcount".
列出所有topic
kafka-topics --zookeeper node03:2181 --list
这时断电所有机器,topic还是存在,说明集群稳定性良好
简单kafka测试:
kafka-console-producer --broker-list node02:9092,node03:9092 --topic wordcount
kafka-console-consumer --zookeeper node01:2181 --topic wordcount --from-beginning
三: 安装kafka manager
https://www.jianshu.com/p/f65e76efe895
https://github.com/yahoo/kafka-manager
测试:
kafka-topics --zookeeper node01:2181 --list (在主机node03上也可以)
安装成功
四:安装redis,spark2:
https://blog.csdn.net/silentwolfyh/article/details/83818525
启动:spark2-shell,pyspark2
五:flume
https://blog.51cto.com/douya/1860390