Hadoop伪分布式部署
为了测试方便记录下hadoop伪分布式部署:
包下载地址:http://archive.cloudera.com/cdh5/cdh/5/
我选用的是cdh5.4.5版本
etc/profile 配置文件
export JAVA_HOME=/home/jdk1.7.0_79 export JRE_HOME=/home/jdk1.7.0_79/jre export HADOOP_HOME=/home/hadoop-2.6.0-cdh5.4.5 export HBASE_HOME=/home/hbase-1.0.0-cdh5.4.5 #export HADOOP_CONF_DIR=/home/hadoop-2.6.0-cdh5.4.5/etc/hadoop export CLASSPATH=./:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib export SCALA_HOME=/home/scala-2.11.6 #export SPARK_HOME=/home/hadoop/CDH5/spark-1.0.2-bin-hadoop2 export SPARK_HOME=/home/spark-1.6.0-bin-hadoop2.6 export SBT_HOME=/home/sbt export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$SPARK_HOME/bin:$SCALA_HOME/bin:$SBT_HOME/bin:$HBASE_HOME/bin
Hadoop配置:
1、core-site.xml
<property> <name>hadoop.tmp.dir</name> <value>file:/home/hadoop/tmp</value> <description>Abase for other temporary directories.</description> </property> <property> <name>fs.defaultFS</name> <value>hdfs://node11:9000</value> </property>
2、hdfs-site.xml
<property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/home/hadoop/tmp/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/home/hadoop/tmp/dfs/data</value> </property>
3、mapred-site.xml
<name>mapreduce.framework.name</name> <value>yarn</value> </property>
4、yarn-site.xml
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property>
5、hadoop-env.sh 第25行
25 export JAVA_HOME=/home/jdk1.7.0_79
格式化:hadoop namenode -format
如果有五个进程表明安装成功:
第二: HBase配置
1、hbase-site.xml
<property> <name>hbase.rootdir</name> <value>hdfs://node11:9000/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>node11</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>hbase.regionserver.wal.codec</name> <value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value> </property>
2、在regoinServer 配置自己的主机名
node11
3、hbase-env.sh
在29行配置jdk:
export JAVA_HOME=/home/jdk1.7.0_79
4、启动start-hbase.sh
如果有如下3个进程表明安装成功
28275 HQuorumPeer
28351 HMaster
28473 HRegionServer
测试:
hbase(main):003:0> put 'test', 'row1', 'cf:a', 'value1' 0 row(s) in 0.0850 seconds hbase(main):004:0> put 'test', 'row2', 'cf:b', 'value2' 0 row(s) in 0.0110 seconds hbase(main):005:0> put 'test', 'row3', 'cf:c', 'value3' 0 row(s) in 0.0100 seconds hbase(main):006:0> scan 'test' ROW COLUMN+CELL row1 column=cf:a, timestamp=1421762485768, value=value1 row2 column=cf:b, timestamp=1421762491785, value=value2 row3 column=cf:c, timestamp=1421762496210, value=value3 3 row(s) in 0.0230 seconds
spark伪分布搭建
配置文件:spark-env.sh
export SCALA_HOME=/home/scala-2.11.6 export JAVA_HOME=/home/jdk1.7.0_79 export SPARK_MASTER_IP=192.168.220.136 export SPARK_WORKER_MEMORY=1025m export master=spark://192.168.220.136:7070
在slave 里面添加 本机名:node
启动:
sbin 目录下:./start-all.sh
出现进程:
33340 Worker 33388 Jps 3662 ResourceManager 3243 NameNode 3512 SecondaryNameNode 3324 DataNode 33279 Master
测试:
在bin 下运行./run-example org.apache.spark.examples.SparkPi