hadoop单击模式环境搭建
一 安装jdk
下载相应版本的jdk安装到相应目录,我的安装目录是/usr/lib/jdk1.8.0_40
下载完成后,在/etc/profile中设置一下环境变量,在文件最后追加如下内容
export JAVA_HOME=/usr/lib/jdk1.8.0_40 export JRE_HOME=/usr/lib/jdk1.8.0_40/jre export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$CLASSPATH export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH
二 安装ssh---------sudo apt-get install ssh
主要使用其管理远端守护进程,这里是单击模式,所以,不重要.
三 下载hadoop
http://hadoop.apache.org/releases.html
建议下载稳定版本的,我下载的是hadoop2.6.4,并把它放在了/usr/local/目录下
hadoop运行在apache服务器上的,需要java环境的支持,所以,下载的hadoop需要配置java环境变量,使java认识hadoop,同时也要使hadoop放到java环境中.
1 设置 ~/.bashrc,为登录的hadoop用户设置环境变量
export JAVA_HOME=/usr/lib/jdk1.8.0_40 export HADOOP_INSTALL=/usr/local/hadoop-2.6.4 export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$JAVA_HOME/bin export PATH=$PATH:$HADOOP_INSTALL/sbin export HADOOP_MAPRED_HOME=$HADOOP_INSTALL export HADOOP_COMMON_HOME=$HADOOP_INSTALL export HADOOP_HDFS_HOME=$HADOOP_INSTALL export YARN_HOME=$HADOOP_INSTALL export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
设置完成之后,要运行
source ~/.bashrc
使设置的环境变量生效
2 配置hadoop
在 /usr/local/hadoop-2.6.4/etc/hadoop/下打开hadoop-env.sh
export JAVA_HOME=/usr/lib/jdk1.8.0_40 export JRE_HOME=/usr/lib/jdk1.8.0_40/jre export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$CLASSPATH export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH
到这里hadoop单击模式就配置好了
运行
./bin/hadoop version
可看到如下信息
Hadoop 2.6.4 Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 5082c73637530b0b7e115f9625ed7fac69f937e6 Compiled by jenkins on 2016-02-12T09:45Z Compiled with protoc 2.5.0 From source with checksum 8dee2286ecdbbbc930a6c87b65cbc010 This command was run using /usr/local/hadoop-2.6.4/share/hadoop/common/hadoop-common-2.6.4.jar
说明hadoop配置好了
下面来运行一下hadoop自带的wordcount程序检验一下
1 在hadoop目录下创建input文件夹,将/etc/hadoop中的配置文件复制到里面作为待测文件
mkdir input
cp etc/hadoop/* input/
2 运行程序,计数
在hadoop目录下运行命令
./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.4.jar grep input output '[a-z.]+'
意思是,通过example那个jar包,将a-z开头的单词数统计出来
看到如下运行信息
File System Counters FILE: Number of bytes read=632564 FILE: Number of bytes written=1415622 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 Map-Reduce Framework Map input records=1151 Map output records=1151 Map output bytes=22396 Map output materialized bytes=24704 Input split bytes=126 Combine input records=0 Combine output records=0 Reduce input groups=70 Reduce shuffle bytes=24704 Reduce input records=1151 Reduce output records=1151 Spilled Records=2302 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=0 CPU time spent (ms)=0 Physical memory (bytes) snapshot=0 Virtual memory (bytes) snapshot=0 Total committed heap usage (bytes)=667942912 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=32250 File Output Format Counters Bytes Written=15798
说明运行成功
查看运行结果
cat output/*
再次运行的话,需要 rm -r output/ 删除output文件夹才能再次运行