hadoop单机部署

https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html

https://www.cnblogs.com/ee900222/p/hadoop_1.html

 

1、安装jdk,设置环境变量

mkdir /usr/java

tar xf jdk1.8.0_221.tar.gz -C /usr/java

 

vi /etc/profile

...

export JAVA_HOME=/usr/java/jdk1.8.0_221

export PATH=$JAVA_HOME/bin:$PATH:$HOME/bin

 

下载hadoop

wget https://archive.apache.org/dist/hadoop/common/hadoop-2.7.0/hadoop-2.7.0.tar.gz

 

tar xf hadoop-2.7.0.tar.gz -C /home/centos

ln -s /home/centos/hadoop-2.7.0 /home/centos/hadoop

 

设置hadoop环境变量

vi .bash_profile

export HADOOP_HOME=/home/centos/hadoop

export PATH=$PATH:$HADOOP_HOME/bin

 

source .bash_profile

 

验证安装是否正常,以下job是使用hadoop自带的样例,在input中统计含有dfs的字符串。

cd hadoop

mkdir input

cp etc/hadoop/*.xml input

bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0.jar grep input output 'dfs[a-z.]+'

cat output/*

 

二、配置伪分布式

修改配置etc/hadoop/core-site.xml:

<configuration>

    <property>

        <name>fs.defaultFS</name>

        <value>hdfs://safe01:9000</value>

    </property>

</configuration>

 

修改配置etc/hadoop/hdfs-site.xml:

<configuration>

    <property>

        <name>dfs.replication</name>

        <value>1</value>

    </property>

</configuration>

 

配置ssh对等性

ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

chmod 0600 ~/.ssh/authorized_keys

 

或使用以下命令配置ssh对等性

ssh-keygen -t rsa

ssh-copy-id safe01

 

执行MapReduce job

hdfs namenode -format

 

启动namenode和datanode

sbin/start-dfs.sh

 

使用jps命令查看,会看到有一个NameNode、DataNode

 

访问NameNode的web页面是http://ip:50070

cd hadoop

hdfs -mkdir /user

hdfs -mkdir /user/centos

hdfs dfs -put etc/hadoop /user/centos/input

hadoop fs -ls /user/centos/input

执行hadoop job

hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0.jar /user/centos/input output 'dfs[a-z.]+'

查看执行结果

hdfs dfs -cat output/*

 

配置Yarn

修改配置:etc/hadoop/mapred-site.xml:

<configuration>

    <property>

        <name>mapreduce.framework.name</name>

        <value>yarn</value>

    </property>

    <property>

        <name>mapreduce.application.classpath</name>

        <value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>

    </property>

</configuration>

 

修改配置:etc/hadoop/yarn-site.xml:

<configuration>

    <property>

        <name>yarn.nodemanager.aux-services</name>

        <value>mapreduce_shuffle</value>

    </property>

    <property>

        <name>yarn.nodemanager.env-whitelist</name>

        <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>

    </property>

</configuration>

 

启动ResourceManager和NodeManager

sbin/start-yarn.sh

 

访问ResourceManager的端口: http://ip:8088/

posted @ 2022-08-03 14:19  leiuk  阅读(362)  评论(0编辑  收藏  举报