Hadoop伪分布式部署

参考了很多文章,不过环境的差别导致问题也不完全相同,这里记录一下我部署的情况。
环境:腾讯云 centos7
软件版本:Hadoop 2.7.7 (后续准备上hbase2.1.5+phoenix5.0.0-HBase2.0)

部署过程网上有很多了,这里只贴我遇到的问题和配置

最开始是按最基本的配置跑,都启动之后执行MapReduce任务运行到running job卡住,然后就开始修改各种配置了

- core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://172.21.x.x:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/hadoop/tmp</value>
    </property>
</configuration>  

  

  ps:这里注意fs.defaultFS要设置内网地址

 

- hdfs-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:/home/hadoop/dfs/name</value>
        <final>true</final>
    </property>
    <property>
         <name>dfs.datanode.data.dir</name>
         <value>file:/home/hadoop/dfs/data</value>
   	 <final>true</final>
    </property>
    <property>
    	<name>dfs.replication</name>
    	<value>1</value>
    </property>
    <property>
    	<name>dfs.permissions</name>
    	<value>false</value>
    </property>
</configuration>

 

- mapred-xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property> 
        <name>mapred.system.dir</name> 
        <value>file:/home/hadoop/mapred/system</value> 
        <final>true</final> 
    </property> 
    <property> 
        <name>mapred.local.dir</name> 
        <value>file:/home/hadoop/mapred/local</value> 
        <final>true</final> 
    </property>
</configuration>

- yarn-site.xml

<?xml version="1.0"?>

<configuration>
   <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>mytccloud</value>
  </property>
  <property>
    <name>yarn.log-aggregation-enable</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.log-aggregation.retain-seconds</name>
    <value>604800</value>
  </property>
  <property>
     <name>yarn.resourcemanager.webapp.address</name>
     <value>mytccloud:8088</value>
   </property>
  <property>
   <name>yarn.resourcemanager.resource-tracker.address</name>
   <value>mytccloud:8031</value>
  </property>
   <property>
     <name>yarn.resourcemanager.address</name>
     <value>mytccloud:8032</value>
   </property>
  <property>
     <name>yarn.resourcemanager.admin.address</name>
     <value>mytccloud:8033</value>
   </property>
   <property> 
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> 
    <value>org.apache.hadoop.mapred.ShuffleHandler</value> 
</property> 
</configuration>

  

  ps:这里注意 yarn.resourcemanager.webapp.address 我设置了机器名+端口,但是需要再设置host,否则可能无法启动resoucemanager

 - hostname

  centod7 修改hostname: hostnamectl set-hostname mytccloud 

 - hosts:  172.21.x.x mytccloud (也是映射到内网,将localhost的部分都注释掉)

 

 - 目录结构

  [hadoop@mytccloud hdfs]$ pwd
    /home/hadoop/hdfs



 

posted @ 2020-04-30 14:22  喵了个汪的2371  阅读(338)  评论(0编辑  收藏  举报