Ubuntu配置hadoop实验（英文版）【大数据处理技术】

The virtual machine image, hadoop installation package and JAVA installation package used in this article are

链接：https://pan.baidu.com/s/1wDCUkjsk3OEJlcV_FQ3RBA?pwd=6q89
提取码：6q89

Experimental environment

	Version
OS	Ubuntu 20.04.4 LTS
JDK	1.8.0_144
Hadoop	2.7.2

Hadoop Installation and Configuration

1.1 Create user hadoop

Open the terminal and enter the following command to create a new user hadoop:

sudo useradd -m hadoop -s /bin/bash  //create user hadoop
sudo passwd hadoop 			//Set the password of user hadoop
sudo adduser hadoop sudo	//Add administrator privileges for user hadoop

who

1.3 Update apt

sudo apt-get update # Update apt
sudo apt-get install vim # install vim

1.4 Install SSH and configure SSH to log in without password

① Install ssh server

SSH login is required for both cluster and single node modes, SSH client is installed by default for Ubuntu. In addition, ssh server needs to be installed:

sudo apt-get install openssh-server # install ssh server

② Use ssh to log in

After installing ssh server, log in to this computer with the following command:

ssh localhost

1.5 Install Java environment

① Download jdk compressed file to local

② Unzip the JDK compressed file

Execute the following shell commands in the terminal:

cd /usr/lib
sudo mkdir jvm
cd
cd 下载
sudo tar -zxvf ./jdk-8u144-linux-x64.tar.gz -C /usr/lib/jvm

After the JDK file is decompressed, execute the following command to check in
/usr/lib/jvm directory:

cd /usr/lib/jvm
ls

③ Setting environment variables

Execute the following command to set the environment variable. The below command uses the vim editor to open the hadoop user's environment variable configuration file.

cd
vim ~/.bashrc

Add the following lines at the beginning of the file:
Press I to insert and press ESC and :wq to save and quit.

export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_144
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH

Save the .bashrc file and exit the vim editor. Then, continue to execute the following command to make the configuration of .bashrc file take effect immediately:

source ~/.bashrc

Use the following command to check whether the installation is successful:

java -version

1.6 Install Hadoop2

Install Hadoop into /usr/local/ :

sudo tar -zxf ~/下载/hadoop-2.7.2.tar.gz -C /usr/local
cd /usr/local/
sudo mv ./hadoop-2.7.2/ ./hadoop
sudo chown -R hadoop ./hadoop

Hadoop can be used after decompression. Enter the following command to check whether Hadoop is available. If successful, the Hadoop version information will be displayed:

cd /usr/local/hadoop
./bin/hadoop version

2. Hadoop local mode

3.Hadoop Pseudo distributed configuration

Hadoop can run in a pseudo distributed way on a single node. Hadoop process runs in a separate java process. The node acts as both namenode and datanode. At the same time, it reads the files in HDFS.

3.1 Modify configuration file core-site.xml and hdfs-site.xml

The Hadoop configuration file is located in /usr/local/hadoop/etc/hadoop/

cd /usr/local/hadoop/etc/hadoop/

vim core-site.xml

File core-site.xml needs to be modified as

<configuration>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>file:/usr/local/hadoop/tmp</value>
    <description>Abase for other temporary directories.</description>
  </property>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:9000</value>
  </property>
</configuration>

vim hdfs-site.xml

And file hdfs-site.xml needs to be modified as

<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>file:/usr/local/hadoop/tmp/dfs/name</value>
  </property>
  <property>
    <name>dfs.datanode.data.dir</name>
    <value>file:/usr/local/hadoop/tmp/dfs/data</value>
  </property>
</configuration>

3.2 Execute the format of namenode

After configuration, execute the format of namenode:

cd /usr/local/hadoop
./bin/hdfs namenode -format

3.3 Start the namenode and datanode daemons

Then start the namenode and datanode daemons.

cd /usr/local/hadoop
./sbin/start-dfs.sh

After the startup is completed, you can judge whether the startup is successful through the command JPS. If it is successful, the following processes will be listed:
"NameNode", "DataNode" and "Secondary NameNode".

jps

3.4 Close the namenode and datanode daemons

sudo vim ~/.bashrc

export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin

source ~/.bashrc

stop-all.sh

posted @ 2023-06-23 12:02 LateSpring 阅读(68) 评论(0) 编辑收藏举报

刷新页面返回顶部

Loading

Jinyu's Blog

Ubuntu配置hadoop实验（英文版）【大数据处理技术】

Experimental environment

Hadoop Installation and Configuration

1.1 Create user hadoop

1.3 Update apt

1.4 Install SSH and configure SSH to log in without password

① Install ssh server

② Use ssh to log in

1.5 Install Java environment

① Download jdk compressed file to local

② Unzip the JDK compressed file

③ Setting environment variables

1.6 Install Hadoop2

2. Hadoop local mode

3.Hadoop Pseudo distributed configuration

3.1 Modify configuration file core-site.xml and hdfs-site.xml

3.2 Execute the format of namenode

3.3 Start the namenode and datanode daemons

3.4 Close the namenode and datanode daemons

公告

Loading

Jinyu's Blog

Ubuntu配置hadoop实验（英文版）【大数据处理技术】

Experimental environment

Hadoop Installation and Configuration

1.1 Create user hadoop

1.2 Switch the login user to hadoop

1.3 Update apt

1.4 Install SSH and configure SSH to log in without password

① Install ssh server

② Use ssh to log in

1.5 Install Java environment

① Download jdk compressed file to local

② Unzip the JDK compressed file

③ Setting environment variables

1.6 Install Hadoop2

2. Hadoop local mode

3.Hadoop Pseudo distributed configuration

3.1 Modify configuration file core-site.xml and hdfs-site.xml

3.2 Execute the format of namenode

3.3 Start the namenode and datanode daemons

3.4 Close the namenode and datanode daemons

公告