Loading

Ubuntu配置hadoop实验(英文版)【大数据处理技术】

The virtual machine image, hadoop installation package and JAVA installation package used in this article are

链接:https://pan.baidu.com/s/1wDCUkjsk3OEJlcV_FQ3RBA?pwd=6q89
提取码:6q89

Experimental environment

Version
OS Ubuntu 20.04.4 LTS
JDK 1.8.0_144
Hadoop 2.7.2

Hadoop Installation and Configuration

1.1 Create user hadoop

Open the terminal and enter the following command to create a new user hadoop:

sudo useradd -m hadoop -s /bin/bash  //create user hadoop
sudo passwd hadoop 			//Set the password of user hadoop
sudo adduser hadoop sudo	//Add administrator privileges for user hadoop

1.2 Switch the login user to hadoop

who

1.3 Update apt

sudo apt-get update # Update apt
sudo apt-get install vim # install vim

image.png

1.4 Install SSH and configure SSH to log in without password

① Install ssh server

SSH login is required for both cluster and single node modes, SSH client is installed by default for Ubuntu. In addition, ssh server needs to be installed:

sudo apt-get install openssh-server # install ssh server

image.png

② Use ssh to log in

After installing ssh server, log in to this computer with the following command:

ssh localhost

1.5 Install Java environment

① Download jdk compressed file to local

image.png
image.pngimage.png
image.png

② Unzip the JDK compressed file

Execute the following shell commands in the terminal:

cd /usr/lib
sudo mkdir jvm
cd
cd 下载
sudo tar -zxvf ./jdk-8u144-linux-x64.tar.gz -C /usr/lib/jvm 

image.png
After the JDK file is decompressed, execute the following command to check in
/usr/lib/jvm directory:

cd /usr/lib/jvm
ls

image.png

③ Setting environment variables

Execute the following command to set the environment variable. The below command uses the vim editor to open the hadoop user's environment variable configuration file.

cd
vim ~/.bashrc

Add the following lines at the beginning of the file:
Press I to insert and press ESC and :wq to save and quit.

export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_144
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH

Save the .bashrc file and exit the vim editor. Then, continue to execute the following command to make the configuration of .bashrc file take effect immediately:

source ~/.bashrc

Use the following command to check whether the installation is successful:

java -version

image.png

1.6 Install Hadoop2

Install Hadoop into /usr/local/ :

sudo tar -zxf ~/下载/hadoop-2.7.2.tar.gz -C /usr/local
cd /usr/local/
sudo mv ./hadoop-2.7.2/ ./hadoop
sudo chown -R hadoop ./hadoop

Hadoop can be used after decompression. Enter the following command to check whether Hadoop is available. If successful, the Hadoop version information will be displayed:

cd /usr/local/hadoop
./bin/hadoop version

image.png

2. Hadoop local mode

3.Hadoop Pseudo distributed configuration

Hadoop can run in a pseudo distributed way on a single node. Hadoop process runs in a separate java process. The node acts as both namenode and datanode. At the same time, it reads the files in HDFS.

3.1 Modify configuration file core-site.xml and hdfs-site.xml

The Hadoop configuration file is located in /usr/local/hadoop/etc/hadoop/

cd /usr/local/hadoop/etc/hadoop/
vim core-site.xml

File core-site.xml needs to be modified as

<configuration>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>file:/usr/local/hadoop/tmp</value>
    <description>Abase for other temporary directories.</description>
  </property>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:9000</value>
  </property>
</configuration>
vim hdfs-site.xml

And file hdfs-site.xml needs to be modified as

<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>file:/usr/local/hadoop/tmp/dfs/name</value>
  </property>
  <property>
    <name>dfs.datanode.data.dir</name>
    <value>file:/usr/local/hadoop/tmp/dfs/data</value>
  </property>
</configuration>

3.2 Execute the format of namenode

After configuration, execute the format of namenode:

cd /usr/local/hadoop
./bin/hdfs namenode -format

image.png

3.3 Start the namenode and datanode daemons

Then start the namenode and datanode daemons.

cd /usr/local/hadoop
./sbin/start-dfs.sh

After the startup is completed, you can judge whether the startup is successful through the command JPS. If it is successful, the following processes will be listed:
"NameNode", "DataNode" and "Secondary NameNode".

jps

image.png

3.4 Close the namenode and datanode daemons

sudo vim ~/.bashrc
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
source ~/.bashrc
stop-all.sh
posted @ 2023-06-23 12:02  LateSpring  阅读(67)  评论(0编辑  收藏  举报