本人采用一个master和两个slave的网络结构,具体搭建如下
1.准备安装包
1.下载安装包 http://pan.baidu.com/s/1jIoZulw 2.安装包清单 scala-2.12.4.tar hadoop-2.7.4.tar zookeeper-3.4.10.tar jdk-8u151-linux-x64.tar spark-2.2.0-bin-hadoop2.7.tar hbase-1.3.1-bin.tar.gz
2.基本安装准备
1.安装virtualbox(自行下载安装) 2.安装unbunt(自行下载安装) 3.设置root用户密码并切换到root用户 sudo passwd root(回车输入要设置为root用户的密码) su root(回车后输入密码) 4.在unbunt上安装ssh服务,vim,net-tools apt install vim apt install net-tools apt install openssh-server 5.修改ssh配置文件并重启ssh服务 vim /etc/ssh/sshd.config(将32行改为PermitRootLogin yes) /etc/init.d/ssh restart(重启ssh服务)
3.集群的命名和ssh无密码登陆验证
1.将虚拟服务器复制2台 2.使用ifconfig命令确认服务器的ip地址 3.使用root账号ssh登陆所有服务器 ssh root@192.168.0.127(回车输入密码) ssh root@192.168.0.128(回车输入密码) ssh root@192.168.0.129(回车输入密码) 4.分别在三台服务器上修改服务器的名字 vim /etc/hostname(三台服务器分别设置为master,slave1,slave2) 5.vim /etc/hosts里添加(每台服务器都相同) 192.168.0.127 master 192.168.0.128 slave1 192.168.0.129 slave2 5.重新启动3台服务器(可以看到服务器名已变更成功) 6.分别在每台服务器上运行ssh-keygen(一直敲回车回到命令行) 7.在master服务器上 cd ~/.ssh/ cat id_rsa.pub > authorized_keys 8.分别在slave1和slave2上运行命令 cat ~/.ssh/id_rsa.pub 9.将slave1和slave2上cat出的内容复制到master的authorized_keys文件里并保存 10.将master的authorized_keys分别拷贝到slave1和slave2的~/.ssh/目录里 11.在master服务器上测试(如无需输入密码登录即为成功) ssh slave1 ssh slave2
4.安装JDK
1.将安装包传送到mastr主机的/usr/local/src/目录上 scp -rp jdk-8u151-linux-x64.tar root@192.168.0.127:/usr/local/src/ (这里是Mac的传送方式,windows请使用ftp工具) 2.登录master解压安装包 cd /usr/local/src tar -zxvf jdk-8u151-linux-x64.ta 3.设置jdk环境变量 vim /etc/profile(在末尾添加) export export JAVA_HOME=/usr/local/src/jdk1.8.0_151 export PATH=:$PATH:$JAVA_HOME/bin:$JAVA_HOME/sbin 4.使配置生效并测试是否成功 source /etc/profile 输入java或java -version无报错即为成功 5.在slave1和slave2上做相同操作
5.安装scala
1.将scala安装包上传到master的/usr/local/src/目录里(同上) 2.解压scala安装包 tar -zxvf scala-2.12.4.tar 3.设置环境变量 vim /etc/profile export JAVA_HOME=/usr/local/src/jdk1.8.0_151 export SCALA_HOME=/usr/local/src/scala-2.12.4 export PATH = :$PATH:$JAVA_HOME/bin:$JAVA_HOME/sbin:$SCALA_HOME/bin 4.使配置生效 source /etc/profile 5.在slave1和slave2上做相同操作
6.安装hadoop
1.将hadoop安装包上传到master的/usr/local/src/目录里(同上) 2.解压hadoop安装包 tar -zxvf hadoop-2.7.4.tar 3.设置环境变量 vim /etc/profile export JAVA_HOME=/usr/local/src/jdk1.8.0_151 export SCALA_HOME=/usr/local/src/scala-2.12.4 export HADOOP_HOME=/usr/local/src/hadoop-2.7.4 export PATH = :$PATH:$JAVA_HOME/bin:$JAVA_HOME/sbin:$SCALA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME:/sbin 4.使配置生效 source /etc/profile 5.配置hadoop核心文件 cd /usr/local/src/hadoop-2.7.4/etc/hadoop vim core-site.xml <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://master:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/src/hadoop-2.7.4/tmp</value> </property> </configuration> vim hdfs-site.xml <configuration> <property> <name>dfs.namenode.secondary.http-address</name> <value>master:50090</value> </property> <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/usr/local/src/hadoop-2.7.4/tmp/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/usr/local/src/hadoop-2.7.4/tmp/dfs/data</value> </property> </configuration> vim mapped-site.xml <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>master:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>master:19888</value> </property> </configuration> vim yarn-site.xml <configuration> <property> <name>yarn.resourcemanager.hostname</name> <value>master</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration> vim hadoop-env.sh文末添加:export JAVA_HOME=/usr/local/src/jdk1.8.0_151 vim slaves里内容为 slave1 slave2 6.将master配置好的hadoop文件分别复制到slave1和slave2(这是Mac上的操作,windows使用ftp工具) cd /usr/local/src/ scp -rp hadoop-2.7.4 slave1:/usr/local/src/ scp -rp hadoop-2.7.4 slave2:/usr/local/src/ 7.在slave1和slave2上重复本节的3,4步骤 8.在master上进行磁盘格式化 hadoop namenode -format cd /usr/local/src/hadoop-2.7.4/sbin/ start-all.sh 9.在master上jps可以看到 2022 SecondaryNameNode 1769 NameNode 5401 Jps 2175 ResourceManager 10.在slave1和slave2上jps可以看到 4020 Jps 1606 DataNode 1775 NodeManager 11.master上将/etc/passwd拖入到hdfs上 hadoop fs -put /etc/passwd / 12.查看如果有/passwd则拖入成功 hadoop fs -ls /
7.安装spark
1.将spark安装包上传到master的/usr/local/src/目录里(同上) 2.解压spark安装包 tar -zxvf spark-2.2.0-bin-hadoop2.7.tar 3.设置环境变量 vim /etc/profile export JAVA_HOME=/usr/local/src/jdk1.8.0_151 export HADOOP_HOME=/usr/local/src/hadoop-2.7.4 export SCALA_HOME=/usr/local/src/scala-2.12.4 export SPARK_HOME=/usr/local/src/spark-2.2.0-bin-hadoop2.7 export PATH=:$PATH:$JAVA_HOME/bin:$JAVA_HOME/sbin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$SCALA_HOME/bin:$SPARK_HOME/bin 4.使配置生效 source /etc/profile 5.配置spark文件 cd /usr/local/src/spark-2.2.0-bin-hadoop2.7/conf cp spark-env.sh.template spark-env.sh vim spark-env.sh里文末添加 export JAVA_HOME=/usr/local/src/jdk1.8.0_151 export SCALA_HOME=/usr/local/src/scala-2.12.4 export HADOOP_HOME=/usr/local/src/hadoop-2.7.4 export HADOOP_CONF_DIR=/usr/local/src/hadoop-2.7.4/etc/hadoop SPARK_MASTER_IP=master SPARK_LOCAL_DIRS=/usr/local/src/spark-2.2.0-bin-hadoop2.7 SPARK_DRIVER_MEMORY=1G 保存退出 vim slaves slave1 slave2 6.将spark配置好的文件复制到slave1和slave2 cd /usr/local/src scp spark-2.2.0-bin-hadoop2.7 slave1:/usr/local/src scp spark-2.2.0-bin-hadoop2.7 slave2:/usr/local/src 7.在slave1,slave2上分别重复本节3,4步骤 8.分别在master,slave1,slave2上启动spark cd /usr/local/src/spark-2.2.0-bin-hadoop2.7/sbin start-all.sh 9.在master上jps可以看到 4320 Master 2022 SecondaryNameNode 1769 NameNode 5499 Jps 2175 ResourceManager 10.在slave1,slave2上jps可以看到 4020 Jps 1606 DataNode 3323 Worker 1775 NodeManager
8.安装zookeeper
1.将zookeeper安装包上传到master的/usr/local/src/目录里(同上) 2.解压zookeeper安装包 tar -zxvf zookeeper-3.4.10.tar 3.配置spark配置文件 cd /usr/local/src/zookeeper-3.4.10/conf cp zoo_sample.cfg zoo.cfg vim zoo.cfg 将此三个参数修改为 clientPort=2181 dataDir=/usr/local/src/zookeeper-3.4.10/zoo_data tickTime=2000 并在尾部添加ip地址填写自己实际的ip地址 server.1=master:2888:3888 server.2=slave1:2888:3888 server.3=slave2:2888:3888 保存退出 4.在zookeeper-3.4.10目录里创建文件夹zoo_data cd /usr/local/src/zookeeper-3.4.10/zoo_data vim myid(此机是master所以写为1,与上一步骤中的server:后对应的一致) 5.将zookeeper-3.4.10文件复制到slave1和slave2(此例是Mac操作方式,windows请用ftp工具) scp -rp zookeeper-3.4.10 save1:/usr/local/src/ scp -rp zookeeper-3.4.10 save2:/usr/local/src/ 6.分别将slave1和slave2里/usr/local/src/ookeeper-3.4.10/zoo_data/myid里的数字改为2和3 7.分别在mater,slave1和slave2里启动zookeeper服务 cd /usr/local/src/ookeeper-3.4.10/ bin/zkServer.sh start 8.使用jps分别在master,slave1,slave2查看进程可以看到都多了QuorumPeerMain 9.master进程为 3729 Jps 2723 NameNode 3413 Master 3126 ResourceManager 2972 SecondaryNameNode 3694 QuorumPeerMain 10.slave1和slave2进程为 2560 DataNode 3251 Jps 3124 QuorumPeerMain 2966 Worker 2749 NodeManager
9.安装Hbase
1.将hbase安装包上传到master的/usr/local/src/目录里(同上) 2.解压hbase安装包 tar -zxvf hbase-1.3.1.tar.gz 3.设置环境变量并使之生效 vim /etc/profile export JAVA_HOME=/usr/local/src/jdk1.8.0_151 export HADOOP_HOME=/usr/local/src/hadoop-2.7.4 export SCALA_HOME=/usr/local/src/scala-2.12.4 export SPARK_HOME=/usr/local/src/spark-2.2.0-bin-hadoop2.7 export HBASE_HOME=/usr/local/src/hbase-1.3.1 export PATH=:$PATH:$JAVA_HOME/bin:$JAVA_HOME/sbin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$SCALA_HOME/bin:$SPARK_HOME/bin:$HBASE_HOME/bin 4.配置hbase配置文件 cd /usr/local/src/hbase-1.3.1/conf vim base-env.sh 文末添加 export JAVA_HOME=/usr/local/src/jdk1.8.0_151 export HBASE_MANAGES_ZK=false vim base-site.xml添加 <configuration> <property> <name>hbase.rootdir</name> <value>hdfs://master:9000/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>master,slave1,slave2</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration> vim regionservers改为 slave1 slave2 5.将/usr/local/src/hbase-1.3.1分别拷贝到slave1和slave2的/usr/local/src目录 6.在slave1,salve2上进行本次3,4步骤 7.分别在master,slave1,slave2上启动base cd /usr/local/src/hbase-1.3.1/bin start-hbase.sh 8.在master上jps查看发现多了HMaster 3313 Jps 1955 SecondaryNameNode 1702 NameNode 2507 QuorumPeerMain 3198 HMaster 2399 Master 2111 ResourceManager 9.在slave1,slave2上jps查看发现多了HRegionServer 1971 Worker 1752 NodeManager 2507 Jps 1579 DataNode 2044 QuorumPeerMain 2396 HRegionServer
技术让生活更优质!