大数据之—CDH搭建

大数据之—CDH搭建

参考:https://zhuanlan.zhihu.com/p/444565129

前言

1、CDH概述

  Cloudera版本(Cloudera’s Distribution Including Apache Hadoop,简称“CDH”),基于Web的用户界面,支持大多数Hadoop组件,包括HDFS、MapReduce、Hive、Pig、 Hbase、Zookeeper、Sqoop,简化了大数据平台的安装、使用难度。

  由于组件齐全,安装维护方便,国内已经有不少公司部署了CDH大数据平台,此处选择CDH 6.3版本。

2、安装CDH前准备

推荐硬件配置: 每台主机:CPU4核、内存8G、硬盘500G(如果是用虚拟机搭建的话,把子节点放到硬盘空间大的盘好点)

主机配置:

192.168.8.137  node1
192.168.8.138  node2
192.168.8.139  node3

软件版本:

  • 操作系统:CentOS release 7.8 (Final) 64位
  • JDK:1.8
  • 数据库:MySQL 5.6.49
  • JDBC:MySQL Connector Java 5.1.38
  • Cloudera Manager: 6.3.1
  • CDH:6.3.1

3、配置

  • 配置
# node1
echo "HOSTNAME=node1" >> /etc/sysconfig/network
# node2
echo "HOSTNAME=node2" >> /etc/sysconfig/network
# node3
echo "HOSTNAME=node3" >> /etc/sysconfig/network
  • 配置

所有节点关闭防火墙

systemctl disable firewalld;systemctl stop firewalld
  • 配置

所有节点配置SeLinux

vim /etc/selinux/config
# 修改内容
SELINUX=permissive

同步配置:xsync /etc/selinux/config

  • 配置

所有节点NTP时间服务配置

yum install ntp
systemctl start ntpd
systemctl enable ntpd
  • 配置

所有节点安装python

CDH要求python 2.7版本,此处系统自带,略过

  • 配置

所有节点修改Linux swappiness参数

为了避免服务器使用swap功能而影响服务器性能,一般都会把vm.swappiness修改为0(cloudera建议10以下)

cd /usr/lib/tuned/ && grep "vm.swappiness" * -R
# 将以下三个文件的 vm.swappiness 的值全部修改为0
vim /usr/lib/tuned/latency-performance/tuned.conf
vim /usr/lib/tuned/throughput-performance/tuned.conf
vim /usr/lib/tuned/virtual-guest/tuned.conf

同步配置:xsync /usr/lib/tuned/latency-performance/tuned.conf /usr/lib/tuned/throughput-performance/tuned.conf /usr/lib/tuned/virtual-guest/tuned.conf

  • 配置

所有节点禁用透明页

vim /etc/rc.local
# 追加以下内容
echo never > /sys/kernel/mm/transparent_hugepage/defrag
echo never > /sys/kernel/mm/transparent_hugepage/enabled

此项配置无法同步

  • 配置

所有节点安装JDK,已经安装过并且配置过环境变量的就不用重新安装了

cloudera-scm-server一直启动失败的解决办法:https://blog.csdn.net/a544258023/article/details/107856387

jdk1.8改成自己的jdk安装目录

mkdir -p /usr/java

ln -s /opt/jdk/java8 /usr/java/default

  • 配置

主节点:安装mysql数据库

此处安装MySQL5.6版本,安装步骤略过

  • 配置

主节点安装mysql

docker run -d -p 3306:3306 --name mysql -e MYSQL_ROOT_PASSWORD=root mysql:5.7.41
docker exec -it mysql bash
# 创建CDH源数据库、用户、amon服务的数据库
mysql -uroot -proot
create database cmf DEFAULT CHARACTER SET utf8;
create database amon DEFAULT CHARACTER SET utf8;
grant all on cmf.* TO 'cmf'@'%' IDENTIFIED BY 'www.research.com';
grant all on amon.* TO 'amon'@'%' IDENTIFIED BY 'www.research.com';
flush privileges;

4、下载安装包

软件已经存到网盘中了,需要的话可自取,失效了我的阿里云盘中也有
链接:https://pan.baidu.com/s/1UH50Uweyi7yg6bV7dl02mQ
提取码:nx7p

主节点安装MySQL的jdbc驱动

mkdir -p /opt/cdh/soft  # cdh的资源上传目录
mkdir -p /usr/share/java && mv /opt/cdh/soft/mysql-connector-java-5.1.47.jar /usr/share/java/mysql-connector-java.jar && cd /usr/share/java && ll

部署CDH

mkdir -p /opt/cdh/cloudera-manager && cd /opt/cdh/cloudera-manager
tar -zvxf /opt/cdh/soft/cm6.3.1-redhat7.tar.gz -C /opt/cdh/cloudera-manager

同步资源:xsync /opt/cdh

所有节点都要安装的内容

# 有顺序要求,顺序不对启动就会导致文件没有权限,主要是某一个包在安装的时候会自动创建cloudera-scm用户与cloudera-scm组
rpm -ivh /opt/cdh/cloudera-manager/cm6.3.1/RPMS/x86_64/cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm --nodeps --force

rpm -ivh /opt/cdh/cloudera-manager/cm6.3.1/RPMS/x86_64/cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm --nodeps --force

只有主节点node1上安装的内容

rpm -ivh /opt/cdh/cloudera-manager/cm6.3.1/RPMS/x86_64/cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm --nodeps --force

所有节点 修改agent配置,指向server节点node1

sed -i "s/server_host=localhost/server_host=此处修改为主节点的ip/g" /etc/cloudera-scm-agent/config.ini

例如:sed -i "s/server_host=localhost/server_host=192.168.8.137/g" /etc/cloudera-scm-agent/config.ini

主节点node1修改server配置

chmod 777 /etc/cloudera-scm-server/db.properties && vim /etc/cloudera-scm-server/db.properties

# 文件内容
# Copyright (c) 2012 Cloudera, Inc. All rights reserved.
# This file describes the database connection.

# The database type
# Currently 'mysql', 'postgresql' and 'oracle' are valid databases.
# com.cloudera.cmf.db.type=mysql

# The database host
# If a non standard port is needed, use 'hostname:port'
#com.cloudera.cmf.db.host=localhost

# The database name
#com.cloudera.cmf.db.name=cmf

# The database user
#com.cloudera.cmf.db.user=cmf

# The database user's password
#com.cloudera.cmf.db.password=

# The db setup type
# After fresh install it is set to INIT
# and will be changed post config.
# If scm-server uses Embedded DB then it is set to EMBEDDED
# If scm-server uses External DB then it is set to EXTERNAL
# com.cloudera.cmf.db.setupType=INIT

com.cloudera.cmf.db.type=mysql
com.cloudera.cmf.db.host=node1
com.cloudera.cmf.db.name=cmf
com.cloudera.cmf.db.user=cmf
com.cloudera.cmf.db.password=www.research.com
com.cloudera.cmf.db.setupType=EXTERNAL

主节点部署离线parcel源

yum install -y httpd
mkdir -p /var/www/html/cdh6_parcel
cp /opt/cdh/soft/CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel /var/www/html/cdh6_parcel/ && cp /opt/cdh/soft/CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel.sha1 /var/www/html/cdh6_parcel/CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel.sha && cp /opt/cdh/soft/manifest.json /var/www/html/cdh6_parcel/ && ll /var/www/html/cdh6_parcel/ && systemctl start httpd && systemctl enable httpd

页面访问:http://node1/cdh6_parcel/

本地存储库设置

cp /opt/cdh/soft/CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel /opt/cloudera/parcel-repo/
cp /opt/cdh/soft/CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel.sha1 /opt/cloudera/parcel-repo/CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel.sha
cp /opt/cdh/soft/manifest.json /opt/cloudera/parcel-repo/

http://node1/chd6_parcel/cdh6/6.3.1/parcels/

image

启动主节点

# 启动
systemctl start cloudera-scm-server

systemctl stop cloudera-scm-server
systemctl restart cloudera-scm-server
systemctl status cloudera-scm-server
# 查看文件夹(没有日志文件的权限是因为安装顺序错了,没有创建出CDH内置的linux账户与组)
ll /var/log/cloudera-scm-server/
# 查看启动日志
tailf /var/log/cloudera-scm-server/cloudera-scm-server.log
journalctl -f -u cloudera-scm-server.service

所有节点启动

# 启动
systemctl start cloudera-scm-agent

systemctl restart cloudera-scm-agent
systemctl stop cloudera-scm-agent
systemctl status cloudera-scm-agent

web页面操作

登录主节点的7180端口:http://node1:7180/

登陆用户名:admin 登陆密码: admin

TODO ....

问题总结

0、CDH启动使用的是默认配置的java路径,如果要用自己的需要将自己的java8软连接到CDH的默认配置位置上去:https://blog.csdn.net/a544258023/article/details/107856387

1、日志文件打不开,是CDH的安装顺序不对(没有创建出cloudera-scm的用户,与cloudera-scm的用户组,重新安装即可)

2、/etc/cloudera-scm-server/db.properties文件操作权限不足,添加777权限

3、未在已配置的存储库中找到任何parcel():https://blog.csdn.net/kimi_Christmas/article/details/123100739

4、安装到一半时,有节点宕机了,重启启动后发生主机运行不良(删除/var/lib/cloudera-scm-agent/cm_guid文件后重启agent服务即可):https://blog.csdn.net/HoldBelief/article/details/80287471

posted @ 2023-03-28 01:54  黄河大道东  阅读(206)  评论(0编辑  收藏  举报