Cloudera’s Distribution Including Apache Hadoop(CDH)安装过程
一、CDH下载:
仓库地址: http://archive.cloudera.com/cm5/redhat/7/x86_64/cm/cloudera-manager.repo,将cloudera-manager.repo文件拷贝到所有节点的/etc/yum.repos.d/文件夹下
cloudera-manager-installer.bin:http://archive.cloudera.com/cm5/installer/latest/,非生产环境安装才需要
RPM: http://archive.cloudera.com/cm5/redhat/7/x86_64/cm/5.12/RPMS/x86_64/,保存至/usr/CDH/rpm
parcels: http://archive.cloudera.com/cdh5/parcels/latest/,三个文件拷贝至CM Manager主机/opt/cloudera/parcel-repo目录下(此目录没有则创建,注意:其它主机不要进行此操作)
二、安装cloudera manager:
1.关闭防火墙:
1.1关闭SELinux:vi /etc/selinux/config,修改SELinux=disabled,重启生效,查看 /usr/sbin/sestatus -v
1.2关闭iptables:
1.2.1 禁止iptables:service iptables stop && chkconfig iptables off
1.2.2 清除iptables: iptables -F
1.3关闭防火墙
systemctl disable firewalld
systemctl stop firewalld
2.建立各节点无密码SSH
2.1 ssh-keygen
2.2 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
3.安装rpm,cd /usr/CDH/rpm
3.1 如果之前有安装刚先卸载:
3.1.1 卸载manager:yum -y remove cloudera-manager-daemons cloudera-manager-server
3.1.2 卸载agent:yum -y remove cloudera-manager-daemons cloudera-manager-agent
3.2 cp cloudera-manager.repo /etc/yum.repos.d/
3.3 manager server:yum -y install oracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm
cloudera-manager-daemons-5.12.0-1.cm5120.p0.120.el7.x86_64.rpm
cloudera-manager-server-5.12.0-1.cm5120.p0.120.el7.x86_64.rpm
3.4 manager agent:yum -y install oracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm
cloudera-manager-daemons-5.12.0-1.cm5120.p0.120.el7.x86_64.rpm
cloudera-manager-agent-5.12.0-1.cm5120.p0.120.el7.x86_64.rpm
3.5 cp /usr/CDH/mysql-connector-java-5.1.38.jar /usr/share/cmf/common_jars
4.建立mysq驱动软连接:cd /usr/share/cmf/lib && ln -s ../common_jars/mysql-connector-java-5.1.15.jar mysql-connector-java-5.1.15.jar
5.Install the JDBC driver on the Cloudera Manager Server host, as well as hosts to which you assign the Activity Monitor, Reports Manager,
Hive Metastore Server, Hue Server, Sentry Server, Cloudera Navigator Audit Server, and Cloudera Navigator Metadata Server roles:
mkdir -p /usr/share/java/ && ln -s /usr/share/cmf/common_jars/mysql-connector-java-5.1.15.jar /usr/share/java/mysql-connector-java.jar
6.运行目录:
6.1 java安装目录: /usr/java/jdk1.7.0_67-cloudera
6.2 cm server数据库配置:/etc/cloudera-scm-server/db.properties
com.cloudera.cmf.db.type=mysql
com.cloudera.cmf.db.host=mysql
com.cloudera.cmf.db.name=scm
com.cloudera.cmf.db.user=scm
com.cloudera.cmf.db.setupType=EXTERNAL
com.cloudera.cmf.db.password=scm
6.3 cm agent代理配置目录: /etc/cloudera-scm-agent,修改config.ini中server_host为cm server主机名或地址
6.4 cm jar包目录: /usr/share/cmf/lib
6.5 cm server启动日志目录: /var/log/cloudera-scm-server
6.6 cm agent启动日志目录: /var/log/cloudera-scm-agent
7.升级mysql驱动:
7.1 将mysql-connector-java-5.1.38.jar上传至所有主机/usr/share/cmf/common_jars目录
7.2 将老版本驱动更名:mv /usr/share/cmf/common_jars/mysql-connector-java-5.1.15.jar /usr/share/cmf/common_jars/mysql-connector-java-5.1.15-old.jar
7.3 将新驱动更名: mv /usr/share/cmf/common_jars/mysql-connector-java-5.1.38.jar /usr/share/cmf/common_jars/mysql-connector-java-5.1.15.jar
三、cloudera数据库初始化:
1.managent service数据库
1.1 mysql命令行下执行:grant all on *.* to 'scm'@'%' identified by 'scm' with grant option;
1.2 bash下执行:/usr/share/cmf/schema/scm_prepare_database.sh mysql -h mysql -uroot -p123 --scm-host manager scm scm scm
1.3 mysql命令行下执行:drop user 'scm'@'%';
以下在mysql命令行下执行:
2.Activity Monitor:
create database if not exists `cdh_amon` default character set utf8 collate utf8_general_ci;
create user amon@'%' identified by 'amon';
grant all privileges on cdh_amon .* to amon @'%' identified by 'amon';
3.Reports Manager:
create database if not exists `cdh_rman` default character set utf8 collate utf8_general_ci;
create user rman@'%' identified by 'rman';
grant all privileges on cdh_rman.* to rman@'%' identified by 'rman';
4.Hive Metastore Server:
create database if not exists `cdh_hive` default character set utf8 collate utf8_general_ci;
create user hive@'%' identified by 'hive';
grant all privileges on cdh_hive.* to hive@'%' identified by 'hive';
5.Sentry Server:
create database if not exists `cdh_sentry` default character set utf8 collate utf8_general_ci;
create user sentry@'%' identified by 'sentry';
grant all privileges on cdh_sentry.* to sentry@'%' identified by 'sentry';
6.Cloudera Navigator Audit Server:
create database if not exists `cdh_nav` default character set utf8 collate utf8_general_ci;
create user nav@'%' identified by 'nav';
grant all privileges on cdh_nav.* to nav@'%' identified by 'nav';
7.Cloudera Navigator Metadata Server:
create database if not exists `cdh_navms` default character set utf8 collate utf8_general_ci;
create user navms@'%' identified by 'navms';
grant all privileges on cdh_navms.* to navms@'%' identified by 'navms';
8.Hue:
create database if not exists `cdh_hue` default character set utf8 collate utf8_general_ci;
create user hue@'%' identified by 'hue';
grant all privileges on cdh_hue.* to hue@'%' identified by 'hue';
9.Oozie:
create database if not exists `cdh_oozie` default character set utf8 collate utf8_general_ci;
create user oozie@'%' identified by 'oozie';
grant all privileges on cdh_oozie.* to oozie@'%' identified by 'oozie';
四、启动主节点:service cloudera-scm-server start,在/run/cloudera-scm-server.pid里保存进程ID
五、启动代理节点:service cloudera-scm-agent start,浏览器输入http://CM Server IP:7180
六、各种坑:
1.代理节点启动报错:
错误信息:Error, CM server guid updated, expected df16790a-2e44-44ec-9db2-8731cc635c61, received b6fecabc-8e32-46be-8a43-5f261064b2c7
解决方法:删除/var/lib/cloudera-scm-agent下cm_guid文件
2.oozie缺少ext-2.2:
解决方法:将ext-2.2拷贝至/var/lib/oozie目录
3.spark运行报错:Required executor memory (1024+384 MB) is above the max threshold (1024 MB) of this cluster!
Please check the values of 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'
解决方法:修改服务YARN (MR2 Included)中如下配置项并保存,重启YARN:
最小容器内存(大于1G):yarn.scheduler.minimum-allocation-mb
最大容器内存(大于1G):yarn.scheduler.maximum-allocation-mb
容器内存(大于1G):yarn.nodemanager.resource.memory-mb
4.linux shell运行hdfs及运行spark时报错:org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
解决方法:将export HADOOP_USER_NAME=hdfs添加到~/.bash_profile或/etc/profile中,然后source ~/.bash_profile或source /etc/profile
5.hive执行引擎默认为mapreduce的配置修改:hive.execution.engine由mr改为spark
6.agent节点无法通过http从manager下载安装文件,取消原来安装的JDK8版本,换成cloudera指定的jdk安装包则OK,分析可能是JDK8某些安全方面的控制造成
7.agent节点无法通过http从manager下载CDH-5.12.0-1.cdh5.12.0.p0.29-el7.parcel.torrent文件,造成安装页面停止
原因:manager节点下/opt/cloudera/parcel-repo/所有文件是通过另一环境COPY而来,torrent文件所有者为root,造成cm无法访问
解决:删除CDH-5.12.0-1.cdh5.12.0.p0.29-el7.parcel.torrent文件,由cm自行从cloudera官网下载此文件
8.agent节点启动后无法生成主机ID,造成CM不能识别该主机
原因:发现该节点下/var/lib/cloudera-scm-agent目录uuid文件大小为0k,未生成此主机ID
解决:删除此uuid文件,输入命令service cloudera-scm-agent restart重新启动agent