1.安装CDH5.12.x
安装前准备
安装步骤
安装过程
修改/etc/hosts
设置ssh 互信
修改linux 系统设置
安装JDK1.8
安装python2.7
安装mysql/postgreysql数据库
安装ntp
设置本地yum源
下载CDH parcels包
安装CM
使用yum安装CM
安装agent
进入CDH
添加节点
使用CM添加节点
手动安装agent
远程yum源安装节点
服务
安装中出现的问题
安装方式
CDH有三种安装方式
- parcels 二进制程序包,包含了CDH组件中的依赖\版本等信息,可以方便的切换CDH版本,CM调用yum来安装parcels,非常方便.cloudera 推荐使用此种方式安装.
- packges rpm方式,缺点是每次只能安装一个版本的rpm,升级\回滚都必须重装rpm,因此升级\回滚存在包不兼容风险.CM调用yum来安装rpm.不推荐.
tarball 手动下载tar包安装,无法使用yum安装,十分麻烦,而且不支持通过CM在线升级CDH,必须完全手动升级.不建议.
本质上parcels和tarball都是CDH组件的压缩包,但是cloudera对tarball没做支持.
本次以parcels方式安装,不使用CDH自动安装,手动下载parcels包安装自动安装 设置CDH yum资源库,然后可以直接用yum 安装jdk CM agent
- 手动安装 手动安装JDK,下载CDH的CM agent,配置本地yum源安装CM和agent
安装前准备
- OS版本 centos6.8
- JDK 1.8
- python 2.7
- 安装账户root CM依赖root账户
- 安装路径 CDH默认路径/opt/cloudera/parcls即可
- 数据库 mysql,不要使用内置的postgrey
安装步骤
- 在vbox上安装centos6.8 略
- 设置ssh互信 至少CM节点要能ssh登录其它节点
- 修改linux系统设置
修改unlimit
关闭iptables
关闭selinux
关闭swape 略
关闭ipv6 略
设置静态ip
修改/etc/hosts,加入CDH所有节点的ip - 安装JDK1.8
- 安装python2.7
- 安装mysql数据库,并创建数据库和用户
- 安装ntp
- 安装CM
- 安装agent
- 安装CDH组件
安装过程
修改/etc/hosts
echo "127.0.0.1 localhost" > /etc/hosts
echo "192.168.0.20 CM" >> /etc/hosts
echo "192.168.0.21 cdh1" >> /etc/hosts
echo "192.168.0.21 cdh2" >> /etc/hosts
设置ssh 互信
参CM节点上执行:
[root@CM ~]# ssh-keygen -t rsa -P ''
然后将key复制到其它节点:
ssh-copy-id -i ~/.ssh/id_rsa.pub 192.168.0.21
ssh-copy-id -i ~/.ssh/id_rsa.pub 192.168.0.22
这样CM节点就能ssh到192.168.0.21 192.168.0.22两台机器上了
修改linux 系统设置
1.修改unlimit 略
2.关闭iptables
[root@cdh2 ~]# service iptables stop
iptables: Setting chains to policy ACCEPT: filter [ OK ]
iptables: Flushing firewall rules: [ OK ]
iptables: Unloading modules: [ OK ]
[root@cdh2 ~]# chkconfig iptables off
3.关闭swap
虚拟机内存不足,略
4.禁用ipv6
这个好麻烦的,略
安装JDK1.8
下载对应版本的JDK1.8,解压到/opt/jdk1.8下
设置java环境变量:
在/etc/profile中设置java环境变量
[root@CM opt]# tar -zxvf jdk-7u80-linux-x64.tar.gz
[root@CM opt]# rm jdk-7u80-linux-x64.tar.gz
[root@CM jdk1.7.0_80]# vi /etc/profile #添加以下内容
JAVA_HOME=/opt/jdk1.7.0_80
PATH=$JAVA_HOME/bin:$PATH
export JAVA_HOME
export PATH
然后执行java 命令测试是否安装成功.
安装python2.7
为了方便,直接安装acconda,再修改系统的python即可.
此外centos6自带的python2.6.6也是可以的
安装mysql/postgreysql数据库
这里选择安装postgreysql,因为mysql的安装包实在太TM的大了
[root@CM ~]# yum isntall postgresql #安装后数据目录在 /var/lib/pgsql
[root@CM ~]# chkconfig --list #找到pg的服务
…
postgresql 0:off 1:off 2:off 3:off 4:off 5:off 6:off
..
[root@CM ~]# service postgresql initdb #初始化pg数据库
Initializing database: [ OK ]
[root@CM ~]# chkconfig postgresql on #开机启动
[root@CM ~]# service postgresql start #启动pg
Starting postgresql service: [ OK ]
[root@CM ~]# sudo -u postgres psql
could not change directory to "/root"
psql (8.4.20)
Type "help" for help.
postgres=# create user cdh1;
CREATE ROLE
postgres=# select * from pg_users;
ERROR: relation "pg_users" does not exist
LINE 1: select * from pg_users;
^
postgres=# select * from pg_user;
usename | usesysid | usecreatedb | usesuper | usecatupd | passwd | valuntil | useconfig
----------+----------+-------------+----------+-----------+----------+----------+-----------
postgres | 10 | t | t | t | ******** | |
cdh1 | 16384 | f | f | f | ******** | |
(2 rows)
postgres=# alter user cdh1 with password 'cdh1';
ALTER ROLE
postgres=# create database cdh1 owner cdh1 ENCODING 'UTF-8';
CREATE DATABASE
[root@CM ~]# vi /var/lib/pgsql/data/pg_hba.conf #改成如下
# "local" is for Unix domain socket connections only
local all all truest
# IPv4 local connections:
host all all 127.0.0.1/32 md5
host all all 192.168.0.0/24 md5
# IPv6 local connections:
host all all ::1/128 ident
[root@CM data]# vi postgresql.conf #改成
listen_addresses = '*'
[root@CM ~]# service postgresql restart
Stopping postgresql service: [ OK ]
Starting postgresql service: [ OK ]
[root@CM ~]# psql -d cdh1 -U cdh1
Password for user cdh1:
psql (8.4.20)
Type "help" for help.
cdh1=>
安装ntp
[root@CM ~]# yum install ntp
设置本地yum源
使用yum安装CM时,可以使用cloudera的远程yum源,或者把CM安装包下载到本地,并设置本地yum源.
- 下载CM安装包
在https://www.cloudera.com/documentation/enterprise/release-notes/topics/cm_vd.html 选择对应版本的CDH CM安装包,下载到/root/下
[root@CM ~]# mkdir -p cloudera_software/RPMS/x86_64
[root@CM ~]# mkdir -p cloudera_software/repodata
[root@CM ~]# cd cloudera_software/RPMS/x86_64
[root@CM x86_64]# wget https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5.12.0/RPMS/x86_64/cloudera-manager-agent-5.12.0-1.cm5120.p0.120.el6.x86_64.rpm
[root@CM x86_64]# wget https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5.12.0/RPMS/x86_64/cloudera-manager-daemons-5.12.0-1.cm5120.p0.120.el6.x86_64.rpm
[root@CM x86_64]# wget https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5.12.0/RPMS/x86_64/cloudera-manager-server-5.12.0-1.cm5120.p0.120.el6.x86_64.rpm
[root@CM x86_64]# wget https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5.12.0/RPMS/x86_64/cloudera-manager-server-db-2-5.12.0-1.cm5120.p0.120.el6.x86_64.rpm
[root@CM x86_64]# wget https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5.12.0/RPMS/x86_64/enterprise-debuginfo-5.12.0-1.cm5120.p0.120.el6.x86_64.rpm
[root@CM x86_64]# #wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/RPMS/x86_64/jdk-6u31-linux-amd64.rpm
[root@CM x86_64]# #wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/RPMS/x86_64/oracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm
[root@CM x86_64]# cd ../../repodata
[root@CM repodata]# wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/repodata/filelists.xml.gz
[root@CM repodata]# wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/repodata/filelists.xml.gz.asc
[root@CM repodata]# wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/repodata/other.xml.gz
[root@CM repodata]# wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/repodata/other.xml.gz.asc
[root@CM repodata]# wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/repodata/primary.xml.gz
[root@CM repodata]# wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/repodata/primary.xml.gz.asc
[root@CM repodata]# wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/repodata/repomd.xml
[root@CM repodata]# wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/repodata/repomd.xml.asc
[root@CM repodata]# cd ..
[root@CM cloudera_software]# wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/RPM-GPG-KEY-cloudera
- 配置本地yum源
[root@CM cloudera_software]# cd /etc/yum.repos.d/ #添加
[cloudera-manager-local]
# Packages for Cloudera Manager, Version 5, on RedHat or CentOS 5 x86_64
name=cloudera-manager-local
baseurl=file:///root/cloudera_software/
gpgkey =file:///root/cloudera_software/RPM-GPG-KEY-cloudera
gpgcheck = 1
下载CDH parcels包
到 https://archive.cloudera.com/cdh5/parcels/ 下载对应版本的parcels包:
这里选择下载https://archive.cloudera.com/cdh5/parcels/latest/ 下的最新的CDH:
https://archive.cloudera.com/cdh5/parcels/latest/CDH-5.12.0-1.cdh5.12.0.p0.29-el6.parcel
https://archive.cloudera.com/cdh5/parcels/latest/CDH-5.12.0-1.cdh5.12.0.p0.29-el6.parcel.sha1
https://archive.cloudera.com/cdh5/parcels/latest/manifest.json
三个文件缺少一个都会导致找不到parcels包,如果新加入了文件,要重启CM进程.
放到/opt/cloudera/parcel-repo
下,为什么放这里?因为CDH的parcels包默认就放这里,上图也可以看到.
[root@CM ~]# mkdir -p /opt/cloudera/parcel-repo
[root@CM ~]# mkdir -p /opt/cloudera/parcels
安装CM
使用yum安装CM
[root@CM cloudera_software]# yum install cloudera-manager-agent cloudera-manager-daemons cloudera-manager-server
一开始下成了redhat5的CM包,删除重新下.记得要执行yum clean
上一步完成之后,会有这些文件:
[root@CM x86_64]# ls /etc/cloudera-scm-server
db.properties log4j.properties
修改其中的db.properties文件:
# Copyright (c) 2012 Cloudera, Inc. All rights reserved.
#
# This file describes the database connection.
#
# The database type
# Currently 'mysql', 'postgresql' and 'oracle' are valid databases.
com.cloudera.cmf.db.type=postgresql
# The database host
# If a non standard port is needed, use 'hostname:port'
com.cloudera.cmf.db.host=localhost
# The database name
com.cloudera.cmf.db.name=cmf
# The database user
com.cloudera.cmf.db.user=cdh1
# The database user's password
com.cloudera.cmf.db.password=cdh1
# The db setup type
# By default, it is set to INIT
# If scm-server uses Embedded DB then it is set to EMBEDDED
# If scm-server uses External DB then it is set to EXTERNAL
com.cloudera.cmf.db.setupType=EXTERNAL
启动CM
[root@CM ~]# service cloudera-scm-server start
Starting cloudera-scm-server: [FAILED]
[root@CM ~]# more /var/log/cloudera-scm-server/cloudera-scm-server.out
+======================================================================+
| Error: JAVA_HOME is not set and Java could not be found |
+----------------------------------------------------------------------+
| Please download the latest Oracle JDK from the Oracle Java web site |
| > http://www.oracle.com/technetwork/java/javase/index.html < |
| |
| Cloudera Manager requires Java 1.6 or later. |
| NOTE: This script will find Oracle Java whether you install using |
| the binary or the RPM based installer. |
+======================================================================+
[root@CM ~]# echo $JAVA_HOME
/opt/jdk1.7.0_80
简直是睁眼说瞎话.网上查了一下 http://community.cloudera.com/t5/Cloudera-Manager-Installation/Error-JAVA-HOME-is-not-set-and-Java-could-not-be-found/td-p/18974/page/3
按照要求设置JAVA_HOME
[root@CM ~]# vi /etc/default/cloudera-scm-server
export JAVA_HOME=/opt/jdk1.7.0_80
然后执行:
[root@CM ~]# service cloudera-scm-server start
Starting cloudera-scm-server: [ OK ]
我有一句mmp不知当讲不当讲!
启动成功了却进不去 192.168.0.20:7180
打开日志一看:
2017-08-20 18:19:25,182 WARN com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#0:com.mchange.v2.resourcepool.BasicResourcePool: com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask@a384493 -- Acquisition Attempt Failed!!! Clearing pending acquires. Whil
e trying to acquire a needed new resource, we failed to succeed more than the maximum number of allowed acquisition attempts (5). Last acquisition attempt exception:
org.postgresql.util.PSQLException: FATAL: database "cmf" does not exist
at org.postgresql.core.v3.ConnectionFactoryImpl.readStartupMessages(ConnectionFactoryImpl.java:469)
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:112)
卧槽,我以为CM会自动创建数据库呢.
再次启动报一堆表找不到,然后CM自动建表了了,可是过一会又挂了
2017-08-20 18:38:48,481 ERROR ParcelUpdateService:com.cloudera.parcel.components.ParcelDownloaderImpl: Error while attempting to retrieve repository info for repo https://archive.cloudera.com/sqoop-connectors/parcels/latest/
java.io.IOException: Closed
at com.ning.http.client.providers.netty.NettyAsyncHttpProvider.doConnect(NettyAsyncHttpProvider.java:873)
at com.ning.http.client.providers.netty.NettyAsyncHttpProvider.execute(NettyAsyncHttpProvider.java:858)
at com.ning.http.client.AsyncHttpClient.executeRequest(AsyncHttpClient.java:512)
CM一直在找远程资源库有没有?!
再次重启,竟然启动了!what the fuck!
安装agent
一上步就直接安装了agent了
安装完需要修改agent的配置文件使其能找到CM
vi /etc/cloudera-scm-agent/config.ini
server_host=192.168.0.20
启动agent
进入CDH
默认使用admin/admin进入
一开始不要选择新加主机,因为还没有装CM的其它服务,如host monitor
由于已经安装了JDK,就不让CM再安装了.
不装单用户
选key
选择CM节点的pub key
配置正确会出现CDH5.12.1的parcels,下一步,直接到parcels分发完成
完成后会显示没有host monitor
创建cmservice数据库create database cmservice owner cdh1 ENCODING 'UTF-8';
这一步死活连不上数据库.查了很多资料,试了很多方法都不行.
添加节点
添加节点的本质是:
在节点上安装agent,然后agent连接cm,那么这个节点就添加进集群了.再由CM分发parcels包,再安装服务
使用CM添加节点
1.
2.
3.选择第二项可以加入自己的yum源
4.自动添加主机时,会设置节点上的yum源
我们到 https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/cloudera-manager.repo 看看这里写的是啥
[cloudera-manager]
# Packages for Cloudera Manager, Version 5, on RedHat or CentOS 6 x86_64
name=Cloudera Manager
baseurl=https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/
gpgkey =https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/RPM-GPG-KEY-cloudera
gpgcheck = 1
其实就是cloudera的在线yum源,这个源里的东西是:
其实就是agent和cm的安装包有没有?
鉴于刷新yum元数据实在太慢了,放弃
或者在上面的步骤中设置自己yum源,不用CDH,这样快很多
手动安装agent
这时就不用选择"这样的操作了",直接在节点上安装agent就好了
从cm上复制agent的安装包,并设置本地yum源并安装
[root@CM cloudera_software]# scp -r /root/cloudera_software root@cdh2:/root/
[root@cdh2 ~]# vi /etc/yum.repos.d/CentOS-Media.repo #添加
[cloudera-manager-local]
# Packages for Cloudera Manager, Version 5, on RedHat or CentOS 5 x86_64
name=cloudera-manager-local
baseurl=file:///root/cloudera_software/
gpgkey =file:///root/cloudera_software/RPM-GPG-KEY-cloudera
gpgcheck = 1
[root@cdh2 ~]# yum install cloudera-manager-agent
这时竟然要联网下一堆依赖,不过很快。怎么创建局域网源?
[root@cdh1 ~]# vi /etc/cloudera-scm-agent/config.ini
server_host=CM
[root@cdh1 ~]# service cloudera-scm-agent start
agent启动完成后,以CM就能看到节点了。
但此时并没有和host monitor连接。
在CM中重启host monitor后就正常了:
远程yum源安装节点
[root@CM cloudera_software]# yum isntall httpd
………..
[root@CM ~]# vi /etc/httpd/conf/httpd.conf
DocumentRoot "/root/cloudera_software"
……
[root@CM ~]# chown -R apache.apache /root/cloudera_software
#删除默认主页
# [root@CM ~]# mv /etc/httpd/conf.d/welcome.conf /etc/httpd/conf.d/welcome.conf.bk
[root@CM ~]# service httpd start
等我会用httpd了再说吧,mzdd!
等节点加入到主机后,选择"向集群添加新主机" "管理当前主机",选择刚加上的主机,下一步分发parcels。
服务
安装中出现的问题
1.
WARN [770705234@scm-web-1838:tsquery.TimeSeriesQueryService@503]
com.cloudera.server.cmf.tsquery.TimeSeriesQueryService@1c378752 failed
to locate nozzleHOST_MONITORING
com.cloudera.cmon.MgmtServiceLocatorException: Could not find a
HOST_MONITORING
nozzle from SCM.
at
com.cloudera.cmon.MgmtServiceLocator.getNozzleIPC(MgmtServiceLocator.java:147)
at
com.cloudera.server.cmf.tsquery.NozzleRequest.<init>(NozzleRequest.java:50)
在安装时添加主机,出现了这个问题,如果只选CM节点,不出现这个问题.
2.
将jdk修改为1.8,之前是1.7
下载postgresql的jdbc驱动到/usr/share/java下,修改/etc/default/cloudera-scm-server的jar路径,添加postgresql的jdbc驱动.并在该文件中export JAVA_HOME
修改日志级别:
export CMF_ROOT_LOGGER="INFO,LOGFILE"
3.
INFO [JvmPauseMonitor:debug.JvmPauseMonitor@236] Detected pause in JVM or host machine (e.g. a stop the world GC, or JVM not scheduled): paused approximately 1039ms: no GCs detected.
+ cat
+======================================================================+
| Error: JAVA_HOME is not set and Java could not be found |
+----------------------------------------------------------------------+
| Please download the latest Oracle JDK from the Oracle Java web site |
| > h t t p : / / w w w .or acl e . c o m/technetwork/java/javase/index.html < |
| |
| Cloudera Manager requires Java 1.6 or later. |
| NOTE: This script will find Oracle Java whether you install using |
| the binary or the RPM based installer. |
+======================================================================+
+ exit 1
修改成jdk1.8,然后并没有任务卵用.
到主机界面设置一下JDK
以上问题,用JAVA1.8重新安装CM AGENT后都解决了!!