CDH-Hadoop_Hive_Impala单机环境搭建纯离线安装

环境介绍及安装包下载地址

Centos6.5

版本选择:cdh5.14.0-src.tar.gz  CDH5 – 5.14.0

Hadoop安装包下载地址

http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.14.0.tar.gz

Hive安装包下载地址

http://archive.cloudera.com/cdh5/cdh/5/hive-1.1.0-cdh5.14.0.tar.gz

impala安装包下载地址

http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.14.0/RPMS/x86_64/impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.14.0/RPMS/x86_64/impala-catalog-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.14.0/RPMS/x86_64/impala-debuginfo-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.14.0/RPMS/x86_64/impala-server-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.14.0/RPMS/x86_64/impala-shell-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.14.0/RPMS/x86_64/impala-state-store-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm  

http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.14.0/RPMS/x86_64/impala-udf-devel-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

 

参考文档

https://docs.cloudera.com/documentation/enterprise/5-12-x/topics/cdh_qs_cdh5_pseudo.html#topic_3

https://docs.cloudera.com/documentation/index.html

1.安装JDK

上传rpm包

rpm -ivh jdk-8u221-linux-x64.rpm

 

默认安装在/usr/java中

路径为/usr/java/jdk1.8.0_221-amd64

 

配置环境变量

vim /etc/profile

#添加如下内容

JAVA_HOME=/usr/java/jdk1.8.0_221-amd64

JRE_HOME=/usr/java/jdk1.8.0_221-amd64/jre

PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin

CLASSPATH=:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib

export JAVA_HOME JRE_HOME PATH CLASSPATH

 

[root@sunny conf]# java -version

java version "1.8.0_221"

Java(TM) SE Runtime Environment (build 1.8.0_221-b11)

Java HotSpot(TM) 64-Bit Server VM (build 25.221-b11, mixed mode)

2.安装Hadoop

2.1指定Hadoop安装目录/data/CDH,进行解压

cd /data/CDH

tar -zvxf /data/install_pkg_CDH/hadoop-2.6.0-cdh5.14.0.tar.gz

 

hadoop核心配置文件位置

/data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/

 

2.2配置Hadoop环境变量

vi /etc/profile

 

#在末尾加入配置:

JAVA_HOME=/usr/java/jdk1.8.0_221-amd64

JRE_HOME=/usr/java/jdk1.8.0_221-amd64/jre

PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin

CLASSPATH=:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib

export JAVA_HOME JRE_HOME PATH CLASSPATH

export HADOOP_HOME=/data/CDH/hadoop-2.6.0-cdh5.14.0

export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native

export HADOOP_OPTS="-Djava.library.path=HADOOP_HOME/lib"

export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

 

source /etc/profile #使得配置生效

 

2.3先在hadoop安装目录下创建数据目录(用于配置: core-site.xml)

cd /data/CDH/hadoop-2.6.0-cdh5.14.0

mkdir data

 

2.4配置core-site.xml

vi   /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/core-site.xml

 

#添加如下配置:

<property>

    <name>fs.defaultFS</name>

    <!-- 这里填的是你自己的ip,端口默认-->

    <value>hdfs://172.20.71.66:9000</value>

</property>

<property>

    <name>hadoop.tmp.dir</name>

    <!-- 这里填的是你自定义的hadoop工作的目录,端口默认-->

    <value>/data/CDH/hadoop-2.6.0-cdh5.14.0/data</value>

</property>

 

<property>

    <name>hadoop.native.lib</name>

    <value>false</value>

    <description>Should native hadoop libraries, if present, be used.

    </description>

</property>

 

2.5配置hdfs-site.xml

vi      /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/hdfs-site.xml

 

#添加如下配置:

<property>

         <name>dfs.replication</name>

         <value>1</value>

</property>

<property>

    <name>dfs.namenode.secondary.http-address</name>

    <value>172.20.71.66:50090</value>

</property>

<property>

    <name>dfs.permissions.enabled</name>

    <value>false</value>

</property>

 

2.6配置mapres-site.xml

cd     /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop

cp mapred-site.xml.template  ./mapred-site.xml  #复制默认文件配置命名为mapred-site.xml

vi      /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/mapred-site.xml

 

#增加如下配置:

<property>

    <name>mapreduce.framework.name</name>

    <value>yarn</value>

</property>

 

2.7配置yarn-site.xml

vi      /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/yarn-site.xml 

 

#添加如下配置:

<property>

    <name>yarn.resourcemanager.hostname</name>

    <value>172.20.71.66</value>

</property>

<property>

    <name>yarn.nodemanager.aux-services</name>

    <value>mapreduce_shuffle</value>

</property>

 

2.8配置hadoop-env.sh

vi      /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/hadoop-env.sh

export JAVA_HOME=${JAVA_HOME}

更改为:

export JAVA_HOME=/usr/java/jdk1.8.0_221-amd64

 

2.9启动hadoop

配置好之后切换到sbin目录下

cd /data/CDH/hadoop-2.6.0-cdh5.14.0/sbin/ 

hadoop namenode –format #格式化hadoop文件格式

 

错误提示:hostname与/etc/hosts中的不对应。

执行hostname命令,查看主机名。比如为testuser

执行vi /etc/hosts

修改

127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4

::1      localhost localhost.localdomain localhost6 localhost6.localdomain6

127.0.0.1 localhost localhost.localdomain testuser  localhost4.localdomain4

::1      localhost localhost.localdomain localhost6 localhost6.localdomain6

 

再次格式化: hadoop namenode -format

成功后启动所有的

./start-all.sh

中间需要输入多次当前服务器密码,按照提示输入即可

然后再输入命令 jps 查看服务启动情况

出现下图标记的五个进程,即表示启动成功!

 

2.10Hadoop页面访问

http://172.20.71.66:50070   #把172.20.71.66 换为你的服务器IP即可

 

3.安装Hive

3.1指定Hive安装目录/data/CDH,并且解压

cd /data/CDH

tar -zxvf /data/install_pkg_CDH/hive-1.1.0-cdh5.14.0.tar.gz

 

3.2配置环境变量

vim /etc/profile

 

#添加如下内容

export HIVE_HOME=/data/CDH/hive-1.1.0-cdh5.14.0/

export PATH=$HIVE_HOME/bin:$PATH

export HIVE_CONF_DIR=$HIVE_HOME/conf

export HIVE_AUX_JARS_PATH=$HIVE_HOME/lib

 

~ source /etc/profile

 

3.3配置hive-env.sh

cd /data/CDH/hive-1.1.0-cdh5.14.0/conf

cp  hive-env.sh.template hive-env.sh

vi hive_env.sh

 

#在文件末尾添加如下内容

export JAVA_HOME=/usr/java/jdk1.8.0_221-amd64

export HADOOP_HOME=/data/CDH/hadoop-2.6.0-cdh5.14.0

export HIVE_HOME=/data/CDH/hive-1.1.0-cdh5.14.0

export HIVE_CONF_DIR=/data/CDH/hive-1.1.0-cdh5.14.0/conf

export HIVE_AUX_JARS_PATH=/data/CDH/hive-1.1.0-cdh5.14.0/lib

 

3.4配置文件hive-site.xml

cd /data/CDH/hive-1.1.0-cdh5.14.0/conf

vi hive-site.xml

#添加如下内容

<?xml version="1.0" encoding="UTF-8" standalone="no"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

  <property>

    <name>javax.jdo.option.ConnectionURL</name>

    <value>jdbc:derby:;databaseName=metastore_db;create=true</value>

  </property>

    <property>

    <name>javax.jdo.option.ConnectionDriverName</name>

    <value>org.apache.derby.jdbc.EmbeddedDriver</value>

  </property>

    <property>

    <name>javax.jdo.option.ConnectionUserName</name>

    <value>APP</value>

  </property>

    <property>

    <name>javax.jdo.option.ConnectionPassword</name>

    <value>mine</value>

  </property>

</configuration>

3.5创建HDFS目录(这里好像非必须的,也可不操作)

cd /data/CDH/hive-1.1.0-cdh5.14.0

mkdir hive_hdfs

cd hive_hdfs

hdfs dfs -mkdir -p warehouse #使用这条命令的前提是hadoop已经安装好了

hdfs dfs -mkdir -p tmp

hdfs dfs -mkdir -p log

hdfs dfs -chmod -R 777 warehouse

hdfs dfs -chmod -R 777 tmp

hdfs dfs -chmod -R 777 log

 

3.6初始化hive (建议在安装目录的bin目录里执行)

cd  /data/CDH/hive-1.1.0-cdh5.14.0/bin

schematool -dbType derby –initSchema

#如果需要反复执行schematool -dbType derby –initSchema ,可以手动删除或重命名metastore_db目录。

 

 

 

3.7启动hive

hive --service metastore &    #可在任意目录输入

hive --service hiveserver2 &    #可在任意目录输入

3.8使用hive

hive  #可在任意目录输入

3.9安装hive时的异常处理

3.9.1 java.lang.ClassNotFoundException: org.htrace.Trace

从maven仓库中下载htrace-core.jar包拷贝到$HIVE_HOME/lib目录下

<!-- https://mvnrepository.com/artifact/org.htrace/htrace-core -->

<dependency>

    <groupId>org.htrace</groupId>

    <artifactId>htrace-core</artifactId>

    <version>3.0</version>

</dependency>

 

cp /data/install_pkg_apache/htrace-core-3.0.jar /data/CDH/hive-1.1.0-cdh5.14.0/lib

cd /data/CDH/hive-1.1.0-cdh5.14.0/bin

 

3.9.2 [ERROR] Terminal initialization failed; falling back to unsupported java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expected

处理方案:从maven仓库中下载 jline-2.12.jar(上传到当前服务器)

cd /data/CDH/hadoop-2.6.0-cdh5.14.0/share/hadoop/yarn/lib

mv jline-2.11.jar jline-2.11.jar_bak

 

cd /data/CDH/hive-1.1.0-cdh5.14.0/lib

cp jline-2.12.jar /data/CDH/hadoop-2.6.0-cdh5.14.0/share/hadoop/yarn/lib

 

重新启动 hadoop:

cd /data/CDH/hadoop-2.6.0-cdh5.14.0/sbin

sh stop-all.sh

sh start-all.sh

 

再次执行: schematool -dbType derby -initSchema

错误还在!!!

 

处理方案Plus:

vi /etc/profile

export HADOOP_USER_CLASSPATH_FIRST=true

再次执行: schematool -dbType derby -initSchema  (ok)

 

接着开始启动hive:

./hive --service metastore &  

./hive --service hiveserver2 &

 

3.9.3 java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

参考博客地址: https://www.58jb.com/html/89.html

在Hive2.0.0以前的版本,需要在一个Hive的主节点上启动两个服务:metastore 、hiveserver2 才可以登陆到Hive的命令行下。(配置hive-site.xml再启动即可)

 

hive --service metastore &

hive --service hiveserver2 &

 

3.10Hive基本操作测试

hive  #进入hive

#执行hql语句

show databases;

create database test;

use test;

create table user_test(name string) row format delimited fields terminated by '\001';

 

insert into table user_test values('zs');

select * from user_test;

 

4.安装Impala

4.1磁盘扩容(服务器配置足够的跳过即可)

需要来个小插曲,我这是自己搭建的centos环境,服务器配置不够,需要加空间。

参考博客: http://blog.itpub.net/117319/viewspace-2125079/

windows操作进入cdm命令行:

C:\Users\kingdee\VirtualBox VMs\centos6\centos6.vdi

VBoxManage.exe modifymedium C:\Users\kingdee\VirtualBox VMs\centos6\centos6.vdi resize --20480  #这里运行有问题 因为文件名称存在空格

 

C:

cd Users\kingdee\VirtualBox VMs\centos6

F:\VirtualBox\VBoxManage.exe modifymedium centos6.vdi --resize 20480  #这样执行就可以啦

 

Centos操作:

fdisk -l

fdisk /dev/sda

n

p

3

回车 (使用默认)

+10G (分配的内存)

w (保存)

 

 

 

查看并立即刷新分区信息

cat /proc/partitions

reboot #重启

mkfs -t ext3 /dev/sda3

格式分后再挂载到映射目录

mount /dev/sda3 /home  #取消挂载命令umount /home

将 impala安装在 /home目录下

cd /home

mkdir data

mkdir CDH

mkdir impala_rpm

cd /home/data/CDH/impala_rpm

 

4.2安装impala(ps:所有安装包都放到/home/data/CDH/impala_rpm_install_pkg 目录下 )

cd    /home/data/CDH/impala_rpm

rpm -ivh ../impala_rpm_install_pkg/impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

 

出现如下错误:

warning: ../impala_rpm_install_pkg/impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm: Header V4 DSA/SHA1 Signature, key ID e8f86acd: NOKEY

error: Failed dependencies:

        bigtop-utils >= 0.7 is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        hadoop is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        hadoop-hdfs is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        hadoop-yarn is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        hadoop-mapreduce is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        hbase is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        hive >= 0.12.0+cdh5.1.0 is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        zookeeper is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        hadoop-libhdfs is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        avro-libs is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        parquet is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        sentry >= 1.3.0+cdh5.1.0 is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        sentry is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        /lib/lsb/init-functions is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

        libhdfs.so.0.0.0()(64bit) is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

 

emmmm… 得先安装 bigtop-util 和 sentry

4.3安装bigtop-util 和 sentry

bigtop-util 和 sentry安装包下载地址:

http://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/5.14.0/RPMS/noarch/

也可以直接点击下面这两个链接下载:

http://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/5.14.0/RPMS/noarch/bigtop-utils-0.7.0+cdh5.14.0+0-1.cdh5.14.0.p0.47.el7.noarch.rpm

http://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/5.14.0/RPMS/noarch/sentry-1.5.1+cdh5.14.0+432-1.cdh5.14.0.p0.47.el7.noarch.rpm

 

执行如下命令:

sudo rpm -ivh  ../impala_rpm_install_pkg/bigtop-utils-0.7.0+cdh5.14.0+0-1.cdh5.14.0.p0.47.el7.noarch.rpm

sudo rpm -ivh  ../impala_rpm_install_pkg/sentry-1.5.1+cdh5.14.0+432-1.cdh5.14.0.p0.47.el7.noarch.rpm --nodeps  # --nodeps 不强制依赖

sudo rpm -ivh  ../impala_rpm_install_pkg/impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm --nodeps

sudo rpm -ivh  ../impala_rpm_install_pkg/impala-server-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

sudo rpm -ivh  ../impala_rpm_install_pkg/impala-state-store-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

sudo rpm -ivh  ../impala_rpm_install_pkg/impala-catalog-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

sudo rpm -ivh  ../impala_rpm_install_pkg/impala-shell-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

 

执行到impala-shell得时候,需要先安装:python-setuptools

 

4.4安装python-setuptools

python-setuptools的下载地址:

      http://mirror.centos.org/altarch/7/os/aarch64/Packages/python-setuptools-0.9.8-7.el7.noarch.rpm

执行如下命令:

sudo rpm -ivh  ../impala_rpm_install_pkg/python-setuptools-0.9.8-7.el7.noarch.rpm --nodeps

 

然后我们继续上次的impala-shell安装:

sudo rpm -ivh  ../impala_rpm_install_pkg/impala-shell-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

欧克了

4.5配置bigtop-utils

vi /etc/default/bigtop-utils

export JAVA_HOME=/usr/java/jdk1.8.0_221-amd64

4.6配置conf

cd  /etc/impala/conf  #复制hadoop与hive的配置文件到impala的目录下

cp /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/core-site.xml ./

cp /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/hdfs-site.xml ./

cp /data/CDH/hive-1.1.0-cdh5.14.0/conf/hive-site.xml ./

 

4.7配置hdfs-site.xml

vi hdfs-site.xml

#添加如下内容:

 <property>

         <name>dfs.datanode.hdfs-blocks-metadata.enabled</name>

         <value>true</value>

</property>

<property>

         <name>dfs.client.file-block-storage-locations.timeout.millis</name>

         <value>10000</value>

</property>

 

4.8处理jar包

4.8.1我们先来学习一下linux的软链接

创建软链接

ln  -s  [源文件或目录]  [目标文件或目录]

 

例如:

当前路径创建test 引向/var/www/test 文件夹

ln –s  /var/www/test  test

 

删除软链接

和删除普通的文件是一样的,删除都是使用rm来进行操作

 rm –rf 软链接名称(请注意不要在后面加”/”,rm –rf 后面加不加”/” 的区别,可自行去百度下啊)

例如:

删除test

rm –rf test

 

修改软链接

ln –snf  [新的源文件或目录]  [目标文件或目录]

这将会修改原有的链接地址为新的地址

例如:

创建一个软链接

ln –s  /var/www/test   /var/test

修改指向的新路径

ln –snf  /var/www/test1   /var/test

4.8.2处理impala jar包的软连接

(以下内容可以复制,记得替换版本号哦,例如我的包版本是 2.6.0-cdh5.14.0, 需要按照你具体的安装版本替换哦)

Hadoop相关jar包

sudo rm -rf /usr/lib/impala/lib/hadoop-*.jar

sudo ln -s $HADOOP_HOME/share/hadoop/common/lib/hadoop-annotations-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-annotations.jar

sudo ln -s $HADOOP_HOME/share/hadoop/common/lib/hadoop-auth-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-auth.jar

sudo ln -s $HADOOP_HOME/share/hadoop/tools/lib/hadoop-aws-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-aws.jar

sudo ln -s $HADOOP_HOME/share/hadoop/common/hadoop-common-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-common.jar

sudo ln -s $HADOOP_HOME/share/hadoop/hdfs/hadoop-hdfs-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-hdfs.jar

sudo ln -s $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-mapreduce-client-common.jar

sudo ln -s $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-mapreduce-client-core.jar

sudo ln -s $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-mapreduce-client-jobclient.jar

sudo ln -s $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-mapreduce-client-shuffle.jar

sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-api-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-api.jar

sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-client-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-client.jar

sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-common.jar

sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-server-applicationhistoryservice.jar

sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-server-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-server-common.jar

sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-server-nodemanager.jar

sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-server-resourcemanager.jar

sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-server-web-proxy.jar

sudo ln -s $HADOOP_HOME/share/hadoop/mapreduce1/hadoop-core-2.6.0-mr1-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-core.jar

sudo ln -s $HADOOP_HOME/share/hadoop/tools/lib/hadoop-azure-datalake-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-azure-datalake.jar

 

#重新建立软连接:

sudo ln -snf $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-common-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-common.jar

sudo ln -snf $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-server-common-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-server-common.jar

sudo ln -snf $HADOOP_HOME/share/hadoop/common/lib/avro-1.7.6-cdh5.14.0.jar /usr/lib/impala/lib/avro.jar

sudo ln -snf $HADOOP_HOME/share/hadoop/common/lib/zookeeper-3.4.5-cdh5.14.0.jar /usr/lib/impala/lib/zookeeper.jar

hive相关jar包

sudo rm -rf /usr/lib/impala/lib/hive-*.jar

sudo ln -s $HIVE_HOME/lib/hive-ant-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-ant.jar

sudo ln -s $HIVE_HOME/lib/hive-beeline-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-beeline.jar

sudo ln -s $HIVE_HOME/lib/hive-common-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-common.jar

sudo ln -s $HIVE_HOME/lib/hive-exec-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-exec.jar

sudo ln -s $HIVE_HOME/lib/hive-hbase-handler-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-hbase-handler.jar

sudo ln -s $HIVE_HOME/lib/hive-metastore-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-metastore.jar

sudo ln -s $HIVE_HOME/lib/hive-serde-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-serde.jar

sudo ln -s $HIVE_HOME/lib/hive-service-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-service.jar

sudo ln -s $HIVE_HOME/lib/hive-shims-common-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-shims-common.jar

sudo ln -s $HIVE_HOME/lib/hive-shims-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-shims.jar

sudo ln -s $HIVE_HOME/lib/hive-shims-scheduler-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-shims-scheduler.jar

sudo ln -snf $HIVE_HOME/lib/parquet-hadoop-bundle-1.5.0-cdh5.14.0.jar /usr/lib/impala/lib/parquet-hadoop-bundle.jar

其它jar包

sudo ln -snf $HIVE_HOME/lib/hbase-annotations-1.2.0-cdh5.14.0.jar /usr/lib/impala/lib/hbase-annotations.jar

sudo ln -snf $HIVE_HOME/lib/hbase-common-1.2.0-cdh5.14.0.jar /usr/lib/impala/lib/hbase-common.jar

sudo ln -snf $HIVE_HOME/lib/hbase-protocol-1.2.0-cdh5.14.0.jar /usr/lib/impala/lib/hbase-protocol.jar

sudo ln -snf /data/CDH/impala-2.11.0-cdh5.14.0/thirdparty/hbase-1.2.0-cdh5.14.0/lib/hbase-client-1.2.0-cdh5.14.0.jar /usr/lib/impala/lib/hbase-client.jar

sudo ln -snf /data/CDH/impala-2.11.0-cdh5.14.0/thirdparty/hadoop-2.6.0-cdh5.14.0/lib/native/libhadoop.so /usr/lib/impala/lib/libhadoop.so

sudo ln -snf /data/CDH/impala-2.11.0-cdh5.14.0/thirdparty/hadoop-2.6.0-cdh5.14.0/lib/native/libhadoop.so.1.0.0 /usr/lib/impala/lib/libhadoop.so.1.0.0

sudo ln -snf /data/CDH/impala-2.11.0-cdh5.14.0/thirdparty/hadoop-2.6.0-cdh5.14.0/lib/native/libhdfs.so /usr/lib/impala/lib/libhdfs.so

sudo ln -snf /data/CDH/impala-2.11.0-cdh5.14.0/thirdparty/hadoop-2.6.0-cdh5.14.0/lib/native/libhdfs.so.0.0.0 /usr/lib/impala/lib/libhdfs.so.0.0.0

 

额外补充

如有遗漏,还有一些jar包软链接未生效,可以全局搜索当前服务器上是否存在此jar,如果不存在就去maven上下载一个,然后放到某个目录,软连接过去即可。

eg:发现还有个 hadoop-core.jar软连接无效。

find  /  -name       “hadoop-core*”;  #根据此命令找到相关包的位置,然后ln –s xx xx 即可

4.9启动impala

service impala-state-store start

service impala-catalog start

service impala-server start

 

在执行service impala-state-store start 报错

[root@sunny conf]# service impala-state-store start

/etc/init.d/impala-state-store: line 34: /lib/lsb/init-functions: No such file or directory

/etc/init.d/impala-state-store: line 104: log_success_msg: command not found

 

需要安装 redhat-lsb

下载地址:http://rpmfind.net/linux/rpm2html/search.php?query=redhat-lsb-core(x86-64)

选择:redhat-lsb-core-4.1-27.el7.centos.1.x86_64.rpm 下载

 

 

 

下载好后执行命令:

rpm -ivh redhat-lsb-core-4.1-27.el7.centos.1.x86_64.rpm

 

继续启动impala

service impala-state-store start

service impala-catalog start

service impala-server start #所有datanode节点启动impalad

 

所有服务启动成功后可以开始访问impala

4.10使用impala

进入impala:

impala-shell #任意路径输入即可,基本操作同hive

访问impalad的管理界面

http://172.20.71.66:25000/

 

访问statestored的管理界面

http://172.20.71.66:25010/

 

访问catalog的管理界面

http://172.20.71.66:25020

impala日志文件:  /var/log/impala 

impala配置文件:/etc/default/impala

/etc/impala/conf  (core-site.xml, hdfs-site.xml, hive-site.xml)

4.11 安装impala异常处理

4.11.1 FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

处理方案:

参考:http://blog.csdn.net/l1028386804/article/details/51566262

进入所安装的Hive的conf目录,找到hive-site.xml,(若没修改,则是hive-default.xml.template)。

<property>

  <name>hive.metastore.schema.verification</name>

  <value>true</value>

   <description>

   Enforce metastore schema version consistency.

   True: Verify that version information stored in metastore matches with one from Hive jars.  Also disable automatic

         schema migration attempt. Users are required to manully migrate schema after Hive upgrade which ensures

         proper metastore schema migration. (Default)

   False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.

   </description>

</property>

改为

<property>

  <name>hive.metastore.schema.verification</name>

  <value>false</value>

   <description>

   Enforce metastore schema version consistency.

   True: Verify that version information stored in metastore matches with one from Hive jars.  Also disable automatic

         schema migration attempt. Users are required to manully migrate schema after Hive upgrade which ensures

         proper metastore schema migration. (Default)

   False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.

   </description>

</property>

 

然后重启hive

hive --service metastore &  

hive --service hiveserver2 &

 

4.11.2 Invalid short-circuit reads configuration:- Impala cannot read or execute the parent directory of dfs.domain.socket.path

处理方案:创建配置文件中 /etc/impala/conf/hdfs-site.xml,对应目录即可

 

 

 

cd /var/run

mkdir hdfs-sockets

cd hdfs-sockets

mkdir dn

4.11.3impala insert数据报错:NoClassDefFoundError: org/apache/hadoop/fs/adl/AdlFileSystem

[localhost.localdomain:21000] > insert into user_test values('zs');

Query: insert into user_test values('zs')

Query submitted at: 2020-12-17 02:39:24 (Coordinator: http://sunny:25000)

ERROR: AnalysisException: Failed to load metadata for table: test.user_test. Running 'invalidate metadata test.user_test' may resolve this problem.

CAUSED BY: NoClassDefFoundError: org/apache/hadoop/fs/adl/AdlFileSystem

CAUSED BY: TableLoadingException: Failed to load metadata for table: test.user_test. Running 'invalidate metadata test.user_test' may resolve this problem.

CAUSED BY: NoClassDefFoundError: org/apache/hadoop/fs/adl/AdlFileSystem

 

处理方案:

impala没有引用hadoop-azure-datalake.jar ,补充引用

sudo ln -s /data/CDH/hadoop-2.6.0-cdh5.14.0/share/hadoop/tools/lib/hadoop-azure-datalake-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-azure-datalake.jar

 

hadoop core-site.xml 配置解析:http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml

 

到此 hadoop hive impala 安装结束啦,完结 撒花~

 

--=====================补充更新 2020-12-28================

ERROR: AnalysisException: Unable to INSERT into target table (Test.user_test) because Impala does not have WRITE access to at least one HDFS path: hdfs://172.20.71.66:9000/user/hive/warehouse/test.db/user_test
处理方案:
vi /etc/selinux/config
SELINUX=disabled

vi /data/CDH/hive-1.1.0-cdh5.14.0/conf/hive-site.xml
添加内容
<property>
<name>hive.metastore.warehouse.dir</name>
<!--此处修改-->
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>

hadoop fs -chmod 777 /user/hive/warehouse/test.db/user_test

hive --service metastore &
hive --service hiveserver2 &

service impala-state-store restart
service impala-catalog restart
service impala-server restart

impala-shell
invalidate metadata


impala-shell --auth_creds_ok_in_clear -l -i 172.20.71.66 -u APP

-- ============================================================================

 posted on 2020-12-18 10:03  阿叮339  阅读(2393)  评论(5编辑  收藏  举报