HIVE的安装
第一部分 安装前准备
环境:workstation11 + centos 7 + hadoop-2.7.7 + mysql 5.6.40 + hive 2.3.3
首先安装mysql,步骤看这里。
接着安装HIVE。
第二部分 正式安装HIVE
1.开启hadoop集群,并关闭hadoop的安全模式
[root@hadoop ~]# start-all.sh ... [root@hadoop ~]# hdfs dfsadmin -safemode leave Safe mode is OFF
2.上传hive安装包,解压缩,重命名,修改环境变量
[root@hadoop ~]# cd /usr/local/ [root@hadoop local]# tar xzvf apache-hive-2.3.3-bin.tar.gz #解压缩 [root@hadoop local]# mv apache-hive-2.3.3-bin hive #重命名 [root@hadoop local]# vi /etc/profile #配置环境变量 添加变量 export HIVE_HOME=/usr/local/hive PATH变量后面添加 :$HIVE_HOME/bin [root@hadoop local]# source /etc/profile #生成环境变量
注意:
如果有hadoop用户的请进入到hadoop用户下修改hadoop用户下的环境变量vi ~/.bash_profile
如果有root用户的请进入到root用户下修改环境变量vi /etc/profile
配置完成之后,可查看hive版本信息
[root@hadoop local]# hive --version Hive 2.3.3 Git git://daijymacpro-2.local/Users/daijy/commit/hive -r 8a511e3f79b43d4be41cd231cf5c99e43b248383 Compiled by daijy on Wed Mar 28 16:58:33 PDT 2018 From source with checksum 8873bba6c55a058614e74c0e628ab022
3.进入/usr/local/hive/conf目录,修改HIVE的4个配置文件
第一个:hive-env.sh
[root@hadoop ~]# cd /usr/local/hive/conf [root@hadoop conf]# cp hive-env.sh.template hive-env.sh [root@hadoop conf]# vi hive-env.sh #添加环境变量 export JAVA_HOME=/usr/java export HADOOP_HOME=/usr/local/hadoop export HIVE_HOME=/usr/local/hive export HIVE_CONF_DIR=/usr/local/hive/conf
第二个:hive-site.xml
首先修改mysql数据库连接信息
[root@hadoop conf]# cp hive-default.xml.template hive-site.xml #注意命名改变 [root@hadoop conf]# vi hive-site.xml #修改mysql数据库连接,注意是修改不是添加!!! <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&characterEncoding=UTF-8&useSSL=false</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>hive</value> </property>
然后创建相应目录并在文件中修改
[root@hadoop conf]# mkdir -p /tmp/hive/local /tmp/hive/resources /tmp/hive/querylog /tmp/hive/operation_logs #创建目录 [root@hadoop conf]# vi hive-site.xml #修改配置文件,注意是修改不是添加!!! <property> <name>hive.exec.scratchdir</name> <value>/tmp/hive</value> <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description> </property> <property> <name>hive.exec.local.scratchdir</name> <value>/tmp/hive/local</value> <description>Local scratch space for Hive jobs</description> </property> <property> <name>hive.downloaded.resources.dir</name> <value>/tmp/hive/resources</value> <description>Temporary local directory for added resources in the remote file system.</description> </property> <property> <name>hive.querylog.location</name> <value>/tmp/hive/querylog</value> <description>Location of Hive run time structured log file</description> </property> <property> <name>hive.server2.logging.operation.log.location</name> <value>/tmp/hive/operation_logs</value> <description>Top level directory where operation logs are stored if logging functionality is enabled</description> </property>
第三个:hive-log4j2.properties
[root@hadoop conf]# cp hive-log4j2.properties.template hive-log4j2.properties
第四个:hive-exec-log4j2.properties
[root@hadoop conf]# cp hive-exec-log4j2.properties.template hive-exec-log4j2.properties
4.初始化hive数据库
先复制mysql的驱动程序到hive/lib下面(我这里是上传mysql-connector-java-5.1.46-bin.jar到/usr/local/hive/lib)然后
[root@hadoop conf]# mysql -uroot -proot #登录mysql
mysql> create user 'hive' identified by 'hive'; #这句话是Oracle用法,其实不必执行
mysql> grant all privileges on *.* to hive@"%" identified by "hive" with grant option; #创建hive用户同时设置远程连接(因为hive-site.xml中配置的是hive用户) Query OK, 0 rows affected (0.04 sec) mysql> flush privileges; #刷新权限 Query OK, 0 rows affected (0.09 sec) mysql> exit #退出 Bye [root@hadoop conf]# cd .. [root@hadoop hive]# bin/schematool -initSchema -dbType mysql #初始化mysql Metastore connection URL: jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&characterEncoding=UTF-8&useSSL=false Metastore Connection Driver : com.mysql.jdbc.Driver Metastore connection User: hive Starting metastore schema initialization to 2.3.0 Initialization script hive-schema-2.3.0.mysql.sql Initialization script completed schemaTool completed
5.启动hive,成功!!!
[root@hadoop hive]# hive Logging initialized using configuration in file:/usr/local/hive/conf/hive-log4j2.properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. hive> exit; [root@hadoop hive]#
6.为hive建立hdfs使用目录
[root@hadoop hive]# hdfs dfs -mkdir /usr [root@hadoop hive]# hdfs dfs -mkdir /usr/hive/ [root@hadoop hive]# hdfs dfs -mkdir -p /usr/hive/warehouse [root@hadoop hive]# hdfs dfs -chmod g+w /usr/hive/warehouse [root@hadoop hive]# hdfs dfs -mkdir /tmp [root@hadoop hive]# hdfs dfs -chmod g+w /tmp [root@hadoop hive]# hdfs dfs -ls -R / #查看 drwxrwxr-x - root supergroup 0 2018-07-27 14:49 /tmp drwxr-xr-x - root supergroup 0 2018-07-27 14:44 /usr drwxr-xr-x - root supergroup 0 2018-07-27 14:45 /usr/hive drwxrwxr-x - root supergroup 0 2018-07-27 14:45 /usr/hive/warehouse
第三部分 遇到错误
1.hive数据库初始化报错1
SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
原因:发生jar包冲突
解决方案:移除其中一个jar包即可
[root@hadoop hive]# mv /usr/local/hive/lib/log4j-slf4j-impl-2.6.2.jar /usr/local/hive/lib/log4j-slf4j-impl-2.6.2.jar_bak
2.hive数据库初始化报错2
Metastore connection URL: jdbc:derby:;databaseName=metastore_db;create=true Metastore Connection Driver : org.apache.derby.jdbc.EmbeddedDriver Metastore connection User: APP Starting metastore schema initialization to 2.3.0 Initialization script hive-schema-2.3.0.mysql.sql Error: Syntax error: Encountered "<EOF>" at line 1, column 64. (state=42X01,code=30000) org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !! Underlying cause: java.io.IOException : Schema script failed, errorcode 2 Use --verbose for detailed stacktrace. *** schemaTool failed ***
原因:先确定hive-site.xml文件名是否正确,如果不对则必须改为hive-site.xml否则不生效。然后查看其中的mysql数据连接信息是否正确修改。我这里犯了个错误就是直接从网上拷贝后粘贴到文件的上方了,后来检查文件时发现文件中其实是有这四个标签的并且都有默认值,估计执行时后面标签的内容把我添加到前面的标签内容给覆盖掉了所以才没有生效。
解决方法:到文件中分别找到这4个标签并一一修改其中的内容,不要直接复制粘贴。这个文件比较大,可以下载到本地用NotPad++打开修改后再重新上传覆盖。
3.启动hive报错
[root@hadoop local]# hive Logging initialized using configuration in file:/usr/local/hive/conf/hive-log4j2.properties Async: true Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D at org.apache.hadoop.fs.Path.initialize(Path.java:205) at org.apache.hadoop.fs.Path.<init>(Path.java:171) at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:659) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:582) at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:549) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:750) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hadoop.util.RunJar.run(RunJar.java:226) at org.apache.hadoop.util.RunJar.main(RunJar.java:141) Caused by: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D at java.net.URI.checkPath(URI.java:1823) at java.net.URI.<init>(URI.java:745) at org.apache.hadoop.fs.Path.initialize(Path.java:202) ... 12 more
解决方法:建立相应目录并在hive-site.xml文件中修改
[root@hadoop conf]# mkdir -p /tmp/hive/local /tmp/hive/resources /tmp/hive/querylog /tmp/hive/operation_logs #创建目录 [root@hadoop conf]# vi hive-site.xml #修改配置文件,注意是修改不是添加!!! <property> <name>hive.exec.scratchdir</name> <value>/tmp/hive</value> <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description> </property> <property> <name>hive.exec.local.scratchdir</name> <value>/tmp/hive/local</value> <description>Local scratch space for Hive jobs</description> </property> <property> <name>hive.downloaded.resources.dir</name> <value>/tmp/hive/resources</value> <description>Temporary local directory for added resources in the remote file system.</description> </property> <property> <name>hive.querylog.location</name> <value>/tmp/hive/querylog</value> <description>Location of Hive run time structured log file</description> </property> <property> <name>hive.server2.logging.operation.log.location</name> <value>/tmp/hive/operation_logs</value> <description>Top level directory where operation logs are stored if logging functionality is enabled</description> </property>