problems_hive

problems_hive

1 hive执行如下截图中的语句时卡住了


原因:yarn未启动,hive底层是要提交mapreduce到yarn上才能计算结果的。

RCA: 之前启动yarn时,未执行jps查看是否已经启动。其实未启动成功,具体原因未知:

[root@node01 ~]# start-yarn.sh  
starting yarn daemons resourcemanager running as process 3220. Stop it first.  

solution: 重新启动dfs、yarn、jobhistory。

2 执行jar包报错:udfFull.jar中没有主清单属性

在windows系统的cmd命令行窗口中执行: java -jar udfFull.jar {"movie":"1287","rate":"5","timeStamp":"978302039","uid":"1"}
报错1:udfFull.jar中没有主清单属性
原因:jar包中的META-INF文件夹下的MANIFEST.MF文件缺少定义jar接口类。
说白了就是没有指定class类。这里说明一下MANIFEST.MF就是一个清单文件,通俗点将就相当于WINDOWS中ini配置文件。用来配置程序的一些信息。
解决方法:在MANIFEST.MF文件中添加一个入口类,添加这一行:Main-Class: TransferJsonUDF (TransferJsonUDF是含有main方法的入口类)

继续执行,报另一个错误:找不到或无法加载主类 TransferJsonUDF
原因:入口类配置得不对。
解决方法:入口类配置修改为: Main-Class: cn.itcast.hive.udf.TransferJsonUDF (即在入口类前面添加其类路径 cn.itcast.hive.udf.)

再次运行,成功:
D:> java -jar udfFull.jar {"movie":"1287","rate":"5","timeStamp":"978302039","uid":"1"}
1287 5 978302039 1

另外,linux中执行该命令,如果后面带参数,则需要在参数两边加上单引号''. 否则执行可能有异常,可以这么写:
java -jar udfFull.jar '{"movie":"914","rate":"3","timeStamp":"978301968","uid":"1"}'

3 beeline连接hive数据库报错

beeline> !connect jdbc:hive2://cdh03:10000 root root org.apache.hive.jdbc.HiveDriver 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/develop/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://cdh03:10000
19/12/16 20:53:02 [main]: WARN jdbc.HiveConnection: Failed to connect to cdh03:10000
Error: Could not open client transport with JDBC Uri: jdbc:hive2://cdh03:10000: Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): <font color=red>User: root is not allowed to impersonate root (state=08S01,code=0)</font>  

RCA: hadoop未设置超级代理。
hadoop引入了一个安全伪装机制,使得hadoop 不允许上层系统直接将实际用户传递到hadoop层,而是将实际用户传递给一个超级代理,由此代理在hadoop上执行操作,避免任意客户端随意操作hadoop,
solution:
在hadoop的配置文件core-site.xml中添加如下属性:

<property>
        <name>hadoop.proxyuser.root.hosts</name>
        <value>*</value>
</property>
<property>
        <name>hadoop.proxyuser.root.groups</name>
        <value>*</value>
</property>

就将上面配置hadoop.proxyuser.xxx.hosts和hadoop.proxyuser.xxx.groups中的xxx设置为root(即你的错误日志中显示的User:xxx为什么就设置为什么)。
“*”表示可通过超级代理“xxx”操作hadoop的用户、用户组和主机。重启hdfs。
参考链接:https://blog.csdn.net/qq_16633405/article/details/82190440
注意:如果要使用beeline连接hive的语法格式为:
!connect jdbc:hive2://cdh01:10000 username password org.apache.hive.jdbc.HiveDriver
使用beeline有2个前提条件:

  1. mysql中创建好hive的元数据的数据库
  2. 事先启动hiveserver2服务:
    • CDH版本启动命令:service hive-server2 start
    • apache版本启动命令:nohup hive --service hiveserver2 2>&1 &

4 运行hive命令报错

[root@cdh01 conf]# hive
Cannot find hadoop installation: $HADOOP_HOME or $HADOOP_PREFIX must be set or hadoop must be in the path
原因:未设置环境变量HADOOP_HOME.
解决方法:
1.vim /etc/hive/conf/hive-env.sh
在该文件中添加语句:export HADOOP_HOME=/usr/lib/hadoop
注意:CDH版本的hadoop的工作目录是/usr/lib/hadoop

5 hive中创建和hbase的整合的表失败报错:

执行语句:
create table course.hbase_score(id int,cname string,score int) stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' with serdeproperties("hbase.columns.mapping" = "cf:name,cf:score") tblproperties("hbase.table.name" = "hbase_score");
报错语句:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.apache.hadoop.hbase.client.HBaseAdmin.(Lorg/apache/hadoop/conf/Configuration;)V
排查步骤:
1.运行 hive -hiveconf hive.root.logger=DEBUG,console,以便查看详细错误日志。然后再次执行建表语句,报如下详细的错误:

ERROR exec.DDLTask: java.lang.NoSuchMethodError: org.apache.hadoop.hbase.client.HBaseAdmin.<init>(Lorg/apache/hadoop/conf/Configuration;)V
	at org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:115)
	at org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:195)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:727)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:720)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:105)
	at com.sun.proxy.$Proxy18.createTable(Unknown Source)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2126)
	at com.sun.proxy.$Proxy18.createTable(Unknown Source)
	at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:784)
	at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4202)
	at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:311)
	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:99)
	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2054)
	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1750)
	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1503)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1287)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1277)
	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:226)
	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:175)
	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:389)
	at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:699)
	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:634)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:226)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:141)  

怀疑是hive和hbase版本不兼容,hive是CDH版本,hbase是apache开源版本,所以在cdh03上重新安装了一个apache开源版本的hive。解决。
小结:cdh01中是CDH版本的hive,cdh03中是apache开源版本的hive。

6 hive执行sql报错

desc:执行sql语句,join两个子查询
errorlog:

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
Caused by: java.lang.IllegalStateException: Invalid input path hdfs://node1:8020/user/hive/warehouse/weblog.db/ods_click_stream_visit/datestr=20181101/visit

action&solution:
检查发现日志中说无效的路径,其实是存在的。
检查发现hive本地模式开启了,尝试关闭本地模式 set hive.exec.mode.local.auto=false;
再次尝试,执行成功。

7 创建hive自定义函数报错

desc:
执行下面语句报错: add jar /develop/weblog/0210udf-1.0.jar;
create temporary function ua_parse as 'com.udf.UAParseUDF';
errorlog:

FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.FunctionTask. Invalid signature file digest for Manifest main attributes

solution:
方法1. 删除jar包中的META-INF/.RSA META-INF/.DSA META-INF/*.SF 文件:

zip -d 0210udf-1.0.jar META-INF/*.RSA META-INF/*.DSA META-INF/*.SF

方法2. 打包时排除以上的需要被删除的文件:

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-shade-plugin</artifactId>
  <version>2.4.3</version>
  <executions>
    <execution>
      <phase>package</phase>
      <goals>
        <goal>shade</goal>
      </goals>
      <configuration>
        <minimizeJar>true</minimizeJar>
        <filters>
          <filter>
            <artifact>*:*</artifact>
            <excludes>
              <exclude>META-INF/*.SF</exclude>
              <exclude>META-INF/*.DSA</exclude>
              <exclude>META-INF/*.RSA</exclude>
            </excludes>
          </filter>
        </filters>
      </configuration>
    </execution>
  </executions>
</plugin>

8 beeline连接hive报错

errorlog:

java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): 
User root is not allowed to impersonate anonymous 错误。

solution: 修改hadoop 配置文件 .../etc/hadoop/core-site.xml,加入如下配置项

<property>
    <name>hadoop.proxyuser.root.hosts</name>
    <value>*</value>
</property>
<property>
    <name>hadoop.proxyuser.root.groups</name>
    <value>*</value>
</property>

9

posted @ 2021-08-16 00:13  mediocrep  阅读(201)  评论(0编辑  收藏  举报
既然选择了远方,便只顾风雨兼程!