Hive与HBase集成及常见问题解决
原文 http://www.cnblogs.com/kivi/p/3224880.html
版本说明
Hadoop 1.0.1
HBase 0.94.9
Hive 0.8.1
一、Hive over HBase
1. 拷贝hbase-0.94.9.jar、zookeeper-3.4.5.jar、protobuf-java-2.4.0a.jar到hive/lib下。
注意:如何hive/lib下已经存在这两个文件的其他版本(例如zookeeper-3.3.1.jar),建议删除后使用hbase下的相关版本。
2. 修改hive/conf下hive-site.xml文件(红色部分)
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>hive.metastore.local</name> <value>true</value> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://192.168.1.111:3306/hive?characterEncoding=UTF-8</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>hive</value> </property> <property> <name>hive.querylog.location</name> <value>/home/hive-0.8.1/logs</value> </property> <property> <name>hive.aux.jars.path</name> <value>file:///home/hive-0.8.1/lib/hive-hbase-handler-0.8.1.jar,file:///home/hive-0.8.1/lib/hbase-0.94.9.jar,file:///home/hive-0.8.1/lib/zookeeper-3.4.5.jar,file:///home/hive-0.8.1/lib/protobuf-java-2.4.0a.jar</value> </property> </configuration>
3. 拷贝hbase-0.94.9.jar到所有hadoop节点(包括master)的hadoop/lib下。
4. 拷贝hbase/conf下的hbase-site.xml文件到所有hadoop节点(包括master)的hadoop/conf下。
Ok... Integration is done!
二、常见问题及解决方案
异常提示 Exception in thread "Thread-54" java.lang.RuntimeException: Error while reading from task log url
at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getStackTraces(TaskLogProcessor.java:240) at org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:227) at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:92) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Server returned HTTP response code: 400 for URL: http://slave3:50060/tasklog?taskid=attempt_201212192008_0014_m_000000_3&start=-8193 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436) at java.net.URL.openStream(URL.java:1010) at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getStackTraces(TaskLogProcessor.java:192) ... 3 more
解决步骤
1.查看异常(根据自己的异常URL地址)
http://master.hadoop:50060/tasklog?taskid=attempt_201307301142_0001_m_000000_0&start=-8193 修改为:
http://master.hadoop:50060/tasklog?attemptid=attempt_201307301142_0001_m_000000_0&start=-8193 进行访问
2.页面中有异常具体记录,我这点的异常提示是: java.lang.NoClassDefFoundError: com/google/protobuf/Message 明显是缺少某个jar,这个jar可以在HBase的jar中找到protobuf-java-2.4.0a.jar
3.将jar放入Hive的lib目录和hadoop的lib目录下,修改配置文件conf/hive-site.xml
<property>
<name>hive.aux.jars.path</name>
<value>file:///usr/java/hive-0.8.1/lib/protobuf-java-2.4.0a.jar</value>
</property>
4.重启MapReduce
stop-mapred.sh
start-mapred.sh