四 Hive整合HBase
安装环境:
hbase版本:hbase-1.4.0-bin.tar.gz
hive版本: apache-hive-1.2.1-bin.tar
注意请使用高一点的hbase版本,不然就算hive和hbase关联成功,执行语句的时候会出现错误(The connection has to be unmanaged)。
hive整合hbase,其实就是用hive执行hsql来操作hbase数据库。
1、拷贝hbase jar包至hive lib目录中,其实就是使hive可以使用hbase的api。
需要将hbase拷贝至hive下的jar包如下:
hbase-protocol-1.4.0.jar
hbase-server-1.4.0.jar
hbase-client-1.4.0.jar
hbase-common-1.4.0.jar
hbase-common-1.4.0-tests.jar
2、修改hive-site.xml文件
<configuration> <!-- hive关联的mysql的地址--> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value> <description>JDBC connect string for a JDBC metastore</description> </property> <!-- mysql数据库驱动--> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> <description>Driver class name for a JDBC metastore</description> </property> <!-- mysql用户名--> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> <description>username to use against metastore database</description> </property> <!-- mysql密码--> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>123456</value> <description>password to use against metastore database</description> </property> <!-- hive 关联hbase 所需要的jar--> <property> <name>hive.aux.jars.path</name> <value>file:///home/hadoop/apps/hive/lib/hive-hbase-handler-1.2.1.jar, file:///home/hadoop/apps/hive/lib/hbase-protocol-1.4.0.jar, file:///home/hadoop/apps/hive/lib/hbase-server-1.4.0.jar, file:///home/hadoop/apps/hive/lib/hbase-client-1.4.0.jar, file:///home/hadoop/apps/hive/lib/hbase-common-1.4.0.jar, file:///home/hadoop/apps/hive/lib/hbase-common-1.4.0-tests.jar, file:///home/hadoop/apps/hive/lib/zookeeper-3.4.6.jar, file:///home/hadoop/apps/hive/lib/guava-14.0.1.jar </value> </property> <!--hbase 数据信息所在的zookeeper地址--> <property> <name>hbase.zookeeper.quorum</name> <value>hadoop2,hadoop3,hadoop4</value> </property> <property> <name>dfs.permissions.enabled</name> <value>false</value> </property> </configuration>
3、关联hbase中已经存在的表
CREATE EXTERNAL TABLE if not exists notebook( row string, nl string, ct string, nbn string ) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES('hbase.columns.mapping' = ':key,nbi:nl,nbi:ct,nbi:nbn') TBLPROPERTIES('hbase.table.name' = 'nb');
说明:
hbase.columns.mapping 中的值分别对应hbase表中的 列族:列的形式,其中 :key是固定写法,对应hbase中的rowkey
hbase.table.name hbase中的表名
org.apache.hadoop.hive.hbase.HBaseStorageHandler 使用的处理器
这样hive就和hbase中已经存在的表进行了关联。
4、启动hive的方式两种:
4.1 hive -hiveconf hbase.zookeeper.quorum=hadoop2,hadoop3,hadoop4
4.2
4.2.1 启动hiveserver2
4.2.2 beeline -hiveconf hbase.zookeeper.quorum=hadoop2,hadoop3,hadoop4
4.2.3 !connect jdbc:hive2://localhost:10000
至此hive关联hbase完成!
5 、使用java连接hive操作hbase
pom.xml
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>cn.itcast.hbase</groupId> <artifactId>hbase</artifactId> <version>0.0.1-SNAPSHOT</version> <dependencies> <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client --> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>2.6.4</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-common</artifactId> <version>2.6.4</version> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-hdfs --> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.12</version> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.hbase/hbase-client --> <dependency> <groupId>org.apache.hbase</groupId> <artifactId>hbase-client</artifactId> <version>1.4.0</version> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.hbase/hbase-server --> <dependency> <groupId>org.apache.hbase</groupId> <artifactId>hbase-server</artifactId> <version>1.4.0</version> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.hive/hive-jdbc --> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-jdbc</artifactId> <version>1.2.1</version> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.hive/hive-metastore --> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-metastore</artifactId> <version>1.2.1</version> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.hive/hive-jdbc --> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-exec</artifactId> <version>1.2.1</version> </dependency> </dependencies> </project>
Hive_Hbase.java
package cn.itcast.bigdata.hbase; import java.sql.Connection; import java.sql.DriverManager; import java.sql.ResultSet; import java.sql.SQLException; import java.sql.Statement; public class Hive_Hbase { public static void main(String[] args) { try { Class.forName("org.apache.hive.jdbc.HiveDriver"); Connection connection = DriverManager.getConnection("jdbc:hive2://hadoop1:10000/shizhan02","hadoop",""); Statement statement = connection.createStatement(); String sql = "SELECT * FROM hive_hbase_table_kv"; ResultSet res = statement.executeQuery(sql); while (res.next()) { System.out.println(res.getString(2)); } } catch (ClassNotFoundException | SQLException e) { // TODO Auto-generated catch block e.printStackTrace(); } } }