Hive映射HBase表的几种方式
1.Hive内部表,语句如下
CREATE TABLE ods.s01_buyer_calllogs_info_ts( key string comment "hbase rowkey", buyer_mobile string comment "手机号", contact_mobile string comment "对方手机号", call_date string comment "发生时间", call_type string comment "通话类型", init_type string comment "0-被叫,1-主叫", other_cell_phone string comment "对方手机号", place string comment "呼叫发生地", start_time string comment "发生时间", subtotal string comment "通话费用", use_time string comment "通话时间(秒)" ) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,record:buyer_mobile,record:contact_mobile,record:call_date,record:call_type,record:init_type,record:other_cell_phone,record:place,record:start_time,record:subtotal,record:use_time") TBLPROPERTIES("hbase.table.name" = "s01_buyer_calllogs_info_ts");
建好表之后,进入hbase shell执行list能看到表s01_buyer_calllogs_info_ts,hive drop掉此表时,hbase也被drop。
2.Hive外部表,语句如下,
create 'buyer_calllogs_info_ts', 'record', {SPLITS_FILE => 'hbase_calllogs_splits.txt'} CREATE EXTERNAL TABLE ods.s10_buyer_calllogs_info_ts( key string comment "hbase rowkey", buyer_mobile string comment "手机号", contact_mobile string comment "对方手机号", call_date string comment "发生时间", call_type string comment "通话类型", init_type string comment "0-被叫,1-主叫", other_cell_phone string comment "对方手机号", place string comment "呼叫发生地", start_time string comment "发生时间", subtotal string comment "通话费用", use_time string comment "通话时间(秒)" ) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,record:buyer_mobile,record:contact_mobile,record:call_date,record:call_type,record:init_type,record:other_cell_phone,record:place,record:start_time,record:subtotal,record:use_time") TBLPROPERTIES("hbase.table.name" = "buyer_calllogs_info_ts");
从方式需要先在hbase建好表,然后在hive中建表,hive drop掉表,hbase表不会变。
3.Hive映射HBase的列族
CREATE TABLE hbase_table_1(value map<string,int>, row_key int) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( "hbase.columns.mapping" = "cf:,:key" ); INSERT OVERWRITE TABLE hbase_table_1 SELECT map(bar, foo), foo FROM pokes WHERE foo=98 OR foo=100;
在hbase查看结果
hbase(main):012:0> scan "hbase_table_1" ROW COLUMN+CELL 100 column=cf:val_100, timestamp=1267739509194, value=100 98 column=cf:val_98, timestamp=1267739509194, value=98 2 row(s) in 0.0080 seconds
在hive查看结果
hive> select * from hbase_table_1; Total MapReduce jobs = 1 Launching Job 1 out of 1 ... OK {"val_100":100} 100 {"val_98":98} 98 Time taken: 3.808 seconds
两种方式可以根据需求确定,详细参见官方文档。