Hive Over HBase

公告

View Post

1. 在hbase上建测试表

hbase(main):003:0> create 'test_hive_over_hbase','f'
0 row(s) in 2.5810 seconds

hbase(main):004:0> put 'test_hive_over_hbase','1001','f:DATA','2012|shaochen'
0 row(s) in 0.2010 seconds

hbase(main):005:0> put 'test_hive_over_hbase','1002','f:DATA','2010|dachao'
0 row(s) in 0.0100 seconds

hbase(main):006:0> put 'test_hive_over_hbase','2001','f:DATA','2013|qiuxin'
0 row(s) in 0.0090 seconds

2.在Hive上建表

CREATE EXTERNAL TABLE hbase_test_hive_over_hbase(key int, value string) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,f:DATA") TBLPROPERTIES ("hbase.table.name" = "test_hive_over_hbase");

注意由于表test_hive_over_hbase已经在hbase中存在，所以Hive中必须是EXTERNAL表.

3.在Hive对HBase测试表进行统计分析

select count(*) from hbase_test_hive_over_hbase where substring(value,0,4)='2013';

Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201312080251_0001, Tracking URL = http://jfp4-2:50030/jobdetails.jsp?jobid=job_201312080251_0001
Kill Command = /usr/lib/hadoop/libexec/../bin/hadoop job  -Dmapred.job.tracker=jfp4-2:54311 -kill job_201312080251_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2013-12-10 08:24:27,257 Stage-1 map = 0%,  reduce = 0%
2013-12-10 08:24:31,305 Stage-1 map = 100%,  reduce = 0%
2013-12-10 08:24:40,391 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201312080251_0001
MapReduce Jobs Launched:
Job 0: Map: 1  Reduce: 1   HDFS Read: 24046 HDFS Write: 2 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
1
Time taken: 22.588 seconds

View Code

posted on 2013-12-10 15:37 littlesuccess 阅读(637) 评论(0) 收藏举报

刷新页面返回顶部