Hive Over HBase
1. 在hbase上建测试表
hbase(main):003:0> create 'test_hive_over_hbase','f' 0 row(s) in 2.5810 seconds hbase(main):004:0> put 'test_hive_over_hbase','1001','f:DATA','2012|shaochen' 0 row(s) in 0.2010 seconds hbase(main):005:0> put 'test_hive_over_hbase','1002','f:DATA','2010|dachao' 0 row(s) in 0.0100 seconds hbase(main):006:0> put 'test_hive_over_hbase','2001','f:DATA','2013|qiuxin' 0 row(s) in 0.0090 seconds
2.在Hive上建表
CREATE EXTERNAL TABLE hbase_test_hive_over_hbase(key int, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,f:DATA") TBLPROPERTIES ("hbase.table.name" = "test_hive_over_hbase");
注意由于表test_hive_over_hbase已经在hbase中存在,所以Hive中必须是EXTERNAL表.
3.在Hive对HBase测试表进行统计分析
select count(*) from hbase_test_hive_over_hbase where substring(value,0,4)='2013';
Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapred.reduce.tasks=<number> Starting Job = job_201312080251_0001, Tracking URL = http://jfp4-2:50030/jobdetails.jsp?jobid=job_201312080251_0001 Kill Command = /usr/lib/hadoop/libexec/../bin/hadoop job -Dmapred.job.tracker=jfp4-2:54311 -kill job_201312080251_0001 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2013-12-10 08:24:27,257 Stage-1 map = 0%, reduce = 0% 2013-12-10 08:24:31,305 Stage-1 map = 100%, reduce = 0% 2013-12-10 08:24:40,391 Stage-1 map = 100%, reduce = 100% Ended Job = job_201312080251_0001 MapReduce Jobs Launched: Job 0: Map: 1 Reduce: 1 HDFS Read: 24046 HDFS Write: 2 SUCCESS Total MapReduce CPU Time Spent: 0 msec OK 1 Time taken: 22.588 seconds
posted on 2013-12-10 15:37 littlesuccess 阅读(623) 评论(0) 编辑 收藏 举报