共用分区和分桶

分区表中，可以嵌套分桶

命令怎么写？有什么实际意义？用了之后有什么效果？

“partitioned by”必须在“clustered by”前面

制作3个文件：
分为三个去插入

可以测试一下如果分桶在分区后面的效果怎么样

create table t_bucket_partition(id int)
clustered by (id) into 3 buckets
partitioned by (type string)
row format delimited fields terminated by '\t';
FAILED: ParseException line 3:0 missing EOF at 'partitioned' near 'buckets'

create table t_partition_bucket(id int)
partitioned by (type string)
clustered by (id) into 3 buckets
row format delimited fields terminated by '\t';

说明：“partition (type='test')”需要提前

分为三个文件，把数据加载到hive中的数据库中去试一下
partition_bucket1
1
2
3
4
partition_bucket2
5
6
7
8
partition_bucket3
9
10
11
12
然后把数据加载到对应的表格中去
load data local inpath '/home/briup/bd1902_hive/partition_bucket1' into table t_partition_bucket partition(type='1');
load data local inpath '/home/briup/bd1902_hive/partition_bucket2' into table t_partition_bucket partition(type='2');
load data local inpath '/home/briup/bd1902_hive/partition_bucket3' into table t_partition_bucket partition(type='3');

加载完后查看一下数据：
select * from t_partition_bucket;
查看一下分区：
show partitions t_partition_bucket;
然后以分桶形式查看数据：
select * from t_partition_bucket tablesample(bucket 2 out of 3 on id);

实际意义：这样做既可以达到分区又可以达到分组效果。分区方便快速查询，加快查询效率，分桶方便做数据和代码验证。

posted @ 2019-08-16 15:44 君莫笑我十年游阅读(152) 评论(0) 收藏举报

刷新页面返回顶部

曾鸿发

共用分区和分桶

分区表中，可以嵌套分桶

命令怎么写？有什么实际意义？用了之后有什么效果？

实际意义：这样做既可以达到分区又可以达到分组效果。分区方便快速查询，加快查询效率，分桶方便做数据和代码验证。

公告