创建视图

1.创建视图,视图是个虚表
create view v1 as select a.id aid,a.name,b.id bid,b.orderno from customers a left outer join default.ttt b on a.id=b.cid;//创建视图字段不要重复,如果有重复需要b.id bid进行重复命名
2.查看视图
show tables;
3.子查询,将查询出来的结果再进行一次过滤
select aid from v1 ;
4.笛卡尔积:两个表之间不加任何成分做连接,

map端连接:
$hive>set hive.auto.convert.join=true; //设置自动转换连接,默认机试true
//使用mapjoin连接按时实现map连接
$hive> select /+mapjoin(customers)/ a.,b. from customers aleft outer join orders b on a.id = b.cid;
调优

1.使用explain查看查询计划
OK

STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1

STAGE PLANS:
Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: customers
Statistics: Num rows: 1 Data size: 38 Basic stats: COMPLETE Column stats: COMPLETE
Select Operator
Statistics: Num rows: 1 Data size: 38 Basic stats: COMPLETE Column stats: COMPLETE
Group By Operator
aggregations: count()
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
Reduce Output Operator
sort order:
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
value expressions: _col0 (type: bigint)
Reduce Operator Tree:
Group By Operator
aggregations: count(VALUE._col0)
mode: mergepartial
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
File Output Operator
compressed: false
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
ListSink

2、设置limit优化策略,避免全部查询
hive>set hive.limit.optimize.enable=true;

设置本地模式运行
$hive>set mapred.job.tracker=local;
//设置为自动本地模式
$hive>set hive.exec.mode.local.auto=true;
//并行执行
$hive>set hive.exec.parallel=true;    //设置并行查询,只是执行不存在依赖关系的阶段
//严格模式:至少指定一个静态分区,不能都是动态的
$hive>set hive.mapred.mode =strict    //1.严格模式一定要指定分区表,否则不让查询
                                                            //2.order by时必须使用limit子句
                                                            //不允许笛卡尔积查询
//设置MR的数量
posted @ 2018-07-09 15:26  shizeqi  阅读(494)  评论(0编辑  收藏  举报