大叔经验分享(33)hive select count为0
hive建表后直接将数据文件拷贝到table目录下,select * 可以查到数据,但是select count(1) 一直返回0,这个是因为hive中有个配置
hive.stats.autogather=true
Enables automated gathering of table-level statistics for newly created tables and table partitions, such as tables created with the INSERT OVERWRITE statement. The parameter does not produce column-level statistics, such as those generated by CBO. If disabled, administrators must manually generate the table-level statistics for newly generated tables and table partitions with the ANALYZE TABLE statement.
可以通过describe来查看table的统计信息
DESCRIBE EXTENDED $table_name;
有个配置控制是否使用talbe的统计信息
hive.compute.query.using.stats=true
Instructs Hive to use statistics when generating query plans
很多人建议的处理方法是
set hive.compute.query.using.stats=false;
正解应该是
ANALYZE TABLE $table_name COMPUTE STATISTICS;
ANALYZE TABLE $table_name partition(p=$1) COMPUTE STATISTICS;
即重新计算统计信息
参考:https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_hive-performance-tuning/content/ch_cost-based-optimizer.html
---------------------------------------------------------------- 结束啦,我是大魔王先生的分割线 :) ----------------------------------------------------------------
- 由于大魔王先生能力有限,文中可能存在错误,欢迎指正、补充!
- 感谢您的阅读,如果文章对您有用,那么请为大魔王先生轻轻点个赞,ありがとう