Hive高级
HiveServer2
- 概述:
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Overview2
- 客户端:
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients
Hive数据压缩
- 压缩格式: bzip2, gzip, lzo, snappy等
- 压缩比:bzip2>gzip>lzo bzip2最节省存储空间
- 解压速度:lzo>gzip>bzip2 lzo解压速度是最快的
在实际的项目开发当中,hive表的数据:
* 存储格式
orcfile / qarquet
* 数据压缩
snappy
Hive数据存储
Hive supports several file formats:
Text File
SequenceFile
RCFile
Avro Files
ORC Files
Parquet
Custom INPUTFORMAT and OUTPUTFORMAT
- https://cwiki.apache.org/confluence/display/Hive/FileFormats
- https://cwiki.apache.org/confluence/display/Hive/SerDe
Hive优化
- EXPLAIN语法
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain