Hive：建表、插入数据、插入复合类型

新建hive表：

CREATE TABLE `test`(
  `a` timestamp, 
  `b` struct<t:timestamp>)
 --下面可选
  [row format delimited fields terminated by '\t']
  [STORED AS Parquet]

查看建好的表的结构：

hive> show create table test;
OK
CREATE TABLE `test`(
  `a` timestamp, 
  `b` struct<t:timestamp>)
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  'hdfs://ambari.master.com:8020/apps/hive/warehouse/test'
TBLPROPERTIES (
  'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\"}', 
  'numFiles'='2', 
  'numRows'='2', 
  'rawDataSize'='78', 
  'totalSize'='80', 
  'transient_lastDdlTime'='1532416983')
Time taken: 0.189 seconds, Fetched: 18 row(s)

插入数据：
hive 不支持直接用insert插入复合类型（如test表中struct类型列），可以用以下方式间接插入

insert into test 
  select unix_timestamp('1970-01-01 08:00:00'), 
    named_struct('t', cast(unix_timestamp('1970-01-01 08:00:00') as timestamp))
    -- named_struct是生成struct类型的函数
from tmp_table limit 1;

补充：
Hive表数据文件如果是parquet类型，struct复合类型里的类型为timestamp的列在存取时候，时区换算会出错，所以可以用varchar类型代替timestamp。

posted @ 2019-01-03 16:12 xuejianbest 阅读(2148) 评论(0) 编辑收藏举报

刷新页面返回顶部

xuejianbest

Hive：建表、插入数据、插入复合类型

公告