hive学习(二) hive操作
1.创建库
create database test;
2.删除库
drop database test;
3.建表
完整ddl建表语法规则
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name -- (Note: TEMPORARY available in Hive 0.14.0 and later) [(col_name data_type [COMMENT col_comment], ... [constraint_specification])] [COMMENT table_comment] [PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)] [CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS] [SKEWED BY (col_name, col_name, ...) -- (Note: Available in Hive 0.10.0 and later)] ON ((col_value, col_value, ...), (col_value, col_value, ...), ...) [STORED AS DIRECTORIES] [ [ROW FORMAT row_format] [STORED AS file_format] | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)] -- (Note: Available in Hive 0.6.0 and later) ] [LOCATION hdfs_path] [TBLPROPERTIES (property_name=property_value, ...)] -- (Note: Available in Hive 0.6.0 and later) [AS select_statement]; -- (Note: Available in Hive 0.5.0 and later; not supported for external tables)
4.hive的数据类型
: primitive_type | array_type | map_type | struct_type :primitive_type |TINYINT | SMALLINT | INT | BIGINT | BOOLEAN | FLOAT | DOUBLE | STRING
5.建表例子:
如下 文件data
对应字段: id,姓名,爱好,住址
1,小明1,lol-book-movie,beijing:changping-shanghai:pudong 2,小明2,lol-book-movie,beijing:changping-shanghai:pudong 3,小明3,lol-book-movie,beijing:changping-shanghai:pudong 4,小明4,lol-book-movie,beijing:changping-shanghai:pudong 5,小明5,lol-movie,beijing:changping-shanghai:pudong 6,小明6,lol-book-movie,beijing:changping-shanghai:pudong 7,小明7,lol-book,beijing:changping-shanghai:pudong 8,小明8,lol-book,beijing:changping-shanghai:pudong 9,小明9,lol-book-movie,beijing:changping-shanghai:pudong id int , 姓名 string,爱好 数组类型,住址 map类型
5.1创建语句分析
create table psn ( id int, name string, hobby array<string>, address map<string,string> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' COLLECTION ITEMS TERMINATED BY '-' MAP KEYS TERMINATED BY ':' LINES TERMINATED BY '\n';
通过hive建表时不但定义了字段的类型,还定义了字段与字段之间的分隔符,数据类型之间的分隔符,每行数据之间的分隔符。
5.2执行创建
hive> create table psn ( > id int, > name string, > hobby array<string>, > address map<string,string> > ) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY ',' > COLLECTION ITEMS TERMINATED BY '-' > MAP KEYS TERMINATED BY ':' > LINES TERMINATED BY '\n'; OK Time taken: 0.758 seconds
5.3查看结构
hive> desc psn; OK id int name string hobby array<string> address map<string,string> Time taken: 0.267 seconds, Fetched: 4 row(s) #详细查看 hive> desc formatted psn;
5.4插入数据
说插入有点不合适,应该是加载文件(data.txt)中的数据
5.5语法:
LOAD DATA [LOCAL] INPATH 'path' INTO TABLE psn;
5.6执行加载
hive> load data local inpath '/root/data' into table psn; Loading data to table default.psn OK Time taken: 3.591 seconds
发现使用这种方法插入数据简直快的飞起啊。
5.7执行查询
hive> select * from psn; OK 1 小明1 ["lol","book","movie"] {"beijing":"changping","shanghai":"pudong"} 2 小明2 ["lol","book","movie"] {"beijing":"changping","shanghai":"pudong"} 3 小明3 ["lol","book","movie"] {"beijing":"changping","shanghai":"pudong"} 4 小明4 ["lol","book","movie"] {"beijing":"changping","shanghai":"pudong"} 5 小明5 ["lol","movie"] {"beijing":"changping","shanghai":"pudong"} 6 小明6 ["lol","book","movie"] {"beijing":"changping","shanghai":"pudong"} 7 小明7 ["lol","book"] {"beijing":"changping","shanghai":"pudong"} 8 小明8 ["lol","book"] {"beijing":"changping","shanghai":"pudong"} 9 小明9 ["lol","book","movie"] {"beijing":"changping","shanghai":"pudong"} Time taken: 0.235 seconds, Fetched: 9 row(s)
5.8注意,
分隔符的格式如果不匹配它不会报错,而是插入null的空数据。
6.hive内部表和外部表
6.1hive内部表语法
CREATE TABLE [IF NOT EXISTS] table_name
删除表时,元数据与表结构都会被删除
6.2hive外部表语法
CREATE EXTERNAL TABLE [IF NOT EXISTS] table_name LOCATION hdfs_path
删除表时,只删除表结构,不删除元数据。
6.3创建外部表
create external table psn ( id int, name string, hobby array<string>, address map<string,string> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' COLLECTION ITEMS TERMINATED BY '-' MAP KEYS TERMINATED BY ':' LINES TERMINATED BY '\n' location '/usr/';
创建前 hdfs下的目录 没有 usr目录
6.4执行创建外部表
hive> create external table psn ( > id int, > name string, > hobby array<string>, > address map<string,string> > ) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY ',' > COLLECTION ITEMS TERMINATED BY '-' > MAP KEYS TERMINATED BY ':' > LINES TERMINATED BY '\n' > location '/usr/'; OK Time taken: 0.188 seconds
再次查看网页hdfs下的目录
外部表执行删除只会删除表结构,不删除元数据。
刷新网页usr目录还在
6.5内部表和外部表的区别:
1.创建时需要制定目录
2.外部表执行删除只会删除表结构,不删除元数据。内部表回删除表结构和删除元数据。