Hive beeline update
Hive cli vs beeline
The primary difference between the two involves how the clients connect to Hive.
The Hive CLI connects directly to the Hive Driver and requires that Hive be installed on the same machine as the client.
Beeline connects to HiveServer2 and does not require the installation of Hive libraries on the same machine as the client. Beeline is a thin client that also uses the Hive JDBC driver but instead executes queries through HiveServer2, which allows multiple concurrent client connections and supports authentication.
1.load data local inpath is not support in beeline.
Solution: a. we can use 'put' and 'load data inpath' to replace this command.
b. create these stage table as external table, then we just add new partitions and then just put file, not need load.
beeline -n username -u "jdbc:hive2://hiveserver:10000/dbname;principal=hive/hiveserve@domain"
run kinit before run beeline.
Echo 'password' | kinit
drop table evan_test2;
CREATE EXTERNAL TABLE evan_test2 (
userId BIGINT,
type INT,
filename String
)
COMMENT 'User Infomation'
PARTITIONED BY (country String)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 'hdfs:////evan_test/';
alter table evan_test2 add partition(country='tet');
2.
INSERT OVERWRITE LOCAL DIRECTORY 'temp'
FIELDS TERMINATED BY ','
select * from evan_test2;
This is not support in beeline .
Solution: we can use beeline's new command to implement the same function.
beeline --showHeader=false --outputformat=dsv --delimiterForDSV=$'\001' -e 'select * from evan_test3' >test.csv
Show data test.csv
Cat -a test.csv
Cat -a test.csv | tr $'\001' '\t'