hive

hive进行mapreduce卡壳解决方法

在google搜索得出的解决方案是在执行的hive语句前添加以下几条参数值设定语句:

set mapreduce.job.reduces=512;
set hive.groupby.skewindata=true;
set hive.optimize.skewjoin=true;
set hive.skewjoin.key=5000;
set hive.groupby.mapaggr.checkinterval=5000;

 

链接:https://pan.baidu.com/s/10h4wyq5aKbnPgXaS0KhBoA
提取码:gxcw
复制这段内容后打开百度网盘手机App,操作更方便哦

第一步:安装hadoop

第二步:安装mysql的JDBC驱动程序:JDBC Driver for MySQL:https://www.mysql.com/products/connector/ 

下载地址http://mirrors.hust.edu.cn/apache/

选择合适的Hive版本进行下载,进到stable-2文件夹可以看到稳定的2.x的版本是2.3.4

cd apache-hive-2.3.3-bin/conf/
touch hive-site.xml
vi hive-site.xml

  

<configuration>
        <property>
                <name>javax.jdo.option.ConnectionURL</name>
                <value>jdbc:mysql://hadoop1:3306/hivedb?createDatabaseIfNotExist=true</value>
                <description>JDBC connect string for a JDBC metastore</description>
                <!-- 如果 mysql 和 hive 在同一个服务器节点,那么请更改 hadoop02 为 localhost -->
        </property>
        <property>
                <name>javax.jdo.option.ConnectionDriverName</name>
                <value>com.mysql.jdbc.Driver</value>
                <description>Driver class name for a JDBC metastore</description>
        </property>
        <property>
                <name>javax.jdo.option.ConnectionUserName</name>
                <value>root</value>
                <description>username to use against metastore database</description>
        </property>
        <property>
                <name>javax.jdo.option.ConnectionPassword</name>
                <value>root</value>
        <description>password to use against metastore database</description>
        </property>
</configuration>

以下可选配置,该配置信息用来指定 Hive 数据仓库的数据存储在 HDFS 上的目录

        <property>
                <name>hive.metastore.warehouse.dir</name>
                <value>/hive/warehouse</value>
                <description>hive default warehouse, if nessecory, change it</description>
        </property> 

  

下载mysql-connector-java-8.0.16-1.el7.noarch.rpm

yum -y install mysql-connector-java-8.0.16-1.el7.noarch.rpm

[root@localhost ~]# rpm -ql mysql-connector-java-8.0.16-1.el7.noarch
/usr/share/doc/mysql-connector-java-8.0.16
/usr/share/doc/mysql-connector-java-8.0.16/CHANGES
/usr/share/doc/mysql-connector-java-8.0.16/INFO_BIN
/usr/share/doc/mysql-connector-java-8.0.16/INFO_SRC
/usr/share/doc/mysql-connector-java-8.0.16/LICENSE
/usr/share/doc/mysql-connector-java-8.0.16/README
/usr/share/java/mysql-connector-java.jar  #将jar包复制到hive根目录下的lib目录里去。

vim ~/.bashrc

export HIVE_HOME=/usr/local/apache-hive-2.3.4-bin
export HADOOP_HOME=/usr/local/hadoop-3.1.2
export PATH=$PATH:$HIVE_HOME/bin
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre

source ~/.bashrc

验证hive安装:

[root@localhost ~]# hive --help
Usage ./hive <parameters> --service serviceName <service parameters>
Service List: beeline cleardanglingscratchdir cli hbaseimport hbaseschematool help hiveburninclient hiveserver2 hplsql jar lineage llapdump llap llapstatus metastore metatool orcfiledump rcfilecat schemaTool version 
Parameters parsed:
  --auxpath : Auxiliary jars 
  --config : Hive configuration directory
  --service : Starts specific service/component. cli is default
Parameters used:
  HADOOP_HOME or HADOOP_PREFIX : Hadoop install directory
  HIVE_OPT : Hive options
For help on a particular service:
  ./hive --service serviceName --help
Debug help:  ./hive --debug --help

  初始化元数据库:

[root@localhost ~]# schematool -dbType mysql -initSchema
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL:	 jdbc:mysql://localhost:3306/hivedb?createDatabaseIfNotExist=true
Metastore Connection Driver :	 com.mysql.jdbc.Driver
Metastore connection User:	 root
Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
Starting metastore schema initialization to 2.3.0
Initialization script hive-schema-2.3.0.mysql.sql
Initialization script completed
schemaTool completed

  

启动hive客户端

  hive --service cli和hive效果一样

[root@localhost ~]# hive
which: no hbase in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin:/usr/local/apache-hive-2.3.4-bin/bin:/usr/local/apache-hive-2.3.4-bin/bin:/usr/local/apache-hive-2.3.4-bin/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/usr/local/apache-hive-2.3.4-bin/lib/hive-common-2.3.4.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive>

  

 基本应用:

95002,刘晨,女,19,IS
95017,王风娟,女,18,IS
95018,王一,女,19,IS
95013,冯伟,男,21,CS
95014,王小丽,女,19,CS
95019,邢小丽,女,19,IS
95020,赵钱,男,21,IS
95003,王敏,女,22,MA
95004,张立,男,19,IS
95012,孙花,女,20,CS
95010,孔小涛,男,19,CS
95005,刘刚,男,18,MA
95006,孙庆,男,23,CS
95007,易思玲,女,19,MA
95008,李娜,女,18,CS
95021,周二,男,17,MA
95022,郑明,男,20,MA
95001,李勇,男,20,CS
95011,包小柏,男,18,MA
95009,梦圆圆,女,18,MA
95015,王君,男,18,MA

  

hive> create database myhive; #创建数据库
Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
OK
Time taken: 5.433 seconds

hive> show databases;  #查看有哪些数据库 OK default myhive Time taken: 0.182 seconds, Fetched: 2 row(s)
hive> use myhive;  #进入数据库 OK Time taken: 0.082 seconds
hive> select current_database(); OK myhive Time taken: 0.163 seconds, Fetched: 1 row(s) hive> create table student(id int, name string, sex string, age int, department string) row format delimited fields terminated by ",";
hive> load data local inpath "/home/hadoop/student.txt" into table student; hive> select * from student; OK 95002 刘晨 女 19 IS 95017 王风娟 女 18 IS 。。。。。。

  


hive 使用方法

desc student; 描述表
desc extended student ; 查看表的详细信息(表的类型(内部表、外部表),表压缩否)
desc formatted student ; 格式化输出表信息
show create table student ;查看建表语句
show functions ; 查看hive中的函数
desc function upper; 描述函数
desc function extended upper; 描述方法具体使用方法
show tables; 查看所有的表
show databases; 查看所有的数据库
set hive.cli.print.header =true 设置参数,临时生效
minimal 是否跑mapreduce可配

  

 

 

 

posted @ 2019-04-27 22:56  linuxws  阅读(413)  评论(0编辑  收藏  举报