hive
hive进行mapreduce卡壳解决方法
在google搜索得出的解决方案是在执行的hive语句前添加以下几条参数值设定语句:
set mapreduce.job.reduces=512;
set hive.groupby.skewindata=true;
set hive.optimize.skewjoin=true;
set hive.skewjoin.key=5000;
set hive.groupby.mapaggr.checkinterval=5000;
链接:https://pan.baidu.com/s/10h4wyq5aKbnPgXaS0KhBoA
提取码:gxcw
复制这段内容后打开百度网盘手机App,操作更方便哦
第一步:安装hadoop
第二步:安装mysql的JDBC驱动程序:JDBC Driver for MySQL:https://www.mysql.com/products/connector/
下载地址http://mirrors.hust.edu.cn/apache/
选择合适的Hive版本进行下载,进到stable-2文件夹可以看到稳定的2.x的版本是2.3.4
1 2 3 | cd apache-hive-2.3.3-bin /conf/ touch hive-site.xml vi hive-site.xml |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | <configuration> <property> <name>javax.jdo.option.ConnectionURL< /name > <value>jdbc:mysql: //hadoop1 :3306 /hivedb ?createDatabaseIfNotExist= true < /value > <description>JDBC connect string for a JDBC metastore< /description > <!-- 如果 mysql 和 hive 在同一个服务器节点,那么请更改 hadoop02 为 localhost --> < /property > <property> <name>javax.jdo.option.ConnectionDriverName< /name > <value>com.mysql.jdbc.Driver< /value > <description>Driver class name for a JDBC metastore< /description > < /property > <property> <name>javax.jdo.option.ConnectionUserName< /name > <value>root< /value > <description>username to use against metastore database< /description > < /property > <property> <name>javax.jdo.option.ConnectionPassword< /name > <value>root< /value > <description>password to use against metastore database< /description > < /property > < /configuration > |
以下可选配置,该配置信息用来指定 Hive 数据仓库的数据存储在 HDFS 上的目录
1 2 3 4 5 | <property> <name>hive.metastore.warehouse. dir < /name > <value> /hive/warehouse < /value > <description>hive default warehouse, if nessecory, change it< /description > < /property > |
下载mysql-connector-java-8.0.16-1.el7.noarch.rpm
yum -y install mysql-connector-java-8.0.16-1.el7.noarch.rpm
1 2 3 4 5 6 7 8 | [root@localhost ~] # rpm -ql mysql-connector-java-8.0.16-1.el7.noarch /usr/share/doc/mysql-connector-java-8 .0.16 /usr/share/doc/mysql-connector-java-8 .0.16 /CHANGES /usr/share/doc/mysql-connector-java-8 .0.16 /INFO_BIN /usr/share/doc/mysql-connector-java-8 .0.16 /INFO_SRC /usr/share/doc/mysql-connector-java-8 .0.16 /LICENSE /usr/share/doc/mysql-connector-java-8 .0.16 /README /usr/share/java/mysql-connector-java .jar #将jar包复制到hive根目录下的lib目录里去。 |
vim ~/.bashrc
1 2 3 4 | export HIVE_HOME= /usr/local/apache-hive-2 .3.4-bin export HADOOP_HOME= /usr/local/hadoop-3 .1.2 export PATH=$PATH:$HIVE_HOME /bin export JAVA_HOME= /usr/lib/jvm/java-1 .8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64 /jre |
source ~/.bashrc
验证hive安装:
1 2 3 4 5 6 7 8 9 10 11 12 13 | [root@localhost ~] # hive --help Usage . /hive <parameters> --service serviceName <service parameters> Service List: beeline cleardanglingscratchdir cli hbaseimport hbaseschematool help hiveburninclient hiveserver2 hplsql jar lineage llapdump llap llapstatus metastore metatool orcfiledump rcfilecat schemaTool version Parameters parsed: --auxpath : Auxiliary jars --config : Hive configuration directory --service : Starts specific service /component . cli is default Parameters used: HADOOP_HOME or HADOOP_PREFIX : Hadoop install directory HIVE_OPT : Hive options For help on a particular service: . /hive --service serviceName --help Debug help: . /hive --debug --help |
初始化元数据库:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | [root@localhost ~] # schematool -dbType mysql -initSchema SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar: file : /usr/local/apache-hive-2 .3.4-bin /lib/log4j-slf4j-impl-2 .6.2.jar! /org/slf4j/impl/StaticLoggerBinder .class] SLF4J: Found binding in [jar: file : /usr/local/hadoop-3 .1.2 /share/hadoop/common/lib/slf4j-log4j12-1 .7.25.jar! /org/slf4j/impl/StaticLoggerBinder .class] SLF4J: See http: //www .slf4j.org /codes .html #multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Metastore connection URL: jdbc:mysql: //localhost :3306 /hivedb ?createDatabaseIfNotExist= true Metastore Connection Driver : com.mysql.jdbc.Driver Metastore connection User: root Loading class `com.mysql.jdbc.Driver '. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver' . The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary. Starting metastore schema initialization to 2.3.0 Initialization script hive-schema-2.3.0.mysql.sql Initialization script completed schemaTool completed |
启动hive客户端
hive --service cli和hive效果一样
1 2 3 4 5 6 7 8 9 10 11 | [root@localhost ~] # hive which : no hbase in ( /usr/local/sbin : /usr/local/bin : /usr/sbin : /usr/bin : /root/bin : /usr/local/apache-hive-2 .3.4-bin /bin : /usr/local/apache-hive-2 .3.4-bin /bin : /usr/local/apache-hive-2 .3.4-bin /bin ) SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar: file : /usr/local/apache-hive-2 .3.4-bin /lib/log4j-slf4j-impl-2 .6.2.jar! /org/slf4j/impl/StaticLoggerBinder .class] SLF4J: Found binding in [jar: file : /usr/local/hadoop-3 .1.2 /share/hadoop/common/lib/slf4j-log4j12-1 .7.25.jar! /org/slf4j/impl/StaticLoggerBinder .class] SLF4J: See http: //www .slf4j.org /codes .html #multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Logging initialized using configuration in jar: file : /usr/local/apache-hive-2 .3.4-bin /lib/hive-common-2 .3.4.jar! /hive-log4j2 .properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. hive> |
基本应用:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | 95002,刘晨,女,19,IS 95017,王风娟,女,18,IS 95018,王一,女,19,IS 95013,冯伟,男,21,CS 95014,王小丽,女,19,CS 95019,邢小丽,女,19,IS 95020,赵钱,男,21,IS 95003,王敏,女,22,MA 95004,张立,男,19,IS 95012,孙花,女,20,CS 95010,孔小涛,男,19,CS 95005,刘刚,男,18,MA 95006,孙庆,男,23,CS 95007,易思玲,女,19,MA 95008,李娜,女,18,CS 95021,周二,男,17,MA 95022,郑明,男,20,MA 95001,李勇,男,20,CS 95011,包小柏,男,18,MA 95009,梦圆圆,女,18,MA 95015,王君,男,18,MA |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | hive> create database myhive; #创建数据库 Loading class `com.mysql.jdbc.Driver '. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver' . The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary. OK Time taken: 5.433 seconds <br>hive> show databases; #查看有哪些数据库 OK default myhive Time taken: 0.182 seconds, Fetched: 2 row(s) <br>hive> use myhive; #进入数据库 OK Time taken: 0.082 seconds <br>hive> select current_database(); OK myhive Time taken: 0.163 seconds, Fetched: 1 row(s) hive> create table student( id int, name string, sex string, age int, department string) row format delimited fields terminated by "," ; <br>hive> load data local inpath "/home/hadoop/student.txt" into table student; hive> select * from student; OK 95002 刘晨 女 19 IS 95017 王风娟 女 18 IS 。。。。。。 |
hive 使用方法
1 2 3 4 5 6 7 8 9 10 11 | desc student; 描述表 desc extended student ; 查看表的详细信息(表的类型(内部表、外部表),表压缩否) desc formatted student ; 格式化输出表信息 show create table student ;查看建表语句 show functions ; 查看hive中的函数 desc function upper; 描述函数 desc function extended upper; 描述方法具体使用方法 show tables; 查看所有的表 show databases; 查看所有的数据库 set hive.cli.print.header = true 设置参数,临时生效 minimal 是否跑mapreduce可配 |
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· AI与.NET技术实操系列:基于图像分类模型对图像进行分类
· go语言实现终端里的倒计时
· 如何编写易于单元测试的代码
· 10年+ .NET Coder 心语,封装的思维:从隐藏、稳定开始理解其本质意义
· .NET Core 中如何实现缓存的预热?
· 分享一个免费、快速、无限量使用的满血 DeepSeek R1 模型,支持深度思考和联网搜索!
· 25岁的心里话
· 基于 Docker 搭建 FRP 内网穿透开源项目(很简单哒)
· ollama系列01:轻松3步本地部署deepseek,普通电脑可用
· 按钮权限的设计及实现