SparkSQL 操作Hive

Spark中本身内置了Hive,但一般用于测试,生产环境中需要连接外置的Hive

1.将Hive的配置文件hive-site.xml拷贝到Spark的配置目录下

cp /usr/hive/apache-hive-3.1.3-bin/conf/hive-site.xml /usr/spark/spark-3.5.0-bin-hadoop3/conf

2.将Mysql JDBC驱动放置在Spark的jars目录下

1.下载Mysql JDBC驱动

地址1(Mysql官方托管):https://downloads.mysql.com/archives/c-j/
image

地址2(Maven中央仓库托管): https://mvnrepository.com/artifact/com.mysql/mysql-connector-j
image

2.拷贝驱动

cp /home/mysql-connector-j-8.0.33.jar /usr/spark/spark-3.5.0-bin-hadoop3/jars

3.将Hadoop的core-site.xmlhdfs-site.xml拷贝到Spark配置目录下

cp /usr/hadoop/hadoop-3.3.6/etc/hadoop/{hdfs-site.xml,core-site.xml} /usr/spark/spark-3.5.0-bin-hadoop3/conf

4.重启Spark-shell

/usr/spark/spark-3.5.0-bin-hadoop3/bin/spark-shell

5.测试

spark.sql("show tables").show
posted @ 2024-01-15 12:06  SpringCore  阅读(154)  评论(0编辑  收藏  举报