spark访问mysql、spark访问hive
spark访问mysql:
导入依赖
<dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>2.3.4</version> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.11</artifactId> <version>2.3.4</version> </dependency> <!-- https://mvnrepository.com/artifact/mysql/mysql-connector-java --> <dependency> <groupId>mysql</groupId> <artifactId>mysql-connector-java</artifactId> <version>5.1.38</version> </dependency>
利用SparkSession编写spark SQL
val spark = SparkSession.builder().master("local[*]").appName("job").getOrCreate() val prop = new Properties() prop.setProperty("driver","com.mysql.jdbc.Driver") prop.setProperty("user","root") prop.setProperty("password","1234") //读表操作 val jdbcDF = spark.read.jdbc("jdbc:mysql://192.168.56.192/word","keyword",prop) jdbcDF.show(false) //写入操作 import spark.implicits._ val df = spark.createDataFrame(spark.sparkContext.parallelize(Seq((13,"衣服",300.0),(14,"裤子",299.00)))) .toDF("id","name","type") df.write.mode("append") .jdbc("jdbc:mysql://192.168.56.192/word","keyword",prop)
spark访问hive:
导入依赖:
<!--hive-spark--> <dependency> <groupId>org.apache.spark </groupId> <artifactId>spark-hive_2.11</artifactId> <version>2.3.4</version> </dependency>
resource资源文件配置:
将hadoop集群五个配置项中的core-site.xml、hdfs-site.xml和hive配置项中的hive-site.xml复制一份到resource内
编写spark SQL
def main(args: Array[String]): Unit = { val spark =SparkSession.builder().master("local[2]").appName("job").enableHiveSupport().getOrCreate() val df = spark.sql("select * from study.shop") df.show(false) }
注:在正常的sparkSession语句内添加hiveSupport即可