DataFrame中的行动算子操作2

## 修改hdfs-site.xml
<property>
<name>hive.metastore.warehouse.dir</name>
<value>hdfs://node1:9000/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
## 保存至table表格中
val sparkConf = new SparkConf().setAppName("demo03").setMaster("local[*]")
val ss = SparkSession.builder().config(sparkConf).getOrCreate()
import ss.implicits._
val seq: Seq[(String, Int)] = Array(
("zs", 20),
("ls", 21),
("ww", 22),
("ml", 23),
("zb", 24),
("wb", 20)
)
val dataFrame: DataFrame = seq.toDF("name", "age")
dataFrame.createOrReplaceTempView("student")
val dataFrame1 = ss.sql("select * from student where age>22")
dataFrame1.show()
/**
* 将数据写出到spark.sql.warehouse.dir 路径下
*/
dataFrame1.write.mode(SaveMode.Overwrite).saveAsTable("stud")
val frame = ss.sql("select * from stud")
frame.show()
val frame1 = ss.sql("show tables")
frame1.show()
/*
将数据以结构化文件写出 需要跟一个目录
*/
dataFrame1.write.mode(SaveMode.Append).csv("hdfs://node1:9000/sparksql/a.csv")
dataFrame1.write.mode(SaveMode.Append).json("hdfs://node1:9000/sparksql/json")
dataFrame1.write.mode(SaveMode.Append).parquet("hdfs://node1:9000/sparksql/parquet")
dataFrame1.write.mode(SaveMode.Append).orc("hdfs://node1:9000/sparksql/orc")
val properties = new Properties()
properties.setProperty("user", "root")
properties.setProperty("password", "Jsq123456...")
dataFrame1.write.mode(SaveMode.Append).jdbc("jdbc:mysql://node1:3306/project?serverTimezone=UTC", "student_info", properties)
/*
写出到text中时,只输出一行,否则会报错
*/
val dataset: Dataset[String] = dataFrame1.map((line) => {
(line.get(0) + " " + line.get(1))
})
dataset.write.mode(SaveMode.Append).text("hdfs://node1:9000/sparksql/text")

SaveMode介绍

SaveMode 解释
SaveMode.ErrorIfExists(default) 如果文件存在,则报错
SaveMode.Append 追加
SaveMode.Overwrite 覆盖写
SaveMode.Ignore 数据存在,则忽略
posted @   jsqup  阅读(32)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· Manus的开源复刻OpenManus初探
· AI 智能体引爆开源社区「GitHub 热点速览」
· 三行代码完成国际化适配,妙~啊~
· .NET Core 中如何实现缓存的预热?
点击右上角即可分享
微信分享提示