spark 下java list 或者scala list 转DataFrame or DataSet 总结
一、JAVA list 转 DataFrame or DataSet -> 关注清哥聊技术公众号,了解更多技术文章
case class CaseJava( var num: String, var id: String, var start_time: String, var istop_time: String) val listData: java.util.List[CaseJava] = new java.util.ArrayList[CaseJava] listData.add(new CaseJava("11","22","33","44")) val dataFrame = spark.createDataFrame(listData, classOf[CaseJava])
二、scala MutableList 转 DataFrame or DataSe
1、方式一:
val spark = SparkSession.builder().appName("Spark-SQL").master("local[2]").getOrCreate() import spark.implicits._ var tom = new TestPerson("Tom Hanks",37,35.5) var sam = new TestPerson("Sam Smith",40,40.5) val PersonList = mutable.MutableList[TestPerson]() //Adding data in list PersonList += tom PersonList += sam //It will be work. var personDS = Seq(PersonList).toDS()
2、方式二:
case class TestPerson(name: String, age: Long, salary: Double) val spark = SparkSession.builder().appName("List to Dataset").master("local[*]").getOrCreate() var tom = new TestPerson("Tom Hanks",37,35.5) var sam = new TestPerson("Sam Smith",40,40.5) // mutable.MutableList[TestPerson]() is not required , i used below way which was // cleaner val PersonList = List(tom,sam) import spark.implicits._ PersonList.toDS().show
3、方式三:
case class TestPerson(name: String, age: Long, salary: Double) val tom = TestPerson("Tom Hanks",37,35.5) val sam = TestPerson("Sam Smith",40,40.5) val PersonList = mutable.MutableList[TestPerson]() PersonList += tom PersonList += sam val personDS = PersonList.toDS() println(personDS.getClass) personDS.show() val personDF = PersonList.toDF() println(personDF.getClass) personDF.show() personDF.select("name", "age").show()
更多请参考:https://stackoverflow.com/questions/39397652/convert-scala-list-to-dataframe-or-dataset
作者的原创文章,转载须注明出处。原创文章归作者所有,欢迎转载,但是保留版权。对于转载了博主的原创文章,不标注出处的,作者将依法追究版权,请尊重作者的成果。