Spark:DataFrame数据透视函数pivot

使用数据透视函数pivot:

val list = List(
  (2017, 1, 100), 
  (2017, 1, 50), 
  (2017, 2, 100), 
  (2017, 3, 50), 
  (2018, 2, 200), 
  (2018, 2, 100))
import spark.implicits._
val ds = spark.createDataset(list)
val df = ds.toDF("year", "month", "num")
val res:org.apache.spark.sql.DataFrame = 
  df.groupBy("year")
    .pivot("month")
    .sum("num")

df.show
+----+-----+---+
|year|month|num|
+----+-----+---+
|2017|    1|100|
|2017|    1| 50|
|2017|    2|100|
|2017|    3| 50|
|2018|    2|200|
|2018|    2|100|
+----+-----+---+

res.show
+----+----+---+----+
|year|   1|  2|   3|
+----+----+---+----+
|2018|null|300|null|
|2017| 150|100|  50|
+----+----+---+----+
posted @ 2019-01-04 17:25  xuejianbest  阅读(1484)  评论(0编辑  收藏  举报