广播变量

  1. 广播变量有个要求,广播变量是只读的,分区中只能获取广播变量的值,无法更改广播变量的值
  2. 优势:节省了磁盘io,数据量越大,效果越明显
  3. 使用:直接通过广播变量的.value函数获取广播变量的值
  4. 案例
package videovar

import org.apache.spark.rdd.RDD
import org.apache.spark.{SparkConf, SparkContext}

object VideoVar {
  def main(args: Array[String]): Unit = {
    val sparkConf = new SparkConf().setMaster("local[2]").setAppName("sum")
    val sc = new SparkContext(sparkConf)
    val rdd: RDD[Int] = sc.makeRDD(1 to 100)

    var num = 3;
    val broadcast = sc.broadcast(num)
    val filter1: RDD[Int] = rdd.filter((num: Int) => {
      if (num % broadcast.value == 0) {
        true
      } else {
        false
      }
    })
    val filter2: RDD[Int] = rdd.filter((num: Int) => {
      if (num % (broadcast.value*broadcast.value) == 0) {
        true
      } else {
        false
      }
    })
    val str1: String = filter1.collect().mkString(",")
    val str2: String = filter2.collect().mkString(",")
    println(str1)
    println(str2)
    sc.stop()
  }
}
posted @ 2022-08-25 10:55  jsqup  阅读(56)  评论(0编辑  收藏  举报