随笔分类 - spark
spark学习笔记
摘要:Accumulate package com.shujia.spark.core import java.lang import org.apache.spark.{SparkConf, SparkContext} import org.apache.spark.rdd.RDD import org
阅读全文
摘要:package com.shujia.spark.core import org.apache.spark.rdd.RDD import org.apache.spark.{SparkConf, SparkContext} object Demo19PageRank { def main(args:
阅读全文
摘要:package com.shujia.spark.core import org.apache.spark.rdd.RDD import org.apache.spark.{SparkConf, SparkContext} import scala.util.Random object Demo18
阅读全文
摘要:package com.shujia.spark.core import org.apache.spark.rdd.RDD import org.apache.spark.{SparkConf, SparkContext} object Demo17Student { def main(args:
阅读全文
摘要:package com.shujia.spark.core import org.apache.spark.{SparkConf, SparkContext} import org.apache.spark.rdd.RDD object Demo16CheckPoint { def main(arg
阅读全文
摘要:package com.shujia.spark.core import org.apache.spark.rdd.RDD import org.apache.spark.storage.StorageLevel import org.apache.spark.{SparkConf, SparkCo
阅读全文
摘要:package com.shujia.spark.core import org.apache.spark.rdd.RDD import org.apache.spark.{Partitioner, SparkConf, SparkContext} object Demo13Patition { d
阅读全文
摘要:package com.shujia.spark.core import org.apache.hadoop.conf.Configuration import org.apache.hadoop.fs.{FileSystem, Path} import org.apache.spark.rdd.R
阅读全文
摘要:package com.shujia.spark.core import org.apache.spark.rdd.RDD import org.apache.spark.{SparkConf, SparkContext} object Demo10Sort { def main(args: Arr
阅读全文
摘要:Union package com.shujia.spark.core import org.apache.spark.rdd.RDD import org.apache.spark.{SparkConf, SparkContext} object Demo8Union { def main(arg
阅读全文
摘要:package com.shujia.spark.core import org.apache.spark.rdd.RDD import org.apache.spark.{SparkConf, SparkContext} object Demo6GroupByKey { def main(args
阅读全文
摘要:package com.shujia.spark.core import org.apache.spark.rdd.RDD import org.apache.spark.{SparkConf, SparkContext} object Demo5Sample { def main(args: Ar
阅读全文
摘要:map package com.shujia.spark.core import org.apache.spark.rdd.RDD import org.apache.spark.{SparkConf, SparkContext} object Demo2Map { def main(args: A
阅读全文
摘要:spark实现Wordcount package com.shujia.spark.core import org.apache.spark.rdd.RDD import org.apache.spark.{SparkConf, SparkContext} object Demo1WordCount
阅读全文
摘要:(摘自xiaohu_bigdata) Spark最初由美国加州伯克利大学的AMP实验室于2009年开发,是基于内存计算的大数据并行计算框架,可用于构建大型的、低延迟的数据分析应用程序。 Spark特点Spark具有如下几个主要特点: 运行速度快:Spark使用先进的DAG(Directed Acyc
阅读全文