摘要: 官方是这样说的:Directory to use for "scratch" space in Spark, including map output files and RDDs that get stored on disk. This should be on a fast, local di... 阅读全文
posted @ 2015-05-18 17:35 HarkLee 阅读(1838) 评论(0) 推荐(0) 编辑
摘要: spark sql中支持sechema合并的操作。直接上官方的代码吧。val sqlContext = new org.apache.spark.sql.SQLContext(sc)// sqlContext from the previous example is used in this exa... 阅读全文
posted @ 2015-05-18 15:32 HarkLee 阅读(988) 评论(0) 推荐(0) 编辑
摘要: 从官网来copy过来的几种模式描述:Scala/JavaPythonMeaningSaveMode.ErrorIfExists(default)"error"(default)When saving a DataFrame to a data source, if data already exis... 阅读全文
posted @ 2015-05-18 14:35 HarkLee 阅读(8994) 评论(1) 推荐(1) 编辑
摘要: val df = sqlContext.load("/opt/modules/spark1.3.1/examples/src/main/resources/people.json","json")df.select("name","age").save("/opt/test/namesAndAges... 阅读全文
posted @ 2015-05-18 14:09 HarkLee 阅读(1524) 评论(0) 推荐(0) 编辑