2015 年 5月 18 日随笔档案 - HarkLee

2015年5月18日

摘要：官方是这样说的：Directory to use for "scratch" space in Spark, including map output files and RDDs that get stored on disk. This should be on a fast, local di... 阅读全文

posted @ 2015-05-18 17:35 HarkLee 阅读(1838) 评论(0) 推荐(0) 编辑

spark sql中进行sechema合并

摘要： spark sql中支持sechema合并的操作。直接上官方的代码吧。val sqlContext = new org.apache.spark.sql.SQLContext(sc)// sqlContext from the previous example is used in this exa... 阅读全文

posted @ 2015-05-18 15:32 HarkLee 阅读(988) 评论(0) 推荐(0) 编辑

spark sql中保存数据的几种方式

摘要：从官网来copy过来的几种模式描述：Scala/JavaPythonMeaningSaveMode.ErrorIfExists(default)"error"(default)When saving a DataFrame to a data source, if data already exis... 阅读全文

posted @ 2015-05-18 14:35 HarkLee 阅读(8994) 评论(1) 推荐(1) 编辑

spark sql中将数据保存成parquet,json格式

摘要： val df = sqlContext.load("/opt/modules/spark1.3.1/examples/src/main/resources/people.json","json")df.select("name","age").save("/opt/test/namesAndAges... 阅读全文

posted @ 2015-05-18 14:09 HarkLee 阅读(1524) 评论(0) 推荐(0) 编辑

HarkLee

打酱油

公告