Hadoop vs Spark性能对比
摘要:基于Spark-0.4和Hadoop-0.20.21. Kmeans数据:自己产生的三维数据,分别围绕正方形的8个顶点{0, 0, 0}, {0, 10, 0}, {0, 0, 10}, {0, 10, 10},{10, 0, 0}, {10, 0, 10}, {10, 10, 0}, {10, 10, 10}Point number189,918,082 (1亿9千万个三维点)Capacity10GBHDFS Location/user/LijieXu/Kmeans/Square-10GB.txt程序逻辑:读取HDFS上的block到内存,每个block转化为RDD,里面包含vector。然
阅读全文
posted @ 2012-08-13 11:50