1. CPU平均利用率
真实图
dataset Nodes Edges Hadoop Spark
wiki-Vote 0.7115 0.103689 25.71 28.18
soc-Slashdot0902 8.2168 0.948464 30.03 34.4
web-Google 87.5713 5.105039 31.62 30.6
cit-Patents 377.4768 16.518948 28.59 28.42
twitter-Small 1131.6811 85.331845 22.12 31.6
模拟图
dataset Nodes Edges Hadoop Spark
kronecker19 41.6962 3.206497 31.07 31.27
kronecker20 83.3566 7.054294 30.68 28.49
kronecker21 166.5554 15.519448 29.71 26.79
kronecker22 333.0326 34.142787 26.67 26.99
kronecker23 665.4956 75.114133 22.99 27.13
kronecker24 1330.5449 165.251092 21.5 28.43
拟合程度
<Hadoop>
SSR: 36.2727
SE:2.007560708920156
R2: 0.743082937526223
<Hadoop>
<spark>
SSR: 55.8185
SE:2.4903926508796874
R2: 0.014813494346762601
<spark>
预测值
dataset Hadoop-Real Hadoop-Predicted Spark-Real Spark-Predicted
soc-Pokec 28.09 28.87 25.44 29.44
soc-LiveJournal 24.16 26.64 27.33 29.24
v23 23.43 24.59 26.88 29.05
v24 21.9 19.16 28.41 28.57
Hadoop Error: 0.0763438904190619
Spark Error: 0.07843159156040536
2. 内存占用总量
真实图
dataset Nodes Edges Hadoop-System Spark-System Hadoop-App Spark-App
wiki-Vote 0.7115 0.103689 9183 11068 8234 9656
soc-Slashdot0902 8.2168 0.948464 10152 16509 8467 15031
web-Google 87.5713 5.105039 9368 24208 9123 22375
cit-Patents 377.4768 16.518948 14455 36279 9346 32978
twitter-Small 1131.6811 85.331845 40304 71378 11365 65193
模拟图
dataset Nodes Edges Hadoop-System Spark-System Hadoop-App Spark-App
kronecker19 41.6962 3.206497 9732 22601 9099 21073
kronecker20 83.3566 7.054294 9897 24609 9242 22923
kronecker21 166.5554 15.519448 10885 28707 9475 26173
kronecker22 333.0326 34.142787 17632 45336 9772 41727
kronecker23 665.4956 75.114133 32271 71553 10713 64563
kronecker24 1330.5449 165.251092 58096 72909 12354 67454
拟合程度
System
<Hadoop>
SSR: 1.15289E8
SE:3579.090511413324
R2: 0.9567841910604893
<Hadoop>
<spark>
SSR: 6.95683E8
SE:8791.934435100793
R2: 0.8709518191368232
<spark>
Application
<Hadoop>
SSR: 955751.0
SE:325.87506126666943
R2: 0.9386345719300593
<Hadoop>
<spark>
SSR: 5.52254E8
SE:7833.36170207629
R2: 0.8774084647738593
<spark>
预测值
System
dataset Hadoop-Real Hadoop-Predicted Spark-Real Spark-Predicted
soc-Pokec 13835 12598.3 36557 28368.4
soc-LiveJournal 27541 23629.71 69629 43329.99
v23 34102 33708.53 71411 56999.6
v24 62219 60529.58 69866 93376.25
Application
dataset Hadoop-Real Hadoop-Predicted Spark-Real Spark-Predicted
soc-Pokec 9681 9170.8 33556 25942.8
soc-LiveJournal 10348 10005.65 64670 39670.31
v23 10640 10768.41 64843 52212.4
v24 12167 12798.22 66758 85588.57
Hadoop Error: 0.0374332992587976
Spark Error: 0.27257832617312483
3. 磁盘I/O平均带宽
真实图
dataset Nodes Edges Hadoop Spark
wiki-Vote 0.7115 0.103689 0.23 0.09
soc-Slashdot0902 8.2168 0.948464 0.22 0.07
web-Google 87.5713 5.105039 1.09 0.09
cit-Patents 377.4768 16.518948 3.35 1.35
twitter-Small 1131.6811 85.331845 5.95 1.61
模拟图
dataset Nodes Edges Hadoop Spark
kronecker19 41.6962 3.206497 0.59 0.09
kronecker20 83.3566 7.054294 1.19 0.32
kronecker21 166.5554 15.519448 2.72 1.15
kronecker22 333.0326 34.142787 3.49 1.55
kronecker23 665.4956 75.114133 5.35 1.66
kronecker24 1330.5449 165.251092 7.42 1.68
拟合程度(Spark Log)
<Hadoop>
SSR: 5.04663
SE:0.7488235217103337
R2: 0.9188094088946611
<Hadoop>
<spark>
SSR: 1.67605
SE:0.4315411657973985
R2: 0.6876516291236345
<spark>
预测值(Spark线性)
dataset Hadoop-Real Hadoop-Predicted Spark-Real Spark-Predicted
soc-Pokec 3.37 1.74 1.54 0.6
soc-LiveJournal 4.88 3.39 1.69 1
v23 5.57 4.9 1.65 1.37
v24 7.48 8.91 1.72 2.35
Hadoop Error: 0.27515726710575933
Spark Error: 0.3863193359842736
预测值(Spark Log)
dataset Hadoop-Real Hadoop-Predicted Spark-Real Spark-Predicted
soc-Pokec 3.37 1.74 1.54 0.98
soc-LiveJournal 4.88 3.39 1.69 1.27
v23 5.57 4.9 1.65 1.4
v24 7.48 8.91 1.72 1.58
Hadoop Error: 0.27515726710575933
Spark Error: 0.21058748746112452
4. 网络I/O平均带宽
真实图
dataset Nodes Edges Hadoop Spark
wiki-Vote 0.7115 0.103689 0.48 0.14
soc-Slashdot0902 8.2168 0.948464 0.62 0.55
web-Google 87.5713 5.105039 1.36 1.71
cit-Patents 377.4768 16.518948 2.48 2.2
twitter-Small 1131.6811 85.331845 3.31 2.31
模拟图
dataset Nodes Edges Hadoop Spark
kronecker19 41.6962 3.206497 0.91 1.39
kronecker20 83.3566 7.054294 1.41 1.79
kronecker21 166.5554 15.519448 2.07 2.26
kronecker22 333.0326 34.142787 2.83 2.45
kronecker23 665.4956 75.114133 3.21 2.46
kronecker24 1330.5449 165.251092 3.62 2.4
拟合程度(Log)
<Hadoop>
SSR: 2.13593
SE:0.48716070814009166
R2: 0.834182608364257
<Hadoop>
<spark>
SSR: 0.46733
SE:0.22787179631440913
R2: 0.9263167475554205
<spark>
预测值(线性)
dataset Hadoop-Real Hadoop-Predicted Spark-Real Spark-Predicted
soc-Pokec 2.59 1.55 2.45 1.56
soc-LiveJournal 3.23 2.24 2.57 1.89
v23 3.37 2.87 2.37 2.2
v24 3.63 4.54 2.33 3.02
Hadoop Error: 0.9991710102541724
Spark Error: 0.9991223108974356
预测值(log)
dataset Hadoop-Real Hadoop-Predicted Spark-Real Spark-Predicted
soc-Pokec 2.59 2.2 2.45 1.91
soc-LiveJournal 3.23 2.7 2.57 2.28
v23 3.37 2.91 2.37 2.44
v24 3.63 3.23 2.33 2.68
Hadoop Error: 0.14065272208292787
Spark Error: 0.12764804947857733
5. 磁盘I/O总量
真实图
dataset Nodes Edges Hadoop Total Read Spark Total Read Hadoop Total Write Spark Total Write
wiki-Vote 0.7115 0.103689 189.67 276.25 438.43 16.49
soc-Slashdot0902 8.2168 0.948464 188.42 291.59 453.37 13.05
web-Google 87.5713 5.105039 242.48 351.67 2649.85 37.64
cit-Patents 377.4768 16.518948 417.19 540.86 13215.73 1693.7
twitter-Small 1131.6811 85.331845 1310.42 1456.49 75797.69 5423.56
模拟图
dataset Nodes Edges Hadoop Total Read Spark Total Read Hadoop Total Write Spark Total Write
kronecker19 41.6962 3.206497 221.86 318.01 1327.23 26.95
kronecker20 83.3566 7.054294 252.33 377.16 3074.33 157.6
kronecker21 166.5554 15.519448 337.94 490.82 8974.67 1024.94
kronecker22 333.0326 34.142787 577.17 733.23 18729.05 2818.79
kronecker23 665.4956 75.114133 1161.75 1299.17 56731.16 6549.93
kronecker24 1330.5449 165.251092 2292.61 3584.97 163565.06 14622.67
预测值
dataset Hadoop-Real Hadoop-Predicted Spark-Real Spark-Predicted
soc-Pokec 14385.9 8743.9 2345 979.45
soc-LiveJournal 43627.56 41651.87 5747.28 3838.22
v23 60092.9 71718.13 7519.37 6450.13
v24 171156.89 151728.43 17588.33 13400.76
6. 网络I/O总量
真实图
dataset Nodes Edges Hadoop Total Read Spark Total Read Hadoop Total Write Spark Total Write
wiki-Vote 0.7115 0.103689 1047.55 28.28 956.75 27.46
soc-Slashdot0902 8.2168 0.948464 1373.96 113.08 1307.49 110.13
web-Google 87.5713 5.105039 3475.79 735.34 3364.81 717.74
cit-Patents 377.4768 16.518948 10281.61 2877.33 10002.93 2814.32
twitter-Small 1131.6811 85.331845 43900.91 8117.75 42938.61 7947.36
模拟图
dataset Nodes Edges Hadoop Total Read Spark Total Read Hadoop Total Write Spark Total Write
kronecker19 41.6962 3.206497 2186.18 458.14 2101.54 445.7
kronecker20 83.3566 7.054294 3843.44 930 3730.54 911.38
kronecker21 166.5554 15.519448 7154.24 2109.45 6969.85 2057.67
kronecker22 333.0326 34.142787 15866.44 4655.56 15493.62 4551.67
kronecker23 665.4956 75.114133 35550.19 10134.52 34705.65 9944.77
kronecker24 1330.5449 165.251092 82821.48 21815.64 81106.79 21374.18
预测值
dataset Hadoop-Real Hadoop-Predicted Spark-Real Spark-Predicted
soc-Pokec 11580.02 7153.32 3875.54 1857.14
soc-LiveJournal 30217 24193.13 9154.24 6030.16
v23 37951.99 39761.5 11317.24 9842.84
v24 86509.42 81190.99 24783.83 19988.86