[Spark][Python][Application]非交互式运行Spark Application 的例子


非交互式运行Spark Application 的例子

$ cat Count.py

import sys
from pyspark import SparkContext

if __name__ == "__main__":

sc = SparkContext()
logfile = sys.argv[1]

count = sc.textFile(logfile).filter(lambda line: '.jpg' in line).count()
print "JPG requests: ", count

sc.stop()

 

$

$ spark-submit --master yarn-client Count.py /test/weblogs/*

Number of JPG requests: 10258
$

posted @ 2017-10-29 10:02  健哥的数据花园  阅读(422)  评论(0编辑  收藏  举报