Spark学习笔记——使用PySpark
1.启动pyspark
2.读取文件
>>> from pyspark.sql import SparkSession >>> >>> spark = SparkSession.builder.appName("myjob").getOrCreate() >>> df = spark.read.text("hdfs:///user/lintong/logs/test") >>> df.show() +-----+ |value| +-----+ | 1| | 2| | 3| | 4| +-----+
3.退出pyspark使用exit()
4.使用spark-submit提交pyspark任务pi.py
spark2-submit --master local[*] /opt/cloudera/parcels/SPARK2/lib/spark2/examples/src/main/python/pi.py
本文只发表于博客园和tonglin0325的博客,作者:tonglin0325,转载请注明原文链接:https://www.cnblogs.com/tonglin0325/p/5463816.html