spark笔记
运行任务的时候,有一些中间数据需要写集群的worker的目录,默认是写/tmp目录
spark-env.sh 增加
export SPARK_WORKER_DIR=/data/spark/work
export SPARK_LOCAL_DIRS=/data/spark/data
export SPARK_WORKER_DIR=/data/spark/work
export SPARK_LOCAL_DIRS=/data/spark/data
需要先创建好目录给777权限
mkdir -p /data/spark/work
mkdir -p /data/spark/data
chmod 777 /data/spark/data
chmod 777 /data/spark/work
# - SPARK_WORKER_DIR, to set the working directory of worker processes
# - SPARK_LOCAL_DIRS, storage directories to use on this node for shuffle and RDD data