spark-env.sh 配置示例
#spark-env.sh JAVA_HOME=/home/hadoop/app/jdk1.7.0_60 SCALA_HOME=/home/hadoop/app/scala-2.10.3 SPARK_HOME=/home/hadoop/app/spark-1.4.0 SPARK_PID_DIR=$SPARK_HOME/tmp HADOOP_CONF_DIR=/home/hadoop/app/hadoop/etc/hadoop SPARK_CLASSPATH=$SPARK_HOME/conf/:$SPARK_HOME/lib/*:/home/hadoop/app/hadoop/share/hadoop/common/lib/hadoop-lzo-0.4.19.jar:/home/hadoop/app/hbase/conf:/home/hadoop/app/hadoop/lib/native:$SPARK_CLASSPATH SPARK_JAVA_OPTS="$SPARK_JAVA_OPTS -Dspark.akka.askTimeout=300 -Dspark.ui.retainedStages=1000 -Dspark.eventLog.enabled=true -Dspark.eventLog.dir=hdfs://sparkcluster/user/spark_history_logs -Dspark.shuffle.spill=false -Dspark.shuffle.manager=hash -Dspark.yarn.max.executor.failures=99999 -Dspark.worker.timeout=300" SPARK_LOCAL_DIRS=/data1/hadoop/spark_local_dir,/data2/hadoop/spark_local_dir,/data3/hadoop/spark_local_dir,/data4/hadoop/spark_local_dir,/data5/hadoop/spark_local_dir,/data6/hadoop/spark_local_dir,/data7/hadoop/spark_local_dir,/data8/hadoop/spark_local_dir,/data9/hadoop/spark_local_dir,/data10/hadoop/spark_local_dir SPARK_MASTER_PORT=4050 SPARK_WORKER_CORES=30 SPARK_WORKER_MEMORY=60g SPARK_WORKER_INSTANCES=6 SPARK_DRIVER_MEMORY=12g SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=spark1:2181,spark2:2181,spark3:2181 $SPARK_DAEMON_JAVA_OPTS"
如何根据机器的情况合理的设置 SPARK_WORKER_CORES AND SPARK_WORKER_MEMORY?
查看机器的cpu信息:
# 总核数 = 物理CPU个数 X 每颗物理CPU的核数 # 总逻辑CPU数 = 物理CPU个数 X 每颗物理CPU的核数 X 超线程数 ## 查看物理CPU个数 cat /proc/cpuinfo| grep "physical id"| sort| uniq| wc -l # 查看逻辑CPU的个数 cat /proc/cpuinfo| grep "processor"| wc -l
Looking for a job working at Home about MSBI