Spark Standalone Mode Configuration

For currently popular distributed framework Spark, here it shows the intro and steps to configure the spark standalone mode on several machines.

It is easy to configure it from stratch. The following instruction I note down is based on the spark-2.0.2-bin-hadoop2.7 as example on the linux debian machines for scala programming.

Assume you have two machines with IP: 192.168.0.51 and 192.168.0.52

1. Preinstall java, scala, sbt

check: https://www.scala-lang.org/download/install.html

http://www.scala-sbt.org/0.13/docs/Installing-sbt-on-Linux.html

2. Download prebuilt spark version with hadoop. or you can compile on your own

the link can be referenced: https://spark.apache.org/downloads.html

3. Unzip the file and create the link for easy visit later

e.g. execute: ln -s /usr/local/spark-2.0.2-bin-hadoop2.7 /usr/local/spark

4. Configure the spark environments:

(1) configure slaves file: /usr/local/spark-2.0.2-bin-hadoop2.7/conf/slaves

# A Spark Worker will be started on each of the machines listed below.

192.168.0.51

192.168.0.52

(2) configure spar_env.sh. e.g.

#spark-env.sh

export SCALA_HOME=/usr/local/scala

export JAVA_HOME=/home/local/jdk

#export SPARK_LOCAL_IP=localhost

export SPARK_EXECUTOR_MEMORY=6g

export SPARK_EXECUTOR_CORES=6

export SPARK_MASTER_IP=192.168.0.51

export SPARK_MASTER_PORT=8070

export SPARK_MASTER_WEBUI_PORT=8080

#export SPARK_WORKER_INSTANCES=1

export SPARK_WORKER_PORT=8092

#export SPARK_WORKER_MEMORY=4g

#export SPARK_WORKER_CORES=4

5. Set up passwordless ssh access key

(1) Generate ssh key without password

$ ssh-keygen -t rsa -P ""

(2) Copy id_rsa.pub to authorized-keys

$  cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

(3) Start ssh localhost if you want to work in only one localhost machine for spark standalone

$ ssh localhost

6. Start spark

$SPARK_HOME/sbin/start-all.sh

execute jps to check worker and master have been up

7. Write and run your application

execute: sbt package

execute: $SPARK_HOME/bin/spark-submit \

　　　　--class "main.scala.MainAppTest" \

　　　　--master local[4] \

　　　　　xxxxxxxx.jar

posted @ 2017-07-11 13:23 安新阅读(198) 评论(0) 编辑收藏举报

刷新页面返回顶部

安新

Spark Standalone Mode Configuration

公告