spark on yarn 集群部署
概述
hadoop2.7.1
spark 1.5.1
192.168.31.62 resourcemanager, namenode, master
192.168.31.63 nodemanager, datanode, worker
192.168.31.64 nodemanager, datanode, worker
Hadoop配置
hadoop-env.sh mapred-env.sh yarn-env.sh至少配置JAVA_HOME
core-site.xml
<property> <name>fs.defaultFS</name> <value>hdfs://192.168.31.62:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file:/opt/local/hadoop/tmp</value> </property> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <name>hadoop.native.lib</name> <value>true</value> </property>
hdfs-site.xml
<property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/opt/local/hadoop/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/opt/local/hadoop/dfs/data</value> </property> <property> <name>dfs.permissions.enabled</name> <value>false</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>192.168.31.62:50090</value> </property>
mapred-site.xml
<property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>
yarn-site.xml
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <!-- <property> <name>yarn.resourcemanager.hostname</name> <value>192.168.31.62</value> </property> --> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>192.168.31.62:8031</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>192.168.31.62:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>192.168.31.62:8030</value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property>
slaves
slave1
slave2
Spark配置
spark-env.sh
export JAVA_HOME=/opt/local/java/jdk export SCALA_HOME=/opt/local/scala export SPARK_WORKER_MEMORY=1g export SPARK_MASTER_IP=192.168.31.62 export SPARK_DRIVER_MEORY=1G export SPARK_LOCAL_DIRS=/opt/local/spark export HADOOP_CONF_DIR=/opt/local/hadoop/etc/hadoop export HADOOP_HOME=/opt/local/hadoop
slaves
slave1
slave2
http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
欢迎关注微信公众号:大数据从业者