关于yarn的spark配置属性
spark1.2.0
Property Name | Default | Meaning |
spark.yarn.applicationMaster.waitTries | 10 | ApplicationMaster 链接Spark master和SparkContext初始化的尝试次数 |
spark.yarn.submit.file.replication | 3 | 上传到HDFS上的Spark jar、app jar登文件的备份数 |
spark.yarn.preserve.staging.files | false | spark任务结束后是否删除上传的Spark jar、app jar等文件 |
spark.yarn.scheduler.heartbeat.interval-ms | 5000 | Spark application master向YARN ResourceManager发送心跳的时间间隔 |
spark.yarn.max.executor.failures |
numExecutors * 2 最小是3 |
executor 失败多少次就标记此应用的运行失败 |
park.yarn.historyServer.address | 空 | 默认无,是可选项,格式为host.com:port,不带http://,是spark历史服务器的地址 |
spark.yarn.dist.archives | 空 | Comma separated list of archives to be extracted into the working directory of each executor |
spark.yarn.dist.files | 空 | Comma-separated list of files to be placed in the working directory of each executor. |
spark.yarn.executor.memoryOverhead |
executorMemory * 0.07, with minimum of 384 |
The amount of off heap memory (in megabytes) to be allocated per executor. This is memory that accounts for things like VM overheads, interned strings, other native overheads, etc. This tends to grow with the executor size (typically 6-10%). |
spark.yarn.driver.memoryOverhead |
driverMemory * 0.07 with minimum of 384 |
The amount of off heap memory (in megabytes) to be allocated per driver. This is memory that accounts for things like VM overheads, interned strings, other native overheads, etc. This tends to grow with the container size (typically 6-10%). |
spark.yarn.queue | default | 应用提交给yarn队列的名字 |
spark.yarn.jar | 空 | spark jar包的的路径,默认使用本地spark目录中的,也可以放到HDFS上 |
spark.yarn.access.namenodes | 空 | 有安全认证的HDFS namenode的地址如:spark.yarn.access.namenodes=hdfs://nn1.com:8032,hdfs://nn2.com:8032 |
spark.yarn.appMasterEnv.[EnvironmentVariableName] | 空 | 环境变量的设置 |
spark.yarn.containerLauncherMaxThreads | 25 | application master启动executor container的线程最大数量 |
欲为大树,何与草争;心若不动,风又奈何。