YARN(MapReduce 2)运行MapReduce的过程-源码分析
这是我的分析,当然查阅书籍和网络。如有什么不对的,请各位批评指正。以下的类有的并不完全,只列出重要的方法。
如要转载,请注上作者以及出处。
一、源码阅读环境
需要安装jdk1.7.0版本及其以上版本,还需要安装Eclipse阅读hadoop源码。
Eclipse安装教程参见我的博客。
Hadoop源码官网下载。我下载的是2.7.3版本的。其中source是源代码工程,需要你编译才能执行。而binary是编译好的克执行文件。
如果你要搭建Hadoop集群,则下载binary的。如果阅读源代码,下载source的。
这里我们需要分析源代码,下载source,解压后文件名是hadoop-2.7.3-src。
把Hadoop导入Eclipse:
1.打开Eclipse,点击File->New->Java Project,会弹出如下图1所示:
图 1
我们就把Hadoop源码导入了Eclipse,但会报很多错误,具体解决方案参见我的博客 Eclipse中导入Hadoop源代码工程 。不过不影响我们看源码。
2.如果你想看某个类,但你不知道在哪?先定位到hadoop-2.7.3-src
以Job.java为例子,
在Windows中:你可以在文件资源管理器的搜索栏里输入想要搜索的类名,如下图2所示:
图 2
然后右键该文件,选择打开文件所在的位置。这个位置和Eclipse中hadoop-2.7.3-src项目的目录结构相对应。
比如我们搜到的Job.java在hadoop-2.7.3-src\hadoop-mapreduce-project\hadoop-mapreduce-client\hadoop-mapreduce-client-core\src\main\java\org\apache\hadoop\mapreduce中。
在Linux中,需要在终端通过find命令查找。
find . -name "Job.java" 第一个参数是路径,其中.表示当前目录,/表示根目录。如下图3所示:
图 3
我们就可以在Eclipse中hadoop-2.7.3-src项目中找到,如下图4所示:
图 4
二、分析前须知:
我们运行装好的集群时,要想启动集群,通常在主节点Master的hadoop-2.7.3/sbin/,目录下运行(命令行)./start-all.sh脚本,直接全部启动。如下代码所示:
"${HADOOP_HDFS_HOME}"/sbin/start-dfs.sh --config $HADOOP_CONF_DIR 它会启动start-dfs.sh脚本
"${HADOOP_YARN_HOME}"/sbin/start-yarn.sh --config $HADOOP_CONF_DIR 它会启动start-yarn.sh脚本
1 #Add other possible options 2 nameStartOpt="$nameStartOpt $@" 3 4 #--------------------------------------------------------- 5 # namenodes 6 7 NAMENODES=$($HADOOP_PREFIX/bin/hdfs getconf -namenodes) 8 9 echo "Starting namenodes on [$NAMENODES]" 10 11 "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \ 12 --config "$HADOOP_CONF_DIR" \ 13 --hostnames "$NAMENODES" \ 14 --script "$bin/hdfs" start namenode $nameStartOpt 15 16 #--------------------------------------------------------- 17 # datanodes (using default slaves file) 18 19 if [ -n "$HADOOP_SECURE_DN_USER" ]; then 20 echo \ 21 "Attempting to start secure cluster, skipping datanodes. " \ 22 "Run start-secure-dns.sh as root to complete startup." 23 else 24 "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \ 25 --config "$HADOOP_CONF_DIR" \ 26 --script "$bin/hdfs" start datanode $dataStartOpt 27 fi 28 29 #--------------------------------------------------------- 30 # secondary namenodes (if any) 31 32 SECONDARY_NAMENODES=$($HADOOP_PREFIX/bin/hdfs getconf -secondarynamenodes 2>/dev/null) 33 34 if [ -n "$SECONDARY_NAMENODES" ]; then 35 echo "Starting secondary namenodes [$SECONDARY_NAMENODES]" 36 37 "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \ 38 --config "$HADOOP_CONF_DIR" \ 39 --hostnames "$SECONDARY_NAMENODES" \ 40 --script "$bin/hdfs" start secondarynamenode 41 fi 42 43 #--------------------------------------------------------- 44 # quorumjournal nodes (if any) 45 46 SHARED_EDITS_DIR=$($HADOOP_PREFIX/bin/hdfs getconf -confKey dfs.namenode.shared.edits.dir 2>&-) 47 48 case "$SHARED_EDITS_DIR" in 49 qjournal://*) 50 JOURNAL_NODES=$(echo "$SHARED_EDITS_DIR" | sed 's,qjournal://\([^/]*\)/.*,\1,g; s/;/ /g; s/:[0-9]*//g') 51 echo "Starting journal nodes [$JOURNAL_NODES]" 52 "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \ 53 --config "$HADOOP_CONF_DIR" \ 54 --hostnames "$JOURNAL_NODES" \ 55 --script "$bin/hdfs" start journalnode ;; 56 esac 57 58 #--------------------------------------------------------- 59 # ZK Failover controllers, if auto-HA is enabled 60 AUTOHA_ENABLED=$($HADOOP_PREFIX/bin/hdfs getconf -confKey dfs.ha.automatic-failover.enabled) 61 if [ "$(echo "$AUTOHA_ENABLED" | tr A-Z a-z)" = "true" ]; then 62 echo "Starting ZK Failover Controllers on NN hosts [$NAMENODES]" 63 "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \ 64 --config "$HADOOP_CONF_DIR" \ 65 --hostnames "$NAMENODES" \ 66 --script "$bin/hdfs" start zkfc 67 fi 68 69 # eof
我们可以看到该脚本会启动namenode,datanode以及secondarynamenode等。
1 # Start all yarn daemons. Run this on master node. 2 3 echo "starting yarn daemons" 4 5 bin=`dirname "${BASH_SOURCE-$0}"` 6 bin=`cd "$bin"; pwd` 7 8 DEFAULT_LIBEXEC_DIR="$bin"/../libexec 9 HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR} 10 . $HADOOP_LIBEXEC_DIR/yarn-config.sh 11 12 # start resourceManager 13 "$bin"/yarn-daemon.sh --config $YARN_CONF_DIR start resourcemanager 14 # start nodeManager 15 "$bin"/yarn-daemons.sh --config $YARN_CONF_DIR start nodemanager 16 # start proxyserver 17 #"$bin"/yarn-daemon.sh --config $YARN_CONF_DIR start proxyserver
我们可以看到该脚本会启动resourcemanager,nodemanager等。
这几个分别对应NameNode.java, DataNode.java, SecondaryNameNode.java, ResourceManager.java以及NodeManager.java。 而且都有public static void main ...(String argv[]){...}方法。就是启动后,它们都处于运行状态。
NameNode.java在hadoop-2.7.3-src/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
DataNode.java在hadoop-2.7.3-src/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
SecondaryNameNode.java在hadoop-2.7.3-src/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
ResourceManager.java在hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
NodeManager.java在hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
三、YARN(MapReduce 2)
对于节点数超出4000的大型集群,MapReduce 1系统开始面临着扩展性的瓶颈。因此2010年雅虎的一个团队开始设计了下一代的MapRduce。由此YARN(Yet Another Resource Negotiator)应运而生。
YARN将MapReduce 1中的Jobtracker的两种角色划分为两个独立的守护进程:管理集群上资源使用的资源管理器和管理集群上运行任务生命周期的应用管理器。基本思路是:应用服务器与资源管理器协商集群的计算资源:容器(每个容器都有特定的内存上限),在这些容器上运行特定应用程序的进程。容器由集群节点上运行的节点管理器监视,以确保应用程序使用的资源不会超过分配给它的资源。
YARN设计的精妙之处在于不同的YARN应用可以在同一个集群上共存。此外,用户甚至有可能在同一个YARN集群上运行多个不同版本的MapReduce,这使得Mapreduce 升级过程更容易管理。
YARN上的MapReduce比经典的MapReduce 1包括更多的实体:
- 提交MapReduce作业的客户端
- YARN资源管理器,负责协调集群上计算资源的分配
- YARN节点管理器,负责启动和监视集群中机器上的计算容器(container)
- MapReduce应用程序master负责协调运行MapReduce作业的任务。它和MapReduce任务在容器中运行,这些容器由资源管理器分配并由节点管理器进行管理。
- 分布式文件系统(一般为HDFS),用来与其他实体间共享作业文件。
作业的运行过程如下图5所示,并具体分析
图 5 Hadoop 使用YARN 运行 MapReduce 的过程
我们在进行MR的编写完成后,会调用job.waitForCompletion(boolean)来将作业提交到集群并等待作业完成,在该方法内部,首先会判断Job状态并调用submit()方法进行提交,将任务提交到集群后会立刻返回。
Hadoop会提供一些自带的测试用例,其中比较常见的如WordCount等。我们就以WordCount为例。Hadoop提供的自带的WordCount.java在hadoop-2.7.3-src/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/WordCount.java
1 /** 2 * Licensed to the Apache Software Foundation (ASF) under one 3 * or more contributor license agreements. See the NOTICE file 4 * distributed with this work for additional information 5 * regarding copyright ownership. The ASF licenses this file 6 * to you under the Apache License, Version 2.0 (the 7 * "License"); you may not use this file except in compliance 8 * with the License. You may obtain a copy of the License at 9 * 10 * http://www.apache.org/licenses/LICENSE-2.0 11 * 12 * Unless required by applicable law or agreed to in writing, software 13 * distributed under the License is distributed on an "AS IS" BASIS, 14 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 * See the License for the specific language governing permissions and 16 * limitations under the License. 17 */ 18 package org.apache.hadoop.examples; 19 20 import java.io.IOException; 21 import java.util.StringTokenizer; 22 23 import org.apache.hadoop.conf.Configuration; 24 import org.apache.hadoop.fs.Path; 25 import org.apache.hadoop.io.IntWritable; 26 import org.apache.hadoop.io.Text; 27 import org.apache.hadoop.mapreduce.Job; 28 import org.apache.hadoop.mapreduce.Mapper; 29 import org.apache.hadoop.mapreduce.Reducer; 30 import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; 31 import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; 32 import org.apache.hadoop.util.GenericOptionsParser; 33 34 //Hadoop 自带测试用例WordCount 35 public class WordCount { 36 37 //继承泛型类Mapper 38 public static class TokenizerMapper 39 extends Mapper<Object, Text, Text, IntWritable>{ 40 41 //定义hadoop数据类型IntWritable实例one,并且赋值为1 42 private final static IntWritable one = new IntWritable(1); 43 //定义hadoop数据类型Text实例word 44 private Text word = new Text(); 45 46 //实现map函数 47 public void map(Object key, Text value, Context context 48 ) throws IOException, InterruptedException { 49 //Java的字符串分解类,默认分隔符“空格”、“制表符(‘\t’)”、“换行符(‘\n’)”、“回车符(‘\r’)” 50 StringTokenizer itr = new StringTokenizer(value.toString()); 51 //循环条件表示返回是否还有分隔符。 52 while (itr.hasMoreTokens()) { 53 /* 54 nextToken():返回从当前位置到下一个分隔符的字符串 55 word.set()Java数据类型与hadoop数据类型转换 56 */ 57 word.set(itr.nextToken()); 58 //hadoop全局类context输出函数write; 59 context.write(word, one); 60 } 61 } 62 } 63 64 //继承泛型类Reducer 65 public static class IntSumReducer 66 extends Reducer<Text,IntWritable,Text,IntWritable> { 67 //实例化IntWritable 68 private IntWritable result = new IntWritable(); 69 70 //实现reduce 71 public void reduce(Text key, Iterable<IntWritable> values, 72 Context context 73 ) throws IOException, InterruptedException { 74 //循环values,并记录单词个数 75 int sum = 0; 76 for (IntWritable val : values) { 77 sum += val.get(); 78 } 79 //Java数据类型sum,转换为hadoop数据类型result 80 result.set(sum); 81 //输出结果到hdfs 82 context.write(key, result); 83 } 84 } 85 86 public static void main(String[] args) throws Exception { 87 //实例化Configuration 88 Configuration conf = new Configuration(); 89 /* 90 GenericOptionsParser是hadoop框架中解析命令行参数的基本类。 91 getRemainingArgs();返回数组【一组路径】 92 */ 93 /* 94 函数实现 95 public String[] getRemainingArgs() { 96 return (commandLine == null) ? new String[]{} : commandLine.getArgs(); 97 } 98 */ 99 String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs(); 100 //如果只有一个路径,则输出需要有输入路径和输出路径 101 if (otherArgs.length < 2) { 102 System.err.println("Usage: wordcount <in> [<in>...] <out>"); 103 System.exit(2); 104 } 105 //实例化job 106 Job job = Job.getInstance(conf, "word count"); 107 job.setJarByClass(WordCount.class); 108 job.setMapperClass(TokenizerMapper.class); 109 /* 110 指定CombinerClass类 111 这里很多人对CombinerClass不理解 112 */ 113 job.setCombinerClass(IntSumReducer.class); 114 job.setReducerClass(IntSumReducer.class); 115 //reduce输出Key的类型,是Text 116 job.setOutputKeyClass(Text.class); 117 // reduce输出Value的类型 118 job.setOutputValueClass(IntWritable.class); 119 //添加输入路径 120 for (int i = 0; i < otherArgs.length - 1; ++i) { 121 FileInputFormat.addInputPath(job, new Path(otherArgs[i])); 122 } 123 //添加输出路径 124 FileOutputFormat.setOutputPath(job, 125 new Path(otherArgs[otherArgs.length - 1])); 126 //提交job,这句话调用Job.java中的waitForCompletion(...)方法 127 System.exit(job.waitForCompletion(true) ? 0 : 1); 128 } 129 }
最后一句job.waitForCompletion(true)调用Job.java中的waitForCompletion(...)方法。
1. 作业提交
(1)MapReduce 2 中的作业提交是使用与MapReduce 1 相同的用户API。即Job的submit()方法创建一个内部的JobSummiter实例,并且调用其submitJobInternal()方法。提交作业后,waitForCompletion()每秒轮询作业的进度,如果发现自上次报告后有改变,便把进度报告到控制台。作业完成后,如果成功就显示作业计数器;如果失败则导致作业失败的错误被记录到控制台。
其中submit()以及waitForCompletion()都在Job.java 类中。而submitJobInternal()在JobSubmitter.java类中。而且这两个类都hadoop-2.7.3-src\hadoop-mapreduce-project\hadoop-mapreduce-client\hadoop-mapreduce-client-core\src\main\java\org\apache\hadoop\mapreduce中。
1 /** 2 * The job submitter's view of the Job. 3 * 4 * <p>It allows the user to configure the 5 * job, submit it, control its execution, and query the state. The set methods 6 * only work until the job is submitted, afterwards they will throw an 7 * IllegalStateException. </p> 8 * 9 * <p> 10 * Normally the user creates the application, describes various facets of the 11 * job via {@link Job} and then submits the job and monitor its progress.</p> 12 * 13 * <p>Here is an example on how to submit a job:</p> 14 * <p><blockquote><pre> 15 * // Create a new Job 16 * Job job = Job.getInstance(); 17 * job.setJarByClass(MyJob.class); 18 * 19 * // Specify various job-specific parameters 20 * job.setJobName("myjob"); 21 * 22 * job.setInputPath(new Path("in")); 23 * job.setOutputPath(new Path("out")); 24 * 25 * job.setMapperClass(MyJob.MyMapper.class); 26 * job.setReducerClass(MyJob.MyReducer.class); 27 * 28 * // Submit the job, then poll for progress until the job is complete 29 * job.waitForCompletion(true); 30 * </pre></blockquote> 31 * 32 * 33 */ 34 35 /* 36 * Job作业提交流程: 37 * 1.我们在进行MR的编写完成后, 38 * a.会调用job.waitForCompletion(boolean)来将作业提交到集群并等待作业完成。 39 * b.在该方法内部,首先会判断Job状态并调用submit()方法进行提交,将任务提交到集群后会立刻返回; 40 * c.提交后,会判断waitForCompletion()的参数布尔变量,若为true的话,表示在作业进行的过程中会实时地进行状态监控 41 * 并打印输出其状态,调用monitorAndPrintJob()。否则,首先会获取线程休眠间隔时间(默认为5000ms),其次循环调用 42 * isComplete()方法来获取任务执行状态,未完成的话,启动线程休眠配置指定的时间,如此循环,知道任务执行完成或则失败。 43 * 2.submit()方法内部 44 * a.确保作业状态; 45 * b.调用setUserNewAPI()来进行api设置 ; 46 * c.调用connect()方法连接RM(ResourceManager); 47 * d.获取JobSubmitter对象,getJobSubmitter(fs,client) 48 * e.submitter对象进行提交作业的提交:submitJobInternal(Job.this,cluster) 49 * 3.在连接RM方法connnect()内部, 50 * a.会创建Cluster实例,Cluster构造函数内部重要的是初始化部分 51 * b.在初始化函数内部,使用java.util.ServiceLoader创建客户端代理,目前包含两个代理对象, 52 * LocalClientProtocolProvider(本地作业)和YarnClientProtocolProvider(yarn作业), 53 * 此处会根据mapreduce.framework.name的配置创建相应的客户端。 54 * 通过LocalClientProtocolProvider创建LocalJobRunner对象,在此就不进行详细说明了。 55 * 通过YarnClientProtocolProvider创建YarnRunner对象,YarnRunner保证当前JobClient运行在Yarn上。 56 * c.在YarnRunner实例化的过程中,创建客户端代理的流程如下: 57 * Cluster->ClientProtocol(YarnRunner)->ResourceMgrDelegate->client(YarnClientImpl)->rmClient(ApplicationClientProtocol) 58 * 在YarnClientImpl的serviceStart阶段会创建RPC代理,注意其中的协议. 59 * Cluster:主要是提供一个访问map/reduce集群的途径; 60 * YarnRunner: 保证当前JobClient运行在Yarn上,在实例化的过程中创建ResourceMgrDelegate; 61 * ResourceMgrDelegate:主要负责与RM进行信息交互; 62 * YarnClientImpl:主要负责实例化rmClient; 63 * rmClient:是各个客户端与RM交互的协议,主要是负责提交或终止Job任务和获取信息(applications、cluster metrics、nodes、queues和ACLs) 64 * 4.接下来,看最核心的部分,JobSubmitter.submitJobInternal(Job,Cluster),主要负责将作业提交到系统上运行,主要包括: 65 * a.校验作业的输入输出checkSpecs(Job),主要是确保输出目录是否设置且在FS上不存在; 66 * b.通过JobSubmissionFiles来获取Job运行时资源文件存放的目录,属性信息key为yarn.app.mapreduce.am.staging-dir, 67 * 默认的目录为/tmp/Hadoop-yarn/staging/hadoop/.staging/JobSubmissionFiles.getStagingDir(): 68 * 在方法内部进行判断若目录存在则判断其所属关系及操作权限;不存在的话,创建并赋予权限700; 69 * c.为缓存中的Job组织必须的统计信息,设置主机名及地址信息,获取jobId,获取提交目录,如 70 * /tmp/hadoop-yarn/staging/root/.staging/job_1395778831382_0002; 71 * d.拷贝作业的jar和配置信息到分布式文件系统上的map-reduce系统目录,调用copyAndConfigureFiles(job,submitJobDir), 72 * 主要是拷贝-libjars,-files,-archives属性对应的信息至submitJobDir; 73 * e.计算作业的输入分片数,调用writeSplits()写job.split,job.splitmetainfo; 74 * f.调用writeConf(conf,submitJobFile)将job.xml文件写入到JobTracker的文件系统; 75 * g.提交作业到JobTracker并监控器状态,调用yarnRunner对象的submitJob(jobId,submitJobDir,job.getCredentials()) 76 * 5.真正的提交在YarnRunner对象的submitJob(…)方法内部: 77 * 78 * 问题1:在进行MR编写时,Hadoop 2.x若引用了hadoop-*-mr1-*.jar的话,在使用Java进行调用的时候,会使用本地方式运行; 79 * 而使用hadoop jar进行调用时,才会提交到yarn环境上运行。 80 */ 81 //中间省略很多 82 @InterfaceAudience.Public 83 @InterfaceStability.Evolving 84 public class Job extends JobContextImpl implements JobContext { 85 private static final Log LOG = LogFactory.getLog(Job.class); 86 87 private JobState state = JobState.DEFINE; 88 private JobStatus status; 89 private long statustime; 90 private Cluster cluster; 91 private ReservationId reservationId; 92 93 /** 94 * @deprecated Use {@link #getInstance()} 95 */ 96 @Deprecated 97 public Job() throws IOException { 98 this(new JobConf(new Configuration())); 99 } 100 101 /** 102 * @deprecated Use {@link #getInstance(Configuration)} 103 */ 104 @Deprecated 105 public Job(Configuration conf) throws IOException { 106 this(new JobConf(conf)); 107 } 108 109 /** 110 * @deprecated Use {@link #getInstance(Configuration, String)} 111 */ 112 @Deprecated 113 public Job(Configuration conf, String jobName) throws IOException { 114 this(new JobConf(conf)); 115 setJobName(jobName); 116 } 117 118 Job(JobConf conf) throws IOException { 119 super(conf, null); 120 // propagate existing user credentials to job 121 this.credentials.mergeAll(this.ugi.getCredentials()); 122 this.cluster = null; 123 } 124 125 Job(JobStatus status, JobConf conf) throws IOException { 126 this(conf); 127 setJobID(status.getJobID()); 128 this.status = status; 129 state = JobState.RUNNING; 130 } 131 132 /** 133 * Creates a new {@link Job} with no particular {@link Cluster} and a given jobName. 134 * A Cluster will be created from the conf parameter only when it's needed. 135 * 136 * The <code>Job</code> makes a copy of the <code>Configuration</code> so 137 * that any necessary internal modifications do not reflect on the incoming 138 * parameter. 139 * 140 * @param conf the configuration 141 * @return the {@link Job} , with no connection to a cluster yet. 142 * @throws IOException 143 */ 144 public static Job getInstance(Configuration conf, String jobName) 145 throws IOException { 146 // create with a null Cluster 147 Job result = getInstance(conf); 148 result.setJobName(jobName); 149 return result; 150 } 151 152 private synchronized void connect() 153 throws IOException, InterruptedException, ClassNotFoundException { 154 if (cluster == null) { 155 //会创建Cluster实例,Cluster构造函数内部重要的是初始化部分 156 cluster = 157 ugi.doAs(new PrivilegedExceptionAction<Cluster>() { 158 public Cluster run() 159 throws IOException, InterruptedException, 160 ClassNotFoundException { 161 return new Cluster(getConfiguration()); 162 } 163 }); 164 } 165 } 166 167 /** 168 * Submit the job to the cluster and return immediately. 169 * @throws IOException 170 */ 171 public void submit() 172 throws IOException, InterruptedException, ClassNotFoundException { 173 ensureState(JobState.DEFINE); //确保作业状态; 174 setUseNewAPI(); //调用setUserNewAPI()来进行api设置 ; 175 connect(); //调用connect()方法连接RM(ResourceManager); 176 //获取JobSubmitter对象,getJobSubmitter(fs,client) 177 final JobSubmitter submitter = 178 getJobSubmitter(cluster.getFileSystem(), cluster.getClient()); 179 status = ugi.doAs(new PrivilegedExceptionAction<JobStatus>() { 180 public JobStatus run() throws IOException, InterruptedException, 181 ClassNotFoundException { 182 //submitter对象进行提交作业的提交:submitJobInternal(Job.this,cluster) 183 return submitter.submitJobInternal(Job.this, cluster); 184 } 185 }); 186 state = JobState.RUNNING; 187 LOG.info("The url to track the job: " + getTrackingURL()); 188 } 189 190 /** 191 * Submit the job to the cluster and wait for it to finish. 192 * @param verbose print the progress to the user 193 * @return true if the job succeeded 194 * @throws IOException thrown if the communication with the 195 * <code>JobTracker</code> is lost 196 */ 197 //将作业提交到集群并等待作业完成。 198 public boolean waitForCompletion(boolean verbose 199 ) throws IOException, InterruptedException, 200 ClassNotFoundException { 201 //判断Job状态并调用submit()方法进行提交,将任务提交到集群后会立刻返回; 202 if (state == JobState.DEFINE) { 203 submit(); 204 } 205 /*提交后,会判断waitForCompletion()的参数布尔变量,若为true的话,表示在作业进行的过程中会实时地进行状态监控并打印输出其状态, 206 * 调用monitorAndPrintJob()。否则,首先会获取线程休眠间隔时间(默认为5000ms),其次循环调用isComplete()方法 207 * 来获取任务执行状态,未完成的话,启动线程休眠配置指定的时间,如此循环,知道任务执行完成或则失败。 208 */ 209 if (verbose) { 210 monitorAndPrintJob(); 211 } else { 212 // get the completion poll interval from the client. 213 int completionPollIntervalMillis = 214 Job.getCompletionPollInterval(cluster.getConf()); 215 while (!isComplete()) { 216 try { 217 Thread.sleep(completionPollIntervalMillis); 218 } catch (InterruptedException ie) { 219 } 220 } 221 } 222 return isSuccessful(); 223 } 224 225 }
1 /** 2 * Provides a way to access information about the map/reduce cluster. 3 */ 4 //主要是提供一个访问map/reduce集群的途径; 5 @InterfaceAudience.Public 6 @InterfaceStability.Evolving 7 public class Cluster { 8 9 @InterfaceStability.Evolving 10 public static enum JobTrackerStatus {INITIALIZING, RUNNING}; 11 12 private ClientProtocolProvider clientProtocolProvider; 13 private ClientProtocol client; 14 private UserGroupInformation ugi; 15 private Configuration conf; 16 private FileSystem fs = null; 17 private Path sysDir = null; 18 private Path stagingAreaDir = null; 19 private Path jobHistoryDir = null; 20 private static final Log LOG = LogFactory.getLog(Cluster.class); 21 22 private static ServiceLoader<ClientProtocolProvider> frameworkLoader = 23 ServiceLoader.load(ClientProtocolProvider.class); 24 25 static { 26 ConfigUtil.loadResources(); 27 } 28 29 public Cluster(Configuration conf) throws IOException { 30 this(null, conf); //调用双参数构造函数 31 } 32 33 public Cluster(InetSocketAddress jobTrackAddr, Configuration conf) 34 throws IOException { 35 this.conf = conf; 36 this.ugi = UserGroupInformation.getCurrentUser(); 37 initialize(jobTrackAddr, conf); // 38 } 39 40 /* 41 * 在初始化函数内部,使用java.util.ServiceLoader创建客户端代理,目前包含两个代理对象, 42 * LocalClientProtocolProvider(本地作业)和YarnClientProtocolProvider(yarn作业), 43 * 此处会根据mapreduce.framework.name的配置创建相应的客户端。 44 * 通过LocalClientProtocolProvider创建LocalJobRunner对象,在此就不进行详细说明了。 45 * 通过YarnClientProtocolProvider创建YarnRunner对象,YarnRunner保证当前JobClient运行在Yarn上。 46 */ 47 private void initialize(InetSocketAddress jobTrackAddr, Configuration conf) 48 throws IOException { 49 50 synchronized (frameworkLoader) { 51 for (ClientProtocolProvider provider : frameworkLoader) { // 52 LOG.debug("Trying ClientProtocolProvider : " 53 + provider.getClass().getName()); 54 ClientProtocol clientProtocol = null; 55 try { 56 if (jobTrackAddr == null) { 57 clientProtocol = provider.create(conf); 58 } else { 59 clientProtocol = provider.create(jobTrackAddr, conf); 60 } 61 62 if (clientProtocol != null) { 63 clientProtocolProvider = provider; 64 client = clientProtocol; 65 LOG.debug("Picked " + provider.getClass().getName() 66 + " as the ClientProtocolProvider"); 67 break; 68 } 69 else { 70 LOG.debug("Cannot pick " + provider.getClass().getName() 71 + " as the ClientProtocolProvider - returned null protocol"); 72 } 73 } 74 catch (Exception e) { 75 LOG.info("Failed to use " + provider.getClass().getName() 76 + " due to error: ", e); 77 } 78 } 79 } 80 81 if (null == clientProtocolProvider || null == client) { 82 throw new IOException( 83 "Cannot initialize Cluster. Please check your configuration for " 84 + MRConfig.FRAMEWORK_NAME 85 + " and the correspond server addresses."); 86 } 87 } 88 89 ClientProtocol getClient() { 90 return client; 91 } 92 93 Configuration getConf() { 94 return conf; 95 } 96 97 }
图6中的1:run job 就是submit()方法实现把作业提交到集群。这个方法内部有一个调用submitter.submitJobInternal(Job.this, cluster),即调用JobSubmitter.java中的submitJobInternal(...)方法,
subJobInternal(...)方法向系统提交作业,它内部调用submitClient.submitJob(...),即ClientProtocol.java的submitJob(...)方法。
1 @InterfaceAudience.Private 2 @InterfaceStability.Unstable 3 class JobSubmitter { 4 protected static final Log LOG = LogFactory.getLog(JobSubmitter.class); 5 private static final String SHUFFLE_KEYGEN_ALGORITHM = "HmacSHA1"; 6 private static final int SHUFFLE_KEY_LENGTH = 64; 7 private FileSystem jtFs; 8 private ClientProtocol submitClient; 9 private String submitHostName; 10 private String submitHostAddress; 11 12 JobSubmitter(FileSystem submitFs, ClientProtocol submitClient) 13 throws IOException { 14 this.submitClient = submitClient; 15 this.jtFs = submitFs; 16 } 17 18 /** 19 * configure the jobconf of the user with the command line options of 20 * -libjars, -files, -archives. 21 * @param job 22 * @throws IOException 23 */ 24 private void copyAndConfigureFiles(Job job, Path jobSubmitDir) 25 throws IOException { 26 JobResourceUploader rUploader = new JobResourceUploader(jtFs); 27 rUploader.uploadFiles(job, jobSubmitDir); 28 29 // Get the working directory. If not set, sets it to filesystem working dir 30 // This code has been added so that working directory reset before running 31 // the job. This is necessary for backward compatibility as other systems 32 // might use the public API JobConf#setWorkingDirectory to reset the working 33 // directory. 34 job.getWorkingDirectory(); 35 } 36 37 /** 38 * Internal method for submitting jobs to the system. 39 * 40 * <p>The job submission process involves: 41 * <ol> 42 * <li> 43 * Checking the input and output specifications of the job. 44 * </li> 45 * <li> 46 * Computing the {@link InputSplit}s for the job. 47 * </li> 48 * <li> 49 * Setup the requisite accounting information for the 50 * {@link DistributedCache} of the job, if necessary. 51 * </li> 52 * <li> 53 * Copying the job's jar and configuration to the map-reduce system 54 * directory on the distributed file-system. 55 * </li> 56 * <li> 57 * Submitting the job to the <code>JobTracker</code> and optionally 58 * monitoring it's status. 59 * </li> 60 * </ol></p> 61 * @param job the configuration to submit 62 * @param cluster the handle to the Cluster 63 * @throws ClassNotFoundException 64 * @throws InterruptedException 65 * @throws IOException 66 */ 67 //主要负责将作业提交到系统上运行, 68 JobStatus submitJobInternal(Job job, Cluster cluster) 69 throws ClassNotFoundException, InterruptedException, IOException { 70 71 //validate the jobs output specs 72 //校验作业的输入输出checkSpecs(Job),主要是确保输出目录是否设置且在FS上不存在; 73 checkSpecs(job); 74 75 Configuration conf = job.getConfiguration(); 76 addMRFrameworkToDistributedCache(conf); 77 78 /* 79 * 通过JobSubmissionFiles来获取Job运行时资源文件存放的目录,属性信息key为 80 * yarn.app.mapreduce.am.staging-dir,默认的目录为/tmp/Hadoop-yarn/staging/hadoop/.staging/ 81 * JobSubmissionFiles.getStagingDir():在方法内部进行判断若目录存在则判断其所属关系及操作权限; 82 * 不存在的话,创建并赋予权限700; 83 */ 84 Path jobStagingArea = JobSubmissionFiles.getStagingDir(cluster, conf); 85 //configure the command line options correctly on the submitting dfs 86 //为缓存中的Job组织必须的统计信息,设置主机名及地址信息,获取jobId, 87 //获取提交目录,如/tmp/hadoop-yarn/staging/root/.staging/job_1395778831382_0002; 88 InetAddress ip = InetAddress.getLocalHost(); 89 if (ip != null) { 90 submitHostAddress = ip.getHostAddress(); 91 submitHostName = ip.getHostName(); 92 conf.set(MRJobConfig.JOB_SUBMITHOST,submitHostName); 93 conf.set(MRJobConfig.JOB_SUBMITHOSTADDR,submitHostAddress); 94 } 95 JobID jobId = submitClient.getNewJobID(); 96 job.setJobID(jobId); 97 Path submitJobDir = new Path(jobStagingArea, jobId.toString()); 98 JobStatus status = null; 99 try { 100 conf.set(MRJobConfig.USER_NAME, 101 UserGroupInformation.getCurrentUser().getShortUserName()); 102 conf.set("hadoop.http.filter.initializers", 103 "org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer"); 104 conf.set(MRJobConfig.MAPREDUCE_JOB_DIR, submitJobDir.toString()); 105 LOG.debug("Configuring job " + jobId + " with " + submitJobDir 106 + " as the submit dir"); 107 // get delegation token for the dir 108 TokenCache.obtainTokensForNamenodes(job.getCredentials(), 109 new Path[] { submitJobDir }, conf); 110 111 populateTokenCache(conf, job.getCredentials()); 112 113 // generate a secret to authenticate shuffle transfers 114 if (TokenCache.getShuffleSecretKey(job.getCredentials()) == null) { 115 KeyGenerator keyGen; 116 try { 117 keyGen = KeyGenerator.getInstance(SHUFFLE_KEYGEN_ALGORITHM); 118 keyGen.init(SHUFFLE_KEY_LENGTH); 119 } catch (NoSuchAlgorithmException e) { 120 throw new IOException("Error generating shuffle secret key", e); 121 } 122 SecretKey shuffleKey = keyGen.generateKey(); 123 TokenCache.setShuffleSecretKey(shuffleKey.getEncoded(), 124 job.getCredentials()); 125 } 126 if (CryptoUtils.isEncryptedSpillEnabled(conf)) { 127 conf.setInt(MRJobConfig.MR_AM_MAX_ATTEMPTS, 1); 128 LOG.warn("Max job attempts set to 1 since encrypted intermediate" + 129 "data spill is enabled"); 130 } 131 132 //拷贝作业的jar和配置信息到分布式文件系统上的map-reduce系统目录,调用copyAndConfigureFiles(job,submitJobDir), 133 //主要是拷贝-libjars,-files,-archives属性对应的信息至submitJobDir; 134 copyAndConfigureFiles(job, submitJobDir); 135 136 Path submitJobFile = JobSubmissionFiles.getJobConfPath(submitJobDir); 137 138 // Create the splits for the job 139 //计算作业的输入分片数,调用writeSplits()写job.split,job.splitmetainfo; 140 LOG.debug("Creating splits at " + jtFs.makeQualified(submitJobDir)); 141 int maps = writeSplits(job, submitJobDir); 142 conf.setInt(MRJobConfig.NUM_MAPS, maps); 143 LOG.info("number of splits:" + maps); 144 145 // write "queue admins of the queue to which job is being submitted" 146 // to job file. 147 String queue = conf.get(MRJobConfig.QUEUE_NAME, 148 JobConf.DEFAULT_QUEUE_NAME); 149 AccessControlList acl = submitClient.getQueueAdmins(queue); 150 conf.set(toFullPropertyName(queue, 151 QueueACL.ADMINISTER_JOBS.getAclName()), acl.getAclString()); 152 153 // removing jobtoken referrals before copying the jobconf to HDFS 154 // as the tasks don't need this setting, actually they may break 155 // because of it if present as the referral will point to a 156 // different job. 157 TokenCache.cleanUpTokenReferral(conf); 158 159 if (conf.getBoolean( 160 MRJobConfig.JOB_TOKEN_TRACKING_IDS_ENABLED, 161 MRJobConfig.DEFAULT_JOB_TOKEN_TRACKING_IDS_ENABLED)) { 162 // Add HDFS tracking ids 163 ArrayList<String> trackingIds = new ArrayList<String>(); 164 for (Token<? extends TokenIdentifier> t : 165 job.getCredentials().getAllTokens()) { 166 trackingIds.add(t.decodeIdentifier().getTrackingId()); 167 } 168 conf.setStrings(MRJobConfig.JOB_TOKEN_TRACKING_IDS, 169 trackingIds.toArray(new String[trackingIds.size()])); 170 } 171 172 // Set reservation info if it exists 173 ReservationId reservationId = job.getReservationId(); 174 if (reservationId != null) { 175 conf.set(MRJobConfig.RESERVATION_ID, reservationId.toString()); 176 } 177 178 // Write job file to submit dir 179 //调用writeConf(conf,submitJobFile)将job.xml文件写入到JobTracker的文件系统; 180 writeConf(conf, submitJobFile); 181 182 // 183 // Now, actually submit the job (using the submit name) 184 // 185 //提交作业到JobTracker并监控器状态,调用yarnRunner对象的submitJob(jobId,submitJobDir,job.getCredentials()) 186 //真正的提交在YarnRunner对象的submitJob(…)方法内部: 187 printTokens(jobId, job.getCredentials()); 188 status = submitClient.submitJob( 189 jobId, submitJobDir.toString(), job.getCredentials()); 190 if (status != null) { 191 return status; 192 } else { 193 throw new IOException("Could not launch job"); 194 } 195 } finally { 196 if (status == null) { 197 LOG.info("Cleaning up the staging area " + submitJobDir); 198 if (jtFs != null && submitJobDir != null) 199 jtFs.delete(submitJobDir, true); 200 201 } 202 } 203 } 204 205 @SuppressWarnings("unchecked") 206 private <T extends InputSplit> 207 int writeNewSplits(JobContext job, Path jobSubmitDir) throws IOException, 208 InterruptedException, ClassNotFoundException { 209 Configuration conf = job.getConfiguration(); 210 InputFormat<?, ?> input = 211 ReflectionUtils.newInstance(job.getInputFormatClass(), conf); 212 213 List<InputSplit> splits = input.getSplits(job); 214 T[] array = (T[]) splits.toArray(new InputSplit[splits.size()]); 215 216 // sort the splits into order based on size, so that the biggest 217 // go first 218 Arrays.sort(array, new SplitComparator()); 219 JobSplitWriter.createSplitFiles(jobSubmitDir, conf, 220 jobSubmitDir.getFileSystem(conf), array); 221 return array.length; 222 } 223 224 private int writeSplits(org.apache.hadoop.mapreduce.JobContext job, 225 Path jobSubmitDir) throws IOException, 226 InterruptedException, ClassNotFoundException { 227 JobConf jConf = (JobConf)job.getConfiguration(); 228 int maps; 229 if (jConf.getUseNewMapper()) { 230 maps = writeNewSplits(job, jobSubmitDir); 231 } else { 232 maps = writeOldSplits(jConf, jobSubmitDir); 233 } 234 return maps; 235 } 236 237 //method to write splits for old api mapper. 238 private int writeOldSplits(JobConf job, Path jobSubmitDir) 239 throws IOException { 240 org.apache.hadoop.mapred.InputSplit[] splits = 241 job.getInputFormat().getSplits(job, job.getNumMapTasks()); 242 // sort the splits into order based on size, so that the biggest 243 // go first 244 Arrays.sort(splits, new Comparator<org.apache.hadoop.mapred.InputSplit>() { 245 public int compare(org.apache.hadoop.mapred.InputSplit a, 246 org.apache.hadoop.mapred.InputSplit b) { 247 try { 248 long left = a.getLength(); 249 long right = b.getLength(); 250 if (left == right) { 251 return 0; 252 } else if (left < right) { 253 return 1; 254 } else { 255 return -1; 256 } 257 } catch (IOException ie) { 258 throw new RuntimeException("Problem getting input split size", ie); 259 } 260 } 261 }); 262 JobSplitWriter.createSplitFiles(jobSubmitDir, job, 263 jobSubmitDir.getFileSystem(job), splits); 264 return splits.length; 265 } 266 267 }
难点1 (2)MapReduce 2 实现了ClientProtocol,当mapreduce.framework.name设置为yarn时启动。提交的过程与经典的非常相似。从资源管理器(而不是Jobtracker)获取新的作业ID,在YARN命名法中它是一个应用程序ID。
public interface ClientProtocol extends VersionedProtocol {...}是个接口类。它有两个实现类。分别是:(我们程序跟踪到这一步丢了:当类调用接口的方法时,会跟丢;这时就要看该接口的实现类)
public class LocalJobRunner implements ClientProtocol {...} //Implements MapReduce locally, in-process, for debugging.
public class YARNRunner implements ClientProtocol {...} //This class enables the current JobClient (0.22 hadoop) to run on YARN.
我们是在YARN上运行的,所以选用YARNRunner.java类。 对于LocalJobRunner,我们以后在讨论。
其中ClientProtocol.java类在hadoop-2.7.3-src\hadoop-mapreduce-project\hadoop-mapreduce-client\hadoop-mapreduce-client-core\src\main\java\org\apache\hadoop\mapreduce\protocol中
YARNRunner.java类在hadoop-2.7.3-src\hadoop-mapreduce-project\hadoop-mapreduce-client\hadoop-mapreduce-client-jobclient\src\main\java\org\apache\hadoop\mapred
1 /** 2 * This class enables the current JobClient (0.22 hadoop) to run on YARN. 3 */ 4 @SuppressWarnings("unchecked") 5 public class YARNRunner implements ClientProtocol { 6 7 private static final Log LOG = LogFactory.getLog(YARNRunner.class); 8 9 private final RecordFactory recordFactory = RecordFactoryProvider.getRecordFactory(null); 10 private ResourceMgrDelegate resMgrDelegate; 11 private ClientCache clientCache; 12 private Configuration conf; 13 private final FileContext defaultFileContext; 14 15 /** 16 * Yarn runner incapsulates the client interface of 17 * yarn 18 * @param conf the configuration object for the client 19 */ 20 21 /* 22 * 在YarnRunner实例化的过程中,创建客户端代理的流程如下: 23 * Cluster->ClientProtocol(YarnRunner)->ResourceMgrDelegate->client(YarnClientImpl)->rmClient(ApplicationClientProtocol) 24 * 保证当前JobClient运行在Yarn上,在实例化的过程中创建ResourceMgrDelegate; 25 */ 26 public YARNRunner(Configuration conf) { 27 this(conf, new ResourceMgrDelegate(new YarnConfiguration(conf))); 28 } 29 30 /** 31 * Similar to {@link #YARNRunner(Configuration)} but allowing injecting 32 * {@link ResourceMgrDelegate}. Enables mocking and testing. 33 * @param conf the configuration object for the client 34 * @param resMgrDelegate the resourcemanager client handle. 35 */ 36 public YARNRunner(Configuration conf, ResourceMgrDelegate resMgrDelegate) { 37 this(conf, resMgrDelegate, new ClientCache(conf, resMgrDelegate)); 38 } 39 40 /** 41 * Similar to {@link YARNRunner#YARNRunner(Configuration, ResourceMgrDelegate)} 42 * but allowing injecting {@link ClientCache}. Enable mocking and testing. 43 * @param conf the configuration object 44 * @param resMgrDelegate the resource manager delegate 45 * @param clientCache the client cache object. 46 */ 47 public YARNRunner(Configuration conf, ResourceMgrDelegate resMgrDelegate, 48 ClientCache clientCache) { 49 this.conf = conf; 50 try { 51 this.resMgrDelegate = resMgrDelegate; 52 this.clientCache = clientCache; 53 this.defaultFileContext = FileContext.getFileContext(this.conf); 54 } catch (UnsupportedFileSystemException ufe) { 55 throw new RuntimeException("Error in instantiating YarnClient", ufe); 56 } 57 } 58 59 @Override 60 public JobStatus submitJob(JobID jobId, String jobSubmitDir, Credentials ts) 61 throws IOException, InterruptedException { 62 63 addHistoryToken(ts); 64 65 // Construct necessary information to start the MR AM 66 //这个appContext很重要, 里面拼接了各种环境变量, 以及启动App Master的脚本 这个对象会一直贯穿于各个类之间, 直到AM启动 67 ApplicationSubmissionContext appContext = 68 createApplicationSubmissionContext(conf, jobSubmitDir, ts); 69 70 // Submit to ResourceManager 71 ////通过ResourceMgrDelegate来sumbit这个appContext, ResourceMgrDelegate类是用来和Resource Manager在通讯的 72 try { 73 ApplicationId applicationId = 74 resMgrDelegate.submitApplication(appContext); 75 76 //这个appMaster并不是我们说的ApplicationMaster对象, 这样的命名刚开始也把我迷惑了。。。 77 ApplicationReport appMaster = resMgrDelegate 78 .getApplicationReport(applicationId); 79 String diagnostics = 80 (appMaster == null ? 81 "application report is null" : appMaster.getDiagnostics()); 82 if (appMaster == null 83 || appMaster.getYarnApplicationState() == YarnApplicationState.FAILED 84 || appMaster.getYarnApplicationState() == YarnApplicationState.KILLED) { 85 throw new IOException("Failed to run job : " + 86 diagnostics); 87 } 88 return clientCache.getClient(jobId).getJobStatus(jobId); 89 } catch (YarnException e) { 90 throw new IOException(e); 91 } 92 } 93 94 //ApplicationSubmissionContext只需要记住amContainer的启动脚本在里面, 后面会用到。 95 public ApplicationSubmissionContext createApplicationSubmissionContext( 96 Configuration jobConf, 97 String jobSubmitDir, Credentials ts) throws IOException { 98 ApplicationId applicationId = resMgrDelegate.getApplicationId(); 99 100 // Setup resource requirements 101 Resource capability = recordFactory.newRecordInstance(Resource.class); 102 capability.setMemory( 103 conf.getInt( 104 MRJobConfig.MR_AM_VMEM_MB, MRJobConfig.DEFAULT_MR_AM_VMEM_MB 105 ) 106 ); 107 capability.setVirtualCores( 108 conf.getInt( 109 MRJobConfig.MR_AM_CPU_VCORES, MRJobConfig.DEFAULT_MR_AM_CPU_VCORES 110 ) 111 ); 112 LOG.debug("AppMaster capability = " + capability); 113 114 // Setup LocalResources 115 Map<String, LocalResource> localResources = 116 new HashMap<String, LocalResource>(); 117 118 Path jobConfPath = new Path(jobSubmitDir, MRJobConfig.JOB_CONF_FILE); 119 120 URL yarnUrlForJobSubmitDir = ConverterUtils 121 .getYarnUrlFromPath(defaultFileContext.getDefaultFileSystem() 122 .resolvePath( 123 defaultFileContext.makeQualified(new Path(jobSubmitDir)))); 124 LOG.debug("Creating setup context, jobSubmitDir url is " 125 + yarnUrlForJobSubmitDir); 126 127 localResources.put(MRJobConfig.JOB_CONF_FILE, 128 createApplicationResource(defaultFileContext, 129 jobConfPath, LocalResourceType.FILE)); 130 if (jobConf.get(MRJobConfig.JAR) != null) { 131 Path jobJarPath = new Path(jobConf.get(MRJobConfig.JAR)); 132 LocalResource rc = createApplicationResource( 133 FileContext.getFileContext(jobJarPath.toUri(), jobConf), 134 jobJarPath, 135 LocalResourceType.PATTERN); 136 String pattern = conf.getPattern(JobContext.JAR_UNPACK_PATTERN, 137 JobConf.UNPACK_JAR_PATTERN_DEFAULT).pattern(); 138 rc.setPattern(pattern); 139 localResources.put(MRJobConfig.JOB_JAR, rc); 140 } else { 141 // Job jar may be null. For e.g, for pipes, the job jar is the hadoop 142 // mapreduce jar itself which is already on the classpath. 143 LOG.info("Job jar is not present. " 144 + "Not adding any jar to the list of resources."); 145 } 146 147 // TODO gross hack 148 for (String s : new String[] { 149 MRJobConfig.JOB_SPLIT, 150 MRJobConfig.JOB_SPLIT_METAINFO }) { 151 localResources.put( 152 MRJobConfig.JOB_SUBMIT_DIR + "/" + s, 153 createApplicationResource(defaultFileContext, 154 new Path(jobSubmitDir, s), LocalResourceType.FILE)); 155 } 156 157 // Setup security tokens 158 DataOutputBuffer dob = new DataOutputBuffer(); 159 ts.writeTokenStorageToStream(dob); 160 ByteBuffer securityTokens = ByteBuffer.wrap(dob.getData(), 0, dob.getLength()); 161 162 // Setup the command to run the AM 163 //这里才是设定Appmaster类的地方, 164 //MRJobConfig.APPLICATION_MASTER_CLASS = org.apache.hadoop.mapreduce.v2.app.MRAppMaster 165 ////所以最后通过命令在nodemanager那边执行的其实是MRAppMaster类的main方法 166 List<String> vargs = new ArrayList<String>(8); 167 vargs.add(MRApps.crossPlatformifyMREnv(jobConf, Environment.JAVA_HOME) 168 + "/bin/java"); 169 170 Path amTmpDir = 171 new Path(MRApps.crossPlatformifyMREnv(conf, Environment.PWD), 172 YarnConfiguration.DEFAULT_CONTAINER_TEMP_DIR); 173 vargs.add("-Djava.io.tmpdir=" + amTmpDir); 174 MRApps.addLog4jSystemProperties(null, vargs, conf); 175 176 // Check for Java Lib Path usage in MAP and REDUCE configs 177 warnForJavaLibPath(conf.get(MRJobConfig.MAP_JAVA_OPTS,""), "map", 178 MRJobConfig.MAP_JAVA_OPTS, MRJobConfig.MAP_ENV); 179 warnForJavaLibPath(conf.get(MRJobConfig.MAPRED_MAP_ADMIN_JAVA_OPTS,""), "map", 180 MRJobConfig.MAPRED_MAP_ADMIN_JAVA_OPTS, MRJobConfig.MAPRED_ADMIN_USER_ENV); 181 warnForJavaLibPath(conf.get(MRJobConfig.REDUCE_JAVA_OPTS,""), "reduce", 182 MRJobConfig.REDUCE_JAVA_OPTS, MRJobConfig.REDUCE_ENV); 183 warnForJavaLibPath(conf.get(MRJobConfig.MAPRED_REDUCE_ADMIN_JAVA_OPTS,""), "reduce", 184 MRJobConfig.MAPRED_REDUCE_ADMIN_JAVA_OPTS, MRJobConfig.MAPRED_ADMIN_USER_ENV); 185 186 // Add AM admin command opts before user command opts 187 // so that it can be overridden by user 188 String mrAppMasterAdminOptions = conf.get(MRJobConfig.MR_AM_ADMIN_COMMAND_OPTS, 189 MRJobConfig.DEFAULT_MR_AM_ADMIN_COMMAND_OPTS); 190 warnForJavaLibPath(mrAppMasterAdminOptions, "app master", 191 MRJobConfig.MR_AM_ADMIN_COMMAND_OPTS, MRJobConfig.MR_AM_ADMIN_USER_ENV); 192 vargs.add(mrAppMasterAdminOptions); 193 194 // Add AM user command opts 195 String mrAppMasterUserOptions = conf.get(MRJobConfig.MR_AM_COMMAND_OPTS, 196 MRJobConfig.DEFAULT_MR_AM_COMMAND_OPTS); 197 warnForJavaLibPath(mrAppMasterUserOptions, "app master", 198 MRJobConfig.MR_AM_COMMAND_OPTS, MRJobConfig.MR_AM_ENV); 199 vargs.add(mrAppMasterUserOptions); 200 201 if (jobConf.getBoolean(MRJobConfig.MR_AM_PROFILE, 202 MRJobConfig.DEFAULT_MR_AM_PROFILE)) { 203 final String profileParams = jobConf.get(MRJobConfig.MR_AM_PROFILE_PARAMS, 204 MRJobConfig.DEFAULT_TASK_PROFILE_PARAMS); 205 if (profileParams != null) { 206 vargs.add(String.format(profileParams, 207 ApplicationConstants.LOG_DIR_EXPANSION_VAR + Path.SEPARATOR 208 + TaskLog.LogName.PROFILE)); 209 } 210 } 211 212 vargs.add(MRJobConfig.APPLICATION_MASTER_CLASS); 213 vargs.add("1>" + ApplicationConstants.LOG_DIR_EXPANSION_VAR + 214 Path.SEPARATOR + ApplicationConstants.STDOUT); 215 vargs.add("2>" + ApplicationConstants.LOG_DIR_EXPANSION_VAR + 216 Path.SEPARATOR + ApplicationConstants.STDERR); 217 218 219 Vector<String> vargsFinal = new Vector<String>(8); 220 // Final command 221 StringBuilder mergedCommand = new StringBuilder(); 222 for (CharSequence str : vargs) { 223 mergedCommand.append(str).append(" "); 224 } 225 vargsFinal.add(mergedCommand.toString()); 226 227 LOG.debug("Command to launch container for ApplicationMaster is : " 228 + mergedCommand); 229 230 // Setup the CLASSPATH in environment 231 // i.e. add { Hadoop jars, job jar, CWD } to classpath. 232 Map<String, String> environment = new HashMap<String, String>(); 233 MRApps.setClasspath(environment, conf); 234 235 // Shell 236 environment.put(Environment.SHELL.name(), 237 conf.get(MRJobConfig.MAPRED_ADMIN_USER_SHELL, 238 MRJobConfig.DEFAULT_SHELL)); 239 240 // Add the container working directory in front of LD_LIBRARY_PATH 241 MRApps.addToEnvironment(environment, Environment.LD_LIBRARY_PATH.name(), 242 MRApps.crossPlatformifyMREnv(conf, Environment.PWD), conf); 243 244 // Setup the environment variables for Admin first 245 MRApps.setEnvFromInputString(environment, 246 conf.get(MRJobConfig.MR_AM_ADMIN_USER_ENV, 247 MRJobConfig.DEFAULT_MR_AM_ADMIN_USER_ENV), conf); 248 // Setup the environment variables (LD_LIBRARY_PATH, etc) 249 MRApps.setEnvFromInputString(environment, 250 conf.get(MRJobConfig.MR_AM_ENV), conf); 251 252 // Parse distributed cache 253 MRApps.setupDistributedCache(jobConf, localResources); 254 255 Map<ApplicationAccessType, String> acls 256 = new HashMap<ApplicationAccessType, String>(2); 257 acls.put(ApplicationAccessType.VIEW_APP, jobConf.get( 258 MRJobConfig.JOB_ACL_VIEW_JOB, MRJobConfig.DEFAULT_JOB_ACL_VIEW_JOB)); 259 acls.put(ApplicationAccessType.MODIFY_APP, jobConf.get( 260 MRJobConfig.JOB_ACL_MODIFY_JOB, 261 MRJobConfig.DEFAULT_JOB_ACL_MODIFY_JOB)); 262 263 // Setup ContainerLaunchContext for AM container 264 //根据前面的拼接的命令生成AM的container 在后面会通过这个对象来启动container 从而启动MRAppMaster 265 ContainerLaunchContext amContainer = 266 ContainerLaunchContext.newInstance(localResources, environment, 267 vargsFinal, null, securityTokens, acls); 268 269 Collection<String> tagsFromConf = 270 jobConf.getTrimmedStringCollection(MRJobConfig.JOB_TAGS); 271 272 // Set up the ApplicationSubmissionContext 273 ApplicationSubmissionContext appContext = 274 recordFactory.newRecordInstance(ApplicationSubmissionContext.class); 275 appContext.setApplicationId(applicationId); // ApplicationId 276 appContext.setQueue( // Queue name 277 jobConf.get(JobContext.QUEUE_NAME, 278 YarnConfiguration.DEFAULT_QUEUE_NAME)); 279 // add reservationID if present 280 ReservationId reservationID = null; 281 try { 282 reservationID = 283 ReservationId.parseReservationId(jobConf 284 .get(JobContext.RESERVATION_ID)); 285 } catch (NumberFormatException e) { 286 // throw exception as reservationid as is invalid 287 String errMsg = 288 "Invalid reservationId: " + jobConf.get(JobContext.RESERVATION_ID) 289 + " specified for the app: " + applicationId; 290 LOG.warn(errMsg); 291 throw new IOException(errMsg); 292 } 293 if (reservationID != null) { 294 appContext.setReservationID(reservationID); 295 LOG.info("SUBMITTING ApplicationSubmissionContext app:" + applicationId 296 + " to queue:" + appContext.getQueue() + " with reservationId:" 297 + appContext.getReservationID()); 298 } 299 appContext.setApplicationName( // Job name 300 jobConf.get(JobContext.JOB_NAME, 301 YarnConfiguration.DEFAULT_APPLICATION_NAME)); 302 appContext.setCancelTokensWhenComplete( 303 conf.getBoolean(MRJobConfig.JOB_CANCEL_DELEGATION_TOKEN, true)); 304 //设置AM Container 305 appContext.setAMContainerSpec(amContainer); // AM Container 306 appContext.setMaxAppAttempts( 307 conf.getInt(MRJobConfig.MR_AM_MAX_ATTEMPTS, 308 MRJobConfig.DEFAULT_MR_AM_MAX_ATTEMPTS)); 309 appContext.setResource(capability); 310 appContext.setApplicationType(MRJobConfig.MR_APPLICATION_TYPE); 311 if (tagsFromConf != null && !tagsFromConf.isEmpty()) { 312 appContext.setApplicationTags(new HashSet<String>(tagsFromConf)); 313 } 314 315 return appContext; 316 } 317 318 }
JobSubmitter类中subJobInternal(...)方法中调用JobID jobId = submitClient.getNewJobID(),完成了图6中2:get new application
getNewjobID()是ClientProtocol.java和YARNRunner.java类中的方法。在YARNRunner.java类中,getNewjobID()方法进一步调用ResourceMgrDelegate.java类中的getNewjobID()方法。
YARNRunner类中的构造函数以及submitJob(...)方法内部有resMgrDelegate.submitApplication(appContext),即调用ResourceMgrDelegate.java的方法submitApplication(...)。
ResourceMgrDelegate.java类在hadoop-2.7.3-src\hadoop-mapreduce-project\hadoop-mapreduce-client\hadoop-mapreduce-client-jobclient\src\main\java\org\apache\hadoop\mapred
(3)作业客户端检查作业的输出说明,计算输入分片(虽然有选项yarn.app.mapreduce.am.compute-splits-in-cluster在集群上来产生分片,这可以使具有多个分片的作业从中受益)并将作业资源(包括作业JAR、配置和分片信息)复制到HDFS。
完成了图6的3:copy job resources
JobSubmitter类中subJobInternal(...)方法中调用copyAndConfigureFiles(job, submitJobDir),将作业资源复制到HDFS。
分片:具体的分片细节由InputSplitFormat指定。分片的规则为FileInputFormat中的getSplits()方法指定。
subJobInternal(...)方法中调用maps = writeSplits(job, submitJobDir)实现分片。writeSplits(...)方法又会调用maps = writeNewSplits(job, jobSubmitDir)或者writeOldSplits(jConf, jobSubmitDir)这两个方法实现分片。这两个方法都会调用List<InputSplit> splits = input.getSplits(job)方法实现分片。而getSplits(...)方法是抽象类public abstract class InputFormat<K, V> {...}的方法。
(4)最后,通过调用资源管理器上的submitApplication()方法提交作业。
完成了图6的4:submit application
JobSubmitter类中subJobInternal(...)方法中调用status = submitClient.submitJob(...)方法,真正的提交作业----->submitJob(...)方法是接口ClientProtocol.java和它的实现类YARNRunner.java类中的方法---->在YARNRunner.java类中,submitJob(...)方法又会进一步调用ResourceMgrDelegate.java类中的submitApplication(...)方法---->而ResourceMgrDelegate.java类中的submitApplication(...)方法又进一步调用它父类抽象类YarnClient.java的submitApplication(...)方法---->抽象类YarnClient.java的submitApplication(...)方法进一步调用子类YarnClientImpl.java的submitApplication(...)方法---->
具体如下:
public abstract class YarnClient extends AbstractService {...}是个抽象类,它有两个扩展类,分别是:
public class ResourceMgrDelegate extends YarnClient {...}
public class YarnClientImpl extends YarnClient {...}
1 public class ResourceMgrDelegate extends YarnClient { 2 private static final Log LOG = LogFactory.getLog(ResourceMgrDelegate.class); 3 4 private YarnConfiguration conf; 5 private ApplicationSubmissionContext application; 6 private ApplicationId applicationId; 7 @Private 8 @VisibleForTesting 9 protected YarnClient client; 10 private Text rmDTService; 11 12 /** 13 * Delegate responsible for communicating with the Resource Manager's 14 * {@link ApplicationClientProtocol}. 15 * @param conf the configuration object. 16 */ 17 //主要负责与RM进行信息交互 18 public ResourceMgrDelegate(YarnConfiguration conf) { 19 super(ResourceMgrDelegate.class.getName()); 20 this.conf = conf; 21 this.client = YarnClient.createYarnClient(); 22 init(conf); 23 start(); 24 } 25 26 @Override 27 protected void serviceInit(Configuration conf) throws Exception { 28 client.init(conf); 29 super.serviceInit(conf); 30 } 31 32 @Override 33 protected void serviceStart() throws Exception { 34 client.start(); 35 super.serviceStart(); 36 } 37 38 @Override 39 protected void serviceStop() throws Exception { 40 client.stop(); 41 super.serviceStop(); 42 } 43 44 public String getFilesystemName() throws IOException, InterruptedException { 45 return FileSystem.get(conf).getUri().toString(); 46 } 47 48 public JobID getNewJobID() throws IOException, InterruptedException { 49 try { 50 this.application = client.createApplication().getApplicationSubmissionContext(); 51 this.applicationId = this.application.getApplicationId(); 52 return TypeConverter.fromYarn(applicationId); 53 } catch (YarnException e) { 54 throw new IOException(e); 55 } 56 } 57 58 public ApplicationId getApplicationId() { 59 return applicationId; 60 } 61 62 @Override 63 public YarnClientApplication createApplication() throws 64 YarnException, IOException { 65 return client.createApplication(); 66 } 67 // 68 @Override 69 public ApplicationId 70 submitApplication(ApplicationSubmissionContext appContext) 71 throws YarnException, IOException { 72 return client.submitApplication(appContext); 73 } 74 75 }
1 @InterfaceAudience.Public 2 @InterfaceStability.Stable 3 public abstract class YarnClient extends AbstractService { 4 5 /** 6 * Create a new instance of YarnClient. 7 */ 8 @Public 9 public static YarnClient createYarnClient() { 10 YarnClient client = new YarnClientImpl(); //新建YarnClientImpl对象 11 return client; 12 } 13 14 @Private 15 protected YarnClient(String name) { 16 super(name); 17 } 18 19 /** 20 * <p> 21 * Submit a new application to <code>YARN.</code> It is a blocking call - it 22 * will not return {@link ApplicationId} until the submitted application is 23 * submitted successfully and accepted by the ResourceManager. 24 * </p> 25 * 26 * <p> 27 * Users should provide an {@link ApplicationId} as part of the parameter 28 * {@link ApplicationSubmissionContext} when submitting a new application, 29 * otherwise it will throw the {@link ApplicationIdNotProvidedException}. 30 * </p> 31 * 32 * <p>This internally calls {@link ApplicationClientProtocol#submitApplication 33 * (SubmitApplicationRequest)}, and after that, it internally invokes 34 * {@link ApplicationClientProtocol#getApplicationReport 35 * (GetApplicationReportRequest)} and waits till it can make sure that the 36 * application gets properly submitted. If RM fails over or RM restart 37 * happens before ResourceManager saves the application's state, 38 * {@link ApplicationClientProtocol 39 * #getApplicationReport(GetApplicationReportRequest)} will throw 40 * the {@link ApplicationNotFoundException}. This API automatically resubmits 41 * the application with the same {@link ApplicationSubmissionContext} when it 42 * catches the {@link ApplicationNotFoundException}</p> 43 * 44 * @param appContext 45 * {@link ApplicationSubmissionContext} containing all the details 46 * needed to submit a new application 47 * @return {@link ApplicationId} of the accepted application 48 * @throws YarnException 49 * @throws IOException 50 * @see #createApplication() 51 */ 52 public abstract ApplicationId submitApplication( 53 ApplicationSubmissionContext appContext) throws YarnException, 54 IOException; 55 56 }
1 @Private 2 @Unstable 3 //主要负责实例化rmClient; 4 public class YarnClientImpl extends YarnClient { 5 6 private static final Log LOG = LogFactory.getLog(YarnClientImpl.class); 7 8 protected ApplicationClientProtocol rmClient; 9 protected long submitPollIntervalMillis; 10 private long asyncApiPollIntervalMillis; 11 private long asyncApiPollTimeoutMillis; 12 protected AHSClient historyClient; 13 private boolean historyServiceEnabled; 14 protected TimelineClient timelineClient; 15 @VisibleForTesting 16 Text timelineService; 17 @VisibleForTesting 18 String timelineDTRenewer; 19 protected boolean timelineServiceEnabled; 20 protected boolean timelineServiceBestEffort; 21 22 private static final String ROOT = "root"; 23 24 public YarnClientImpl() { 25 super(YarnClientImpl.class.getName()); 26 } 27 28 @SuppressWarnings("deprecation") 29 @Override 30 protected void serviceInit(Configuration conf) throws Exception { 31 asyncApiPollIntervalMillis = 32 conf.getLong(YarnConfiguration.YARN_CLIENT_APPLICATION_CLIENT_PROTOCOL_POLL_INTERVAL_MS, 33 YarnConfiguration.DEFAULT_YARN_CLIENT_APPLICATION_CLIENT_PROTOCOL_POLL_INTERVAL_MS); 34 asyncApiPollTimeoutMillis = 35 conf.getLong(YarnConfiguration.YARN_CLIENT_APPLICATION_CLIENT_PROTOCOL_POLL_TIMEOUT_MS, 36 YarnConfiguration.DEFAULT_YARN_CLIENT_APPLICATION_CLIENT_PROTOCOL_POLL_TIMEOUT_MS); 37 submitPollIntervalMillis = asyncApiPollIntervalMillis; 38 if (conf.get(YarnConfiguration.YARN_CLIENT_APP_SUBMISSION_POLL_INTERVAL_MS) 39 != null) { 40 submitPollIntervalMillis = conf.getLong( 41 YarnConfiguration.YARN_CLIENT_APP_SUBMISSION_POLL_INTERVAL_MS, 42 YarnConfiguration.DEFAULT_YARN_CLIENT_APPLICATION_CLIENT_PROTOCOL_POLL_INTERVAL_MS); 43 } 44 45 if (conf.getBoolean(YarnConfiguration.APPLICATION_HISTORY_ENABLED, 46 YarnConfiguration.DEFAULT_APPLICATION_HISTORY_ENABLED)) { 47 historyServiceEnabled = true; 48 historyClient = AHSClient.createAHSClient(); 49 historyClient.init(conf); 50 } 51 52 if (conf.getBoolean(YarnConfiguration.TIMELINE_SERVICE_ENABLED, 53 YarnConfiguration.DEFAULT_TIMELINE_SERVICE_ENABLED)) { 54 timelineServiceEnabled = true; 55 timelineClient = createTimelineClient(); 56 timelineClient.init(conf); 57 timelineDTRenewer = getTimelineDelegationTokenRenewer(conf); 58 timelineService = TimelineUtils.buildTimelineTokenService(conf); 59 } 60 61 timelineServiceBestEffort = conf.getBoolean( 62 YarnConfiguration.TIMELINE_SERVICE_CLIENT_BEST_EFFORT, 63 YarnConfiguration.DEFAULT_TIMELINE_SERVICE_CLIENT_BEST_EFFORT); 64 super.serviceInit(conf); 65 } 66 67 TimelineClient createTimelineClient() throws IOException, YarnException { 68 return TimelineClient.createTimelineClient(); 69 } 70 71 @Override 72 protected void serviceStart() throws Exception { 73 //rmClient是各个客户端与RM交互的协议,主要是负责提交或终止Job任务和获取信息(applications、cluster metrics、nodes、queues和ACLs) 74 try { 75 rmClient = ClientRMProxy.createRMProxy(getConfig(), 76 ApplicationClientProtocol.class); 77 if (historyServiceEnabled) { 78 historyClient.start(); 79 } 80 if (timelineServiceEnabled) { 81 timelineClient.start(); 82 } 83 } catch (IOException e) { 84 throw new YarnRuntimeException(e); 85 } 86 super.serviceStart(); 87 } 88 89 @Override 90 protected void serviceStop() throws Exception { 91 if (this.rmClient != null) { 92 RPC.stopProxy(this.rmClient); 93 } 94 if (historyServiceEnabled) { 95 historyClient.stop(); 96 } 97 if (timelineServiceEnabled) { 98 timelineClient.stop(); 99 } 100 super.serviceStop(); 101 } 102 103 private GetNewApplicationResponse getNewApplication() 104 throws YarnException, IOException { 105 GetNewApplicationRequest request = 106 Records.newRecord(GetNewApplicationRequest.class); 107 return rmClient.getNewApplication(request); 108 } 109 110 @Override 111 public YarnClientApplication createApplication() 112 throws YarnException, IOException { 113 ApplicationSubmissionContext context = Records.newRecord 114 (ApplicationSubmissionContext.class); 115 GetNewApplicationResponse newApp = getNewApplication(); 116 ApplicationId appId = newApp.getApplicationId(); 117 context.setApplicationId(appId); 118 return new YarnClientApplication(newApp, context); 119 } 120 121 @Override 122 public ApplicationId 123 submitApplication(ApplicationSubmissionContext appContext) 124 throws YarnException, IOException { 125 ApplicationId applicationId = appContext.getApplicationId(); 126 if (applicationId == null) { 127 throw new ApplicationIdNotProvidedException( 128 "ApplicationId is not provided in ApplicationSubmissionContext"); 129 } 130 //将appContext设置到一个request里面 131 SubmitApplicationRequest request = 132 Records.newRecord(SubmitApplicationRequest.class); 133 request.setApplicationSubmissionContext(appContext); 134 135 // Automatically add the timeline DT into the CLC 136 // Only when the security and the timeline service are both enabled 137 if (isSecurityEnabled() && timelineServiceEnabled) { 138 addTimelineDelegationToken(appContext.getAMContainerSpec()); 139 } 140 141 //TODO: YARN-1763:Handle RM failovers during the submitApplication call. 142 ////通过rmClient提交request, 这个rmClient其实就是ClientRMService类, 143 //是用来和Resource Manager做RPC的call, 通过这个类, 可以直接和RM对话 144 rmClient.submitApplication(request); //这里提交任务 145 146 int pollCount = 0; 147 long startTime = System.currentTimeMillis(); 148 EnumSet<YarnApplicationState> waitingStates = 149 EnumSet.of(YarnApplicationState.NEW, 150 YarnApplicationState.NEW_SAVING, 151 YarnApplicationState.SUBMITTED); 152 EnumSet<YarnApplicationState> failToSubmitStates = 153 EnumSet.of(YarnApplicationState.FAILED, 154 YarnApplicationState.KILLED); 155 ////一直循环, 直到状态变为NEW为止, 如果长时间状态没变, 那么就timeout 156 while (true) { 157 try { 158 ApplicationReport appReport = getApplicationReport(applicationId); 159 YarnApplicationState state = appReport.getYarnApplicationState(); 160 if (!waitingStates.contains(state)) { 161 if(failToSubmitStates.contains(state)) { 162 throw new YarnException("Failed to submit " + applicationId + 163 " to YARN : " + appReport.getDiagnostics()); 164 } 165 LOG.info("Submitted application " + applicationId); 166 break; 167 } 168 169 long elapsedMillis = System.currentTimeMillis() - startTime; 170 if (enforceAsyncAPITimeout() && 171 elapsedMillis >= asyncApiPollTimeoutMillis) { 172 throw new YarnException("Timed out while waiting for application " + 173 applicationId + " to be submitted successfully"); 174 } 175 176 // Notify the client through the log every 10 poll, in case the client 177 // is blocked here too long. 178 if (++pollCount % 10 == 0) { 179 LOG.info("Application submission is not finished, " + 180 "submitted application " + applicationId + 181 " is still in " + state); 182 } 183 try { 184 Thread.sleep(submitPollIntervalMillis); 185 } catch (InterruptedException ie) { 186 LOG.error("Interrupted while waiting for application " 187 + applicationId 188 + " to be successfully submitted."); 189 } 190 } catch (ApplicationNotFoundException ex) { 191 // FailOver or RM restart happens before RMStateStore saves 192 // ApplicationState 193 LOG.info("Re-submit application " + applicationId + "with the " + 194 "same ApplicationSubmissionContext"); 195 rmClient.submitApplication(request); 196 } 197 } 198 199 return applicationId; 200 } 201 202 }
YARNRunner类的构造函数---->调用ResourceMgrDelegate的构造函数---->调用抽象类YarnClient的createYarnClient()方法---->调用YarnClientImpl的构造函数生成对象并返回给YarnClient类和ResourceMgrDelegate类的对象(所以ResourceMgrDelegate类中对象调用的方法最终调用的是YarnClientImpl类中的方法)。
YARNRunner类的构造函数---->调用ResourceMgrDelegate的构造函数---->调用抽象类AbstractService的init()以及start()方法,该抽象类是抽象类YarnClient的父类---->该抽象类的两个方法分别会调用该类内部方法serviceInit()以及serviceStart()方法,(其实最终对用的是YarnClientImpl类中的相应的serviceInit()以及serviceStart()方法)。
YARNRunner类的submitJob()方法---->调用ResourceMgrDelegate类的submitApplication()方法---->调用抽象类YarnClient的submitApplication()方法(其实最终对用的是YarnClientImpl类中的submitApplication()方法)。
到目前为止, 所有的内容都还是在提交Job的那台Client机器上, 还没有到ResourceManger那边。接下来是ResourceManger端:
难点2 接着上面,YarnClientImpl.java的submitApplication(...)方法---->YarnClientImpl.java的submitApplication(...)方法内部进一步调用接口ApplicationClientProtocol.java的submitApplication(...)方法,调用它的实现类ClientRMService.java中的submitApplication(...)方法,以及它的构造函数。
public interface ApplicationClientProtocol extends ApplicationBaseProtocol {...}是个接口,它有四个实现类,其中只有两个实现类中有submitApplication()方法,这两个实现类分别是:(我们程序跟踪到这一步丢了:当类调用接口的方法时,会跟丢;这时就要看该接口的实现类)
public class ApplicationClientProtocolPBClientImpl implements ApplicationClientProtocol, Closeable {...}
public class ClientRMService extends AbstractService implements ApplicationClientProtocol {...} //这里选用这个类,
参考2 参考
ClientRMService.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
1 /** 2 * The client interface to the Resource Manager. This module handles all the rpc 3 * interfaces to the resource manager from the client. 4 */ 5 public class ClientRMService extends AbstractService implements 6 ApplicationClientProtocol { 7 private static final ArrayList<ApplicationReport> EMPTY_APPS_REPORT = new ArrayList<ApplicationReport>(); 8 9 private static final Log LOG = LogFactory.getLog(ClientRMService.class); 10 11 final private AtomicInteger applicationCounter = new AtomicInteger(0); 12 final private YarnScheduler scheduler; 13 final private RMContext rmContext; 14 private final RMAppManager rmAppManager; 15 16 private Server server; 17 protected RMDelegationTokenSecretManager rmDTSecretManager; 18 19 private final RecordFactory recordFactory = RecordFactoryProvider.getRecordFactory(null); 20 InetSocketAddress clientBindAddress; 21 22 private final ApplicationACLsManager applicationsACLsManager; 23 private final QueueACLsManager queueACLsManager; 24 25 // For Reservation APIs 26 private Clock clock; 27 private ReservationSystem reservationSystem; 28 private ReservationInputValidator rValidator; 29 30 public ClientRMService(RMContext rmContext, YarnScheduler scheduler, 31 RMAppManager rmAppManager, ApplicationACLsManager applicationACLsManager, 32 QueueACLsManager queueACLsManager, 33 RMDelegationTokenSecretManager rmDTSecretManager) { 34 this(rmContext, scheduler, rmAppManager, applicationACLsManager, 35 queueACLsManager, rmDTSecretManager, new UTCClock()); 36 } 37 38 public ClientRMService(RMContext rmContext, YarnScheduler scheduler, 39 RMAppManager rmAppManager, ApplicationACLsManager applicationACLsManager, 40 QueueACLsManager queueACLsManager, 41 RMDelegationTokenSecretManager rmDTSecretManager, Clock clock) { 42 super(ClientRMService.class.getName()); 43 this.scheduler = scheduler; 44 this.rmContext = rmContext; 45 this.rmAppManager = rmAppManager; 46 this.applicationsACLsManager = applicationACLsManager; 47 this.queueACLsManager = queueACLsManager; 48 this.rmDTSecretManager = rmDTSecretManager; 49 this.reservationSystem = rmContext.getReservationSystem(); 50 this.clock = clock; 51 this.rValidator = new ReservationInputValidator(clock); 52 } 53 54 @Override 55 protected void serviceInit(Configuration conf) throws Exception { 56 clientBindAddress = getBindAddress(conf); 57 super.serviceInit(conf); 58 } 59 60 @Override 61 protected void serviceStart() throws Exception { 62 Configuration conf = getConfig(); 63 YarnRPC rpc = YarnRPC.create(conf); 64 this.server = 65 rpc.getServer(ApplicationClientProtocol.class, this, 66 clientBindAddress, 67 conf, this.rmDTSecretManager, 68 conf.getInt(YarnConfiguration.RM_CLIENT_THREAD_COUNT, 69 YarnConfiguration.DEFAULT_RM_CLIENT_THREAD_COUNT)); 70 71 // Enable service authorization? 72 if (conf.getBoolean( 73 CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHORIZATION, 74 false)) { 75 InputStream inputStream = 76 this.rmContext.getConfigurationProvider() 77 .getConfigurationInputStream(conf, 78 YarnConfiguration.HADOOP_POLICY_CONFIGURATION_FILE); 79 if (inputStream != null) { 80 conf.addResource(inputStream); 81 } 82 refreshServiceAcls(conf, RMPolicyProvider.getInstance()); 83 } 84 85 this.server.start(); 86 clientBindAddress = conf.updateConnectAddr(YarnConfiguration.RM_BIND_HOST, 87 YarnConfiguration.RM_ADDRESS, 88 YarnConfiguration.DEFAULT_RM_ADDRESS, 89 server.getListenerAddress()); 90 super.serviceStart(); 91 } 92 93 @Override 94 protected void serviceStop() throws Exception { 95 if (this.server != null) { 96 this.server.stop(); 97 } 98 super.serviceStop(); 99 } 100 101 102 ApplicationId getNewApplicationId() { 103 ApplicationId applicationId = org.apache.hadoop.yarn.server.utils.BuilderUtils 104 .newApplicationId(recordFactory, ResourceManager.getClusterTimeStamp(), 105 applicationCounter.incrementAndGet()); 106 LOG.info("Allocated new applicationId: " + applicationId.getId()); 107 return applicationId; 108 } 109 110 @Override 111 public GetNewApplicationResponse getNewApplication( 112 GetNewApplicationRequest request) throws YarnException { 113 GetNewApplicationResponse response = recordFactory 114 .newRecordInstance(GetNewApplicationResponse.class); 115 response.setApplicationId(getNewApplicationId()); 116 // Pick up min/max resource from scheduler... 117 response.setMaximumResourceCapability(scheduler 118 .getMaximumResourceCapability()); 119 120 return response; 121 } 122 123 @Override 124 public SubmitApplicationResponse submitApplication( 125 SubmitApplicationRequest request) throws YarnException { 126 ApplicationSubmissionContext submissionContext = request 127 .getApplicationSubmissionContext(); 128 ApplicationId applicationId = submissionContext.getApplicationId(); 129 130 // ApplicationSubmissionContext needs to be validated for safety - only 131 // those fields that are independent of the RM's configuration will be 132 // checked here, those that are dependent on RM configuration are validated 133 // in RMAppManager. 134 135 //开始各种验证 一不开心就不让干活 136 String user = null; 137 try { 138 // Safety 139 user = UserGroupInformation.getCurrentUser().getShortUserName(); 140 } catch (IOException ie) { 141 LOG.warn("Unable to get the current user.", ie); 142 RMAuditLogger.logFailure(user, AuditConstants.SUBMIT_APP_REQUEST, 143 ie.getMessage(), "ClientRMService", 144 "Exception in submitting application", applicationId); 145 throw RPCUtil.getRemoteException(ie); 146 } 147 148 //各种验证 149 // Check whether app has already been put into rmContext, 150 // If it is, simply return the response 151 if (rmContext.getRMApps().get(applicationId) != null) { 152 LOG.info("This is an earlier submitted application: " + applicationId); 153 return SubmitApplicationResponse.newInstance(); 154 } 155 156 //继续验证 157 if (submissionContext.getQueue() == null) { 158 submissionContext.setQueue(YarnConfiguration.DEFAULT_QUEUE_NAME); 159 } 160 if (submissionContext.getApplicationName() == null) { 161 submissionContext.setApplicationName( 162 YarnConfiguration.DEFAULT_APPLICATION_NAME); 163 } 164 if (submissionContext.getApplicationType() == null) { 165 submissionContext 166 .setApplicationType(YarnConfiguration.DEFAULT_APPLICATION_TYPE); 167 } else { 168 if (submissionContext.getApplicationType().length() > YarnConfiguration.APPLICATION_TYPE_LENGTH) { 169 submissionContext.setApplicationType(submissionContext 170 .getApplicationType().substring(0, 171 YarnConfiguration.APPLICATION_TYPE_LENGTH)); 172 } 173 } 174 175 try { 176 // call RMAppManager to submit application directly 177 //干活 通过rmAppManager提交 178 rmAppManager.submitApplication(submissionContext, 179 System.currentTimeMillis(), user); 180 181 LOG.info("Application with id " + applicationId.getId() + 182 " submitted by user " + user); 183 RMAuditLogger.logSuccess(user, AuditConstants.SUBMIT_APP_REQUEST, 184 "ClientRMService", applicationId); 185 } catch (YarnException e) { 186 LOG.info("Exception in submitting application with id " + 187 applicationId.getId(), e); 188 RMAuditLogger.logFailure(user, AuditConstants.SUBMIT_APP_REQUEST, 189 e.getMessage(), "ClientRMService", 190 "Exception in submitting application", applicationId); 191 throw e; 192 } 193 194 SubmitApplicationResponse response = recordFactory 195 .newRecordInstance(SubmitApplicationResponse.class); 196 return response; 197 } 198 199 }
ClientRMService.java中的submitApplication(...)方法---->调用RMAppManager.java的submitApplication(...)方法
1 /** 2 * This class manages the list of applications for the resource manager. 3 */ 4 public class RMAppManager implements EventHandler<RMAppManagerEvent>, 5 Recoverable { 6 7 private static final Log LOG = LogFactory.getLog(RMAppManager.class); 8 9 private int maxCompletedAppsInMemory; 10 private int maxCompletedAppsInStateStore; 11 protected int completedAppsInStateStore = 0; 12 private LinkedList<ApplicationId> completedApps = new LinkedList<ApplicationId>(); 13 14 private final RMContext rmContext; 15 private final ApplicationMasterService masterService; 16 private final YarnScheduler scheduler; 17 private final ApplicationACLsManager applicationACLsManager; 18 private Configuration conf; 19 20 public RMAppManager(RMContext context, 21 YarnScheduler scheduler, ApplicationMasterService masterService, 22 ApplicationACLsManager applicationACLsManager, Configuration conf) { 23 this.rmContext = context; 24 this.scheduler = scheduler; 25 this.masterService = masterService; 26 this.applicationACLsManager = applicationACLsManager; 27 this.conf = conf; 28 this.maxCompletedAppsInMemory = conf.getInt( 29 YarnConfiguration.RM_MAX_COMPLETED_APPLICATIONS, 30 YarnConfiguration.DEFAULT_RM_MAX_COMPLETED_APPLICATIONS); 31 this.maxCompletedAppsInStateStore = 32 conf.getInt( 33 YarnConfiguration.RM_STATE_STORE_MAX_COMPLETED_APPLICATIONS, 34 YarnConfiguration.DEFAULT_RM_STATE_STORE_MAX_COMPLETED_APPLICATIONS); 35 if (this.maxCompletedAppsInStateStore > this.maxCompletedAppsInMemory) { 36 this.maxCompletedAppsInStateStore = this.maxCompletedAppsInMemory; 37 } 38 } 39 40 @SuppressWarnings("unchecked") 41 protected void submitApplication( 42 ApplicationSubmissionContext submissionContext, long submitTime, 43 String user) throws YarnException { 44 ApplicationId applicationId = submissionContext.getApplicationId(); 45 46 //创建一个RMAppImpl对象 其实就是启动RMApp状态机 以及执行RMAppEvent 47 RMAppImpl application = 48 createAndPopulateNewRMApp(submissionContext, submitTime, user, false); 49 ApplicationId appId = submissionContext.getApplicationId(); 50 51 //如果有安全认证enable的话会走这里, 比如kerberos啥的 我就不这么麻烦了 以看懂为主, 直接到else 52 if (UserGroupInformation.isSecurityEnabled()) { 53 try { 54 this.rmContext.getDelegationTokenRenewer().addApplicationAsync(appId, 55 parseCredentials(submissionContext), 56 submissionContext.getCancelTokensWhenComplete(), 57 application.getUser()); 58 } catch (Exception e) { 59 LOG.warn("Unable to parse credentials.", e); 60 // Sending APP_REJECTED is fine, since we assume that the 61 // RMApp is in NEW state and thus we haven't yet informed the 62 // scheduler about the existence of the application 63 assert application.getState() == RMAppState.NEW; 64 this.rmContext.getDispatcher().getEventHandler() 65 .handle(new RMAppEvent(applicationId, 66 RMAppEventType.APP_REJECTED, e.getMessage())); 67 throw RPCUtil.getRemoteException(e); 68 } 69 } else { 70 // Dispatcher is not yet started at this time, so these START events 71 // enqueued should be guaranteed to be first processed when dispatcher 72 // gets started. 73 //启动RMApp的状态机, 这里rmContext其实是resourceManager的Client代理, 74 //这一步就是让去RM端的dispatcher去处理RMAppEventType.START事件 75 this.rmContext.getDispatcher().getEventHandler() 76 .handle(new RMAppEvent(applicationId, RMAppEventType.START)); 77 } 78 } 79 80 private RMAppImpl createAndPopulateNewRMApp( 81 ApplicationSubmissionContext submissionContext, long submitTime, 82 String user, boolean isRecovery) throws YarnException { 83 ApplicationId applicationId = submissionContext.getApplicationId(); 84 ResourceRequest amReq = 85 validateAndCreateResourceRequest(submissionContext, isRecovery); 86 87 // Create RMApp 88 //创建 RMApp 89 RMAppImpl application = 90 new RMAppImpl(applicationId, rmContext, this.conf, 91 submissionContext.getApplicationName(), user, 92 submissionContext.getQueue(), 93 submissionContext, this.scheduler, this.masterService, 94 submitTime, submissionContext.getApplicationType(), 95 submissionContext.getApplicationTags(), amReq); 96 97 // Concurrent app submissions with same applicationId will fail here 98 // Concurrent app submissions with different applicationIds will not 99 // influence each other 100 if (rmContext.getRMApps().putIfAbsent(applicationId, application) != 101 null) { 102 String message = "Application with id " + applicationId 103 + " is already present! Cannot add a duplicate!"; 104 LOG.warn(message); 105 throw new YarnException(message); 106 } 107 // Inform the ACLs Manager 108 this.applicationACLsManager.addApplication(applicationId, 109 submissionContext.getAMContainerSpec().getApplicationACLs()); 110 String appViewACLs = submissionContext.getAMContainerSpec() 111 .getApplicationACLs().get(ApplicationAccessType.VIEW_APP); 112 rmContext.getSystemMetricsPublisher().appACLsUpdated( 113 application, appViewACLs, System.currentTimeMillis()); 114 return application; 115 } 116 117 }
RMAppManager.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
难点3 RMAppManager之后,跟踪就出现了瓶颈,看了很多资料后
实际上RMAPPManager类的submitApplication()方法内部会调用该类的createAndPopulateNewRMApp()方法,该方法构建一个app(其实RMApp)并放入applicationACLS 。submitApplication()方法最后还会调用this.rmContext.getDispatcher().getEventHandler().handle(new RMAppEvent(applicationId, RMAppEventType.START)),触发app启动事件,往异步处理器增加个RMAppEvent事件,类型枚值RMAppEventType.START,在RM内部会注册该类型的事件会用什么处理器来处理。其中this.rmContext=RMContextImpl this.rmContext.getDispatcher()=AsyncDispatcher this.rmContext.getDispatcher().getEventHandler()=AsyncDispatcher$GenericEventHandler。
我们先来看一下createAndPopulateNewRMApp()方法,在RMAppManager类中,它调用了RMAppImpl构造函数,来创建RMApp。即会调用RMAppImpl.java, 包括全部的。
1 @SuppressWarnings({ "rawtypes", "unchecked" }) 2 public class RMAppImpl implements RMApp, Recoverable { 3 4 private static final Log LOG = LogFactory.getLog(RMAppImpl.class); 5 private static final String UNAVAILABLE = "N/A"; 6 7 // Immutable fields 8 private final ApplicationId applicationId; 9 private final RMContext rmContext; 10 private final Configuration conf; 11 private final String user; 12 private final String name; 13 private final ApplicationSubmissionContext submissionContext; 14 private final Dispatcher dispatcher; 15 private final YarnScheduler scheduler; 16 private final ApplicationMasterService masterService; 17 private final StringBuilder diagnostics = new StringBuilder(); 18 private final int maxAppAttempts; 19 private final ReadLock readLock; 20 private final WriteLock writeLock; 21 private final Map<ApplicationAttemptId, RMAppAttempt> attempts 22 = new LinkedHashMap<ApplicationAttemptId, RMAppAttempt>(); 23 private final long submitTime; 24 private final Set<RMNode> updatedNodes = new HashSet<RMNode>(); 25 private final String applicationType; 26 private final Set<String> applicationTags; 27 28 private final long attemptFailuresValidityInterval; 29 30 private Clock systemClock; 31 32 private boolean isNumAttemptsBeyondThreshold = false; 33 34 // Mutable fields 35 private long startTime; 36 private long finishTime = 0; 37 private long storedFinishTime = 0; 38 // This field isn't protected by readlock now. 39 private volatile RMAppAttempt currentAttempt; 40 private String queue; 41 private EventHandler handler; 42 private static final AppFinishedTransition FINISHED_TRANSITION = 43 new AppFinishedTransition(); 44 private Set<NodeId> ranNodes = new ConcurrentSkipListSet<NodeId>(); 45 46 // These states stored are only valid when app is at killing or final_saving. 47 private RMAppState stateBeforeKilling; 48 private RMAppState stateBeforeFinalSaving; 49 private RMAppEvent eventCausingFinalSaving; 50 private RMAppState targetedFinalState; 51 private RMAppState recoveredFinalState; 52 private ResourceRequest amReq; 53 54 Object transitionTodo; 55 56 private static final StateMachineFactory<RMAppImpl, 57 RMAppState, 58 RMAppEventType, 59 RMAppEvent> stateMachineFactory 60 = new StateMachineFactory<RMAppImpl, 61 RMAppState, 62 RMAppEventType, 63 RMAppEvent>(RMAppState.NEW) 64 65 66 // Transitions from NEW state 67 .addTransition(RMAppState.NEW, RMAppState.NEW, 68 RMAppEventType.NODE_UPDATE, new RMAppNodeUpdateTransition()) 69 .addTransition(RMAppState.NEW, RMAppState.NEW_SAVING, 70 RMAppEventType.START, new RMAppNewlySavingTransition()) 71 .addTransition(RMAppState.NEW, EnumSet.of(RMAppState.SUBMITTED, 72 RMAppState.ACCEPTED, RMAppState.FINISHED, RMAppState.FAILED, 73 RMAppState.KILLED, RMAppState.FINAL_SAVING), 74 RMAppEventType.RECOVER, new RMAppRecoveredTransition()) 75 .addTransition(RMAppState.NEW, RMAppState.KILLED, RMAppEventType.KILL, 76 new AppKilledTransition()) 77 .addTransition(RMAppState.NEW, RMAppState.FINAL_SAVING, 78 RMAppEventType.APP_REJECTED, 79 new FinalSavingTransition(new AppRejectedTransition(), 80 RMAppState.FAILED)) 81 82 // Transitions from NEW_SAVING state 83 .addTransition(RMAppState.NEW_SAVING, RMAppState.NEW_SAVING, 84 RMAppEventType.NODE_UPDATE, new RMAppNodeUpdateTransition()) 85 .addTransition(RMAppState.NEW_SAVING, RMAppState.SUBMITTED, 86 RMAppEventType.APP_NEW_SAVED, new AddApplicationToSchedulerTransition()) 87 .addTransition(RMAppState.NEW_SAVING, RMAppState.FINAL_SAVING, 88 RMAppEventType.KILL, 89 new FinalSavingTransition( 90 new AppKilledTransition(), RMAppState.KILLED)) 91 .addTransition(RMAppState.NEW_SAVING, RMAppState.FINAL_SAVING, 92 RMAppEventType.APP_REJECTED, 93 new FinalSavingTransition(new AppRejectedTransition(), 94 RMAppState.FAILED)) 95 .addTransition(RMAppState.NEW_SAVING, RMAppState.NEW_SAVING, 96 RMAppEventType.MOVE, new RMAppMoveTransition()) 97 98 // Transitions from SUBMITTED state 99 .addTransition(RMAppState.SUBMITTED, RMAppState.SUBMITTED, 100 RMAppEventType.NODE_UPDATE, new RMAppNodeUpdateTransition()) 101 .addTransition(RMAppState.SUBMITTED, RMAppState.SUBMITTED, 102 RMAppEventType.MOVE, new RMAppMoveTransition()) 103 .addTransition(RMAppState.SUBMITTED, RMAppState.FINAL_SAVING, 104 RMAppEventType.APP_REJECTED, 105 new FinalSavingTransition( 106 new AppRejectedTransition(), RMAppState.FAILED)) 107 .addTransition(RMAppState.SUBMITTED, RMAppState.ACCEPTED, 108 RMAppEventType.APP_ACCEPTED, new StartAppAttemptTransition()) 109 .addTransition(RMAppState.SUBMITTED, RMAppState.FINAL_SAVING, 110 RMAppEventType.KILL, 111 new FinalSavingTransition( 112 new AppKilledTransition(), RMAppState.KILLED)) 113 114 // Transitions from ACCEPTED state 115 .addTransition(RMAppState.ACCEPTED, RMAppState.ACCEPTED, 116 RMAppEventType.NODE_UPDATE, new RMAppNodeUpdateTransition()) 117 .addTransition(RMAppState.ACCEPTED, RMAppState.ACCEPTED, 118 RMAppEventType.MOVE, new RMAppMoveTransition()) 119 .addTransition(RMAppState.ACCEPTED, RMAppState.RUNNING, 120 RMAppEventType.ATTEMPT_REGISTERED) 121 .addTransition(RMAppState.ACCEPTED, 122 EnumSet.of(RMAppState.ACCEPTED, RMAppState.FINAL_SAVING), 123 // ACCEPTED state is possible to receive ATTEMPT_FAILED/ATTEMPT_FINISHED 124 // event because RMAppRecoveredTransition is returning ACCEPTED state 125 // directly and waiting for the previous AM to exit. 126 RMAppEventType.ATTEMPT_FAILED, 127 new AttemptFailedTransition(RMAppState.ACCEPTED)) 128 .addTransition(RMAppState.ACCEPTED, RMAppState.FINAL_SAVING, 129 RMAppEventType.ATTEMPT_FINISHED, 130 new FinalSavingTransition(FINISHED_TRANSITION, RMAppState.FINISHED)) 131 .addTransition(RMAppState.ACCEPTED, RMAppState.KILLING, 132 RMAppEventType.KILL, new KillAttemptTransition()) 133 .addTransition(RMAppState.ACCEPTED, RMAppState.FINAL_SAVING, 134 RMAppEventType.ATTEMPT_KILLED, 135 new FinalSavingTransition(new AppKilledTransition(), RMAppState.KILLED)) 136 .addTransition(RMAppState.ACCEPTED, RMAppState.ACCEPTED, 137 RMAppEventType.APP_RUNNING_ON_NODE, 138 new AppRunningOnNodeTransition()) 139 140 // Transitions from RUNNING state 141 .addTransition(RMAppState.RUNNING, RMAppState.RUNNING, 142 RMAppEventType.NODE_UPDATE, new RMAppNodeUpdateTransition()) 143 .addTransition(RMAppState.RUNNING, RMAppState.RUNNING, 144 RMAppEventType.MOVE, new RMAppMoveTransition()) 145 .addTransition(RMAppState.RUNNING, RMAppState.FINAL_SAVING, 146 RMAppEventType.ATTEMPT_UNREGISTERED, 147 new FinalSavingTransition( 148 new AttemptUnregisteredTransition(), 149 RMAppState.FINISHING, RMAppState.FINISHED)) 150 .addTransition(RMAppState.RUNNING, RMAppState.FINISHED, 151 // UnManagedAM directly jumps to finished 152 RMAppEventType.ATTEMPT_FINISHED, FINISHED_TRANSITION) 153 .addTransition(RMAppState.RUNNING, RMAppState.RUNNING, 154 RMAppEventType.APP_RUNNING_ON_NODE, 155 new AppRunningOnNodeTransition()) 156 .addTransition(RMAppState.RUNNING, 157 EnumSet.of(RMAppState.ACCEPTED, RMAppState.FINAL_SAVING), 158 RMAppEventType.ATTEMPT_FAILED, 159 new AttemptFailedTransition(RMAppState.ACCEPTED)) 160 .addTransition(RMAppState.RUNNING, RMAppState.KILLING, 161 RMAppEventType.KILL, new KillAttemptTransition()) 162 163 // Transitions from FINAL_SAVING state 164 .addTransition(RMAppState.FINAL_SAVING, 165 EnumSet.of(RMAppState.FINISHING, RMAppState.FAILED, 166 RMAppState.KILLED, RMAppState.FINISHED), RMAppEventType.APP_UPDATE_SAVED, 167 new FinalStateSavedTransition()) 168 .addTransition(RMAppState.FINAL_SAVING, RMAppState.FINAL_SAVING, 169 RMAppEventType.ATTEMPT_FINISHED, 170 new AttemptFinishedAtFinalSavingTransition()) 171 .addTransition(RMAppState.FINAL_SAVING, RMAppState.FINAL_SAVING, 172 RMAppEventType.APP_RUNNING_ON_NODE, 173 new AppRunningOnNodeTransition()) 174 // ignorable transitions 175 .addTransition(RMAppState.FINAL_SAVING, RMAppState.FINAL_SAVING, 176 EnumSet.of(RMAppEventType.NODE_UPDATE, RMAppEventType.KILL, 177 RMAppEventType.APP_NEW_SAVED, RMAppEventType.MOVE)) 178 179 // Transitions from FINISHING state 180 .addTransition(RMAppState.FINISHING, RMAppState.FINISHED, 181 RMAppEventType.ATTEMPT_FINISHED, FINISHED_TRANSITION) 182 .addTransition(RMAppState.FINISHING, RMAppState.FINISHING, 183 RMAppEventType.APP_RUNNING_ON_NODE, 184 new AppRunningOnNodeTransition()) 185 // ignorable transitions 186 .addTransition(RMAppState.FINISHING, RMAppState.FINISHING, 187 EnumSet.of(RMAppEventType.NODE_UPDATE, 188 // ignore Kill/Move as we have already saved the final Finished state 189 // in state store. 190 RMAppEventType.KILL, RMAppEventType.MOVE)) 191 192 // Transitions from KILLING state 193 .addTransition(RMAppState.KILLING, RMAppState.KILLING, 194 RMAppEventType.APP_RUNNING_ON_NODE, 195 new AppRunningOnNodeTransition()) 196 .addTransition(RMAppState.KILLING, RMAppState.FINAL_SAVING, 197 RMAppEventType.ATTEMPT_KILLED, 198 new FinalSavingTransition( 199 new AppKilledTransition(), RMAppState.KILLED)) 200 .addTransition(RMAppState.KILLING, RMAppState.FINAL_SAVING, 201 RMAppEventType.ATTEMPT_UNREGISTERED, 202 new FinalSavingTransition( 203 new AttemptUnregisteredTransition(), 204 RMAppState.FINISHING, RMAppState.FINISHED)) 205 .addTransition(RMAppState.KILLING, RMAppState.FINISHED, 206 // UnManagedAM directly jumps to finished 207 RMAppEventType.ATTEMPT_FINISHED, FINISHED_TRANSITION) 208 .addTransition(RMAppState.KILLING, 209 EnumSet.of(RMAppState.FINAL_SAVING), 210 RMAppEventType.ATTEMPT_FAILED, 211 new AttemptFailedTransition(RMAppState.KILLING)) 212 213 .addTransition(RMAppState.KILLING, RMAppState.KILLING, 214 EnumSet.of( 215 RMAppEventType.NODE_UPDATE, 216 RMAppEventType.ATTEMPT_REGISTERED, 217 RMAppEventType.APP_UPDATE_SAVED, 218 RMAppEventType.KILL, RMAppEventType.MOVE)) 219 220 // Transitions from FINISHED state 221 // ignorable transitions 222 .addTransition(RMAppState.FINISHED, RMAppState.FINISHED, 223 RMAppEventType.APP_RUNNING_ON_NODE, 224 new AppRunningOnNodeTransition()) 225 .addTransition(RMAppState.FINISHED, RMAppState.FINISHED, 226 EnumSet.of( 227 RMAppEventType.NODE_UPDATE, 228 RMAppEventType.ATTEMPT_UNREGISTERED, 229 RMAppEventType.ATTEMPT_FINISHED, 230 RMAppEventType.KILL, RMAppEventType.MOVE)) 231 232 // Transitions from FAILED state 233 // ignorable transitions 234 .addTransition(RMAppState.FAILED, RMAppState.FAILED, 235 RMAppEventType.APP_RUNNING_ON_NODE, 236 new AppRunningOnNodeTransition()) 237 .addTransition(RMAppState.FAILED, RMAppState.FAILED, 238 EnumSet.of(RMAppEventType.KILL, RMAppEventType.NODE_UPDATE, 239 RMAppEventType.MOVE)) 240 241 // Transitions from KILLED state 242 // ignorable transitions 243 .addTransition(RMAppState.KILLED, RMAppState.KILLED, 244 RMAppEventType.APP_RUNNING_ON_NODE, 245 new AppRunningOnNodeTransition()) 246 .addTransition( 247 RMAppState.KILLED, 248 RMAppState.KILLED, 249 EnumSet.of(RMAppEventType.APP_ACCEPTED, 250 RMAppEventType.APP_REJECTED, RMAppEventType.KILL, 251 RMAppEventType.ATTEMPT_FINISHED, RMAppEventType.ATTEMPT_FAILED, 252 RMAppEventType.NODE_UPDATE, RMAppEventType.MOVE)) 253 254 .installTopology(); 255 256 private final StateMachine<RMAppState, RMAppEventType, RMAppEvent> 257 stateMachine; 258 259 private static final int DUMMY_APPLICATION_ATTEMPT_NUMBER = -1; 260 261 public RMAppImpl(ApplicationId applicationId, RMContext rmContext, 262 Configuration config, String name, String user, String queue, 263 ApplicationSubmissionContext submissionContext, YarnScheduler scheduler, 264 ApplicationMasterService masterService, long submitTime, 265 String applicationType, Set<String> applicationTags, 266 ResourceRequest amReq) { 267 268 this.systemClock = new SystemClock(); 269 270 this.applicationId = applicationId; 271 this.name = name; 272 this.rmContext = rmContext; 273 this.dispatcher = rmContext.getDispatcher(); 274 this.handler = dispatcher.getEventHandler(); 275 this.conf = config; 276 this.user = user; 277 this.queue = queue; 278 this.submissionContext = submissionContext; 279 this.scheduler = scheduler; 280 this.masterService = masterService; 281 this.submitTime = submitTime; 282 this.startTime = this.systemClock.getTime(); 283 this.applicationType = applicationType; 284 this.applicationTags = applicationTags; 285 this.amReq = amReq; 286 287 int globalMaxAppAttempts = conf.getInt(YarnConfiguration.RM_AM_MAX_ATTEMPTS, 288 YarnConfiguration.DEFAULT_RM_AM_MAX_ATTEMPTS); 289 int individualMaxAppAttempts = submissionContext.getMaxAppAttempts(); 290 if (individualMaxAppAttempts <= 0 || 291 individualMaxAppAttempts > globalMaxAppAttempts) { 292 this.maxAppAttempts = globalMaxAppAttempts; 293 LOG.warn("The specific max attempts: " + individualMaxAppAttempts 294 + " for application: " + applicationId.getId() 295 + " is invalid, because it is out of the range [1, " 296 + globalMaxAppAttempts + "]. Use the global max attempts instead."); 297 } else { 298 this.maxAppAttempts = individualMaxAppAttempts; 299 } 300 301 this.attemptFailuresValidityInterval = 302 submissionContext.getAttemptFailuresValidityInterval(); 303 if (this.attemptFailuresValidityInterval > 0) { 304 LOG.info("The attemptFailuresValidityInterval for the application: " 305 + this.applicationId + " is " + this.attemptFailuresValidityInterval 306 + "."); 307 } 308 309 ReentrantReadWriteLock lock = new ReentrantReadWriteLock(); 310 this.readLock = lock.readLock(); 311 this.writeLock = lock.writeLock(); 312 313 this.stateMachine = stateMachineFactory.make(this); 314 315 rmContext.getRMApplicationHistoryWriter().applicationStarted(this); 316 rmContext.getSystemMetricsPublisher().appCreated(this, startTime); 317 } 318 319 @Override 320 public ApplicationId getApplicationId() { 321 return this.applicationId; 322 } 323 324 @Override 325 public ApplicationSubmissionContext getApplicationSubmissionContext() { 326 return this.submissionContext; 327 } 328 329 @Override 330 public FinalApplicationStatus getFinalApplicationStatus() { 331 // finish state is obtained based on the state machine's current state 332 // as a fall-back in case the application has not been unregistered 333 // ( or if the app never unregistered itself ) 334 // when the report is requested 335 if (currentAttempt != null 336 && currentAttempt.getFinalApplicationStatus() != null) { 337 return currentAttempt.getFinalApplicationStatus(); 338 } 339 return createFinalApplicationStatus(this.stateMachine.getCurrentState()); 340 } 341 342 @Override 343 public RMAppState getState() { 344 this.readLock.lock(); 345 try { 346 return this.stateMachine.getCurrentState(); 347 } finally { 348 this.readLock.unlock(); 349 } 350 } 351 352 @Override 353 public String getUser() { 354 return this.user; 355 } 356 357 @Override 358 public float getProgress() { 359 RMAppAttempt attempt = this.currentAttempt; 360 if (attempt != null) { 361 return attempt.getProgress(); 362 } 363 return 0; 364 } 365 366 @Override 367 public RMAppAttempt getRMAppAttempt(ApplicationAttemptId appAttemptId) { 368 this.readLock.lock(); 369 370 try { 371 return this.attempts.get(appAttemptId); 372 } finally { 373 this.readLock.unlock(); 374 } 375 } 376 377 @Override 378 public String getQueue() { 379 return this.queue; 380 } 381 382 @Override 383 public void setQueue(String queue) { 384 this.queue = queue; 385 } 386 387 @Override 388 public String getName() { 389 return this.name; 390 } 391 392 @Override 393 public RMAppAttempt getCurrentAppAttempt() { 394 return this.currentAttempt; 395 } 396 397 @Override 398 public Map<ApplicationAttemptId, RMAppAttempt> getAppAttempts() { 399 this.readLock.lock(); 400 401 try { 402 return Collections.unmodifiableMap(this.attempts); 403 } finally { 404 this.readLock.unlock(); 405 } 406 } 407 408 private FinalApplicationStatus createFinalApplicationStatus(RMAppState state) { 409 switch(state) { 410 case NEW: 411 case NEW_SAVING: 412 case SUBMITTED: 413 case ACCEPTED: 414 case RUNNING: 415 case FINAL_SAVING: 416 case KILLING: 417 return FinalApplicationStatus.UNDEFINED; 418 // finished without a proper final state is the same as failed 419 case FINISHING: 420 case FINISHED: 421 case FAILED: 422 return FinalApplicationStatus.FAILED; 423 case KILLED: 424 return FinalApplicationStatus.KILLED; 425 } 426 throw new YarnRuntimeException("Unknown state passed!"); 427 } 428 429 @Override 430 public int pullRMNodeUpdates(Collection<RMNode> updatedNodes) { 431 this.writeLock.lock(); 432 try { 433 int updatedNodeCount = this.updatedNodes.size(); 434 updatedNodes.addAll(this.updatedNodes); 435 this.updatedNodes.clear(); 436 return updatedNodeCount; 437 } finally { 438 this.writeLock.unlock(); 439 } 440 } 441 442 @Override 443 public ApplicationReport createAndGetApplicationReport(String clientUserName, 444 boolean allowAccess) { 445 this.readLock.lock(); 446 447 try { 448 ApplicationAttemptId currentApplicationAttemptId = null; 449 org.apache.hadoop.yarn.api.records.Token clientToAMToken = null; 450 String trackingUrl = UNAVAILABLE; 451 String host = UNAVAILABLE; 452 String origTrackingUrl = UNAVAILABLE; 453 int rpcPort = -1; 454 ApplicationResourceUsageReport appUsageReport = 455 RMServerUtils.DUMMY_APPLICATION_RESOURCE_USAGE_REPORT; 456 FinalApplicationStatus finishState = getFinalApplicationStatus(); 457 String diags = UNAVAILABLE; 458 float progress = 0.0f; 459 org.apache.hadoop.yarn.api.records.Token amrmToken = null; 460 if (allowAccess) { 461 trackingUrl = getDefaultProxyTrackingUrl(); 462 if (this.currentAttempt != null) { 463 currentApplicationAttemptId = this.currentAttempt.getAppAttemptId(); 464 trackingUrl = this.currentAttempt.getTrackingUrl(); 465 origTrackingUrl = this.currentAttempt.getOriginalTrackingUrl(); 466 if (UserGroupInformation.isSecurityEnabled()) { 467 // get a token so the client can communicate with the app attempt 468 // NOTE: token may be unavailable if the attempt is not running 469 Token<ClientToAMTokenIdentifier> attemptClientToAMToken = 470 this.currentAttempt.createClientToken(clientUserName); 471 if (attemptClientToAMToken != null) { 472 clientToAMToken = BuilderUtils.newClientToAMToken( 473 attemptClientToAMToken.getIdentifier(), 474 attemptClientToAMToken.getKind().toString(), 475 attemptClientToAMToken.getPassword(), 476 attemptClientToAMToken.getService().toString()); 477 } 478 } 479 host = this.currentAttempt.getHost(); 480 rpcPort = this.currentAttempt.getRpcPort(); 481 appUsageReport = currentAttempt.getApplicationResourceUsageReport(); 482 progress = currentAttempt.getProgress(); 483 } 484 diags = this.diagnostics.toString(); 485 486 if (currentAttempt != null && 487 currentAttempt.getAppAttemptState() == RMAppAttemptState.LAUNCHED) { 488 if (getApplicationSubmissionContext().getUnmanagedAM() && 489 clientUserName != null && getUser().equals(clientUserName)) { 490 Token<AMRMTokenIdentifier> token = currentAttempt.getAMRMToken(); 491 if (token != null) { 492 amrmToken = BuilderUtils.newAMRMToken(token.getIdentifier(), 493 token.getKind().toString(), token.getPassword(), 494 token.getService().toString()); 495 } 496 } 497 } 498 499 RMAppMetrics rmAppMetrics = getRMAppMetrics(); 500 appUsageReport.setMemorySeconds(rmAppMetrics.getMemorySeconds()); 501 appUsageReport.setVcoreSeconds(rmAppMetrics.getVcoreSeconds()); 502 } 503 504 if (currentApplicationAttemptId == null) { 505 currentApplicationAttemptId = 506 BuilderUtils.newApplicationAttemptId(this.applicationId, 507 DUMMY_APPLICATION_ATTEMPT_NUMBER); 508 } 509 510 return BuilderUtils.newApplicationReport(this.applicationId, 511 currentApplicationAttemptId, this.user, this.queue, 512 this.name, host, rpcPort, clientToAMToken, 513 createApplicationState(), diags, 514 trackingUrl, this.startTime, this.finishTime, finishState, 515 appUsageReport, origTrackingUrl, progress, this.applicationType, 516 amrmToken, applicationTags); 517 } finally { 518 this.readLock.unlock(); 519 } 520 } 521 522 private String getDefaultProxyTrackingUrl() { 523 try { 524 final String scheme = WebAppUtils.getHttpSchemePrefix(conf); 525 String proxy = WebAppUtils.getProxyHostAndPort(conf); 526 URI proxyUri = ProxyUriUtils.getUriFromAMUrl(scheme, proxy); 527 URI result = ProxyUriUtils.getProxyUri(null, proxyUri, applicationId); 528 return result.toASCIIString(); 529 } catch (URISyntaxException e) { 530 LOG.warn("Could not generate default proxy tracking URL for " 531 + applicationId); 532 return UNAVAILABLE; 533 } 534 } 535 536 @Override 537 public long getFinishTime() { 538 this.readLock.lock(); 539 540 try { 541 return this.finishTime; 542 } finally { 543 this.readLock.unlock(); 544 } 545 } 546 547 @Override 548 public long getStartTime() { 549 this.readLock.lock(); 550 551 try { 552 return this.startTime; 553 } finally { 554 this.readLock.unlock(); 555 } 556 } 557 558 @Override 559 public long getSubmitTime() { 560 return this.submitTime; 561 } 562 563 @Override 564 public String getTrackingUrl() { 565 RMAppAttempt attempt = this.currentAttempt; 566 if (attempt != null) { 567 return attempt.getTrackingUrl(); 568 } 569 return null; 570 } 571 572 @Override 573 public String getOriginalTrackingUrl() { 574 RMAppAttempt attempt = this.currentAttempt; 575 if (attempt != null) { 576 return attempt.getOriginalTrackingUrl(); 577 } 578 return null; 579 } 580 581 @Override 582 public StringBuilder getDiagnostics() { 583 this.readLock.lock(); 584 585 try { 586 return this.diagnostics; 587 } finally { 588 this.readLock.unlock(); 589 } 590 } 591 592 @Override 593 public int getMaxAppAttempts() { 594 return this.maxAppAttempts; 595 } 596 597 @Override 598 public void handle(RMAppEvent event) { 599 600 this.writeLock.lock(); 601 602 try { 603 ApplicationId appID = event.getApplicationId(); 604 LOG.debug("Processing event for " + appID + " of type " 605 + event.getType()); 606 final RMAppState oldState = getState(); 607 try { 608 /* keep the master in sync with the state machine */ 609 this.stateMachine.doTransition(event.getType(), event); 610 } catch (InvalidStateTransitonException e) { 611 LOG.error("Can't handle this event at current state", e); 612 /* TODO fail the application on the failed transition */ 613 } 614 615 if (oldState != getState()) { 616 LOG.info(appID + " State change from " + oldState + " to " 617 + getState()); 618 } 619 } finally { 620 this.writeLock.unlock(); 621 } 622 } 623 624 @Override 625 public void recover(RMState state) { 626 ApplicationStateData appState = 627 state.getApplicationState().get(getApplicationId()); 628 this.recoveredFinalState = appState.getState(); 629 LOG.info("Recovering app: " + getApplicationId() + " with " + 630 + appState.getAttemptCount() + " attempts and final state = " 631 + this.recoveredFinalState ); 632 this.diagnostics.append(appState.getDiagnostics()); 633 this.storedFinishTime = appState.getFinishTime(); 634 this.startTime = appState.getStartTime(); 635 636 for(int i=0; i<appState.getAttemptCount(); ++i) { 637 // create attempt 638 createNewAttempt(); 639 ((RMAppAttemptImpl)this.currentAttempt).recover(state); 640 } 641 } 642 643 private void createNewAttempt() { 644 ApplicationAttemptId appAttemptId = 645 ApplicationAttemptId.newInstance(applicationId, attempts.size() + 1); 646 RMAppAttempt attempt = 647 new RMAppAttemptImpl(appAttemptId, rmContext, scheduler, masterService, 648 submissionContext, conf, 649 // The newly created attempt maybe last attempt if (number of 650 // previously failed attempts(which should not include Preempted, 651 // hardware error and NM resync) + 1) equal to the max-attempt 652 // limit. 653 maxAppAttempts == (getNumFailedAppAttempts() + 1), amReq); 654 attempts.put(appAttemptId, attempt); 655 currentAttempt = attempt; 656 } 657 658 private void 659 createAndStartNewAttempt(boolean transferStateFromPreviousAttempt) { 660 createNewAttempt(); 661 handler.handle(new RMAppStartAttemptEvent(currentAttempt.getAppAttemptId(), 662 transferStateFromPreviousAttempt)); 663 } 664 665 private void processNodeUpdate(RMAppNodeUpdateType type, RMNode node) { 666 NodeState nodeState = node.getState(); 667 updatedNodes.add(node); 668 LOG.debug("Received node update event:" + type + " for node:" + node 669 + " with state:" + nodeState); 670 } 671 672 private static class RMAppTransition implements 673 SingleArcTransition<RMAppImpl, RMAppEvent> { 674 public void transition(RMAppImpl app, RMAppEvent event) { 675 }; 676 677 } 678 679 private static final class RMAppNodeUpdateTransition extends RMAppTransition { 680 public void transition(RMAppImpl app, RMAppEvent event) { 681 RMAppNodeUpdateEvent nodeUpdateEvent = (RMAppNodeUpdateEvent) event; 682 app.processNodeUpdate(nodeUpdateEvent.getUpdateType(), 683 nodeUpdateEvent.getNode()); 684 }; 685 } 686 687 private static final class AppRunningOnNodeTransition extends RMAppTransition { 688 public void transition(RMAppImpl app, RMAppEvent event) { 689 RMAppRunningOnNodeEvent nodeAddedEvent = (RMAppRunningOnNodeEvent) event; 690 691 // if final state already stored, notify RMNode 692 if (isAppInFinalState(app)) { 693 app.handler.handle( 694 new RMNodeCleanAppEvent(nodeAddedEvent.getNodeId(), nodeAddedEvent 695 .getApplicationId())); 696 return; 697 } 698 699 // otherwise, add it to ranNodes for further process 700 app.ranNodes.add(nodeAddedEvent.getNodeId()); 701 }; 702 } 703 704 /** 705 * Move an app to a new queue. 706 * This transition must set the result on the Future in the RMAppMoveEvent, 707 * either as an exception for failure or null for success, or the client will 708 * be left waiting forever. 709 */ 710 private static final class RMAppMoveTransition extends RMAppTransition { 711 public void transition(RMAppImpl app, RMAppEvent event) { 712 RMAppMoveEvent moveEvent = (RMAppMoveEvent) event; 713 try { 714 app.queue = app.scheduler.moveApplication(app.applicationId, 715 moveEvent.getTargetQueue()); 716 } catch (YarnException ex) { 717 moveEvent.getResult().setException(ex); 718 return; 719 } 720 721 // TODO: Write out change to state store (YARN-1558) 722 // Also take care of RM failover 723 moveEvent.getResult().set(null); 724 } 725 } 726 727 // synchronously recover attempt to ensure any incoming external events 728 // to be processed after the attempt processes the recover event. 729 private void recoverAppAttempts() { 730 for (RMAppAttempt attempt : getAppAttempts().values()) { 731 attempt.handle(new RMAppAttemptEvent(attempt.getAppAttemptId(), 732 RMAppAttemptEventType.RECOVER)); 733 } 734 } 735 736 private static final class RMAppRecoveredTransition implements 737 MultipleArcTransition<RMAppImpl, RMAppEvent, RMAppState> { 738 739 @Override 740 public RMAppState transition(RMAppImpl app, RMAppEvent event) { 741 742 RMAppRecoverEvent recoverEvent = (RMAppRecoverEvent) event; 743 app.recover(recoverEvent.getRMState()); 744 // The app has completed. 745 if (app.recoveredFinalState != null) { 746 app.recoverAppAttempts(); 747 new FinalTransition(app.recoveredFinalState).transition(app, event); 748 return app.recoveredFinalState; 749 } 750 751 if (UserGroupInformation.isSecurityEnabled()) { 752 // asynchronously renew delegation token on recovery. 753 try { 754 app.rmContext.getDelegationTokenRenewer() 755 .addApplicationAsyncDuringRecovery(app.getApplicationId(), 756 app.parseCredentials(), 757 app.submissionContext.getCancelTokensWhenComplete(), 758 app.getUser()); 759 } catch (Exception e) { 760 String msg = "Failed to fetch user credentials from application:" 761 + e.getMessage(); 762 app.diagnostics.append(msg); 763 LOG.error(msg, e); 764 } 765 } 766 767 // No existent attempts means the attempt associated with this app was not 768 // started or started but not yet saved. 769 if (app.attempts.isEmpty()) { 770 app.scheduler.handle(new AppAddedSchedulerEvent(app.applicationId, 771 app.submissionContext.getQueue(), app.user, 772 app.submissionContext.getReservationID())); 773 return RMAppState.SUBMITTED; 774 } 775 776 // Add application to scheduler synchronously to guarantee scheduler 777 // knows applications before AM or NM re-registers. 778 app.scheduler.handle(new AppAddedSchedulerEvent(app.applicationId, 779 app.submissionContext.getQueue(), app.user, true, 780 app.submissionContext.getReservationID())); 781 782 // recover attempts 783 app.recoverAppAttempts(); 784 785 // Last attempt is in final state, return ACCEPTED waiting for last 786 // RMAppAttempt to send finished or failed event back. 787 if (app.currentAttempt != null 788 && (app.currentAttempt.getState() == RMAppAttemptState.KILLED 789 || app.currentAttempt.getState() == RMAppAttemptState.FINISHED 790 || (app.currentAttempt.getState() == RMAppAttemptState.FAILED 791 && app.getNumFailedAppAttempts() == app.maxAppAttempts))) { 792 return RMAppState.ACCEPTED; 793 } 794 795 // YARN-1507 is saving the application state after the application is 796 // accepted. So after YARN-1507, an app is saved meaning it is accepted. 797 // Thus we return ACCECPTED state on recovery. 798 return RMAppState.ACCEPTED; 799 } 800 } 801 802 private static final class AddApplicationToSchedulerTransition extends 803 RMAppTransition { 804 @Override 805 public void transition(RMAppImpl app, RMAppEvent event) { 806 app.handler.handle(new AppAddedSchedulerEvent(app.applicationId, 807 app.submissionContext.getQueue(), app.user, 808 app.submissionContext.getReservationID())); 809 } 810 } 811 812 private static final class StartAppAttemptTransition extends RMAppTransition { 813 @Override 814 public void transition(RMAppImpl app, RMAppEvent event) { 815 app.createAndStartNewAttempt(false); 816 }; 817 } 818 819 private static final class FinalStateSavedTransition implements 820 MultipleArcTransition<RMAppImpl, RMAppEvent, RMAppState> { 821 822 @Override 823 public RMAppState transition(RMAppImpl app, RMAppEvent event) { 824 if (app.transitionTodo instanceof SingleArcTransition) { 825 ((SingleArcTransition) app.transitionTodo).transition(app, 826 app.eventCausingFinalSaving); 827 } else if (app.transitionTodo instanceof MultipleArcTransition) { 828 ((MultipleArcTransition) app.transitionTodo).transition(app, 829 app.eventCausingFinalSaving); 830 } 831 return app.targetedFinalState; 832 833 } 834 } 835 836 private static class AttemptFailedFinalStateSavedTransition extends 837 RMAppTransition { 838 @Override 839 public void transition(RMAppImpl app, RMAppEvent event) { 840 String msg = null; 841 if (event instanceof RMAppFailedAttemptEvent) { 842 msg = app.getAppAttemptFailedDiagnostics(event); 843 } 844 LOG.info(msg); 845 app.diagnostics.append(msg); 846 // Inform the node for app-finish 847 new FinalTransition(RMAppState.FAILED).transition(app, event); 848 } 849 } 850 851 private String getAppAttemptFailedDiagnostics(RMAppEvent event) { 852 String msg = null; 853 RMAppFailedAttemptEvent failedEvent = (RMAppFailedAttemptEvent) event; 854 if (this.submissionContext.getUnmanagedAM()) { 855 // RM does not manage the AM. Do not retry 856 msg = "Unmanaged application " + this.getApplicationId() 857 + " failed due to " + failedEvent.getDiagnosticMsg() 858 + ". Failing the application."; 859 } else if (this.isNumAttemptsBeyondThreshold) { 860 msg = "Application " + this.getApplicationId() + " failed " 861 + this.maxAppAttempts + " times due to " 862 + failedEvent.getDiagnosticMsg() + ". Failing the application."; 863 } 864 return msg; 865 } 866 867 private static final class RMAppNewlySavingTransition extends RMAppTransition { 868 @Override 869 public void transition(RMAppImpl app, RMAppEvent event) { 870 871 // If recovery is enabled then store the application information in a 872 // non-blocking call so make sure that RM has stored the information 873 // needed to restart the AM after RM restart without further client 874 // communication 875 LOG.info("Storing application with id " + app.applicationId); 876 app.rmContext.getStateStore().storeNewApplication(app); 877 } 878 } 879 880 private void rememberTargetTransitions(RMAppEvent event, 881 Object transitionToDo, RMAppState targetFinalState) { 882 transitionTodo = transitionToDo; 883 targetedFinalState = targetFinalState; 884 eventCausingFinalSaving = event; 885 } 886 887 private void rememberTargetTransitionsAndStoreState(RMAppEvent event, 888 Object transitionToDo, RMAppState targetFinalState, 889 RMAppState stateToBeStored) { 890 rememberTargetTransitions(event, transitionToDo, targetFinalState); 891 this.stateBeforeFinalSaving = getState(); 892 this.storedFinishTime = this.systemClock.getTime(); 893 894 LOG.info("Updating application " + this.applicationId 895 + " with final state: " + this.targetedFinalState); 896 // we lost attempt_finished diagnostics in app, because attempt_finished 897 // diagnostics is sent after app final state is saved. Later on, we will 898 // create GetApplicationAttemptReport specifically for getting per attempt 899 // info. 900 String diags = null; 901 switch (event.getType()) { 902 case APP_REJECTED: 903 case ATTEMPT_FINISHED: 904 case ATTEMPT_KILLED: 905 diags = event.getDiagnosticMsg(); 906 break; 907 case ATTEMPT_FAILED: 908 RMAppFailedAttemptEvent failedEvent = (RMAppFailedAttemptEvent) event; 909 diags = getAppAttemptFailedDiagnostics(failedEvent); 910 break; 911 default: 912 break; 913 } 914 ApplicationStateData appState = 915 ApplicationStateData.newInstance(this.submitTime, this.startTime, 916 this.user, this.submissionContext, 917 stateToBeStored, diags, this.storedFinishTime); 918 this.rmContext.getStateStore().updateApplicationState(appState); 919 } 920 921 private static final class FinalSavingTransition extends RMAppTransition { 922 Object transitionToDo; 923 RMAppState targetedFinalState; 924 RMAppState stateToBeStored; 925 926 public FinalSavingTransition(Object transitionToDo, 927 RMAppState targetedFinalState) { 928 this(transitionToDo, targetedFinalState, targetedFinalState); 929 } 930 931 public FinalSavingTransition(Object transitionToDo, 932 RMAppState targetedFinalState, RMAppState stateToBeStored) { 933 this.transitionToDo = transitionToDo; 934 this.targetedFinalState = targetedFinalState; 935 this.stateToBeStored = stateToBeStored; 936 } 937 938 @Override 939 public void transition(RMAppImpl app, RMAppEvent event) { 940 app.rememberTargetTransitionsAndStoreState(event, transitionToDo, 941 targetedFinalState, stateToBeStored); 942 } 943 } 944 945 private static class AttemptUnregisteredTransition extends RMAppTransition { 946 @Override 947 public void transition(RMAppImpl app, RMAppEvent event) { 948 app.finishTime = app.storedFinishTime; 949 } 950 } 951 952 private static class AppFinishedTransition extends FinalTransition { 953 public AppFinishedTransition() { 954 super(RMAppState.FINISHED); 955 } 956 957 public void transition(RMAppImpl app, RMAppEvent event) { 958 app.diagnostics.append(event.getDiagnosticMsg()); 959 super.transition(app, event); 960 }; 961 } 962 963 private static class AttemptFinishedAtFinalSavingTransition extends 964 RMAppTransition { 965 @Override 966 public void transition(RMAppImpl app, RMAppEvent event) { 967 if (app.targetedFinalState.equals(RMAppState.FAILED) 968 || app.targetedFinalState.equals(RMAppState.KILLED)) { 969 // Ignore Attempt_Finished event if we were supposed to reach FAILED 970 // FINISHED state 971 return; 972 } 973 974 // pass in the earlier attempt_unregistered event, as it is needed in 975 // AppFinishedFinalStateSavedTransition later on 976 app.rememberTargetTransitions(event, 977 new AppFinishedFinalStateSavedTransition(app.eventCausingFinalSaving), 978 RMAppState.FINISHED); 979 }; 980 } 981 982 private static class AppFinishedFinalStateSavedTransition extends 983 RMAppTransition { 984 RMAppEvent attemptUnregistered; 985 986 public AppFinishedFinalStateSavedTransition(RMAppEvent attemptUnregistered) { 987 this.attemptUnregistered = attemptUnregistered; 988 } 989 @Override 990 public void transition(RMAppImpl app, RMAppEvent event) { 991 new AttemptUnregisteredTransition().transition(app, attemptUnregistered); 992 FINISHED_TRANSITION.transition(app, event); 993 }; 994 } 995 996 997 private static class AppKilledTransition extends FinalTransition { 998 public AppKilledTransition() { 999 super(RMAppState.KILLED); 1000 } 1001 1002 @Override 1003 public void transition(RMAppImpl app, RMAppEvent event) { 1004 app.diagnostics.append(event.getDiagnosticMsg()); 1005 super.transition(app, event); 1006 }; 1007 } 1008 1009 private static class KillAttemptTransition extends RMAppTransition { 1010 @Override 1011 public void transition(RMAppImpl app, RMAppEvent event) { 1012 app.stateBeforeKilling = app.getState(); 1013 // Forward app kill diagnostics in the event to kill app attempt. 1014 // These diagnostics will be returned back in ATTEMPT_KILLED event sent by 1015 // RMAppAttemptImpl. 1016 app.handler.handle( 1017 new RMAppAttemptEvent(app.currentAttempt.getAppAttemptId(), 1018 RMAppAttemptEventType.KILL, event.getDiagnosticMsg())); 1019 } 1020 } 1021 1022 private static final class AppRejectedTransition extends 1023 FinalTransition{ 1024 public AppRejectedTransition() { 1025 super(RMAppState.FAILED); 1026 } 1027 1028 public void transition(RMAppImpl app, RMAppEvent event) { 1029 app.diagnostics.append(event.getDiagnosticMsg()); 1030 super.transition(app, event); 1031 }; 1032 } 1033 1034 private static class FinalTransition extends RMAppTransition { 1035 1036 private final RMAppState finalState; 1037 1038 public FinalTransition(RMAppState finalState) { 1039 this.finalState = finalState; 1040 } 1041 1042 public void transition(RMAppImpl app, RMAppEvent event) { 1043 for (NodeId nodeId : app.getRanNodes()) { 1044 app.handler.handle( 1045 new RMNodeCleanAppEvent(nodeId, app.applicationId)); 1046 } 1047 app.finishTime = app.storedFinishTime; 1048 if (app.finishTime == 0 ) { 1049 app.finishTime = app.systemClock.getTime(); 1050 } 1051 // Recovered apps that are completed were not added to scheduler, so no 1052 // need to remove them from scheduler. 1053 if (app.recoveredFinalState == null) { 1054 app.handler.handle(new AppRemovedSchedulerEvent(app.applicationId, 1055 finalState)); 1056 } 1057 app.handler.handle( 1058 new RMAppManagerEvent(app.applicationId, 1059 RMAppManagerEventType.APP_COMPLETED)); 1060 1061 app.rmContext.getRMApplicationHistoryWriter() 1062 .applicationFinished(app, finalState); 1063 app.rmContext.getSystemMetricsPublisher() 1064 .appFinished(app, finalState, app.finishTime); 1065 }; 1066 } 1067 1068 private int getNumFailedAppAttempts() { 1069 int completedAttempts = 0; 1070 long endTime = this.systemClock.getTime(); 1071 // Do not count AM preemption, hardware failures or NM resync 1072 // as attempt failure. 1073 for (RMAppAttempt attempt : attempts.values()) { 1074 if (attempt.shouldCountTowardsMaxAttemptRetry()) { 1075 if (this.attemptFailuresValidityInterval <= 0 1076 || (attempt.getFinishTime() > endTime 1077 - this.attemptFailuresValidityInterval)) { 1078 completedAttempts++; 1079 } 1080 } 1081 } 1082 return completedAttempts; 1083 } 1084 1085 private static final class AttemptFailedTransition implements 1086 MultipleArcTransition<RMAppImpl, RMAppEvent, RMAppState> { 1087 1088 private final RMAppState initialState; 1089 1090 public AttemptFailedTransition(RMAppState initialState) { 1091 this.initialState = initialState; 1092 } 1093 1094 @Override 1095 public RMAppState transition(RMAppImpl app, RMAppEvent event) { 1096 int numberOfFailure = app.getNumFailedAppAttempts(); 1097 LOG.info("The number of failed attempts" 1098 + (app.attemptFailuresValidityInterval > 0 ? " in previous " 1099 + app.attemptFailuresValidityInterval + " milliseconds " : " ") 1100 + "is " + numberOfFailure + ". The max attempts is " 1101 + app.maxAppAttempts); 1102 if (!app.submissionContext.getUnmanagedAM() 1103 && numberOfFailure < app.maxAppAttempts) { 1104 if (initialState.equals(RMAppState.KILLING)) { 1105 // If this is not last attempt, app should be killed instead of 1106 // launching a new attempt 1107 app.rememberTargetTransitionsAndStoreState(event, 1108 new AppKilledTransition(), RMAppState.KILLED, RMAppState.KILLED); 1109 return RMAppState.FINAL_SAVING; 1110 } 1111 1112 boolean transferStateFromPreviousAttempt; 1113 RMAppFailedAttemptEvent failedEvent = (RMAppFailedAttemptEvent) event; 1114 transferStateFromPreviousAttempt = 1115 failedEvent.getTransferStateFromPreviousAttempt(); 1116 1117 RMAppAttempt oldAttempt = app.currentAttempt; 1118 app.createAndStartNewAttempt(transferStateFromPreviousAttempt); 1119 // Transfer the state from the previous attempt to the current attempt. 1120 // Note that the previous failed attempt may still be collecting the 1121 // container events from the scheduler and update its data structures 1122 // before the new attempt is created. We always transferState for 1123 // finished containers so that they can be acked to NM, 1124 // but when pulling finished container we will check this flag again. 1125 ((RMAppAttemptImpl) app.currentAttempt) 1126 .transferStateFromPreviousAttempt(oldAttempt); 1127 return initialState; 1128 } else { 1129 if (numberOfFailure >= app.maxAppAttempts) { 1130 app.isNumAttemptsBeyondThreshold = true; 1131 } 1132 app.rememberTargetTransitionsAndStoreState(event, 1133 new AttemptFailedFinalStateSavedTransition(), RMAppState.FAILED, 1134 RMAppState.FAILED); 1135 return RMAppState.FINAL_SAVING; 1136 } 1137 } 1138 } 1139 1140 @Override 1141 public String getApplicationType() { 1142 return this.applicationType; 1143 } 1144 1145 @Override 1146 public Set<String> getApplicationTags() { 1147 return this.applicationTags; 1148 } 1149 1150 @Override 1151 public boolean isAppFinalStateStored() { 1152 RMAppState state = getState(); 1153 return state.equals(RMAppState.FINISHING) 1154 || state.equals(RMAppState.FINISHED) || state.equals(RMAppState.FAILED) 1155 || state.equals(RMAppState.KILLED); 1156 } 1157 1158 @Override 1159 public YarnApplicationState createApplicationState() { 1160 RMAppState rmAppState = getState(); 1161 // If App is in FINAL_SAVING state, return its previous state. 1162 if (rmAppState.equals(RMAppState.FINAL_SAVING)) { 1163 rmAppState = stateBeforeFinalSaving; 1164 } 1165 if (rmAppState.equals(RMAppState.KILLING)) { 1166 rmAppState = stateBeforeKilling; 1167 } 1168 return RMServerUtils.createApplicationState(rmAppState); 1169 } 1170 1171 public static boolean isAppInFinalState(RMApp rmApp) { 1172 RMAppState appState = ((RMAppImpl) rmApp).getRecoveredFinalState(); 1173 if (appState == null) { 1174 appState = rmApp.getState(); 1175 } 1176 return appState == RMAppState.FAILED || appState == RMAppState.FINISHED 1177 || appState == RMAppState.KILLED; 1178 } 1179 1180 public RMAppState getRecoveredFinalState() { 1181 return this.recoveredFinalState; 1182 } 1183 1184 @Override 1185 public Set<NodeId> getRanNodes() { 1186 return ranNodes; 1187 } 1188 1189 @Override 1190 public RMAppMetrics getRMAppMetrics() { 1191 Resource resourcePreempted = Resource.newInstance(0, 0); 1192 int numAMContainerPreempted = 0; 1193 int numNonAMContainerPreempted = 0; 1194 long memorySeconds = 0; 1195 long vcoreSeconds = 0; 1196 for (RMAppAttempt attempt : attempts.values()) { 1197 if (null != attempt) { 1198 RMAppAttemptMetrics attemptMetrics = 1199 attempt.getRMAppAttemptMetrics(); 1200 Resources.addTo(resourcePreempted, 1201 attemptMetrics.getResourcePreempted()); 1202 numAMContainerPreempted += attemptMetrics.getIsPreempted() ? 1 : 0; 1203 numNonAMContainerPreempted += 1204 attemptMetrics.getNumNonAMContainersPreempted(); 1205 // getAggregateAppResourceUsage() will calculate resource usage stats 1206 // for both running and finished containers. 1207 AggregateAppResourceUsage resUsage = 1208 attempt.getRMAppAttemptMetrics().getAggregateAppResourceUsage(); 1209 memorySeconds += resUsage.getMemorySeconds(); 1210 vcoreSeconds += resUsage.getVcoreSeconds(); 1211 } 1212 } 1213 1214 return new RMAppMetrics(resourcePreempted, 1215 numNonAMContainerPreempted, numAMContainerPreempted, 1216 memorySeconds, vcoreSeconds); 1217 } 1218 1219 @Private 1220 @VisibleForTesting 1221 public void setSystemClock(Clock clock) { 1222 this.systemClock = clock; 1223 } 1224 1225 @Override 1226 public ReservationId getReservationId() { 1227 return submissionContext.getReservationID(); 1228 } 1229 1230 @Override 1231 public ResourceRequest getAMResourceRequest() { 1232 return this.amReq; 1233 } 1234 1235 protected Credentials parseCredentials() throws IOException { 1236 Credentials credentials = new Credentials(); 1237 DataInputByteBuffer dibb = new DataInputByteBuffer(); 1238 ByteBuffer tokens = submissionContext.getAMContainerSpec().getTokens(); 1239 if (tokens != null) { 1240 dibb.reset(tokens); 1241 credentials.readTokenStorageStream(dibb); 1242 tokens.rewind(); 1243 } 1244 return credentials; 1245 } 1246 }
RMAppImpl.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
接下来看一下 this.rmContext.getDispatcher().getEventHandler().handle(new RMAppEvent(applicationId, RMAppEventType.START)), this.rmContext是RMContext接口类的对象,该接口是ResourceManager的上下文或语境,该接口只有一个实现类RMContextImpl。该类的getDispatcher()方法会返回接口类Dispatcher对象,我们这里选择的是Dispatcher的实现类AsyncDispatcher类,而AsyncDispatcher类的getEventHandler()方法会返回接口类EventHandler对象,其实在AsyncDispatcher类内部已经把EventHandler接口的对象初始化为AsyncDispatcher类的内部类GenericEventHandler类的对象,并调用该内部类的handle()方法来处理,其中RMAppEventType注册到中央异步调度器的地方在ResourceManager.java中。handle()方法最后会在队列eventQueue中添加事件event。 handle函数里,最终把event事件放进了队列eventQueue中:eventQueue.put(event);注意这个异步调度器AsyncDispatcher类是公用的。RMAppEventType.START事件放入队列eventQueue中,会被 RMAppImpl 类获取,进入其handle函数。
1 /** 2 * Dispatches {@link Event}s in a separate thread. Currently only single thread 3 * does that. Potentially there could be multiple channels for each event type 4 * class and a thread pool can be used to dispatch the events. 5 */ 6 @SuppressWarnings("rawtypes") 7 @Public 8 @Evolving 9 public class AsyncDispatcher extends AbstractService implements Dispatcher { 10 11 private static final Log LOG = LogFactory.getLog(AsyncDispatcher.class); 12 13 private final BlockingQueue<Event> eventQueue; 14 private volatile int lastEventQueueSizeLogged = 0; 15 private volatile boolean stopped = false; 16 17 // Configuration flag for enabling/disabling draining dispatcher's events on 18 // stop functionality. 19 private volatile boolean drainEventsOnStop = false; 20 21 // Indicates all the remaining dispatcher's events on stop have been drained 22 // and processed. 23 private volatile boolean drained = true; 24 private Object waitForDrained = new Object(); 25 26 // For drainEventsOnStop enabled only, block newly coming events into the 27 // queue while stopping. 28 private volatile boolean blockNewEvents = false; 29 private final EventHandler handlerInstance = new GenericEventHandler(); 30 31 private Thread eventHandlingThread; 32 protected final Map<Class<? extends Enum>, EventHandler> eventDispatchers; 33 private boolean exitOnDispatchException; 34 35 public AsyncDispatcher() { 36 this(new LinkedBlockingQueue<Event>()); 37 } 38 39 public AsyncDispatcher(BlockingQueue<Event> eventQueue) { 40 super("Dispatcher"); 41 this.eventQueue = eventQueue; 42 this.eventDispatchers = new HashMap<Class<? extends Enum>, EventHandler>(); 43 } 44 45 Runnable createThread() { 46 return new Runnable() { 47 @Override 48 public void run() { 49 while (!stopped && !Thread.currentThread().isInterrupted()) { 50 drained = eventQueue.isEmpty(); 51 // blockNewEvents is only set when dispatcher is draining to stop, 52 // adding this check is to avoid the overhead of acquiring the lock 53 // and calling notify every time in the normal run of the loop. 54 if (blockNewEvents) { 55 synchronized (waitForDrained) { 56 if (drained) { 57 waitForDrained.notify(); 58 } 59 } 60 } 61 Event event; 62 try { 63 event = eventQueue.take(); 64 } catch(InterruptedException ie) { 65 if (!stopped) { 66 LOG.warn("AsyncDispatcher thread interrupted", ie); 67 } 68 return; 69 } 70 if (event != null) { 71 dispatch(event); 72 } 73 } 74 } 75 }; 76 } 77 78 @Override 79 protected void serviceInit(Configuration conf) throws Exception { 80 this.exitOnDispatchException = 81 conf.getBoolean(Dispatcher.DISPATCHER_EXIT_ON_ERROR_KEY, 82 Dispatcher.DEFAULT_DISPATCHER_EXIT_ON_ERROR); 83 super.serviceInit(conf); 84 } 85 86 @Override 87 protected void serviceStart() throws Exception { 88 //start all the components 89 super.serviceStart(); 90 eventHandlingThread = new Thread(createThread()); 91 eventHandlingThread.setName("AsyncDispatcher event handler"); 92 eventHandlingThread.start(); 93 } 94 95 public void setDrainEventsOnStop() { 96 drainEventsOnStop = true; 97 } 98 99 @Override 100 protected void serviceStop() throws Exception { 101 if (drainEventsOnStop) { 102 blockNewEvents = true; 103 LOG.info("AsyncDispatcher is draining to stop, igonring any new events."); 104 long endTime = System.currentTimeMillis() + getConfig() 105 .getLong(YarnConfiguration.DISPATCHER_DRAIN_EVENTS_TIMEOUT, 106 YarnConfiguration.DEFAULT_DISPATCHER_DRAIN_EVENTS_TIMEOUT); 107 108 synchronized (waitForDrained) { 109 while (!drained && eventHandlingThread != null 110 && eventHandlingThread.isAlive() 111 && System.currentTimeMillis() < endTime) { 112 waitForDrained.wait(1000); 113 LOG.info("Waiting for AsyncDispatcher to drain. Thread state is :" + 114 eventHandlingThread.getState()); 115 } 116 } 117 } 118 stopped = true; 119 if (eventHandlingThread != null) { 120 eventHandlingThread.interrupt(); 121 try { 122 eventHandlingThread.join(); 123 } catch (InterruptedException ie) { 124 LOG.warn("Interrupted Exception while stopping", ie); 125 } 126 } 127 128 // stop all the components 129 super.serviceStop(); 130 } 131 132 @SuppressWarnings("unchecked") 133 protected void dispatch(Event event) { 134 //all events go thru this loop 135 if (LOG.isDebugEnabled()) { 136 LOG.debug("Dispatching the event " + event.getClass().getName() + "." 137 + event.toString()); 138 } 139 140 Class<? extends Enum> type = event.getType().getDeclaringClass(); 141 142 try{ 143 EventHandler handler = eventDispatchers.get(type); 144 if(handler != null) { 145 handler.handle(event); 146 } else { 147 throw new Exception("No handler for registered for " + type); 148 } 149 } catch (Throwable t) { 150 //TODO Maybe log the state of the queue 151 LOG.fatal("Error in dispatcher thread", t); 152 // If serviceStop is called, we should exit this thread gracefully. 153 if (exitOnDispatchException 154 && (ShutdownHookManager.get().isShutdownInProgress()) == false 155 && stopped == false) { 156 Thread shutDownThread = new Thread(createShutDownThread()); 157 shutDownThread.setName("AsyncDispatcher ShutDown handler"); 158 shutDownThread.start(); 159 } 160 } 161 } 162 163 @SuppressWarnings("unchecked") 164 @Override 165 public void register(Class<? extends Enum> eventType, 166 EventHandler handler) { 167 /* check to see if we have a listener registered */ 168 EventHandler<Event> registeredHandler = (EventHandler<Event>) 169 eventDispatchers.get(eventType); 170 LOG.info("Registering " + eventType + " for " + handler.getClass()); 171 if (registeredHandler == null) { 172 eventDispatchers.put(eventType, handler); 173 } else if (!(registeredHandler instanceof MultiListenerHandler)){ 174 /* for multiple listeners of an event add the multiple listener handler */ 175 MultiListenerHandler multiHandler = new MultiListenerHandler(); 176 multiHandler.addHandler(registeredHandler); 177 multiHandler.addHandler(handler); 178 eventDispatchers.put(eventType, multiHandler); 179 } else { 180 /* already a multilistener, just add to it */ 181 MultiListenerHandler multiHandler 182 = (MultiListenerHandler) registeredHandler; 183 multiHandler.addHandler(handler); 184 } 185 } 186 187 @Override 188 public EventHandler getEventHandler() { 189 return handlerInstance; 190 } 191 192 class GenericEventHandler implements EventHandler<Event> { 193 public void handle(Event event) { 194 if (blockNewEvents) { 195 return; 196 } 197 drained = false; 198 199 /* all this method does is enqueue all the events onto the queue */ 200 int qSize = eventQueue.size(); 201 if (qSize != 0 && qSize % 1000 == 0 202 && lastEventQueueSizeLogged != qSize) { 203 lastEventQueueSizeLogged = qSize; 204 LOG.info("Size of event-queue is " + qSize); 205 } 206 int remCapacity = eventQueue.remainingCapacity(); 207 if (remCapacity < 1000) { 208 LOG.warn("Very low remaining capacity in the event-queue: " 209 + remCapacity); 210 } 211 try { 212 eventQueue.put(event); 213 } catch (InterruptedException e) { 214 if (!stopped) { 215 LOG.warn("AsyncDispatcher thread interrupted", e); 216 } 217 // Need to reset drained flag to true if event queue is empty, 218 // otherwise dispatcher will hang on stop. 219 drained = eventQueue.isEmpty(); 220 throw new YarnRuntimeException(e); 221 } 222 }; 223 } 224 225 /** 226 * Multiplexing an event. Sending it to different handlers that 227 * are interested in the event. 228 * @param <T> the type of event these multiple handlers are interested in. 229 */ 230 static class MultiListenerHandler implements EventHandler<Event> { 231 List<EventHandler<Event>> listofHandlers; 232 233 public MultiListenerHandler() { 234 listofHandlers = new ArrayList<EventHandler<Event>>(); 235 } 236 237 @Override 238 public void handle(Event event) { 239 for (EventHandler<Event> handler: listofHandlers) { 240 handler.handle(event); 241 } 242 } 243 244 void addHandler(EventHandler<Event> handler) { 245 listofHandlers.add(handler); 246 } 247 248 } 249 250 Runnable createShutDownThread() { 251 return new Runnable() { 252 @Override 253 public void run() { 254 LOG.info("Exiting, bbye.."); 255 System.exit(-1); 256 } 257 }; 258 } 259 260 @VisibleForTesting 261 protected boolean isEventThreadWaiting() { 262 return eventHandlingThread.getState() == Thread.State.WAITING; 263 } 264 265 @VisibleForTesting 266 protected boolean isDrained() { 267 return this.drained; 268 } 269 }
AsyncDispatcher.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
在GenericEventHandler中handle()方法会通过eventQueue.put(event)往队列中添加数据,即所谓的生产过程。那么,这个GenericEventHandler是如何获得的呢?getEventHandler()方法告诉了我们答案。
关于AsyncDispatcher,可以参考
小结:
关于ApplicationMaster启动流程,可以参考
RMStateStore是存储ResourceManager状态的基础接口,真实的存储器需要实现存储和加载方法。 关于RMStateStore,可以参考
一部分文章在这会说: 在文章的开头有写“事件调度器”, 在ResourceManager那边会有AsyncDispatcher来调度所有事件, 这里的话会通过ApplicationEventDispatcher去做RMAppImpl的transition方法, 看一下RMAppImpl类的初始化的时候的各种event和transition。 介绍的不清楚。
另一部分转而看ResourceManager,因为this.rmContext.getDispatcher().getEventHandler().handle(new RMAppEvent(applicationId, RMAppEventType.START)),触发app启动事件,往异步处理器增加个RMAppEvent事件,类型枚值RMAppEventType.START,在RM内部会注册该类型的事件会用什么处理器来处理。
不过殊途同归。
1 public enum RMAppEventType { 2 // Source: ClientRMService 3 START, 4 RECOVER, 5 KILL, 6 MOVE, // Move app to a new queue 7 8 // Source: Scheduler and RMAppManager 9 APP_REJECTED, 10 11 // Source: Scheduler 12 APP_ACCEPTED, 13 14 // Source: RMAppAttempt 15 ATTEMPT_REGISTERED, 16 ATTEMPT_UNREGISTERED, 17 ATTEMPT_FINISHED, // Will send the final state 18 ATTEMPT_FAILED, 19 ATTEMPT_KILLED, 20 NODE_UPDATE, 21 22 // Source: Container and ResourceTracker 23 APP_RUNNING_ON_NODE, 24 25 // Source: RMStateStore 26 APP_NEW_SAVED, 27 APP_UPDATE_SAVED, 28 }
RMAppEventType.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppEventType.java
1 public enum RMAppState { 2 NEW, 3 NEW_SAVING, 4 SUBMITTED, 5 ACCEPTED, 6 RUNNING, 7 FINAL_SAVING, 8 FINISHING, 9 FINISHED, 10 FAILED, 11 KILLING, 12 KILLED 13 }
RMAppState.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppState.java
在ResourceManager内部:
1 /** 2 * The ResourceManager is the main class that is a set of components. 3 * "I am the ResourceManager. All your resources belong to us..." 4 * 5 */ 6 @SuppressWarnings("unchecked") 7 public class ResourceManager extends CompositeService implements Recoverable { 8 9 /** 10 * Priority of the ResourceManager shutdown hook. 11 */ 12 public static final int SHUTDOWN_HOOK_PRIORITY = 30; 13 14 private static final Log LOG = LogFactory.getLog(ResourceManager.class); 15 private static long clusterTimeStamp = System.currentTimeMillis(); 16 17 /** 18 * "Always On" services. Services that need to run always irrespective of 19 * the HA state of the RM. 20 */ 21 @VisibleForTesting 22 protected RMContextImpl rmContext; 23 private Dispatcher rmDispatcher; 24 @VisibleForTesting 25 protected AdminService adminService; 26 27 /** 28 * "Active" services. Services that need to run only on the Active RM. 29 * These services are managed (initialized, started, stopped) by the 30 * {@link CompositeService} RMActiveServices. 31 * 32 * RM is active when (1) HA is disabled, or (2) HA is enabled and the RM is 33 * in Active state. 34 */ 35 protected RMActiveServices activeServices; 36 protected RMSecretManagerService rmSecretManagerService; 37 38 protected ResourceScheduler scheduler; 39 protected ReservationSystem reservationSystem; 40 private ClientRMService clientRM; 41 protected ApplicationMasterService masterService; 42 protected NMLivelinessMonitor nmLivelinessMonitor; 43 protected NodesListManager nodesListManager; 44 protected RMAppManager rmAppManager; 45 protected ApplicationACLsManager applicationACLsManager; 46 protected QueueACLsManager queueACLsManager; 47 private WebApp webApp; 48 private AppReportFetcher fetcher = null; 49 protected ResourceTrackerService resourceTracker; 50 51 @VisibleForTesting 52 protected String webAppAddress; 53 private ConfigurationProvider configurationProvider = null; 54 /** End of Active services */ 55 56 private Configuration conf; 57 58 private UserGroupInformation rmLoginUGI; 59 60 public ResourceManager() { 61 super("ResourceManager"); 62 } 63 64 public RMContext getRMContext() { 65 return this.rmContext; 66 } 67 68 public static long getClusterTimeStamp() { 69 return clusterTimeStamp; 70 } 71 72 @VisibleForTesting 73 protected static void setClusterTimeStamp(long timestamp) { 74 clusterTimeStamp = timestamp; 75 } 76 77 @VisibleForTesting 78 Dispatcher getRmDispatcher() { 79 return rmDispatcher; 80 } 81 82 @Override 83 protected void serviceInit(Configuration conf) throws Exception { 84 this.conf = conf; 85 this.rmContext = new RMContextImpl(); 86 87 this.configurationProvider = 88 ConfigurationProviderFactory.getConfigurationProvider(conf); 89 this.configurationProvider.init(this.conf); 90 rmContext.setConfigurationProvider(configurationProvider); 91 92 // load core-site.xml 93 InputStream coreSiteXMLInputStream = 94 this.configurationProvider.getConfigurationInputStream(this.conf, 95 YarnConfiguration.CORE_SITE_CONFIGURATION_FILE); 96 if (coreSiteXMLInputStream != null) { 97 this.conf.addResource(coreSiteXMLInputStream); 98 } 99 100 // Do refreshUserToGroupsMappings with loaded core-site.xml 101 Groups.getUserToGroupsMappingServiceWithLoadedConfiguration(this.conf) 102 .refresh(); 103 104 // Do refreshSuperUserGroupsConfiguration with loaded core-site.xml 105 // Or use RM specific configurations to overwrite the common ones first 106 // if they exist 107 RMServerUtils.processRMProxyUsersConf(conf); 108 ProxyUsers.refreshSuperUserGroupsConfiguration(this.conf); 109 110 // load yarn-site.xml 111 InputStream yarnSiteXMLInputStream = 112 this.configurationProvider.getConfigurationInputStream(this.conf, 113 YarnConfiguration.YARN_SITE_CONFIGURATION_FILE); 114 if (yarnSiteXMLInputStream != null) { 115 this.conf.addResource(yarnSiteXMLInputStream); 116 } 117 118 //校验配置合法性,yarn.resourcemanager.am.max-attempts ,validate expireIntvl >= heartbeatIntvl 119 validateConfigs(this.conf); 120 121 // Set HA configuration should be done before login 122 this.rmContext.setHAEnabled(HAUtil.isHAEnabled(this.conf)); 123 if (this.rmContext.isHAEnabled()) { 124 HAUtil.verifyAndSetConfiguration(this.conf); 125 } 126 127 // Set UGI and do login 128 // If security is enabled, use login user 129 // If security is not enabled, use current user 130 this.rmLoginUGI = UserGroupInformation.getCurrentUser(); 131 try { 132 doSecureLogin(); 133 } catch(IOException ie) { 134 throw new YarnRuntimeException("Failed to login", ie); 135 } 136 137 // register the handlers for all AlwaysOn services using setupDispatcher(). 138 rmDispatcher = setupDispatcher(); 139 addIfService(rmDispatcher); 140 rmContext.setDispatcher(rmDispatcher); 141 142 adminService = createAdminService(); 143 addService(adminService); 144 rmContext.setRMAdminService(adminService); 145 146 rmContext.setYarnConfiguration(conf); 147 148 //创建并初始化ResourceManager的内部类RMActiveServices 149 createAndInitActiveServices(); 150 151 webAppAddress = WebAppUtils.getWebAppBindURL(this.conf, 152 YarnConfiguration.RM_BIND_HOST, 153 WebAppUtils.getRMWebAppURLWithoutScheme(this.conf)); 154 155 RMApplicationHistoryWriter rmApplicationHistoryWriter = 156 createRMApplicationHistoryWriter(); 157 addService(rmApplicationHistoryWriter); 158 rmContext.setRMApplicationHistoryWriter(rmApplicationHistoryWriter); 159 160 SystemMetricsPublisher systemMetricsPublisher = createSystemMetricsPublisher(); 161 addService(systemMetricsPublisher); 162 rmContext.setSystemMetricsPublisher(systemMetricsPublisher); 163 164 super.serviceInit(this.conf); 165 } 166 167 protected QueueACLsManager createQueueACLsManager(ResourceScheduler scheduler, 168 Configuration conf) { 169 return new QueueACLsManager(scheduler, conf); 170 } 171 172 @VisibleForTesting 173 protected void setRMStateStore(RMStateStore rmStore) { 174 rmStore.setRMDispatcher(rmDispatcher); 175 rmStore.setResourceManager(this); 176 rmContext.setStateStore(rmStore); 177 } 178 179 protected EventHandler<SchedulerEvent> createSchedulerEventDispatcher() { 180 return new SchedulerEventDispatcher(this.scheduler); 181 } 182 183 protected Dispatcher createDispatcher() { 184 return new AsyncDispatcher(); 185 } 186 187 protected ResourceScheduler createScheduler() { 188 String schedulerClassName = conf.get(YarnConfiguration.RM_SCHEDULER, 189 YarnConfiguration.DEFAULT_RM_SCHEDULER); 190 LOG.info("Using Scheduler: " + schedulerClassName); 191 try { 192 Class<?> schedulerClazz = Class.forName(schedulerClassName); 193 if (ResourceScheduler.class.isAssignableFrom(schedulerClazz)) { 194 return (ResourceScheduler) ReflectionUtils.newInstance(schedulerClazz, 195 this.conf); 196 } else { 197 throw new YarnRuntimeException("Class: " + schedulerClassName 198 + " not instance of " + ResourceScheduler.class.getCanonicalName()); 199 } 200 } catch (ClassNotFoundException e) { 201 throw new YarnRuntimeException("Could not instantiate Scheduler: " 202 + schedulerClassName, e); 203 } 204 } 205 206 protected ReservationSystem createReservationSystem() { 207 String reservationClassName = 208 conf.get(YarnConfiguration.RM_RESERVATION_SYSTEM_CLASS, 209 AbstractReservationSystem.getDefaultReservationSystem(scheduler)); 210 if (reservationClassName == null) { 211 return null; 212 } 213 LOG.info("Using ReservationSystem: " + reservationClassName); 214 try { 215 Class<?> reservationClazz = Class.forName(reservationClassName); 216 if (ReservationSystem.class.isAssignableFrom(reservationClazz)) { 217 return (ReservationSystem) ReflectionUtils.newInstance( 218 reservationClazz, this.conf); 219 } else { 220 throw new YarnRuntimeException("Class: " + reservationClassName 221 + " not instance of " + ReservationSystem.class.getCanonicalName()); 222 } 223 } catch (ClassNotFoundException e) { 224 throw new YarnRuntimeException( 225 "Could not instantiate ReservationSystem: " + reservationClassName, e); 226 } 227 } 228 229 protected ApplicationMasterLauncher createAMLauncher() { 230 return new ApplicationMasterLauncher(this.rmContext); 231 } 232 233 private NMLivelinessMonitor createNMLivelinessMonitor() { 234 return new NMLivelinessMonitor(this.rmContext 235 .getDispatcher()); 236 } 237 238 protected AMLivelinessMonitor createAMLivelinessMonitor() { 239 return new AMLivelinessMonitor(this.rmDispatcher); 240 } 241 242 protected RMNodeLabelsManager createNodeLabelManager() 243 throws InstantiationException, IllegalAccessException { 244 return new RMNodeLabelsManager(); 245 } 246 247 protected DelegationTokenRenewer createDelegationTokenRenewer() { 248 return new DelegationTokenRenewer(); 249 } 250 251 protected RMAppManager createRMAppManager() { 252 return new RMAppManager(this.rmContext, this.scheduler, this.masterService, 253 this.applicationACLsManager, this.conf); 254 } 255 256 protected RMApplicationHistoryWriter createRMApplicationHistoryWriter() { 257 return new RMApplicationHistoryWriter(); 258 } 259 260 protected SystemMetricsPublisher createSystemMetricsPublisher() { 261 return new SystemMetricsPublisher(); 262 } 263 264 // sanity check for configurations 265 protected static void validateConfigs(Configuration conf) { 266 // validate max-attempts 267 int globalMaxAppAttempts = 268 conf.getInt(YarnConfiguration.RM_AM_MAX_ATTEMPTS, 269 YarnConfiguration.DEFAULT_RM_AM_MAX_ATTEMPTS); 270 if (globalMaxAppAttempts <= 0) { 271 throw new YarnRuntimeException("Invalid global max attempts configuration" 272 + ", " + YarnConfiguration.RM_AM_MAX_ATTEMPTS 273 + "=" + globalMaxAppAttempts + ", it should be a positive integer."); 274 } 275 276 // validate expireIntvl >= heartbeatIntvl 277 long expireIntvl = conf.getLong(YarnConfiguration.RM_NM_EXPIRY_INTERVAL_MS, 278 YarnConfiguration.DEFAULT_RM_NM_EXPIRY_INTERVAL_MS); 279 long heartbeatIntvl = 280 conf.getLong(YarnConfiguration.RM_NM_HEARTBEAT_INTERVAL_MS, 281 YarnConfiguration.DEFAULT_RM_NM_HEARTBEAT_INTERVAL_MS); 282 if (expireIntvl < heartbeatIntvl) { 283 throw new YarnRuntimeException("Nodemanager expiry interval should be no" 284 + " less than heartbeat interval, " 285 + YarnConfiguration.RM_NM_EXPIRY_INTERVAL_MS + "=" + expireIntvl 286 + ", " + YarnConfiguration.RM_NM_HEARTBEAT_INTERVAL_MS + "=" 287 + heartbeatIntvl); 288 } 289 } 290 291 /** 292 * RMActiveServices handles all the Active services in the RM. 293 */ 294 @Private 295 public class RMActiveServices extends CompositeService { 296 297 private DelegationTokenRenewer delegationTokenRenewer; 298 private EventHandler<SchedulerEvent> schedulerDispatcher; 299 private ApplicationMasterLauncher applicationMasterLauncher; 300 private ContainerAllocationExpirer containerAllocationExpirer; 301 private ResourceManager rm; 302 private boolean recoveryEnabled; 303 private RMActiveServiceContext activeServiceContext; 304 305 RMActiveServices(ResourceManager rm) { 306 super("RMActiveServices"); 307 this.rm = rm; 308 } 309 310 @Override 311 protected void serviceInit(Configuration configuration) throws Exception { 312 activeServiceContext = new RMActiveServiceContext(); 313 rmContext.setActiveServiceContext(activeServiceContext); 314 315 conf.setBoolean(Dispatcher.DISPATCHER_EXIT_ON_ERROR_KEY, true); 316 rmSecretManagerService = createRMSecretManagerService(); 317 addService(rmSecretManagerService); 318 319 containerAllocationExpirer = new ContainerAllocationExpirer(rmDispatcher); 320 addService(containerAllocationExpirer); 321 rmContext.setContainerAllocationExpirer(containerAllocationExpirer); 322 323 AMLivelinessMonitor amLivelinessMonitor = createAMLivelinessMonitor(); 324 addService(amLivelinessMonitor); 325 rmContext.setAMLivelinessMonitor(amLivelinessMonitor); 326 327 AMLivelinessMonitor amFinishingMonitor = createAMLivelinessMonitor(); 328 addService(amFinishingMonitor); 329 rmContext.setAMFinishingMonitor(amFinishingMonitor); 330 331 RMNodeLabelsManager nlm = createNodeLabelManager(); 332 nlm.setRMContext(rmContext); 333 addService(nlm); 334 rmContext.setNodeLabelManager(nlm); 335 336 boolean isRecoveryEnabled = conf.getBoolean( 337 YarnConfiguration.RECOVERY_ENABLED, 338 YarnConfiguration.DEFAULT_RM_RECOVERY_ENABLED); 339 340 RMStateStore rmStore = null; 341 if (isRecoveryEnabled) { 342 recoveryEnabled = true; 343 rmStore = RMStateStoreFactory.getStore(conf); 344 boolean isWorkPreservingRecoveryEnabled = 345 conf.getBoolean( 346 YarnConfiguration.RM_WORK_PRESERVING_RECOVERY_ENABLED, 347 YarnConfiguration.DEFAULT_RM_WORK_PRESERVING_RECOVERY_ENABLED); 348 rmContext 349 .setWorkPreservingRecoveryEnabled(isWorkPreservingRecoveryEnabled); 350 } else { 351 recoveryEnabled = false; 352 rmStore = new NullRMStateStore(); 353 } 354 355 try { 356 rmStore.init(conf); 357 rmStore.setRMDispatcher(rmDispatcher); 358 rmStore.setResourceManager(rm); 359 } catch (Exception e) { 360 // the Exception from stateStore.init() needs to be handled for 361 // HA and we need to give up master status if we got fenced 362 LOG.error("Failed to init state store", e); 363 throw e; 364 } 365 rmContext.setStateStore(rmStore); 366 367 if (UserGroupInformation.isSecurityEnabled()) { 368 delegationTokenRenewer = createDelegationTokenRenewer(); 369 rmContext.setDelegationTokenRenewer(delegationTokenRenewer); 370 } 371 372 // Register event handler for NodesListManager 373 nodesListManager = new NodesListManager(rmContext); 374 rmDispatcher.register(NodesListManagerEventType.class, nodesListManager); 375 addService(nodesListManager); 376 rmContext.setNodesListManager(nodesListManager); 377 378 // Initialize the scheduler 379 scheduler = createScheduler(); 380 scheduler.setRMContext(rmContext); 381 addIfService(scheduler); 382 rmContext.setScheduler(scheduler); 383 384 schedulerDispatcher = createSchedulerEventDispatcher(); 385 addIfService(schedulerDispatcher); 386 rmDispatcher.register(SchedulerEventType.class, schedulerDispatcher); 387 388 // Register event handler for RmAppEvents 389 //注册RMAppEvent事件的事件处理器 390 //RMAppManager往异步处理器增加个RMAppEvent事件,类型枚值RMAppEventType.START,所以由ApplicationEventDispatcher(rmContext)来处理 391 rmDispatcher.register(RMAppEventType.class, 392 new ApplicationEventDispatcher(rmContext)); 393 394 // Register event handler for RmAppAttemptEvents 395 rmDispatcher.register(RMAppAttemptEventType.class, 396 new ApplicationAttemptEventDispatcher(rmContext)); 397 398 // Register event handler for RmNodes 399 rmDispatcher.register( 400 RMNodeEventType.class, new NodeEventDispatcher(rmContext)); 401 402 nmLivelinessMonitor = createNMLivelinessMonitor(); 403 addService(nmLivelinessMonitor); 404 405 resourceTracker = createResourceTrackerService(); 406 addService(resourceTracker); 407 rmContext.setResourceTrackerService(resourceTracker); 408 409 DefaultMetricsSystem.initialize("ResourceManager"); 410 JvmMetrics.initSingleton("ResourceManager", null); 411 412 // Initialize the Reservation system 413 if (conf.getBoolean(YarnConfiguration.RM_RESERVATION_SYSTEM_ENABLE, 414 YarnConfiguration.DEFAULT_RM_RESERVATION_SYSTEM_ENABLE)) { 415 reservationSystem = createReservationSystem(); 416 if (reservationSystem != null) { 417 reservationSystem.setRMContext(rmContext); 418 addIfService(reservationSystem); 419 rmContext.setReservationSystem(reservationSystem); 420 LOG.info("Initialized Reservation system"); 421 } 422 } 423 424 // creating monitors that handle preemption 425 createPolicyMonitors(); 426 427 masterService = createApplicationMasterService(); 428 addService(masterService) ; 429 rmContext.setApplicationMasterService(masterService); 430 431 applicationACLsManager = new ApplicationACLsManager(conf); 432 433 queueACLsManager = createQueueACLsManager(scheduler, conf); 434 435 rmAppManager = createRMAppManager(); 436 // Register event handler for RMAppManagerEvents 437 rmDispatcher.register(RMAppManagerEventType.class, rmAppManager); 438 439 clientRM = createClientRMService(); 440 addService(clientRM); 441 rmContext.setClientRMService(clientRM); 442 443 applicationMasterLauncher = createAMLauncher(); 444 rmDispatcher.register(AMLauncherEventType.class, 445 applicationMasterLauncher); 446 447 addService(applicationMasterLauncher); 448 if (UserGroupInformation.isSecurityEnabled()) { 449 addService(delegationTokenRenewer); 450 delegationTokenRenewer.setRMContext(rmContext); 451 } 452 453 new RMNMInfo(rmContext, scheduler); 454 455 super.serviceInit(conf); 456 } 457 458 @Override 459 protected void serviceStart() throws Exception { 460 RMStateStore rmStore = rmContext.getStateStore(); 461 // The state store needs to start irrespective of recoveryEnabled as apps 462 // need events to move to further states. 463 rmStore.start(); 464 465 if(recoveryEnabled) { 466 try { 467 LOG.info("Recovery started"); 468 rmStore.checkVersion(); 469 if (rmContext.isWorkPreservingRecoveryEnabled()) { 470 rmContext.setEpoch(rmStore.getAndIncrementEpoch()); 471 } 472 RMState state = rmStore.loadState(); 473 recover(state); 474 LOG.info("Recovery ended"); 475 } catch (Exception e) { 476 // the Exception from loadState() needs to be handled for 477 // HA and we need to give up master status if we got fenced 478 LOG.error("Failed to load/recover state", e); 479 throw e; 480 } 481 } 482 483 super.serviceStart(); 484 } 485 486 @Override 487 protected void serviceStop() throws Exception { 488 489 super.serviceStop(); 490 DefaultMetricsSystem.shutdown(); 491 if (rmContext != null) { 492 RMStateStore store = rmContext.getStateStore(); 493 try { 494 store.close(); 495 } catch (Exception e) { 496 LOG.error("Error closing store.", e); 497 } 498 } 499 500 } 501 502 protected void createPolicyMonitors() { 503 if (scheduler instanceof PreemptableResourceScheduler 504 && conf.getBoolean(YarnConfiguration.RM_SCHEDULER_ENABLE_MONITORS, 505 YarnConfiguration.DEFAULT_RM_SCHEDULER_ENABLE_MONITORS)) { 506 LOG.info("Loading policy monitors"); 507 List<SchedulingEditPolicy> policies = conf.getInstances( 508 YarnConfiguration.RM_SCHEDULER_MONITOR_POLICIES, 509 SchedulingEditPolicy.class); 510 if (policies.size() > 0) { 511 for (SchedulingEditPolicy policy : policies) { 512 LOG.info("LOADING SchedulingEditPolicy:" + policy.getPolicyName()); 513 // periodically check whether we need to take action to guarantee 514 // constraints 515 SchedulingMonitor mon = new SchedulingMonitor(rmContext, policy); 516 addService(mon); 517 } 518 } else { 519 LOG.warn("Policy monitors configured (" + 520 YarnConfiguration.RM_SCHEDULER_ENABLE_MONITORS + 521 ") but none specified (" + 522 YarnConfiguration.RM_SCHEDULER_MONITOR_POLICIES + ")"); 523 } 524 } 525 } 526 } 527 528 @Private 529 public static class SchedulerEventDispatcher extends AbstractService 530 implements EventHandler<SchedulerEvent> { 531 532 private final ResourceScheduler scheduler; 533 private final BlockingQueue<SchedulerEvent> eventQueue = 534 new LinkedBlockingQueue<SchedulerEvent>(); 535 private volatile int lastEventQueueSizeLogged = 0; 536 private final Thread eventProcessor; 537 private volatile boolean stopped = false; 538 private boolean shouldExitOnError = false; 539 540 public SchedulerEventDispatcher(ResourceScheduler scheduler) { 541 super(SchedulerEventDispatcher.class.getName()); 542 this.scheduler = scheduler; 543 this.eventProcessor = new Thread(new EventProcessor()); 544 this.eventProcessor.setName("ResourceManager Event Processor"); 545 } 546 547 @Override 548 protected void serviceInit(Configuration conf) throws Exception { 549 this.shouldExitOnError = 550 conf.getBoolean(Dispatcher.DISPATCHER_EXIT_ON_ERROR_KEY, 551 Dispatcher.DEFAULT_DISPATCHER_EXIT_ON_ERROR); 552 super.serviceInit(conf); 553 } 554 555 @Override 556 protected void serviceStart() throws Exception { 557 this.eventProcessor.start(); 558 super.serviceStart(); 559 } 560 561 private final class EventProcessor implements Runnable { 562 @Override 563 public void run() { 564 565 SchedulerEvent event; 566 567 while (!stopped && !Thread.currentThread().isInterrupted()) { 568 try { 569 event = eventQueue.take(); 570 } catch (InterruptedException e) { 571 LOG.error("Returning, interrupted : " + e); 572 return; // TODO: Kill RM. 573 } 574 575 try { 576 scheduler.handle(event); 577 } catch (Throwable t) { 578 // An error occurred, but we are shutting down anyway. 579 // If it was an InterruptedException, the very act of 580 // shutdown could have caused it and is probably harmless. 581 if (stopped) { 582 LOG.warn("Exception during shutdown: ", t); 583 break; 584 } 585 LOG.fatal("Error in handling event type " + event.getType() 586 + " to the scheduler", t); 587 if (shouldExitOnError 588 && !ShutdownHookManager.get().isShutdownInProgress()) { 589 LOG.info("Exiting, bbye.."); 590 System.exit(-1); 591 } 592 } 593 } 594 } 595 } 596 597 @Override 598 protected void serviceStop() throws Exception { 599 this.stopped = true; 600 this.eventProcessor.interrupt(); 601 try { 602 this.eventProcessor.join(); 603 } catch (InterruptedException e) { 604 throw new YarnRuntimeException(e); 605 } 606 super.serviceStop(); 607 } 608 609 @Override 610 public void handle(SchedulerEvent event) { 611 try { 612 int qSize = eventQueue.size(); 613 if (qSize != 0 && qSize % 1000 == 0 614 && lastEventQueueSizeLogged != qSize) { 615 lastEventQueueSizeLogged = qSize; 616 LOG.info("Size of scheduler event-queue is " + qSize); 617 } 618 int remCapacity = eventQueue.remainingCapacity(); 619 if (remCapacity < 1000) { 620 LOG.info("Very low remaining capacity on scheduler event queue: " 621 + remCapacity); 622 } 623 this.eventQueue.put(event); 624 } catch (InterruptedException e) { 625 LOG.info("Interrupted. Trying to exit gracefully."); 626 } 627 } 628 } 629 630 @Private 631 public static class RMFatalEventDispatcher 632 implements EventHandler<RMFatalEvent> { 633 634 @Override 635 public void handle(RMFatalEvent event) { 636 LOG.fatal("Received a " + RMFatalEvent.class.getName() + " of type " + 637 event.getType().name() + ". Cause:\n" + event.getCause()); 638 639 ExitUtil.terminate(1, event.getCause()); 640 } 641 } 642 643 public void handleTransitionToStandBy() { 644 if (rmContext.isHAEnabled()) { 645 try { 646 // Transition to standby and reinit active services 647 LOG.info("Transitioning RM to Standby mode"); 648 transitionToStandby(true); 649 adminService.resetLeaderElection(); 650 return; 651 } catch (Exception e) { 652 LOG.fatal("Failed to transition RM to Standby mode."); 653 ExitUtil.terminate(1, e); 654 } 655 } 656 } 657 658 @Private 659 public static final class ApplicationEventDispatcher implements 660 EventHandler<RMAppEvent> { 661 662 private final RMContext rmContext; 663 664 public ApplicationEventDispatcher(RMContext rmContext) { 665 this.rmContext = rmContext; 666 } 667 668 @Override 669 public void handle(RMAppEvent event) { 670 ApplicationId appID = event.getApplicationId(); 671 RMApp rmApp = this.rmContext.getRMApps().get(appID); 672 if (rmApp != null) { 673 try { 674 // 675 rmApp.handle(event); 676 } catch (Throwable t) { 677 LOG.error("Error in handling event type " + event.getType() 678 + " for application " + appID, t); 679 } 680 } 681 } 682 } 683 684 @Private 685 public static final class ApplicationAttemptEventDispatcher implements 686 EventHandler<RMAppAttemptEvent> { 687 688 private final RMContext rmContext; 689 690 public ApplicationAttemptEventDispatcher(RMContext rmContext) { 691 this.rmContext = rmContext; 692 } 693 694 @Override 695 public void handle(RMAppAttemptEvent event) { 696 ApplicationAttemptId appAttemptID = event.getApplicationAttemptId(); 697 ApplicationId appAttemptId = appAttemptID.getApplicationId(); 698 RMApp rmApp = this.rmContext.getRMApps().get(appAttemptId); 699 if (rmApp != null) { 700 RMAppAttempt rmAppAttempt = rmApp.getRMAppAttempt(appAttemptID); 701 if (rmAppAttempt != null) { 702 try { 703 rmAppAttempt.handle(event); 704 } catch (Throwable t) { 705 LOG.error("Error in handling event type " + event.getType() 706 + " for applicationAttempt " + appAttemptId, t); 707 } 708 } 709 } 710 } 711 } 712 713 @Private 714 public static final class NodeEventDispatcher implements 715 EventHandler<RMNodeEvent> { 716 717 private final RMContext rmContext; 718 719 public NodeEventDispatcher(RMContext rmContext) { 720 this.rmContext = rmContext; 721 } 722 723 @Override 724 public void handle(RMNodeEvent event) { 725 NodeId nodeId = event.getNodeId(); 726 RMNode node = this.rmContext.getRMNodes().get(nodeId); 727 if (node != null) { 728 try { 729 ((EventHandler<RMNodeEvent>) node).handle(event); 730 } catch (Throwable t) { 731 LOG.error("Error in handling event type " + event.getType() 732 + " for node " + nodeId, t); 733 } 734 } 735 } 736 } 737 738 protected void startWepApp() { 739 740 // Use the customized yarn filter instead of the standard kerberos filter to 741 // allow users to authenticate using delegation tokens 742 // 4 conditions need to be satisfied - 743 // 1. security is enabled 744 // 2. http auth type is set to kerberos 745 // 3. "yarn.resourcemanager.webapp.use-yarn-filter" override is set to true 746 // 4. hadoop.http.filter.initializers container AuthenticationFilterInitializer 747 748 Configuration conf = getConfig(); 749 boolean enableCorsFilter = 750 conf.getBoolean(YarnConfiguration.RM_WEBAPP_ENABLE_CORS_FILTER, 751 YarnConfiguration.DEFAULT_RM_WEBAPP_ENABLE_CORS_FILTER); 752 boolean useYarnAuthenticationFilter = 753 conf.getBoolean( 754 YarnConfiguration.RM_WEBAPP_DELEGATION_TOKEN_AUTH_FILTER, 755 YarnConfiguration.DEFAULT_RM_WEBAPP_DELEGATION_TOKEN_AUTH_FILTER); 756 String authPrefix = "hadoop.http.authentication."; 757 String authTypeKey = authPrefix + "type"; 758 String filterInitializerConfKey = "hadoop.http.filter.initializers"; 759 String actualInitializers = ""; 760 Class<?>[] initializersClasses = 761 conf.getClasses(filterInitializerConfKey); 762 763 // setup CORS 764 if (enableCorsFilter) { 765 conf.setBoolean(HttpCrossOriginFilterInitializer.PREFIX 766 + HttpCrossOriginFilterInitializer.ENABLED_SUFFIX, true); 767 } 768 769 boolean hasHadoopAuthFilterInitializer = false; 770 boolean hasRMAuthFilterInitializer = false; 771 if (initializersClasses != null) { 772 for (Class<?> initializer : initializersClasses) { 773 if (initializer.getName().equals( 774 AuthenticationFilterInitializer.class.getName())) { 775 hasHadoopAuthFilterInitializer = true; 776 } 777 if (initializer.getName().equals( 778 RMAuthenticationFilterInitializer.class.getName())) { 779 hasRMAuthFilterInitializer = true; 780 } 781 } 782 if (UserGroupInformation.isSecurityEnabled() 783 && useYarnAuthenticationFilter 784 && hasHadoopAuthFilterInitializer 785 && conf.get(authTypeKey, "").equals( 786 KerberosAuthenticationHandler.TYPE)) { 787 ArrayList<String> target = new ArrayList<String>(); 788 for (Class<?> filterInitializer : initializersClasses) { 789 if (filterInitializer.getName().equals( 790 AuthenticationFilterInitializer.class.getName())) { 791 if (hasRMAuthFilterInitializer == false) { 792 target.add(RMAuthenticationFilterInitializer.class.getName()); 793 } 794 continue; 795 } 796 target.add(filterInitializer.getName()); 797 } 798 actualInitializers = StringUtils.join(",", target); 799 800 LOG.info("Using RM authentication filter(kerberos/delegation-token)" 801 + " for RM webapp authentication"); 802 RMAuthenticationFilter 803 .setDelegationTokenSecretManager(getClientRMService().rmDTSecretManager); 804 conf.set(filterInitializerConfKey, actualInitializers); 805 } 806 } 807 808 // if security is not enabled and the default filter initializer has not 809 // been set, set the initializer to include the 810 // RMAuthenticationFilterInitializer which in turn will set up the simple 811 // auth filter. 812 813 String initializers = conf.get(filterInitializerConfKey); 814 if (!UserGroupInformation.isSecurityEnabled()) { 815 if (initializersClasses == null || initializersClasses.length == 0) { 816 conf.set(filterInitializerConfKey, 817 RMAuthenticationFilterInitializer.class.getName()); 818 conf.set(authTypeKey, "simple"); 819 } else if (initializers.equals(StaticUserWebFilter.class.getName())) { 820 conf.set(filterInitializerConfKey, 821 RMAuthenticationFilterInitializer.class.getName() + "," 822 + initializers); 823 conf.set(authTypeKey, "simple"); 824 } 825 } 826 827 Builder<ApplicationMasterService> builder = 828 WebApps 829 .$for("cluster", ApplicationMasterService.class, masterService, 830 "ws") 831 .with(conf) 832 .withHttpSpnegoPrincipalKey( 833 YarnConfiguration.RM_WEBAPP_SPNEGO_USER_NAME_KEY) 834 .withHttpSpnegoKeytabKey( 835 YarnConfiguration.RM_WEBAPP_SPNEGO_KEYTAB_FILE_KEY) 836 .at(webAppAddress); 837 String proxyHostAndPort = WebAppUtils.getProxyHostAndPort(conf); 838 if(WebAppUtils.getResolvedRMWebAppURLWithoutScheme(conf). 839 equals(proxyHostAndPort)) { 840 if (HAUtil.isHAEnabled(conf)) { 841 fetcher = new AppReportFetcher(conf); 842 } else { 843 fetcher = new AppReportFetcher(conf, getClientRMService()); 844 } 845 builder.withServlet(ProxyUriUtils.PROXY_SERVLET_NAME, 846 ProxyUriUtils.PROXY_PATH_SPEC, WebAppProxyServlet.class); 847 builder.withAttribute(WebAppProxy.FETCHER_ATTRIBUTE, fetcher); 848 String[] proxyParts = proxyHostAndPort.split(":"); 849 builder.withAttribute(WebAppProxy.PROXY_HOST_ATTRIBUTE, proxyParts[0]); 850 851 } 852 webApp = builder.start(new RMWebApp(this)); 853 } 854 855 /** 856 * Helper method to create and init {@link #activeServices}. This creates an 857 * instance of {@link RMActiveServices} and initializes it. 858 * @throws Exception 859 */ 860 protected void createAndInitActiveServices() throws Exception { 861 activeServices = new RMActiveServices(this); 862 //最后调用的是RMActiveServices类的serviceInit函数 863 activeServices.init(conf); 864 } 865 866 /** 867 * Helper method to start {@link #activeServices}. 868 * @throws Exception 869 */ 870 void startActiveServices() throws Exception { 871 if (activeServices != null) { 872 clusterTimeStamp = System.currentTimeMillis(); 873 activeServices.start(); 874 } 875 } 876 877 /** 878 * Helper method to stop {@link #activeServices}. 879 * @throws Exception 880 */ 881 void stopActiveServices() throws Exception { 882 if (activeServices != null) { 883 activeServices.stop(); 884 activeServices = null; 885 } 886 } 887 888 void reinitialize(boolean initialize) throws Exception { 889 ClusterMetrics.destroy(); 890 QueueMetrics.clearQueueMetrics(); 891 if (initialize) { 892 resetDispatcher(); 893 createAndInitActiveServices(); 894 } 895 } 896 897 @VisibleForTesting 898 protected boolean areActiveServicesRunning() { 899 return activeServices != null && activeServices.isInState(STATE.STARTED); 900 } 901 902 synchronized void transitionToActive() throws Exception { 903 if (rmContext.getHAServiceState() == HAServiceProtocol.HAServiceState.ACTIVE) { 904 LOG.info("Already in active state"); 905 return; 906 } 907 908 LOG.info("Transitioning to active state"); 909 910 this.rmLoginUGI.doAs(new PrivilegedExceptionAction<Void>() { 911 @Override 912 public Void run() throws Exception { 913 try { 914 startActiveServices(); 915 return null; 916 } catch (Exception e) { 917 reinitialize(true); 918 throw e; 919 } 920 } 921 }); 922 923 rmContext.setHAServiceState(HAServiceProtocol.HAServiceState.ACTIVE); 924 LOG.info("Transitioned to active state"); 925 } 926 927 synchronized void transitionToStandby(boolean initialize) 928 throws Exception { 929 if (rmContext.getHAServiceState() == 930 HAServiceProtocol.HAServiceState.STANDBY) { 931 LOG.info("Already in standby state"); 932 return; 933 } 934 935 LOG.info("Transitioning to standby state"); 936 HAServiceState state = rmContext.getHAServiceState(); 937 rmContext.setHAServiceState(HAServiceProtocol.HAServiceState.STANDBY); 938 if (state == HAServiceProtocol.HAServiceState.ACTIVE) { 939 stopActiveServices(); 940 reinitialize(initialize); 941 } 942 LOG.info("Transitioned to standby state"); 943 } 944 945 @Override 946 protected void serviceStart() throws Exception { 947 if (this.rmContext.isHAEnabled()) { 948 transitionToStandby(true); 949 } else { 950 transitionToActive(); 951 } 952 953 startWepApp(); 954 if (getConfig().getBoolean(YarnConfiguration.IS_MINI_YARN_CLUSTER, 955 false)) { 956 int port = webApp.port(); 957 WebAppUtils.setRMWebAppPort(conf, port); 958 } 959 super.serviceStart(); 960 } 961 962 protected void doSecureLogin() throws IOException { 963 InetSocketAddress socAddr = getBindAddress(conf); 964 SecurityUtil.login(this.conf, YarnConfiguration.RM_KEYTAB, 965 YarnConfiguration.RM_PRINCIPAL, socAddr.getHostName()); 966 967 // if security is enable, set rmLoginUGI as UGI of loginUser 968 if (UserGroupInformation.isSecurityEnabled()) { 969 this.rmLoginUGI = UserGroupInformation.getLoginUser(); 970 } 971 } 972 973 @Override 974 protected void serviceStop() throws Exception { 975 if (webApp != null) { 976 webApp.stop(); 977 } 978 if (fetcher != null) { 979 fetcher.stop(); 980 } 981 if (configurationProvider != null) { 982 configurationProvider.close(); 983 } 984 super.serviceStop(); 985 transitionToStandby(false); 986 rmContext.setHAServiceState(HAServiceState.STOPPING); 987 } 988 989 protected ResourceTrackerService createResourceTrackerService() { 990 return new ResourceTrackerService(this.rmContext, this.nodesListManager, 991 this.nmLivelinessMonitor, 992 this.rmContext.getContainerTokenSecretManager(), 993 this.rmContext.getNMTokenSecretManager()); 994 } 995 996 protected ClientRMService createClientRMService() { 997 return new ClientRMService(this.rmContext, scheduler, this.rmAppManager, 998 this.applicationACLsManager, this.queueACLsManager, 999 this.rmContext.getRMDelegationTokenSecretManager()); 1000 } 1001 1002 protected ApplicationMasterService createApplicationMasterService() { 1003 return new ApplicationMasterService(this.rmContext, scheduler); 1004 } 1005 1006 protected AdminService createAdminService() { 1007 return new AdminService(this, rmContext); 1008 } 1009 1010 protected RMSecretManagerService createRMSecretManagerService() { 1011 return new RMSecretManagerService(conf, rmContext); 1012 } 1013 1014 @Private 1015 public ClientRMService getClientRMService() { 1016 return this.clientRM; 1017 } 1018 1019 /** 1020 * return the scheduler. 1021 * @return the scheduler for the Resource Manager. 1022 */ 1023 @Private 1024 public ResourceScheduler getResourceScheduler() { 1025 return this.scheduler; 1026 } 1027 1028 /** 1029 * return the resource tracking component. 1030 * @return the resource tracking component. 1031 */ 1032 @Private 1033 public ResourceTrackerService getResourceTrackerService() { 1034 return this.resourceTracker; 1035 } 1036 1037 @Private 1038 public ApplicationMasterService getApplicationMasterService() { 1039 return this.masterService; 1040 } 1041 1042 @Private 1043 public ApplicationACLsManager getApplicationACLsManager() { 1044 return this.applicationACLsManager; 1045 } 1046 1047 @Private 1048 public QueueACLsManager getQueueACLsManager() { 1049 return this.queueACLsManager; 1050 } 1051 1052 @Private 1053 WebApp getWebapp() { 1054 return this.webApp; 1055 } 1056 1057 @Override 1058 public void recover(RMState state) throws Exception { 1059 // recover RMdelegationTokenSecretManager 1060 rmContext.getRMDelegationTokenSecretManager().recover(state); 1061 1062 // recover AMRMTokenSecretManager 1063 rmContext.getAMRMTokenSecretManager().recover(state); 1064 1065 // recover applications 1066 rmAppManager.recover(state); 1067 1068 setSchedulerRecoveryStartAndWaitTime(state, conf); 1069 } 1070 1071 /*main函数中主要分析服务初始化和服务启动,RM是个综合服务类继承结构CompositeService->AbstractService,RM初始化是会先进入父类的init函数, 1072 * AbstractService抽取了服务的基本操作如start、stop、close,只要我们的服务覆盖serviceStart、serviceStop、serviceInit等函数就可以控制自己的服务了, 1073 * 这相当于对服务做了统一的管理。 1074 */ 1075 public static void main(String argv[]) { 1076 //未捕获异常处理类 1077 Thread.setDefaultUncaughtExceptionHandler(new YarnUncaughtExceptionHandler()); 1078 StringUtils.startupShutdownMessage(ResourceManager.class, argv, LOG); 1079 try { 1080 //载入控制文件 1081 Configuration conf = new YarnConfiguration(); 1082 //创建空RM对象,并未包含任何服务,也未启动 1083 GenericOptionsParser hParser = new GenericOptionsParser(conf, argv); 1084 argv = hParser.getRemainingArgs(); 1085 // If -format-state-store, then delete RMStateStore; else startup normally 1086 if (argv.length == 1 && argv[0].equals("-format-state-store")) { 1087 deleteRMStateStore(conf); 1088 } else { 1089 ResourceManager resourceManager = new ResourceManager(); 1090 //添加关闭钩子 1091 ShutdownHookManager.get().addShutdownHook( 1092 new CompositeServiceShutdownHook(resourceManager), 1093 SHUTDOWN_HOOK_PRIORITY); 1094 //初始化服务 ,会调用父类AbstractService的init函数,该函数内部调用serviceInit函数,实际上调用的是ResourceManager的serviceInit函数 1095 resourceManager.init(conf); 1096 //启动RM 1097 resourceManager.start(); 1098 } 1099 } catch (Throwable t) { 1100 LOG.fatal("Error starting ResourceManager", t); 1101 System.exit(-1); 1102 } 1103 } 1104 1105 /** 1106 * Register the handlers for alwaysOn services 1107 */ 1108 private Dispatcher setupDispatcher() { 1109 Dispatcher dispatcher = createDispatcher(); 1110 dispatcher.register(RMFatalEventType.class, 1111 new ResourceManager.RMFatalEventDispatcher()); 1112 return dispatcher; 1113 } 1114 1115 private void resetDispatcher() { 1116 Dispatcher dispatcher = setupDispatcher(); 1117 ((Service)dispatcher).init(this.conf); 1118 ((Service)dispatcher).start(); 1119 removeService((Service)rmDispatcher); 1120 // Need to stop previous rmDispatcher before assigning new dispatcher 1121 // otherwise causes "AsyncDispatcher event handler" thread leak 1122 ((Service) rmDispatcher).stop(); 1123 rmDispatcher = dispatcher; 1124 addIfService(rmDispatcher); 1125 rmContext.setDispatcher(rmDispatcher); 1126 } 1127 1128 private void setSchedulerRecoveryStartAndWaitTime(RMState state, 1129 Configuration conf) { 1130 if (!state.getApplicationState().isEmpty()) { 1131 long waitTime = 1132 conf.getLong(YarnConfiguration.RM_WORK_PRESERVING_RECOVERY_SCHEDULING_WAIT_MS, 1133 YarnConfiguration.DEFAULT_RM_WORK_PRESERVING_RECOVERY_SCHEDULING_WAIT_MS); 1134 rmContext.setSchedulerRecoveryStartAndWaitTime(waitTime); 1135 } 1136 } 1137 1138 /** 1139 * Retrieve RM bind address from configuration 1140 * 1141 * @param conf 1142 * @return InetSocketAddress 1143 */ 1144 public static InetSocketAddress getBindAddress(Configuration conf) { 1145 return conf.getSocketAddr(YarnConfiguration.RM_ADDRESS, 1146 YarnConfiguration.DEFAULT_RM_ADDRESS, YarnConfiguration.DEFAULT_RM_PORT); 1147 } 1148 1149 /** 1150 * Deletes the RMStateStore 1151 * 1152 * @param conf 1153 * @throws Exception 1154 */ 1155 private static void deleteRMStateStore(Configuration conf) throws Exception { 1156 RMStateStore rmStore = RMStateStoreFactory.getStore(conf); 1157 rmStore.init(conf); 1158 rmStore.start(); 1159 try { 1160 LOG.info("Deleting ResourceManager state store..."); 1161 rmStore.deleteStore(); 1162 LOG.info("State store deleted"); 1163 } finally { 1164 rmStore.stop(); 1165 } 1166 } 1167 }
我们可以从main()函数开始分析,main()函数内部调用resourceManager.init(conf),该函数初始化服务 ,会调用父类AbstractService的init函数,该函数内部调用serviceInit函数,实际上调用的是ResourceManager的serviceInit函数。且ResourceManager的serviceInit函数内部会调用createAndInitActiveServices(),该函数创建并初始化ResourceManager的内部类RMActiveServices,该函数内部会调用activeServices.init(conf),即最后调用的是ResourceManager类的内部类RMActiveServices类的serviceInit函数。serviceInit函数内部调用rmDispatcher.register(RMAppEventType.class, new ApplicationEventDispatcher(rmContext)),即注册RMAppEvent事件的事件处理器,与前面的RMAppManager类的RMAppEventType.START呼应,即RMAppManager往异步处理器增加个RMAppEvent事件,类型枚值RMAppEventType.START,所以由ApplicationEventDispatcher(rmContext)来处理。 其中ApplicationEventDispatcher类是ResourceManager类的一个内部类,它的handle方法内会调用rmApp.handle(event), rmApp是RMApp接口类的对象,这里是它的实现类RMAppImpl的对象,即调用的是RMAppImpl类的handle方法,该函数内部会调用this.stateMachine.doTransition(event.getType(), event),其实在RMAppImpl类构造函数里有this.stateMachine = stateMachineFactory.make(this), stateMachine通过状态工厂创建,状态工厂核心addTransition,这个stateMachine是个状态机工厂,其中绑定了很多的事件转换。 各种状态转变对应的处理器,有个submit应该是对应到MAppEventType.START ,在RMAppImpl类内部有.addTransition(RMAppState.NEW, RMAppState.NEW_SAVING, RMAppEventType.START, new RMAppNewlySavingTransition()), 意思就是接受RMAppEventType.START类型的事件,已经捕捉了RMAppEventType.START事件, 会把RMApp的状态从NEW变成NEW_SAVING, 调用回调类是RMAppNewlySavingTransition。 参考
其中addTransition()方法是StateMachineFactory类的方法。在addTransition函数中,就将第二个参数postState传给了新构建的内部类SingleInternalArc。
1 /** 2 * State machine topology. 3 * This object is semantically immutable. If you have a 4 * StateMachineFactory there's no operation in the API that changes 5 * its semantic properties. 6 * 7 * @param <OPERAND> The object type on which this state machine operates. 8 * @param <STATE> The state of the entity. 9 * @param <EVENTTYPE> The external eventType to be handled. 10 * @param <EVENT> The event object. 11 * 12 */ 13 @Public 14 @Evolving 15 final public class StateMachineFactory 16 <OPERAND, STATE extends Enum<STATE>, 17 EVENTTYPE extends Enum<EVENTTYPE>, EVENT> { 18 19 private final TransitionsListNode transitionsListNode; 20 21 private Map<STATE, Map<EVENTTYPE, 22 Transition<OPERAND, STATE, EVENTTYPE, EVENT>>> stateMachineTable; 23 24 private STATE defaultInitialState; 25 26 private final boolean optimized; 27 28 /** 29 * Constructor 30 * 31 * This is the only constructor in the API. 32 * 33 */ 34 public StateMachineFactory(STATE defaultInitialState) { 35 this.transitionsListNode = null; 36 this.defaultInitialState = defaultInitialState; 37 this.optimized = false; 38 this.stateMachineTable = null; 39 } 40 41 private StateMachineFactory 42 (StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> that, 43 ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT> t) { 44 this.defaultInitialState = that.defaultInitialState; 45 this.transitionsListNode 46 = new TransitionsListNode(t, that.transitionsListNode); 47 this.optimized = false; 48 this.stateMachineTable = null; 49 } 50 51 private StateMachineFactory 52 (StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> that, 53 boolean optimized) { 54 this.defaultInitialState = that.defaultInitialState; 55 this.transitionsListNode = that.transitionsListNode; 56 this.optimized = optimized; 57 if (optimized) { 58 makeStateMachineTable(); 59 } else { 60 stateMachineTable = null; 61 } 62 } 63 64 private interface ApplicableTransition 65 <OPERAND, STATE extends Enum<STATE>, 66 EVENTTYPE extends Enum<EVENTTYPE>, EVENT> { 67 void apply(StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> subject); 68 } 69 70 private class TransitionsListNode { 71 final ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT> transition; 72 final TransitionsListNode next; 73 74 TransitionsListNode 75 (ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT> transition, 76 TransitionsListNode next) { 77 this.transition = transition; 78 this.next = next; 79 } 80 } 81 82 static private class ApplicableSingleOrMultipleTransition 83 <OPERAND, STATE extends Enum<STATE>, 84 EVENTTYPE extends Enum<EVENTTYPE>, EVENT> 85 implements ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT> { 86 final STATE preState; 87 final EVENTTYPE eventType; 88 final Transition<OPERAND, STATE, EVENTTYPE, EVENT> transition; 89 90 ApplicableSingleOrMultipleTransition 91 (STATE preState, EVENTTYPE eventType, 92 Transition<OPERAND, STATE, EVENTTYPE, EVENT> transition) { 93 this.preState = preState; 94 this.eventType = eventType; 95 this.transition = transition; 96 } 97 98 @Override 99 public void apply 100 (StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> subject) { 101 Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>> transitionMap 102 = subject.stateMachineTable.get(preState); 103 if (transitionMap == null) { 104 // I use HashMap here because I would expect most EVENTTYPE's to not 105 // apply out of a particular state, so FSM sizes would be 106 // quadratic if I use EnumMap's here as I do at the top level. 107 transitionMap = new HashMap<EVENTTYPE, 108 Transition<OPERAND, STATE, EVENTTYPE, EVENT>>(); 109 subject.stateMachineTable.put(preState, transitionMap); 110 } 111 transitionMap.put(eventType, transition); 112 } 113 } 114 115 /** 116 * @return a NEW StateMachineFactory just like {@code this} with the current 117 * transition added as a new legal transition. This overload 118 * has no hook object. 119 * 120 * Note that the returned StateMachineFactory is a distinct 121 * object. 122 * 123 * This method is part of the API. 124 * 125 * @param preState pre-transition state 126 * @param postState post-transition state 127 * @param eventType stimulus for the transition 128 */ 129 public StateMachineFactory 130 <OPERAND, STATE, EVENTTYPE, EVENT> 131 addTransition(STATE preState, STATE postState, EVENTTYPE eventType) { 132 return addTransition(preState, postState, eventType, null); 133 } 134 135 /** 136 * @return a NEW StateMachineFactory just like {@code this} with the current 137 * transition added as a new legal transition. This overload 138 * has no hook object. 139 * 140 * 141 * Note that the returned StateMachineFactory is a distinct 142 * object. 143 * 144 * This method is part of the API. 145 * 146 * @param preState pre-transition state 147 * @param postState post-transition state 148 * @param eventTypes List of stimuli for the transitions 149 */ 150 public StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> addTransition( 151 STATE preState, STATE postState, Set<EVENTTYPE> eventTypes) { 152 return addTransition(preState, postState, eventTypes, null); 153 } 154 155 /** 156 * @return a NEW StateMachineFactory just like {@code this} with the current 157 * transition added as a new legal transition 158 * 159 * Note that the returned StateMachineFactory is a distinct 160 * object. 161 * 162 * This method is part of the API. 163 * 164 * @param preState pre-transition state 165 * @param postState post-transition state 166 * @param eventTypes List of stimuli for the transitions 167 * @param hook transition hook 168 */ 169 public StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> addTransition( 170 STATE preState, STATE postState, Set<EVENTTYPE> eventTypes, 171 SingleArcTransition<OPERAND, EVENT> hook) { 172 StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> factory = null; 173 for (EVENTTYPE event : eventTypes) { 174 if (factory == null) { 175 factory = addTransition(preState, postState, event, hook); 176 } else { 177 factory = factory.addTransition(preState, postState, event, hook); 178 } 179 } 180 return factory; 181 } 182 183 /** 184 * @return a NEW StateMachineFactory just like {@code this} with the current 185 * transition added as a new legal transition 186 * 187 * Note that the returned StateMachineFactory is a distinct object. 188 * 189 * This method is part of the API. 190 * 191 * @param preState pre-transition state 192 * @param postState post-transition state 193 * @param eventType stimulus for the transition 194 * @param hook transition hook 195 */ 196 // 197 public StateMachineFactory 198 <OPERAND, STATE, EVENTTYPE, EVENT> 199 addTransition(STATE preState, STATE postState, 200 EVENTTYPE eventType, 201 SingleArcTransition<OPERAND, EVENT> hook){ 202 return new StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> 203 (this, new ApplicableSingleOrMultipleTransition<OPERAND, STATE, EVENTTYPE, EVENT> 204 (preState, eventType, new SingleInternalArc(postState, hook))); 205 } 206 207 /** 208 * @return a NEW StateMachineFactory just like {@code this} with the current 209 * transition added as a new legal transition 210 * 211 * Note that the returned StateMachineFactory is a distinct object. 212 * 213 * This method is part of the API. 214 * 215 * @param preState pre-transition state 216 * @param postStates valid post-transition states 217 * @param eventType stimulus for the transition 218 * @param hook transition hook 219 */ 220 public StateMachineFactory 221 <OPERAND, STATE, EVENTTYPE, EVENT> 222 addTransition(STATE preState, Set<STATE> postStates, 223 EVENTTYPE eventType, 224 MultipleArcTransition<OPERAND, EVENT, STATE> hook){ 225 return new StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> 226 (this, 227 new ApplicableSingleOrMultipleTransition<OPERAND, STATE, EVENTTYPE, EVENT> 228 (preState, eventType, new MultipleInternalArc(postStates, hook))); 229 } 230 231 /** 232 * @return a StateMachineFactory just like {@code this}, except that if 233 * you won't need any synchronization to build a state machine 234 * 235 * Note that the returned StateMachineFactory is a distinct object. 236 * 237 * This method is part of the API. 238 * 239 * The only way you could distinguish the returned 240 * StateMachineFactory from {@code this} would be by 241 * measuring the performance of the derived 242 * {@code StateMachine} you can get from it. 243 * 244 * Calling this is optional. It doesn't change the semantics of the factory, 245 * if you call it then when you use the factory there is no synchronization. 246 */ 247 public StateMachineFactory 248 <OPERAND, STATE, EVENTTYPE, EVENT> 249 installTopology() { 250 return new StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT>(this, true); 251 } 252 253 /** 254 * Effect a transition due to the effecting stimulus. 255 * @param state current state 256 * @param eventType trigger to initiate the transition 257 * @param cause causal eventType context 258 * @return transitioned state 259 */ 260 private STATE doTransition 261 (OPERAND operand, STATE oldState, EVENTTYPE eventType, EVENT event) 262 throws InvalidStateTransitonException { 263 // We can assume that stateMachineTable is non-null because we call 264 // maybeMakeStateMachineTable() when we build an InnerStateMachine , 265 // and this code only gets called from inside a working InnerStateMachine . 266 Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>> transitionMap 267 = stateMachineTable.get(oldState); 268 if (transitionMap != null) { 269 Transition<OPERAND, STATE, EVENTTYPE, EVENT> transition 270 = transitionMap.get(eventType); 271 if (transition != null) { 272 return transition.doTransition(operand, oldState, event, eventType); 273 } 274 } 275 throw new InvalidStateTransitonException(oldState, eventType); 276 } 277 278 private synchronized void maybeMakeStateMachineTable() { 279 if (stateMachineTable == null) { 280 makeStateMachineTable(); 281 } 282 } 283 284 private void makeStateMachineTable() { 285 Stack<ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT>> stack = 286 new Stack<ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT>>(); 287 288 Map<STATE, Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>>> 289 prototype = new HashMap<STATE, Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>>>(); 290 291 prototype.put(defaultInitialState, null); 292 293 // I use EnumMap here because it'll be faster and denser. I would 294 // expect most of the states to have at least one transition. 295 stateMachineTable 296 = new EnumMap<STATE, Map<EVENTTYPE, 297 Transition<OPERAND, STATE, EVENTTYPE, EVENT>>>(prototype); 298 299 for (TransitionsListNode cursor = transitionsListNode; 300 cursor != null; 301 cursor = cursor.next) { 302 stack.push(cursor.transition); 303 } 304 305 while (!stack.isEmpty()) { 306 stack.pop().apply(this); 307 } 308 } 309 310 private interface Transition<OPERAND, STATE extends Enum<STATE>, 311 EVENTTYPE extends Enum<EVENTTYPE>, EVENT> { 312 STATE doTransition(OPERAND operand, STATE oldState, 313 EVENT event, EVENTTYPE eventType); 314 } 315 316 private class SingleInternalArc 317 implements Transition<OPERAND, STATE, EVENTTYPE, EVENT> { 318 319 private STATE postState; 320 private SingleArcTransition<OPERAND, EVENT> hook; // transition hook 321 322 SingleInternalArc(STATE postState, 323 SingleArcTransition<OPERAND, EVENT> hook) { 324 this.postState = postState; 325 this.hook = hook; 326 } 327 328 @Override 329 public STATE doTransition(OPERAND operand, STATE oldState, 330 EVENT event, EVENTTYPE eventType) { 331 if (hook != null) { 332 hook.transition(operand, event); 333 } 334 return postState; 335 } 336 } 337 338 private class MultipleInternalArc 339 implements Transition<OPERAND, STATE, EVENTTYPE, EVENT>{ 340 341 // Fields 342 private Set<STATE> validPostStates; 343 private MultipleArcTransition<OPERAND, EVENT, STATE> hook; // transition hook 344 345 MultipleInternalArc(Set<STATE> postStates, 346 MultipleArcTransition<OPERAND, EVENT, STATE> hook) { 347 this.validPostStates = postStates; 348 this.hook = hook; 349 } 350 351 @Override 352 public STATE doTransition(OPERAND operand, STATE oldState, 353 EVENT event, EVENTTYPE eventType) 354 throws InvalidStateTransitonException { 355 STATE postState = hook.transition(operand, event); 356 357 if (!validPostStates.contains(postState)) { 358 throw new InvalidStateTransitonException(oldState, eventType); 359 } 360 return postState; 361 } 362 } 363 364 /* 365 * @return a {@link StateMachine} that starts in 366 * {@code initialState} and whose {@link Transition} s are 367 * applied to {@code operand} . 368 * 369 * This is part of the API. 370 * 371 * @param operand the object upon which the returned 372 * {@link StateMachine} will operate. 373 * @param initialState the state in which the returned 374 * {@link StateMachine} will start. 375 * 376 */ 377 public StateMachine<STATE, EVENTTYPE, EVENT> 378 make(OPERAND operand, STATE initialState) { 379 return new InternalStateMachine(operand, initialState); 380 } 381 382 /* 383 * @return a {@link StateMachine} that starts in the default initial 384 * state and whose {@link Transition} s are applied to 385 * {@code operand} . 386 * 387 * This is part of the API. 388 * 389 * @param operand the object upon which the returned 390 * {@link StateMachine} will operate. 391 * 392 */ 393 public StateMachine<STATE, EVENTTYPE, EVENT> make(OPERAND operand) { 394 return new InternalStateMachine(operand, defaultInitialState); 395 } 396 397 private class InternalStateMachine 398 implements StateMachine<STATE, EVENTTYPE, EVENT> { 399 private final OPERAND operand; 400 private STATE currentState; 401 402 InternalStateMachine(OPERAND operand, STATE initialState) { 403 this.operand = operand; 404 this.currentState = initialState; 405 if (!optimized) { 406 maybeMakeStateMachineTable(); 407 } 408 } 409 410 @Override 411 public synchronized STATE getCurrentState() { 412 return currentState; 413 } 414 415 @Override 416 public synchronized STATE doTransition(EVENTTYPE eventType, EVENT event) 417 throws InvalidStateTransitonException { 418 currentState = StateMachineFactory.this.doTransition 419 (operand, currentState, eventType, event); 420 return currentState; 421 } 422 } 423 424 /** 425 * Generate a graph represents the state graph of this StateMachine 426 * @param name graph name 427 * @return Graph object generated 428 */ 429 @SuppressWarnings("rawtypes") 430 public Graph generateStateGraph(String name) { 431 maybeMakeStateMachineTable(); 432 Graph g = new Graph(name); 433 for (STATE startState : stateMachineTable.keySet()) { 434 Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>> transitions 435 = stateMachineTable.get(startState); 436 for (Entry<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>> entry : 437 transitions.entrySet()) { 438 Transition<OPERAND, STATE, EVENTTYPE, EVENT> transition = entry.getValue(); 439 if (transition instanceof StateMachineFactory.SingleInternalArc) { 440 StateMachineFactory.SingleInternalArc sa 441 = (StateMachineFactory.SingleInternalArc) transition; 442 Graph.Node fromNode = g.getNode(startState.toString()); 443 Graph.Node toNode = g.getNode(sa.postState.toString()); 444 fromNode.addEdge(toNode, entry.getKey().toString()); 445 } else if (transition instanceof StateMachineFactory.MultipleInternalArc) { 446 StateMachineFactory.MultipleInternalArc ma 447 = (StateMachineFactory.MultipleInternalArc) transition; 448 Iterator iter = ma.validPostStates.iterator(); 449 while (iter.hasNext()) { 450 Graph.Node fromNode = g.getNode(startState.toString()); 451 Graph.Node toNode = g.getNode(iter.next().toString()); 452 fromNode.addEdge(toNode, entry.getKey().toString()); 453 } 454 } 455 } 456 } 457 return g; 458 } 459 }
StateMachineFactory.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/state/StateMachineFactory.java
它调用的是StateMachineFactory的addTransition()函数,
1 //StateMachineFactory.java 2 public StateMachineFactory 3 <OPERAND, STATE, EVENTTYPE, EVENT> 4 addTransition(STATE preState, STATE postState, 5 EVENTTYPE eventType, 6 SingleArcTransition<OPERAND, EVENT> hook){ 7 return new StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> 8 (this, new ApplicableSingleOrMultipleTransition<OPERAND, STATE, EVENTTYPE, EVENT> 9 (preState, eventType, new SingleInternalArc(postState, hook))); 10 }
addTransition()方法内部会调用 new StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> (this, new ApplicableSingleOrMultipleTransition<OPERAND, STATE, EVENTTYPE, EVENT> (preState, eventType, new SingleInternalArc(postState, hook))), 其中初始化的内部类SingleInternalArc中,保存了状态转换之后的值postState,此时的值就是RMAppState.NEW_SAVING。 也保存了回调函数hook=RMAppNewlySavingTransition。
之后就该返回到RMAppImpl类的handle函数中,调用this.stateMachine.doTransition(event.getType(), event), 进入到StateMachineFactory类中内部接口类Transition的doTransition方法, 再调用StateMachineFactory类的doTransition方法。
到 return transition.doTransition(operand, oldState, event, eventType), 其中oldState=RMAppState.NEW, transition=org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc。 进入内部类SingleInternalArc的函数doTransition中。该方法内部
会调用此处调用的hook.transition(operand, event), 并且最后返回值return postState;就是上面保存的RMAppState.NEW_SAVING,到上层保存在变量currentState里,即返回到RMAppImpl类的handle函数中,这个变量在RMAppImpl中被get函数getState()获取。
即if (oldState != getState()) { LOG.info(appID + " State change from " + oldState + " to " + getState()); }, 打印出来状态由 NEW 转变成 NEW_SAVING 。 例如:
2017-02-20 22:59:07,702 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1487602693580_0001 State change from NEW to NEW_SAVING
此时就完成了一次状态机的状态转变。
再回到hook变量,hook初始化的就是上面状态机绑定的回调类RMAppNewlySavingTransition。再次回调到了RMAppImpl类中的内部类RMAppNewlySavingTransition的transition函数里, 该函数内部会调用
app.rmContext.getStateStore().storeNewApplication(app), 其中app.rmContext=RMContextImpl , app.rmContext.getStateStore()=RMStateStore , 进入RMStateStore类的storeNewApplication函数里,
1 @Private 2 @Unstable 3 /** 4 * Base class to implement storage of ResourceManager state. 5 * Takes care of asynchronous notifications and interfacing with YARN objects. 6 * Real store implementations need to derive from it and implement blocking 7 * store and load methods to actually store and load the state. 8 */ 9 public abstract class RMStateStore extends AbstractService { 10 11 // constants for RM App state and RMDTSecretManagerState. 12 protected static final String RM_APP_ROOT = "RMAppRoot"; 13 protected static final String RM_DT_SECRET_MANAGER_ROOT = "RMDTSecretManagerRoot"; 14 protected static final String DELEGATION_KEY_PREFIX = "DelegationKey_"; 15 protected static final String DELEGATION_TOKEN_PREFIX = "RMDelegationToken_"; 16 protected static final String DELEGATION_TOKEN_SEQUENCE_NUMBER_PREFIX = 17 "RMDTSequenceNumber_"; 18 protected static final String AMRMTOKEN_SECRET_MANAGER_ROOT = 19 "AMRMTokenSecretManagerRoot"; 20 protected static final String VERSION_NODE = "RMVersionNode"; 21 protected static final String EPOCH_NODE = "EpochNode"; 22 private ResourceManager resourceManager; 23 private final ReadLock readLock; 24 private final WriteLock writeLock; 25 26 public static final Log LOG = LogFactory.getLog(RMStateStore.class); 27 28 /** 29 * The enum defines state of RMStateStore. 30 */ 31 public enum RMStateStoreState { 32 ACTIVE, 33 FENCED 34 }; 35 36 private static final StateMachineFactory<RMStateStore, 37 RMStateStoreState, 38 RMStateStoreEventType, 39 RMStateStoreEvent> 40 stateMachineFactory = new StateMachineFactory<RMStateStore, 41 RMStateStoreState, 42 RMStateStoreEventType, 43 RMStateStoreEvent>( 44 RMStateStoreState.ACTIVE) 45 .addTransition(RMStateStoreState.ACTIVE, 46 EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED), 47 RMStateStoreEventType.STORE_APP, new StoreAppTransition()) 48 .addTransition(RMStateStoreState.ACTIVE, 49 EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED), 50 RMStateStoreEventType.UPDATE_APP, new UpdateAppTransition()) 51 .addTransition(RMStateStoreState.ACTIVE, 52 EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED), 53 RMStateStoreEventType.REMOVE_APP, new RemoveAppTransition()) 54 .addTransition(RMStateStoreState.ACTIVE, 55 EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED), 56 RMStateStoreEventType.STORE_APP_ATTEMPT, 57 new StoreAppAttemptTransition()) 58 .addTransition(RMStateStoreState.ACTIVE, 59 EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED), 60 RMStateStoreEventType.UPDATE_APP_ATTEMPT, 61 new UpdateAppAttemptTransition()) 62 .addTransition(RMStateStoreState.ACTIVE, 63 EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED), 64 RMStateStoreEventType.STORE_MASTERKEY, 65 new StoreRMDTMasterKeyTransition()) 66 .addTransition(RMStateStoreState.ACTIVE, 67 EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED), 68 RMStateStoreEventType.REMOVE_MASTERKEY, 69 new RemoveRMDTMasterKeyTransition()) 70 .addTransition(RMStateStoreState.ACTIVE, 71 EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED), 72 RMStateStoreEventType.STORE_DELEGATION_TOKEN, 73 new StoreRMDTTransition()) 74 .addTransition(RMStateStoreState.ACTIVE, 75 EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED), 76 RMStateStoreEventType.REMOVE_DELEGATION_TOKEN, 77 new RemoveRMDTTransition()) 78 .addTransition(RMStateStoreState.ACTIVE, 79 EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED), 80 RMStateStoreEventType.UPDATE_DELEGATION_TOKEN, 81 new UpdateRMDTTransition()) 82 .addTransition(RMStateStoreState.ACTIVE, 83 EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED), 84 RMStateStoreEventType.UPDATE_AMRM_TOKEN, 85 new StoreOrUpdateAMRMTokenTransition()) 86 .addTransition(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED, 87 RMStateStoreEventType.FENCED) 88 .addTransition(RMStateStoreState.FENCED, RMStateStoreState.FENCED, 89 EnumSet.of( 90 RMStateStoreEventType.STORE_APP, 91 RMStateStoreEventType.UPDATE_APP, 92 RMStateStoreEventType.REMOVE_APP, 93 RMStateStoreEventType.STORE_APP_ATTEMPT, 94 RMStateStoreEventType.UPDATE_APP_ATTEMPT, 95 RMStateStoreEventType.FENCED, 96 RMStateStoreEventType.STORE_MASTERKEY, 97 RMStateStoreEventType.REMOVE_MASTERKEY, 98 RMStateStoreEventType.STORE_DELEGATION_TOKEN, 99 RMStateStoreEventType.REMOVE_DELEGATION_TOKEN, 100 RMStateStoreEventType.UPDATE_DELEGATION_TOKEN, 101 RMStateStoreEventType.UPDATE_AMRM_TOKEN)); 102 103 private final StateMachine<RMStateStoreState, 104 RMStateStoreEventType, 105 RMStateStoreEvent> stateMachine; 106 107 private static class StoreAppTransition 108 implements MultipleArcTransition<RMStateStore, RMStateStoreEvent, 109 RMStateStoreState> { 110 @Override 111 public RMStateStoreState transition(RMStateStore store, 112 RMStateStoreEvent event) { 113 if (!(event instanceof RMStateStoreAppEvent)) { 114 // should never happen 115 LOG.error("Illegal event type: " + event.getClass()); 116 return RMStateStoreState.ACTIVE; 117 } 118 boolean isFenced = false; 119 ApplicationStateData appState = 120 ((RMStateStoreAppEvent) event).getAppState(); 121 ApplicationId appId = 122 appState.getApplicationSubmissionContext().getApplicationId(); 123 LOG.info("Storing info for app: " + appId); 124 try { 125 store.storeApplicationStateInternal(appId, appState); 126 store.notifyApplication(new RMAppEvent(appId, 127 RMAppEventType.APP_NEW_SAVED)); 128 } catch (Exception e) { 129 LOG.error("Error storing app: " + appId, e); 130 isFenced = store.notifyStoreOperationFailedInternal(e); 131 } 132 return finalState(isFenced); 133 }; 134 } 135 136 private static class UpdateAppTransition implements 137 MultipleArcTransition<RMStateStore, RMStateStoreEvent, 138 RMStateStoreState> { 139 @Override 140 public RMStateStoreState transition(RMStateStore store, 141 RMStateStoreEvent event) { 142 if (!(event instanceof RMStateUpdateAppEvent)) { 143 // should never happen 144 LOG.error("Illegal event type: " + event.getClass()); 145 return RMStateStoreState.ACTIVE; 146 } 147 boolean isFenced = false; 148 ApplicationStateData appState = 149 ((RMStateUpdateAppEvent) event).getAppState(); 150 ApplicationId appId = 151 appState.getApplicationSubmissionContext().getApplicationId(); 152 LOG.info("Updating info for app: " + appId); 153 try { 154 store.updateApplicationStateInternal(appId, appState); 155 store.notifyApplication(new RMAppEvent(appId, 156 RMAppEventType.APP_UPDATE_SAVED)); 157 } catch (Exception e) { 158 LOG.error("Error updating app: " + appId, e); 159 isFenced = store.notifyStoreOperationFailedInternal(e); 160 } 161 return finalState(isFenced); 162 }; 163 } 164 165 private static class RemoveAppTransition implements 166 MultipleArcTransition<RMStateStore, RMStateStoreEvent, 167 RMStateStoreState> { 168 @Override 169 public RMStateStoreState transition(RMStateStore store, 170 RMStateStoreEvent event) { 171 if (!(event instanceof RMStateStoreRemoveAppEvent)) { 172 // should never happen 173 LOG.error("Illegal event type: " + event.getClass()); 174 return RMStateStoreState.ACTIVE; 175 } 176 boolean isFenced = false; 177 ApplicationStateData appState = 178 ((RMStateStoreRemoveAppEvent) event).getAppState(); 179 ApplicationId appId = 180 appState.getApplicationSubmissionContext().getApplicationId(); 181 LOG.info("Removing info for app: " + appId); 182 try { 183 store.removeApplicationStateInternal(appState); 184 } catch (Exception e) { 185 LOG.error("Error removing app: " + appId, e); 186 isFenced = store.notifyStoreOperationFailedInternal(e); 187 } 188 return finalState(isFenced); 189 }; 190 } 191 192 private static class StoreAppAttemptTransition implements 193 MultipleArcTransition<RMStateStore, RMStateStoreEvent, 194 RMStateStoreState> { 195 @Override 196 public RMStateStoreState transition(RMStateStore store, 197 RMStateStoreEvent event) { 198 if (!(event instanceof RMStateStoreAppAttemptEvent)) { 199 // should never happen 200 LOG.error("Illegal event type: " + event.getClass()); 201 return RMStateStoreState.ACTIVE; 202 } 203 boolean isFenced = false; 204 ApplicationAttemptStateData attemptState = 205 ((RMStateStoreAppAttemptEvent) event).getAppAttemptState(); 206 try { 207 if (LOG.isDebugEnabled()) { 208 LOG.debug("Storing info for attempt: " + attemptState.getAttemptId()); 209 } 210 store.storeApplicationAttemptStateInternal(attemptState.getAttemptId(), 211 attemptState); 212 store.notifyApplicationAttempt(new RMAppAttemptEvent 213 (attemptState.getAttemptId(), 214 RMAppAttemptEventType.ATTEMPT_NEW_SAVED)); 215 } catch (Exception e) { 216 LOG.error("Error storing appAttempt: " + attemptState.getAttemptId(), e); 217 isFenced = store.notifyStoreOperationFailedInternal(e); 218 } 219 return finalState(isFenced); 220 }; 221 } 222 223 private static class UpdateAppAttemptTransition implements 224 MultipleArcTransition<RMStateStore, RMStateStoreEvent, 225 RMStateStoreState> { 226 @Override 227 public RMStateStoreState transition(RMStateStore store, 228 RMStateStoreEvent event) { 229 if (!(event instanceof RMStateUpdateAppAttemptEvent)) { 230 // should never happen 231 LOG.error("Illegal event type: " + event.getClass()); 232 return RMStateStoreState.ACTIVE; 233 } 234 boolean isFenced = false; 235 ApplicationAttemptStateData attemptState = 236 ((RMStateUpdateAppAttemptEvent) event).getAppAttemptState(); 237 try { 238 if (LOG.isDebugEnabled()) { 239 LOG.debug("Updating info for attempt: " + attemptState.getAttemptId()); 240 } 241 store.updateApplicationAttemptStateInternal(attemptState.getAttemptId(), 242 attemptState); 243 store.notifyApplicationAttempt(new RMAppAttemptEvent 244 (attemptState.getAttemptId(), 245 RMAppAttemptEventType.ATTEMPT_UPDATE_SAVED)); 246 } catch (Exception e) { 247 LOG.error("Error updating appAttempt: " + attemptState.getAttemptId(), e); 248 isFenced = store.notifyStoreOperationFailedInternal(e); 249 } 250 return finalState(isFenced); 251 }; 252 } 253 254 private static class StoreRMDTTransition implements 255 MultipleArcTransition<RMStateStore, RMStateStoreEvent, 256 RMStateStoreState> { 257 @Override 258 public RMStateStoreState transition(RMStateStore store, 259 RMStateStoreEvent event) { 260 if (!(event instanceof RMStateStoreRMDTEvent)) { 261 // should never happen 262 LOG.error("Illegal event type: " + event.getClass()); 263 return RMStateStoreState.ACTIVE; 264 } 265 boolean isFenced = false; 266 RMStateStoreRMDTEvent dtEvent = (RMStateStoreRMDTEvent) event; 267 try { 268 LOG.info("Storing RMDelegationToken and SequenceNumber"); 269 store.storeRMDelegationTokenState( 270 dtEvent.getRmDTIdentifier(), dtEvent.getRenewDate()); 271 } catch (Exception e) { 272 LOG.error("Error While Storing RMDelegationToken and SequenceNumber ", 273 e); 274 isFenced = store.notifyStoreOperationFailedInternal(e); 275 } 276 return finalState(isFenced); 277 } 278 } 279 280 private static class RemoveRMDTTransition implements 281 MultipleArcTransition<RMStateStore, RMStateStoreEvent, 282 RMStateStoreState> { 283 @Override 284 public RMStateStoreState transition(RMStateStore store, 285 RMStateStoreEvent event) { 286 if (!(event instanceof RMStateStoreRMDTEvent)) { 287 // should never happen 288 LOG.error("Illegal event type: " + event.getClass()); 289 return RMStateStoreState.ACTIVE; 290 } 291 boolean isFenced = false; 292 RMStateStoreRMDTEvent dtEvent = (RMStateStoreRMDTEvent) event; 293 try { 294 LOG.info("Removing RMDelegationToken and SequenceNumber"); 295 store.removeRMDelegationTokenState(dtEvent.getRmDTIdentifier()); 296 } catch (Exception e) { 297 LOG.error("Error While Removing RMDelegationToken and SequenceNumber ", 298 e); 299 isFenced = store.notifyStoreOperationFailedInternal(e); 300 } 301 return finalState(isFenced); 302 } 303 } 304 305 private static class UpdateRMDTTransition implements 306 MultipleArcTransition<RMStateStore, RMStateStoreEvent, 307 RMStateStoreState> { 308 @Override 309 public RMStateStoreState transition(RMStateStore store, 310 RMStateStoreEvent event) { 311 if (!(event instanceof RMStateStoreRMDTEvent)) { 312 // should never happen 313 LOG.error("Illegal event type: " + event.getClass()); 314 return RMStateStoreState.ACTIVE; 315 } 316 boolean isFenced = false; 317 RMStateStoreRMDTEvent dtEvent = (RMStateStoreRMDTEvent) event; 318 try { 319 LOG.info("Updating RMDelegationToken and SequenceNumber"); 320 store.updateRMDelegationTokenState( 321 dtEvent.getRmDTIdentifier(), dtEvent.getRenewDate()); 322 } catch (Exception e) { 323 LOG.error("Error While Updating RMDelegationToken and SequenceNumber ", 324 e); 325 isFenced = store.notifyStoreOperationFailedInternal(e); 326 } 327 return finalState(isFenced); 328 } 329 } 330 331 private static class StoreRMDTMasterKeyTransition implements 332 MultipleArcTransition<RMStateStore, RMStateStoreEvent, 333 RMStateStoreState> { 334 @Override 335 public RMStateStoreState transition(RMStateStore store, 336 RMStateStoreEvent event) { 337 if (!(event instanceof RMStateStoreRMDTMasterKeyEvent)) { 338 // should never happen 339 LOG.error("Illegal event type: " + event.getClass()); 340 return RMStateStoreState.ACTIVE; 341 } 342 boolean isFenced = false; 343 RMStateStoreRMDTMasterKeyEvent dtEvent = 344 (RMStateStoreRMDTMasterKeyEvent) event; 345 try { 346 LOG.info("Storing RMDTMasterKey."); 347 store.storeRMDTMasterKeyState(dtEvent.getDelegationKey()); 348 } catch (Exception e) { 349 LOG.error("Error While Storing RMDTMasterKey.", e); 350 isFenced = store.notifyStoreOperationFailedInternal(e); 351 } 352 return finalState(isFenced); 353 } 354 } 355 356 private static class RemoveRMDTMasterKeyTransition implements 357 MultipleArcTransition<RMStateStore, RMStateStoreEvent, 358 RMStateStoreState> { 359 @Override 360 public RMStateStoreState transition(RMStateStore store, 361 RMStateStoreEvent event) { 362 if (!(event instanceof RMStateStoreRMDTMasterKeyEvent)) { 363 // should never happen 364 LOG.error("Illegal event type: " + event.getClass()); 365 return RMStateStoreState.ACTIVE; 366 } 367 boolean isFenced = false; 368 RMStateStoreRMDTMasterKeyEvent dtEvent = 369 (RMStateStoreRMDTMasterKeyEvent) event; 370 try { 371 LOG.info("Removing RMDTMasterKey."); 372 store.removeRMDTMasterKeyState(dtEvent.getDelegationKey()); 373 } catch (Exception e) { 374 LOG.error("Error While Removing RMDTMasterKey.", e); 375 isFenced = store.notifyStoreOperationFailedInternal(e); 376 } 377 return finalState(isFenced); 378 } 379 } 380 381 private static class StoreOrUpdateAMRMTokenTransition implements 382 MultipleArcTransition<RMStateStore, RMStateStoreEvent, 383 RMStateStoreState> { 384 @Override 385 public RMStateStoreState transition(RMStateStore store, 386 RMStateStoreEvent event) { 387 if (!(event instanceof RMStateStoreAMRMTokenEvent)) { 388 // should never happen 389 LOG.error("Illegal event type: " + event.getClass()); 390 return RMStateStoreState.ACTIVE; 391 } 392 RMStateStoreAMRMTokenEvent amrmEvent = (RMStateStoreAMRMTokenEvent) event; 393 boolean isFenced = false; 394 try { 395 LOG.info("Updating AMRMToken"); 396 store.storeOrUpdateAMRMTokenSecretManagerState( 397 amrmEvent.getAmrmTokenSecretManagerState(), amrmEvent.isUpdate()); 398 } catch (Exception e) { 399 LOG.error("Error storing info for AMRMTokenSecretManager", e); 400 isFenced = store.notifyStoreOperationFailedInternal(e); 401 } 402 return finalState(isFenced); 403 } 404 } 405 406 private static RMStateStoreState finalState(boolean isFenced) { 407 return isFenced ? RMStateStoreState.FENCED : RMStateStoreState.ACTIVE; 408 } 409 410 public RMStateStore() { 411 super(RMStateStore.class.getName()); 412 ReentrantReadWriteLock lock = new ReentrantReadWriteLock(); 413 this.readLock = lock.readLock(); 414 this.writeLock = lock.writeLock(); 415 stateMachine = stateMachineFactory.make(this); 416 } 417 418 public static class RMDTSecretManagerState { 419 // DTIdentifier -> renewDate 420 Map<RMDelegationTokenIdentifier, Long> delegationTokenState = 421 new HashMap<RMDelegationTokenIdentifier, Long>(); 422 423 Set<DelegationKey> masterKeyState = 424 new HashSet<DelegationKey>(); 425 426 int dtSequenceNumber = 0; 427 428 public Map<RMDelegationTokenIdentifier, Long> getTokenState() { 429 return delegationTokenState; 430 } 431 432 public Set<DelegationKey> getMasterKeyState() { 433 return masterKeyState; 434 } 435 436 public int getDTSequenceNumber() { 437 return dtSequenceNumber; 438 } 439 } 440 441 /** 442 * State of the ResourceManager 443 */ 444 public static class RMState { 445 Map<ApplicationId, ApplicationStateData> appState = 446 new TreeMap<ApplicationId, ApplicationStateData>(); 447 448 RMDTSecretManagerState rmSecretManagerState = new RMDTSecretManagerState(); 449 450 AMRMTokenSecretManagerState amrmTokenSecretManagerState = null; 451 452 public Map<ApplicationId, ApplicationStateData> getApplicationState() { 453 return appState; 454 } 455 456 public RMDTSecretManagerState getRMDTSecretManagerState() { 457 return rmSecretManagerState; 458 } 459 460 public AMRMTokenSecretManagerState getAMRMTokenSecretManagerState() { 461 return amrmTokenSecretManagerState; 462 } 463 } 464 465 private Dispatcher rmDispatcher; 466 467 /** 468 * Dispatcher used to send state operation completion events to 469 * ResourceManager services 470 */ 471 public void setRMDispatcher(Dispatcher dispatcher) { 472 this.rmDispatcher = dispatcher; 473 } 474 475 AsyncDispatcher dispatcher; 476 477 @Override 478 protected void serviceInit(Configuration conf) throws Exception{ 479 // create async handler 480 dispatcher = new AsyncDispatcher(); 481 dispatcher.init(conf); 482 dispatcher.register(RMStateStoreEventType.class, 483 new ForwardingEventHandler()); 484 dispatcher.setDrainEventsOnStop(); 485 initInternal(conf); 486 } 487 488 @Override 489 protected void serviceStart() throws Exception { 490 dispatcher.start(); 491 startInternal(); 492 } 493 494 /** 495 * Derived classes initialize themselves using this method. 496 */ 497 protected abstract void initInternal(Configuration conf) throws Exception; 498 499 /** 500 * Derived classes start themselves using this method. 501 * The base class is started and the event dispatcher is ready to use at 502 * this point 503 */ 504 protected abstract void startInternal() throws Exception; 505 506 @Override 507 protected void serviceStop() throws Exception { 508 dispatcher.stop(); 509 closeInternal(); 510 } 511 512 /** 513 * Derived classes close themselves using this method. 514 * The base class will be closed and the event dispatcher will be shutdown 515 * after this 516 */ 517 protected abstract void closeInternal() throws Exception; 518 519 /** 520 * 1) Versioning scheme: major.minor. For e.g. 1.0, 1.1, 1.2...1.25, 2.0 etc. 521 * 2) Any incompatible change of state-store is a major upgrade, and any 522 * compatible change of state-store is a minor upgrade. 523 * 3) If theres's no version, treat it as CURRENT_VERSION_INFO. 524 * 4) Within a minor upgrade, say 1.1 to 1.2: 525 * overwrite the version info and proceed as normal. 526 * 5) Within a major upgrade, say 1.2 to 2.0: 527 * throw exception and indicate user to use a separate upgrade tool to 528 * upgrade RM state. 529 */ 530 public void checkVersion() throws Exception { 531 Version loadedVersion = loadVersion(); 532 LOG.info("Loaded RM state version info " + loadedVersion); 533 if (loadedVersion != null && loadedVersion.equals(getCurrentVersion())) { 534 return; 535 } 536 // if there is no version info, treat it as CURRENT_VERSION_INFO; 537 if (loadedVersion == null) { 538 loadedVersion = getCurrentVersion(); 539 } 540 if (loadedVersion.isCompatibleTo(getCurrentVersion())) { 541 LOG.info("Storing RM state version info " + getCurrentVersion()); 542 storeVersion(); 543 } else { 544 throw new RMStateVersionIncompatibleException( 545 "Expecting RM state version " + getCurrentVersion() 546 + ", but loading version " + loadedVersion); 547 } 548 } 549 550 /** 551 * Derived class use this method to load the version information from state 552 * store. 553 */ 554 protected abstract Version loadVersion() throws Exception; 555 556 /** 557 * Derived class use this method to store the version information. 558 */ 559 protected abstract void storeVersion() throws Exception; 560 561 /** 562 * Get the current version of the underlying state store. 563 */ 564 protected abstract Version getCurrentVersion(); 565 566 567 /** 568 * Get the current epoch of RM and increment the value. 569 */ 570 public abstract long getAndIncrementEpoch() throws Exception; 571 572 /** 573 * Blocking API 574 * The derived class must recover state from the store and return a new 575 * RMState object populated with that state 576 * This must not be called on the dispatcher thread 577 */ 578 public abstract RMState loadState() throws Exception; 579 580 /** 581 * Non-Blocking API 582 * ResourceManager services use this to store the application's state 583 * This does not block the dispatcher threads 584 * RMAppStoredEvent will be sent on completion to notify the RMApp 585 */ 586 @SuppressWarnings("unchecked") 587 public void storeNewApplication(RMApp app) { 588 ApplicationSubmissionContext context = app 589 .getApplicationSubmissionContext(); 590 assert context instanceof ApplicationSubmissionContextPBImpl; 591 ApplicationStateData appState = 592 ApplicationStateData.newInstance( 593 app.getSubmitTime(), app.getStartTime(), context, app.getUser()); 594 dispatcher.getEventHandler().handle(new RMStateStoreAppEvent(appState)); 595 } 596 597 @SuppressWarnings("unchecked") 598 public void updateApplicationState( 599 ApplicationStateData appState) { 600 dispatcher.getEventHandler().handle(new RMStateUpdateAppEvent(appState)); 601 } 602 603 public void updateFencedState() { 604 handleStoreEvent(new RMStateStoreEvent(RMStateStoreEventType.FENCED)); 605 } 606 607 /** 608 * Blocking API 609 * Derived classes must implement this method to store the state of an 610 * application. 611 */ 612 protected abstract void storeApplicationStateInternal(ApplicationId appId, 613 ApplicationStateData appStateData) throws Exception; 614 615 protected abstract void updateApplicationStateInternal(ApplicationId appId, 616 ApplicationStateData appStateData) throws Exception; 617 618 @SuppressWarnings("unchecked") 619 /** 620 * Non-blocking API 621 * ResourceManager services call this to store state on an application attempt 622 * This does not block the dispatcher threads 623 * RMAppAttemptStoredEvent will be sent on completion to notify the RMAppAttempt 624 */ 625 public void storeNewApplicationAttempt(RMAppAttempt appAttempt) { 626 Credentials credentials = getCredentialsFromAppAttempt(appAttempt); 627 628 AggregateAppResourceUsage resUsage = 629 appAttempt.getRMAppAttemptMetrics().getAggregateAppResourceUsage(); 630 ApplicationAttemptStateData attemptState = 631 ApplicationAttemptStateData.newInstance( 632 appAttempt.getAppAttemptId(), 633 appAttempt.getMasterContainer(), 634 credentials, appAttempt.getStartTime(), 635 resUsage.getMemorySeconds(), 636 resUsage.getVcoreSeconds()); 637 638 dispatcher.getEventHandler().handle( 639 new RMStateStoreAppAttemptEvent(attemptState)); 640 } 641 642 @SuppressWarnings("unchecked") 643 public void updateApplicationAttemptState( 644 ApplicationAttemptStateData attemptState) { 645 dispatcher.getEventHandler().handle( 646 new RMStateUpdateAppAttemptEvent(attemptState)); 647 } 648 649 /** 650 * Blocking API 651 * Derived classes must implement this method to store the state of an 652 * application attempt 653 */ 654 protected abstract void storeApplicationAttemptStateInternal( 655 ApplicationAttemptId attemptId, 656 ApplicationAttemptStateData attemptStateData) throws Exception; 657 658 protected abstract void updateApplicationAttemptStateInternal( 659 ApplicationAttemptId attemptId, 660 ApplicationAttemptStateData attemptStateData) throws Exception; 661 662 /** 663 * RMDTSecretManager call this to store the state of a delegation token 664 * and sequence number 665 */ 666 public void storeRMDelegationToken( 667 RMDelegationTokenIdentifier rmDTIdentifier, Long renewDate) { 668 handleStoreEvent(new RMStateStoreRMDTEvent(rmDTIdentifier, renewDate, 669 RMStateStoreEventType.STORE_DELEGATION_TOKEN)); 670 } 671 672 /** 673 * Blocking API 674 * Derived classes must implement this method to store the state of 675 * RMDelegationToken and sequence number 676 */ 677 protected abstract void storeRMDelegationTokenState( 678 RMDelegationTokenIdentifier rmDTIdentifier, Long renewDate) 679 throws Exception; 680 681 /** 682 * RMDTSecretManager call this to remove the state of a delegation token 683 */ 684 public void removeRMDelegationToken( 685 RMDelegationTokenIdentifier rmDTIdentifier) { 686 handleStoreEvent(new RMStateStoreRMDTEvent(rmDTIdentifier, null, 687 RMStateStoreEventType.REMOVE_DELEGATION_TOKEN)); 688 } 689 690 /** 691 * Blocking API 692 * Derived classes must implement this method to remove the state of RMDelegationToken 693 */ 694 protected abstract void removeRMDelegationTokenState( 695 RMDelegationTokenIdentifier rmDTIdentifier) throws Exception; 696 697 /** 698 * RMDTSecretManager call this to update the state of a delegation token 699 * and sequence number 700 */ 701 public void updateRMDelegationToken( 702 RMDelegationTokenIdentifier rmDTIdentifier, Long renewDate) { 703 handleStoreEvent(new RMStateStoreRMDTEvent(rmDTIdentifier, renewDate, 704 RMStateStoreEventType.UPDATE_DELEGATION_TOKEN)); 705 } 706 707 /** 708 * Blocking API 709 * Derived classes must implement this method to update the state of 710 * RMDelegationToken and sequence number 711 */ 712 protected abstract void updateRMDelegationTokenState( 713 RMDelegationTokenIdentifier rmDTIdentifier, Long renewDate) 714 throws Exception; 715 716 /** 717 * RMDTSecretManager call this to store the state of a master key 718 */ 719 public void storeRMDTMasterKey(DelegationKey delegationKey) { 720 handleStoreEvent(new RMStateStoreRMDTMasterKeyEvent(delegationKey, 721 RMStateStoreEventType.STORE_MASTERKEY)); 722 } 723 724 /** 725 * Blocking API 726 * Derived classes must implement this method to store the state of 727 * DelegationToken Master Key 728 */ 729 protected abstract void storeRMDTMasterKeyState(DelegationKey delegationKey) 730 throws Exception; 731 732 /** 733 * RMDTSecretManager call this to remove the state of a master key 734 */ 735 public void removeRMDTMasterKey(DelegationKey delegationKey) { 736 handleStoreEvent(new RMStateStoreRMDTMasterKeyEvent(delegationKey, 737 RMStateStoreEventType.REMOVE_MASTERKEY)); 738 } 739 740 /** 741 * Blocking API 742 * Derived classes must implement this method to remove the state of 743 * DelegationToken Master Key 744 */ 745 protected abstract void removeRMDTMasterKeyState(DelegationKey delegationKey) 746 throws Exception; 747 748 /** 749 * Blocking API Derived classes must implement this method to store or update 750 * the state of AMRMToken Master Key 751 */ 752 protected abstract void storeOrUpdateAMRMTokenSecretManagerState( 753 AMRMTokenSecretManagerState amrmTokenSecretManagerState, boolean isUpdate) 754 throws Exception; 755 756 /** 757 * Store or Update state of AMRMToken Master Key 758 */ 759 public void storeOrUpdateAMRMTokenSecretManager( 760 AMRMTokenSecretManagerState amrmTokenSecretManagerState, boolean isUpdate) { 761 handleStoreEvent(new RMStateStoreAMRMTokenEvent( 762 amrmTokenSecretManagerState, isUpdate, 763 RMStateStoreEventType.UPDATE_AMRM_TOKEN)); 764 } 765 766 /** 767 * Non-blocking API 768 * ResourceManager services call this to remove an application from the state 769 * store 770 * This does not block the dispatcher threads 771 * There is no notification of completion for this operation. 772 */ 773 @SuppressWarnings("unchecked") 774 public void removeApplication(RMApp app) { 775 ApplicationStateData appState = 776 ApplicationStateData.newInstance( 777 app.getSubmitTime(), app.getStartTime(), 778 app.getApplicationSubmissionContext(), app.getUser()); 779 for(RMAppAttempt appAttempt : app.getAppAttempts().values()) { 780 appState.attempts.put(appAttempt.getAppAttemptId(), null); 781 } 782 783 dispatcher.getEventHandler().handle(new RMStateStoreRemoveAppEvent(appState)); 784 } 785 786 /** 787 * Blocking API 788 * Derived classes must implement this method to remove the state of an 789 * application and its attempts 790 */ 791 protected abstract void removeApplicationStateInternal( 792 ApplicationStateData appState) throws Exception; 793 794 // TODO: This should eventually become cluster-Id + "AM_RM_TOKEN_SERVICE". See 795 // YARN-1779 796 public static final Text AM_RM_TOKEN_SERVICE = new Text( 797 "AM_RM_TOKEN_SERVICE"); 798 799 public static final Text AM_CLIENT_TOKEN_MASTER_KEY_NAME = 800 new Text("YARN_CLIENT_TOKEN_MASTER_KEY"); 801 802 public Credentials getCredentialsFromAppAttempt(RMAppAttempt appAttempt) { 803 Credentials credentials = new Credentials(); 804 805 SecretKey clientTokenMasterKey = 806 appAttempt.getClientTokenMasterKey(); 807 if(clientTokenMasterKey != null){ 808 credentials.addSecretKey(AM_CLIENT_TOKEN_MASTER_KEY_NAME, 809 clientTokenMasterKey.getEncoded()); 810 } 811 return credentials; 812 } 813 814 @VisibleForTesting 815 protected boolean isFencedState() { 816 return (RMStateStoreState.FENCED == getRMStateStoreState()); 817 } 818 819 // Dispatcher related code 820 protected void handleStoreEvent(RMStateStoreEvent event) { 821 this.writeLock.lock(); 822 try { 823 824 if (LOG.isDebugEnabled()) { 825 LOG.debug("Processing event of type " + event.getType()); 826 } 827 828 final RMStateStoreState oldState = getRMStateStoreState(); 829 830 this.stateMachine.doTransition(event.getType(), event); 831 832 if (oldState != getRMStateStoreState()) { 833 LOG.info("RMStateStore state change from " + oldState + " to " 834 + getRMStateStoreState()); 835 } 836 837 } catch (InvalidStateTransitonException e) { 838 LOG.error("Can't handle this event at current state", e); 839 } finally { 840 this.writeLock.unlock(); 841 } 842 } 843 844 /** 845 * This method is called to notify the ResourceManager that the store 846 * operation has failed. 847 * @param failureCause the exception due to which the operation failed 848 */ 849 protected void notifyStoreOperationFailed(Exception failureCause) { 850 if (isFencedState()) { 851 return; 852 } 853 if (notifyStoreOperationFailedInternal(failureCause)) { 854 updateFencedState(); 855 } 856 } 857 858 @SuppressWarnings("unchecked") 859 private boolean notifyStoreOperationFailedInternal( 860 Exception failureCause) { 861 boolean isFenced = false; 862 LOG.error("State store operation failed ", failureCause); 863 if (HAUtil.isHAEnabled(getConfig())) { 864 LOG.warn("State-store fenced ! Transitioning RM to standby"); 865 isFenced = true; 866 Thread standByTransitionThread = 867 new Thread(new StandByTransitionThread()); 868 standByTransitionThread.setName("StandByTransitionThread Handler"); 869 standByTransitionThread.start(); 870 } else if (YarnConfiguration.shouldRMFailFast(getConfig())) { 871 LOG.fatal("Fail RM now due to state-store error!"); 872 rmDispatcher.getEventHandler().handle( 873 new RMFatalEvent(RMFatalEventType.STATE_STORE_OP_FAILED, 874 failureCause)); 875 } else { 876 LOG.warn("Skip the state-store error."); 877 } 878 return isFenced; 879 } 880 881 @SuppressWarnings("unchecked") 882 /** 883 * This method is called to notify the application that 884 * new application is stored or updated in state store 885 * @param event App event containing the app id and event type 886 */ 887 private void notifyApplication(RMAppEvent event) { 888 rmDispatcher.getEventHandler().handle(event); 889 } 890 891 @SuppressWarnings("unchecked") 892 /** 893 * This method is called to notify the application attempt 894 * that new attempt is stored or updated in state store 895 * @param event App attempt event containing the app attempt 896 * id and event type 897 */ 898 private void notifyApplicationAttempt(RMAppAttemptEvent event) { 899 rmDispatcher.getEventHandler().handle(event); 900 } 901 902 /** 903 * EventHandler implementation which forward events to the FSRMStateStore 904 * This hides the EventHandle methods of the store from its public interface 905 */ 906 private final class ForwardingEventHandler 907 implements EventHandler<RMStateStoreEvent> { 908 909 @Override 910 public void handle(RMStateStoreEvent event) { 911 handleStoreEvent(event); 912 } 913 } 914 915 /** 916 * Derived classes must implement this method to delete the state store 917 * @throws Exception 918 */ 919 public abstract void deleteStore() throws Exception; 920 921 public void setResourceManager(ResourceManager rm) { 922 this.resourceManager = rm; 923 } 924 925 private class StandByTransitionThread implements Runnable { 926 @Override 927 public void run() { 928 LOG.info("RMStateStore has been fenced"); 929 resourceManager.handleTransitionToStandBy(); 930 } 931 } 932 933 public RMStateStoreState getRMStateStoreState() { 934 this.readLock.lock(); 935 try { 936 return this.stateMachine.getCurrentState(); 937 } finally { 938 this.readLock.unlock(); 939 } 940 } 941 }
RMStateStore.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
storeNewApplication函数内dispatcher.getEventHandler().handle(new RMStateStoreAppEvent(appState)), handle函数的参数new RMStateStoreAppEvent(appState),初始化了一个RMStateStoreEventType.STORE_APP事件。
1 public class RMStateStoreAppEvent extends RMStateStoreEvent { 2 3 private final ApplicationStateData appState; 4 5 public RMStateStoreAppEvent(ApplicationStateData appState) { 6 super(RMStateStoreEventType.STORE_APP); 7 this.appState = appState; 8 } 9 10 public ApplicationStateData getAppState() { 11 return appState; 12 } 13 }
此处的dispatcher是RMStateStore类自己的变量,只在初始化时绑定了一个RMStateStoreEventType, 看RMStateStore类的serviceInit()函数, 该函数内部调用dispatcher.register(RMStateStoreEventType.class, new ForwardingEventHandler()), 调用的类是RMStateStore 类的内部类 ForwardingEventHandler, 该内部类的handle函数调用了函数handleStoreEvent(), 该函数内部调用 this.stateMachine.doTransition(event.getType(), event), 同前面一样,又会进入类StateMachineFactory中的内部类SingleInternalArc里, 不过这次的状态机工厂是RMStateStore类的内部变量,上面的状态机工厂是RMAppImpl类的,他们绑定的事件不同。 可以在RMStateStore类最开始.addTransition(RMStateStoreState.ACTIVE, EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED), RMStateStoreEventType.STORE_APP, new StoreAppTransition()), 看到RMStateStoreEventType.STORE_APP事件只是将状态RMStateStoreState.ACTIVE转变为 EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED)。 {不确定,,,主要作用就是完成RMAppImpl类当前信息的日志记录。日记记录是为了RM的重启。 详情见ResourceManager重启过程。 }
RMStateStoreEventType.STORE_APP事件绑定的类是StoreAppTransition, 我们追踪一下addTransition()函数, 如下:
1 //StateMachineFactory.java 2 public StateMachineFactory 3 <OPERAND, STATE, EVENTTYPE, EVENT> 4 addTransition(STATE preState, Set<STATE> postStates, 5 EVENTTYPE eventType, 6 MultipleArcTransition<OPERAND, EVENT, STATE> hook){ 7 return new StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> 8 (this, 9 new ApplicableSingleOrMultipleTransition<OPERAND, STATE, EVENTTYPE, EVENT> 10 (preState, eventType, new MultipleInternalArc(postStates, hook))); 11 }
最后会在StateMachineFactory中的内部类MultipleInternalArc里调用hook.transition(operand, event),它是接口类MultipleArcTransition的函数。 在看一下绑定类StoreAppTransition,它是RMStateStore类的内部类, 该类实现了MultipleArcTransition类,所以最后调用的是RMStateStore类的内部类StoreAppTransition的函数transition()。
在该函数内部进一步调用store.notifyApplication(new RMAppEvent(appId, RMAppEventType.APP_NEW_SAVED)), 而notifyApplication函数内部进一步调用 rmDispatcher.getEventHandler().handle(event), 向中央调度器发送了一个RMAppEvent(appId, RMAppEventType.APP_NEW_SAVED)事件, 事件类型是 RMAppEventType.APP_NEW_SAVED, 由于在ResourceManager中,将RMAppEventType类型的事件绑定到了ResourceManager类的内部类ApplicationEventDispatcher中, ApplicationEventDispatcher类的handle()函数中调用rmApp.handle(event), 最终调用的是RMAppImpl类的handle()函数, 所以和前面的RMAppEventType.START事件一样,会被RMAppImpl类处理。
RMAppImpl类内部 .addTransition(RMAppState.NEW_SAVING, RMAppState.SUBMITTED,
RMAppEventType.APP_NEW_SAVED, new AddApplicationToSchedulerTransition()),
RMAppImpl收到RMAppEventType.APP_NEW_SAVED事件后,
将自身的运行状态由NEW_SAVING转换为SUBMITTED ,调用回调类是RMAppImpl类的内部类
AddApplicationToSchedulerTransition, 该内部类的transition()函数内部会调用 app.handler.handle(
new AppAddedSchedulerEvent(app.applicationId,
app.submissionContext.getQueue(), app.user, app.submissionContext.getReservationID())),
如下所示:
1 //RMAppImpl.java 2 private static final class AddApplicationToSchedulerTransition extends 3 RMAppTransition { 4 @Override 5 public void transition(RMAppImpl app, RMAppEvent event) { 6 app.handler.handle(new AppAddedSchedulerEvent(app.applicationId, 7 app.submissionContext.getQueue(), app.user, 8 app.submissionContext.getReservationID())); 9 } 10 }
在AppAddedSchedulerEvent类中,如下所示:
1 //AppAddedSchedulerEvent.java 2 public AppAddedSchedulerEvent(ApplicationId applicationId, String queue, 3 String user, ReservationId reservationID) { 4 this(applicationId, queue, user, false, reservationID); 5 } 6 7 public AppAddedSchedulerEvent(ApplicationId applicationId, String queue, 8 String user, boolean isAppRecovering, ReservationId reservationID) { 9 super(SchedulerEventType.APP_ADDED); 10 this.applicationId = applicationId; 11 this.queue = queue; 12 this.user = user; 13 this.reservationID = reservationID; 14 this.isAppRecovering = isAppRecovering; 15 }
scheduler收到SchedulerEventType.APP_ADDED事件之后,首先进行权限检查,然后将应用程序信息保存到内部的数据结构中,并向RMAppImpl发送APP_ACCEPTED事件。 具体过程如下:
1 public enum SchedulerEventType { 2 3 // Source: Node 4 NODE_ADDED, 5 NODE_REMOVED, 6 NODE_UPDATE, 7 NODE_RESOURCE_UPDATE, 8 NODE_LABELS_UPDATE, 9 10 // Source: RMApp 11 APP_ADDED, 12 APP_REMOVED, 13 14 // Source: RMAppAttempt 15 APP_ATTEMPT_ADDED, 16 APP_ATTEMPT_REMOVED, 17 18 // Source: ContainerAllocationExpirer 19 CONTAINER_EXPIRED, 20 21 // Source: RMContainer 22 CONTAINER_RESCHEDULED, 23 24 // Source: SchedulingEditPolicy 25 DROP_RESERVATION, 26 PREEMPT_CONTAINER, 27 KILL_CONTAINER 28 }
SchedulerEventType.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/SchedulerEventType.java
新建的事件类AppAddedSchedulerEvent, 其中有super(SchedulerEventType.APP_ADDED), 事件类型SchedulerEventType.APP_ADDED, 由于在ResourceManager中,将SchedulerEventType类型的事件绑定到了EventHandler<SchedulerEvent> 的对象schedulerDispatcher上, 绑定过程是在ResourceManager类的内部类RMActiveServices的serviceInit()函数中, 如下所示:
1 //ResourceManager类的内部类RMActiveServices的serviceInit()函数的绑定SchedulerEventType类型的事件 2 schedulerDispatcher = createSchedulerEventDispatcher(); 3 addIfService(schedulerDispatcher); 4 rmDispatcher.register(SchedulerEventType.class, schedulerDispatcher);
我们进入createSchedulerEventDispatcher()函数, 如下所示:
1 //ResourceManager.java 2 protected EventHandler<SchedulerEvent> createSchedulerEventDispatcher() { 3 return new SchedulerEventDispatcher(this.scheduler); 4 }
这里发现传入的调度器scheduler是this.scheduler, this.scheduler 在绑定ScheduleEventType类型的事件前面进行的初始化,如下所示:
1 //ResourceManager类的内部类RMActiveServices的serviceInit()函数 2 // Initialize the scheduler 3 scheduler = createScheduler(); 4 scheduler.setRMContext(rmContext); 5 addIfService(scheduler); 6 rmContext.setScheduler(scheduler); 7 8 schedulerDispatcher = createSchedulerEventDispatcher(); 9 addIfService(schedulerDispatcher); 10 rmDispatcher.register(SchedulerEventType.class, schedulerDispatcher); 11 12 // Register event handler for RmAppEvents 13 //注册RMAppEvent事件的事件处理器 14 //RMAppManager往异步处理器增加个RMAppEvent事件,类型枚值RMAppEventType.START,所以由ApplicationEventDispatcher(rmContext)来处理 15 rmDispatcher.register(RMAppEventType.class, 16 new ApplicationEventDispatcher(rmContext)); 17 18 // Register event handler for RmAppAttemptEvents 19 rmDispatcher.register(RMAppAttemptEventType.class, 20 new ApplicationAttemptEventDispatcher(rmContext));
再进入createScheduler函数, 如下所示:
1 //ResourceManager.java 2 protected ResourceScheduler createScheduler() { 3 String schedulerClassName = conf.get(YarnConfiguration.RM_SCHEDULER, 4 YarnConfiguration.DEFAULT_RM_SCHEDULER); 5 LOG.info("Using Scheduler: " + schedulerClassName); 6 try { 7 Class<?> schedulerClazz = Class.forName(schedulerClassName); 8 if (ResourceScheduler.class.isAssignableFrom(schedulerClazz)) { 9 return (ResourceScheduler) ReflectionUtils.newInstance(schedulerClazz, 10 this.conf); 11 } else { 12 throw new YarnRuntimeException("Class: " + schedulerClassName 13 + " not instance of " + ResourceScheduler.class.getCanonicalName()); 14 } 15 } catch (ClassNotFoundException e) { 16 throw new YarnRuntimeException("Could not instantiate Scheduler: " 17 + schedulerClassName, e); 18 } 19 }
发现用的默认调度器是YarnConfiguration.DEFAULT_RM_SCHEDULER, 而它是在YarnConfiguration.java中,取值为org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler。 如下所示:
1 //YarnConfiguration.java 2 /** The class to use as the resource scheduler.*/ 3 public static final String RM_SCHEDULER = 4 RM_PREFIX + "scheduler.class"; 5 6 public static final String DEFAULT_RM_SCHEDULER = 7 "org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler";
进入CapacityScheduler类中,如下所示:
1 @LimitedPrivate("yarn") 2 @Evolving 3 @SuppressWarnings("unchecked") 4 public class CapacityScheduler extends 5 AbstractYarnScheduler<FiCaSchedulerApp, FiCaSchedulerNode> implements 6 PreemptableResourceScheduler, CapacitySchedulerContext, Configurable { 7 8 private static final Log LOG = LogFactory.getLog(CapacityScheduler.class); 9 private YarnAuthorizationProvider authorizer; 10 11 private CSQueue root; 12 // timeout to join when we stop this service 13 protected final long THREAD_JOIN_TIMEOUT_MS = 1000; 14 15 static final Comparator<CSQueue> queueComparator = new Comparator<CSQueue>() { 16 @Override 17 public int compare(CSQueue q1, CSQueue q2) { 18 if (q1.getUsedCapacity() < q2.getUsedCapacity()) { 19 return -1; 20 } else if (q1.getUsedCapacity() > q2.getUsedCapacity()) { 21 return 1; 22 } 23 24 return q1.getQueuePath().compareTo(q2.getQueuePath()); 25 } 26 }; 27 28 static final Comparator<FiCaSchedulerApp> applicationComparator = 29 new Comparator<FiCaSchedulerApp>() { 30 @Override 31 public int compare(FiCaSchedulerApp a1, FiCaSchedulerApp a2) { 32 return a1.getApplicationId().compareTo(a2.getApplicationId()); 33 } 34 }; 35 36 @Override 37 public void setConf(Configuration conf) { 38 yarnConf = conf; 39 } 40 41 private void validateConf(Configuration conf) { 42 // validate scheduler memory allocation setting 43 int minMem = conf.getInt( 44 YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_MB, 45 YarnConfiguration.DEFAULT_RM_SCHEDULER_MINIMUM_ALLOCATION_MB); 46 int maxMem = conf.getInt( 47 YarnConfiguration.RM_SCHEDULER_MAXIMUM_ALLOCATION_MB, 48 YarnConfiguration.DEFAULT_RM_SCHEDULER_MAXIMUM_ALLOCATION_MB); 49 50 if (minMem <= 0 || minMem > maxMem) { 51 throw new YarnRuntimeException("Invalid resource scheduler memory" 52 + " allocation configuration" 53 + ", " + YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_MB 54 + "=" + minMem 55 + ", " + YarnConfiguration.RM_SCHEDULER_MAXIMUM_ALLOCATION_MB 56 + "=" + maxMem + ", min and max should be greater than 0" 57 + ", max should be no smaller than min."); 58 } 59 60 // validate scheduler vcores allocation setting 61 int minVcores = conf.getInt( 62 YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_VCORES, 63 YarnConfiguration.DEFAULT_RM_SCHEDULER_MINIMUM_ALLOCATION_VCORES); 64 int maxVcores = conf.getInt( 65 YarnConfiguration.RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES, 66 YarnConfiguration.DEFAULT_RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES); 67 68 if (minVcores <= 0 || minVcores > maxVcores) { 69 throw new YarnRuntimeException("Invalid resource scheduler vcores" 70 + " allocation configuration" 71 + ", " + YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_VCORES 72 + "=" + minVcores 73 + ", " + YarnConfiguration.RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES 74 + "=" + maxVcores + ", min and max should be greater than 0" 75 + ", max should be no smaller than min."); 76 } 77 } 78 79 @Override 80 public Configuration getConf() { 81 return yarnConf; 82 } 83 84 private CapacitySchedulerConfiguration conf; 85 private Configuration yarnConf; 86 87 private Map<String, CSQueue> queues = new ConcurrentHashMap<String, CSQueue>(); 88 89 private AtomicInteger numNodeManagers = new AtomicInteger(0); 90 91 private ResourceCalculator calculator; 92 private boolean usePortForNodeName; 93 94 private boolean scheduleAsynchronously; 95 private AsyncScheduleThread asyncSchedulerThread; 96 private RMNodeLabelsManager labelManager; 97 98 /** 99 * EXPERT 100 */ 101 private long asyncScheduleInterval; 102 private static final String ASYNC_SCHEDULER_INTERVAL = 103 CapacitySchedulerConfiguration.SCHEDULE_ASYNCHRONOUSLY_PREFIX 104 + ".scheduling-interval-ms"; 105 private static final long DEFAULT_ASYNC_SCHEDULER_INTERVAL = 5; 106 107 private boolean overrideWithQueueMappings = false; 108 private List<QueueMapping> mappings = null; 109 private Groups groups; 110 111 @VisibleForTesting 112 public synchronized String getMappedQueueForTest(String user) 113 throws IOException { 114 return getMappedQueue(user); 115 } 116 117 public CapacityScheduler() { 118 super(CapacityScheduler.class.getName()); 119 } 120 121 @Override 122 public QueueMetrics getRootQueueMetrics() { 123 return root.getMetrics(); 124 } 125 126 public CSQueue getRootQueue() { 127 return root; 128 } 129 130 @Override 131 public CapacitySchedulerConfiguration getConfiguration() { 132 return conf; 133 } 134 135 @Override 136 public synchronized RMContainerTokenSecretManager 137 getContainerTokenSecretManager() { 138 return this.rmContext.getContainerTokenSecretManager(); 139 } 140 141 @Override 142 public Comparator<FiCaSchedulerApp> getApplicationComparator() { 143 return applicationComparator; 144 } 145 146 @Override 147 public ResourceCalculator getResourceCalculator() { 148 return calculator; 149 } 150 151 @Override 152 public Comparator<CSQueue> getQueueComparator() { 153 return queueComparator; 154 } 155 156 @Override 157 public int getNumClusterNodes() { 158 return numNodeManagers.get(); 159 } 160 161 @Override 162 public synchronized RMContext getRMContext() { 163 return this.rmContext; 164 } 165 166 @Override 167 public synchronized void setRMContext(RMContext rmContext) { 168 this.rmContext = rmContext; 169 } 170 171 private synchronized void initScheduler(Configuration configuration) throws 172 IOException { 173 this.conf = loadCapacitySchedulerConfiguration(configuration); 174 validateConf(this.conf); 175 this.minimumAllocation = this.conf.getMinimumAllocation(); 176 initMaximumResourceCapability(this.conf.getMaximumAllocation()); 177 this.calculator = this.conf.getResourceCalculator(); 178 this.usePortForNodeName = this.conf.getUsePortForNodeName(); 179 this.applications = 180 new ConcurrentHashMap<ApplicationId, 181 SchedulerApplication<FiCaSchedulerApp>>(); 182 this.labelManager = rmContext.getNodeLabelManager(); 183 authorizer = YarnAuthorizationProvider.getInstance(yarnConf); 184 initializeQueues(this.conf); 185 186 scheduleAsynchronously = this.conf.getScheduleAynschronously(); 187 asyncScheduleInterval = 188 this.conf.getLong(ASYNC_SCHEDULER_INTERVAL, 189 DEFAULT_ASYNC_SCHEDULER_INTERVAL); 190 if (scheduleAsynchronously) { 191 asyncSchedulerThread = new AsyncScheduleThread(this); 192 } 193 194 LOG.info("Initialized CapacityScheduler with " + 195 "calculator=" + getResourceCalculator().getClass() + ", " + 196 "minimumAllocation=<" + getMinimumResourceCapability() + ">, " + 197 "maximumAllocation=<" + getMaximumResourceCapability() + ">, " + 198 "asynchronousScheduling=" + scheduleAsynchronously + ", " + 199 "asyncScheduleInterval=" + asyncScheduleInterval + "ms"); 200 } 201 202 private synchronized void startSchedulerThreads() { 203 if (scheduleAsynchronously) { 204 Preconditions.checkNotNull(asyncSchedulerThread, 205 "asyncSchedulerThread is null"); 206 asyncSchedulerThread.start(); 207 } 208 } 209 210 @Override 211 public void serviceInit(Configuration conf) throws Exception { 212 Configuration configuration = new Configuration(conf); 213 super.serviceInit(conf); 214 initScheduler(configuration); 215 } 216 217 @Override 218 public void serviceStart() throws Exception { 219 startSchedulerThreads(); 220 super.serviceStart(); 221 } 222 223 @Override 224 public void serviceStop() throws Exception { 225 synchronized (this) { 226 if (scheduleAsynchronously && asyncSchedulerThread != null) { 227 asyncSchedulerThread.interrupt(); 228 asyncSchedulerThread.join(THREAD_JOIN_TIMEOUT_MS); 229 } 230 } 231 super.serviceStop(); 232 } 233 234 @Override 235 public synchronized void 236 reinitialize(Configuration conf, RMContext rmContext) throws IOException { 237 Configuration configuration = new Configuration(conf); 238 CapacitySchedulerConfiguration oldConf = this.conf; 239 this.conf = loadCapacitySchedulerConfiguration(configuration); 240 validateConf(this.conf); 241 try { 242 LOG.info("Re-initializing queues..."); 243 refreshMaximumAllocation(this.conf.getMaximumAllocation()); 244 reinitializeQueues(this.conf); 245 } catch (Throwable t) { 246 this.conf = oldConf; 247 refreshMaximumAllocation(this.conf.getMaximumAllocation()); 248 throw new IOException("Failed to re-init queues", t); 249 } 250 } 251 252 long getAsyncScheduleInterval() { 253 return asyncScheduleInterval; 254 } 255 256 private final static Random random = new Random(System.currentTimeMillis()); 257 258 /** 259 * Schedule on all nodes by starting at a random point. 260 * @param cs 261 */ 262 static void schedule(CapacityScheduler cs) { 263 // First randomize the start point 264 int current = 0; 265 Collection<FiCaSchedulerNode> nodes = cs.getAllNodes().values(); 266 int start = random.nextInt(nodes.size()); 267 for (FiCaSchedulerNode node : nodes) { 268 if (current++ >= start) { 269 cs.allocateContainersToNode(node); 270 } 271 } 272 // Now, just get everyone to be safe 273 for (FiCaSchedulerNode node : nodes) { 274 cs.allocateContainersToNode(node); 275 } 276 try { 277 Thread.sleep(cs.getAsyncScheduleInterval()); 278 } catch (InterruptedException e) {} 279 } 280 281 static class AsyncScheduleThread extends Thread { 282 283 private final CapacityScheduler cs; 284 private AtomicBoolean runSchedules = new AtomicBoolean(false); 285 286 public AsyncScheduleThread(CapacityScheduler cs) { 287 this.cs = cs; 288 setDaemon(true); 289 } 290 291 @Override 292 public void run() { 293 while (true) { 294 if (!runSchedules.get()) { 295 try { 296 Thread.sleep(100); 297 } catch (InterruptedException ie) {} 298 } else { 299 schedule(cs); 300 } 301 } 302 } 303 304 public void beginSchedule() { 305 runSchedules.set(true); 306 } 307 308 public void suspendSchedule() { 309 runSchedules.set(false); 310 } 311 312 } 313 314 @Private 315 public static final String ROOT_QUEUE = 316 CapacitySchedulerConfiguration.PREFIX + CapacitySchedulerConfiguration.ROOT; 317 318 static class QueueHook { 319 public CSQueue hook(CSQueue queue) { 320 return queue; 321 } 322 } 323 private static final QueueHook noop = new QueueHook(); 324 325 private void initializeQueueMappings() throws IOException { 326 overrideWithQueueMappings = conf.getOverrideWithQueueMappings(); 327 LOG.info("Initialized queue mappings, override: " 328 + overrideWithQueueMappings); 329 // Get new user/group mappings 330 List<QueueMapping> newMappings = conf.getQueueMappings(); 331 //check if mappings refer to valid queues 332 for (QueueMapping mapping : newMappings) { 333 if (!mapping.queue.equals(CURRENT_USER_MAPPING) && 334 !mapping.queue.equals(PRIMARY_GROUP_MAPPING)) { 335 CSQueue queue = queues.get(mapping.queue); 336 if (queue == null || !(queue instanceof LeafQueue)) { 337 throw new IOException( 338 "mapping contains invalid or non-leaf queue " + mapping.queue); 339 } 340 } 341 } 342 //apply the new mappings since they are valid 343 mappings = newMappings; 344 // initialize groups if mappings are present 345 if (mappings.size() > 0) { 346 groups = new Groups(conf); 347 } 348 } 349 350 @Lock(CapacityScheduler.class) 351 private void initializeQueues(CapacitySchedulerConfiguration conf) 352 throws IOException { 353 354 root = 355 parseQueue(this, conf, null, CapacitySchedulerConfiguration.ROOT, 356 queues, queues, noop); 357 labelManager.reinitializeQueueLabels(getQueueToLabels()); 358 LOG.info("Initialized root queue " + root); 359 initializeQueueMappings(); 360 setQueueAcls(authorizer, queues); 361 } 362 363 @Lock(CapacityScheduler.class) 364 private void reinitializeQueues(CapacitySchedulerConfiguration conf) 365 throws IOException { 366 // Parse new queues 367 Map<String, CSQueue> newQueues = new HashMap<String, CSQueue>(); 368 CSQueue newRoot = 369 parseQueue(this, conf, null, CapacitySchedulerConfiguration.ROOT, 370 newQueues, queues, noop); 371 372 // Ensure all existing queues are still present 373 validateExistingQueues(queues, newQueues); 374 375 // Add new queues 376 addNewQueues(queues, newQueues); 377 378 // Re-configure queues 379 root.reinitialize(newRoot, clusterResource); 380 initializeQueueMappings(); 381 382 // Re-calculate headroom for active applications 383 root.updateClusterResource(clusterResource, new ResourceLimits( 384 clusterResource)); 385 386 labelManager.reinitializeQueueLabels(getQueueToLabels()); 387 setQueueAcls(authorizer, queues); 388 } 389 390 @VisibleForTesting 391 public static void setQueueAcls(YarnAuthorizationProvider authorizer, 392 Map<String, CSQueue> queues) throws IOException { 393 for (CSQueue queue : queues.values()) { 394 AbstractCSQueue csQueue = (AbstractCSQueue) queue; 395 authorizer.setPermission(csQueue.getPrivilegedEntity(), 396 csQueue.getACLs(), UserGroupInformation.getCurrentUser()); 397 } 398 } 399 400 private Map<String, Set<String>> getQueueToLabels() { 401 Map<String, Set<String>> queueToLabels = new HashMap<String, Set<String>>(); 402 for (CSQueue queue : queues.values()) { 403 queueToLabels.put(queue.getQueueName(), queue.getAccessibleNodeLabels()); 404 } 405 return queueToLabels; 406 } 407 408 /** 409 * Ensure all existing queues are present. Queues cannot be deleted 410 * @param queues existing queues 411 * @param newQueues new queues 412 */ 413 @Lock(CapacityScheduler.class) 414 private void validateExistingQueues( 415 Map<String, CSQueue> queues, Map<String, CSQueue> newQueues) 416 throws IOException { 417 // check that all static queues are included in the newQueues list 418 for (Map.Entry<String, CSQueue> e : queues.entrySet()) { 419 if (!(e.getValue() instanceof ReservationQueue)) { 420 String queueName = e.getKey(); 421 CSQueue oldQueue = e.getValue(); 422 CSQueue newQueue = newQueues.get(queueName); 423 if (null == newQueue) { 424 throw new IOException(queueName + " cannot be found during refresh!"); 425 } else if (!oldQueue.getQueuePath().equals(newQueue.getQueuePath())) { 426 throw new IOException(queueName + " is moved from:" 427 + oldQueue.getQueuePath() + " to:" + newQueue.getQueuePath() 428 + " after refresh, which is not allowed."); 429 } 430 } 431 } 432 } 433 434 /** 435 * Add the new queues (only) to our list of queues... 436 * ... be careful, do not overwrite existing queues. 437 * @param queues 438 * @param newQueues 439 */ 440 @Lock(CapacityScheduler.class) 441 private void addNewQueues( 442 Map<String, CSQueue> queues, Map<String, CSQueue> newQueues) 443 { 444 for (Map.Entry<String, CSQueue> e : newQueues.entrySet()) { 445 String queueName = e.getKey(); 446 CSQueue queue = e.getValue(); 447 if (!queues.containsKey(queueName)) { 448 queues.put(queueName, queue); 449 } 450 } 451 } 452 453 @Lock(CapacityScheduler.class) 454 static CSQueue parseQueue( 455 CapacitySchedulerContext csContext, 456 CapacitySchedulerConfiguration conf, 457 CSQueue parent, String queueName, Map<String, CSQueue> queues, 458 Map<String, CSQueue> oldQueues, 459 QueueHook hook) throws IOException { 460 CSQueue queue; 461 String fullQueueName = 462 (parent == null) ? queueName 463 : (parent.getQueuePath() + "." + queueName); 464 String[] childQueueNames = 465 conf.getQueues(fullQueueName); 466 boolean isReservableQueue = conf.isReservable(fullQueueName); 467 if (childQueueNames == null || childQueueNames.length == 0) { 468 if (null == parent) { 469 throw new IllegalStateException( 470 "Queue configuration missing child queue names for " + queueName); 471 } 472 // Check if the queue will be dynamically managed by the Reservation 473 // system 474 if (isReservableQueue) { 475 queue = 476 new PlanQueue(csContext, queueName, parent, 477 oldQueues.get(queueName)); 478 } else { 479 queue = 480 new LeafQueue(csContext, queueName, parent, 481 oldQueues.get(queueName)); 482 483 // Used only for unit tests 484 queue = hook.hook(queue); 485 } 486 } else { 487 if (isReservableQueue) { 488 throw new IllegalStateException( 489 "Only Leaf Queues can be reservable for " + queueName); 490 } 491 ParentQueue parentQueue = 492 new ParentQueue(csContext, queueName, parent, oldQueues.get(queueName)); 493 494 // Used only for unit tests 495 queue = hook.hook(parentQueue); 496 497 List<CSQueue> childQueues = new ArrayList<CSQueue>(); 498 for (String childQueueName : childQueueNames) { 499 CSQueue childQueue = 500 parseQueue(csContext, conf, queue, childQueueName, 501 queues, oldQueues, hook); 502 childQueues.add(childQueue); 503 } 504 parentQueue.setChildQueues(childQueues); 505 } 506 507 if(queue instanceof LeafQueue == true && queues.containsKey(queueName) 508 && queues.get(queueName) instanceof LeafQueue == true) { 509 throw new IOException("Two leaf queues were named " + queueName 510 + ". Leaf queue names must be distinct"); 511 } 512 queues.put(queueName, queue); 513 514 LOG.info("Initialized queue: " + queue); 515 return queue; 516 } 517 518 public CSQueue getQueue(String queueName) { 519 if (queueName == null) { 520 return null; 521 } 522 return queues.get(queueName); 523 } 524 525 private static final String CURRENT_USER_MAPPING = "%user"; 526 527 private static final String PRIMARY_GROUP_MAPPING = "%primary_group"; 528 529 private String getMappedQueue(String user) throws IOException { 530 for (QueueMapping mapping : mappings) { 531 if (mapping.type == MappingType.USER) { 532 if (mapping.source.equals(CURRENT_USER_MAPPING)) { 533 if (mapping.queue.equals(CURRENT_USER_MAPPING)) { 534 return user; 535 } 536 else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) { 537 return groups.getGroups(user).get(0); 538 } 539 else { 540 return mapping.queue; 541 } 542 } 543 if (user.equals(mapping.source)) { 544 return mapping.queue; 545 } 546 } 547 if (mapping.type == MappingType.GROUP) { 548 for (String userGroups : groups.getGroups(user)) { 549 if (userGroups.equals(mapping.source)) { 550 return mapping.queue; 551 } 552 } 553 } 554 } 555 return null; 556 } 557 558 private String getQueueMappings(ApplicationId applicationId, String queueName, 559 String user) { 560 if (mappings != null && mappings.size() > 0) { 561 try { 562 String mappedQueue = getMappedQueue(user); 563 if (mappedQueue != null) { 564 // We have a mapping, should we use it? 565 if (queueName.equals(YarnConfiguration.DEFAULT_QUEUE_NAME) 566 || overrideWithQueueMappings) { 567 LOG.info("Application " + applicationId + " user " + user 568 + " mapping [" + queueName + "] to [" + mappedQueue 569 + "] override " + overrideWithQueueMappings); 570 queueName = mappedQueue; 571 RMApp rmApp = rmContext.getRMApps().get(applicationId); 572 rmApp.setQueue(queueName); 573 } 574 } 575 } catch (IOException ioex) { 576 String message = "Failed to submit application " + applicationId + 577 " submitted by user " + user + " reason: " + ioex.getMessage(); 578 this.rmContext.getDispatcher().getEventHandler() 579 .handle(new RMAppEvent(applicationId, 580 RMAppEventType.APP_REJECTED, message)); 581 return null; 582 } 583 } 584 return queueName; 585 } 586 587 private synchronized void addApplicationOnRecovery( 588 ApplicationId applicationId, String queueName, String user) { 589 queueName = getQueueMappings(applicationId, queueName, user); 590 if (queueName == null) { 591 // Exception encountered while getting queue mappings. 592 return; 593 } 594 // sanity checks. 595 CSQueue queue = getQueue(queueName); 596 if (queue == null) { 597 //During a restart, this indicates a queue was removed, which is 598 //not presently supported 599 if (!YarnConfiguration.shouldRMFailFast(getConfig())) { 600 this.rmContext.getDispatcher().getEventHandler().handle( 601 new RMAppEvent(applicationId, RMAppEventType.KILL, 602 "Application killed on recovery as it was submitted to queue " + 603 queueName + " which no longer exists after restart.")); 604 return; 605 } else { 606 String queueErrorMsg = "Queue named " + queueName 607 + " missing during application recovery." 608 + " Queue removal during recovery is not presently supported by the" 609 + " capacity scheduler, please restart with all queues configured" 610 + " which were present before shutdown/restart."; 611 LOG.fatal(queueErrorMsg); 612 throw new QueueInvalidException(queueErrorMsg); 613 } 614 } 615 if (!(queue instanceof LeafQueue)) { 616 // During RM restart, this means leaf queue was converted to a parent 617 // queue, which is not supported for running apps. 618 if (!YarnConfiguration.shouldRMFailFast(getConfig())) { 619 this.rmContext.getDispatcher().getEventHandler().handle( 620 new RMAppEvent(applicationId, RMAppEventType.KILL, 621 "Application killed on recovery as it was submitted to queue " + 622 queueName + " which is no longer a leaf queue after restart.")); 623 return; 624 } else { 625 String queueErrorMsg = "Queue named " + queueName 626 + " is no longer a leaf queue during application recovery." 627 + " Changing a leaf queue to a parent queue during recovery is" 628 + " not presently supported by the capacity scheduler. Please" 629 + " restart with leaf queues before shutdown/restart continuing" 630 + " as leaf queues."; 631 LOG.fatal(queueErrorMsg); 632 throw new QueueInvalidException(queueErrorMsg); 633 } 634 } 635 // Submit to the queue 636 try { 637 queue.submitApplication(applicationId, user, queueName); 638 } catch (AccessControlException ace) { 639 // Ignore the exception for recovered app as the app was previously 640 // accepted. 641 } 642 queue.getMetrics().submitApp(user); 643 SchedulerApplication<FiCaSchedulerApp> application = 644 new SchedulerApplication<FiCaSchedulerApp>(queue, user); 645 applications.put(applicationId, application); 646 LOG.info("Accepted application " + applicationId + " from user: " + user 647 + ", in queue: " + queueName); 648 if (LOG.isDebugEnabled()) { 649 LOG.debug(applicationId + " is recovering. Skip notifying APP_ACCEPTED"); 650 } 651 } 652 653 private synchronized void addApplication(ApplicationId applicationId, 654 String queueName, String user) { 655 queueName = getQueueMappings(applicationId, queueName, user); 656 if (queueName == null) { 657 // Exception encountered while getting queue mappings. 658 return; 659 } 660 // sanity checks. 661 CSQueue queue = getQueue(queueName); 662 if (queue == null) { 663 String message = "Application " + applicationId + 664 " submitted by user " + user + " to unknown queue: " + queueName; 665 this.rmContext.getDispatcher().getEventHandler() 666 .handle(new RMAppEvent(applicationId, 667 RMAppEventType.APP_REJECTED, message)); 668 return; 669 } 670 if (!(queue instanceof LeafQueue)) { 671 String message = "Application " + applicationId + 672 " submitted by user " + user + " to non-leaf queue: " + queueName; 673 this.rmContext.getDispatcher().getEventHandler() 674 .handle(new RMAppEvent(applicationId, 675 RMAppEventType.APP_REJECTED, message)); 676 return; 677 } 678 // Submit to the queue 679 try { 680 queue.submitApplication(applicationId, user, queueName); 681 } catch (AccessControlException ace) { 682 LOG.info("Failed to submit application " + applicationId + " to queue " 683 + queueName + " from user " + user, ace); 684 this.rmContext.getDispatcher().getEventHandler() 685 .handle(new RMAppEvent(applicationId, 686 RMAppEventType.APP_REJECTED, ace.toString())); 687 return; 688 } 689 // update the metrics 690 queue.getMetrics().submitApp(user); 691 SchedulerApplication<FiCaSchedulerApp> application = 692 new SchedulerApplication<FiCaSchedulerApp>(queue, user); 693 applications.put(applicationId, application); 694 LOG.info("Accepted application " + applicationId + " from user: " + user 695 + ", in queue: " + queueName); 696 rmContext.getDispatcher().getEventHandler() 697 .handle(new RMAppEvent(applicationId, RMAppEventType.APP_ACCEPTED)); 698 } 699 700 private synchronized void addApplicationAttempt( 701 ApplicationAttemptId applicationAttemptId, 702 boolean transferStateFromPreviousAttempt, 703 boolean isAttemptRecovering) { 704 SchedulerApplication<FiCaSchedulerApp> application = 705 applications.get(applicationAttemptId.getApplicationId()); 706 if (application == null) { 707 LOG.warn("Application " + applicationAttemptId.getApplicationId() + 708 " cannot be found in scheduler."); 709 return; 710 } 711 CSQueue queue = (CSQueue) application.getQueue(); 712 713 FiCaSchedulerApp attempt = 714 new FiCaSchedulerApp(applicationAttemptId, application.getUser(), 715 queue, queue.getActiveUsersManager(), rmContext); 716 if (transferStateFromPreviousAttempt) { 717 attempt.transferStateFromPreviousAttempt(application 718 .getCurrentAppAttempt()); 719 } 720 application.setCurrentAppAttempt(attempt); 721 722 queue.submitApplicationAttempt(attempt, application.getUser()); 723 LOG.info("Added Application Attempt " + applicationAttemptId 724 + " to scheduler from user " + application.getUser() + " in queue " 725 + queue.getQueueName()); 726 if (isAttemptRecovering) { 727 if (LOG.isDebugEnabled()) { 728 LOG.debug(applicationAttemptId 729 + " is recovering. Skipping notifying ATTEMPT_ADDED"); 730 } 731 } else { 732 rmContext.getDispatcher().getEventHandler().handle( 733 new RMAppAttemptEvent(applicationAttemptId, 734 RMAppAttemptEventType.ATTEMPT_ADDED)); 735 } 736 } 737 738 private synchronized void doneApplication(ApplicationId applicationId, 739 RMAppState finalState) { 740 SchedulerApplication<FiCaSchedulerApp> application = 741 applications.get(applicationId); 742 if (application == null){ 743 // The AppRemovedSchedulerEvent maybe sent on recovery for completed apps, 744 // ignore it. 745 LOG.warn("Couldn't find application " + applicationId); 746 return; 747 } 748 CSQueue queue = (CSQueue) application.getQueue(); 749 if (!(queue instanceof LeafQueue)) { 750 LOG.error("Cannot finish application " + "from non-leaf queue: " 751 + queue.getQueueName()); 752 } else { 753 queue.finishApplication(applicationId, application.getUser()); 754 } 755 application.stop(finalState); 756 applications.remove(applicationId); 757 } 758 759 private synchronized void doneApplicationAttempt( 760 ApplicationAttemptId applicationAttemptId, 761 RMAppAttemptState rmAppAttemptFinalState, boolean keepContainers) { 762 LOG.info("Application Attempt " + applicationAttemptId + " is done." + 763 " finalState=" + rmAppAttemptFinalState); 764 765 FiCaSchedulerApp attempt = getApplicationAttempt(applicationAttemptId); 766 SchedulerApplication<FiCaSchedulerApp> application = 767 applications.get(applicationAttemptId.getApplicationId()); 768 769 if (application == null || attempt == null) { 770 LOG.info("Unknown application " + applicationAttemptId + " has completed!"); 771 return; 772 } 773 774 // Release all the allocated, acquired, running containers 775 for (RMContainer rmContainer : attempt.getLiveContainers()) { 776 if (keepContainers 777 && rmContainer.getState().equals(RMContainerState.RUNNING)) { 778 // do not kill the running container in the case of work-preserving AM 779 // restart. 780 LOG.info("Skip killing " + rmContainer.getContainerId()); 781 continue; 782 } 783 completedContainer( 784 rmContainer, 785 SchedulerUtils.createAbnormalContainerStatus( 786 rmContainer.getContainerId(), SchedulerUtils.COMPLETED_APPLICATION), 787 RMContainerEventType.KILL); 788 } 789 790 // Release all reserved containers 791 for (RMContainer rmContainer : attempt.getReservedContainers()) { 792 completedContainer( 793 rmContainer, 794 SchedulerUtils.createAbnormalContainerStatus( 795 rmContainer.getContainerId(), "Application Complete"), 796 RMContainerEventType.KILL); 797 } 798 799 // Clean up pending requests, metrics etc. 800 attempt.stop(rmAppAttemptFinalState); 801 802 // Inform the queue 803 String queueName = attempt.getQueue().getQueueName(); 804 CSQueue queue = queues.get(queueName); 805 if (!(queue instanceof LeafQueue)) { 806 LOG.error("Cannot finish application " + "from non-leaf queue: " 807 + queueName); 808 } else { 809 queue.finishApplicationAttempt(attempt, queue.getQueueName()); 810 } 811 } 812 813 @Override 814 @Lock(Lock.NoLock.class) 815 public Allocation allocate(ApplicationAttemptId applicationAttemptId, 816 List<ResourceRequest> ask, List<ContainerId> release, 817 List<String> blacklistAdditions, List<String> blacklistRemovals) { 818 819 FiCaSchedulerApp application = getApplicationAttempt(applicationAttemptId); 820 if (application == null) { 821 LOG.info("Calling allocate on removed " + 822 "or non existant application " + applicationAttemptId); 823 return EMPTY_ALLOCATION; 824 } 825 826 // Sanity check 827 SchedulerUtils.normalizeRequests( 828 ask, getResourceCalculator(), getClusterResource(), 829 getMinimumResourceCapability(), getMaximumResourceCapability()); 830 831 // Release containers 832 releaseContainers(release, application); 833 834 synchronized (application) { 835 836 // make sure we aren't stopping/removing the application 837 // when the allocate comes in 838 if (application.isStopped()) { 839 LOG.info("Calling allocate on a stopped " + 840 "application " + applicationAttemptId); 841 return EMPTY_ALLOCATION; 842 } 843 844 if (!ask.isEmpty()) { 845 846 if(LOG.isDebugEnabled()) { 847 LOG.debug("allocate: pre-update" + 848 " applicationAttemptId=" + applicationAttemptId + 849 " application=" + application); 850 } 851 application.showRequests(); 852 853 // Update application requests 854 application.updateResourceRequests(ask); 855 856 LOG.debug("allocate: post-update"); 857 application.showRequests(); 858 } 859 860 if(LOG.isDebugEnabled()) { 861 LOG.debug("allocate:" + 862 " applicationAttemptId=" + applicationAttemptId + 863 " #ask=" + ask.size()); 864 } 865 866 application.updateBlacklist(blacklistAdditions, blacklistRemovals); 867 868 return application.getAllocation(getResourceCalculator(), 869 clusterResource, getMinimumResourceCapability()); 870 } 871 } 872 873 @Override 874 @Lock(Lock.NoLock.class) 875 public QueueInfo getQueueInfo(String queueName, 876 boolean includeChildQueues, boolean recursive) 877 throws IOException { 878 CSQueue queue = null; 879 queue = this.queues.get(queueName); 880 if (queue == null) { 881 throw new IOException("Unknown queue: " + queueName); 882 } 883 return queue.getQueueInfo(includeChildQueues, recursive); 884 } 885 886 @Override 887 @Lock(Lock.NoLock.class) 888 public List<QueueUserACLInfo> getQueueUserAclInfo() { 889 UserGroupInformation user = null; 890 try { 891 user = UserGroupInformation.getCurrentUser(); 892 } catch (IOException ioe) { 893 // should never happen 894 return new ArrayList<QueueUserACLInfo>(); 895 } 896 897 return root.getQueueUserAclInfo(user); 898 } 899 900 private synchronized void nodeUpdate(RMNode nm) { 901 if (LOG.isDebugEnabled()) { 902 LOG.debug("nodeUpdate: " + nm + " clusterResources: " + clusterResource); 903 } 904 905 FiCaSchedulerNode node = getNode(nm.getNodeID()); 906 907 List<UpdatedContainerInfo> containerInfoList = nm.pullContainerUpdates(); 908 List<ContainerStatus> newlyLaunchedContainers = new ArrayList<ContainerStatus>(); 909 List<ContainerStatus> completedContainers = new ArrayList<ContainerStatus>(); 910 for(UpdatedContainerInfo containerInfo : containerInfoList) { 911 newlyLaunchedContainers.addAll(containerInfo.getNewlyLaunchedContainers()); 912 completedContainers.addAll(containerInfo.getCompletedContainers()); 913 } 914 915 // Processing the newly launched containers 916 for (ContainerStatus launchedContainer : newlyLaunchedContainers) { 917 containerLaunchedOnNode(launchedContainer.getContainerId(), node); 918 } 919 920 // Process completed containers 921 for (ContainerStatus completedContainer : completedContainers) { 922 ContainerId containerId = completedContainer.getContainerId(); 923 LOG.debug("Container FINISHED: " + containerId); 924 completedContainer(getRMContainer(containerId), 925 completedContainer, RMContainerEventType.FINISHED); 926 } 927 928 // Now node data structures are upto date and ready for scheduling. 929 if(LOG.isDebugEnabled()) { 930 LOG.debug("Node being looked for scheduling " + nm 931 + " availableResource: " + node.getAvailableResource()); 932 } 933 } 934 935 /** 936 * Process resource update on a node. 937 */ 938 private synchronized void updateNodeAndQueueResource(RMNode nm, 939 ResourceOption resourceOption) { 940 updateNodeResource(nm, resourceOption); 941 root.updateClusterResource(clusterResource, new ResourceLimits( 942 clusterResource)); 943 } 944 945 /** 946 * Process node labels update on a node. 947 * 948 * TODO: Currently capacity scheduler will kill containers on a node when 949 * labels on the node changed. It is a simply solution to ensure guaranteed 950 * capacity on labels of queues. When YARN-2498 completed, we can let 951 * preemption policy to decide if such containers need to be killed or just 952 * keep them running. 953 */ 954 private synchronized void updateLabelsOnNode(NodeId nodeId, 955 Set<String> newLabels) { 956 FiCaSchedulerNode node = nodes.get(nodeId); 957 if (null == node) { 958 return; 959 } 960 961 // labels is same, we don't need do update 962 if (node.getLabels().size() == newLabels.size() 963 && node.getLabels().containsAll(newLabels)) { 964 return; 965 } 966 967 // Kill running containers since label is changed 968 for (RMContainer rmContainer : node.getRunningContainers()) { 969 ContainerId containerId = rmContainer.getContainerId(); 970 completedContainer(rmContainer, 971 ContainerStatus.newInstance(containerId, 972 ContainerState.COMPLETE, 973 String.format( 974 "Container=%s killed since labels on the node=%s changed", 975 containerId.toString(), nodeId.toString()), 976 ContainerExitStatus.KILLED_BY_RESOURCEMANAGER), 977 RMContainerEventType.KILL); 978 } 979 980 // Unreserve container on this node 981 RMContainer reservedContainer = node.getReservedContainer(); 982 if (null != reservedContainer) { 983 dropContainerReservation(reservedContainer); 984 } 985 986 // Update node labels after we've done this 987 node.updateLabels(newLabels); 988 } 989 990 private synchronized void allocateContainersToNode(FiCaSchedulerNode node) { 991 if (rmContext.isWorkPreservingRecoveryEnabled() 992 && !rmContext.isSchedulerReadyForAllocatingContainers()) { 993 return; 994 } 995 996 // Assign new containers... 997 // 1. Check for reserved applications 998 // 2. Schedule if there are no reservations 999 1000 RMContainer reservedContainer = node.getReservedContainer(); 1001 if (reservedContainer != null) { 1002 FiCaSchedulerApp reservedApplication = 1003 getCurrentAttemptForContainer(reservedContainer.getContainerId()); 1004 1005 // Try to fulfill the reservation 1006 LOG.info("Trying to fulfill reservation for application " + 1007 reservedApplication.getApplicationId() + " on node: " + 1008 node.getNodeID()); 1009 1010 LeafQueue queue = ((LeafQueue)reservedApplication.getQueue()); 1011 CSAssignment assignment = 1012 queue.assignContainers( 1013 clusterResource, 1014 node, 1015 // TODO, now we only consider limits for parent for non-labeled 1016 // resources, should consider labeled resources as well. 1017 new ResourceLimits(labelManager.getResourceByLabel( 1018 RMNodeLabelsManager.NO_LABEL, clusterResource))); 1019 1020 RMContainer excessReservation = assignment.getExcessReservation(); 1021 if (excessReservation != null) { 1022 Container container = excessReservation.getContainer(); 1023 queue.completedContainer( 1024 clusterResource, assignment.getApplication(), node, 1025 excessReservation, 1026 SchedulerUtils.createAbnormalContainerStatus( 1027 container.getId(), 1028 SchedulerUtils.UNRESERVED_CONTAINER), 1029 RMContainerEventType.RELEASED, null, true); 1030 } 1031 1032 } 1033 1034 // Try to schedule more if there are no reservations to fulfill 1035 if (node.getReservedContainer() == null) { 1036 if (calculator.computeAvailableContainers(node.getAvailableResource(), 1037 minimumAllocation) > 0) { 1038 if (LOG.isDebugEnabled()) { 1039 LOG.debug("Trying to schedule on node: " + node.getNodeName() + 1040 ", available: " + node.getAvailableResource()); 1041 } 1042 root.assignContainers( 1043 clusterResource, 1044 node, 1045 // TODO, now we only consider limits for parent for non-labeled 1046 // resources, should consider labeled resources as well. 1047 new ResourceLimits(labelManager.getResourceByLabel( 1048 RMNodeLabelsManager.NO_LABEL, clusterResource))); 1049 } 1050 } else { 1051 LOG.info("Skipping scheduling since node " + node.getNodeID() + 1052 " is reserved by application " + 1053 node.getReservedContainer().getContainerId().getApplicationAttemptId() 1054 ); 1055 } 1056 1057 } 1058 1059 @Override 1060 public void handle(SchedulerEvent event) { 1061 switch(event.getType()) { 1062 case NODE_ADDED: 1063 { 1064 NodeAddedSchedulerEvent nodeAddedEvent = (NodeAddedSchedulerEvent)event; 1065 addNode(nodeAddedEvent.getAddedRMNode()); 1066 recoverContainersOnNode(nodeAddedEvent.getContainerReports(), 1067 nodeAddedEvent.getAddedRMNode()); 1068 } 1069 break; 1070 case NODE_REMOVED: 1071 { 1072 NodeRemovedSchedulerEvent nodeRemovedEvent = (NodeRemovedSchedulerEvent)event; 1073 removeNode(nodeRemovedEvent.getRemovedRMNode()); 1074 } 1075 break; 1076 case NODE_RESOURCE_UPDATE: 1077 { 1078 NodeResourceUpdateSchedulerEvent nodeResourceUpdatedEvent = 1079 (NodeResourceUpdateSchedulerEvent)event; 1080 updateNodeAndQueueResource(nodeResourceUpdatedEvent.getRMNode(), 1081 nodeResourceUpdatedEvent.getResourceOption()); 1082 } 1083 break; 1084 case NODE_LABELS_UPDATE: 1085 { 1086 NodeLabelsUpdateSchedulerEvent labelUpdateEvent = 1087 (NodeLabelsUpdateSchedulerEvent) event; 1088 1089 for (Entry<NodeId, Set<String>> entry : labelUpdateEvent 1090 .getUpdatedNodeToLabels().entrySet()) { 1091 NodeId id = entry.getKey(); 1092 Set<String> labels = entry.getValue(); 1093 updateLabelsOnNode(id, labels); 1094 } 1095 } 1096 break; 1097 case NODE_UPDATE: 1098 { 1099 NodeUpdateSchedulerEvent nodeUpdatedEvent = (NodeUpdateSchedulerEvent)event; 1100 RMNode node = nodeUpdatedEvent.getRMNode(); 1101 nodeUpdate(node); 1102 if (!scheduleAsynchronously) { 1103 allocateContainersToNode(getNode(node.getNodeID())); 1104 } 1105 } 1106 break; 1107 case APP_ADDED: 1108 { 1109 AppAddedSchedulerEvent appAddedEvent = (AppAddedSchedulerEvent) event; 1110 String queueName = 1111 resolveReservationQueueName(appAddedEvent.getQueue(), 1112 appAddedEvent.getApplicationId(), 1113 appAddedEvent.getReservationID()); 1114 if (queueName != null) { 1115 if (!appAddedEvent.getIsAppRecovering()) { 1116 addApplication(appAddedEvent.getApplicationId(), queueName, 1117 appAddedEvent.getUser()); 1118 } else { 1119 addApplicationOnRecovery(appAddedEvent.getApplicationId(), queueName, 1120 appAddedEvent.getUser()); 1121 } 1122 } 1123 } 1124 break; 1125 case APP_REMOVED: 1126 { 1127 AppRemovedSchedulerEvent appRemovedEvent = (AppRemovedSchedulerEvent)event; 1128 doneApplication(appRemovedEvent.getApplicationID(), 1129 appRemovedEvent.getFinalState()); 1130 } 1131 break; 1132 case APP_ATTEMPT_ADDED: 1133 { 1134 AppAttemptAddedSchedulerEvent appAttemptAddedEvent = 1135 (AppAttemptAddedSchedulerEvent) event; 1136 addApplicationAttempt(appAttemptAddedEvent.getApplicationAttemptId(), 1137 appAttemptAddedEvent.getTransferStateFromPreviousAttempt(), 1138 appAttemptAddedEvent.getIsAttemptRecovering()); 1139 } 1140 break; 1141 case APP_ATTEMPT_REMOVED: 1142 { 1143 AppAttemptRemovedSchedulerEvent appAttemptRemovedEvent = 1144 (AppAttemptRemovedSchedulerEvent) event; 1145 doneApplicationAttempt(appAttemptRemovedEvent.getApplicationAttemptID(), 1146 appAttemptRemovedEvent.getFinalAttemptState(), 1147 appAttemptRemovedEvent.getKeepContainersAcrossAppAttempts()); 1148 } 1149 break; 1150 case CONTAINER_EXPIRED: 1151 { 1152 ContainerExpiredSchedulerEvent containerExpiredEvent = 1153 (ContainerExpiredSchedulerEvent) event; 1154 ContainerId containerId = containerExpiredEvent.getContainerId(); 1155 completedContainer(getRMContainer(containerId), 1156 SchedulerUtils.createAbnormalContainerStatus( 1157 containerId, 1158 SchedulerUtils.EXPIRED_CONTAINER), 1159 RMContainerEventType.EXPIRE); 1160 } 1161 break; 1162 case DROP_RESERVATION: 1163 { 1164 ContainerPreemptEvent dropReservationEvent = (ContainerPreemptEvent)event; 1165 RMContainer container = dropReservationEvent.getContainer(); 1166 dropContainerReservation(container); 1167 } 1168 break; 1169 case PREEMPT_CONTAINER: 1170 { 1171 ContainerPreemptEvent preemptContainerEvent = 1172 (ContainerPreemptEvent)event; 1173 ApplicationAttemptId aid = preemptContainerEvent.getAppId(); 1174 RMContainer containerToBePreempted = preemptContainerEvent.getContainer(); 1175 preemptContainer(aid, containerToBePreempted); 1176 } 1177 break; 1178 case KILL_CONTAINER: 1179 { 1180 ContainerPreemptEvent killContainerEvent = (ContainerPreemptEvent)event; 1181 RMContainer containerToBeKilled = killContainerEvent.getContainer(); 1182 killContainer(containerToBeKilled); 1183 } 1184 break; 1185 case CONTAINER_RESCHEDULED: 1186 { 1187 ContainerRescheduledEvent containerRescheduledEvent = 1188 (ContainerRescheduledEvent) event; 1189 RMContainer container = containerRescheduledEvent.getContainer(); 1190 recoverResourceRequestForContainer(container); 1191 } 1192 break; 1193 default: 1194 LOG.error("Invalid eventtype " + event.getType() + ". Ignoring!"); 1195 } 1196 } 1197 1198 private synchronized void addNode(RMNode nodeManager) { 1199 FiCaSchedulerNode schedulerNode = new FiCaSchedulerNode(nodeManager, 1200 usePortForNodeName, nodeManager.getNodeLabels()); 1201 this.nodes.put(nodeManager.getNodeID(), schedulerNode); 1202 Resources.addTo(clusterResource, schedulerNode.getTotalResource()); 1203 1204 // update this node to node label manager 1205 if (labelManager != null) { 1206 labelManager.activateNode(nodeManager.getNodeID(), 1207 schedulerNode.getTotalResource()); 1208 } 1209 1210 root.updateClusterResource(clusterResource, new ResourceLimits( 1211 clusterResource)); 1212 int numNodes = numNodeManagers.incrementAndGet(); 1213 updateMaximumAllocation(schedulerNode, true); 1214 1215 LOG.info("Added node " + nodeManager.getNodeAddress() + 1216 " clusterResource: " + clusterResource); 1217 1218 if (scheduleAsynchronously && numNodes == 1) { 1219 asyncSchedulerThread.beginSchedule(); 1220 } 1221 } 1222 1223 private synchronized void removeNode(RMNode nodeInfo) { 1224 // update this node to node label manager 1225 if (labelManager != null) { 1226 labelManager.deactivateNode(nodeInfo.getNodeID()); 1227 } 1228 1229 FiCaSchedulerNode node = nodes.get(nodeInfo.getNodeID()); 1230 if (node == null) { 1231 return; 1232 } 1233 Resources.subtractFrom(clusterResource, node.getTotalResource()); 1234 root.updateClusterResource(clusterResource, new ResourceLimits( 1235 clusterResource)); 1236 int numNodes = numNodeManagers.decrementAndGet(); 1237 1238 if (scheduleAsynchronously && numNodes == 0) { 1239 asyncSchedulerThread.suspendSchedule(); 1240 } 1241 1242 // Remove running containers 1243 List<RMContainer> runningContainers = node.getRunningContainers(); 1244 for (RMContainer container : runningContainers) { 1245 completedContainer(container, 1246 SchedulerUtils.createAbnormalContainerStatus( 1247 container.getContainerId(), 1248 SchedulerUtils.LOST_CONTAINER), 1249 RMContainerEventType.KILL); 1250 } 1251 1252 // Remove reservations, if any 1253 RMContainer reservedContainer = node.getReservedContainer(); 1254 if (reservedContainer != null) { 1255 completedContainer(reservedContainer, 1256 SchedulerUtils.createAbnormalContainerStatus( 1257 reservedContainer.getContainerId(), 1258 SchedulerUtils.LOST_CONTAINER), 1259 RMContainerEventType.KILL); 1260 } 1261 1262 this.nodes.remove(nodeInfo.getNodeID()); 1263 updateMaximumAllocation(node, false); 1264 1265 LOG.info("Removed node " + nodeInfo.getNodeAddress() + 1266 " clusterResource: " + clusterResource); 1267 } 1268 1269 @Lock(CapacityScheduler.class) 1270 @Override 1271 protected synchronized void completedContainer(RMContainer rmContainer, 1272 ContainerStatus containerStatus, RMContainerEventType event) { 1273 if (rmContainer == null) { 1274 LOG.info("Null container completed..."); 1275 return; 1276 } 1277 1278 Container container = rmContainer.getContainer(); 1279 1280 // Get the application for the finished container 1281 FiCaSchedulerApp application = 1282 getCurrentAttemptForContainer(container.getId()); 1283 ApplicationId appId = 1284 container.getId().getApplicationAttemptId().getApplicationId(); 1285 if (application == null) { 1286 LOG.info("Container " + container + " of" + " unknown application " 1287 + appId + " completed with event " + event); 1288 return; 1289 } 1290 1291 // Get the node on which the container was allocated 1292 FiCaSchedulerNode node = getNode(container.getNodeId()); 1293 1294 // Inform the queue 1295 LeafQueue queue = (LeafQueue)application.getQueue(); 1296 queue.completedContainer(clusterResource, application, node, 1297 rmContainer, containerStatus, event, null, true); 1298 1299 LOG.info("Application attempt " + application.getApplicationAttemptId() 1300 + " released container " + container.getId() + " on node: " + node 1301 + " with event: " + event); 1302 } 1303 1304 @Lock(Lock.NoLock.class) 1305 @VisibleForTesting 1306 @Override 1307 public FiCaSchedulerApp getApplicationAttempt( 1308 ApplicationAttemptId applicationAttemptId) { 1309 return super.getApplicationAttempt(applicationAttemptId); 1310 } 1311 1312 @Lock(Lock.NoLock.class) 1313 public FiCaSchedulerNode getNode(NodeId nodeId) { 1314 return nodes.get(nodeId); 1315 } 1316 1317 @Lock(Lock.NoLock.class) 1318 Map<NodeId, FiCaSchedulerNode> getAllNodes() { 1319 return nodes; 1320 } 1321 1322 @Override 1323 @Lock(Lock.NoLock.class) 1324 public void recover(RMState state) throws Exception { 1325 // NOT IMPLEMENTED 1326 } 1327 1328 @Override 1329 public void dropContainerReservation(RMContainer container) { 1330 if(LOG.isDebugEnabled()){ 1331 LOG.debug("DROP_RESERVATION:" + container.toString()); 1332 } 1333 completedContainer(container, 1334 SchedulerUtils.createAbnormalContainerStatus( 1335 container.getContainerId(), 1336 SchedulerUtils.UNRESERVED_CONTAINER), 1337 RMContainerEventType.KILL); 1338 } 1339 1340 @Override 1341 public void preemptContainer(ApplicationAttemptId aid, RMContainer cont) { 1342 if(LOG.isDebugEnabled()){ 1343 LOG.debug("PREEMPT_CONTAINER: application:" + aid.toString() + 1344 " container: " + cont.toString()); 1345 } 1346 FiCaSchedulerApp app = getApplicationAttempt(aid); 1347 if (app != null) { 1348 app.addPreemptContainer(cont.getContainerId()); 1349 } 1350 } 1351 1352 @Override 1353 public void killContainer(RMContainer cont) { 1354 if (LOG.isDebugEnabled()) { 1355 LOG.debug("KILL_CONTAINER: container" + cont.toString()); 1356 } 1357 completedContainer(cont, SchedulerUtils.createPreemptedContainerStatus( 1358 cont.getContainerId(), SchedulerUtils.PREEMPTED_CONTAINER), 1359 RMContainerEventType.KILL); 1360 } 1361 1362 @Override 1363 public synchronized boolean checkAccess(UserGroupInformation callerUGI, 1364 QueueACL acl, String queueName) { 1365 CSQueue queue = getQueue(queueName); 1366 if (queue == null) { 1367 if (LOG.isDebugEnabled()) { 1368 LOG.debug("ACL not found for queue access-type " + acl 1369 + " for queue " + queueName); 1370 } 1371 return false; 1372 } 1373 return queue.hasAccess(acl, callerUGI); 1374 } 1375 1376 @Override 1377 public List<ApplicationAttemptId> getAppsInQueue(String queueName) { 1378 CSQueue queue = queues.get(queueName); 1379 if (queue == null) { 1380 return null; 1381 } 1382 List<ApplicationAttemptId> apps = new ArrayList<ApplicationAttemptId>(); 1383 queue.collectSchedulerApplications(apps); 1384 return apps; 1385 } 1386 1387 private CapacitySchedulerConfiguration loadCapacitySchedulerConfiguration( 1388 Configuration configuration) throws IOException { 1389 try { 1390 InputStream CSInputStream = 1391 this.rmContext.getConfigurationProvider() 1392 .getConfigurationInputStream(configuration, 1393 YarnConfiguration.CS_CONFIGURATION_FILE); 1394 if (CSInputStream != null) { 1395 configuration.addResource(CSInputStream); 1396 return new CapacitySchedulerConfiguration(configuration, false); 1397 } 1398 return new CapacitySchedulerConfiguration(configuration, true); 1399 } catch (Exception e) { 1400 throw new IOException(e); 1401 } 1402 } 1403 1404 private synchronized String resolveReservationQueueName(String queueName, 1405 ApplicationId applicationId, ReservationId reservationID) { 1406 CSQueue queue = getQueue(queueName); 1407 // Check if the queue is a plan queue 1408 if ((queue == null) || !(queue instanceof PlanQueue)) { 1409 return queueName; 1410 } 1411 if (reservationID != null) { 1412 String resQName = reservationID.toString(); 1413 queue = getQueue(resQName); 1414 if (queue == null) { 1415 String message = 1416 "Application " 1417 + applicationId 1418 + " submitted to a reservation which is not yet currently active: " 1419 + resQName; 1420 this.rmContext.getDispatcher().getEventHandler() 1421 .handle(new RMAppEvent(applicationId, 1422 RMAppEventType.APP_REJECTED, message)); 1423 return null; 1424 } 1425 if (!queue.getParent().getQueueName().equals(queueName)) { 1426 String message = 1427 "Application: " + applicationId + " submitted to a reservation " 1428 + resQName + " which does not belong to the specified queue: " 1429 + queueName; 1430 this.rmContext.getDispatcher().getEventHandler() 1431 .handle(new RMAppEvent(applicationId, 1432 RMAppEventType.APP_REJECTED, message)); 1433 return null; 1434 } 1435 // use the reservation queue to run the app 1436 queueName = resQName; 1437 } else { 1438 // use the default child queue of the plan for unreserved apps 1439 queueName = queueName + ReservationConstants.DEFAULT_QUEUE_SUFFIX; 1440 } 1441 return queueName; 1442 } 1443 1444 @Override 1445 public synchronized void removeQueue(String queueName) 1446 throws SchedulerDynamicEditException { 1447 LOG.info("Removing queue: " + queueName); 1448 CSQueue q = this.getQueue(queueName); 1449 if (!(q instanceof ReservationQueue)) { 1450 throw new SchedulerDynamicEditException("The queue that we are asked " 1451 + "to remove (" + queueName + ") is not a ReservationQueue"); 1452 } 1453 ReservationQueue disposableLeafQueue = (ReservationQueue) q; 1454 // at this point we should have no more apps 1455 if (disposableLeafQueue.getNumApplications() > 0) { 1456 throw new SchedulerDynamicEditException("The queue " + queueName 1457 + " is not empty " + disposableLeafQueue.getApplications().size() 1458 + " active apps " + disposableLeafQueue.pendingApplications.size() 1459 + " pending apps"); 1460 } 1461 1462 ((PlanQueue) disposableLeafQueue.getParent()).removeChildQueue(q); 1463 this.queues.remove(queueName); 1464 LOG.info("Removal of ReservationQueue " + queueName + " has succeeded"); 1465 } 1466 1467 @Override 1468 public synchronized void addQueue(Queue queue) 1469 throws SchedulerDynamicEditException { 1470 1471 if (!(queue instanceof ReservationQueue)) { 1472 throw new SchedulerDynamicEditException("Queue " + queue.getQueueName() 1473 + " is not a ReservationQueue"); 1474 } 1475 1476 ReservationQueue newQueue = (ReservationQueue) queue; 1477 1478 if (newQueue.getParent() == null 1479 || !(newQueue.getParent() instanceof PlanQueue)) { 1480 throw new SchedulerDynamicEditException("ParentQueue for " 1481 + newQueue.getQueueName() 1482 + " is not properly set (should be set and be a PlanQueue)"); 1483 } 1484 1485 PlanQueue parentPlan = (PlanQueue) newQueue.getParent(); 1486 String queuename = newQueue.getQueueName(); 1487 parentPlan.addChildQueue(newQueue); 1488 this.queues.put(queuename, newQueue); 1489 LOG.info("Creation of ReservationQueue " + newQueue + " succeeded"); 1490 } 1491 1492 @Override 1493 public synchronized void setEntitlement(String inQueue, 1494 QueueEntitlement entitlement) throws SchedulerDynamicEditException, 1495 YarnException { 1496 LeafQueue queue = getAndCheckLeafQueue(inQueue); 1497 ParentQueue parent = (ParentQueue) queue.getParent(); 1498 1499 if (!(queue instanceof ReservationQueue)) { 1500 throw new SchedulerDynamicEditException("Entitlement can not be" 1501 + " modified dynamically since queue " + inQueue 1502 + " is not a ReservationQueue"); 1503 } 1504 1505 if (!(parent instanceof PlanQueue)) { 1506 throw new SchedulerDynamicEditException("The parent of ReservationQueue " 1507 + inQueue + " must be an PlanQueue"); 1508 } 1509 1510 ReservationQueue newQueue = (ReservationQueue) queue; 1511 1512 float sumChilds = ((PlanQueue) parent).sumOfChildCapacities(); 1513 float newChildCap = sumChilds - queue.getCapacity() + entitlement.getCapacity(); 1514 1515 if (newChildCap >= 0 && newChildCap < 1.0f + CSQueueUtils.EPSILON) { 1516 // note: epsilon checks here are not ok, as the epsilons might accumulate 1517 // and become a problem in aggregate 1518 if (Math.abs(entitlement.getCapacity() - queue.getCapacity()) == 0 1519 && Math.abs(entitlement.getMaxCapacity() - queue.getMaximumCapacity()) == 0) { 1520 return; 1521 } 1522 newQueue.setEntitlement(entitlement); 1523 } else { 1524 throw new SchedulerDynamicEditException( 1525 "Sum of child queues would exceed 100% for PlanQueue: " 1526 + parent.getQueueName()); 1527 } 1528 LOG.info("Set entitlement for ReservationQueue " + inQueue + " to " 1529 + queue.getCapacity() + " request was (" + entitlement.getCapacity() + ")"); 1530 } 1531 1532 @Override 1533 public synchronized String moveApplication(ApplicationId appId, 1534 String targetQueueName) throws YarnException { 1535 FiCaSchedulerApp app = 1536 getApplicationAttempt(ApplicationAttemptId.newInstance(appId, 0)); 1537 String sourceQueueName = app.getQueue().getQueueName(); 1538 LeafQueue source = getAndCheckLeafQueue(sourceQueueName); 1539 String destQueueName = handleMoveToPlanQueue(targetQueueName); 1540 LeafQueue dest = getAndCheckLeafQueue(destQueueName); 1541 // Validation check - ACLs, submission limits for user & queue 1542 String user = app.getUser(); 1543 try { 1544 dest.submitApplication(appId, user, destQueueName); 1545 } catch (AccessControlException e) { 1546 throw new YarnException(e); 1547 } 1548 // Move all live containers 1549 for (RMContainer rmContainer : app.getLiveContainers()) { 1550 source.detachContainer(clusterResource, app, rmContainer); 1551 // attach the Container to another queue 1552 dest.attachContainer(clusterResource, app, rmContainer); 1553 } 1554 // Detach the application.. 1555 source.finishApplicationAttempt(app, sourceQueueName); 1556 source.getParent().finishApplication(appId, app.getUser()); 1557 // Finish app & update metrics 1558 app.move(dest); 1559 // Submit to a new queue 1560 dest.submitApplicationAttempt(app, user); 1561 applications.get(appId).setQueue(dest); 1562 LOG.info("App: " + app.getApplicationId() + " successfully moved from " 1563 + sourceQueueName + " to: " + destQueueName); 1564 return targetQueueName; 1565 } 1566 1567 /** 1568 * Check that the String provided in input is the name of an existing, 1569 * LeafQueue, if successful returns the queue. 1570 * 1571 * @param queue 1572 * @return the LeafQueue 1573 * @throws YarnException 1574 */ 1575 private LeafQueue getAndCheckLeafQueue(String queue) throws YarnException { 1576 CSQueue ret = this.getQueue(queue); 1577 if (ret == null) { 1578 throw new YarnException("The specified Queue: " + queue 1579 + " doesn't exist"); 1580 } 1581 if (!(ret instanceof LeafQueue)) { 1582 throw new YarnException("The specified Queue: " + queue 1583 + " is not a Leaf Queue. Move is supported only for Leaf Queues."); 1584 } 1585 return (LeafQueue) ret; 1586 } 1587 1588 /** {@inheritDoc} */ 1589 @Override 1590 public EnumSet<SchedulerResourceTypes> getSchedulingResourceTypes() { 1591 if (calculator.getClass().getName() 1592 .equals(DefaultResourceCalculator.class.getName())) { 1593 return EnumSet.of(SchedulerResourceTypes.MEMORY); 1594 } 1595 return EnumSet 1596 .of(SchedulerResourceTypes.MEMORY, SchedulerResourceTypes.CPU); 1597 } 1598 1599 @Override 1600 public Resource getMaximumResourceCapability(String queueName) { 1601 CSQueue queue = getQueue(queueName); 1602 if (queue == null) { 1603 LOG.error("Unknown queue: " + queueName); 1604 return getMaximumResourceCapability(); 1605 } 1606 if (!(queue instanceof LeafQueue)) { 1607 LOG.error("queue " + queueName + " is not an leaf queue"); 1608 return getMaximumResourceCapability(); 1609 } 1610 return ((LeafQueue)queue).getMaximumAllocation(); 1611 } 1612 1613 private String handleMoveToPlanQueue(String targetQueueName) { 1614 CSQueue dest = getQueue(targetQueueName); 1615 if (dest != null && dest instanceof PlanQueue) { 1616 // use the default child reservation queue of the plan 1617 targetQueueName = targetQueueName + ReservationConstants.DEFAULT_QUEUE_SUFFIX; 1618 } 1619 return targetQueueName; 1620 } 1621 1622 @Override 1623 public Set<String> getPlanQueues() { 1624 Set<String> ret = new HashSet<String>(); 1625 for (Map.Entry<String, CSQueue> l : queues.entrySet()) { 1626 if (l.getValue() instanceof PlanQueue) { 1627 ret.add(l.getKey()); 1628 } 1629 } 1630 return ret; 1631 } 1632 }
CapacityScheduler.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
其实由前面可知, ResourceManager 类的 createSchedulerEventDispatcher() 函数调用的是 ResourceManager 类的内部类 SchedulerEventDispatcher, 如下所示
//ResourceManager类的内部类SchedulerEventDispatcher public SchedulerEventDispatcher(ResourceScheduler scheduler) { super(SchedulerEventDispatcher.class.getName()); this.scheduler = scheduler; this.eventProcessor = new Thread(new EventProcessor()); this.eventProcessor.setName("ResourceManager Event Processor"); }
会进一步调用内部类 SchedulerEventDispatcher 的内部类 EventProcessor, 如下所示:
1 //ResourceManager类的内部类SchedulerEventDispatcher的内部类EventProcessor 2 private final class EventProcessor implements Runnable { 3 @Override 4 public void run() { 5 6 SchedulerEvent event; 7 8 while (!stopped && !Thread.currentThread().isInterrupted()) { 9 try { 10 event = eventQueue.take(); 11 } catch (InterruptedException e) { 12 LOG.error("Returning, interrupted : " + e); 13 return; // TODO: Kill RM. 14 } 15 16 try { 17 scheduler.handle(event); 18 } catch (Throwable t) { 19 // An error occurred, but we are shutting down anyway. 20 // If it was an InterruptedException, the very act of 21 // shutdown could have caused it and is probably harmless. 22 if (stopped) { 23 LOG.warn("Exception during shutdown: ", t); 24 break; 25 } 26 LOG.fatal("Error in handling event type " + event.getType() 27 + " to the scheduler", t); 28 if (shouldExitOnError 29 && !ShutdownHookManager.get().isShutdownInProgress()) { 30 LOG.info("Exiting, bbye.."); 31 System.exit(-1); 32 } 33 } 34 } 35 } 36 }
在函数run()内部, 有scheduler.handle(event), 我们知道,这个scheduler类型是默认调度起CapacityScheduler的对象。 所以最后调用的是 CapacityScheduler类的 handle()函数, 如下所示:
1 //CapacityScheduler.java 2 public void handle(SchedulerEvent event) { 3 switch(event.getType()) {
4 ......
5 case APP_ADDED: 6 { 7 AppAddedSchedulerEvent appAddedEvent = (AppAddedSchedulerEvent) event; 8 String queueName = 9 resolveReservationQueueName(appAddedEvent.getQueue(), 10 appAddedEvent.getApplicationId(), 11 appAddedEvent.getReservationID()); 12 if (queueName != null) { 13 if (!appAddedEvent.getIsAppRecovering()) { 14 addApplication(appAddedEvent.getApplicationId(), queueName, 15 appAddedEvent.getUser()); 16 } else { 17 addApplicationOnRecovery(appAddedEvent.getApplicationId(), queueName, 18 appAddedEvent.getUser()); 19 } 20 } 21 } 22 break;
23 .......
24 }
我们可以看到APP_ADDED对应的操作, 其中 handle()函数内部调用 addApplication(appAddedEvent.getApplicationId(), queueName, appAddedEvent.getUser()), 进入函数addApplication(),如下所示:
1 //CapacityScheduler.java 2 private synchronized void addApplication(ApplicationId applicationId, 3 String queueName, String user) { 4 queueName = getQueueMappings(applicationId, queueName, user); 5 if (queueName == null) { 6 // Exception encountered while getting queue mappings. 7 return; 8 } 9 // sanity checks. 10 CSQueue queue = getQueue(queueName); 11 if (queue == null) { 12 String message = "Application " + applicationId + 13 " submitted by user " + user + " to unknown queue: " + queueName; 14 this.rmContext.getDispatcher().getEventHandler() 15 .handle(new RMAppEvent(applicationId, 16 RMAppEventType.APP_REJECTED, message)); 17 return; 18 } 19 if (!(queue instanceof LeafQueue)) { 20 String message = "Application " + applicationId + 21 " submitted by user " + user + " to non-leaf queue: " + queueName; 22 this.rmContext.getDispatcher().getEventHandler() 23 .handle(new RMAppEvent(applicationId, 24 RMAppEventType.APP_REJECTED, message)); 25 return; 26 } 27 // Submit to the queue 28 try { 29 queue.submitApplication(applicationId, user, queueName); 30 } catch (AccessControlException ace) { 31 LOG.info("Failed to submit application " + applicationId + " to queue " 32 + queueName + " from user " + user, ace); 33 this.rmContext.getDispatcher().getEventHandler() 34 .handle(new RMAppEvent(applicationId, 35 RMAppEventType.APP_REJECTED, ace.toString())); 36 return; 37 } 38 // update the metrics 39 queue.getMetrics().submitApp(user); 40 SchedulerApplication<FiCaSchedulerApp> application = 41 new SchedulerApplication<FiCaSchedulerApp>(queue, user); 42 applications.put(applicationId, application); 43 LOG.info("Accepted application " + applicationId + " from user: " + user 44 + ", in queue: " + queueName); 45 rmContext.getDispatcher().getEventHandler() 46 .handle(new RMAppEvent(applicationId, RMAppEventType.APP_ACCEPTED)); 47 }
该函数内部会调用 this.rmContext.getDispatcher().getEventHandler().handle(new RMAppEvent(applicationId, RMAppEventType.APP_REJECTED, message)), 前面我们类似的已经分析过,其中this.rmContext=RMContextImpl this.rmContext.getDispatcher()=AsyncDispatcher this.rmContext.getDispatcher().getEventHandler()=AsyncDispatcher$GenericEventHandler。 其中RMAppEventType事件已经在ResourceManager类的内部类RMActiveServices的serviceInit()函数中注册过,我们知道最后调用的是 RMAppImpl 类的 handle()函数, 进一步在RMAppImpl 类中, (上一次的状态是从RMAppState.NEW_SAVING 转变为 RMAppState.SUBMITTED)调用StateMachineFactory类的 .addTransition(。。。)函数。
同理rmContext.getDispatcher().getEventHandler().handle(new RMAppEvent(applicationId, RMAppEventType.APP_ACCEPTED)), 最后在 RMAppImpl 类中调用StateMachineFactory类的 .addTransition(。。。)函数。
我们知道,上一次的状态是RMAppState.SUBMITTED,具体如下所示:
1 //RMAppImpl.java 2 private static final StateMachineFactory<RMAppImpl, 3 RMAppState, 4 RMAppEventType, 5 RMAppEvent> stateMachineFactory 6 = new StateMachineFactory<RMAppImpl, 7 RMAppState, 8 RMAppEventType, 9 RMAppEvent>(RMAppState.NEW) 10 11 12 // Transitions from NEW state 13 ....... 14 .addTransition(RMAppState.NEW, RMAppState.NEW_SAVING, 15 RMAppEventType.START, new RMAppNewlySavingTransition()) 16 ...... 17 18 // Transitions from NEW_SAVING state 19 ...... 20 .addTransition(RMAppState.NEW_SAVING, RMAppState.SUBMITTED, 21 RMAppEventType.APP_NEW_SAVED, new AddApplicationToSchedulerTransition()) 22 ...... 23 24 // Transitions from SUBMITTED state 25 ...... 26 .addTransition(RMAppState.SUBMITTED, RMAppState.FINAL_SAVING, 27 RMAppEventType.APP_REJECTED, 28 new FinalSavingTransition( 29 new AppRejectedTransition(), RMAppState.FAILED)) 30 .addTransition(RMAppState.SUBMITTED, RMAppState.ACCEPTED, 31 RMAppEventType.APP_ACCEPTED, new StartAppAttemptTransition()) 32 ...... 33 .installTopology();
由此可知, 对于RMAppImpl收到RMAppEventType.APP_ACCEPTED 事件后,将自身的运行状态由 RMAppState.SUBMITTED转变为 RMAppState.ACCEPTED , 调用回调类是RMAppImpl类的内部类StartAppAttemptTransition, 如下所示:
1 //RMAppImpl.java 的内部类 StartAppAttemptTransition 2 private static final class StartAppAttemptTransition extends RMAppTransition { 3 @Override 4 public void transition(RMAppImpl app, RMAppEvent event) { 5 app.createAndStartNewAttempt(false); 6 }; 7 }
进入函数createAndStartNewAttempt, 如下所示:
1 //RMAppImpl.java 2 private void createNewAttempt() { 3 ApplicationAttemptId appAttemptId = 4 ApplicationAttemptId.newInstance(applicationId, attempts.size() + 1); 5 RMAppAttempt attempt = 6 new RMAppAttemptImpl(appAttemptId, rmContext, scheduler, masterService, 7 submissionContext, conf, 8 // The newly created attempt maybe last attempt if (number of 9 // previously failed attempts(which should not include Preempted, 10 // hardware error and NM resync) + 1) equal to the max-attempt 11 // limit. 12 maxAppAttempts == (getNumFailedAppAttempts() + 1), amReq); 13 attempts.put(appAttemptId, attempt); 14 currentAttempt = attempt; 15 } 16 17 private void 18 createAndStartNewAttempt(boolean transferStateFromPreviousAttempt) { 19 createNewAttempt(); 20 handler.handle(new RMAppStartAttemptEvent(currentAttempt.getAppAttemptId(), 21 transferStateFromPreviousAttempt)); 22 }
createNewAttempt()函数创建了一个运行实例对象RMAppAttemptImpl。 并且handler.handle(new RMAppStartAttemptEvent(currentAttempt.getAppAttemptId(), transferStateFromPreviousAttempt)),调用 RMAppStartAttemptEvent类,如下所示:
1 //RMAppStartAttemptEvent.java 2 public class RMAppStartAttemptEvent extends RMAppAttemptEvent { 3 4 private final boolean transferStateFromPreviousAttempt; 5 6 public RMAppStartAttemptEvent(ApplicationAttemptId appAttemptId, 7 boolean transferStateFromPreviousAttempt) { 8 super(appAttemptId, RMAppAttemptEventType.START); 9 this.transferStateFromPreviousAttempt = transferStateFromPreviousAttempt; 10 } 11 12 public boolean getTransferStateFromPreviousAttempt() { 13 return transferStateFromPreviousAttempt; 14 } 15 }
其中事件类型RMAppAttemptEventType.START, 由于在ResourceManager中,将RMAppAttemptEventType类型的事件绑定到了ApplicationAttemptEventDispatcher类,如下所示:
1 public enum RMAppAttemptEventType { 2 // Source: RMApp 3 START, 4 KILL, 5 6 // Source: AMLauncher 7 LAUNCHED, 8 LAUNCH_FAILED, 9 10 // Source: AMLivelinessMonitor 11 EXPIRE, 12 13 // Source: ApplicationMasterService 14 REGISTERED, 15 STATUS_UPDATE, 16 UNREGISTERED, 17 18 // Source: Containers 19 CONTAINER_ALLOCATED, 20 CONTAINER_FINISHED, 21 22 // Source: RMStateStore 23 ATTEMPT_NEW_SAVED, 24 ATTEMPT_UPDATE_SAVED, 25 26 // Source: Scheduler 27 ATTEMPT_ADDED, 28 29 // Source: RMAttemptImpl.recover 30 RECOVER 31 32 }
RMAppAttemptEventType.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptEventType.java
//ResourceManager.java的内部类RMActiveServices的serviceInit()函数 // Register event handler for RmAppAttemptEvents rmDispatcher.register(RMAppAttemptEventType.class, new ApplicationAttemptEventDispatcher(rmContext));
进入ApplicationAttemptEventDispatcher,如下所示:
//ResourceManager.java的内部类ApplicationAttemptEventDispatcher @Private public static final class ApplicationAttemptEventDispatcher implements EventHandler<RMAppAttemptEvent> { private final RMContext rmContext; public ApplicationAttemptEventDispatcher(RMContext rmContext) { this.rmContext = rmContext; } @Override public void handle(RMAppAttemptEvent event) { ApplicationAttemptId appAttemptID = event.getApplicationAttemptId(); ApplicationId appAttemptId = appAttemptID.getApplicationId(); RMApp rmApp = this.rmContext.getRMApps().get(appAttemptId); if (rmApp != null) { RMAppAttempt rmAppAttempt = rmApp.getRMAppAttempt(appAttemptID); if (rmAppAttempt != null) { try { rmAppAttempt.handle(event); } catch (Throwable t) { LOG.error("Error in handling event type " + event.getType() + " for applicationAttempt " + appAttemptId, t); } } } } }
该类函数handle()函数内部调用 rmAppAttempt.handle(event), 其中rmAppAttempt 是 接口RMAppAttempt的对象, 该接口只有一个实现类 public class RMAppAttemptImpl implements RMAppAttempt, Recoverable {。。。}。 所以最终调用的是RMAppAttemptImpl类的handle()函数。
1 @SuppressWarnings({"unchecked", "rawtypes"}) 2 public class RMAppAttemptImpl implements RMAppAttempt, Recoverable { 3 4 private static final Log LOG = LogFactory.getLog(RMAppAttemptImpl.class); 5 6 private static final RecordFactory recordFactory = RecordFactoryProvider 7 .getRecordFactory(null); 8 9 public final static Priority AM_CONTAINER_PRIORITY = recordFactory 10 .newRecordInstance(Priority.class); 11 static { 12 AM_CONTAINER_PRIORITY.setPriority(0); 13 } 14 15 private final StateMachine<RMAppAttemptState, 16 RMAppAttemptEventType, 17 RMAppAttemptEvent> stateMachine; 18 19 private final RMContext rmContext; 20 private final EventHandler eventHandler; 21 private final YarnScheduler scheduler; 22 private final ApplicationMasterService masterService; 23 24 private final ReadLock readLock; 25 private final WriteLock writeLock; 26 27 private final ApplicationAttemptId applicationAttemptId; 28 private final ApplicationSubmissionContext submissionContext; 29 private Token<AMRMTokenIdentifier> amrmToken = null; 30 private volatile Integer amrmTokenKeyId = null; 31 private SecretKey clientTokenMasterKey = null; 32 33 private ConcurrentMap<NodeId, List<ContainerStatus>> 34 justFinishedContainers = 35 new ConcurrentHashMap<NodeId, List<ContainerStatus>>(); 36 // Tracks the previous finished containers that are waiting to be 37 // verified as received by the AM. If the AM sends the next allocate 38 // request it implicitly acks this list. 39 private ConcurrentMap<NodeId, List<ContainerStatus>> 40 finishedContainersSentToAM = 41 new ConcurrentHashMap<NodeId, List<ContainerStatus>>(); 42 private Container masterContainer; 43 44 private float progress = 0; 45 private String host = "N/A"; 46 private int rpcPort = -1; 47 private String originalTrackingUrl = "N/A"; 48 private String proxiedTrackingUrl = "N/A"; 49 private long startTime = 0; 50 private long finishTime = 0; 51 private long launchAMStartTime = 0; 52 private long launchAMEndTime = 0; 53 54 // Set to null initially. Will eventually get set 55 // if an RMAppAttemptUnregistrationEvent occurs 56 private FinalApplicationStatus finalStatus = null; 57 private final StringBuilder diagnostics = new StringBuilder(); 58 private int amContainerExitStatus = ContainerExitStatus.INVALID; 59 60 private Configuration conf; 61 // Since AM preemption, hardware error and NM resync are not counted towards 62 // AM failure count, even if this flag is true, a new attempt can still be 63 // re-created if this attempt is eventually failed because of preemption, 64 // hardware error or NM resync. So this flag indicates that this may be 65 // last attempt. 66 private final boolean maybeLastAttempt; 67 private static final ExpiredTransition EXPIRED_TRANSITION = 68 new ExpiredTransition(); 69 70 private RMAppAttemptEvent eventCausingFinalSaving; 71 private RMAppAttemptState targetedFinalState; 72 private RMAppAttemptState recoveredFinalState; 73 private RMAppAttemptState stateBeforeFinalSaving; 74 private Object transitionTodo; 75 76 private RMAppAttemptMetrics attemptMetrics = null; 77 private ResourceRequest amReq = null; 78 79 private static final StateMachineFactory<RMAppAttemptImpl, 80 RMAppAttemptState, 81 RMAppAttemptEventType, 82 RMAppAttemptEvent> 83 stateMachineFactory = new StateMachineFactory<RMAppAttemptImpl, 84 RMAppAttemptState, 85 RMAppAttemptEventType, 86 RMAppAttemptEvent>(RMAppAttemptState.NEW) 87 88 // Transitions from NEW State 89 .addTransition(RMAppAttemptState.NEW, RMAppAttemptState.SUBMITTED, 90 RMAppAttemptEventType.START, new AttemptStartedTransition()) 91 .addTransition(RMAppAttemptState.NEW, RMAppAttemptState.FINAL_SAVING, 92 RMAppAttemptEventType.KILL, 93 new FinalSavingTransition(new BaseFinalTransition( 94 RMAppAttemptState.KILLED), RMAppAttemptState.KILLED)) 95 .addTransition(RMAppAttemptState.NEW, RMAppAttemptState.FINAL_SAVING, 96 RMAppAttemptEventType.REGISTERED, 97 new FinalSavingTransition( 98 new UnexpectedAMRegisteredTransition(), RMAppAttemptState.FAILED)) 99 .addTransition( RMAppAttemptState.NEW, 100 EnumSet.of(RMAppAttemptState.FINISHED, RMAppAttemptState.KILLED, 101 RMAppAttemptState.FAILED, RMAppAttemptState.LAUNCHED), 102 RMAppAttemptEventType.RECOVER, new AttemptRecoveredTransition()) 103 104 // Transitions from SUBMITTED state 105 .addTransition(RMAppAttemptState.SUBMITTED, 106 EnumSet.of(RMAppAttemptState.LAUNCHED_UNMANAGED_SAVING, 107 RMAppAttemptState.SCHEDULED), 108 RMAppAttemptEventType.ATTEMPT_ADDED, 109 new ScheduleTransition()) 110 .addTransition(RMAppAttemptState.SUBMITTED, RMAppAttemptState.FINAL_SAVING, 111 RMAppAttemptEventType.KILL, 112 new FinalSavingTransition(new BaseFinalTransition( 113 RMAppAttemptState.KILLED), RMAppAttemptState.KILLED)) 114 .addTransition(RMAppAttemptState.SUBMITTED, RMAppAttemptState.FINAL_SAVING, 115 RMAppAttemptEventType.REGISTERED, 116 new FinalSavingTransition( 117 new UnexpectedAMRegisteredTransition(), RMAppAttemptState.FAILED)) 118 119 // Transitions from SCHEDULED State 120 .addTransition(RMAppAttemptState.SCHEDULED, 121 EnumSet.of(RMAppAttemptState.ALLOCATED_SAVING, 122 RMAppAttemptState.SCHEDULED), 123 RMAppAttemptEventType.CONTAINER_ALLOCATED, 124 new AMContainerAllocatedTransition()) 125 .addTransition(RMAppAttemptState.SCHEDULED, RMAppAttemptState.FINAL_SAVING, 126 RMAppAttemptEventType.KILL, 127 new FinalSavingTransition(new BaseFinalTransition( 128 RMAppAttemptState.KILLED), RMAppAttemptState.KILLED)) 129 .addTransition(RMAppAttemptState.SCHEDULED, 130 RMAppAttemptState.FINAL_SAVING, 131 RMAppAttemptEventType.CONTAINER_FINISHED, 132 new FinalSavingTransition( 133 new AMContainerCrashedBeforeRunningTransition(), 134 RMAppAttemptState.FAILED)) 135 136 // Transitions from ALLOCATED_SAVING State 137 .addTransition(RMAppAttemptState.ALLOCATED_SAVING, 138 RMAppAttemptState.ALLOCATED, 139 RMAppAttemptEventType.ATTEMPT_NEW_SAVED, new AttemptStoredTransition()) 140 141 // App could be killed by the client. So need to handle this. 142 .addTransition(RMAppAttemptState.ALLOCATED_SAVING, 143 RMAppAttemptState.FINAL_SAVING, 144 RMAppAttemptEventType.KILL, 145 new FinalSavingTransition(new BaseFinalTransition( 146 RMAppAttemptState.KILLED), RMAppAttemptState.KILLED)) 147 .addTransition(RMAppAttemptState.ALLOCATED_SAVING, 148 RMAppAttemptState.FINAL_SAVING, 149 RMAppAttemptEventType.CONTAINER_FINISHED, 150 new FinalSavingTransition( 151 new AMContainerCrashedBeforeRunningTransition(), 152 RMAppAttemptState.FAILED)) 153 154 // Transitions from LAUNCHED_UNMANAGED_SAVING State 155 .addTransition(RMAppAttemptState.LAUNCHED_UNMANAGED_SAVING, 156 RMAppAttemptState.LAUNCHED, 157 RMAppAttemptEventType.ATTEMPT_NEW_SAVED, 158 new UnmanagedAMAttemptSavedTransition()) 159 // attempt should not try to register in this state 160 .addTransition(RMAppAttemptState.LAUNCHED_UNMANAGED_SAVING, 161 RMAppAttemptState.FINAL_SAVING, 162 RMAppAttemptEventType.REGISTERED, 163 new FinalSavingTransition( 164 new UnexpectedAMRegisteredTransition(), RMAppAttemptState.FAILED)) 165 // App could be killed by the client. So need to handle this. 166 .addTransition(RMAppAttemptState.LAUNCHED_UNMANAGED_SAVING, 167 RMAppAttemptState.FINAL_SAVING, 168 RMAppAttemptEventType.KILL, 169 new FinalSavingTransition(new BaseFinalTransition( 170 RMAppAttemptState.KILLED), RMAppAttemptState.KILLED)) 171 172 // Transitions from ALLOCATED State 173 .addTransition(RMAppAttemptState.ALLOCATED, RMAppAttemptState.LAUNCHED, 174 RMAppAttemptEventType.LAUNCHED, new AMLaunchedTransition()) 175 .addTransition(RMAppAttemptState.ALLOCATED, RMAppAttemptState.FINAL_SAVING, 176 RMAppAttemptEventType.LAUNCH_FAILED, 177 new FinalSavingTransition(new LaunchFailedTransition(), 178 RMAppAttemptState.FAILED)) 179 .addTransition(RMAppAttemptState.ALLOCATED, RMAppAttemptState.FINAL_SAVING, 180 RMAppAttemptEventType.KILL, 181 new FinalSavingTransition( 182 new KillAllocatedAMTransition(), RMAppAttemptState.KILLED)) 183 184 .addTransition(RMAppAttemptState.ALLOCATED, RMAppAttemptState.FINAL_SAVING, 185 RMAppAttemptEventType.CONTAINER_FINISHED, 186 new FinalSavingTransition( 187 new AMContainerCrashedBeforeRunningTransition(), RMAppAttemptState.FAILED)) 188 189 // Transitions from LAUNCHED State 190 .addTransition(RMAppAttemptState.LAUNCHED, RMAppAttemptState.RUNNING, 191 RMAppAttemptEventType.REGISTERED, new AMRegisteredTransition()) 192 .addTransition(RMAppAttemptState.LAUNCHED, 193 EnumSet.of(RMAppAttemptState.LAUNCHED, RMAppAttemptState.FINAL_SAVING), 194 RMAppAttemptEventType.CONTAINER_FINISHED, 195 new ContainerFinishedTransition( 196 new AMContainerCrashedBeforeRunningTransition(), 197 RMAppAttemptState.LAUNCHED)) 198 .addTransition( 199 RMAppAttemptState.LAUNCHED, RMAppAttemptState.FINAL_SAVING, 200 RMAppAttemptEventType.EXPIRE, 201 new FinalSavingTransition(EXPIRED_TRANSITION, 202 RMAppAttemptState.FAILED)) 203 .addTransition(RMAppAttemptState.LAUNCHED, RMAppAttemptState.FINAL_SAVING, 204 RMAppAttemptEventType.KILL, 205 new FinalSavingTransition(new FinalTransition( 206 RMAppAttemptState.KILLED), RMAppAttemptState.KILLED)) 207 208 // Transitions from RUNNING State 209 .addTransition(RMAppAttemptState.RUNNING, 210 EnumSet.of(RMAppAttemptState.FINAL_SAVING, RMAppAttemptState.FINISHED), 211 RMAppAttemptEventType.UNREGISTERED, new AMUnregisteredTransition()) 212 .addTransition(RMAppAttemptState.RUNNING, RMAppAttemptState.RUNNING, 213 RMAppAttemptEventType.STATUS_UPDATE, new StatusUpdateTransition()) 214 .addTransition(RMAppAttemptState.RUNNING, RMAppAttemptState.RUNNING, 215 RMAppAttemptEventType.CONTAINER_ALLOCATED) 216 .addTransition( 217 RMAppAttemptState.RUNNING, 218 EnumSet.of(RMAppAttemptState.RUNNING, RMAppAttemptState.FINAL_SAVING), 219 RMAppAttemptEventType.CONTAINER_FINISHED, 220 new ContainerFinishedTransition( 221 new AMContainerCrashedAtRunningTransition(), 222 RMAppAttemptState.RUNNING)) 223 .addTransition( 224 RMAppAttemptState.RUNNING, RMAppAttemptState.FINAL_SAVING, 225 RMAppAttemptEventType.EXPIRE, 226 new FinalSavingTransition(EXPIRED_TRANSITION, 227 RMAppAttemptState.FAILED)) 228 .addTransition( 229 RMAppAttemptState.RUNNING, RMAppAttemptState.FINAL_SAVING, 230 RMAppAttemptEventType.KILL, 231 new FinalSavingTransition(new FinalTransition( 232 RMAppAttemptState.KILLED), RMAppAttemptState.KILLED)) 233 234 // Transitions from FINAL_SAVING State 235 .addTransition(RMAppAttemptState.FINAL_SAVING, 236 EnumSet.of(RMAppAttemptState.FINISHING, RMAppAttemptState.FAILED, 237 RMAppAttemptState.KILLED, RMAppAttemptState.FINISHED), 238 RMAppAttemptEventType.ATTEMPT_UPDATE_SAVED, 239 new FinalStateSavedTransition()) 240 .addTransition(RMAppAttemptState.FINAL_SAVING, RMAppAttemptState.FINAL_SAVING, 241 RMAppAttemptEventType.CONTAINER_FINISHED, 242 new ContainerFinishedAtFinalSavingTransition()) 243 .addTransition(RMAppAttemptState.FINAL_SAVING, RMAppAttemptState.FINAL_SAVING, 244 RMAppAttemptEventType.EXPIRE, 245 new AMExpiredAtFinalSavingTransition()) 246 .addTransition(RMAppAttemptState.FINAL_SAVING, RMAppAttemptState.FINAL_SAVING, 247 EnumSet.of( 248 RMAppAttemptEventType.UNREGISTERED, 249 RMAppAttemptEventType.STATUS_UPDATE, 250 RMAppAttemptEventType.LAUNCHED, 251 RMAppAttemptEventType.LAUNCH_FAILED, 252 // should be fixed to reject container allocate request at Final 253 // Saving in scheduler 254 RMAppAttemptEventType.CONTAINER_ALLOCATED, 255 RMAppAttemptEventType.ATTEMPT_NEW_SAVED, 256 RMAppAttemptEventType.KILL)) 257 258 // Transitions from FAILED State 259 // For work-preserving AM restart, failed attempt are still capturing 260 // CONTAINER_FINISHED event and record the finished containers for the 261 // use by the next new attempt. 262 .addTransition(RMAppAttemptState.FAILED, RMAppAttemptState.FAILED, 263 RMAppAttemptEventType.CONTAINER_FINISHED, 264 new ContainerFinishedAtFinalStateTransition()) 265 .addTransition( 266 RMAppAttemptState.FAILED, 267 RMAppAttemptState.FAILED, 268 EnumSet.of( 269 RMAppAttemptEventType.EXPIRE, 270 RMAppAttemptEventType.KILL, 271 RMAppAttemptEventType.UNREGISTERED, 272 RMAppAttemptEventType.STATUS_UPDATE, 273 RMAppAttemptEventType.CONTAINER_ALLOCATED)) 274 275 // Transitions from FINISHING State 276 .addTransition(RMAppAttemptState.FINISHING, 277 EnumSet.of(RMAppAttemptState.FINISHING, RMAppAttemptState.FINISHED), 278 RMAppAttemptEventType.CONTAINER_FINISHED, 279 new AMFinishingContainerFinishedTransition()) 280 .addTransition(RMAppAttemptState.FINISHING, RMAppAttemptState.FINISHED, 281 RMAppAttemptEventType.EXPIRE, 282 new FinalTransition(RMAppAttemptState.FINISHED)) 283 .addTransition(RMAppAttemptState.FINISHING, RMAppAttemptState.FINISHING, 284 EnumSet.of( 285 RMAppAttemptEventType.UNREGISTERED, 286 RMAppAttemptEventType.STATUS_UPDATE, 287 RMAppAttemptEventType.CONTAINER_ALLOCATED, 288 // ignore Kill as we have already saved the final Finished state in 289 // state store. 290 RMAppAttemptEventType.KILL)) 291 292 // Transitions from FINISHED State 293 .addTransition( 294 RMAppAttemptState.FINISHED, 295 RMAppAttemptState.FINISHED, 296 EnumSet.of( 297 RMAppAttemptEventType.EXPIRE, 298 RMAppAttemptEventType.UNREGISTERED, 299 RMAppAttemptEventType.CONTAINER_ALLOCATED, 300 RMAppAttemptEventType.KILL)) 301 .addTransition(RMAppAttemptState.FINISHED, 302 RMAppAttemptState.FINISHED, 303 RMAppAttemptEventType.CONTAINER_FINISHED, 304 new ContainerFinishedAtFinalStateTransition()) 305 306 // Transitions from KILLED State 307 .addTransition( 308 RMAppAttemptState.KILLED, 309 RMAppAttemptState.KILLED, 310 EnumSet.of(RMAppAttemptEventType.ATTEMPT_ADDED, 311 RMAppAttemptEventType.LAUNCHED, 312 RMAppAttemptEventType.LAUNCH_FAILED, 313 RMAppAttemptEventType.EXPIRE, 314 RMAppAttemptEventType.REGISTERED, 315 RMAppAttemptEventType.CONTAINER_ALLOCATED, 316 RMAppAttemptEventType.UNREGISTERED, 317 RMAppAttemptEventType.KILL, 318 RMAppAttemptEventType.STATUS_UPDATE)) 319 .addTransition(RMAppAttemptState.KILLED, 320 RMAppAttemptState.KILLED, 321 RMAppAttemptEventType.CONTAINER_FINISHED, 322 new ContainerFinishedAtFinalStateTransition()) 323 .installTopology(); 324 325 public RMAppAttemptImpl(ApplicationAttemptId appAttemptId, 326 RMContext rmContext, YarnScheduler scheduler, 327 ApplicationMasterService masterService, 328 ApplicationSubmissionContext submissionContext, 329 Configuration conf, boolean maybeLastAttempt, ResourceRequest amReq) { 330 this.conf = conf; 331 this.applicationAttemptId = appAttemptId; 332 this.rmContext = rmContext; 333 this.eventHandler = rmContext.getDispatcher().getEventHandler(); 334 this.submissionContext = submissionContext; 335 this.scheduler = scheduler; 336 this.masterService = masterService; 337 338 ReentrantReadWriteLock lock = new ReentrantReadWriteLock(); 339 this.readLock = lock.readLock(); 340 this.writeLock = lock.writeLock(); 341 342 this.proxiedTrackingUrl = generateProxyUriWithScheme(); 343 this.maybeLastAttempt = maybeLastAttempt; 344 this.stateMachine = stateMachineFactory.make(this); 345 346 this.attemptMetrics = 347 new RMAppAttemptMetrics(applicationAttemptId, rmContext); 348 349 this.amReq = amReq; 350 } 351 352 @Override 353 public ApplicationAttemptId getAppAttemptId() { 354 return this.applicationAttemptId; 355 } 356 357 @Override 358 public ApplicationSubmissionContext getSubmissionContext() { 359 return this.submissionContext; 360 } 361 362 @Override 363 public FinalApplicationStatus getFinalApplicationStatus() { 364 this.readLock.lock(); 365 try { 366 return this.finalStatus; 367 } finally { 368 this.readLock.unlock(); 369 } 370 } 371 372 @Override 373 public RMAppAttemptState getAppAttemptState() { 374 this.readLock.lock(); 375 try { 376 return this.stateMachine.getCurrentState(); 377 } finally { 378 this.readLock.unlock(); 379 } 380 } 381 382 @Override 383 public String getHost() { 384 this.readLock.lock(); 385 386 try { 387 return this.host; 388 } finally { 389 this.readLock.unlock(); 390 } 391 } 392 393 @Override 394 public int getRpcPort() { 395 this.readLock.lock(); 396 397 try { 398 return this.rpcPort; 399 } finally { 400 this.readLock.unlock(); 401 } 402 } 403 404 @Override 405 public String getTrackingUrl() { 406 this.readLock.lock(); 407 try { 408 return (getSubmissionContext().getUnmanagedAM()) ? 409 this.originalTrackingUrl : this.proxiedTrackingUrl; 410 } finally { 411 this.readLock.unlock(); 412 } 413 } 414 415 @Override 416 public String getOriginalTrackingUrl() { 417 this.readLock.lock(); 418 try { 419 return this.originalTrackingUrl; 420 } finally { 421 this.readLock.unlock(); 422 } 423 } 424 425 @Override 426 public String getWebProxyBase() { 427 this.readLock.lock(); 428 try { 429 return ProxyUriUtils.getPath(applicationAttemptId.getApplicationId()); 430 } finally { 431 this.readLock.unlock(); 432 } 433 } 434 435 private String generateProxyUriWithScheme() { 436 this.readLock.lock(); 437 try { 438 final String scheme = WebAppUtils.getHttpSchemePrefix(conf); 439 String proxy = WebAppUtils.getProxyHostAndPort(conf); 440 URI proxyUri = ProxyUriUtils.getUriFromAMUrl(scheme, proxy); 441 URI result = ProxyUriUtils.getProxyUri(null, proxyUri, 442 applicationAttemptId.getApplicationId()); 443 return result.toASCIIString(); 444 } catch (URISyntaxException e) { 445 LOG.warn("Could not proxify the uri for " 446 + applicationAttemptId.getApplicationId(), e); 447 return null; 448 } finally { 449 this.readLock.unlock(); 450 } 451 } 452 453 private void setTrackingUrlToRMAppPage(RMAppAttemptState stateToBeStored) { 454 originalTrackingUrl = pjoin( 455 WebAppUtils.getResolvedRMWebAppURLWithScheme(conf), 456 "cluster", "app", getAppAttemptId().getApplicationId()); 457 switch (stateToBeStored) { 458 case KILLED: 459 case FAILED: 460 proxiedTrackingUrl = originalTrackingUrl; 461 break; 462 default: 463 break; 464 } 465 } 466 467 private void setTrackingUrlToAHSPage(RMAppAttemptState stateToBeStored) { 468 originalTrackingUrl = pjoin( 469 WebAppUtils.getHttpSchemePrefix(conf) + 470 WebAppUtils.getAHSWebAppURLWithoutScheme(conf), 471 "applicationhistory", "app", getAppAttemptId().getApplicationId()); 472 switch (stateToBeStored) { 473 case KILLED: 474 case FAILED: 475 proxiedTrackingUrl = originalTrackingUrl; 476 break; 477 default: 478 break; 479 } 480 } 481 482 private void invalidateAMHostAndPort() { 483 this.host = "N/A"; 484 this.rpcPort = -1; 485 } 486 487 // This is only used for RMStateStore. Normal operation must invoke the secret 488 // manager to get the key and not use the local key directly. 489 @Override 490 public SecretKey getClientTokenMasterKey() { 491 return this.clientTokenMasterKey; 492 } 493 494 @Override 495 public Token<AMRMTokenIdentifier> getAMRMToken() { 496 this.readLock.lock(); 497 try { 498 return this.amrmToken; 499 } finally { 500 this.readLock.unlock(); 501 } 502 } 503 504 @Private 505 public void setAMRMToken(Token<AMRMTokenIdentifier> lastToken) { 506 this.writeLock.lock(); 507 try { 508 this.amrmToken = lastToken; 509 this.amrmTokenKeyId = null; 510 } finally { 511 this.writeLock.unlock(); 512 } 513 } 514 515 @Private 516 public int getAMRMTokenKeyId() { 517 Integer keyId = this.amrmTokenKeyId; 518 if (keyId == null) { 519 this.readLock.lock(); 520 try { 521 if (this.amrmToken == null) { 522 throw new YarnRuntimeException("Missing AMRM token for " 523 + this.applicationAttemptId); 524 } 525 keyId = this.amrmToken.decodeIdentifier().getKeyId(); 526 this.amrmTokenKeyId = keyId; 527 } catch (IOException e) { 528 throw new YarnRuntimeException("AMRM token decode error for " 529 + this.applicationAttemptId, e); 530 } finally { 531 this.readLock.unlock(); 532 } 533 } 534 return keyId; 535 } 536 537 @Override 538 public Token<ClientToAMTokenIdentifier> createClientToken(String client) { 539 this.readLock.lock(); 540 541 try { 542 Token<ClientToAMTokenIdentifier> token = null; 543 ClientToAMTokenSecretManagerInRM secretMgr = 544 this.rmContext.getClientToAMTokenSecretManager(); 545 if (client != null && 546 secretMgr.getMasterKey(this.applicationAttemptId) != null) { 547 token = new Token<ClientToAMTokenIdentifier>( 548 new ClientToAMTokenIdentifier(this.applicationAttemptId, client), 549 secretMgr); 550 } 551 return token; 552 } finally { 553 this.readLock.unlock(); 554 } 555 } 556 557 @Override 558 public String getDiagnostics() { 559 this.readLock.lock(); 560 561 try { 562 return this.diagnostics.toString(); 563 } finally { 564 this.readLock.unlock(); 565 } 566 } 567 568 public int getAMContainerExitStatus() { 569 this.readLock.lock(); 570 try { 571 return this.amContainerExitStatus; 572 } finally { 573 this.readLock.unlock(); 574 } 575 } 576 577 @Override 578 public float getProgress() { 579 this.readLock.lock(); 580 581 try { 582 return this.progress; 583 } finally { 584 this.readLock.unlock(); 585 } 586 } 587 588 @VisibleForTesting 589 @Override 590 public List<ContainerStatus> getJustFinishedContainers() { 591 this.readLock.lock(); 592 try { 593 List<ContainerStatus> returnList = new ArrayList<ContainerStatus>(); 594 for (Collection<ContainerStatus> containerStatusList : 595 justFinishedContainers.values()) { 596 returnList.addAll(containerStatusList); 597 } 598 return returnList; 599 } finally { 600 this.readLock.unlock(); 601 } 602 } 603 604 @Override 605 public ConcurrentMap<NodeId, List<ContainerStatus>> 606 getJustFinishedContainersReference 607 () { 608 this.readLock.lock(); 609 try { 610 return this.justFinishedContainers; 611 } finally { 612 this.readLock.unlock(); 613 } 614 } 615 616 @Override 617 public ConcurrentMap<NodeId, List<ContainerStatus>> 618 getFinishedContainersSentToAMReference() { 619 this.readLock.lock(); 620 try { 621 return this.finishedContainersSentToAM; 622 } finally { 623 this.readLock.unlock(); 624 } 625 } 626 627 @Override 628 public List<ContainerStatus> pullJustFinishedContainers() { 629 this.writeLock.lock(); 630 631 try { 632 List<ContainerStatus> returnList = new ArrayList<ContainerStatus>(); 633 634 // A new allocate means the AM received the previously sent 635 // finishedContainers. We can ack this to NM now 636 sendFinishedContainersToNM(); 637 638 // Mark every containerStatus as being sent to AM though we may return 639 // only the ones that belong to the current attempt 640 boolean keepContainersAcressAttempts = this.submissionContext 641 .getKeepContainersAcrossApplicationAttempts(); 642 for (NodeId nodeId:justFinishedContainers.keySet()) { 643 644 // Clear and get current values 645 List<ContainerStatus> finishedContainers = justFinishedContainers.put 646 (nodeId, new ArrayList<ContainerStatus>()); 647 648 if (keepContainersAcressAttempts) { 649 returnList.addAll(finishedContainers); 650 } else { 651 // Filter out containers from previous attempt 652 for (ContainerStatus containerStatus: finishedContainers) { 653 if (containerStatus.getContainerId().getApplicationAttemptId() 654 .equals(this.getAppAttemptId())) { 655 returnList.add(containerStatus); 656 } 657 } 658 } 659 660 finishedContainersSentToAM.putIfAbsent(nodeId, new ArrayList 661 <ContainerStatus>()); 662 finishedContainersSentToAM.get(nodeId).addAll(finishedContainers); 663 } 664 665 return returnList; 666 } finally { 667 this.writeLock.unlock(); 668 } 669 } 670 671 @Override 672 public Container getMasterContainer() { 673 this.readLock.lock(); 674 675 try { 676 return this.masterContainer; 677 } finally { 678 this.readLock.unlock(); 679 } 680 } 681 682 @InterfaceAudience.Private 683 @VisibleForTesting 684 public void setMasterContainer(Container container) { 685 masterContainer = container; 686 } 687 688 @Override 689 public void handle(RMAppAttemptEvent event) { 690 691 this.writeLock.lock(); 692 693 try { 694 ApplicationAttemptId appAttemptID = event.getApplicationAttemptId(); 695 LOG.debug("Processing event for " + appAttemptID + " of type " 696 + event.getType()); 697 final RMAppAttemptState oldState = getAppAttemptState(); 698 try { 699 /* keep the master in sync with the state machine */ 700 this.stateMachine.doTransition(event.getType(), event); 701 } catch (InvalidStateTransitonException e) { 702 LOG.error("Can't handle this event at current state", e); 703 /* TODO fail the application on the failed transition */ 704 } 705 706 if (oldState != getAppAttemptState()) { 707 LOG.info(appAttemptID + " State change from " + oldState + " to " 708 + getAppAttemptState()); 709 } 710 } finally { 711 this.writeLock.unlock(); 712 } 713 } 714 715 @Override 716 public ApplicationResourceUsageReport getApplicationResourceUsageReport() { 717 this.readLock.lock(); 718 try { 719 ApplicationResourceUsageReport report = 720 scheduler.getAppResourceUsageReport(this.getAppAttemptId()); 721 if (report == null) { 722 report = RMServerUtils.DUMMY_APPLICATION_RESOURCE_USAGE_REPORT; 723 } 724 AggregateAppResourceUsage resUsage = 725 this.attemptMetrics.getAggregateAppResourceUsage(); 726 report.setMemorySeconds(resUsage.getMemorySeconds()); 727 report.setVcoreSeconds(resUsage.getVcoreSeconds()); 728 return report; 729 } finally { 730 this.readLock.unlock(); 731 } 732 } 733 734 @Override 735 public void recover(RMState state) { 736 ApplicationStateData appState = 737 state.getApplicationState().get(getAppAttemptId().getApplicationId()); 738 ApplicationAttemptStateData attemptState = 739 appState.getAttempt(getAppAttemptId()); 740 assert attemptState != null; 741 LOG.info("Recovering attempt: " + getAppAttemptId() + " with final state: " 742 + attemptState.getState()); 743 diagnostics.append("Attempt recovered after RM restart"); 744 diagnostics.append(attemptState.getDiagnostics()); 745 this.amContainerExitStatus = attemptState.getAMContainerExitStatus(); 746 if (amContainerExitStatus == ContainerExitStatus.PREEMPTED) { 747 this.attemptMetrics.setIsPreempted(); 748 } 749 750 Credentials credentials = attemptState.getAppAttemptTokens(); 751 setMasterContainer(attemptState.getMasterContainer()); 752 recoverAppAttemptCredentials(credentials, attemptState.getState()); 753 this.recoveredFinalState = attemptState.getState(); 754 this.originalTrackingUrl = attemptState.getFinalTrackingUrl(); 755 this.finalStatus = attemptState.getFinalApplicationStatus(); 756 this.startTime = attemptState.getStartTime(); 757 this.finishTime = attemptState.getFinishTime(); 758 this.attemptMetrics.updateAggregateAppResourceUsage( 759 attemptState.getMemorySeconds(),attemptState.getVcoreSeconds()); 760 } 761 762 public void transferStateFromPreviousAttempt(RMAppAttempt attempt) { 763 this.justFinishedContainers = attempt.getJustFinishedContainersReference(); 764 this.finishedContainersSentToAM = 765 attempt.getFinishedContainersSentToAMReference(); 766 } 767 768 private void recoverAppAttemptCredentials(Credentials appAttemptTokens, 769 RMAppAttemptState state) { 770 if (appAttemptTokens == null || state == RMAppAttemptState.FAILED 771 || state == RMAppAttemptState.FINISHED 772 || state == RMAppAttemptState.KILLED) { 773 return; 774 } 775 776 if (UserGroupInformation.isSecurityEnabled()) { 777 byte[] clientTokenMasterKeyBytes = appAttemptTokens.getSecretKey( 778 RMStateStore.AM_CLIENT_TOKEN_MASTER_KEY_NAME); 779 if (clientTokenMasterKeyBytes != null) { 780 clientTokenMasterKey = rmContext.getClientToAMTokenSecretManager() 781 .registerMasterKey(applicationAttemptId, clientTokenMasterKeyBytes); 782 } 783 } 784 785 setAMRMToken(rmContext.getAMRMTokenSecretManager().createAndGetAMRMToken( 786 applicationAttemptId)); 787 } 788 789 private static class BaseTransition implements 790 SingleArcTransition<RMAppAttemptImpl, RMAppAttemptEvent> { 791 792 @Override 793 public void transition(RMAppAttemptImpl appAttempt, 794 RMAppAttemptEvent event) { 795 } 796 797 } 798 799 private static final class AttemptStartedTransition extends BaseTransition { 800 @Override 801 public void transition(RMAppAttemptImpl appAttempt, 802 RMAppAttemptEvent event) { 803 804 boolean transferStateFromPreviousAttempt = false; 805 if (event instanceof RMAppStartAttemptEvent) { 806 transferStateFromPreviousAttempt = 807 ((RMAppStartAttemptEvent) event) 808 .getTransferStateFromPreviousAttempt(); 809 } 810 appAttempt.startTime = System.currentTimeMillis(); 811 812 // Register with the ApplicationMasterService 813 appAttempt.masterService 814 .registerAppAttempt(appAttempt.applicationAttemptId); 815 816 if (UserGroupInformation.isSecurityEnabled()) { 817 appAttempt.clientTokenMasterKey = 818 appAttempt.rmContext.getClientToAMTokenSecretManager() 819 .createMasterKey(appAttempt.applicationAttemptId); 820 } 821 822 // Add the applicationAttempt to the scheduler and inform the scheduler 823 // whether to transfer the state from previous attempt. 824 appAttempt.eventHandler.handle(new AppAttemptAddedSchedulerEvent( 825 appAttempt.applicationAttemptId, transferStateFromPreviousAttempt)); 826 } 827 } 828 829 private static final List<ContainerId> EMPTY_CONTAINER_RELEASE_LIST = 830 new ArrayList<ContainerId>(); 831 832 private static final List<ResourceRequest> EMPTY_CONTAINER_REQUEST_LIST = 833 new ArrayList<ResourceRequest>(); 834 835 @VisibleForTesting 836 public static final class ScheduleTransition 837 implements 838 MultipleArcTransition<RMAppAttemptImpl, RMAppAttemptEvent, RMAppAttemptState> { 839 @Override 840 public RMAppAttemptState transition(RMAppAttemptImpl appAttempt, 841 RMAppAttemptEvent event) { 842 ApplicationSubmissionContext subCtx = appAttempt.submissionContext; 843 if (!subCtx.getUnmanagedAM()) { 844 // Need reset #containers before create new attempt, because this request 845 // will be passed to scheduler, and scheduler will deduct the number after 846 // AM container allocated 847 848 // Currently, following fields are all hard code, 849 // TODO: change these fields when we want to support 850 // priority/resource-name/relax-locality specification for AM containers 851 // allocation. 852 appAttempt.amReq.setNumContainers(1); 853 appAttempt.amReq.setPriority(AM_CONTAINER_PRIORITY); 854 appAttempt.amReq.setResourceName(ResourceRequest.ANY); 855 appAttempt.amReq.setRelaxLocality(true); 856 857 // AM resource has been checked when submission 858 Allocation amContainerAllocation = 859 appAttempt.scheduler.allocate(appAttempt.applicationAttemptId, 860 Collections.singletonList(appAttempt.amReq), 861 EMPTY_CONTAINER_RELEASE_LIST, null, null); 862 if (amContainerAllocation != null 863 && amContainerAllocation.getContainers() != null) { 864 assert (amContainerAllocation.getContainers().size() == 0); 865 } 866 return RMAppAttemptState.SCHEDULED; 867 } else { 868 // save state and then go to LAUNCHED state 869 appAttempt.storeAttempt(); 870 return RMAppAttemptState.LAUNCHED_UNMANAGED_SAVING; 871 } 872 } 873 } 874 875 private static final class AMContainerAllocatedTransition 876 implements 877 MultipleArcTransition<RMAppAttemptImpl, RMAppAttemptEvent, RMAppAttemptState> { 878 @Override 879 public RMAppAttemptState transition(RMAppAttemptImpl appAttempt, 880 RMAppAttemptEvent event) { 881 // Acquire the AM container from the scheduler. 882 Allocation amContainerAllocation = 883 appAttempt.scheduler.allocate(appAttempt.applicationAttemptId, 884 EMPTY_CONTAINER_REQUEST_LIST, EMPTY_CONTAINER_RELEASE_LIST, null, 885 null); 886 // There must be at least one container allocated, because a 887 // CONTAINER_ALLOCATED is emitted after an RMContainer is constructed, 888 // and is put in SchedulerApplication#newlyAllocatedContainers. 889 890 // Note that YarnScheduler#allocate is not guaranteed to be able to 891 // fetch it since container may not be fetchable for some reason like 892 // DNS unavailable causing container token not generated. As such, we 893 // return to the previous state and keep retry until am container is 894 // fetched. 895 if (amContainerAllocation.getContainers().size() == 0) { 896 appAttempt.retryFetchingAMContainer(appAttempt); 897 return RMAppAttemptState.SCHEDULED; 898 } 899 900 // Set the masterContainer 901 appAttempt.setMasterContainer(amContainerAllocation.getContainers() 902 .get(0)); 903 RMContainerImpl rmMasterContainer = (RMContainerImpl)appAttempt.scheduler 904 .getRMContainer(appAttempt.getMasterContainer().getId()); 905 rmMasterContainer.setAMContainer(true); 906 // The node set in NMTokenSecrentManager is used for marking whether the 907 // NMToken has been issued for this node to the AM. 908 // When AM container was allocated to RM itself, the node which allocates 909 // this AM container was marked as the NMToken already sent. Thus, 910 // clear this node set so that the following allocate requests from AM are 911 // able to retrieve the corresponding NMToken. 912 appAttempt.rmContext.getNMTokenSecretManager() 913 .clearNodeSetForAttempt(appAttempt.applicationAttemptId); 914 appAttempt.getSubmissionContext().setResource( 915 appAttempt.getMasterContainer().getResource()); 916 appAttempt.storeAttempt(); 917 return RMAppAttemptState.ALLOCATED_SAVING; 918 } 919 } 920 921 private void retryFetchingAMContainer(final RMAppAttemptImpl appAttempt) { 922 // start a new thread so that we are not blocking main dispatcher thread. 923 new Thread() { 924 @Override 925 public void run() { 926 try { 927 Thread.sleep(500); 928 } catch (InterruptedException e) { 929 LOG.warn("Interrupted while waiting to resend the" 930 + " ContainerAllocated Event."); 931 } 932 appAttempt.eventHandler.handle( 933 new RMAppAttemptEvent(appAttempt.applicationAttemptId, 934 RMAppAttemptEventType.CONTAINER_ALLOCATED)); 935 } 936 }.start(); 937 } 938 939 private static final class AttemptStoredTransition extends BaseTransition { 940 @Override 941 public void transition(RMAppAttemptImpl appAttempt, 942 RMAppAttemptEvent event) { 943 appAttempt.launchAttempt(); 944 } 945 } 946 947 private static class AttemptRecoveredTransition 948 implements 949 MultipleArcTransition<RMAppAttemptImpl, RMAppAttemptEvent, RMAppAttemptState> { 950 @Override 951 public RMAppAttemptState transition(RMAppAttemptImpl appAttempt, 952 RMAppAttemptEvent event) { 953 RMApp rmApp = appAttempt.rmContext.getRMApps().get( 954 appAttempt.getAppAttemptId().getApplicationId()); 955 956 /* 957 * If last attempt recovered final state is null .. it means attempt was 958 * started but AM container may or may not have started / finished. 959 * Therefore we should wait for it to finish. 960 */ 961 if (appAttempt.recoveredFinalState != null) { 962 appAttempt.progress = 1.0f; 963 // We will replay the final attempt only if last attempt is in final 964 // state but application is not in final state. 965 if (rmApp.getCurrentAppAttempt() == appAttempt 966 && !RMAppImpl.isAppInFinalState(rmApp)) { 967 // Add the previous finished attempt to scheduler synchronously so 968 // that scheduler knows the previous attempt. 969 appAttempt.scheduler.handle(new AppAttemptAddedSchedulerEvent( 970 appAttempt.getAppAttemptId(), false, true)); 971 (new BaseFinalTransition(appAttempt.recoveredFinalState)).transition( 972 appAttempt, event); 973 } 974 return appAttempt.recoveredFinalState; 975 } else if (RMAppImpl.isAppInFinalState(rmApp)) { 976 // Somehow attempt final state was not saved but app final state was saved. 977 // Skip adding the attempt into scheduler 978 RMAppState appState = ((RMAppImpl) rmApp).getRecoveredFinalState(); 979 LOG.warn(rmApp.getApplicationId() + " final state (" + appState 980 + ") was recorded, but " + appAttempt.applicationAttemptId 981 + " final state (" + appAttempt.recoveredFinalState 982 + ") was not recorded."); 983 switch (appState) { 984 case FINISHED: 985 return RMAppAttemptState.FINISHED; 986 case FAILED: 987 return RMAppAttemptState.FAILED; 988 case KILLED: 989 return RMAppAttemptState.KILLED; 990 } 991 return RMAppAttemptState.FAILED; 992 } else{ 993 // Add the current attempt to the scheduler. 994 if (appAttempt.rmContext.isWorkPreservingRecoveryEnabled()) { 995 // Need to register an app attempt before AM can register 996 appAttempt.masterService 997 .registerAppAttempt(appAttempt.applicationAttemptId); 998 999 // Add attempt to scheduler synchronously to guarantee scheduler 1000 // knows attempts before AM or NM re-registers. 1001 appAttempt.scheduler.handle(new AppAttemptAddedSchedulerEvent( 1002 appAttempt.getAppAttemptId(), false, true)); 1003 } 1004 1005 /* 1006 * Since the application attempt's final state is not saved that means 1007 * for AM container (previous attempt) state must be one of these. 1008 * 1) AM container may not have been launched (RM failed right before 1009 * this). 1010 * 2) AM container was successfully launched but may or may not have 1011 * registered / unregistered. 1012 * In whichever case we will wait (by moving attempt into LAUNCHED 1013 * state) and mark this attempt failed (assuming non work preserving 1014 * restart) only after 1015 * 1) Node manager during re-registration heart beats back saying 1016 * am container finished. 1017 * 2) OR AMLivelinessMonitor expires this attempt (when am doesn't 1018 * heart beat back). 1019 */ 1020 (new AMLaunchedTransition()).transition(appAttempt, event); 1021 return RMAppAttemptState.LAUNCHED; 1022 } 1023 } 1024 } 1025 1026 1027 private void rememberTargetTransitions(RMAppAttemptEvent event, 1028 Object transitionToDo, RMAppAttemptState targetFinalState) { 1029 transitionTodo = transitionToDo; 1030 targetedFinalState = targetFinalState; 1031 eventCausingFinalSaving = event; 1032 } 1033 1034 private void rememberTargetTransitionsAndStoreState(RMAppAttemptEvent event, 1035 Object transitionToDo, RMAppAttemptState targetFinalState, 1036 RMAppAttemptState stateToBeStored) { 1037 1038 rememberTargetTransitions(event, transitionToDo, targetFinalState); 1039 stateBeforeFinalSaving = getState(); 1040 1041 // As of today, finalState, diagnostics, final-tracking-url and 1042 // finalAppStatus are the only things that we store into the StateStore 1043 // AFTER the initial saving on app-attempt-start 1044 // These fields can be visible from outside only after they are saved in 1045 // StateStore 1046 String diags = null; 1047 1048 // don't leave the tracking URL pointing to a non-existent AM 1049 if (conf.getBoolean(YarnConfiguration.APPLICATION_HISTORY_ENABLED, 1050 YarnConfiguration.DEFAULT_APPLICATION_HISTORY_ENABLED)) { 1051 setTrackingUrlToAHSPage(stateToBeStored); 1052 } else { 1053 setTrackingUrlToRMAppPage(stateToBeStored); 1054 } 1055 String finalTrackingUrl = getOriginalTrackingUrl(); 1056 FinalApplicationStatus finalStatus = null; 1057 int exitStatus = ContainerExitStatus.INVALID; 1058 switch (event.getType()) { 1059 case LAUNCH_FAILED: 1060 diags = event.getDiagnosticMsg(); 1061 break; 1062 case REGISTERED: 1063 diags = getUnexpectedAMRegisteredDiagnostics(); 1064 break; 1065 case UNREGISTERED: 1066 RMAppAttemptUnregistrationEvent unregisterEvent = 1067 (RMAppAttemptUnregistrationEvent) event; 1068 diags = unregisterEvent.getDiagnosticMsg(); 1069 // reset finalTrackingUrl to url sent by am 1070 finalTrackingUrl = sanitizeTrackingUrl(unregisterEvent.getFinalTrackingUrl()); 1071 finalStatus = unregisterEvent.getFinalApplicationStatus(); 1072 break; 1073 case CONTAINER_FINISHED: 1074 RMAppAttemptContainerFinishedEvent finishEvent = 1075 (RMAppAttemptContainerFinishedEvent) event; 1076 diags = getAMContainerCrashedDiagnostics(finishEvent); 1077 exitStatus = finishEvent.getContainerStatus().getExitStatus(); 1078 break; 1079 case KILL: 1080 break; 1081 case EXPIRE: 1082 diags = getAMExpiredDiagnostics(event); 1083 break; 1084 default: 1085 break; 1086 } 1087 AggregateAppResourceUsage resUsage = 1088 this.attemptMetrics.getAggregateAppResourceUsage(); 1089 RMStateStore rmStore = rmContext.getStateStore(); 1090 setFinishTime(System.currentTimeMillis()); 1091 1092 ApplicationAttemptStateData attemptState = 1093 ApplicationAttemptStateData.newInstance( 1094 applicationAttemptId, getMasterContainer(), 1095 rmStore.getCredentialsFromAppAttempt(this), 1096 startTime, stateToBeStored, finalTrackingUrl, diags, 1097 finalStatus, exitStatus, 1098 getFinishTime(), resUsage.getMemorySeconds(), 1099 resUsage.getVcoreSeconds()); 1100 LOG.info("Updating application attempt " + applicationAttemptId 1101 + " with final state: " + targetedFinalState + ", and exit status: " 1102 + exitStatus); 1103 rmStore.updateApplicationAttemptState(attemptState); 1104 } 1105 1106 private static class FinalSavingTransition extends BaseTransition { 1107 1108 Object transitionToDo; 1109 RMAppAttemptState targetedFinalState; 1110 1111 public FinalSavingTransition(Object transitionToDo, 1112 RMAppAttemptState targetedFinalState) { 1113 this.transitionToDo = transitionToDo; 1114 this.targetedFinalState = targetedFinalState; 1115 } 1116 1117 @Override 1118 public void transition(RMAppAttemptImpl appAttempt, RMAppAttemptEvent event) { 1119 // For cases Killed/Failed, targetedFinalState is the same as the state to 1120 // be stored 1121 appAttempt.rememberTargetTransitionsAndStoreState(event, transitionToDo, 1122 targetedFinalState, targetedFinalState); 1123 } 1124 } 1125 1126 private static class FinalStateSavedTransition implements 1127 MultipleArcTransition<RMAppAttemptImpl, RMAppAttemptEvent, RMAppAttemptState> { 1128 @Override 1129 public RMAppAttemptState transition(RMAppAttemptImpl appAttempt, 1130 RMAppAttemptEvent event) { 1131 RMAppAttemptEvent causeEvent = appAttempt.eventCausingFinalSaving; 1132 1133 if (appAttempt.transitionTodo instanceof SingleArcTransition) { 1134 ((SingleArcTransition) appAttempt.transitionTodo).transition( 1135 appAttempt, causeEvent); 1136 } else if (appAttempt.transitionTodo instanceof MultipleArcTransition) { 1137 ((MultipleArcTransition) appAttempt.transitionTodo).transition( 1138 appAttempt, causeEvent); 1139 } 1140 return appAttempt.targetedFinalState; 1141 } 1142 } 1143 1144 private static class BaseFinalTransition extends BaseTransition { 1145 1146 private final RMAppAttemptState finalAttemptState; 1147 1148 public BaseFinalTransition(RMAppAttemptState finalAttemptState) { 1149 this.finalAttemptState = finalAttemptState; 1150 } 1151 1152 @Override 1153 public void transition(RMAppAttemptImpl appAttempt, 1154 RMAppAttemptEvent event) { 1155 ApplicationAttemptId appAttemptId = appAttempt.getAppAttemptId(); 1156 1157 // Tell the AMS. Unregister from the ApplicationMasterService 1158 appAttempt.masterService.unregisterAttempt(appAttemptId); 1159 1160 // Tell the application and the scheduler 1161 ApplicationId applicationId = appAttemptId.getApplicationId(); 1162 RMAppEvent appEvent = null; 1163 boolean keepContainersAcrossAppAttempts = false; 1164 switch (finalAttemptState) { 1165 case FINISHED: 1166 { 1167 appEvent = 1168 new RMAppEvent(applicationId, RMAppEventType.ATTEMPT_FINISHED, 1169 appAttempt.getDiagnostics()); 1170 } 1171 break; 1172 case KILLED: 1173 { 1174 appAttempt.invalidateAMHostAndPort(); 1175 // Forward diagnostics received in attempt kill event. 1176 appEvent = 1177 new RMAppFailedAttemptEvent(applicationId, 1178 RMAppEventType.ATTEMPT_KILLED, 1179 event.getDiagnosticMsg(), false); 1180 } 1181 break; 1182 case FAILED: 1183 { 1184 appAttempt.invalidateAMHostAndPort(); 1185 1186 if (appAttempt.submissionContext 1187 .getKeepContainersAcrossApplicationAttempts() 1188 && !appAttempt.submissionContext.getUnmanagedAM()) { 1189 // See if we should retain containers for non-unmanaged applications 1190 if (!appAttempt.shouldCountTowardsMaxAttemptRetry()) { 1191 // Premption, hardware failures, NM resync doesn't count towards 1192 // app-failures and so we should retain containers. 1193 keepContainersAcrossAppAttempts = true; 1194 } else if (!appAttempt.maybeLastAttempt) { 1195 // Not preemption, hardware failures or NM resync. 1196 // Not last-attempt too - keep containers. 1197 keepContainersAcrossAppAttempts = true; 1198 } 1199 } 1200 appEvent = 1201 new RMAppFailedAttemptEvent(applicationId, 1202 RMAppEventType.ATTEMPT_FAILED, appAttempt.getDiagnostics(), 1203 keepContainersAcrossAppAttempts); 1204 1205 } 1206 break; 1207 default: 1208 { 1209 LOG.error("Cannot get this state!! Error!!"); 1210 } 1211 break; 1212 } 1213 1214 appAttempt.eventHandler.handle(appEvent); 1215 appAttempt.eventHandler.handle(new AppAttemptRemovedSchedulerEvent( 1216 appAttemptId, finalAttemptState, keepContainersAcrossAppAttempts)); 1217 appAttempt.removeCredentials(appAttempt); 1218 1219 appAttempt.rmContext.getRMApplicationHistoryWriter() 1220 .applicationAttemptFinished(appAttempt, finalAttemptState); 1221 appAttempt.rmContext.getSystemMetricsPublisher() 1222 .appAttemptFinished(appAttempt, finalAttemptState, 1223 appAttempt.rmContext.getRMApps().get( 1224 appAttempt.applicationAttemptId.getApplicationId()), 1225 System.currentTimeMillis()); 1226 } 1227 } 1228 1229 private static class AMLaunchedTransition extends BaseTransition { 1230 @Override 1231 public void transition(RMAppAttemptImpl appAttempt, 1232 RMAppAttemptEvent event) { 1233 if (event.getType() == RMAppAttemptEventType.LAUNCHED) { 1234 appAttempt.launchAMEndTime = System.currentTimeMillis(); 1235 long delay = appAttempt.launchAMEndTime - 1236 appAttempt.launchAMStartTime; 1237 ClusterMetrics.getMetrics().addAMLaunchDelay(delay); 1238 } 1239 // Register with AMLivelinessMonitor 1240 appAttempt.attemptLaunched(); 1241 1242 // register the ClientTokenMasterKey after it is saved in the store, 1243 // otherwise client may hold an invalid ClientToken after RM restarts. 1244 if (UserGroupInformation.isSecurityEnabled()) { 1245 appAttempt.rmContext.getClientToAMTokenSecretManager() 1246 .registerApplication(appAttempt.getAppAttemptId(), 1247 appAttempt.getClientTokenMasterKey()); 1248 } 1249 } 1250 } 1251 1252 @Override 1253 public boolean shouldCountTowardsMaxAttemptRetry() { 1254 try { 1255 this.readLock.lock(); 1256 int exitStatus = getAMContainerExitStatus(); 1257 return !(exitStatus == ContainerExitStatus.PREEMPTED 1258 || exitStatus == ContainerExitStatus.ABORTED 1259 || exitStatus == ContainerExitStatus.DISKS_FAILED 1260 || exitStatus == ContainerExitStatus.KILLED_BY_RESOURCEMANAGER); 1261 } finally { 1262 this.readLock.unlock(); 1263 } 1264 } 1265 1266 private static final class UnmanagedAMAttemptSavedTransition 1267 extends AMLaunchedTransition { 1268 @Override 1269 public void transition(RMAppAttemptImpl appAttempt, 1270 RMAppAttemptEvent event) { 1271 // create AMRMToken 1272 appAttempt.amrmToken = 1273 appAttempt.rmContext.getAMRMTokenSecretManager().createAndGetAMRMToken( 1274 appAttempt.applicationAttemptId); 1275 1276 super.transition(appAttempt, event); 1277 } 1278 } 1279 1280 private static final class LaunchFailedTransition extends BaseFinalTransition { 1281 1282 public LaunchFailedTransition() { 1283 super(RMAppAttemptState.FAILED); 1284 } 1285 1286 @Override 1287 public void transition(RMAppAttemptImpl appAttempt, 1288 RMAppAttemptEvent event) { 1289 1290 // Use diagnostic from launcher 1291 appAttempt.diagnostics.append(event.getDiagnosticMsg()); 1292 1293 // Tell the app, scheduler 1294 super.transition(appAttempt, event); 1295 1296 } 1297 } 1298 1299 private static final class KillAllocatedAMTransition extends 1300 BaseFinalTransition { 1301 public KillAllocatedAMTransition() { 1302 super(RMAppAttemptState.KILLED); 1303 } 1304 1305 @Override 1306 public void transition(RMAppAttemptImpl appAttempt, 1307 RMAppAttemptEvent event) { 1308 1309 // Tell the application and scheduler 1310 super.transition(appAttempt, event); 1311 1312 // Tell the launcher to cleanup. 1313 appAttempt.eventHandler.handle(new AMLauncherEvent( 1314 AMLauncherEventType.CLEANUP, appAttempt)); 1315 1316 } 1317 } 1318 1319 private static final class AMRegisteredTransition extends BaseTransition { 1320 @Override 1321 public void transition(RMAppAttemptImpl appAttempt, 1322 RMAppAttemptEvent event) { 1323 long delay = System.currentTimeMillis() - appAttempt.launchAMEndTime; 1324 ClusterMetrics.getMetrics().addAMRegisterDelay(delay); 1325 RMAppAttemptRegistrationEvent registrationEvent 1326 = (RMAppAttemptRegistrationEvent) event; 1327 appAttempt.host = registrationEvent.getHost(); 1328 appAttempt.rpcPort = registrationEvent.getRpcport(); 1329 appAttempt.originalTrackingUrl = 1330 sanitizeTrackingUrl(registrationEvent.getTrackingurl()); 1331 1332 // Let the app know 1333 appAttempt.eventHandler.handle(new RMAppEvent(appAttempt 1334 .getAppAttemptId().getApplicationId(), 1335 RMAppEventType.ATTEMPT_REGISTERED)); 1336 1337 // TODO:FIXME: Note for future. Unfortunately we only do a state-store 1338 // write at AM launch time, so we don't save the AM's tracking URL anywhere 1339 // as that would mean an extra state-store write. For now, we hope that in 1340 // work-preserving restart, AMs are forced to reregister. 1341 1342 appAttempt.rmContext.getRMApplicationHistoryWriter() 1343 .applicationAttemptStarted(appAttempt); 1344 appAttempt.rmContext.getSystemMetricsPublisher() 1345 .appAttemptRegistered(appAttempt, System.currentTimeMillis()); 1346 } 1347 } 1348 1349 private static final class AMContainerCrashedBeforeRunningTransition extends 1350 BaseFinalTransition { 1351 1352 public AMContainerCrashedBeforeRunningTransition() { 1353 super(RMAppAttemptState.FAILED); 1354 } 1355 1356 @Override 1357 public void transition(RMAppAttemptImpl appAttempt, 1358 RMAppAttemptEvent event) { 1359 RMAppAttemptContainerFinishedEvent finishEvent = 1360 ((RMAppAttemptContainerFinishedEvent)event); 1361 1362 // UnRegister from AMLivelinessMonitor 1363 appAttempt.rmContext.getAMLivelinessMonitor().unregister( 1364 appAttempt.getAppAttemptId()); 1365 1366 // Setup diagnostic message and exit status 1367 appAttempt.setAMContainerCrashedDiagnosticsAndExitStatus(finishEvent); 1368 1369 // Tell the app, scheduler 1370 super.transition(appAttempt, finishEvent); 1371 } 1372 } 1373 1374 private void setAMContainerCrashedDiagnosticsAndExitStatus( 1375 RMAppAttemptContainerFinishedEvent finishEvent) { 1376 ContainerStatus status = finishEvent.getContainerStatus(); 1377 String diagnostics = getAMContainerCrashedDiagnostics(finishEvent); 1378 this.diagnostics.append(diagnostics); 1379 this.amContainerExitStatus = status.getExitStatus(); 1380 } 1381 1382 private String getAMContainerCrashedDiagnostics( 1383 RMAppAttemptContainerFinishedEvent finishEvent) { 1384 ContainerStatus status = finishEvent.getContainerStatus(); 1385 StringBuilder diagnosticsBuilder = new StringBuilder(); 1386 diagnosticsBuilder.append("AM Container for ").append( 1387 finishEvent.getApplicationAttemptId()).append( 1388 " exited with ").append(" exitCode: ").append(status.getExitStatus()). 1389 append("\n"); 1390 if (this.getTrackingUrl() != null) { 1391 diagnosticsBuilder.append("For more detailed output,").append( 1392 " check application tracking page:").append( 1393 this.getTrackingUrl()).append( 1394 "Then, click on links to logs of each attempt.\n"); 1395 } 1396 diagnosticsBuilder.append("Diagnostics: ").append(status.getDiagnostics()) 1397 .append("Failing this attempt"); 1398 return diagnosticsBuilder.toString(); 1399 } 1400 1401 private static class FinalTransition extends BaseFinalTransition { 1402 1403 public FinalTransition(RMAppAttemptState finalAttemptState) { 1404 super(finalAttemptState); 1405 } 1406 1407 @Override 1408 public void transition(RMAppAttemptImpl appAttempt, 1409 RMAppAttemptEvent event) { 1410 1411 appAttempt.progress = 1.0f; 1412 1413 // Tell the app and the scheduler 1414 super.transition(appAttempt, event); 1415 1416 // UnRegister from AMLivelinessMonitor. Perhaps for 1417 // FAILING/KILLED/UnManaged AMs 1418 appAttempt.rmContext.getAMLivelinessMonitor().unregister( 1419 appAttempt.getAppAttemptId()); 1420 appAttempt.rmContext.getAMFinishingMonitor().unregister( 1421 appAttempt.getAppAttemptId()); 1422 1423 if(!appAttempt.submissionContext.getUnmanagedAM()) { 1424 // Tell the launcher to cleanup. 1425 appAttempt.eventHandler.handle(new AMLauncherEvent( 1426 AMLauncherEventType.CLEANUP, appAttempt)); 1427 } 1428 } 1429 } 1430 1431 private static class ExpiredTransition extends FinalTransition { 1432 1433 public ExpiredTransition() { 1434 super(RMAppAttemptState.FAILED); 1435 } 1436 1437 @Override 1438 public void transition(RMAppAttemptImpl appAttempt, 1439 RMAppAttemptEvent event) { 1440 appAttempt.diagnostics.append(getAMExpiredDiagnostics(event)); 1441 super.transition(appAttempt, event); 1442 } 1443 } 1444 1445 private static String getAMExpiredDiagnostics(RMAppAttemptEvent event) { 1446 String diag = 1447 "ApplicationMaster for attempt " + event.getApplicationAttemptId() 1448 + " timed out"; 1449 return diag; 1450 } 1451 1452 private static class UnexpectedAMRegisteredTransition extends 1453 BaseFinalTransition { 1454 1455 public UnexpectedAMRegisteredTransition() { 1456 super(RMAppAttemptState.FAILED); 1457 } 1458 1459 @Override 1460 public void transition(RMAppAttemptImpl appAttempt, RMAppAttemptEvent event) { 1461 assert appAttempt.submissionContext.getUnmanagedAM(); 1462 appAttempt.diagnostics.append(getUnexpectedAMRegisteredDiagnostics()); 1463 super.transition(appAttempt, event); 1464 } 1465 1466 } 1467 1468 private static String getUnexpectedAMRegisteredDiagnostics() { 1469 return "Unmanaged AM must register after AM attempt reaches LAUNCHED state."; 1470 } 1471 1472 private static final class StatusUpdateTransition extends 1473 BaseTransition { 1474 @Override 1475 public void transition(RMAppAttemptImpl appAttempt, 1476 RMAppAttemptEvent event) { 1477 1478 RMAppAttemptStatusupdateEvent statusUpdateEvent 1479 = (RMAppAttemptStatusupdateEvent) event; 1480 1481 // Update progress 1482 appAttempt.progress = statusUpdateEvent.getProgress(); 1483 1484 // Ping to AMLivelinessMonitor 1485 appAttempt.rmContext.getAMLivelinessMonitor().receivedPing( 1486 statusUpdateEvent.getApplicationAttemptId()); 1487 } 1488 } 1489 1490 private static final class AMUnregisteredTransition implements 1491 MultipleArcTransition<RMAppAttemptImpl, RMAppAttemptEvent, RMAppAttemptState> { 1492 1493 @Override 1494 public RMAppAttemptState transition(RMAppAttemptImpl appAttempt, 1495 RMAppAttemptEvent event) { 1496 // Tell the app 1497 if (appAttempt.getSubmissionContext().getUnmanagedAM()) { 1498 // Unmanaged AMs have no container to wait for, so they skip 1499 // the FINISHING state and go straight to FINISHED. 1500 appAttempt.updateInfoOnAMUnregister(event); 1501 new FinalTransition(RMAppAttemptState.FINISHED).transition( 1502 appAttempt, event); 1503 return RMAppAttemptState.FINISHED; 1504 } 1505 // Saving the attempt final state 1506 appAttempt.rememberTargetTransitionsAndStoreState(event, 1507 new FinalStateSavedAfterAMUnregisterTransition(), 1508 RMAppAttemptState.FINISHING, RMAppAttemptState.FINISHED); 1509 ApplicationId applicationId = 1510 appAttempt.getAppAttemptId().getApplicationId(); 1511 1512 // Tell the app immediately that AM is unregistering so that app itself 1513 // can save its state as soon as possible. Whether we do it like this, or 1514 // we wait till AppAttempt is saved, it doesn't make any difference on the 1515 // app side w.r.t failure conditions. The only event going out of 1516 // AppAttempt to App after this point of time is AM/AppAttempt Finished. 1517 appAttempt.eventHandler.handle(new RMAppEvent(applicationId, 1518 RMAppEventType.ATTEMPT_UNREGISTERED)); 1519 return RMAppAttemptState.FINAL_SAVING; 1520 } 1521 } 1522 1523 private static class FinalStateSavedAfterAMUnregisterTransition extends 1524 BaseTransition { 1525 @Override 1526 public void 1527 transition(RMAppAttemptImpl appAttempt, RMAppAttemptEvent event) { 1528 // Unregister from the AMlivenessMonitor and register with AMFinishingMonitor 1529 appAttempt.rmContext.getAMLivelinessMonitor().unregister( 1530 appAttempt.applicationAttemptId); 1531 appAttempt.rmContext.getAMFinishingMonitor().register( 1532 appAttempt.applicationAttemptId); 1533 1534 // Do not make any more changes to this transition code. Make all changes 1535 // to the following method. Unless you are absolutely sure that you have 1536 // stuff to do that shouldn't be used by the callers of the following 1537 // method. 1538 appAttempt.updateInfoOnAMUnregister(event); 1539 } 1540 } 1541 1542 private void updateInfoOnAMUnregister(RMAppAttemptEvent event) { 1543 progress = 1.0f; 1544 RMAppAttemptUnregistrationEvent unregisterEvent = 1545 (RMAppAttemptUnregistrationEvent) event; 1546 diagnostics.append(unregisterEvent.getDiagnosticMsg()); 1547 originalTrackingUrl = sanitizeTrackingUrl(unregisterEvent.getFinalTrackingUrl()); 1548 finalStatus = unregisterEvent.getFinalApplicationStatus(); 1549 } 1550 1551 private static final class ContainerFinishedTransition 1552 implements 1553 MultipleArcTransition<RMAppAttemptImpl, RMAppAttemptEvent, RMAppAttemptState> { 1554 1555 // The transition To Do after attempt final state is saved. 1556 private BaseTransition transitionToDo; 1557 private RMAppAttemptState currentState; 1558 1559 public ContainerFinishedTransition(BaseTransition transitionToDo, 1560 RMAppAttemptState currentState) { 1561 this.transitionToDo = transitionToDo; 1562 this.currentState = currentState; 1563 } 1564 1565 @Override 1566 public RMAppAttemptState transition(RMAppAttemptImpl appAttempt, 1567 RMAppAttemptEvent event) { 1568 1569 RMAppAttemptContainerFinishedEvent containerFinishedEvent = 1570 (RMAppAttemptContainerFinishedEvent) event; 1571 ContainerStatus containerStatus = 1572 containerFinishedEvent.getContainerStatus(); 1573 1574 // Is this container the AmContainer? If the finished container is same as 1575 // the AMContainer, AppAttempt fails 1576 if (appAttempt.masterContainer != null 1577 && appAttempt.masterContainer.getId().equals( 1578 containerStatus.getContainerId())) { 1579 appAttempt.sendAMContainerToNM(appAttempt, containerFinishedEvent); 1580 1581 // Remember the follow up transition and save the final attempt state. 1582 appAttempt.rememberTargetTransitionsAndStoreState(event, 1583 transitionToDo, RMAppAttemptState.FAILED, RMAppAttemptState.FAILED); 1584 return RMAppAttemptState.FINAL_SAVING; 1585 } 1586 1587 // Add all finished containers so that they can be acked to NM 1588 addJustFinishedContainer(appAttempt, containerFinishedEvent); 1589 return this.currentState; 1590 } 1591 } 1592 1593 1594 // Ack NM to remove finished containers from context. 1595 private void sendFinishedContainersToNM() { 1596 for (NodeId nodeId : finishedContainersSentToAM.keySet()) { 1597 1598 // Clear and get current values 1599 List<ContainerStatus> currentSentContainers = 1600 finishedContainersSentToAM.put(nodeId, 1601 new ArrayList<ContainerStatus>()); 1602 List<ContainerId> containerIdList = 1603 new ArrayList<ContainerId>(currentSentContainers.size()); 1604 for (ContainerStatus containerStatus : currentSentContainers) { 1605 containerIdList.add(containerStatus.getContainerId()); 1606 } 1607 eventHandler.handle(new RMNodeFinishedContainersPulledByAMEvent(nodeId, 1608 containerIdList)); 1609 } 1610 } 1611 1612 // Add am container to the list so that am container instance will be 1613 // removed from NMContext. 1614 private void sendAMContainerToNM(RMAppAttemptImpl appAttempt, 1615 RMAppAttemptContainerFinishedEvent containerFinishedEvent) { 1616 NodeId nodeId = containerFinishedEvent.getNodeId(); 1617 finishedContainersSentToAM.putIfAbsent(nodeId, 1618 new ArrayList<ContainerStatus>()); 1619 appAttempt.finishedContainersSentToAM.get(nodeId).add( 1620 containerFinishedEvent.getContainerStatus()); 1621 if (!appAttempt.getSubmissionContext() 1622 .getKeepContainersAcrossApplicationAttempts()) { 1623 appAttempt.sendFinishedContainersToNM(); 1624 } 1625 } 1626 1627 private static void addJustFinishedContainer(RMAppAttemptImpl appAttempt, 1628 RMAppAttemptContainerFinishedEvent containerFinishedEvent) { 1629 appAttempt.justFinishedContainers.putIfAbsent(containerFinishedEvent 1630 .getNodeId(), new ArrayList<ContainerStatus>()); 1631 appAttempt.justFinishedContainers.get(containerFinishedEvent 1632 .getNodeId()).add(containerFinishedEvent.getContainerStatus()); 1633 } 1634 1635 private static final class ContainerFinishedAtFinalStateTransition 1636 extends BaseTransition { 1637 @Override 1638 public void 1639 transition(RMAppAttemptImpl appAttempt, RMAppAttemptEvent event) { 1640 RMAppAttemptContainerFinishedEvent containerFinishedEvent = 1641 (RMAppAttemptContainerFinishedEvent) event; 1642 1643 // Normal container. Add it in completed containers list 1644 addJustFinishedContainer(appAttempt, containerFinishedEvent); 1645 } 1646 } 1647 1648 private static class AMContainerCrashedAtRunningTransition extends 1649 BaseTransition { 1650 @Override 1651 public void 1652 transition(RMAppAttemptImpl appAttempt, RMAppAttemptEvent event) { 1653 RMAppAttemptContainerFinishedEvent finishEvent = 1654 (RMAppAttemptContainerFinishedEvent) event; 1655 // container associated with AM. must not be unmanaged 1656 assert appAttempt.submissionContext.getUnmanagedAM() == false; 1657 // Setup diagnostic message and exit status 1658 appAttempt.setAMContainerCrashedDiagnosticsAndExitStatus(finishEvent); 1659 new FinalTransition(RMAppAttemptState.FAILED).transition(appAttempt, 1660 event); 1661 } 1662 } 1663 1664 private static final class AMFinishingContainerFinishedTransition 1665 implements 1666 MultipleArcTransition<RMAppAttemptImpl, RMAppAttemptEvent, RMAppAttemptState> { 1667 1668 @Override 1669 public RMAppAttemptState transition(RMAppAttemptImpl appAttempt, 1670 RMAppAttemptEvent event) { 1671 1672 RMAppAttemptContainerFinishedEvent containerFinishedEvent 1673 = (RMAppAttemptContainerFinishedEvent) event; 1674 ContainerStatus containerStatus = 1675 containerFinishedEvent.getContainerStatus(); 1676 1677 // Is this container the ApplicationMaster container? 1678 if (appAttempt.masterContainer.getId().equals( 1679 containerStatus.getContainerId())) { 1680 new FinalTransition(RMAppAttemptState.FINISHED).transition( 1681 appAttempt, containerFinishedEvent); 1682 appAttempt.sendAMContainerToNM(appAttempt, containerFinishedEvent); 1683 return RMAppAttemptState.FINISHED; 1684 } 1685 // Add all finished containers so that they can be acked to NM. 1686 addJustFinishedContainer(appAttempt, containerFinishedEvent); 1687 1688 return RMAppAttemptState.FINISHING; 1689 } 1690 } 1691 1692 private static class ContainerFinishedAtFinalSavingTransition extends 1693 BaseTransition { 1694 @Override 1695 public void 1696 transition(RMAppAttemptImpl appAttempt, RMAppAttemptEvent event) { 1697 RMAppAttemptContainerFinishedEvent containerFinishedEvent = 1698 (RMAppAttemptContainerFinishedEvent) event; 1699 ContainerStatus containerStatus = 1700 containerFinishedEvent.getContainerStatus(); 1701 1702 // If this is the AM container, it means the AM container is finished, 1703 // but we are not yet acknowledged that the final state has been saved. 1704 // Thus, we still return FINAL_SAVING state here. 1705 if (appAttempt.masterContainer.getId().equals( 1706 containerStatus.getContainerId())) { 1707 appAttempt.sendAMContainerToNM(appAttempt, containerFinishedEvent); 1708 1709 if (appAttempt.targetedFinalState.equals(RMAppAttemptState.FAILED) 1710 || appAttempt.targetedFinalState.equals(RMAppAttemptState.KILLED)) { 1711 // ignore Container_Finished Event if we were supposed to reach 1712 // FAILED/KILLED state. 1713 return; 1714 } 1715 1716 // pass in the earlier AMUnregistered Event also, as this is needed for 1717 // AMFinishedAfterFinalSavingTransition later on 1718 appAttempt.rememberTargetTransitions(event, 1719 new AMFinishedAfterFinalSavingTransition( 1720 appAttempt.eventCausingFinalSaving), RMAppAttemptState.FINISHED); 1721 return; 1722 } 1723 1724 // Add all finished containers so that they can be acked to NM. 1725 addJustFinishedContainer(appAttempt, containerFinishedEvent); 1726 } 1727 } 1728 1729 private static class AMFinishedAfterFinalSavingTransition extends 1730 BaseTransition { 1731 RMAppAttemptEvent amUnregisteredEvent; 1732 public AMFinishedAfterFinalSavingTransition( 1733 RMAppAttemptEvent amUnregisteredEvent) { 1734 this.amUnregisteredEvent = amUnregisteredEvent; 1735 } 1736 1737 @Override 1738 public void 1739 transition(RMAppAttemptImpl appAttempt, RMAppAttemptEvent event) { 1740 appAttempt.updateInfoOnAMUnregister(amUnregisteredEvent); 1741 new FinalTransition(RMAppAttemptState.FINISHED).transition(appAttempt, 1742 event); 1743 } 1744 } 1745 1746 private static class AMExpiredAtFinalSavingTransition extends 1747 BaseTransition { 1748 @Override 1749 public void 1750 transition(RMAppAttemptImpl appAttempt, RMAppAttemptEvent event) { 1751 if (appAttempt.targetedFinalState.equals(RMAppAttemptState.FAILED) 1752 || appAttempt.targetedFinalState.equals(RMAppAttemptState.KILLED)) { 1753 // ignore Container_Finished Event if we were supposed to reach 1754 // FAILED/KILLED state. 1755 return; 1756 } 1757 1758 // pass in the earlier AMUnregistered Event also, as this is needed for 1759 // AMFinishedAfterFinalSavingTransition later on 1760 appAttempt.rememberTargetTransitions(event, 1761 new AMFinishedAfterFinalSavingTransition( 1762 appAttempt.eventCausingFinalSaving), RMAppAttemptState.FINISHED); 1763 } 1764 } 1765 1766 @Override 1767 public long getStartTime() { 1768 this.readLock.lock(); 1769 try { 1770 return this.startTime; 1771 } finally { 1772 this.readLock.unlock(); 1773 } 1774 } 1775 1776 @Override 1777 public RMAppAttemptState getState() { 1778 this.readLock.lock(); 1779 1780 try { 1781 return this.stateMachine.getCurrentState(); 1782 } finally { 1783 this.readLock.unlock(); 1784 } 1785 } 1786 1787 @Override 1788 public YarnApplicationAttemptState createApplicationAttemptState() { 1789 RMAppAttemptState state = getState(); 1790 // If AppAttempt is in FINAL_SAVING state, return its previous state. 1791 if (state.equals(RMAppAttemptState.FINAL_SAVING)) { 1792 state = stateBeforeFinalSaving; 1793 } 1794 return RMServerUtils.createApplicationAttemptState(state); 1795 } 1796 1797 private void launchAttempt(){ 1798 launchAMStartTime = System.currentTimeMillis(); 1799 // Send event to launch the AM Container 1800 eventHandler.handle(new AMLauncherEvent(AMLauncherEventType.LAUNCH, this)); 1801 } 1802 1803 private void attemptLaunched() { 1804 // Register with AMLivelinessMonitor 1805 rmContext.getAMLivelinessMonitor().register(getAppAttemptId()); 1806 } 1807 1808 private void storeAttempt() { 1809 // store attempt data in a non-blocking manner to prevent dispatcher 1810 // thread starvation and wait for state to be saved 1811 LOG.info("Storing attempt: AppId: " + 1812 getAppAttemptId().getApplicationId() 1813 + " AttemptId: " + 1814 getAppAttemptId() 1815 + " MasterContainer: " + masterContainer); 1816 rmContext.getStateStore().storeNewApplicationAttempt(this); 1817 } 1818 1819 private void removeCredentials(RMAppAttemptImpl appAttempt) { 1820 // Unregister from the ClientToAMTokenSecretManager 1821 if (UserGroupInformation.isSecurityEnabled()) { 1822 appAttempt.rmContext.getClientToAMTokenSecretManager() 1823 .unRegisterApplication(appAttempt.getAppAttemptId()); 1824 } 1825 1826 // Remove the AppAttempt from the AMRMTokenSecretManager 1827 appAttempt.rmContext.getAMRMTokenSecretManager() 1828 .applicationMasterFinished(appAttempt.getAppAttemptId()); 1829 } 1830 1831 private static String sanitizeTrackingUrl(String url) { 1832 return (url == null || url.trim().isEmpty()) ? "N/A" : url; 1833 } 1834 1835 @Override 1836 public ApplicationAttemptReport createApplicationAttemptReport() { 1837 this.readLock.lock(); 1838 ApplicationAttemptReport attemptReport = null; 1839 try { 1840 // AM container maybe not yet allocated. and also unmangedAM doesn't have 1841 // am container. 1842 ContainerId amId = 1843 masterContainer == null ? null : masterContainer.getId(); 1844 attemptReport = ApplicationAttemptReport.newInstance(this 1845 .getAppAttemptId(), this.getHost(), this.getRpcPort(), this 1846 .getTrackingUrl(), this.getOriginalTrackingUrl(), this.getDiagnostics(), 1847 YarnApplicationAttemptState .valueOf(this.getState().toString()), amId); 1848 } finally { 1849 this.readLock.unlock(); 1850 } 1851 return attemptReport; 1852 } 1853 1854 // for testing 1855 public boolean mayBeLastAttempt() { 1856 return maybeLastAttempt; 1857 } 1858 1859 @Override 1860 public RMAppAttemptMetrics getRMAppAttemptMetrics() { 1861 // didn't use read/write lock here because RMAppAttemptMetrics has its own 1862 // lock 1863 return attemptMetrics; 1864 } 1865 1866 @Override 1867 public long getFinishTime() { 1868 try { 1869 this.readLock.lock(); 1870 return this.finishTime; 1871 } finally { 1872 this.readLock.unlock(); 1873 } 1874 } 1875 1876 private void setFinishTime(long finishTime) { 1877 try { 1878 this.writeLock.lock(); 1879 this.finishTime = finishTime; 1880 } finally { 1881 this.writeLock.unlock(); 1882 } 1883 } 1884 }
RMAppAttemptImpl.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
RMAppAttemptImpl接受RMAppAttemptEventType.START事件后,进行一系列初始化工作。 在类RMAppAttemptImpl中,也有状态机工厂。 同前面的分析类似, 在RMAppAttemptImpl类的handle()函数如下所示:
1 // RMAppAttemptImpl.java 2 @Override 3 public void handle(RMAppAttemptEvent event) { 4 5 this.writeLock.lock(); 6 7 try { 8 ApplicationAttemptId appAttemptID = event.getApplicationAttemptId(); 9 LOG.debug("Processing event for " + appAttemptID + " of type " 10 + event.getType()); 11 final RMAppAttemptState oldState = getAppAttemptState(); 12 try { 13 /* keep the master in sync with the state machine */ 14 this.stateMachine.doTransition(event.getType(), event); 15 } catch (InvalidStateTransitonException e) { 16 LOG.error("Can't handle this event at current state", e); 17 /* TODO fail the application on the failed transition */ 18 } 19 20 if (oldState != getAppAttemptState()) { 21 LOG.info(appAttemptID + " State change from " + oldState + " to " 22 + getAppAttemptState()); 23 } 24 } finally { 25 this.writeLock.unlock(); 26 } 27 }
handle()函数内部调用 this.stateMachine.doTransition(event.getType(), event), 其中this.stateMachine 在 RMAppAttemptImpl类的构造函数中进行初始化, 如下所示:
1 //RMAppAttemptImpl.java 2 public RMAppAttemptImpl(ApplicationAttemptId appAttemptId, 3 RMContext rmContext, YarnScheduler scheduler, 4 ApplicationMasterService masterService, 5 ApplicationSubmissionContext submissionContext, 6 Configuration conf, boolean maybeLastAttempt, ResourceRequest amReq) { 7 this.conf = conf; 8 this.applicationAttemptId = appAttemptId; 9 this.rmContext = rmContext; 10 this.eventHandler = rmContext.getDispatcher().getEventHandler(); 11 this.submissionContext = submissionContext; 12 this.scheduler = scheduler; 13 this.masterService = masterService; 14 15 ReentrantReadWriteLock lock = new ReentrantReadWriteLock(); 16 this.readLock = lock.readLock(); 17 this.writeLock = lock.writeLock(); 18 19 this.proxiedTrackingUrl = generateProxyUriWithScheme(); 20 this.maybeLastAttempt = maybeLastAttempt; 21 this.stateMachine = stateMachineFactory.make(this); 22 23 this.attemptMetrics = 24 new RMAppAttemptMetrics(applicationAttemptId, rmContext); 25 26 this.amReq = amReq; 27 }
this.stateMachine = stateMachineFactory.make(this), 会触发 StateMachineFactory类进行状态转换, 如下所示:
//RMAppAttemptImpl.java private static final StateMachineFactory<RMAppAttemptImpl, RMAppAttemptState, RMAppAttemptEventType, RMAppAttemptEvent> stateMachineFactory = new StateMachineFactory<RMAppAttemptImpl, RMAppAttemptState, RMAppAttemptEventType, RMAppAttemptEvent>(RMAppAttemptState.NEW) // Transitions from NEW State .addTransition(RMAppAttemptState.NEW, RMAppAttemptState.SUBMITTED, RMAppAttemptEventType.START, new AttemptStartedTransition()) ...... .installTopology();
接受RMAppAttemptEventType.START事件,将自身状态由RMAppAttemptState.NEW转换为RMAppAttemptState.SUBMITTED,并调用AttemptStartedTransition。 如下所示:
1 public enum RMAppAttemptState { 2 NEW, SUBMITTED, SCHEDULED, ALLOCATED, LAUNCHED, FAILED, RUNNING, FINISHING, 3 FINISHED, KILLED, ALLOCATED_SAVING, LAUNCHED_UNMANAGED_SAVING, FINAL_SAVING 4 }
RMAppAttemptState.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptState.java
1 //RMAppAttemptImpl.java 的内部类 AttemptStartedTransition 2 private static final class AttemptStartedTransition extends BaseTransition { 3 @Override 4 public void transition(RMAppAttemptImpl appAttempt, 5 RMAppAttemptEvent event) { 6 7 boolean transferStateFromPreviousAttempt = false; 8 if (event instanceof RMAppStartAttemptEvent) { 9 transferStateFromPreviousAttempt = 10 ((RMAppStartAttemptEvent) event) 11 .getTransferStateFromPreviousAttempt(); 12 } 13 appAttempt.startTime = System.currentTimeMillis(); 14 15 // Register with the ApplicationMasterService 16 appAttempt.masterService 17 .registerAppAttempt(appAttempt.applicationAttemptId); 18 19 if (UserGroupInformation.isSecurityEnabled()) { 20 appAttempt.clientTokenMasterKey = 21 appAttempt.rmContext.getClientToAMTokenSecretManager() 22 .createMasterKey(appAttempt.applicationAttemptId); 23 } 24 25 // Add the applicationAttempt to the scheduler and inform the scheduler 26 // whether to transfer the state from previous attempt. 27 appAttempt.eventHandler.handle(new AppAttemptAddedSchedulerEvent( 28 appAttempt.applicationAttemptId, transferStateFromPreviousAttempt)); 29 } 30 }
最后会调用appAttempt.eventHandler.handle(new AppAttemptAddedSchedulerEvent( appAttempt.applicationAttemptId, transferStateFromPreviousAttempt)), 会调用AppAttemptAddedSchedulerEvent类。如下所示:
1 //AppAttemptAddedSchedulerEvent.java 2 public AppAttemptAddedSchedulerEvent( 3 ApplicationAttemptId applicationAttemptId, 4 boolean transferStateFromPreviousAttempt) { 5 this(applicationAttemptId, transferStateFromPreviousAttempt, false); 6 } 7 8 public AppAttemptAddedSchedulerEvent( 9 ApplicationAttemptId applicationAttemptId, 10 boolean transferStateFromPreviousAttempt, 11 boolean isAttemptRecovering) { 12 super(SchedulerEventType.APP_ATTEMPT_ADDED); 13 this.applicationAttemptId = applicationAttemptId; 14 this.transferStateFromPreviousAttempt = transferStateFromPreviousAttempt; 15 this.isAttemptRecovering = isAttemptRecovering; 16 }
这里最后发送了一个SchedulerEventType.APP_ATTEMPT_ADDED事件。 其中SchedulerEventType类型事件是在ResourceManager类的内部类RMActiveServices的serviceInit()函数中进行绑定的,将SchedulerEventType类型的事件绑定到了EventHandler<SchedulerEvent> 的对象schedulerDispatcher上, 同上面的分析一样, 最终CapacityScheduler收到SchedulerEventType.APP_ATTEMPT_ADDED事件后,走到默认调度器CapacityScheduler中的handle()函数。 如下所示:
1 //CapacityScheduler.java 2 public void handle(SchedulerEvent event) { 3 switch(event.getType()) { 4 ...... 5 case APP_ATTEMPT_ADDED: 6 { 7 AppAttemptAddedSchedulerEvent appAttemptAddedEvent = 8 (AppAttemptAddedSchedulerEvent) event; 9 addApplicationAttempt(appAttemptAddedEvent.getApplicationAttemptId(), 10 appAttemptAddedEvent.getTransferStateFromPreviousAttempt(), 11 appAttemptAddedEvent.getIsAttemptRecovering()); 12 } 13 break; 14 ...... 15 } 16 }
我们可以看到, 调用了addApplicationAttempt() 函数, 进入函数addApplicationAttempt(), 如下所示:
1 //CapacityScheduler.java 2 private synchronized void addApplicationAttempt( 3 ApplicationAttemptId applicationAttemptId, 4 boolean transferStateFromPreviousAttempt, 5 boolean isAttemptRecovering) { 6 SchedulerApplication<FiCaSchedulerApp> application = 7 applications.get(applicationAttemptId.getApplicationId()); 8 if (application == null) { 9 LOG.warn("Application " + applicationAttemptId.getApplicationId() + 10 " cannot be found in scheduler."); 11 return; 12 } 13 CSQueue queue = (CSQueue) application.getQueue(); 14 15 FiCaSchedulerApp attempt = 16 new FiCaSchedulerApp(applicationAttemptId, application.getUser(), 17 queue, queue.getActiveUsersManager(), rmContext); 18 if (transferStateFromPreviousAttempt) { 19 attempt.transferStateFromPreviousAttempt(application 20 .getCurrentAppAttempt()); 21 } 22 application.setCurrentAppAttempt(attempt); 23 24 queue.submitApplicationAttempt(attempt, application.getUser()); 25 LOG.info("Added Application Attempt " + applicationAttemptId 26 + " to scheduler from user " + application.getUser() + " in queue " 27 + queue.getQueueName()); 28 if (isAttemptRecovering) { 29 if (LOG.isDebugEnabled()) { 30 LOG.debug(applicationAttemptId 31 + " is recovering. Skipping notifying ATTEMPT_ADDED"); 32 } 33 } else { 34 rmContext.getDispatcher().getEventHandler().handle( 35 new RMAppAttemptEvent(applicationAttemptId, 36 RMAppAttemptEventType.ATTEMPT_ADDED)); 37 } 38 }
其中addApplicationAttempt() 函数的24和25行是将运行实例加入到队列中, 并打印:例如 Added Application Attempt appattempt_1487944669971_0001_000001 to scheduler from user root in queue default
34~36行是发送事件RMAppAttemptEventType.ATTEMPT_ADDED给RMAppAttemptImpl。 具体分析同上, RMAppAttemptEventType类型事件是在ResourceManager类的内部类RMActiveServices的serviceInit()函数中进行绑定的,将RMAppAttemptEventType类型的事件绑定到了内部类ApplicationAttemptEventDispatcher, 该类内部会调用rmAppAttempt.handle(event), 即RMAppAttemptImpl类的handle()函数, 函数内部会调用this.stateMachine.doTransition(event.getType(), event), 触发StateMachineFactory类的转换事件, 我们知道,上一步的状态是RMAppAttemptState.SUBMITTED, 如下所示:
1 //RMAppAttemptImpl.java 2 private static final StateMachineFactory<RMAppAttemptImpl, 3 RMAppAttemptState, 4 RMAppAttemptEventType, 5 RMAppAttemptEvent> 6 stateMachineFactory = new StateMachineFactory<RMAppAttemptImpl, 7 RMAppAttemptState, 8 RMAppAttemptEventType, 9 RMAppAttemptEvent>(RMAppAttemptState.NEW) 10 11 ...... 12 // Transitions from SUBMITTED state 13 .addTransition(RMAppAttemptState.SUBMITTED, 14 EnumSet.of(RMAppAttemptState.LAUNCHED_UNMANAGED_SAVING, 15 RMAppAttemptState.SCHEDULED), 16 RMAppAttemptEventType.ATTEMPT_ADDED, 17 new ScheduleTransition()) 18 19 ...... 20 .installTopology();
将自身状态由RMAppAttemptState.SUBMITTED转换为EnumSet.of(RMAppAttemptState.LAUNCHED_UNMANAGED_SAVING, RMAppAttemptState.SCHEDULED),并调用ScheduleTransition。 如下所示:
1 // RMAppAttemptImpl.java 的内部类 ScheduleTransition 2 @VisibleForTesting 3 public static final class ScheduleTransition 4 implements 5 MultipleArcTransition<RMAppAttemptImpl, RMAppAttemptEvent, RMAppAttemptState> { 6 @Override 7 public RMAppAttemptState transition(RMAppAttemptImpl appAttempt, 8 RMAppAttemptEvent event) { 9 ApplicationSubmissionContext subCtx = appAttempt.submissionContext; 10 if (!subCtx.getUnmanagedAM()) { 11 // Need reset #containers before create new attempt, because this request 12 // will be passed to scheduler, and scheduler will deduct the number after 13 // AM container allocated 14 15 // Currently, following fields are all hard code, 16 // TODO: change these fields when we want to support 17 // priority/resource-name/relax-locality specification for AM containers 18 // allocation. 19 appAttempt.amReq.setNumContainers(1); 20 appAttempt.amReq.setPriority(AM_CONTAINER_PRIORITY); 21 appAttempt.amReq.setResourceName(ResourceRequest.ANY); 22 appAttempt.amReq.setRelaxLocality(true); 23 24 // AM resource has been checked when submission 25 Allocation amContainerAllocation = 26 appAttempt.scheduler.allocate(appAttempt.applicationAttemptId, 27 Collections.singletonList(appAttempt.amReq), 28 EMPTY_CONTAINER_RELEASE_LIST, null, null); 29 if (amContainerAllocation != null 30 && amContainerAllocation.getContainers() != null) { 31 assert (amContainerAllocation.getContainers().size() == 0); 32 } 33 return RMAppAttemptState.SCHEDULED; 34 } else { 35 // save state and then go to LAUNCHED state 36 appAttempt.storeAttempt(); 37 return RMAppAttemptState.LAUNCHED_UNMANAGED_SAVING; 38 } 39 } 40 }
为AM申请Container资源,该资源描述如19~22行, 即一个优先级为AM_CONTAINER_PRIORITY(值为0),可在任意节点上ResourceRequest.ANY。
该函数内部调用 Allocation amContainerAllocation = appAttempt.scheduler.allocate(appAttempt.applicationAttemptId, Collections.singletonList(appAttempt.amReq), EMPTY_CONTAINER_RELEASE_LIST, null, null), 即去scheduler里面执行allocate 然后会返回一个Allocation对象,会等NodeManager去heartBeat的时候,ResourceManager发现这个NM还有资源, 然后就assign这个Allocation到这个NM上面, 再去Launch AM 。
这里的allocate()即是接口 YarnScheduler.java 的allocate()函数, 我们前面已经分析过, 默认调度器类是 CapacityScheduler, 所以最终调用的是 接口YarnScheduler 的 实现类CapacityScheduler.java 的allocate()函数。如下所示:
1 //CapacityScheduler.java 2 @Override 3 @Lock(Lock.NoLock.class) 4 public Allocation allocate(ApplicationAttemptId applicationAttemptId, 5 List<ResourceRequest> ask, List<ContainerId> release, 6 List<String> blacklistAdditions, List<String> blacklistRemovals) { 7 8 // 9 FiCaSchedulerApp application = getApplicationAttempt(applicationAttemptId); 10 if (application == null) { 11 LOG.info("Calling allocate on removed " + 12 "or non existant application " + applicationAttemptId); 13 return EMPTY_ALLOCATION; 14 } 15 16 // Sanity check 17 //完整性检查 18 SchedulerUtils.normalizeRequests( 19 ask, getResourceCalculator(), getClusterResource(), 20 getMinimumResourceCapability(), getMaximumResourceCapability()); 21 22 // Release containers 23 //释放容器 24 releaseContainers(release, application); 25 26 synchronized (application) { 27 28 // make sure we aren't stopping/removing the application 29 // when the allocate comes in 30 if (application.isStopped()) { 31 LOG.info("Calling allocate on a stopped " + 32 "application " + applicationAttemptId); 33 return EMPTY_ALLOCATION; 34 } 35 36 if (!ask.isEmpty()) { 37 38 if(LOG.isDebugEnabled()) { 39 LOG.debug("allocate: pre-update" + 40 " applicationAttemptId=" + applicationAttemptId + 41 " application=" + application); 42 } 43 application.showRequests(); 44 45 // Update application requests 46 application.updateResourceRequests(ask); 47 48 LOG.debug("allocate: post-update"); 49 application.showRequests(); 50 } 51 52 if(LOG.isDebugEnabled()) { 53 LOG.debug("allocate:" + 54 " applicationAttemptId=" + applicationAttemptId + 55 " #ask=" + ask.size()); 56 } 57 58 application.updateBlacklist(blacklistAdditions, blacklistRemovals); 59 60 return application.getAllocation(getResourceCalculator(), 61 clusterResource, getMinimumResourceCapability()); 62 } 63 }
最后会调用application.getAllocation(getResourceCalculator(), clusterResource, getMinimumResourceCapability()), 看getAllocation()函数,如下所示:
1 //FiCaSchedulerApp.java 2 /** 3 * This method produces an Allocation that includes the current view 4 * of the resources that will be allocated to and preempted from this 5 * application. 6 * 7 * @param rc 8 * @param clusterResource 9 * @param minimumAllocation 10 * @return an allocation 11 */ 12 public synchronized Allocation getAllocation(ResourceCalculator rc, 13 Resource clusterResource, Resource minimumAllocation) { 14 15 Set<ContainerId> currentContPreemption = Collections.unmodifiableSet( 16 new HashSet<ContainerId>(containersToPreempt)); 17 containersToPreempt.clear(); 18 Resource tot = Resource.newInstance(0, 0); 19 for(ContainerId c : currentContPreemption){ 20 Resources.addTo(tot, 21 liveContainers.get(c).getContainer().getResource()); 22 } 23 int numCont = (int) Math.ceil( 24 Resources.divide(rc, clusterResource, tot, minimumAllocation)); 25 ResourceRequest rr = ResourceRequest.newInstance( 26 Priority.UNDEFINED, ResourceRequest.ANY, 27 minimumAllocation, numCont); 28 ContainersAndNMTokensAllocation allocation = 29 pullNewlyAllocatedContainersAndNMTokens(); 30 Resource headroom = getHeadroom(); 31 setApplicationHeadroomForMetrics(headroom); 32 return new Allocation(allocation.getContainerList(), headroom, null, 33 currentContPreemption, Collections.singletonList(rr), 34 allocation.getNMTokenList()); 35 }
该方法会继续调用 ContainersAndNMTokensAllocation allocation = pullNewlyAllocatedContainersAndNMTokens(), 如下所示:
1 //SchedulerApplicationAttempt.java 2 // Create container token and NMToken altogether, if either of them fails for 3 // some reason like DNS unavailable, do not return this container and keep it 4 // in the newlyAllocatedContainers waiting to be refetched. 5 public synchronized ContainersAndNMTokensAllocation 6 pullNewlyAllocatedContainersAndNMTokens() { 7 List<Container> returnContainerList = 8 new ArrayList<Container>(newlyAllocatedContainers.size()); 9 List<NMToken> nmTokens = new ArrayList<NMToken>(); 10 for (Iterator<RMContainer> i = newlyAllocatedContainers.iterator(); i 11 .hasNext();) { 12 RMContainer rmContainer = i.next(); 13 Container container = rmContainer.getContainer(); 14 try { 15 // create container token and NMToken altogether. 16 container.setContainerToken(rmContext.getContainerTokenSecretManager() 17 .createContainerToken(container.getId(), container.getNodeId(), 18 getUser(), container.getResource(), container.getPriority(), 19 rmContainer.getCreationTime(), this.logAggregationContext)); 20 NMToken nmToken = 21 rmContext.getNMTokenSecretManager().createAndGetNMToken(getUser(), 22 getApplicationAttemptId(), container); 23 if (nmToken != null) { 24 nmTokens.add(nmToken); 25 } 26 } catch (IllegalArgumentException e) { 27 // DNS might be down, skip returning this container. 28 LOG.error("Error trying to assign container token and NM token to" + 29 " an allocated container " + container.getId(), e); 30 continue; 31 } 32 returnContainerList.add(container); 33 i.remove(); 34 rmContainer.handle(new RMContainerEvent(rmContainer.getContainerId(), 35 RMContainerEventType.ACQUIRED)); 36 } 37 return new ContainersAndNMTokensAllocation(returnContainerList, nmTokens); 38 }
到这一步就遇到瓶颈,追踪中断, 参考 参考1 参考2 参考3 分析如下:
我们使用的是 Capacity 调度器,CapacityScheduler.allocate() 方法的主要做两件事情:
- 调用 FicaSchedulerApp.updateResourceRequests() 更新 APP (指从调度器角度看的 APP) 的资源需求。
- 通过 FicaSchedulerApp.pullNewlyAllocatedContainersAndNMTokens() 把 FicaSchedulerApp.newlyAllocatedContainers 这个 List 中的Container取出来,封装后返回。
FicaSchedulerApp.newlyAllocatedContainers 这个数据结构中存放的,正是最近申请到的 Container 。那么,这个 List 中的元素是怎么来的呢,这要从 NM 的心跳说起。
也即此刻,某个node(称为“AM-NODE”)正好通过heartbeat向ResourceManager.ResourceTrackerService汇报自己所在节点的资源使用情况。
所以去NodeManager.java 中分析。 参考1 参考2 参考3 参考4
以下的分析先暂停,因为任务要求要分析调度器的源码。