YARN(MapReduce 2)运行MapReduce的过程-源码分析

这是我的分析,当然查阅书籍和网络。如有什么不对的,请各位批评指正。以下的类有的并不完全,只列出重要的方法。

如要转载,请注上作者以及出处。

一、源码阅读环境

需要安装jdk1.7.0版本及其以上版本,还需要安装Eclipse阅读hadoop源码。

Eclipse安装教程参见我的博客。

Hadoop源码官网下载。我下载的是2.7.3版本的。其中source是源代码工程,需要你编译才能执行。而binary是编译好的克执行文件。

如果你要搭建Hadoop集群,则下载binary的。如果阅读源代码,下载source的。

这里我们需要分析源代码,下载source,解压后文件名是hadoop-2.7.3-src。

把Hadoop导入Eclipse

1.打开Eclipse,点击File->New->Java Project,会弹出如下图1所示:

          图 1

我们就把Hadoop源码导入了Eclipse,但会报很多错误,具体解决方案参见我的博客 Eclipse中导入Hadoop源代码工程  。不过不影响我们看源码。

2.如果你想看某个类,但你不知道在哪?先定位到hadoop-2.7.3-src

以Job.java为例子,

在Windows中:你可以在文件资源管理器的搜索栏里输入想要搜索的类名,如下图2所示:

          图 2

然后右键该文件,选择打开文件所在的位置。这个位置和Eclipse中hadoop-2.7.3-src项目的目录结构相对应。

比如我们搜到的Job.java在hadoop-2.7.3-src\hadoop-mapreduce-project\hadoop-mapreduce-client\hadoop-mapreduce-client-core\src\main\java\org\apache\hadoop\mapreduce中。

在Linux中,需要在终端通过find命令查找。

find . -name "Job.java"    第一个参数是路径,其中.表示当前目录,/表示根目录。如下图3所示:

                     图 3

我们就可以在Eclipse中hadoop-2.7.3-src项目中找到,如下图4所示:

                            图 4

Hadoop1的执行流程 参考    参考   

二、分析前须知:

我们运行装好的集群时,要想启动集群,通常在主节点Master的hadoop-2.7.3/sbin/,目录下运行(命令行)./start-all.sh脚本,直接全部启动。如下代码所示:

start-all.sh

"${HADOOP_HDFS_HOME}"/sbin/start-dfs.sh --config $HADOOP_CONF_DIR    它会启动start-dfs.sh脚本

"${HADOOP_YARN_HOME}"/sbin/start-yarn.sh --config $HADOOP_CONF_DIR    它会启动start-yarn.sh脚本

 1 #Add other possible options
 2 nameStartOpt="$nameStartOpt $@"
 3 
 4 #---------------------------------------------------------
 5 # namenodes
 6 
 7 NAMENODES=$($HADOOP_PREFIX/bin/hdfs getconf -namenodes)
 8 
 9 echo "Starting namenodes on [$NAMENODES]"
10 
11 "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \
12   --config "$HADOOP_CONF_DIR" \
13   --hostnames "$NAMENODES" \
14   --script "$bin/hdfs" start namenode $nameStartOpt
15 
16 #---------------------------------------------------------
17 # datanodes (using default slaves file)
18 
19 if [ -n "$HADOOP_SECURE_DN_USER" ]; then
20   echo \
21     "Attempting to start secure cluster, skipping datanodes. " \
22     "Run start-secure-dns.sh as root to complete startup."
23 else
24   "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \
25     --config "$HADOOP_CONF_DIR" \
26     --script "$bin/hdfs" start datanode $dataStartOpt
27 fi
28 
29 #---------------------------------------------------------
30 # secondary namenodes (if any)
31 
32 SECONDARY_NAMENODES=$($HADOOP_PREFIX/bin/hdfs getconf -secondarynamenodes 2>/dev/null)
33 
34 if [ -n "$SECONDARY_NAMENODES" ]; then
35   echo "Starting secondary namenodes [$SECONDARY_NAMENODES]"
36 
37   "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \
38       --config "$HADOOP_CONF_DIR" \
39       --hostnames "$SECONDARY_NAMENODES" \
40       --script "$bin/hdfs" start secondarynamenode
41 fi
42 
43 #---------------------------------------------------------
44 # quorumjournal nodes (if any)
45 
46 SHARED_EDITS_DIR=$($HADOOP_PREFIX/bin/hdfs getconf -confKey dfs.namenode.shared.edits.dir 2>&-)
47 
48 case "$SHARED_EDITS_DIR" in
49 qjournal://*)
50   JOURNAL_NODES=$(echo "$SHARED_EDITS_DIR" | sed 's,qjournal://\([^/]*\)/.*,\1,g; s/;/ /g; s/:[0-9]*//g')
51   echo "Starting journal nodes [$JOURNAL_NODES]"
52   "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \
53       --config "$HADOOP_CONF_DIR" \
54       --hostnames "$JOURNAL_NODES" \
55       --script "$bin/hdfs" start journalnode ;;
56 esac
57 
58 #---------------------------------------------------------
59 # ZK Failover controllers, if auto-HA is enabled
60 AUTOHA_ENABLED=$($HADOOP_PREFIX/bin/hdfs getconf -confKey dfs.ha.automatic-failover.enabled)
61 if [ "$(echo "$AUTOHA_ENABLED" | tr A-Z a-z)" = "true" ]; then
62   echo "Starting ZK Failover Controllers on NN hosts [$NAMENODES]"
63   "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \
64     --config "$HADOOP_CONF_DIR" \
65     --hostnames "$NAMENODES" \
66     --script "$bin/hdfs" start zkfc
67 fi
68 
69 # eof
start-dfs.sh

我们可以看到该脚本会启动namenode,datanode以及secondarynamenode等。

 1 # Start all yarn daemons.  Run this on master node.
 2 
 3 echo "starting yarn daemons"
 4 
 5 bin=`dirname "${BASH_SOURCE-$0}"`
 6 bin=`cd "$bin"; pwd`
 7 
 8 DEFAULT_LIBEXEC_DIR="$bin"/../libexec
 9 HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
10 . $HADOOP_LIBEXEC_DIR/yarn-config.sh
11 
12 # start resourceManager
13 "$bin"/yarn-daemon.sh --config $YARN_CONF_DIR  start resourcemanager
14 # start nodeManager
15 "$bin"/yarn-daemons.sh --config $YARN_CONF_DIR  start nodemanager
16 # start proxyserver
17 #"$bin"/yarn-daemon.sh --config $YARN_CONF_DIR  start proxyserver
start-yarn.sh

我们可以看到该脚本会启动resourcemanager,nodemanager等。

这几个分别对应NameNode.java, DataNode.java, SecondaryNameNode.java, ResourceManager.java以及NodeManager.java。 而且都有public static void main ...(String argv[]){...}方法。就是启动后,它们都处于运行状态。

NameNode.java在hadoop-2.7.3-src/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java

DataNode.java在hadoop-2.7.3-src/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java

SecondaryNameNode.java在hadoop-2.7.3-src/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java

ResourceManager.java在hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java

NodeManager.java在hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java

 三、YARN(MapReduce 2)

  对于节点数超出4000的大型集群,MapReduce 1系统开始面临着扩展性的瓶颈。因此2010年雅虎的一个团队开始设计了下一代的MapRduce。由此YARN(Yet Another Resource Negotiator)应运而生。

  YARN将MapReduce 1中的Jobtracker的两种角色划分为两个独立的守护进程:管理集群上资源使用的资源管理器和管理集群上运行任务生命周期的应用管理器。基本思路是:应用服务器与资源管理器协商集群的计算资源:容器(每个容器都有特定的内存上限),在这些容器上运行特定应用程序的进程。容器由集群节点上运行的节点管理器监视,以确保应用程序使用的资源不会超过分配给它的资源。

  YARN设计的精妙之处在于不同的YARN应用可以在同一个集群上共存。此外,用户甚至有可能在同一个YARN集群上运行多个不同版本的MapReduce,这使得Mapreduce 升级过程更容易管理。

YARN上的MapReduce比经典的MapReduce 1包括更多的实体:

  • 提交MapReduce作业的客户端
  • YARN资源管理器,负责协调集群上计算资源的分配
  • YARN节点管理器,负责启动和监视集群中机器上的计算容器(container)
  • MapReduce应用程序master负责协调运行MapReduce作业的任务。它和MapReduce任务在容器中运行,这些容器由资源管理器分配并由节点管理器进行管理。
  • 分布式文件系统(一般为HDFS),用来与其他实体间共享作业文件。

作业的运行过程如下图5所示,并具体分析

 

            图 5 Hadoop 使用YARN 运行 MapReduce 的过程

Job作业提交流程

参考2    参考3   参考4    参考

我们在进行MR的编写完成后,会调用job.waitForCompletion(boolean)来将作业提交到集群并等待作业完成在该方法内部,首先会判断Job状态并调用submit()方法进行提交,将任务提交到集群后会立刻返回

Hadoop会提供一些自带的测试用例,其中比较常见的如WordCount等。我们就以WordCount为例。Hadoop提供的自带的WordCount.java在hadoop-2.7.3-src/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/WordCount.java

  1 /**
  2  * Licensed to the Apache Software Foundation (ASF) under one
  3  * or more contributor license agreements.  See the NOTICE file
  4  * distributed with this work for additional information
  5  * regarding copyright ownership.  The ASF licenses this file
  6  * to you under the Apache License, Version 2.0 (the
  7  * "License"); you may not use this file except in compliance
  8  * with the License.  You may obtain a copy of the License at
  9  *
 10  *     http://www.apache.org/licenses/LICENSE-2.0
 11  *
 12  * Unless required by applicable law or agreed to in writing, software
 13  * distributed under the License is distributed on an "AS IS" BASIS,
 14  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 15  * See the License for the specific language governing permissions and
 16  * limitations under the License.
 17  */
 18 package org.apache.hadoop.examples;
 19 
 20 import java.io.IOException;
 21 import java.util.StringTokenizer;
 22 
 23 import org.apache.hadoop.conf.Configuration;
 24 import org.apache.hadoop.fs.Path;
 25 import org.apache.hadoop.io.IntWritable;
 26 import org.apache.hadoop.io.Text;
 27 import org.apache.hadoop.mapreduce.Job;
 28 import org.apache.hadoop.mapreduce.Mapper;
 29 import org.apache.hadoop.mapreduce.Reducer;
 30 import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
 31 import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
 32 import org.apache.hadoop.util.GenericOptionsParser;
 33 
 34 //Hadoop 自带测试用例WordCount
 35 public class WordCount {
 36   
 37   //继承泛型类Mapper
 38   public static class TokenizerMapper 
 39        extends Mapper<Object, Text, Text, IntWritable>{
 40     
 41     //定义hadoop数据类型IntWritable实例one,并且赋值为1
 42     private final static IntWritable one = new IntWritable(1);
 43     //定义hadoop数据类型Text实例word
 44     private Text word = new Text();
 45       
 46     //实现map函数
 47     public void map(Object key, Text value, Context context
 48                     ) throws IOException, InterruptedException {
 49       //Java的字符串分解类,默认分隔符“空格”、“制表符(‘\t’)”、“换行符(‘\n’)”、“回车符(‘\r’)”
 50       StringTokenizer itr = new StringTokenizer(value.toString());
 51       //循环条件表示返回是否还有分隔符。
 52       while (itr.hasMoreTokens()) {
 53         /*
 54           nextToken():返回从当前位置到下一个分隔符的字符串
 55           word.set()Java数据类型与hadoop数据类型转换
 56         */
 57         word.set(itr.nextToken());
 58         //hadoop全局类context输出函数write;
 59         context.write(word, one);
 60       }
 61     }
 62   }
 63   
 64   //继承泛型类Reducer
 65   public static class IntSumReducer 
 66        extends Reducer<Text,IntWritable,Text,IntWritable> {
 67     //实例化IntWritable
 68     private IntWritable result = new IntWritable();
 69 
 70     //实现reduce
 71     public void reduce(Text key, Iterable<IntWritable> values, 
 72                        Context context
 73                        ) throws IOException, InterruptedException {
 74       //循环values,并记录单词个数
 75       int sum = 0;
 76       for (IntWritable val : values) {
 77         sum += val.get();
 78       }
 79       //Java数据类型sum,转换为hadoop数据类型result
 80       result.set(sum);
 81       //输出结果到hdfs
 82       context.write(key, result);
 83     }
 84   }
 85 
 86   public static void main(String[] args) throws Exception {
 87     //实例化Configuration
 88     Configuration conf = new Configuration();
 89     /*
 90     GenericOptionsParser是hadoop框架中解析命令行参数的基本类。
 91     getRemainingArgs();返回数组【一组路径】
 92     */
 93     /*
 94     函数实现
 95     public String[] getRemainingArgs() {
 96       return (commandLine == null) ? new String[]{} : commandLine.getArgs();
 97     }
 98     */
 99     String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
100     //如果只有一个路径,则输出需要有输入路径和输出路径
101     if (otherArgs.length < 2) {
102       System.err.println("Usage: wordcount <in> [<in>...] <out>");
103       System.exit(2);
104     }
105     //实例化job
106     Job job = Job.getInstance(conf, "word count");
107     job.setJarByClass(WordCount.class);
108     job.setMapperClass(TokenizerMapper.class);
109     /*
110     指定CombinerClass类
111     这里很多人对CombinerClass不理解
112     */
113     job.setCombinerClass(IntSumReducer.class);
114     job.setReducerClass(IntSumReducer.class);
115     //reduce输出Key的类型,是Text
116     job.setOutputKeyClass(Text.class);
117     // reduce输出Value的类型
118     job.setOutputValueClass(IntWritable.class);
119     //添加输入路径
120     for (int i = 0; i < otherArgs.length - 1; ++i) {
121       FileInputFormat.addInputPath(job, new Path(otherArgs[i]));
122     }
123     //添加输出路径
124     FileOutputFormat.setOutputPath(job,
125       new Path(otherArgs[otherArgs.length - 1]));
126     //提交job,这句话调用Job.java中的waitForCompletion(...)方法
127     System.exit(job.waitForCompletion(true) ? 0 : 1);
128   }
129 }
WordCount.java

最后一句job.waitForCompletion(true)调用Job.java中的waitForCompletion(...)方法。

1. 作业提交

  (1)MapReduce 2 中的作业提交是使用与MapReduce 1 相同的用户API。即Job的submit()方法创建一个内部的JobSummiter实例,并且调用其submitJobInternal()方法。提交作业后,waitForCompletion()每秒轮询作业的进度,如果发现自上次报告后有改变,便把进度报告到控制台。作业完成后,如果成功就显示作业计数器;如果失败则导致作业失败的错误被记录到控制台。

  其中submit()以及waitForCompletion()都在Job.java 类中。而submitJobInternal()JobSubmitter.java类中。而且这两个类都hadoop-2.7.3-src\hadoop-mapreduce-project\hadoop-mapreduce-client\hadoop-mapreduce-client-core\src\main\java\org\apache\hadoop\mapreduce中。

  1 /**
  2  * The job submitter's view of the Job.
  3  * 
  4  * <p>It allows the user to configure the
  5  * job, submit it, control its execution, and query the state. The set methods
  6  * only work until the job is submitted, afterwards they will throw an 
  7  * IllegalStateException. </p>
  8  * 
  9  * <p>
 10  * Normally the user creates the application, describes various facets of the
 11  * job via {@link Job} and then submits the job and monitor its progress.</p>
 12  * 
 13  * <p>Here is an example on how to submit a job:</p>
 14  * <p><blockquote><pre>
 15  *     // Create a new Job
 16  *     Job job = Job.getInstance();
 17  *     job.setJarByClass(MyJob.class);
 18  *     
 19  *     // Specify various job-specific parameters     
 20  *     job.setJobName("myjob");
 21  *     
 22  *     job.setInputPath(new Path("in"));
 23  *     job.setOutputPath(new Path("out"));
 24  *     
 25  *     job.setMapperClass(MyJob.MyMapper.class);
 26  *     job.setReducerClass(MyJob.MyReducer.class);
 27  *
 28  *     // Submit the job, then poll for progress until the job is complete
 29  *     job.waitForCompletion(true);
 30  * </pre></blockquote>
 31  * 
 32  * 
 33  */
 34 
 35 /*
 36  * Job作业提交流程:
 37  * 1.我们在进行MR的编写完成后,
 38  *         a.会调用job.waitForCompletion(boolean)来将作业提交到集群并等待作业完成。
 39  *         b.在该方法内部,首先会判断Job状态并调用submit()方法进行提交,将任务提交到集群后会立刻返回;
 40  *         c.提交后,会判断waitForCompletion()的参数布尔变量,若为true的话,表示在作业进行的过程中会实时地进行状态监控
 41  *           并打印输出其状态,调用monitorAndPrintJob()。否则,首先会获取线程休眠间隔时间(默认为5000ms),其次循环调用
 42  *           isComplete()方法来获取任务执行状态,未完成的话,启动线程休眠配置指定的时间,如此循环,知道任务执行完成或则失败。
 43  * 2.submit()方法内部
 44  *         a.确保作业状态;
 45  *         b.调用setUserNewAPI()来进行api设置 ;
 46  *         c.调用connect()方法连接RM(ResourceManager);
 47  *         d.获取JobSubmitter对象,getJobSubmitter(fs,client)
 48  *         e.submitter对象进行提交作业的提交:submitJobInternal(Job.this,cluster)
 49  * 3.在连接RM方法connnect()内部,
 50  *         a.会创建Cluster实例,Cluster构造函数内部重要的是初始化部分
 51  *         b.在初始化函数内部,使用java.util.ServiceLoader创建客户端代理,目前包含两个代理对象,
 52  *           LocalClientProtocolProvider(本地作业)和YarnClientProtocolProvider(yarn作业),
 53  *           此处会根据mapreduce.framework.name的配置创建相应的客户端。
 54  *           通过LocalClientProtocolProvider创建LocalJobRunner对象,在此就不进行详细说明了。
 55  *           通过YarnClientProtocolProvider创建YarnRunner对象,YarnRunner保证当前JobClient运行在Yarn上。
 56  *         c.在YarnRunner实例化的过程中,创建客户端代理的流程如下:
 57  *           Cluster->ClientProtocol(YarnRunner)->ResourceMgrDelegate->client(YarnClientImpl)->rmClient(ApplicationClientProtocol)
 58  *           在YarnClientImpl的serviceStart阶段会创建RPC代理,注意其中的协议.
 59  *           Cluster:主要是提供一个访问map/reduce集群的途径;
 60  *           YarnRunner: 保证当前JobClient运行在Yarn上,在实例化的过程中创建ResourceMgrDelegate;
 61  *           ResourceMgrDelegate:主要负责与RM进行信息交互;
 62  *           YarnClientImpl:主要负责实例化rmClient;
 63  *           rmClient:是各个客户端与RM交互的协议,主要是负责提交或终止Job任务和获取信息(applications、cluster metrics、nodes、queues和ACLs)
 64  * 4.接下来,看最核心的部分,JobSubmitter.submitJobInternal(Job,Cluster),主要负责将作业提交到系统上运行,主要包括:
 65  *         a.校验作业的输入输出checkSpecs(Job),主要是确保输出目录是否设置且在FS上不存在;
 66  *         b.通过JobSubmissionFiles来获取Job运行时资源文件存放的目录,属性信息key为yarn.app.mapreduce.am.staging-dir,
 67  *           默认的目录为/tmp/Hadoop-yarn/staging/hadoop/.staging/JobSubmissionFiles.getStagingDir():
 68  *           在方法内部进行判断若目录存在则判断其所属关系及操作权限;不存在的话,创建并赋予权限700;
 69  *         c.为缓存中的Job组织必须的统计信息,设置主机名及地址信息,获取jobId,获取提交目录,如
 70  *         /tmp/hadoop-yarn/staging/root/.staging/job_1395778831382_0002;
 71  *         d.拷贝作业的jar和配置信息到分布式文件系统上的map-reduce系统目录,调用copyAndConfigureFiles(job,submitJobDir),
 72  *           主要是拷贝-libjars,-files,-archives属性对应的信息至submitJobDir;
 73  *         e.计算作业的输入分片数,调用writeSplits()写job.split,job.splitmetainfo;
 74  *         f.调用writeConf(conf,submitJobFile)将job.xml文件写入到JobTracker的文件系统;
 75  *         g.提交作业到JobTracker并监控器状态,调用yarnRunner对象的submitJob(jobId,submitJobDir,job.getCredentials())
 76  * 5.真正的提交在YarnRunner对象的submitJob(…)方法内部:
 77  * 
 78  * 问题1:在进行MR编写时,Hadoop 2.x若引用了hadoop-*-mr1-*.jar的话,在使用Java进行调用的时候,会使用本地方式运行;
 79  * 而使用hadoop jar进行调用时,才会提交到yarn环境上运行。
 80  */
 81 //中间省略很多
 82 @InterfaceAudience.Public
 83 @InterfaceStability.Evolving
 84 public class Job extends JobContextImpl implements JobContext {  
 85   private static final Log LOG = LogFactory.getLog(Job.class);
 86 
 87   private JobState state = JobState.DEFINE;
 88   private JobStatus status;
 89   private long statustime;
 90   private Cluster cluster;
 91   private ReservationId reservationId;
 92 
 93   /**
 94    * @deprecated Use {@link #getInstance()}
 95    */
 96   @Deprecated
 97   public Job() throws IOException {
 98     this(new JobConf(new Configuration()));
 99   }
100 
101   /**
102    * @deprecated Use {@link #getInstance(Configuration)}
103    */
104   @Deprecated
105   public Job(Configuration conf) throws IOException {
106     this(new JobConf(conf));
107   }
108 
109   /**
110    * @deprecated Use {@link #getInstance(Configuration, String)}
111    */
112   @Deprecated
113   public Job(Configuration conf, String jobName) throws IOException {
114     this(new JobConf(conf));
115     setJobName(jobName);
116   }
117 
118   Job(JobConf conf) throws IOException {
119     super(conf, null);
120     // propagate existing user credentials to job
121     this.credentials.mergeAll(this.ugi.getCredentials());
122     this.cluster = null;
123   }
124 
125   Job(JobStatus status, JobConf conf) throws IOException {
126     this(conf);
127     setJobID(status.getJobID());
128     this.status = status;
129     state = JobState.RUNNING;
130   }  
131       
132   /**
133    * Creates a new {@link Job} with no particular {@link Cluster} and a given jobName.
134    * A Cluster will be created from the conf parameter only when it's needed.
135    *
136    * The <code>Job</code> makes a copy of the <code>Configuration</code> so 
137    * that any necessary internal modifications do not reflect on the incoming 
138    * parameter.
139    * 
140    * @param conf the configuration
141    * @return the {@link Job} , with no connection to a cluster yet.
142    * @throws IOException
143    */
144   public static Job getInstance(Configuration conf, String jobName)
145            throws IOException {
146     // create with a null Cluster
147     Job result = getInstance(conf);
148     result.setJobName(jobName);
149     return result;
150   }
151   
152   private synchronized void connect()
153           throws IOException, InterruptedException, ClassNotFoundException {
154     if (cluster == null) {
155       //会创建Cluster实例,Cluster构造函数内部重要的是初始化部分
156       cluster = 
157         ugi.doAs(new PrivilegedExceptionAction<Cluster>() {
158                    public Cluster run()
159                           throws IOException, InterruptedException, 
160                                  ClassNotFoundException {
161                      return new Cluster(getConfiguration());
162                    }
163                  });
164     }
165   }
166 
167   /**
168    * Submit the job to the cluster and return immediately.
169    * @throws IOException
170    */
171   public void submit() 
172          throws IOException, InterruptedException, ClassNotFoundException {
173     ensureState(JobState.DEFINE);        //确保作业状态;
174     setUseNewAPI();                        //调用setUserNewAPI()来进行api设置 ;
175     connect();                            //调用connect()方法连接RM(ResourceManager);
176     //获取JobSubmitter对象,getJobSubmitter(fs,client)
177     final JobSubmitter submitter = 
178         getJobSubmitter(cluster.getFileSystem(), cluster.getClient());    
179     status = ugi.doAs(new PrivilegedExceptionAction<JobStatus>() {
180       public JobStatus run() throws IOException, InterruptedException, 
181       ClassNotFoundException {
182         //submitter对象进行提交作业的提交:submitJobInternal(Job.this,cluster)
183         return submitter.submitJobInternal(Job.this, cluster);    
184       }
185     });
186     state = JobState.RUNNING;
187     LOG.info("The url to track the job: " + getTrackingURL());
188    }
189   
190   /**
191    * Submit the job to the cluster and wait for it to finish.
192    * @param verbose print the progress to the user
193    * @return true if the job succeeded
194    * @throws IOException thrown if the communication with the 
195    *         <code>JobTracker</code> is lost
196    */
197   //将作业提交到集群并等待作业完成。
198   public boolean waitForCompletion(boolean verbose
199                                    ) throws IOException, InterruptedException,
200                                             ClassNotFoundException {
201     //判断Job状态并调用submit()方法进行提交,将任务提交到集群后会立刻返回;
202     if (state == JobState.DEFINE) {
203       submit();
204     }
205     /*提交后,会判断waitForCompletion()的参数布尔变量,若为true的话,表示在作业进行的过程中会实时地进行状态监控并打印输出其状态,
206      * 调用monitorAndPrintJob()。否则,首先会获取线程休眠间隔时间(默认为5000ms),其次循环调用isComplete()方法
207      * 来获取任务执行状态,未完成的话,启动线程休眠配置指定的时间,如此循环,知道任务执行完成或则失败。
208     */
209     if (verbose) {
210       monitorAndPrintJob();
211     } else {
212       // get the completion poll interval from the client.
213       int completionPollIntervalMillis = 
214         Job.getCompletionPollInterval(cluster.getConf());
215       while (!isComplete()) {
216         try {
217           Thread.sleep(completionPollIntervalMillis);
218         } catch (InterruptedException ie) {
219         }
220       }
221     }
222     return isSuccessful();
223   }
224 
225 }
Job.java
 1 /**
 2  * Provides a way to access information about the map/reduce cluster.
 3  */
 4 //主要是提供一个访问map/reduce集群的途径;
 5 @InterfaceAudience.Public
 6 @InterfaceStability.Evolving
 7 public class Cluster {
 8   
 9   @InterfaceStability.Evolving
10   public static enum JobTrackerStatus {INITIALIZING, RUNNING};
11   
12   private ClientProtocolProvider clientProtocolProvider;
13   private ClientProtocol client;
14   private UserGroupInformation ugi;
15   private Configuration conf;
16   private FileSystem fs = null;
17   private Path sysDir = null;
18   private Path stagingAreaDir = null;
19   private Path jobHistoryDir = null;
20   private static final Log LOG = LogFactory.getLog(Cluster.class);
21 
22   private static ServiceLoader<ClientProtocolProvider> frameworkLoader =
23       ServiceLoader.load(ClientProtocolProvider.class);
24   
25   static {
26     ConfigUtil.loadResources();
27   }
28   
29   public Cluster(Configuration conf) throws IOException {
30     this(null, conf);    //调用双参数构造函数
31   }
32 
33   public Cluster(InetSocketAddress jobTrackAddr, Configuration conf) 
34       throws IOException {
35     this.conf = conf;        
36     this.ugi = UserGroupInformation.getCurrentUser();
37     initialize(jobTrackAddr, conf);    //
38   }
39   
40   /*
41    * 在初始化函数内部,使用java.util.ServiceLoader创建客户端代理,目前包含两个代理对象,
42    * LocalClientProtocolProvider(本地作业)和YarnClientProtocolProvider(yarn作业),
43    * 此处会根据mapreduce.framework.name的配置创建相应的客户端。
44    * 通过LocalClientProtocolProvider创建LocalJobRunner对象,在此就不进行详细说明了。
45    * 通过YarnClientProtocolProvider创建YarnRunner对象,YarnRunner保证当前JobClient运行在Yarn上。
46    */
47   private void initialize(InetSocketAddress jobTrackAddr, Configuration conf)
48       throws IOException {
49 
50     synchronized (frameworkLoader) {
51       for (ClientProtocolProvider provider : frameworkLoader) {    //
52         LOG.debug("Trying ClientProtocolProvider : "
53             + provider.getClass().getName());
54         ClientProtocol clientProtocol = null; 
55         try {
56           if (jobTrackAddr == null) {
57             clientProtocol = provider.create(conf);
58           } else {
59             clientProtocol = provider.create(jobTrackAddr, conf);
60           }
61 
62           if (clientProtocol != null) {
63             clientProtocolProvider = provider;
64             client = clientProtocol;
65             LOG.debug("Picked " + provider.getClass().getName()
66                 + " as the ClientProtocolProvider");
67             break;
68           }
69           else {
70             LOG.debug("Cannot pick " + provider.getClass().getName()
71                 + " as the ClientProtocolProvider - returned null protocol");
72           }
73         } 
74         catch (Exception e) {
75           LOG.info("Failed to use " + provider.getClass().getName()
76               + " due to error: ", e);
77         }
78       }
79     }
80 
81     if (null == clientProtocolProvider || null == client) {
82       throw new IOException(
83           "Cannot initialize Cluster. Please check your configuration for "
84               + MRConfig.FRAMEWORK_NAME
85               + " and the correspond server addresses.");
86     }
87   }
88 
89   ClientProtocol getClient() {
90     return client;
91   }
92   
93   Configuration getConf() {
94     return conf;
95   }
96   
97 }
Cluster.java

图6中的1:run job 就是submit()方法实现把作业提交到集群。这个方法内部有一个调用submitter.submitJobInternal(Job.this, cluster),即调用JobSubmitter.java中的submitJobInternal(...)方法,

subJobInternal(...)方法向系统提交作业,它内部调用submitClient.submitJob(...),即ClientProtocol.java的submitJob(...)方法。

  1 @InterfaceAudience.Private
  2 @InterfaceStability.Unstable
  3 class JobSubmitter {
  4   protected static final Log LOG = LogFactory.getLog(JobSubmitter.class);
  5   private static final String SHUFFLE_KEYGEN_ALGORITHM = "HmacSHA1";
  6   private static final int SHUFFLE_KEY_LENGTH = 64;
  7   private FileSystem jtFs;
  8   private ClientProtocol submitClient;
  9   private String submitHostName;
 10   private String submitHostAddress;
 11   
 12   JobSubmitter(FileSystem submitFs, ClientProtocol submitClient) 
 13   throws IOException {
 14     this.submitClient = submitClient;
 15     this.jtFs = submitFs;
 16   }
 17   
 18   /**
 19    * configure the jobconf of the user with the command line options of 
 20    * -libjars, -files, -archives.
 21    * @param job
 22    * @throws IOException
 23    */
 24   private void copyAndConfigureFiles(Job job, Path jobSubmitDir) 
 25   throws IOException {
 26     JobResourceUploader rUploader = new JobResourceUploader(jtFs);
 27     rUploader.uploadFiles(job, jobSubmitDir);
 28 
 29     // Get the working directory. If not set, sets it to filesystem working dir
 30     // This code has been added so that working directory reset before running
 31     // the job. This is necessary for backward compatibility as other systems
 32     // might use the public API JobConf#setWorkingDirectory to reset the working
 33     // directory.
 34     job.getWorkingDirectory();
 35   }
 36 
 37   /**
 38    * Internal method for submitting jobs to the system.
 39    * 
 40    * <p>The job submission process involves:
 41    * <ol>
 42    *   <li>
 43    *   Checking the input and output specifications of the job.
 44    *   </li>
 45    *   <li>
 46    *   Computing the {@link InputSplit}s for the job.
 47    *   </li>
 48    *   <li>
 49    *   Setup the requisite accounting information for the 
 50    *   {@link DistributedCache} of the job, if necessary.
 51    *   </li>
 52    *   <li>
 53    *   Copying the job's jar and configuration to the map-reduce system
 54    *   directory on the distributed file-system. 
 55    *   </li>
 56    *   <li>
 57    *   Submitting the job to the <code>JobTracker</code> and optionally
 58    *   monitoring it's status.
 59    *   </li>
 60    * </ol></p>
 61    * @param job the configuration to submit
 62    * @param cluster the handle to the Cluster
 63    * @throws ClassNotFoundException
 64    * @throws InterruptedException
 65    * @throws IOException
 66    */
 67   //主要负责将作业提交到系统上运行,
 68   JobStatus submitJobInternal(Job job, Cluster cluster) 
 69   throws ClassNotFoundException, InterruptedException, IOException {
 70 
 71     //validate the jobs output specs 
 72     //校验作业的输入输出checkSpecs(Job),主要是确保输出目录是否设置且在FS上不存在;
 73     checkSpecs(job);
 74 
 75     Configuration conf = job.getConfiguration();
 76     addMRFrameworkToDistributedCache(conf);
 77     
 78     /*
 79      * 通过JobSubmissionFiles来获取Job运行时资源文件存放的目录,属性信息key为
 80      * yarn.app.mapreduce.am.staging-dir,默认的目录为/tmp/Hadoop-yarn/staging/hadoop/.staging/
 81      * JobSubmissionFiles.getStagingDir():在方法内部进行判断若目录存在则判断其所属关系及操作权限;
 82      * 不存在的话,创建并赋予权限700;
 83      */
 84     Path jobStagingArea = JobSubmissionFiles.getStagingDir(cluster, conf);
 85     //configure the command line options correctly on the submitting dfs
 86     //为缓存中的Job组织必须的统计信息,设置主机名及地址信息,获取jobId,
 87     //获取提交目录,如/tmp/hadoop-yarn/staging/root/.staging/job_1395778831382_0002;
 88     InetAddress ip = InetAddress.getLocalHost();
 89     if (ip != null) {
 90       submitHostAddress = ip.getHostAddress();
 91       submitHostName = ip.getHostName();
 92       conf.set(MRJobConfig.JOB_SUBMITHOST,submitHostName);
 93       conf.set(MRJobConfig.JOB_SUBMITHOSTADDR,submitHostAddress);
 94     }
 95     JobID jobId = submitClient.getNewJobID();
 96     job.setJobID(jobId);
 97     Path submitJobDir = new Path(jobStagingArea, jobId.toString());
 98     JobStatus status = null;
 99     try {
100       conf.set(MRJobConfig.USER_NAME,
101           UserGroupInformation.getCurrentUser().getShortUserName());
102       conf.set("hadoop.http.filter.initializers", 
103           "org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer");
104       conf.set(MRJobConfig.MAPREDUCE_JOB_DIR, submitJobDir.toString());
105       LOG.debug("Configuring job " + jobId + " with " + submitJobDir 
106           + " as the submit dir");
107       // get delegation token for the dir
108       TokenCache.obtainTokensForNamenodes(job.getCredentials(),
109           new Path[] { submitJobDir }, conf);
110       
111       populateTokenCache(conf, job.getCredentials());
112 
113       // generate a secret to authenticate shuffle transfers
114       if (TokenCache.getShuffleSecretKey(job.getCredentials()) == null) {
115         KeyGenerator keyGen;
116         try {
117           keyGen = KeyGenerator.getInstance(SHUFFLE_KEYGEN_ALGORITHM);
118           keyGen.init(SHUFFLE_KEY_LENGTH);
119         } catch (NoSuchAlgorithmException e) {
120           throw new IOException("Error generating shuffle secret key", e);
121         }
122         SecretKey shuffleKey = keyGen.generateKey();
123         TokenCache.setShuffleSecretKey(shuffleKey.getEncoded(),
124             job.getCredentials());
125       }
126       if (CryptoUtils.isEncryptedSpillEnabled(conf)) {
127         conf.setInt(MRJobConfig.MR_AM_MAX_ATTEMPTS, 1);
128         LOG.warn("Max job attempts set to 1 since encrypted intermediate" +
129                 "data spill is enabled");
130       }
131       
132       //拷贝作业的jar和配置信息到分布式文件系统上的map-reduce系统目录,调用copyAndConfigureFiles(job,submitJobDir),
133       //主要是拷贝-libjars,-files,-archives属性对应的信息至submitJobDir;
134       copyAndConfigureFiles(job, submitJobDir);
135 
136       Path submitJobFile = JobSubmissionFiles.getJobConfPath(submitJobDir);
137       
138       // Create the splits for the job
139       //计算作业的输入分片数,调用writeSplits()写job.split,job.splitmetainfo;
140       LOG.debug("Creating splits at " + jtFs.makeQualified(submitJobDir));
141       int maps = writeSplits(job, submitJobDir);
142       conf.setInt(MRJobConfig.NUM_MAPS, maps);
143       LOG.info("number of splits:" + maps);
144 
145       // write "queue admins of the queue to which job is being submitted"
146       // to job file.
147       String queue = conf.get(MRJobConfig.QUEUE_NAME,
148           JobConf.DEFAULT_QUEUE_NAME);
149       AccessControlList acl = submitClient.getQueueAdmins(queue);
150       conf.set(toFullPropertyName(queue,
151           QueueACL.ADMINISTER_JOBS.getAclName()), acl.getAclString());
152 
153       // removing jobtoken referrals before copying the jobconf to HDFS
154       // as the tasks don't need this setting, actually they may break
155       // because of it if present as the referral will point to a
156       // different job.
157       TokenCache.cleanUpTokenReferral(conf);
158 
159       if (conf.getBoolean(
160           MRJobConfig.JOB_TOKEN_TRACKING_IDS_ENABLED,
161           MRJobConfig.DEFAULT_JOB_TOKEN_TRACKING_IDS_ENABLED)) {
162         // Add HDFS tracking ids
163         ArrayList<String> trackingIds = new ArrayList<String>();
164         for (Token<? extends TokenIdentifier> t :
165             job.getCredentials().getAllTokens()) {
166           trackingIds.add(t.decodeIdentifier().getTrackingId());
167         }
168         conf.setStrings(MRJobConfig.JOB_TOKEN_TRACKING_IDS,
169             trackingIds.toArray(new String[trackingIds.size()]));
170       }
171 
172       // Set reservation info if it exists
173       ReservationId reservationId = job.getReservationId();
174       if (reservationId != null) {
175         conf.set(MRJobConfig.RESERVATION_ID, reservationId.toString());
176       }
177 
178       // Write job file to submit dir
179       //调用writeConf(conf,submitJobFile)将job.xml文件写入到JobTracker的文件系统;
180       writeConf(conf, submitJobFile);
181       
182       //
183       // Now, actually submit the job (using the submit name)
184       //
185       //提交作业到JobTracker并监控器状态,调用yarnRunner对象的submitJob(jobId,submitJobDir,job.getCredentials())
186       //真正的提交在YarnRunner对象的submitJob(…)方法内部:
187       printTokens(jobId, job.getCredentials());
188       status = submitClient.submitJob(
189           jobId, submitJobDir.toString(), job.getCredentials());
190       if (status != null) {
191         return status;
192       } else {
193         throw new IOException("Could not launch job");
194       }
195     } finally {
196       if (status == null) {
197         LOG.info("Cleaning up the staging area " + submitJobDir);
198         if (jtFs != null && submitJobDir != null)
199           jtFs.delete(submitJobDir, true);
200 
201       }
202     }
203   }
204   
205   @SuppressWarnings("unchecked")
206   private <T extends InputSplit>
207   int writeNewSplits(JobContext job, Path jobSubmitDir) throws IOException,
208       InterruptedException, ClassNotFoundException {
209     Configuration conf = job.getConfiguration();
210     InputFormat<?, ?> input =
211       ReflectionUtils.newInstance(job.getInputFormatClass(), conf);
212 
213     List<InputSplit> splits = input.getSplits(job);
214     T[] array = (T[]) splits.toArray(new InputSplit[splits.size()]);
215 
216     // sort the splits into order based on size, so that the biggest
217     // go first
218     Arrays.sort(array, new SplitComparator());
219     JobSplitWriter.createSplitFiles(jobSubmitDir, conf, 
220         jobSubmitDir.getFileSystem(conf), array);
221     return array.length;
222   }
223   
224   private int writeSplits(org.apache.hadoop.mapreduce.JobContext job,
225       Path jobSubmitDir) throws IOException,
226       InterruptedException, ClassNotFoundException {
227     JobConf jConf = (JobConf)job.getConfiguration();
228     int maps;
229     if (jConf.getUseNewMapper()) {
230       maps = writeNewSplits(job, jobSubmitDir);
231     } else {
232       maps = writeOldSplits(jConf, jobSubmitDir);
233     }
234     return maps;
235   }
236   
237   //method to write splits for old api mapper.
238   private int writeOldSplits(JobConf job, Path jobSubmitDir) 
239   throws IOException {
240     org.apache.hadoop.mapred.InputSplit[] splits =
241     job.getInputFormat().getSplits(job, job.getNumMapTasks());
242     // sort the splits into order based on size, so that the biggest
243     // go first
244     Arrays.sort(splits, new Comparator<org.apache.hadoop.mapred.InputSplit>() {
245       public int compare(org.apache.hadoop.mapred.InputSplit a,
246                          org.apache.hadoop.mapred.InputSplit b) {
247         try {
248           long left = a.getLength();
249           long right = b.getLength();
250           if (left == right) {
251             return 0;
252           } else if (left < right) {
253             return 1;
254           } else {
255             return -1;
256           }
257         } catch (IOException ie) {
258           throw new RuntimeException("Problem getting input split size", ie);
259         }
260       }
261     });
262     JobSplitWriter.createSplitFiles(jobSubmitDir, job, 
263         jobSubmitDir.getFileSystem(job), splits);
264     return splits.length;
265   }
266   
267 }
JobSubmitter.java

难点1  (2)MapReduce 2 实现了ClientProtocol,当mapreduce.framework.name设置为yarn时启动。提交的过程与经典的非常相似。从资源管理器(而不是Jobtracker)获取新的作业ID,在YARN命名法中它是一个应用程序ID。

public interface ClientProtocol extends VersionedProtocol {...}是个接口类。它有两个实现类。分别是:(我们程序跟踪到这一步丢了:当类调用接口的方法时,会跟丢;这时就要看该接口的实现类)

public class LocalJobRunner implements ClientProtocol {...}     //Implements MapReduce locally, in-process, for debugging.

public class YARNRunner implements ClientProtocol {...}           //This class enables the current JobClient (0.22 hadoop) to run on YARN.

我们是在YARN上运行的,所以选用YARNRunner.java类。    对于LocalJobRunner,我们以后在讨论。

其中ClientProtocol.java类在hadoop-2.7.3-src\hadoop-mapreduce-project\hadoop-mapreduce-client\hadoop-mapreduce-client-core\src\main\java\org\apache\hadoop\mapreduce\protocol中

YARNRunner.java类在hadoop-2.7.3-src\hadoop-mapreduce-project\hadoop-mapreduce-client\hadoop-mapreduce-client-jobclient\src\main\java\org\apache\hadoop\mapred

  1 /**
  2  * This class enables the current JobClient (0.22 hadoop) to run on YARN.
  3  */
  4 @SuppressWarnings("unchecked")
  5 public class YARNRunner implements ClientProtocol {
  6 
  7   private static final Log LOG = LogFactory.getLog(YARNRunner.class);
  8 
  9   private final RecordFactory recordFactory = RecordFactoryProvider.getRecordFactory(null);
 10   private ResourceMgrDelegate resMgrDelegate;
 11   private ClientCache clientCache;
 12   private Configuration conf;
 13   private final FileContext defaultFileContext;
 14   
 15   /**
 16    * Yarn runner incapsulates the client interface of
 17    * yarn
 18    * @param conf the configuration object for the client
 19    */
 20   
 21   /*
 22    * 在YarnRunner实例化的过程中,创建客户端代理的流程如下:
 23    * Cluster->ClientProtocol(YarnRunner)->ResourceMgrDelegate->client(YarnClientImpl)->rmClient(ApplicationClientProtocol)
 24    * 保证当前JobClient运行在Yarn上,在实例化的过程中创建ResourceMgrDelegate;
 25    */
 26   public YARNRunner(Configuration conf) {
 27    this(conf, new ResourceMgrDelegate(new YarnConfiguration(conf)));
 28   }
 29 
 30   /**
 31    * Similar to {@link #YARNRunner(Configuration)} but allowing injecting
 32    * {@link ResourceMgrDelegate}. Enables mocking and testing.
 33    * @param conf the configuration object for the client
 34    * @param resMgrDelegate the resourcemanager client handle.
 35    */
 36   public YARNRunner(Configuration conf, ResourceMgrDelegate resMgrDelegate) {
 37    this(conf, resMgrDelegate, new ClientCache(conf, resMgrDelegate));
 38   }
 39 
 40   /**
 41    * Similar to {@link YARNRunner#YARNRunner(Configuration, ResourceMgrDelegate)}
 42    * but allowing injecting {@link ClientCache}. Enable mocking and testing.
 43    * @param conf the configuration object
 44    * @param resMgrDelegate the resource manager delegate
 45    * @param clientCache the client cache object.
 46    */
 47   public YARNRunner(Configuration conf, ResourceMgrDelegate resMgrDelegate,
 48       ClientCache clientCache) {
 49     this.conf = conf;
 50     try {
 51       this.resMgrDelegate = resMgrDelegate;
 52       this.clientCache = clientCache;
 53       this.defaultFileContext = FileContext.getFileContext(this.conf);
 54     } catch (UnsupportedFileSystemException ufe) {
 55       throw new RuntimeException("Error in instantiating YarnClient", ufe);
 56     }
 57   }
 58   
 59   @Override
 60   public JobStatus submitJob(JobID jobId, String jobSubmitDir, Credentials ts)
 61   throws IOException, InterruptedException {
 62     
 63     addHistoryToken(ts);
 64     
 65     // Construct necessary information to start the MR AM
 66     //这个appContext很重要, 里面拼接了各种环境变量, 以及启动App Master的脚本  这个对象会一直贯穿于各个类之间, 直到AM启动
 67     ApplicationSubmissionContext appContext =
 68       createApplicationSubmissionContext(conf, jobSubmitDir, ts);
 69 
 70     // Submit to ResourceManager
 71     ////通过ResourceMgrDelegate来sumbit这个appContext,  ResourceMgrDelegate类是用来和Resource Manager在通讯的
 72     try {
 73       ApplicationId applicationId =
 74           resMgrDelegate.submitApplication(appContext);
 75       
 76       //这个appMaster并不是我们说的ApplicationMaster对象, 这样的命名刚开始也把我迷惑了。。。 
 77       ApplicationReport appMaster = resMgrDelegate
 78           .getApplicationReport(applicationId);
 79       String diagnostics =
 80           (appMaster == null ?
 81               "application report is null" : appMaster.getDiagnostics());
 82       if (appMaster == null
 83           || appMaster.getYarnApplicationState() == YarnApplicationState.FAILED
 84           || appMaster.getYarnApplicationState() == YarnApplicationState.KILLED) {
 85         throw new IOException("Failed to run job : " +
 86             diagnostics);
 87       }
 88       return clientCache.getClient(jobId).getJobStatus(jobId);
 89     } catch (YarnException e) {
 90       throw new IOException(e);
 91     }
 92   }
 93 
 94   //ApplicationSubmissionContext只需要记住amContainer的启动脚本在里面, 后面会用到。
 95   public ApplicationSubmissionContext createApplicationSubmissionContext(
 96       Configuration jobConf,
 97       String jobSubmitDir, Credentials ts) throws IOException {
 98     ApplicationId applicationId = resMgrDelegate.getApplicationId();
 99 
100     // Setup resource requirements
101     Resource capability = recordFactory.newRecordInstance(Resource.class);
102     capability.setMemory(
103         conf.getInt(
104             MRJobConfig.MR_AM_VMEM_MB, MRJobConfig.DEFAULT_MR_AM_VMEM_MB
105             )
106         );
107     capability.setVirtualCores(
108         conf.getInt(
109             MRJobConfig.MR_AM_CPU_VCORES, MRJobConfig.DEFAULT_MR_AM_CPU_VCORES
110             )
111         );
112     LOG.debug("AppMaster capability = " + capability);
113 
114     // Setup LocalResources
115     Map<String, LocalResource> localResources =
116         new HashMap<String, LocalResource>();
117 
118     Path jobConfPath = new Path(jobSubmitDir, MRJobConfig.JOB_CONF_FILE);
119 
120     URL yarnUrlForJobSubmitDir = ConverterUtils
121         .getYarnUrlFromPath(defaultFileContext.getDefaultFileSystem()
122             .resolvePath(
123                 defaultFileContext.makeQualified(new Path(jobSubmitDir))));
124     LOG.debug("Creating setup context, jobSubmitDir url is "
125         + yarnUrlForJobSubmitDir);
126 
127     localResources.put(MRJobConfig.JOB_CONF_FILE,
128         createApplicationResource(defaultFileContext,
129             jobConfPath, LocalResourceType.FILE));
130     if (jobConf.get(MRJobConfig.JAR) != null) {
131       Path jobJarPath = new Path(jobConf.get(MRJobConfig.JAR));
132       LocalResource rc = createApplicationResource(
133           FileContext.getFileContext(jobJarPath.toUri(), jobConf),
134           jobJarPath,
135           LocalResourceType.PATTERN);
136       String pattern = conf.getPattern(JobContext.JAR_UNPACK_PATTERN, 
137           JobConf.UNPACK_JAR_PATTERN_DEFAULT).pattern();
138       rc.setPattern(pattern);
139       localResources.put(MRJobConfig.JOB_JAR, rc);
140     } else {
141       // Job jar may be null. For e.g, for pipes, the job jar is the hadoop
142       // mapreduce jar itself which is already on the classpath.
143       LOG.info("Job jar is not present. "
144           + "Not adding any jar to the list of resources.");
145     }
146 
147     // TODO gross hack
148     for (String s : new String[] {
149         MRJobConfig.JOB_SPLIT,
150         MRJobConfig.JOB_SPLIT_METAINFO }) {
151       localResources.put(
152           MRJobConfig.JOB_SUBMIT_DIR + "/" + s,
153           createApplicationResource(defaultFileContext,
154               new Path(jobSubmitDir, s), LocalResourceType.FILE));
155     }
156 
157     // Setup security tokens
158     DataOutputBuffer dob = new DataOutputBuffer();
159     ts.writeTokenStorageToStream(dob);
160     ByteBuffer securityTokens  = ByteBuffer.wrap(dob.getData(), 0, dob.getLength());
161 
162     // Setup the command to run the AM
163     //这里才是设定Appmaster类的地方, 
164     //MRJobConfig.APPLICATION_MASTER_CLASS = org.apache.hadoop.mapreduce.v2.app.MRAppMaster  
165     ////所以最后通过命令在nodemanager那边执行的其实是MRAppMaster类的main方法
166     List<String> vargs = new ArrayList<String>(8);
167     vargs.add(MRApps.crossPlatformifyMREnv(jobConf, Environment.JAVA_HOME)
168         + "/bin/java");
169 
170     Path amTmpDir =
171         new Path(MRApps.crossPlatformifyMREnv(conf, Environment.PWD),
172             YarnConfiguration.DEFAULT_CONTAINER_TEMP_DIR);
173     vargs.add("-Djava.io.tmpdir=" + amTmpDir);
174     MRApps.addLog4jSystemProperties(null, vargs, conf);
175 
176     // Check for Java Lib Path usage in MAP and REDUCE configs
177     warnForJavaLibPath(conf.get(MRJobConfig.MAP_JAVA_OPTS,""), "map", 
178         MRJobConfig.MAP_JAVA_OPTS, MRJobConfig.MAP_ENV);
179     warnForJavaLibPath(conf.get(MRJobConfig.MAPRED_MAP_ADMIN_JAVA_OPTS,""), "map", 
180         MRJobConfig.MAPRED_MAP_ADMIN_JAVA_OPTS, MRJobConfig.MAPRED_ADMIN_USER_ENV);
181     warnForJavaLibPath(conf.get(MRJobConfig.REDUCE_JAVA_OPTS,""), "reduce", 
182         MRJobConfig.REDUCE_JAVA_OPTS, MRJobConfig.REDUCE_ENV);
183     warnForJavaLibPath(conf.get(MRJobConfig.MAPRED_REDUCE_ADMIN_JAVA_OPTS,""), "reduce", 
184         MRJobConfig.MAPRED_REDUCE_ADMIN_JAVA_OPTS, MRJobConfig.MAPRED_ADMIN_USER_ENV);
185 
186     // Add AM admin command opts before user command opts
187     // so that it can be overridden by user
188     String mrAppMasterAdminOptions = conf.get(MRJobConfig.MR_AM_ADMIN_COMMAND_OPTS,
189         MRJobConfig.DEFAULT_MR_AM_ADMIN_COMMAND_OPTS);
190     warnForJavaLibPath(mrAppMasterAdminOptions, "app master", 
191         MRJobConfig.MR_AM_ADMIN_COMMAND_OPTS, MRJobConfig.MR_AM_ADMIN_USER_ENV);
192     vargs.add(mrAppMasterAdminOptions);
193     
194     // Add AM user command opts
195     String mrAppMasterUserOptions = conf.get(MRJobConfig.MR_AM_COMMAND_OPTS,
196         MRJobConfig.DEFAULT_MR_AM_COMMAND_OPTS);
197     warnForJavaLibPath(mrAppMasterUserOptions, "app master", 
198         MRJobConfig.MR_AM_COMMAND_OPTS, MRJobConfig.MR_AM_ENV);
199     vargs.add(mrAppMasterUserOptions);
200 
201     if (jobConf.getBoolean(MRJobConfig.MR_AM_PROFILE,
202         MRJobConfig.DEFAULT_MR_AM_PROFILE)) {
203       final String profileParams = jobConf.get(MRJobConfig.MR_AM_PROFILE_PARAMS,
204           MRJobConfig.DEFAULT_TASK_PROFILE_PARAMS);
205       if (profileParams != null) {
206         vargs.add(String.format(profileParams,
207             ApplicationConstants.LOG_DIR_EXPANSION_VAR + Path.SEPARATOR
208                 + TaskLog.LogName.PROFILE));
209       }
210     }
211 
212     vargs.add(MRJobConfig.APPLICATION_MASTER_CLASS);
213     vargs.add("1>" + ApplicationConstants.LOG_DIR_EXPANSION_VAR +
214         Path.SEPARATOR + ApplicationConstants.STDOUT);
215     vargs.add("2>" + ApplicationConstants.LOG_DIR_EXPANSION_VAR +
216         Path.SEPARATOR + ApplicationConstants.STDERR);
217 
218 
219     Vector<String> vargsFinal = new Vector<String>(8);
220     // Final command
221     StringBuilder mergedCommand = new StringBuilder();
222     for (CharSequence str : vargs) {
223       mergedCommand.append(str).append(" ");
224     }
225     vargsFinal.add(mergedCommand.toString());
226 
227     LOG.debug("Command to launch container for ApplicationMaster is : "
228         + mergedCommand);
229 
230     // Setup the CLASSPATH in environment
231     // i.e. add { Hadoop jars, job jar, CWD } to classpath.
232     Map<String, String> environment = new HashMap<String, String>();
233     MRApps.setClasspath(environment, conf);
234 
235     // Shell
236     environment.put(Environment.SHELL.name(),
237         conf.get(MRJobConfig.MAPRED_ADMIN_USER_SHELL,
238             MRJobConfig.DEFAULT_SHELL));
239 
240     // Add the container working directory in front of LD_LIBRARY_PATH
241     MRApps.addToEnvironment(environment, Environment.LD_LIBRARY_PATH.name(),
242         MRApps.crossPlatformifyMREnv(conf, Environment.PWD), conf);
243 
244     // Setup the environment variables for Admin first
245     MRApps.setEnvFromInputString(environment, 
246         conf.get(MRJobConfig.MR_AM_ADMIN_USER_ENV,
247             MRJobConfig.DEFAULT_MR_AM_ADMIN_USER_ENV), conf);
248     // Setup the environment variables (LD_LIBRARY_PATH, etc)
249     MRApps.setEnvFromInputString(environment, 
250         conf.get(MRJobConfig.MR_AM_ENV), conf);
251 
252     // Parse distributed cache
253     MRApps.setupDistributedCache(jobConf, localResources);
254 
255     Map<ApplicationAccessType, String> acls
256         = new HashMap<ApplicationAccessType, String>(2);
257     acls.put(ApplicationAccessType.VIEW_APP, jobConf.get(
258         MRJobConfig.JOB_ACL_VIEW_JOB, MRJobConfig.DEFAULT_JOB_ACL_VIEW_JOB));
259     acls.put(ApplicationAccessType.MODIFY_APP, jobConf.get(
260         MRJobConfig.JOB_ACL_MODIFY_JOB,
261         MRJobConfig.DEFAULT_JOB_ACL_MODIFY_JOB));
262 
263     // Setup ContainerLaunchContext for AM container
264     //根据前面的拼接的命令生成AM的container 在后面会通过这个对象来启动container 从而启动MRAppMaster
265     ContainerLaunchContext amContainer =
266         ContainerLaunchContext.newInstance(localResources, environment,
267           vargsFinal, null, securityTokens, acls);
268 
269     Collection<String> tagsFromConf =
270         jobConf.getTrimmedStringCollection(MRJobConfig.JOB_TAGS);
271 
272     // Set up the ApplicationSubmissionContext
273     ApplicationSubmissionContext appContext =
274         recordFactory.newRecordInstance(ApplicationSubmissionContext.class);
275     appContext.setApplicationId(applicationId);                // ApplicationId
276     appContext.setQueue(                                       // Queue name
277         jobConf.get(JobContext.QUEUE_NAME,
278         YarnConfiguration.DEFAULT_QUEUE_NAME));
279     // add reservationID if present
280     ReservationId reservationID = null;
281     try {
282       reservationID =
283           ReservationId.parseReservationId(jobConf
284               .get(JobContext.RESERVATION_ID));
285     } catch (NumberFormatException e) {
286       // throw exception as reservationid as is invalid
287       String errMsg =
288           "Invalid reservationId: " + jobConf.get(JobContext.RESERVATION_ID)
289               + " specified for the app: " + applicationId;
290       LOG.warn(errMsg);
291       throw new IOException(errMsg);
292     }
293     if (reservationID != null) {
294       appContext.setReservationID(reservationID);
295       LOG.info("SUBMITTING ApplicationSubmissionContext app:" + applicationId
296           + " to queue:" + appContext.getQueue() + " with reservationId:"
297           + appContext.getReservationID());
298     }
299     appContext.setApplicationName(                             // Job name
300         jobConf.get(JobContext.JOB_NAME,
301         YarnConfiguration.DEFAULT_APPLICATION_NAME));
302     appContext.setCancelTokensWhenComplete(
303         conf.getBoolean(MRJobConfig.JOB_CANCEL_DELEGATION_TOKEN, true));
304     //设置AM Container 
305     appContext.setAMContainerSpec(amContainer);         // AM Container
306     appContext.setMaxAppAttempts(
307         conf.getInt(MRJobConfig.MR_AM_MAX_ATTEMPTS,
308             MRJobConfig.DEFAULT_MR_AM_MAX_ATTEMPTS));
309     appContext.setResource(capability);
310     appContext.setApplicationType(MRJobConfig.MR_APPLICATION_TYPE);
311     if (tagsFromConf != null && !tagsFromConf.isEmpty()) {
312       appContext.setApplicationTags(new HashSet<String>(tagsFromConf));
313     }
314 
315     return appContext;
316   }
317 
318 }
YARNRunner.java

JobSubmitter类中subJobInternal(...)方法中调用JobID jobId = submitClient.getNewJobID(),完成了图6中2:get new application

getNewjobID()是ClientProtocol.java和YARNRunner.java类中的方法。在YARNRunner.java类中,getNewjobID()方法进一步调用ResourceMgrDelegate.java类中的getNewjobID()方法。

YARNRunner类中的构造函数以及submitJob(...)方法内部有resMgrDelegate.submitApplication(appContext),即调用ResourceMgrDelegate.java的方法submitApplication(...)

ResourceMgrDelegate.java类在hadoop-2.7.3-src\hadoop-mapreduce-project\hadoop-mapreduce-client\hadoop-mapreduce-client-jobclient\src\main\java\org\apache\hadoop\mapred

   (3)作业客户端检查作业的输出说明,计算输入分片(虽然有选项yarn.app.mapreduce.am.compute-splits-in-cluster在集群上来产生分片,这可以使具有多个分片的作业从中受益)并将作业资源(包括作业JAR、配置和分片信息)复制到HDFS。

完成了图6的3:copy job resources

JobSubmitter类中subJobInternal(...)方法中调用copyAndConfigureFiles(job, submitJobDir),将作业资源复制到HDFS。

分片:具体的分片细节由InputSplitFormat指定。分片的规则为FileInputFormat中的getSplits()方法指定。

subJobInternal(...)方法中调用maps = writeSplits(job, submitJobDir)实现分片。writeSplits(...)方法又会调用maps = writeNewSplits(job, jobSubmitDir)或者writeOldSplits(jConf, jobSubmitDir)这两个方法实现分片。这两个方法都会调用List<InputSplit> splits = input.getSplits(job)方法实现分片。而getSplits(...)方法是抽象类public abstract class InputFormat<K, V> {...}的方法。

   (4)最后,通过调用资源管理器上的submitApplication()方法提交作业。

完成了图6的4:submit application

JobSubmitter类中subJobInternal(...)方法中调用status = submitClient.submitJob(...)方法,真正的提交作业----->submitJob(...)方法是接口ClientProtocol.java和它的实现类YARNRunner.java类中的方法---->在YARNRunner.java类中,submitJob(...)方法又会进一步调用ResourceMgrDelegate.java类中的submitApplication(...)方法---->而ResourceMgrDelegate.java类中的submitApplication(...)方法又进一步调用它父类抽象类YarnClient.java的submitApplication(...)方法---->抽象类YarnClient.java的submitApplication(...)方法进一步调用子类YarnClientImpl.java的submitApplication(...)方法---->

具体如下:

public abstract class YarnClient extends AbstractService {...}是个抽象类,它有两个扩展类,分别是:

public class ResourceMgrDelegate extends YarnClient {...}

public class YarnClientImpl extends YarnClient {...}

 1 public class ResourceMgrDelegate extends YarnClient {
 2   private static final Log LOG = LogFactory.getLog(ResourceMgrDelegate.class);
 3       
 4   private YarnConfiguration conf;
 5   private ApplicationSubmissionContext application;
 6   private ApplicationId applicationId;
 7   @Private
 8   @VisibleForTesting
 9   protected YarnClient client;
10   private Text rmDTService;
11 
12   /**
13    * Delegate responsible for communicating with the Resource Manager's
14    * {@link ApplicationClientProtocol}.
15    * @param conf the configuration object.
16    */
17   //主要负责与RM进行信息交互
18   public ResourceMgrDelegate(YarnConfiguration conf) {
19     super(ResourceMgrDelegate.class.getName());
20     this.conf = conf;
21     this.client = YarnClient.createYarnClient();
22     init(conf);
23     start();
24   }
25 
26   @Override
27   protected void serviceInit(Configuration conf) throws Exception {
28     client.init(conf);
29     super.serviceInit(conf);
30   }
31 
32   @Override
33   protected void serviceStart() throws Exception {
34     client.start();
35     super.serviceStart();
36   }
37 
38   @Override
39   protected void serviceStop() throws Exception {
40     client.stop();
41     super.serviceStop();
42   }
43 
44   public String getFilesystemName() throws IOException, InterruptedException {
45     return FileSystem.get(conf).getUri().toString();
46   }
47 
48   public JobID getNewJobID() throws IOException, InterruptedException {
49     try {
50       this.application = client.createApplication().getApplicationSubmissionContext();
51       this.applicationId = this.application.getApplicationId();
52       return TypeConverter.fromYarn(applicationId);
53     } catch (YarnException e) {
54       throw new IOException(e);
55     }
56   }
57 
58   public ApplicationId getApplicationId() {
59     return applicationId;
60   }
61 
62   @Override
63   public YarnClientApplication createApplication() throws
64       YarnException, IOException {
65     return client.createApplication();
66   }
67   //
68   @Override
69   public ApplicationId
70       submitApplication(ApplicationSubmissionContext appContext)
71           throws YarnException, IOException {
72     return client.submitApplication(appContext);
73   }
74 
75 }
ResourceMgrDelegate.java
 1 @InterfaceAudience.Public
 2 @InterfaceStability.Stable
 3 public abstract class YarnClient extends AbstractService {
 4 
 5   /**
 6    * Create a new instance of YarnClient.
 7    */
 8   @Public
 9   public static YarnClient createYarnClient() {
10     YarnClient client = new YarnClientImpl();    //新建YarnClientImpl对象
11     return client;
12   }
13 
14   @Private
15   protected YarnClient(String name) {
16     super(name);
17   }
18 
19   /**
20    * <p>
21    * Submit a new application to <code>YARN.</code> It is a blocking call - it
22    * will not return {@link ApplicationId} until the submitted application is
23    * submitted successfully and accepted by the ResourceManager.
24    * </p>
25    * 
26    * <p>
27    * Users should provide an {@link ApplicationId} as part of the parameter
28    * {@link ApplicationSubmissionContext} when submitting a new application,
29    * otherwise it will throw the {@link ApplicationIdNotProvidedException}.
30    * </p>
31    *
32    * <p>This internally calls {@link ApplicationClientProtocol#submitApplication
33    * (SubmitApplicationRequest)}, and after that, it internally invokes
34    * {@link ApplicationClientProtocol#getApplicationReport
35    * (GetApplicationReportRequest)} and waits till it can make sure that the
36    * application gets properly submitted. If RM fails over or RM restart
37    * happens before ResourceManager saves the application's state,
38    * {@link ApplicationClientProtocol
39    * #getApplicationReport(GetApplicationReportRequest)} will throw
40    * the {@link ApplicationNotFoundException}. This API automatically resubmits
41    * the application with the same {@link ApplicationSubmissionContext} when it
42    * catches the {@link ApplicationNotFoundException}</p>
43    *
44    * @param appContext
45    *          {@link ApplicationSubmissionContext} containing all the details
46    *          needed to submit a new application
47    * @return {@link ApplicationId} of the accepted application
48    * @throws YarnException
49    * @throws IOException
50    * @see #createApplication()
51    */
52   public abstract ApplicationId submitApplication(
53       ApplicationSubmissionContext appContext) throws YarnException,
54       IOException;
55 
56 }
YarnClient.java
  1 @Private
  2 @Unstable
  3 //主要负责实例化rmClient;
  4 public class YarnClientImpl extends YarnClient {
  5 
  6   private static final Log LOG = LogFactory.getLog(YarnClientImpl.class);
  7 
  8   protected ApplicationClientProtocol rmClient;
  9   protected long submitPollIntervalMillis;
 10   private long asyncApiPollIntervalMillis;
 11   private long asyncApiPollTimeoutMillis;
 12   protected AHSClient historyClient;
 13   private boolean historyServiceEnabled;
 14   protected TimelineClient timelineClient;
 15   @VisibleForTesting
 16   Text timelineService;
 17   @VisibleForTesting
 18   String timelineDTRenewer;
 19   protected boolean timelineServiceEnabled;
 20   protected boolean timelineServiceBestEffort;
 21 
 22   private static final String ROOT = "root";
 23 
 24   public YarnClientImpl() {
 25     super(YarnClientImpl.class.getName());
 26   }
 27 
 28   @SuppressWarnings("deprecation")
 29   @Override
 30   protected void serviceInit(Configuration conf) throws Exception {
 31     asyncApiPollIntervalMillis =
 32         conf.getLong(YarnConfiguration.YARN_CLIENT_APPLICATION_CLIENT_PROTOCOL_POLL_INTERVAL_MS,
 33           YarnConfiguration.DEFAULT_YARN_CLIENT_APPLICATION_CLIENT_PROTOCOL_POLL_INTERVAL_MS);
 34     asyncApiPollTimeoutMillis =
 35         conf.getLong(YarnConfiguration.YARN_CLIENT_APPLICATION_CLIENT_PROTOCOL_POLL_TIMEOUT_MS,
 36             YarnConfiguration.DEFAULT_YARN_CLIENT_APPLICATION_CLIENT_PROTOCOL_POLL_TIMEOUT_MS);
 37     submitPollIntervalMillis = asyncApiPollIntervalMillis;
 38     if (conf.get(YarnConfiguration.YARN_CLIENT_APP_SUBMISSION_POLL_INTERVAL_MS)
 39         != null) {
 40       submitPollIntervalMillis = conf.getLong(
 41         YarnConfiguration.YARN_CLIENT_APP_SUBMISSION_POLL_INTERVAL_MS,
 42         YarnConfiguration.DEFAULT_YARN_CLIENT_APPLICATION_CLIENT_PROTOCOL_POLL_INTERVAL_MS);
 43     }
 44 
 45     if (conf.getBoolean(YarnConfiguration.APPLICATION_HISTORY_ENABLED,
 46       YarnConfiguration.DEFAULT_APPLICATION_HISTORY_ENABLED)) {
 47       historyServiceEnabled = true;
 48       historyClient = AHSClient.createAHSClient();
 49       historyClient.init(conf);
 50     }
 51 
 52     if (conf.getBoolean(YarnConfiguration.TIMELINE_SERVICE_ENABLED,
 53         YarnConfiguration.DEFAULT_TIMELINE_SERVICE_ENABLED)) {
 54       timelineServiceEnabled = true;
 55       timelineClient = createTimelineClient();
 56       timelineClient.init(conf);
 57       timelineDTRenewer = getTimelineDelegationTokenRenewer(conf);
 58       timelineService = TimelineUtils.buildTimelineTokenService(conf);
 59     }
 60 
 61     timelineServiceBestEffort = conf.getBoolean(
 62         YarnConfiguration.TIMELINE_SERVICE_CLIENT_BEST_EFFORT,
 63         YarnConfiguration.DEFAULT_TIMELINE_SERVICE_CLIENT_BEST_EFFORT);
 64     super.serviceInit(conf);
 65   }
 66 
 67   TimelineClient createTimelineClient() throws IOException, YarnException {
 68     return TimelineClient.createTimelineClient();
 69   }
 70 
 71   @Override
 72   protected void serviceStart() throws Exception {
 73     //rmClient是各个客户端与RM交互的协议,主要是负责提交或终止Job任务和获取信息(applications、cluster metrics、nodes、queues和ACLs)
 74     try {
 75       rmClient = ClientRMProxy.createRMProxy(getConfig(),
 76           ApplicationClientProtocol.class);
 77       if (historyServiceEnabled) {
 78         historyClient.start();
 79       }
 80       if (timelineServiceEnabled) {
 81         timelineClient.start();
 82       }
 83     } catch (IOException e) {
 84       throw new YarnRuntimeException(e);
 85     }
 86     super.serviceStart();
 87   }
 88 
 89   @Override
 90   protected void serviceStop() throws Exception {
 91     if (this.rmClient != null) {
 92       RPC.stopProxy(this.rmClient);
 93     }
 94     if (historyServiceEnabled) {
 95       historyClient.stop();
 96     }
 97     if (timelineServiceEnabled) {
 98       timelineClient.stop();
 99     }
100     super.serviceStop();
101   }
102 
103   private GetNewApplicationResponse getNewApplication()
104       throws YarnException, IOException {
105     GetNewApplicationRequest request =
106         Records.newRecord(GetNewApplicationRequest.class);
107     return rmClient.getNewApplication(request);
108   }
109 
110   @Override
111   public YarnClientApplication createApplication()
112       throws YarnException, IOException {
113     ApplicationSubmissionContext context = Records.newRecord
114         (ApplicationSubmissionContext.class);
115     GetNewApplicationResponse newApp = getNewApplication();
116     ApplicationId appId = newApp.getApplicationId();
117     context.setApplicationId(appId);
118     return new YarnClientApplication(newApp, context);
119   }
120 
121   @Override
122   public ApplicationId
123       submitApplication(ApplicationSubmissionContext appContext)
124           throws YarnException, IOException {
125     ApplicationId applicationId = appContext.getApplicationId();
126     if (applicationId == null) {
127       throw new ApplicationIdNotProvidedException(
128           "ApplicationId is not provided in ApplicationSubmissionContext");
129     }
130     //将appContext设置到一个request里面  
131     SubmitApplicationRequest request =
132         Records.newRecord(SubmitApplicationRequest.class);
133     request.setApplicationSubmissionContext(appContext);
134 
135     // Automatically add the timeline DT into the CLC
136     // Only when the security and the timeline service are both enabled
137     if (isSecurityEnabled() && timelineServiceEnabled) {
138       addTimelineDelegationToken(appContext.getAMContainerSpec());
139     }
140 
141     //TODO: YARN-1763:Handle RM failovers during the submitApplication call.
142     ////通过rmClient提交request, 这个rmClient其实就是ClientRMService类, 
143     //是用来和Resource Manager做RPC的call, 通过这个类, 可以直接和RM对话 
144     rmClient.submitApplication(request);    //这里提交任务
145 
146     int pollCount = 0;
147     long startTime = System.currentTimeMillis();
148     EnumSet<YarnApplicationState> waitingStates = 
149                                  EnumSet.of(YarnApplicationState.NEW,
150                                  YarnApplicationState.NEW_SAVING,
151                                  YarnApplicationState.SUBMITTED);
152     EnumSet<YarnApplicationState> failToSubmitStates = 
153                                   EnumSet.of(YarnApplicationState.FAILED,
154                                   YarnApplicationState.KILLED);        
155     ////一直循环, 直到状态变为NEW为止, 如果长时间状态没变, 那么就timeout
156     while (true) {
157       try {
158         ApplicationReport appReport = getApplicationReport(applicationId);
159         YarnApplicationState state = appReport.getYarnApplicationState();
160         if (!waitingStates.contains(state)) {
161           if(failToSubmitStates.contains(state)) {
162             throw new YarnException("Failed to submit " + applicationId + 
163                 " to YARN : " + appReport.getDiagnostics());
164           }
165           LOG.info("Submitted application " + applicationId);
166           break;
167         }
168 
169         long elapsedMillis = System.currentTimeMillis() - startTime;
170         if (enforceAsyncAPITimeout() &&
171             elapsedMillis >= asyncApiPollTimeoutMillis) {
172           throw new YarnException("Timed out while waiting for application " +
173               applicationId + " to be submitted successfully");
174         }
175 
176         // Notify the client through the log every 10 poll, in case the client
177         // is blocked here too long.
178         if (++pollCount % 10 == 0) {
179           LOG.info("Application submission is not finished, " +
180               "submitted application " + applicationId +
181               " is still in " + state);
182         }
183         try {
184           Thread.sleep(submitPollIntervalMillis);
185         } catch (InterruptedException ie) {
186           LOG.error("Interrupted while waiting for application "
187               + applicationId
188               + " to be successfully submitted.");
189         }
190       } catch (ApplicationNotFoundException ex) {
191         // FailOver or RM restart happens before RMStateStore saves
192         // ApplicationState
193         LOG.info("Re-submit application " + applicationId + "with the " +
194             "same ApplicationSubmissionContext");
195         rmClient.submitApplication(request);
196       }
197     }
198 
199     return applicationId;
200   }
201 
202 }
YarnClientImpl.java

YARNRunner类的构造函数---->调用ResourceMgrDelegate的构造函数---->调用抽象类YarnClient的createYarnClient()方法---->调用YarnClientImpl的构造函数生成对象并返回给YarnClient类和ResourceMgrDelegate类的对象(所以ResourceMgrDelegate类中对象调用的方法最终调用的是YarnClientImpl类中的方法)。

YARNRunner类的构造函数---->调用ResourceMgrDelegate的构造函数---->调用抽象类AbstractService的init()以及start()方法,该抽象类是抽象类YarnClient的父类---->该抽象类的两个方法分别会调用该类内部方法serviceInit()以及serviceStart()方法,(其实最终对用的是YarnClientImpl类中的相应的serviceInit()以及serviceStart()方法)。

YARNRunner类的submitJob()方法---->调用ResourceMgrDelegate类的submitApplication()方法---->调用抽象类YarnClient的submitApplication()方法(其实最终对用的是YarnClientImpl类中的submitApplication()方法)。

到目前为止, 所有的内容都还是在提交Job的那台Client机器上, 还没有到ResourceManger那边。接下来是ResourceManger端:

难点2    接着上面,YarnClientImpl.java的submitApplication(...)方法---->YarnClientImpl.java的submitApplication(...)方法内部进一步调用接口ApplicationClientProtocol.java的submitApplication(...)方法,调用它的实现类ClientRMService.java中的submitApplication(...)方法,以及它的构造函数。

public interface ApplicationClientProtocol extends ApplicationBaseProtocol {...}是个接口,它有四个实现类,其中只有两个实现类中有submitApplication()方法,这两个实现类分别是:(我们程序跟踪到这一步丢了:当类调用接口的方法时,会跟丢;这时就要看该接口的实现类)

public class ApplicationClientProtocolPBClientImpl implements ApplicationClientProtocol, Closeable {...}

public class ClientRMService extends AbstractService implements ApplicationClientProtocol {...}      //这里选用这个类,

原因请参见这里:

参考2     参考   

ClientRMService.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java

  1 /**
  2  * The client interface to the Resource Manager. This module handles all the rpc
  3  * interfaces to the resource manager from the client.
  4  */
  5 public class ClientRMService extends AbstractService implements
  6     ApplicationClientProtocol {
  7   private static final ArrayList<ApplicationReport> EMPTY_APPS_REPORT = new ArrayList<ApplicationReport>();
  8 
  9   private static final Log LOG = LogFactory.getLog(ClientRMService.class);
 10 
 11   final private AtomicInteger applicationCounter = new AtomicInteger(0);
 12   final private YarnScheduler scheduler;
 13   final private RMContext rmContext;
 14   private final RMAppManager rmAppManager;
 15 
 16   private Server server;
 17   protected RMDelegationTokenSecretManager rmDTSecretManager;
 18 
 19   private final RecordFactory recordFactory = RecordFactoryProvider.getRecordFactory(null);
 20   InetSocketAddress clientBindAddress;
 21 
 22   private final ApplicationACLsManager applicationsACLsManager;
 23   private final QueueACLsManager queueACLsManager;
 24 
 25   // For Reservation APIs
 26   private Clock clock;
 27   private ReservationSystem reservationSystem;
 28   private ReservationInputValidator rValidator;
 29 
 30   public ClientRMService(RMContext rmContext, YarnScheduler scheduler,
 31       RMAppManager rmAppManager, ApplicationACLsManager applicationACLsManager,
 32       QueueACLsManager queueACLsManager,
 33       RMDelegationTokenSecretManager rmDTSecretManager) {
 34     this(rmContext, scheduler, rmAppManager, applicationACLsManager,
 35         queueACLsManager, rmDTSecretManager, new UTCClock());
 36   }
 37 
 38   public ClientRMService(RMContext rmContext, YarnScheduler scheduler,
 39       RMAppManager rmAppManager, ApplicationACLsManager applicationACLsManager,
 40       QueueACLsManager queueACLsManager,
 41       RMDelegationTokenSecretManager rmDTSecretManager, Clock clock) {
 42     super(ClientRMService.class.getName());
 43     this.scheduler = scheduler;
 44     this.rmContext = rmContext;
 45     this.rmAppManager = rmAppManager;
 46     this.applicationsACLsManager = applicationACLsManager;
 47     this.queueACLsManager = queueACLsManager;
 48     this.rmDTSecretManager = rmDTSecretManager;
 49     this.reservationSystem = rmContext.getReservationSystem();
 50     this.clock = clock;
 51     this.rValidator = new ReservationInputValidator(clock);
 52   }
 53 
 54   @Override
 55   protected void serviceInit(Configuration conf) throws Exception {
 56     clientBindAddress = getBindAddress(conf);
 57     super.serviceInit(conf);
 58   }
 59 
 60   @Override
 61   protected void serviceStart() throws Exception {
 62     Configuration conf = getConfig();
 63     YarnRPC rpc = YarnRPC.create(conf);
 64     this.server =   
 65       rpc.getServer(ApplicationClientProtocol.class, this,
 66             clientBindAddress,
 67             conf, this.rmDTSecretManager,
 68             conf.getInt(YarnConfiguration.RM_CLIENT_THREAD_COUNT, 
 69                 YarnConfiguration.DEFAULT_RM_CLIENT_THREAD_COUNT));
 70     
 71     // Enable service authorization?
 72     if (conf.getBoolean(
 73         CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHORIZATION, 
 74         false)) {
 75       InputStream inputStream =
 76           this.rmContext.getConfigurationProvider()
 77               .getConfigurationInputStream(conf,
 78                   YarnConfiguration.HADOOP_POLICY_CONFIGURATION_FILE);
 79       if (inputStream != null) {
 80         conf.addResource(inputStream);
 81       }
 82       refreshServiceAcls(conf, RMPolicyProvider.getInstance());
 83     }
 84     
 85     this.server.start();
 86     clientBindAddress = conf.updateConnectAddr(YarnConfiguration.RM_BIND_HOST,
 87                                                YarnConfiguration.RM_ADDRESS,
 88                                                YarnConfiguration.DEFAULT_RM_ADDRESS,
 89                                                server.getListenerAddress());
 90     super.serviceStart();
 91   }
 92 
 93   @Override
 94   protected void serviceStop() throws Exception {
 95     if (this.server != null) {
 96         this.server.stop();
 97     }
 98     super.serviceStop();
 99   }
100 
101   
102   ApplicationId getNewApplicationId() {
103     ApplicationId applicationId = org.apache.hadoop.yarn.server.utils.BuilderUtils
104         .newApplicationId(recordFactory, ResourceManager.getClusterTimeStamp(),
105             applicationCounter.incrementAndGet());
106     LOG.info("Allocated new applicationId: " + applicationId.getId());
107     return applicationId;
108   }
109 
110   @Override
111   public GetNewApplicationResponse getNewApplication(
112       GetNewApplicationRequest request) throws YarnException {
113     GetNewApplicationResponse response = recordFactory
114         .newRecordInstance(GetNewApplicationResponse.class);
115     response.setApplicationId(getNewApplicationId());
116     // Pick up min/max resource from scheduler...
117     response.setMaximumResourceCapability(scheduler
118         .getMaximumResourceCapability());       
119     
120     return response;
121   }
122 
123   @Override
124   public SubmitApplicationResponse submitApplication(
125       SubmitApplicationRequest request) throws YarnException {
126     ApplicationSubmissionContext submissionContext = request
127         .getApplicationSubmissionContext();
128     ApplicationId applicationId = submissionContext.getApplicationId();
129 
130     // ApplicationSubmissionContext needs to be validated for safety - only
131     // those fields that are independent of the RM's configuration will be
132     // checked here, those that are dependent on RM configuration are validated
133     // in RMAppManager.
134     
135     //开始各种验证 一不开心就不让干活 
136     String user = null;
137     try {
138       // Safety
139       user = UserGroupInformation.getCurrentUser().getShortUserName();
140     } catch (IOException ie) {
141       LOG.warn("Unable to get the current user.", ie);
142       RMAuditLogger.logFailure(user, AuditConstants.SUBMIT_APP_REQUEST,
143           ie.getMessage(), "ClientRMService",
144           "Exception in submitting application", applicationId);
145       throw RPCUtil.getRemoteException(ie);
146     }
147     
148     //各种验证 
149     // Check whether app has already been put into rmContext,
150     // If it is, simply return the response
151     if (rmContext.getRMApps().get(applicationId) != null) {
152       LOG.info("This is an earlier submitted application: " + applicationId);
153       return SubmitApplicationResponse.newInstance();
154     }
155     
156     //继续验证  
157     if (submissionContext.getQueue() == null) {
158       submissionContext.setQueue(YarnConfiguration.DEFAULT_QUEUE_NAME);
159     }
160     if (submissionContext.getApplicationName() == null) {
161       submissionContext.setApplicationName(
162           YarnConfiguration.DEFAULT_APPLICATION_NAME);
163     }
164     if (submissionContext.getApplicationType() == null) {
165       submissionContext
166         .setApplicationType(YarnConfiguration.DEFAULT_APPLICATION_TYPE);
167     } else {
168       if (submissionContext.getApplicationType().length() > YarnConfiguration.APPLICATION_TYPE_LENGTH) {
169         submissionContext.setApplicationType(submissionContext
170           .getApplicationType().substring(0,
171             YarnConfiguration.APPLICATION_TYPE_LENGTH));
172       }
173     }
174 
175     try {
176       // call RMAppManager to submit application directly
177       //干活  通过rmAppManager提交  
178       rmAppManager.submitApplication(submissionContext,
179           System.currentTimeMillis(), user);
180 
181       LOG.info("Application with id " + applicationId.getId() + 
182           " submitted by user " + user);
183       RMAuditLogger.logSuccess(user, AuditConstants.SUBMIT_APP_REQUEST,
184           "ClientRMService", applicationId);
185     } catch (YarnException e) {
186       LOG.info("Exception in submitting application with id " +
187           applicationId.getId(), e);
188       RMAuditLogger.logFailure(user, AuditConstants.SUBMIT_APP_REQUEST,
189           e.getMessage(), "ClientRMService",
190           "Exception in submitting application", applicationId);
191       throw e;
192     }
193 
194     SubmitApplicationResponse response = recordFactory
195         .newRecordInstance(SubmitApplicationResponse.class);
196     return response;
197   }
198 
199 }
ClientRMService.java

 ClientRMService.java中的submitApplication(...)方法---->调用RMAppManager.java的submitApplication(...)方法

  1 /**
  2  * This class manages the list of applications for the resource manager. 
  3  */
  4 public class RMAppManager implements EventHandler<RMAppManagerEvent>, 
  5                                         Recoverable {
  6 
  7   private static final Log LOG = LogFactory.getLog(RMAppManager.class);
  8 
  9   private int maxCompletedAppsInMemory;
 10   private int maxCompletedAppsInStateStore;
 11   protected int completedAppsInStateStore = 0;
 12   private LinkedList<ApplicationId> completedApps = new LinkedList<ApplicationId>();
 13 
 14   private final RMContext rmContext;
 15   private final ApplicationMasterService masterService;
 16   private final YarnScheduler scheduler;
 17   private final ApplicationACLsManager applicationACLsManager;
 18   private Configuration conf;
 19 
 20   public RMAppManager(RMContext context,
 21       YarnScheduler scheduler, ApplicationMasterService masterService,
 22       ApplicationACLsManager applicationACLsManager, Configuration conf) {
 23     this.rmContext = context;
 24     this.scheduler = scheduler;
 25     this.masterService = masterService;
 26     this.applicationACLsManager = applicationACLsManager;
 27     this.conf = conf;
 28     this.maxCompletedAppsInMemory = conf.getInt(
 29         YarnConfiguration.RM_MAX_COMPLETED_APPLICATIONS,
 30         YarnConfiguration.DEFAULT_RM_MAX_COMPLETED_APPLICATIONS);
 31     this.maxCompletedAppsInStateStore =
 32         conf.getInt(
 33           YarnConfiguration.RM_STATE_STORE_MAX_COMPLETED_APPLICATIONS,
 34           YarnConfiguration.DEFAULT_RM_STATE_STORE_MAX_COMPLETED_APPLICATIONS);
 35     if (this.maxCompletedAppsInStateStore > this.maxCompletedAppsInMemory) {
 36       this.maxCompletedAppsInStateStore = this.maxCompletedAppsInMemory;
 37     }
 38   }
 39 
 40   @SuppressWarnings("unchecked")
 41   protected void submitApplication(
 42       ApplicationSubmissionContext submissionContext, long submitTime,
 43       String user) throws YarnException {
 44     ApplicationId applicationId = submissionContext.getApplicationId();
 45 
 46     //创建一个RMAppImpl对象 其实就是启动RMApp状态机 以及执行RMAppEvent 
 47     RMAppImpl application =
 48         createAndPopulateNewRMApp(submissionContext, submitTime, user, false);
 49     ApplicationId appId = submissionContext.getApplicationId();
 50 
 51     //如果有安全认证enable的话会走这里, 比如kerberos啥的 我就不这么麻烦了 以看懂为主, 直接到else 
 52     if (UserGroupInformation.isSecurityEnabled()) {
 53       try {
 54         this.rmContext.getDelegationTokenRenewer().addApplicationAsync(appId,
 55             parseCredentials(submissionContext),
 56             submissionContext.getCancelTokensWhenComplete(),
 57             application.getUser());
 58       } catch (Exception e) {
 59         LOG.warn("Unable to parse credentials.", e);
 60         // Sending APP_REJECTED is fine, since we assume that the
 61         // RMApp is in NEW state and thus we haven't yet informed the
 62         // scheduler about the existence of the application
 63         assert application.getState() == RMAppState.NEW;
 64         this.rmContext.getDispatcher().getEventHandler()
 65           .handle(new RMAppEvent(applicationId,
 66               RMAppEventType.APP_REJECTED, e.getMessage()));
 67         throw RPCUtil.getRemoteException(e);
 68       }
 69     } else {
 70       // Dispatcher is not yet started at this time, so these START events
 71       // enqueued should be guaranteed to be first processed when dispatcher
 72       // gets started.
 73       //启动RMApp的状态机, 这里rmContext其实是resourceManager的Client代理,
 74       //这一步就是让去RM端的dispatcher去处理RMAppEventType.START事件
 75       this.rmContext.getDispatcher().getEventHandler()
 76         .handle(new RMAppEvent(applicationId, RMAppEventType.START));
 77     }
 78   }
 79 
 80   private RMAppImpl createAndPopulateNewRMApp(
 81       ApplicationSubmissionContext submissionContext, long submitTime,
 82       String user, boolean isRecovery) throws YarnException {
 83     ApplicationId applicationId = submissionContext.getApplicationId();
 84     ResourceRequest amReq =
 85         validateAndCreateResourceRequest(submissionContext, isRecovery);
 86 
 87     // Create RMApp
 88     //创建 RMApp
 89     RMAppImpl application =
 90         new RMAppImpl(applicationId, rmContext, this.conf,
 91             submissionContext.getApplicationName(), user,
 92             submissionContext.getQueue(),
 93             submissionContext, this.scheduler, this.masterService,
 94             submitTime, submissionContext.getApplicationType(),
 95             submissionContext.getApplicationTags(), amReq);
 96 
 97     // Concurrent app submissions with same applicationId will fail here
 98     // Concurrent app submissions with different applicationIds will not
 99     // influence each other
100     if (rmContext.getRMApps().putIfAbsent(applicationId, application) !=
101         null) {
102       String message = "Application with id " + applicationId
103           + " is already present! Cannot add a duplicate!";
104       LOG.warn(message);
105       throw new YarnException(message);
106     }
107     // Inform the ACLs Manager
108     this.applicationACLsManager.addApplication(applicationId,
109         submissionContext.getAMContainerSpec().getApplicationACLs());
110     String appViewACLs = submissionContext.getAMContainerSpec()
111         .getApplicationACLs().get(ApplicationAccessType.VIEW_APP);
112     rmContext.getSystemMetricsPublisher().appACLsUpdated(
113         application, appViewACLs, System.currentTimeMillis());
114     return application;
115   }
116 
117 }
RMAppManager.java

RMAppManager.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java

难点3  RMAppManager之后,跟踪就出现了瓶颈,看了很多资料后

参考1    参考2   

  实际上RMAPPManager类的submitApplication()方法内部会调用该类的createAndPopulateNewRMApp()方法,该方法构建一个app(其实RMApp)并放入applicationACLS 。submitApplication()方法最后还会调用this.rmContext.getDispatcher().getEventHandler().handle(new RMAppEvent(applicationId, RMAppEventType.START)),触发app启动事件,往异步处理器增加个RMAppEvent事件,类型枚值RMAppEventType.START在RM内部会注册该类型的事件会用什么处理器来处理。其中this.rmContext=RMContextImpl  this.rmContext.getDispatcher()=AsyncDispatcher   this.rmContext.getDispatcher().getEventHandler()=AsyncDispatcher$GenericEventHandler。

  我们先来看一下createAndPopulateNewRMApp()方法,在RMAppManager类中,它调用了RMAppImpl构造函数,来创建RMApp。即会调用RMAppImpl.java, 包括全部的。

   1 @SuppressWarnings({ "rawtypes", "unchecked" })
   2 public class RMAppImpl implements RMApp, Recoverable {
   3 
   4   private static final Log LOG = LogFactory.getLog(RMAppImpl.class);
   5   private static final String UNAVAILABLE = "N/A";
   6 
   7   // Immutable fields
   8   private final ApplicationId applicationId;
   9   private final RMContext rmContext;
  10   private final Configuration conf;
  11   private final String user;
  12   private final String name;
  13   private final ApplicationSubmissionContext submissionContext;
  14   private final Dispatcher dispatcher;
  15   private final YarnScheduler scheduler;
  16   private final ApplicationMasterService masterService;
  17   private final StringBuilder diagnostics = new StringBuilder();
  18   private final int maxAppAttempts;
  19   private final ReadLock readLock;
  20   private final WriteLock writeLock;
  21   private final Map<ApplicationAttemptId, RMAppAttempt> attempts
  22       = new LinkedHashMap<ApplicationAttemptId, RMAppAttempt>();
  23   private final long submitTime;
  24   private final Set<RMNode> updatedNodes = new HashSet<RMNode>();
  25   private final String applicationType;
  26   private final Set<String> applicationTags;
  27 
  28   private final long attemptFailuresValidityInterval;
  29 
  30   private Clock systemClock;
  31 
  32   private boolean isNumAttemptsBeyondThreshold = false;
  33 
  34   // Mutable fields
  35   private long startTime;
  36   private long finishTime = 0;
  37   private long storedFinishTime = 0;
  38   // This field isn't protected by readlock now.
  39   private volatile RMAppAttempt currentAttempt;
  40   private String queue;
  41   private EventHandler handler;
  42   private static final AppFinishedTransition FINISHED_TRANSITION =
  43       new AppFinishedTransition();
  44   private Set<NodeId> ranNodes = new ConcurrentSkipListSet<NodeId>();
  45 
  46   // These states stored are only valid when app is at killing or final_saving.
  47   private RMAppState stateBeforeKilling;
  48   private RMAppState stateBeforeFinalSaving;
  49   private RMAppEvent eventCausingFinalSaving;
  50   private RMAppState targetedFinalState;
  51   private RMAppState recoveredFinalState;
  52   private ResourceRequest amReq;
  53 
  54   Object transitionTodo;
  55 
  56   private static final StateMachineFactory<RMAppImpl,
  57                                            RMAppState,
  58                                            RMAppEventType,
  59                                            RMAppEvent> stateMachineFactory
  60                                = new StateMachineFactory<RMAppImpl,
  61                                            RMAppState,
  62                                            RMAppEventType,
  63                                            RMAppEvent>(RMAppState.NEW)
  64 
  65 
  66      // Transitions from NEW state
  67     .addTransition(RMAppState.NEW, RMAppState.NEW,
  68         RMAppEventType.NODE_UPDATE, new RMAppNodeUpdateTransition())
  69     .addTransition(RMAppState.NEW, RMAppState.NEW_SAVING,
  70         RMAppEventType.START, new RMAppNewlySavingTransition())
  71     .addTransition(RMAppState.NEW, EnumSet.of(RMAppState.SUBMITTED,
  72             RMAppState.ACCEPTED, RMAppState.FINISHED, RMAppState.FAILED,
  73             RMAppState.KILLED, RMAppState.FINAL_SAVING),
  74         RMAppEventType.RECOVER, new RMAppRecoveredTransition())
  75     .addTransition(RMAppState.NEW, RMAppState.KILLED, RMAppEventType.KILL,
  76         new AppKilledTransition())
  77     .addTransition(RMAppState.NEW, RMAppState.FINAL_SAVING,
  78         RMAppEventType.APP_REJECTED,
  79         new FinalSavingTransition(new AppRejectedTransition(),
  80           RMAppState.FAILED))
  81 
  82     // Transitions from NEW_SAVING state
  83     .addTransition(RMAppState.NEW_SAVING, RMAppState.NEW_SAVING,
  84         RMAppEventType.NODE_UPDATE, new RMAppNodeUpdateTransition())
  85     .addTransition(RMAppState.NEW_SAVING, RMAppState.SUBMITTED,
  86         RMAppEventType.APP_NEW_SAVED, new AddApplicationToSchedulerTransition())
  87     .addTransition(RMAppState.NEW_SAVING, RMAppState.FINAL_SAVING,
  88         RMAppEventType.KILL,
  89         new FinalSavingTransition(
  90           new AppKilledTransition(), RMAppState.KILLED))
  91     .addTransition(RMAppState.NEW_SAVING, RMAppState.FINAL_SAVING,
  92         RMAppEventType.APP_REJECTED,
  93           new FinalSavingTransition(new AppRejectedTransition(),
  94             RMAppState.FAILED))
  95     .addTransition(RMAppState.NEW_SAVING, RMAppState.NEW_SAVING,
  96         RMAppEventType.MOVE, new RMAppMoveTransition())
  97 
  98      // Transitions from SUBMITTED state
  99     .addTransition(RMAppState.SUBMITTED, RMAppState.SUBMITTED,
 100         RMAppEventType.NODE_UPDATE, new RMAppNodeUpdateTransition())
 101     .addTransition(RMAppState.SUBMITTED, RMAppState.SUBMITTED,
 102         RMAppEventType.MOVE, new RMAppMoveTransition())
 103     .addTransition(RMAppState.SUBMITTED, RMAppState.FINAL_SAVING,
 104         RMAppEventType.APP_REJECTED,
 105         new FinalSavingTransition(
 106           new AppRejectedTransition(), RMAppState.FAILED))
 107     .addTransition(RMAppState.SUBMITTED, RMAppState.ACCEPTED,
 108         RMAppEventType.APP_ACCEPTED, new StartAppAttemptTransition())
 109     .addTransition(RMAppState.SUBMITTED, RMAppState.FINAL_SAVING,
 110         RMAppEventType.KILL,
 111         new FinalSavingTransition(
 112           new AppKilledTransition(), RMAppState.KILLED))
 113 
 114      // Transitions from ACCEPTED state
 115     .addTransition(RMAppState.ACCEPTED, RMAppState.ACCEPTED,
 116         RMAppEventType.NODE_UPDATE, new RMAppNodeUpdateTransition())
 117     .addTransition(RMAppState.ACCEPTED, RMAppState.ACCEPTED,
 118         RMAppEventType.MOVE, new RMAppMoveTransition())
 119     .addTransition(RMAppState.ACCEPTED, RMAppState.RUNNING,
 120         RMAppEventType.ATTEMPT_REGISTERED)
 121     .addTransition(RMAppState.ACCEPTED,
 122         EnumSet.of(RMAppState.ACCEPTED, RMAppState.FINAL_SAVING),
 123         // ACCEPTED state is possible to receive ATTEMPT_FAILED/ATTEMPT_FINISHED
 124         // event because RMAppRecoveredTransition is returning ACCEPTED state
 125         // directly and waiting for the previous AM to exit.
 126         RMAppEventType.ATTEMPT_FAILED,
 127         new AttemptFailedTransition(RMAppState.ACCEPTED))
 128     .addTransition(RMAppState.ACCEPTED, RMAppState.FINAL_SAVING,
 129         RMAppEventType.ATTEMPT_FINISHED,
 130         new FinalSavingTransition(FINISHED_TRANSITION, RMAppState.FINISHED))
 131     .addTransition(RMAppState.ACCEPTED, RMAppState.KILLING,
 132         RMAppEventType.KILL, new KillAttemptTransition())
 133     .addTransition(RMAppState.ACCEPTED, RMAppState.FINAL_SAVING,
 134         RMAppEventType.ATTEMPT_KILLED,
 135         new FinalSavingTransition(new AppKilledTransition(), RMAppState.KILLED))
 136     .addTransition(RMAppState.ACCEPTED, RMAppState.ACCEPTED, 
 137         RMAppEventType.APP_RUNNING_ON_NODE,
 138         new AppRunningOnNodeTransition())
 139 
 140      // Transitions from RUNNING state
 141     .addTransition(RMAppState.RUNNING, RMAppState.RUNNING,
 142         RMAppEventType.NODE_UPDATE, new RMAppNodeUpdateTransition())
 143     .addTransition(RMAppState.RUNNING, RMAppState.RUNNING,
 144         RMAppEventType.MOVE, new RMAppMoveTransition())
 145     .addTransition(RMAppState.RUNNING, RMAppState.FINAL_SAVING,
 146         RMAppEventType.ATTEMPT_UNREGISTERED,
 147         new FinalSavingTransition(
 148           new AttemptUnregisteredTransition(),
 149           RMAppState.FINISHING, RMAppState.FINISHED))
 150     .addTransition(RMAppState.RUNNING, RMAppState.FINISHED,
 151       // UnManagedAM directly jumps to finished
 152         RMAppEventType.ATTEMPT_FINISHED, FINISHED_TRANSITION)
 153     .addTransition(RMAppState.RUNNING, RMAppState.RUNNING, 
 154         RMAppEventType.APP_RUNNING_ON_NODE,
 155         new AppRunningOnNodeTransition())
 156     .addTransition(RMAppState.RUNNING,
 157         EnumSet.of(RMAppState.ACCEPTED, RMAppState.FINAL_SAVING),
 158         RMAppEventType.ATTEMPT_FAILED,
 159         new AttemptFailedTransition(RMAppState.ACCEPTED))
 160     .addTransition(RMAppState.RUNNING, RMAppState.KILLING,
 161         RMAppEventType.KILL, new KillAttemptTransition())
 162 
 163      // Transitions from FINAL_SAVING state
 164     .addTransition(RMAppState.FINAL_SAVING,
 165       EnumSet.of(RMAppState.FINISHING, RMAppState.FAILED,
 166         RMAppState.KILLED, RMAppState.FINISHED), RMAppEventType.APP_UPDATE_SAVED,
 167         new FinalStateSavedTransition())
 168     .addTransition(RMAppState.FINAL_SAVING, RMAppState.FINAL_SAVING,
 169         RMAppEventType.ATTEMPT_FINISHED,
 170         new AttemptFinishedAtFinalSavingTransition())
 171     .addTransition(RMAppState.FINAL_SAVING, RMAppState.FINAL_SAVING, 
 172         RMAppEventType.APP_RUNNING_ON_NODE,
 173         new AppRunningOnNodeTransition())
 174     // ignorable transitions
 175     .addTransition(RMAppState.FINAL_SAVING, RMAppState.FINAL_SAVING,
 176         EnumSet.of(RMAppEventType.NODE_UPDATE, RMAppEventType.KILL,
 177           RMAppEventType.APP_NEW_SAVED, RMAppEventType.MOVE))
 178 
 179      // Transitions from FINISHING state
 180     .addTransition(RMAppState.FINISHING, RMAppState.FINISHED,
 181         RMAppEventType.ATTEMPT_FINISHED, FINISHED_TRANSITION)
 182     .addTransition(RMAppState.FINISHING, RMAppState.FINISHING, 
 183         RMAppEventType.APP_RUNNING_ON_NODE,
 184         new AppRunningOnNodeTransition())
 185     // ignorable transitions
 186     .addTransition(RMAppState.FINISHING, RMAppState.FINISHING,
 187       EnumSet.of(RMAppEventType.NODE_UPDATE,
 188         // ignore Kill/Move as we have already saved the final Finished state
 189         // in state store.
 190         RMAppEventType.KILL, RMAppEventType.MOVE))
 191 
 192      // Transitions from KILLING state
 193     .addTransition(RMAppState.KILLING, RMAppState.KILLING, 
 194         RMAppEventType.APP_RUNNING_ON_NODE,
 195         new AppRunningOnNodeTransition())
 196     .addTransition(RMAppState.KILLING, RMAppState.FINAL_SAVING,
 197         RMAppEventType.ATTEMPT_KILLED,
 198         new FinalSavingTransition(
 199           new AppKilledTransition(), RMAppState.KILLED))
 200     .addTransition(RMAppState.KILLING, RMAppState.FINAL_SAVING,
 201         RMAppEventType.ATTEMPT_UNREGISTERED,
 202         new FinalSavingTransition(
 203           new AttemptUnregisteredTransition(),
 204           RMAppState.FINISHING, RMAppState.FINISHED))
 205     .addTransition(RMAppState.KILLING, RMAppState.FINISHED,
 206       // UnManagedAM directly jumps to finished
 207         RMAppEventType.ATTEMPT_FINISHED, FINISHED_TRANSITION)
 208     .addTransition(RMAppState.KILLING,
 209         EnumSet.of(RMAppState.FINAL_SAVING),
 210         RMAppEventType.ATTEMPT_FAILED,
 211         new AttemptFailedTransition(RMAppState.KILLING))
 212 
 213     .addTransition(RMAppState.KILLING, RMAppState.KILLING,
 214         EnumSet.of(
 215             RMAppEventType.NODE_UPDATE,
 216             RMAppEventType.ATTEMPT_REGISTERED,
 217             RMAppEventType.APP_UPDATE_SAVED,
 218             RMAppEventType.KILL, RMAppEventType.MOVE))
 219 
 220      // Transitions from FINISHED state
 221      // ignorable transitions
 222     .addTransition(RMAppState.FINISHED, RMAppState.FINISHED, 
 223         RMAppEventType.APP_RUNNING_ON_NODE,
 224         new AppRunningOnNodeTransition())
 225     .addTransition(RMAppState.FINISHED, RMAppState.FINISHED,
 226         EnumSet.of(
 227             RMAppEventType.NODE_UPDATE,
 228             RMAppEventType.ATTEMPT_UNREGISTERED,
 229             RMAppEventType.ATTEMPT_FINISHED,
 230             RMAppEventType.KILL, RMAppEventType.MOVE))
 231 
 232      // Transitions from FAILED state
 233      // ignorable transitions
 234     .addTransition(RMAppState.FAILED, RMAppState.FAILED, 
 235         RMAppEventType.APP_RUNNING_ON_NODE,
 236         new AppRunningOnNodeTransition())
 237     .addTransition(RMAppState.FAILED, RMAppState.FAILED,
 238         EnumSet.of(RMAppEventType.KILL, RMAppEventType.NODE_UPDATE,
 239             RMAppEventType.MOVE))
 240 
 241      // Transitions from KILLED state
 242      // ignorable transitions
 243     .addTransition(RMAppState.KILLED, RMAppState.KILLED, 
 244         RMAppEventType.APP_RUNNING_ON_NODE,
 245         new AppRunningOnNodeTransition())
 246     .addTransition(
 247         RMAppState.KILLED,
 248         RMAppState.KILLED,
 249         EnumSet.of(RMAppEventType.APP_ACCEPTED,
 250             RMAppEventType.APP_REJECTED, RMAppEventType.KILL,
 251             RMAppEventType.ATTEMPT_FINISHED, RMAppEventType.ATTEMPT_FAILED,
 252             RMAppEventType.NODE_UPDATE, RMAppEventType.MOVE))
 253 
 254      .installTopology();
 255 
 256   private final StateMachine<RMAppState, RMAppEventType, RMAppEvent>
 257                                                                  stateMachine;
 258 
 259   private static final int DUMMY_APPLICATION_ATTEMPT_NUMBER = -1;
 260   
 261   public RMAppImpl(ApplicationId applicationId, RMContext rmContext,
 262       Configuration config, String name, String user, String queue,
 263       ApplicationSubmissionContext submissionContext, YarnScheduler scheduler,
 264       ApplicationMasterService masterService, long submitTime,
 265       String applicationType, Set<String> applicationTags, 
 266       ResourceRequest amReq) {
 267 
 268     this.systemClock = new SystemClock();
 269 
 270     this.applicationId = applicationId;
 271     this.name = name;
 272     this.rmContext = rmContext;
 273     this.dispatcher = rmContext.getDispatcher();
 274     this.handler = dispatcher.getEventHandler();
 275     this.conf = config;
 276     this.user = user;
 277     this.queue = queue;
 278     this.submissionContext = submissionContext;
 279     this.scheduler = scheduler;
 280     this.masterService = masterService;
 281     this.submitTime = submitTime;
 282     this.startTime = this.systemClock.getTime();
 283     this.applicationType = applicationType;
 284     this.applicationTags = applicationTags;
 285     this.amReq = amReq;
 286 
 287     int globalMaxAppAttempts = conf.getInt(YarnConfiguration.RM_AM_MAX_ATTEMPTS,
 288         YarnConfiguration.DEFAULT_RM_AM_MAX_ATTEMPTS);
 289     int individualMaxAppAttempts = submissionContext.getMaxAppAttempts();
 290     if (individualMaxAppAttempts <= 0 ||
 291         individualMaxAppAttempts > globalMaxAppAttempts) {
 292       this.maxAppAttempts = globalMaxAppAttempts;
 293       LOG.warn("The specific max attempts: " + individualMaxAppAttempts
 294           + " for application: " + applicationId.getId()
 295           + " is invalid, because it is out of the range [1, "
 296           + globalMaxAppAttempts + "]. Use the global max attempts instead.");
 297     } else {
 298       this.maxAppAttempts = individualMaxAppAttempts;
 299     }
 300 
 301     this.attemptFailuresValidityInterval =
 302         submissionContext.getAttemptFailuresValidityInterval();
 303     if (this.attemptFailuresValidityInterval > 0) {
 304       LOG.info("The attemptFailuresValidityInterval for the application: "
 305           + this.applicationId + " is " + this.attemptFailuresValidityInterval
 306           + ".");
 307     }
 308 
 309     ReentrantReadWriteLock lock = new ReentrantReadWriteLock();
 310     this.readLock = lock.readLock();
 311     this.writeLock = lock.writeLock();
 312 
 313     this.stateMachine = stateMachineFactory.make(this);
 314 
 315     rmContext.getRMApplicationHistoryWriter().applicationStarted(this);
 316     rmContext.getSystemMetricsPublisher().appCreated(this, startTime);
 317   }
 318 
 319   @Override
 320   public ApplicationId getApplicationId() {
 321     return this.applicationId;
 322   }
 323   
 324   @Override
 325   public ApplicationSubmissionContext getApplicationSubmissionContext() {
 326     return this.submissionContext;
 327   }
 328 
 329   @Override
 330   public FinalApplicationStatus getFinalApplicationStatus() {
 331     // finish state is obtained based on the state machine's current state
 332     // as a fall-back in case the application has not been unregistered
 333     // ( or if the app never unregistered itself )
 334     // when the report is requested
 335     if (currentAttempt != null
 336         && currentAttempt.getFinalApplicationStatus() != null) {
 337       return currentAttempt.getFinalApplicationStatus();
 338     }
 339     return createFinalApplicationStatus(this.stateMachine.getCurrentState());
 340   }
 341 
 342   @Override
 343   public RMAppState getState() {
 344     this.readLock.lock();
 345     try {
 346         return this.stateMachine.getCurrentState();
 347     } finally {
 348       this.readLock.unlock();
 349     }
 350   }
 351 
 352   @Override
 353   public String getUser() {
 354     return this.user;
 355   }
 356 
 357   @Override
 358   public float getProgress() {
 359     RMAppAttempt attempt = this.currentAttempt;
 360     if (attempt != null) {
 361       return attempt.getProgress();
 362     }
 363     return 0;
 364   }
 365 
 366   @Override
 367   public RMAppAttempt getRMAppAttempt(ApplicationAttemptId appAttemptId) {
 368     this.readLock.lock();
 369 
 370     try {
 371       return this.attempts.get(appAttemptId);
 372     } finally {
 373       this.readLock.unlock();
 374     }
 375   }
 376 
 377   @Override
 378   public String getQueue() {
 379     return this.queue;
 380   }
 381   
 382   @Override
 383   public void setQueue(String queue) {
 384     this.queue = queue;
 385   }
 386 
 387   @Override
 388   public String getName() {
 389     return this.name;
 390   }
 391 
 392   @Override
 393   public RMAppAttempt getCurrentAppAttempt() {
 394     return this.currentAttempt;
 395   }
 396 
 397   @Override
 398   public Map<ApplicationAttemptId, RMAppAttempt> getAppAttempts() {
 399     this.readLock.lock();
 400 
 401     try {
 402       return Collections.unmodifiableMap(this.attempts);
 403     } finally {
 404       this.readLock.unlock();
 405     }
 406   }
 407 
 408   private FinalApplicationStatus createFinalApplicationStatus(RMAppState state) {
 409     switch(state) {
 410     case NEW:
 411     case NEW_SAVING:
 412     case SUBMITTED:
 413     case ACCEPTED:
 414     case RUNNING:
 415     case FINAL_SAVING:
 416     case KILLING:
 417       return FinalApplicationStatus.UNDEFINED;    
 418     // finished without a proper final state is the same as failed  
 419     case FINISHING:
 420     case FINISHED:
 421     case FAILED:
 422       return FinalApplicationStatus.FAILED;
 423     case KILLED:
 424       return FinalApplicationStatus.KILLED;
 425     }
 426     throw new YarnRuntimeException("Unknown state passed!");
 427   }
 428 
 429   @Override
 430   public int pullRMNodeUpdates(Collection<RMNode> updatedNodes) {
 431     this.writeLock.lock();
 432     try {
 433       int updatedNodeCount = this.updatedNodes.size();
 434       updatedNodes.addAll(this.updatedNodes);
 435       this.updatedNodes.clear();
 436       return updatedNodeCount;
 437     } finally {
 438       this.writeLock.unlock();
 439     }
 440   }
 441   
 442   @Override
 443   public ApplicationReport createAndGetApplicationReport(String clientUserName,
 444       boolean allowAccess) {
 445     this.readLock.lock();
 446 
 447     try {
 448       ApplicationAttemptId currentApplicationAttemptId = null;
 449       org.apache.hadoop.yarn.api.records.Token clientToAMToken = null;
 450       String trackingUrl = UNAVAILABLE;
 451       String host = UNAVAILABLE;
 452       String origTrackingUrl = UNAVAILABLE;
 453       int rpcPort = -1;
 454       ApplicationResourceUsageReport appUsageReport =
 455           RMServerUtils.DUMMY_APPLICATION_RESOURCE_USAGE_REPORT;
 456       FinalApplicationStatus finishState = getFinalApplicationStatus();
 457       String diags = UNAVAILABLE;
 458       float progress = 0.0f;
 459       org.apache.hadoop.yarn.api.records.Token amrmToken = null;
 460       if (allowAccess) {
 461         trackingUrl = getDefaultProxyTrackingUrl();
 462         if (this.currentAttempt != null) {
 463           currentApplicationAttemptId = this.currentAttempt.getAppAttemptId();
 464           trackingUrl = this.currentAttempt.getTrackingUrl();
 465           origTrackingUrl = this.currentAttempt.getOriginalTrackingUrl();
 466           if (UserGroupInformation.isSecurityEnabled()) {
 467             // get a token so the client can communicate with the app attempt
 468             // NOTE: token may be unavailable if the attempt is not running
 469             Token<ClientToAMTokenIdentifier> attemptClientToAMToken =
 470                 this.currentAttempt.createClientToken(clientUserName);
 471             if (attemptClientToAMToken != null) {
 472               clientToAMToken = BuilderUtils.newClientToAMToken(
 473                   attemptClientToAMToken.getIdentifier(),
 474                   attemptClientToAMToken.getKind().toString(),
 475                   attemptClientToAMToken.getPassword(),
 476                   attemptClientToAMToken.getService().toString());
 477             }
 478           }
 479           host = this.currentAttempt.getHost();
 480           rpcPort = this.currentAttempt.getRpcPort();
 481           appUsageReport = currentAttempt.getApplicationResourceUsageReport();
 482           progress = currentAttempt.getProgress();
 483         }
 484         diags = this.diagnostics.toString();
 485 
 486         if (currentAttempt != null && 
 487             currentAttempt.getAppAttemptState() == RMAppAttemptState.LAUNCHED) {
 488           if (getApplicationSubmissionContext().getUnmanagedAM() &&
 489               clientUserName != null && getUser().equals(clientUserName)) {
 490             Token<AMRMTokenIdentifier> token = currentAttempt.getAMRMToken();
 491             if (token != null) {
 492               amrmToken = BuilderUtils.newAMRMToken(token.getIdentifier(),
 493                   token.getKind().toString(), token.getPassword(),
 494                   token.getService().toString());
 495             }
 496           }
 497         }
 498 
 499         RMAppMetrics rmAppMetrics = getRMAppMetrics();
 500         appUsageReport.setMemorySeconds(rmAppMetrics.getMemorySeconds());
 501         appUsageReport.setVcoreSeconds(rmAppMetrics.getVcoreSeconds());
 502       }
 503 
 504       if (currentApplicationAttemptId == null) {
 505         currentApplicationAttemptId = 
 506             BuilderUtils.newApplicationAttemptId(this.applicationId, 
 507                 DUMMY_APPLICATION_ATTEMPT_NUMBER);
 508       }
 509 
 510       return BuilderUtils.newApplicationReport(this.applicationId,
 511           currentApplicationAttemptId, this.user, this.queue,
 512           this.name, host, rpcPort, clientToAMToken,
 513           createApplicationState(), diags,
 514           trackingUrl, this.startTime, this.finishTime, finishState,
 515           appUsageReport, origTrackingUrl, progress, this.applicationType, 
 516           amrmToken, applicationTags);
 517     } finally {
 518       this.readLock.unlock();
 519     }
 520   }
 521 
 522   private String getDefaultProxyTrackingUrl() {
 523     try {
 524       final String scheme = WebAppUtils.getHttpSchemePrefix(conf);
 525       String proxy = WebAppUtils.getProxyHostAndPort(conf);
 526       URI proxyUri = ProxyUriUtils.getUriFromAMUrl(scheme, proxy);
 527       URI result = ProxyUriUtils.getProxyUri(null, proxyUri, applicationId);
 528       return result.toASCIIString();
 529     } catch (URISyntaxException e) {
 530       LOG.warn("Could not generate default proxy tracking URL for "
 531           + applicationId);
 532       return UNAVAILABLE;
 533     }
 534   }
 535 
 536   @Override
 537   public long getFinishTime() {
 538     this.readLock.lock();
 539 
 540     try {
 541       return this.finishTime;
 542     } finally {
 543       this.readLock.unlock();
 544     }
 545   }
 546 
 547   @Override
 548   public long getStartTime() {
 549     this.readLock.lock();
 550 
 551     try {
 552       return this.startTime;
 553     } finally {
 554       this.readLock.unlock();
 555     }
 556   }
 557 
 558   @Override
 559   public long getSubmitTime() {
 560     return this.submitTime;
 561   }
 562 
 563   @Override
 564   public String getTrackingUrl() {
 565     RMAppAttempt attempt = this.currentAttempt;
 566     if (attempt != null) {
 567       return attempt.getTrackingUrl();
 568     }
 569     return null;
 570   }
 571 
 572   @Override
 573   public String getOriginalTrackingUrl() {
 574     RMAppAttempt attempt = this.currentAttempt;
 575     if (attempt != null) {
 576       return attempt.getOriginalTrackingUrl();
 577     }
 578     return null;
 579   }
 580 
 581   @Override
 582   public StringBuilder getDiagnostics() {
 583     this.readLock.lock();
 584 
 585     try {
 586       return this.diagnostics;
 587     } finally {
 588       this.readLock.unlock();
 589     }
 590   }
 591 
 592   @Override
 593   public int getMaxAppAttempts() {
 594     return this.maxAppAttempts;
 595   }
 596 
 597   @Override
 598   public void handle(RMAppEvent event) {
 599 
 600     this.writeLock.lock();
 601 
 602     try {
 603       ApplicationId appID = event.getApplicationId();
 604       LOG.debug("Processing event for " + appID + " of type "
 605           + event.getType());
 606       final RMAppState oldState = getState();
 607       try {
 608         /* keep the master in sync with the state machine */
 609         this.stateMachine.doTransition(event.getType(), event);
 610       } catch (InvalidStateTransitonException e) {
 611         LOG.error("Can't handle this event at current state", e);
 612         /* TODO fail the application on the failed transition */
 613       }
 614 
 615       if (oldState != getState()) {
 616         LOG.info(appID + " State change from " + oldState + " to "
 617             + getState());
 618       }
 619     } finally {
 620       this.writeLock.unlock();
 621     }
 622   }
 623 
 624   @Override
 625   public void recover(RMState state) {
 626     ApplicationStateData appState =
 627         state.getApplicationState().get(getApplicationId());
 628     this.recoveredFinalState = appState.getState();
 629     LOG.info("Recovering app: " + getApplicationId() + " with " + 
 630         + appState.getAttemptCount() + " attempts and final state = "
 631         + this.recoveredFinalState );
 632     this.diagnostics.append(appState.getDiagnostics());
 633     this.storedFinishTime = appState.getFinishTime();
 634     this.startTime = appState.getStartTime();
 635 
 636     for(int i=0; i<appState.getAttemptCount(); ++i) {
 637       // create attempt
 638       createNewAttempt();
 639       ((RMAppAttemptImpl)this.currentAttempt).recover(state);
 640     }
 641   }
 642 
 643   private void createNewAttempt() {
 644     ApplicationAttemptId appAttemptId =
 645         ApplicationAttemptId.newInstance(applicationId, attempts.size() + 1);
 646     RMAppAttempt attempt =
 647         new RMAppAttemptImpl(appAttemptId, rmContext, scheduler, masterService,
 648           submissionContext, conf,
 649           // The newly created attempt maybe last attempt if (number of
 650           // previously failed attempts(which should not include Preempted,
 651           // hardware error and NM resync) + 1) equal to the max-attempt
 652           // limit.
 653           maxAppAttempts == (getNumFailedAppAttempts() + 1), amReq);
 654     attempts.put(appAttemptId, attempt);
 655     currentAttempt = attempt;
 656   }
 657   
 658   private void
 659       createAndStartNewAttempt(boolean transferStateFromPreviousAttempt) {
 660     createNewAttempt();
 661     handler.handle(new RMAppStartAttemptEvent(currentAttempt.getAppAttemptId(),
 662       transferStateFromPreviousAttempt));
 663   }
 664 
 665   private void processNodeUpdate(RMAppNodeUpdateType type, RMNode node) {
 666     NodeState nodeState = node.getState();
 667     updatedNodes.add(node);
 668     LOG.debug("Received node update event:" + type + " for node:" + node
 669         + " with state:" + nodeState);
 670   }
 671 
 672   private static class RMAppTransition implements
 673       SingleArcTransition<RMAppImpl, RMAppEvent> {
 674     public void transition(RMAppImpl app, RMAppEvent event) {
 675     };
 676 
 677   }
 678 
 679   private static final class RMAppNodeUpdateTransition extends RMAppTransition {
 680     public void transition(RMAppImpl app, RMAppEvent event) {
 681       RMAppNodeUpdateEvent nodeUpdateEvent = (RMAppNodeUpdateEvent) event;
 682       app.processNodeUpdate(nodeUpdateEvent.getUpdateType(),
 683           nodeUpdateEvent.getNode());
 684     };
 685   }
 686   
 687   private static final class AppRunningOnNodeTransition extends RMAppTransition {
 688     public void transition(RMAppImpl app, RMAppEvent event) {
 689       RMAppRunningOnNodeEvent nodeAddedEvent = (RMAppRunningOnNodeEvent) event;
 690       
 691       // if final state already stored, notify RMNode
 692       if (isAppInFinalState(app)) {
 693         app.handler.handle(
 694             new RMNodeCleanAppEvent(nodeAddedEvent.getNodeId(), nodeAddedEvent
 695                 .getApplicationId()));
 696         return;
 697       }
 698       
 699       // otherwise, add it to ranNodes for further process
 700       app.ranNodes.add(nodeAddedEvent.getNodeId());
 701     };
 702   }
 703 
 704   /**
 705    * Move an app to a new queue.
 706    * This transition must set the result on the Future in the RMAppMoveEvent,
 707    * either as an exception for failure or null for success, or the client will
 708    * be left waiting forever.
 709    */
 710   private static final class RMAppMoveTransition extends RMAppTransition {
 711     public void transition(RMAppImpl app, RMAppEvent event) {
 712       RMAppMoveEvent moveEvent = (RMAppMoveEvent) event;
 713       try {
 714         app.queue = app.scheduler.moveApplication(app.applicationId,
 715             moveEvent.getTargetQueue());
 716       } catch (YarnException ex) {
 717         moveEvent.getResult().setException(ex);
 718         return;
 719       }
 720       
 721       // TODO: Write out change to state store (YARN-1558)
 722       // Also take care of RM failover
 723       moveEvent.getResult().set(null);
 724     }
 725   }
 726 
 727   // synchronously recover attempt to ensure any incoming external events
 728   // to be processed after the attempt processes the recover event.
 729   private void recoverAppAttempts() {
 730     for (RMAppAttempt attempt : getAppAttempts().values()) {
 731       attempt.handle(new RMAppAttemptEvent(attempt.getAppAttemptId(),
 732         RMAppAttemptEventType.RECOVER));
 733     }
 734   }
 735 
 736   private static final class RMAppRecoveredTransition implements
 737       MultipleArcTransition<RMAppImpl, RMAppEvent, RMAppState> {
 738 
 739     @Override
 740     public RMAppState transition(RMAppImpl app, RMAppEvent event) {
 741 
 742       RMAppRecoverEvent recoverEvent = (RMAppRecoverEvent) event;
 743       app.recover(recoverEvent.getRMState());
 744       // The app has completed.
 745       if (app.recoveredFinalState != null) {
 746         app.recoverAppAttempts();
 747         new FinalTransition(app.recoveredFinalState).transition(app, event);
 748         return app.recoveredFinalState;
 749       }
 750 
 751       if (UserGroupInformation.isSecurityEnabled()) {
 752         // asynchronously renew delegation token on recovery.
 753         try {
 754           app.rmContext.getDelegationTokenRenewer()
 755               .addApplicationAsyncDuringRecovery(app.getApplicationId(),
 756                   app.parseCredentials(),
 757                   app.submissionContext.getCancelTokensWhenComplete(),
 758                   app.getUser());
 759         } catch (Exception e) {
 760           String msg = "Failed to fetch user credentials from application:"
 761               + e.getMessage();
 762           app.diagnostics.append(msg);
 763           LOG.error(msg, e);
 764         }
 765       }
 766 
 767       // No existent attempts means the attempt associated with this app was not
 768       // started or started but not yet saved.
 769       if (app.attempts.isEmpty()) {
 770         app.scheduler.handle(new AppAddedSchedulerEvent(app.applicationId,
 771           app.submissionContext.getQueue(), app.user,
 772           app.submissionContext.getReservationID()));
 773         return RMAppState.SUBMITTED;
 774       }
 775 
 776       // Add application to scheduler synchronously to guarantee scheduler
 777       // knows applications before AM or NM re-registers.
 778       app.scheduler.handle(new AppAddedSchedulerEvent(app.applicationId,
 779         app.submissionContext.getQueue(), app.user, true,
 780           app.submissionContext.getReservationID()));
 781 
 782       // recover attempts
 783       app.recoverAppAttempts();
 784 
 785       // Last attempt is in final state, return ACCEPTED waiting for last
 786       // RMAppAttempt to send finished or failed event back.
 787       if (app.currentAttempt != null
 788           && (app.currentAttempt.getState() == RMAppAttemptState.KILLED
 789               || app.currentAttempt.getState() == RMAppAttemptState.FINISHED
 790               || (app.currentAttempt.getState() == RMAppAttemptState.FAILED
 791                   && app.getNumFailedAppAttempts() == app.maxAppAttempts))) {
 792         return RMAppState.ACCEPTED;
 793       }
 794 
 795       // YARN-1507 is saving the application state after the application is
 796       // accepted. So after YARN-1507, an app is saved meaning it is accepted.
 797       // Thus we return ACCECPTED state on recovery.
 798       return RMAppState.ACCEPTED;
 799     }
 800   }
 801 
 802   private static final class AddApplicationToSchedulerTransition extends
 803       RMAppTransition {
 804     @Override
 805     public void transition(RMAppImpl app, RMAppEvent event) {
 806       app.handler.handle(new AppAddedSchedulerEvent(app.applicationId,
 807         app.submissionContext.getQueue(), app.user,
 808         app.submissionContext.getReservationID()));
 809     }
 810   }
 811 
 812   private static final class StartAppAttemptTransition extends RMAppTransition {
 813     @Override
 814     public void transition(RMAppImpl app, RMAppEvent event) {
 815       app.createAndStartNewAttempt(false);
 816     };
 817   }
 818 
 819   private static final class FinalStateSavedTransition implements
 820       MultipleArcTransition<RMAppImpl, RMAppEvent, RMAppState> {
 821 
 822     @Override
 823     public RMAppState transition(RMAppImpl app, RMAppEvent event) {
 824       if (app.transitionTodo instanceof SingleArcTransition) {
 825         ((SingleArcTransition) app.transitionTodo).transition(app,
 826           app.eventCausingFinalSaving);
 827       } else if (app.transitionTodo instanceof MultipleArcTransition) {
 828         ((MultipleArcTransition) app.transitionTodo).transition(app,
 829           app.eventCausingFinalSaving);
 830       }
 831       return app.targetedFinalState;
 832 
 833     }
 834   }
 835 
 836   private static class AttemptFailedFinalStateSavedTransition extends
 837       RMAppTransition {
 838     @Override
 839     public void transition(RMAppImpl app, RMAppEvent event) {
 840       String msg = null;
 841       if (event instanceof RMAppFailedAttemptEvent) {
 842         msg = app.getAppAttemptFailedDiagnostics(event);
 843       }
 844       LOG.info(msg);
 845       app.diagnostics.append(msg);
 846       // Inform the node for app-finish
 847       new FinalTransition(RMAppState.FAILED).transition(app, event);
 848     }
 849   }
 850 
 851   private String getAppAttemptFailedDiagnostics(RMAppEvent event) {
 852     String msg = null;
 853     RMAppFailedAttemptEvent failedEvent = (RMAppFailedAttemptEvent) event;
 854     if (this.submissionContext.getUnmanagedAM()) {
 855       // RM does not manage the AM. Do not retry
 856       msg = "Unmanaged application " + this.getApplicationId()
 857               + " failed due to " + failedEvent.getDiagnosticMsg()
 858               + ". Failing the application.";
 859     } else if (this.isNumAttemptsBeyondThreshold) {
 860       msg = "Application " + this.getApplicationId() + " failed "
 861               + this.maxAppAttempts + " times due to "
 862               + failedEvent.getDiagnosticMsg() + ". Failing the application.";
 863     }
 864     return msg;
 865   }
 866 
 867   private static final class RMAppNewlySavingTransition extends RMAppTransition {
 868     @Override
 869     public void transition(RMAppImpl app, RMAppEvent event) {
 870 
 871       // If recovery is enabled then store the application information in a
 872       // non-blocking call so make sure that RM has stored the information
 873       // needed to restart the AM after RM restart without further client
 874       // communication
 875       LOG.info("Storing application with id " + app.applicationId);
 876       app.rmContext.getStateStore().storeNewApplication(app);
 877     }
 878   }
 879 
 880   private void rememberTargetTransitions(RMAppEvent event,
 881       Object transitionToDo, RMAppState targetFinalState) {
 882     transitionTodo = transitionToDo;
 883     targetedFinalState = targetFinalState;
 884     eventCausingFinalSaving = event;
 885   }
 886 
 887   private void rememberTargetTransitionsAndStoreState(RMAppEvent event,
 888       Object transitionToDo, RMAppState targetFinalState,
 889       RMAppState stateToBeStored) {
 890     rememberTargetTransitions(event, transitionToDo, targetFinalState);
 891     this.stateBeforeFinalSaving = getState();
 892     this.storedFinishTime = this.systemClock.getTime();
 893 
 894     LOG.info("Updating application " + this.applicationId
 895         + " with final state: " + this.targetedFinalState);
 896     // we lost attempt_finished diagnostics in app, because attempt_finished
 897     // diagnostics is sent after app final state is saved. Later on, we will
 898     // create GetApplicationAttemptReport specifically for getting per attempt
 899     // info.
 900     String diags = null;
 901     switch (event.getType()) {
 902     case APP_REJECTED:
 903     case ATTEMPT_FINISHED:
 904     case ATTEMPT_KILLED:
 905       diags = event.getDiagnosticMsg();
 906       break;
 907     case ATTEMPT_FAILED:
 908       RMAppFailedAttemptEvent failedEvent = (RMAppFailedAttemptEvent) event;
 909       diags = getAppAttemptFailedDiagnostics(failedEvent);
 910       break;
 911     default:
 912       break;
 913     }
 914     ApplicationStateData appState =
 915         ApplicationStateData.newInstance(this.submitTime, this.startTime,
 916             this.user, this.submissionContext,
 917             stateToBeStored, diags, this.storedFinishTime);
 918     this.rmContext.getStateStore().updateApplicationState(appState);
 919   }
 920 
 921   private static final class FinalSavingTransition extends RMAppTransition {
 922     Object transitionToDo;
 923     RMAppState targetedFinalState;
 924     RMAppState stateToBeStored;
 925 
 926     public FinalSavingTransition(Object transitionToDo,
 927         RMAppState targetedFinalState) {
 928       this(transitionToDo, targetedFinalState, targetedFinalState);
 929     }
 930 
 931     public FinalSavingTransition(Object transitionToDo,
 932         RMAppState targetedFinalState, RMAppState stateToBeStored) {
 933       this.transitionToDo = transitionToDo;
 934       this.targetedFinalState = targetedFinalState;
 935       this.stateToBeStored = stateToBeStored;
 936     }
 937 
 938     @Override
 939     public void transition(RMAppImpl app, RMAppEvent event) {
 940       app.rememberTargetTransitionsAndStoreState(event, transitionToDo,
 941         targetedFinalState, stateToBeStored);
 942     }
 943   }
 944 
 945   private static class AttemptUnregisteredTransition extends RMAppTransition {
 946     @Override
 947     public void transition(RMAppImpl app, RMAppEvent event) {
 948       app.finishTime = app.storedFinishTime;
 949     }
 950   }
 951 
 952   private static class AppFinishedTransition extends FinalTransition {
 953     public AppFinishedTransition() {
 954       super(RMAppState.FINISHED);
 955     }
 956 
 957     public void transition(RMAppImpl app, RMAppEvent event) {
 958       app.diagnostics.append(event.getDiagnosticMsg());
 959       super.transition(app, event);
 960     };
 961   }
 962 
 963   private static class AttemptFinishedAtFinalSavingTransition extends
 964       RMAppTransition {
 965     @Override
 966     public void transition(RMAppImpl app, RMAppEvent event) {
 967       if (app.targetedFinalState.equals(RMAppState.FAILED)
 968           || app.targetedFinalState.equals(RMAppState.KILLED)) {
 969         // Ignore Attempt_Finished event if we were supposed to reach FAILED
 970         // FINISHED state
 971         return;
 972       }
 973 
 974       // pass in the earlier attempt_unregistered event, as it is needed in
 975       // AppFinishedFinalStateSavedTransition later on
 976       app.rememberTargetTransitions(event,
 977         new AppFinishedFinalStateSavedTransition(app.eventCausingFinalSaving),
 978         RMAppState.FINISHED);
 979     };
 980   }
 981 
 982   private static class AppFinishedFinalStateSavedTransition extends
 983       RMAppTransition {
 984     RMAppEvent attemptUnregistered;
 985 
 986     public AppFinishedFinalStateSavedTransition(RMAppEvent attemptUnregistered) {
 987       this.attemptUnregistered = attemptUnregistered;
 988     }
 989     @Override
 990     public void transition(RMAppImpl app, RMAppEvent event) {
 991       new AttemptUnregisteredTransition().transition(app, attemptUnregistered);
 992       FINISHED_TRANSITION.transition(app, event);
 993     };
 994   }
 995 
 996 
 997   private static class AppKilledTransition extends FinalTransition {
 998     public AppKilledTransition() {
 999       super(RMAppState.KILLED);
1000     }
1001 
1002     @Override
1003     public void transition(RMAppImpl app, RMAppEvent event) {
1004       app.diagnostics.append(event.getDiagnosticMsg());
1005       super.transition(app, event);
1006     };
1007   }
1008 
1009   private static class KillAttemptTransition extends RMAppTransition {
1010     @Override
1011     public void transition(RMAppImpl app, RMAppEvent event) {
1012       app.stateBeforeKilling = app.getState();
1013       // Forward app kill diagnostics in the event to kill app attempt.
1014       // These diagnostics will be returned back in ATTEMPT_KILLED event sent by
1015       // RMAppAttemptImpl.
1016       app.handler.handle(
1017           new RMAppAttemptEvent(app.currentAttempt.getAppAttemptId(),
1018               RMAppAttemptEventType.KILL, event.getDiagnosticMsg()));
1019     }
1020   }
1021 
1022   private static final class AppRejectedTransition extends
1023       FinalTransition{
1024     public AppRejectedTransition() {
1025       super(RMAppState.FAILED);
1026     }
1027 
1028     public void transition(RMAppImpl app, RMAppEvent event) {
1029       app.diagnostics.append(event.getDiagnosticMsg());
1030       super.transition(app, event);
1031     };
1032   }
1033 
1034   private static class FinalTransition extends RMAppTransition {
1035 
1036     private final RMAppState finalState;
1037 
1038     public FinalTransition(RMAppState finalState) {
1039       this.finalState = finalState;
1040     }
1041 
1042     public void transition(RMAppImpl app, RMAppEvent event) {
1043       for (NodeId nodeId : app.getRanNodes()) {
1044         app.handler.handle(
1045             new RMNodeCleanAppEvent(nodeId, app.applicationId));
1046       }
1047       app.finishTime = app.storedFinishTime;
1048       if (app.finishTime == 0 ) {
1049         app.finishTime = app.systemClock.getTime();
1050       }
1051       // Recovered apps that are completed were not added to scheduler, so no
1052       // need to remove them from scheduler.
1053       if (app.recoveredFinalState == null) {
1054         app.handler.handle(new AppRemovedSchedulerEvent(app.applicationId,
1055           finalState));
1056       }
1057       app.handler.handle(
1058           new RMAppManagerEvent(app.applicationId,
1059           RMAppManagerEventType.APP_COMPLETED));
1060 
1061       app.rmContext.getRMApplicationHistoryWriter()
1062           .applicationFinished(app, finalState);
1063       app.rmContext.getSystemMetricsPublisher()
1064           .appFinished(app, finalState, app.finishTime);
1065     };
1066   }
1067 
1068   private int getNumFailedAppAttempts() {
1069     int completedAttempts = 0;
1070     long endTime = this.systemClock.getTime();
1071     // Do not count AM preemption, hardware failures or NM resync
1072     // as attempt failure.
1073     for (RMAppAttempt attempt : attempts.values()) {
1074       if (attempt.shouldCountTowardsMaxAttemptRetry()) {
1075         if (this.attemptFailuresValidityInterval <= 0
1076             || (attempt.getFinishTime() > endTime
1077                 - this.attemptFailuresValidityInterval)) {
1078           completedAttempts++;
1079         }
1080       }
1081     }
1082     return completedAttempts;
1083   }
1084 
1085   private static final class AttemptFailedTransition implements
1086       MultipleArcTransition<RMAppImpl, RMAppEvent, RMAppState> {
1087 
1088     private final RMAppState initialState;
1089 
1090     public AttemptFailedTransition(RMAppState initialState) {
1091       this.initialState = initialState;
1092     }
1093 
1094     @Override
1095     public RMAppState transition(RMAppImpl app, RMAppEvent event) {
1096       int numberOfFailure = app.getNumFailedAppAttempts();
1097       LOG.info("The number of failed attempts"
1098           + (app.attemptFailuresValidityInterval > 0 ? " in previous "
1099               + app.attemptFailuresValidityInterval + " milliseconds " : " ")
1100           + "is " + numberOfFailure + ". The max attempts is "
1101           + app.maxAppAttempts);
1102       if (!app.submissionContext.getUnmanagedAM()
1103           && numberOfFailure < app.maxAppAttempts) {
1104         if (initialState.equals(RMAppState.KILLING)) {
1105           // If this is not last attempt, app should be killed instead of
1106           // launching a new attempt
1107           app.rememberTargetTransitionsAndStoreState(event,
1108             new AppKilledTransition(), RMAppState.KILLED, RMAppState.KILLED);
1109           return RMAppState.FINAL_SAVING;
1110         }
1111 
1112         boolean transferStateFromPreviousAttempt;
1113         RMAppFailedAttemptEvent failedEvent = (RMAppFailedAttemptEvent) event;
1114         transferStateFromPreviousAttempt =
1115             failedEvent.getTransferStateFromPreviousAttempt();
1116 
1117         RMAppAttempt oldAttempt = app.currentAttempt;
1118         app.createAndStartNewAttempt(transferStateFromPreviousAttempt);
1119         // Transfer the state from the previous attempt to the current attempt.
1120         // Note that the previous failed attempt may still be collecting the
1121         // container events from the scheduler and update its data structures
1122         // before the new attempt is created. We always transferState for
1123         // finished containers so that they can be acked to NM,
1124         // but when pulling finished container we will check this flag again.
1125         ((RMAppAttemptImpl) app.currentAttempt)
1126           .transferStateFromPreviousAttempt(oldAttempt);
1127         return initialState;
1128       } else {
1129         if (numberOfFailure >= app.maxAppAttempts) {
1130           app.isNumAttemptsBeyondThreshold = true;
1131         }
1132         app.rememberTargetTransitionsAndStoreState(event,
1133           new AttemptFailedFinalStateSavedTransition(), RMAppState.FAILED,
1134           RMAppState.FAILED);
1135         return RMAppState.FINAL_SAVING;
1136       }
1137     }
1138   }
1139 
1140   @Override
1141   public String getApplicationType() {
1142     return this.applicationType;
1143   }
1144 
1145   @Override
1146   public Set<String> getApplicationTags() {
1147     return this.applicationTags;
1148   }
1149 
1150   @Override
1151   public boolean isAppFinalStateStored() {
1152     RMAppState state = getState();
1153     return state.equals(RMAppState.FINISHING)
1154         || state.equals(RMAppState.FINISHED) || state.equals(RMAppState.FAILED)
1155         || state.equals(RMAppState.KILLED);
1156   }
1157 
1158   @Override
1159   public YarnApplicationState createApplicationState() {
1160     RMAppState rmAppState = getState();
1161     // If App is in FINAL_SAVING state, return its previous state.
1162     if (rmAppState.equals(RMAppState.FINAL_SAVING)) {
1163       rmAppState = stateBeforeFinalSaving;
1164     }
1165     if (rmAppState.equals(RMAppState.KILLING)) {
1166       rmAppState = stateBeforeKilling;
1167     }
1168     return RMServerUtils.createApplicationState(rmAppState);
1169   }
1170   
1171   public static boolean isAppInFinalState(RMApp rmApp) {
1172     RMAppState appState = ((RMAppImpl) rmApp).getRecoveredFinalState();
1173     if (appState == null) {
1174       appState = rmApp.getState();
1175     }
1176     return appState == RMAppState.FAILED || appState == RMAppState.FINISHED
1177         || appState == RMAppState.KILLED;
1178   }
1179   
1180   public RMAppState getRecoveredFinalState() {
1181     return this.recoveredFinalState;
1182   }
1183 
1184   @Override
1185   public Set<NodeId> getRanNodes() {
1186     return ranNodes;
1187   }
1188   
1189   @Override
1190   public RMAppMetrics getRMAppMetrics() {
1191     Resource resourcePreempted = Resource.newInstance(0, 0);
1192     int numAMContainerPreempted = 0;
1193     int numNonAMContainerPreempted = 0;
1194     long memorySeconds = 0;
1195     long vcoreSeconds = 0;
1196     for (RMAppAttempt attempt : attempts.values()) {
1197       if (null != attempt) {
1198         RMAppAttemptMetrics attemptMetrics =
1199             attempt.getRMAppAttemptMetrics();
1200         Resources.addTo(resourcePreempted,
1201             attemptMetrics.getResourcePreempted());
1202         numAMContainerPreempted += attemptMetrics.getIsPreempted() ? 1 : 0;
1203         numNonAMContainerPreempted +=
1204             attemptMetrics.getNumNonAMContainersPreempted();
1205         // getAggregateAppResourceUsage() will calculate resource usage stats
1206         // for both running and finished containers.
1207         AggregateAppResourceUsage resUsage =
1208             attempt.getRMAppAttemptMetrics().getAggregateAppResourceUsage();
1209         memorySeconds += resUsage.getMemorySeconds();
1210         vcoreSeconds += resUsage.getVcoreSeconds();
1211       }
1212     }
1213 
1214     return new RMAppMetrics(resourcePreempted,
1215         numNonAMContainerPreempted, numAMContainerPreempted,
1216         memorySeconds, vcoreSeconds);
1217   }
1218 
1219   @Private
1220   @VisibleForTesting
1221   public void setSystemClock(Clock clock) {
1222     this.systemClock = clock;
1223   }
1224 
1225   @Override
1226   public ReservationId getReservationId() {
1227     return submissionContext.getReservationID();
1228   }
1229   
1230   @Override
1231   public ResourceRequest getAMResourceRequest() {
1232     return this.amReq; 
1233   }
1234 
1235   protected Credentials parseCredentials() throws IOException {
1236     Credentials credentials = new Credentials();
1237     DataInputByteBuffer dibb = new DataInputByteBuffer();
1238     ByteBuffer tokens = submissionContext.getAMContainerSpec().getTokens();
1239     if (tokens != null) {
1240       dibb.reset(tokens);
1241       credentials.readTokenStorageStream(dibb);
1242       tokens.rewind();
1243     }
1244     return credentials;
1245   }
1246 }
RMAppImpl.java

RMAppImpl.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java

  接下来看一下 this.rmContext.getDispatcher().getEventHandler().handle(new RMAppEvent(applicationId, RMAppEventType.START)), this.rmContext是RMContext接口类的对象,该接口是ResourceManager的上下文或语境,该接口只有一个实现类RMContextImpl。该类的getDispatcher()方法会返回接口类Dispatcher对象,我们这里选择的是Dispatcher的实现类AsyncDispatcher类,而AsyncDispatcher类的getEventHandler()方法会返回接口类EventHandler对象,其实在AsyncDispatcher类内部已经把EventHandler接口的对象初始化为AsyncDispatcher类的内部类GenericEventHandler类的对象,并调用该内部类的handle()方法来处理,其中RMAppEventType注册到中央异步调度器的地方在ResourceManager.java中。handle()方法最后会在队列eventQueue中添加事件event。  handle函数里,最终把event事件放进了队列eventQueue中:eventQueue.put(event);注意这个异步调度器AsyncDispatcher类是公用的。RMAppEventType.START事件放入队列eventQueue中,会被 RMAppImpl 类获取,进入其handle函数。  

  1 /**
  2  * Dispatches {@link Event}s in a separate thread. Currently only single thread
  3  * does that. Potentially there could be multiple channels for each event type
  4  * class and a thread pool can be used to dispatch the events.
  5  */
  6 @SuppressWarnings("rawtypes")
  7 @Public
  8 @Evolving
  9 public class AsyncDispatcher extends AbstractService implements Dispatcher {
 10 
 11   private static final Log LOG = LogFactory.getLog(AsyncDispatcher.class);
 12 
 13   private final BlockingQueue<Event> eventQueue;
 14   private volatile int lastEventQueueSizeLogged = 0;
 15   private volatile boolean stopped = false;
 16 
 17   // Configuration flag for enabling/disabling draining dispatcher's events on
 18   // stop functionality.
 19   private volatile boolean drainEventsOnStop = false;
 20 
 21   // Indicates all the remaining dispatcher's events on stop have been drained
 22   // and processed.
 23   private volatile boolean drained = true;
 24   private Object waitForDrained = new Object();
 25 
 26   // For drainEventsOnStop enabled only, block newly coming events into the
 27   // queue while stopping.
 28   private volatile boolean blockNewEvents = false;
 29   private final EventHandler handlerInstance = new GenericEventHandler();
 30 
 31   private Thread eventHandlingThread;
 32   protected final Map<Class<? extends Enum>, EventHandler> eventDispatchers;
 33   private boolean exitOnDispatchException;
 34 
 35   public AsyncDispatcher() {
 36     this(new LinkedBlockingQueue<Event>());
 37   }
 38 
 39   public AsyncDispatcher(BlockingQueue<Event> eventQueue) {
 40     super("Dispatcher");
 41     this.eventQueue = eventQueue;
 42     this.eventDispatchers = new HashMap<Class<? extends Enum>, EventHandler>();
 43   }
 44 
 45   Runnable createThread() {
 46     return new Runnable() {
 47       @Override
 48       public void run() {
 49         while (!stopped && !Thread.currentThread().isInterrupted()) {
 50           drained = eventQueue.isEmpty();
 51           // blockNewEvents is only set when dispatcher is draining to stop,
 52           // adding this check is to avoid the overhead of acquiring the lock
 53           // and calling notify every time in the normal run of the loop.
 54           if (blockNewEvents) {
 55             synchronized (waitForDrained) {
 56               if (drained) {
 57                 waitForDrained.notify();
 58               }
 59             }
 60           }
 61           Event event;
 62           try {
 63             event = eventQueue.take();
 64           } catch(InterruptedException ie) {
 65             if (!stopped) {
 66               LOG.warn("AsyncDispatcher thread interrupted", ie);
 67             }
 68             return;
 69           }
 70           if (event != null) {
 71             dispatch(event);
 72           }
 73         }
 74       }
 75     };
 76   }
 77 
 78   @Override
 79   protected void serviceInit(Configuration conf) throws Exception {
 80     this.exitOnDispatchException =
 81         conf.getBoolean(Dispatcher.DISPATCHER_EXIT_ON_ERROR_KEY,
 82           Dispatcher.DEFAULT_DISPATCHER_EXIT_ON_ERROR);
 83     super.serviceInit(conf);
 84   }
 85 
 86   @Override
 87   protected void serviceStart() throws Exception {
 88     //start all the components
 89     super.serviceStart();
 90     eventHandlingThread = new Thread(createThread());
 91     eventHandlingThread.setName("AsyncDispatcher event handler");
 92     eventHandlingThread.start();
 93   }
 94 
 95   public void setDrainEventsOnStop() {
 96     drainEventsOnStop = true;
 97   }
 98 
 99   @Override
100   protected void serviceStop() throws Exception {
101     if (drainEventsOnStop) {
102       blockNewEvents = true;
103       LOG.info("AsyncDispatcher is draining to stop, igonring any new events.");
104       long endTime = System.currentTimeMillis() + getConfig()
105           .getLong(YarnConfiguration.DISPATCHER_DRAIN_EVENTS_TIMEOUT,
106               YarnConfiguration.DEFAULT_DISPATCHER_DRAIN_EVENTS_TIMEOUT);
107 
108       synchronized (waitForDrained) {
109         while (!drained && eventHandlingThread != null
110             && eventHandlingThread.isAlive()
111             && System.currentTimeMillis() < endTime) {
112           waitForDrained.wait(1000);
113           LOG.info("Waiting for AsyncDispatcher to drain. Thread state is :" +
114               eventHandlingThread.getState());
115         }
116       }
117     }
118     stopped = true;
119     if (eventHandlingThread != null) {
120       eventHandlingThread.interrupt();
121       try {
122         eventHandlingThread.join();
123       } catch (InterruptedException ie) {
124         LOG.warn("Interrupted Exception while stopping", ie);
125       }
126     }
127 
128     // stop all the components
129     super.serviceStop();
130   }
131 
132   @SuppressWarnings("unchecked")
133   protected void dispatch(Event event) {
134     //all events go thru this loop
135     if (LOG.isDebugEnabled()) {
136       LOG.debug("Dispatching the event " + event.getClass().getName() + "."
137           + event.toString());
138     }
139 
140     Class<? extends Enum> type = event.getType().getDeclaringClass();
141 
142     try{
143       EventHandler handler = eventDispatchers.get(type);
144       if(handler != null) {
145         handler.handle(event);
146       } else {
147         throw new Exception("No handler for registered for " + type);
148       }
149     } catch (Throwable t) {
150       //TODO Maybe log the state of the queue
151       LOG.fatal("Error in dispatcher thread", t);
152       // If serviceStop is called, we should exit this thread gracefully.
153       if (exitOnDispatchException
154           && (ShutdownHookManager.get().isShutdownInProgress()) == false
155           && stopped == false) {
156         Thread shutDownThread = new Thread(createShutDownThread());
157         shutDownThread.setName("AsyncDispatcher ShutDown handler");
158         shutDownThread.start();
159       }
160     }
161   }
162 
163   @SuppressWarnings("unchecked")
164   @Override
165   public void register(Class<? extends Enum> eventType,
166       EventHandler handler) {
167     /* check to see if we have a listener registered */
168     EventHandler<Event> registeredHandler = (EventHandler<Event>)
169     eventDispatchers.get(eventType);
170     LOG.info("Registering " + eventType + " for " + handler.getClass());
171     if (registeredHandler == null) {
172       eventDispatchers.put(eventType, handler);
173     } else if (!(registeredHandler instanceof MultiListenerHandler)){
174       /* for multiple listeners of an event add the multiple listener handler */
175       MultiListenerHandler multiHandler = new MultiListenerHandler();
176       multiHandler.addHandler(registeredHandler);
177       multiHandler.addHandler(handler);
178       eventDispatchers.put(eventType, multiHandler);
179     } else {
180       /* already a multilistener, just add to it */
181       MultiListenerHandler multiHandler
182       = (MultiListenerHandler) registeredHandler;
183       multiHandler.addHandler(handler);
184     }
185   }
186 
187   @Override
188   public EventHandler getEventHandler() {
189     return handlerInstance;
190   }
191 
192   class GenericEventHandler implements EventHandler<Event> {
193     public void handle(Event event) {
194       if (blockNewEvents) {
195         return;
196       }
197       drained = false;
198 
199       /* all this method does is enqueue all the events onto the queue */
200       int qSize = eventQueue.size();
201       if (qSize != 0 && qSize % 1000 == 0
202           && lastEventQueueSizeLogged != qSize) {
203         lastEventQueueSizeLogged = qSize;
204         LOG.info("Size of event-queue is " + qSize);
205       }
206       int remCapacity = eventQueue.remainingCapacity();
207       if (remCapacity < 1000) {
208         LOG.warn("Very low remaining capacity in the event-queue: "
209             + remCapacity);
210       }
211       try {
212         eventQueue.put(event);
213       } catch (InterruptedException e) {
214         if (!stopped) {
215           LOG.warn("AsyncDispatcher thread interrupted", e);
216         }
217         // Need to reset drained flag to true if event queue is empty,
218         // otherwise dispatcher will hang on stop.
219         drained = eventQueue.isEmpty();
220         throw new YarnRuntimeException(e);
221       }
222     };
223   }
224 
225   /**
226    * Multiplexing an event. Sending it to different handlers that
227    * are interested in the event.
228    * @param <T> the type of event these multiple handlers are interested in.
229    */
230   static class MultiListenerHandler implements EventHandler<Event> {
231     List<EventHandler<Event>> listofHandlers;
232 
233     public MultiListenerHandler() {
234       listofHandlers = new ArrayList<EventHandler<Event>>();
235     }
236 
237     @Override
238     public void handle(Event event) {
239       for (EventHandler<Event> handler: listofHandlers) {
240         handler.handle(event);
241       }
242     }
243 
244     void addHandler(EventHandler<Event> handler) {
245       listofHandlers.add(handler);
246     }
247 
248   }
249 
250   Runnable createShutDownThread() {
251     return new Runnable() {
252       @Override
253       public void run() {
254         LOG.info("Exiting, bbye..");
255         System.exit(-1);
256       }
257     };
258   }
259 
260   @VisibleForTesting
261   protected boolean isEventThreadWaiting() {
262     return eventHandlingThread.getState() == Thread.State.WAITING;
263   }
264 
265   @VisibleForTesting
266   protected boolean isDrained() {
267     return this.drained;
268   }
269 }
AsyncDispatcher.java

AsyncDispatcher.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java

在GenericEventHandler中handle()方法会通过eventQueue.put(event)往队列中添加数据,即所谓的生产过程。那么,这个GenericEventHandler是如何获得的呢?getEventHandler()方法告诉了我们答案。

关于AsyncDispatcher,可以参考   

小结:

关于ApplicationMaster启动流程,可以参考   

RMStateStore是存储ResourceManager状态的基础接口,真实的存储器需要实现存储和加载方法。 关于RMStateStore,可以参考   

一部分文章在这会说: 在文章的开头有写“事件调度器”, 在ResourceManager那边会有AsyncDispatcher来调度所有事件, 这里的话会通过ApplicationEventDispatcher去做RMAppImpl的transition方法, 看一下RMAppImpl类的初始化的时候的各种event和transition。 介绍的不清楚。

另一部分转而看ResourceManager,因为this.rmContext.getDispatcher().getEventHandler().handle(new RMAppEvent(applicationId, RMAppEventType.START)),触发app启动事件,往异步处理器增加个RMAppEvent事件,类型枚值RMAppEventType.START,在RM内部会注册该类型的事件会用什么处理器来处理。

不过殊途同归。

 1 public enum RMAppEventType {
 2   // Source: ClientRMService
 3   START,
 4   RECOVER,
 5   KILL,
 6   MOVE, // Move app to a new queue
 7 
 8   // Source: Scheduler and RMAppManager
 9   APP_REJECTED,
10 
11   // Source: Scheduler
12   APP_ACCEPTED,
13 
14   // Source: RMAppAttempt
15   ATTEMPT_REGISTERED,
16   ATTEMPT_UNREGISTERED,
17   ATTEMPT_FINISHED, // Will send the final state
18   ATTEMPT_FAILED,
19   ATTEMPT_KILLED,
20   NODE_UPDATE,
21   
22   // Source: Container and ResourceTracker
23   APP_RUNNING_ON_NODE,
24 
25   // Source: RMStateStore
26   APP_NEW_SAVED,
27   APP_UPDATE_SAVED,
28 }
RMAppEventType.java

RMAppEventType.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppEventType.java

 1 public enum RMAppState {
 2   NEW,
 3   NEW_SAVING,
 4   SUBMITTED,
 5   ACCEPTED,
 6   RUNNING,
 7   FINAL_SAVING,
 8   FINISHING,
 9   FINISHED,
10   FAILED,
11   KILLING,
12   KILLED
13 }
RMAppState.java

RMAppState.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppState.java

 

在ResourceManager内部:

   1 /**
   2  * The ResourceManager is the main class that is a set of components.
   3  * "I am the ResourceManager. All your resources belong to us..."
   4  *
   5  */
   6 @SuppressWarnings("unchecked")
   7 public class ResourceManager extends CompositeService implements Recoverable {
   8 
   9   /**
  10    * Priority of the ResourceManager shutdown hook.
  11    */
  12   public static final int SHUTDOWN_HOOK_PRIORITY = 30;
  13 
  14   private static final Log LOG = LogFactory.getLog(ResourceManager.class);
  15   private static long clusterTimeStamp = System.currentTimeMillis();
  16 
  17   /**
  18    * "Always On" services. Services that need to run always irrespective of
  19    * the HA state of the RM.
  20    */
  21   @VisibleForTesting
  22   protected RMContextImpl rmContext;
  23   private Dispatcher rmDispatcher;
  24   @VisibleForTesting
  25   protected AdminService adminService;
  26 
  27   /**
  28    * "Active" services. Services that need to run only on the Active RM.
  29    * These services are managed (initialized, started, stopped) by the
  30    * {@link CompositeService} RMActiveServices.
  31    *
  32    * RM is active when (1) HA is disabled, or (2) HA is enabled and the RM is
  33    * in Active state.
  34    */
  35   protected RMActiveServices activeServices;
  36   protected RMSecretManagerService rmSecretManagerService;
  37 
  38   protected ResourceScheduler scheduler;
  39   protected ReservationSystem reservationSystem;
  40   private ClientRMService clientRM;
  41   protected ApplicationMasterService masterService;
  42   protected NMLivelinessMonitor nmLivelinessMonitor;
  43   protected NodesListManager nodesListManager;
  44   protected RMAppManager rmAppManager;
  45   protected ApplicationACLsManager applicationACLsManager;
  46   protected QueueACLsManager queueACLsManager;
  47   private WebApp webApp;
  48   private AppReportFetcher fetcher = null;
  49   protected ResourceTrackerService resourceTracker;
  50 
  51   @VisibleForTesting
  52   protected String webAppAddress;
  53   private ConfigurationProvider configurationProvider = null;
  54   /** End of Active services */
  55 
  56   private Configuration conf;
  57 
  58   private UserGroupInformation rmLoginUGI;
  59   
  60   public ResourceManager() {
  61     super("ResourceManager");
  62   }
  63 
  64   public RMContext getRMContext() {
  65     return this.rmContext;
  66   }
  67 
  68   public static long getClusterTimeStamp() {
  69     return clusterTimeStamp;
  70   }
  71 
  72   @VisibleForTesting
  73   protected static void setClusterTimeStamp(long timestamp) {
  74     clusterTimeStamp = timestamp;
  75   }
  76 
  77   @VisibleForTesting
  78   Dispatcher getRmDispatcher() {
  79     return rmDispatcher;
  80   }
  81 
  82   @Override
  83   protected void serviceInit(Configuration conf) throws Exception {
  84     this.conf = conf;
  85     this.rmContext = new RMContextImpl();
  86     
  87     this.configurationProvider =
  88         ConfigurationProviderFactory.getConfigurationProvider(conf);
  89     this.configurationProvider.init(this.conf);
  90     rmContext.setConfigurationProvider(configurationProvider);
  91 
  92     // load core-site.xml
  93     InputStream coreSiteXMLInputStream =
  94         this.configurationProvider.getConfigurationInputStream(this.conf,
  95             YarnConfiguration.CORE_SITE_CONFIGURATION_FILE);
  96     if (coreSiteXMLInputStream != null) {
  97       this.conf.addResource(coreSiteXMLInputStream);
  98     }
  99 
 100     // Do refreshUserToGroupsMappings with loaded core-site.xml
 101     Groups.getUserToGroupsMappingServiceWithLoadedConfiguration(this.conf)
 102         .refresh();
 103 
 104     // Do refreshSuperUserGroupsConfiguration with loaded core-site.xml
 105     // Or use RM specific configurations to overwrite the common ones first
 106     // if they exist
 107     RMServerUtils.processRMProxyUsersConf(conf);
 108     ProxyUsers.refreshSuperUserGroupsConfiguration(this.conf);
 109 
 110     // load yarn-site.xml
 111     InputStream yarnSiteXMLInputStream =
 112         this.configurationProvider.getConfigurationInputStream(this.conf,
 113             YarnConfiguration.YARN_SITE_CONFIGURATION_FILE);
 114     if (yarnSiteXMLInputStream != null) {
 115       this.conf.addResource(yarnSiteXMLInputStream);
 116     }
 117     
 118     //校验配置合法性,yarn.resourcemanager.am.max-attempts ,validate expireIntvl >= heartbeatIntvl
 119     validateConfigs(this.conf);
 120     
 121     // Set HA configuration should be done before login
 122     this.rmContext.setHAEnabled(HAUtil.isHAEnabled(this.conf));
 123     if (this.rmContext.isHAEnabled()) {
 124       HAUtil.verifyAndSetConfiguration(this.conf);
 125     }
 126     
 127     // Set UGI and do login
 128     // If security is enabled, use login user
 129     // If security is not enabled, use current user
 130     this.rmLoginUGI = UserGroupInformation.getCurrentUser();
 131     try {
 132       doSecureLogin();
 133     } catch(IOException ie) {
 134       throw new YarnRuntimeException("Failed to login", ie);
 135     }
 136 
 137     // register the handlers for all AlwaysOn services using setupDispatcher().
 138     rmDispatcher = setupDispatcher();
 139     addIfService(rmDispatcher);
 140     rmContext.setDispatcher(rmDispatcher);
 141 
 142     adminService = createAdminService();
 143     addService(adminService);
 144     rmContext.setRMAdminService(adminService);
 145 
 146     rmContext.setYarnConfiguration(conf);
 147     
 148     //创建并初始化ResourceManager的内部类RMActiveServices
 149     createAndInitActiveServices();
 150 
 151     webAppAddress = WebAppUtils.getWebAppBindURL(this.conf,
 152                       YarnConfiguration.RM_BIND_HOST,
 153                       WebAppUtils.getRMWebAppURLWithoutScheme(this.conf));
 154 
 155     RMApplicationHistoryWriter rmApplicationHistoryWriter =
 156         createRMApplicationHistoryWriter();
 157     addService(rmApplicationHistoryWriter);
 158     rmContext.setRMApplicationHistoryWriter(rmApplicationHistoryWriter);
 159 
 160     SystemMetricsPublisher systemMetricsPublisher = createSystemMetricsPublisher();
 161     addService(systemMetricsPublisher);
 162     rmContext.setSystemMetricsPublisher(systemMetricsPublisher);
 163 
 164     super.serviceInit(this.conf);
 165   }
 166   
 167   protected QueueACLsManager createQueueACLsManager(ResourceScheduler scheduler,
 168       Configuration conf) {
 169     return new QueueACLsManager(scheduler, conf);
 170   }
 171 
 172   @VisibleForTesting
 173   protected void setRMStateStore(RMStateStore rmStore) {
 174     rmStore.setRMDispatcher(rmDispatcher);
 175     rmStore.setResourceManager(this);
 176     rmContext.setStateStore(rmStore);
 177   }
 178 
 179   protected EventHandler<SchedulerEvent> createSchedulerEventDispatcher() {
 180     return new SchedulerEventDispatcher(this.scheduler);
 181   }
 182 
 183   protected Dispatcher createDispatcher() {
 184     return new AsyncDispatcher();
 185   }
 186 
 187   protected ResourceScheduler createScheduler() {
 188     String schedulerClassName = conf.get(YarnConfiguration.RM_SCHEDULER,
 189         YarnConfiguration.DEFAULT_RM_SCHEDULER);
 190     LOG.info("Using Scheduler: " + schedulerClassName);
 191     try {
 192       Class<?> schedulerClazz = Class.forName(schedulerClassName);
 193       if (ResourceScheduler.class.isAssignableFrom(schedulerClazz)) {
 194         return (ResourceScheduler) ReflectionUtils.newInstance(schedulerClazz,
 195             this.conf);
 196       } else {
 197         throw new YarnRuntimeException("Class: " + schedulerClassName
 198             + " not instance of " + ResourceScheduler.class.getCanonicalName());
 199       }
 200     } catch (ClassNotFoundException e) {
 201       throw new YarnRuntimeException("Could not instantiate Scheduler: "
 202           + schedulerClassName, e);
 203     }
 204   }
 205 
 206   protected ReservationSystem createReservationSystem() {
 207     String reservationClassName =
 208         conf.get(YarnConfiguration.RM_RESERVATION_SYSTEM_CLASS,
 209             AbstractReservationSystem.getDefaultReservationSystem(scheduler));
 210     if (reservationClassName == null) {
 211       return null;
 212     }
 213     LOG.info("Using ReservationSystem: " + reservationClassName);
 214     try {
 215       Class<?> reservationClazz = Class.forName(reservationClassName);
 216       if (ReservationSystem.class.isAssignableFrom(reservationClazz)) {
 217         return (ReservationSystem) ReflectionUtils.newInstance(
 218             reservationClazz, this.conf);
 219       } else {
 220         throw new YarnRuntimeException("Class: " + reservationClassName
 221             + " not instance of " + ReservationSystem.class.getCanonicalName());
 222       }
 223     } catch (ClassNotFoundException e) {
 224       throw new YarnRuntimeException(
 225           "Could not instantiate ReservationSystem: " + reservationClassName, e);
 226     }
 227   }
 228 
 229   protected ApplicationMasterLauncher createAMLauncher() {
 230     return new ApplicationMasterLauncher(this.rmContext);
 231   }
 232 
 233   private NMLivelinessMonitor createNMLivelinessMonitor() {
 234     return new NMLivelinessMonitor(this.rmContext
 235         .getDispatcher());
 236   }
 237 
 238   protected AMLivelinessMonitor createAMLivelinessMonitor() {
 239     return new AMLivelinessMonitor(this.rmDispatcher);
 240   }
 241   
 242   protected RMNodeLabelsManager createNodeLabelManager()
 243       throws InstantiationException, IllegalAccessException {
 244     return new RMNodeLabelsManager();
 245   }
 246   
 247   protected DelegationTokenRenewer createDelegationTokenRenewer() {
 248     return new DelegationTokenRenewer();
 249   }
 250 
 251   protected RMAppManager createRMAppManager() {
 252     return new RMAppManager(this.rmContext, this.scheduler, this.masterService,
 253       this.applicationACLsManager, this.conf);
 254   }
 255 
 256   protected RMApplicationHistoryWriter createRMApplicationHistoryWriter() {
 257     return new RMApplicationHistoryWriter();
 258   }
 259 
 260   protected SystemMetricsPublisher createSystemMetricsPublisher() {
 261     return new SystemMetricsPublisher(); 
 262   }
 263 
 264   // sanity check for configurations
 265   protected static void validateConfigs(Configuration conf) {
 266     // validate max-attempts
 267     int globalMaxAppAttempts =
 268         conf.getInt(YarnConfiguration.RM_AM_MAX_ATTEMPTS,
 269         YarnConfiguration.DEFAULT_RM_AM_MAX_ATTEMPTS);
 270     if (globalMaxAppAttempts <= 0) {
 271       throw new YarnRuntimeException("Invalid global max attempts configuration"
 272           + ", " + YarnConfiguration.RM_AM_MAX_ATTEMPTS
 273           + "=" + globalMaxAppAttempts + ", it should be a positive integer.");
 274     }
 275 
 276     // validate expireIntvl >= heartbeatIntvl
 277     long expireIntvl = conf.getLong(YarnConfiguration.RM_NM_EXPIRY_INTERVAL_MS,
 278         YarnConfiguration.DEFAULT_RM_NM_EXPIRY_INTERVAL_MS);
 279     long heartbeatIntvl =
 280         conf.getLong(YarnConfiguration.RM_NM_HEARTBEAT_INTERVAL_MS,
 281             YarnConfiguration.DEFAULT_RM_NM_HEARTBEAT_INTERVAL_MS);
 282     if (expireIntvl < heartbeatIntvl) {
 283       throw new YarnRuntimeException("Nodemanager expiry interval should be no"
 284           + " less than heartbeat interval, "
 285           + YarnConfiguration.RM_NM_EXPIRY_INTERVAL_MS + "=" + expireIntvl
 286           + ", " + YarnConfiguration.RM_NM_HEARTBEAT_INTERVAL_MS + "="
 287           + heartbeatIntvl);
 288     }
 289   }
 290 
 291   /**
 292    * RMActiveServices handles all the Active services in the RM.
 293    */
 294   @Private
 295   public class RMActiveServices extends CompositeService {
 296 
 297     private DelegationTokenRenewer delegationTokenRenewer;
 298     private EventHandler<SchedulerEvent> schedulerDispatcher;
 299     private ApplicationMasterLauncher applicationMasterLauncher;
 300     private ContainerAllocationExpirer containerAllocationExpirer;
 301     private ResourceManager rm;
 302     private boolean recoveryEnabled;
 303     private RMActiveServiceContext activeServiceContext;
 304 
 305     RMActiveServices(ResourceManager rm) {
 306       super("RMActiveServices");
 307       this.rm = rm;
 308     }
 309 
 310     @Override
 311     protected void serviceInit(Configuration configuration) throws Exception {
 312       activeServiceContext = new RMActiveServiceContext();
 313       rmContext.setActiveServiceContext(activeServiceContext);
 314 
 315       conf.setBoolean(Dispatcher.DISPATCHER_EXIT_ON_ERROR_KEY, true);
 316       rmSecretManagerService = createRMSecretManagerService();
 317       addService(rmSecretManagerService);
 318 
 319       containerAllocationExpirer = new ContainerAllocationExpirer(rmDispatcher);
 320       addService(containerAllocationExpirer);
 321       rmContext.setContainerAllocationExpirer(containerAllocationExpirer);
 322 
 323       AMLivelinessMonitor amLivelinessMonitor = createAMLivelinessMonitor();
 324       addService(amLivelinessMonitor);
 325       rmContext.setAMLivelinessMonitor(amLivelinessMonitor);
 326 
 327       AMLivelinessMonitor amFinishingMonitor = createAMLivelinessMonitor();
 328       addService(amFinishingMonitor);
 329       rmContext.setAMFinishingMonitor(amFinishingMonitor);
 330       
 331       RMNodeLabelsManager nlm = createNodeLabelManager();
 332       nlm.setRMContext(rmContext);
 333       addService(nlm);
 334       rmContext.setNodeLabelManager(nlm);
 335 
 336       boolean isRecoveryEnabled = conf.getBoolean(
 337           YarnConfiguration.RECOVERY_ENABLED,
 338           YarnConfiguration.DEFAULT_RM_RECOVERY_ENABLED);
 339 
 340       RMStateStore rmStore = null;
 341       if (isRecoveryEnabled) {
 342         recoveryEnabled = true;
 343         rmStore = RMStateStoreFactory.getStore(conf);
 344         boolean isWorkPreservingRecoveryEnabled =
 345             conf.getBoolean(
 346               YarnConfiguration.RM_WORK_PRESERVING_RECOVERY_ENABLED,
 347               YarnConfiguration.DEFAULT_RM_WORK_PRESERVING_RECOVERY_ENABLED);
 348         rmContext
 349           .setWorkPreservingRecoveryEnabled(isWorkPreservingRecoveryEnabled);
 350       } else {
 351         recoveryEnabled = false;
 352         rmStore = new NullRMStateStore();
 353       }
 354 
 355       try {
 356         rmStore.init(conf);
 357         rmStore.setRMDispatcher(rmDispatcher);
 358         rmStore.setResourceManager(rm);
 359       } catch (Exception e) {
 360         // the Exception from stateStore.init() needs to be handled for
 361         // HA and we need to give up master status if we got fenced
 362         LOG.error("Failed to init state store", e);
 363         throw e;
 364       }
 365       rmContext.setStateStore(rmStore);
 366 
 367       if (UserGroupInformation.isSecurityEnabled()) {
 368         delegationTokenRenewer = createDelegationTokenRenewer();
 369         rmContext.setDelegationTokenRenewer(delegationTokenRenewer);
 370       }
 371 
 372       // Register event handler for NodesListManager
 373       nodesListManager = new NodesListManager(rmContext);
 374       rmDispatcher.register(NodesListManagerEventType.class, nodesListManager);
 375       addService(nodesListManager);
 376       rmContext.setNodesListManager(nodesListManager);
 377 
 378       // Initialize the scheduler
 379       scheduler = createScheduler();
 380       scheduler.setRMContext(rmContext);
 381       addIfService(scheduler);
 382       rmContext.setScheduler(scheduler);
 383 
 384       schedulerDispatcher = createSchedulerEventDispatcher();
 385       addIfService(schedulerDispatcher);
 386       rmDispatcher.register(SchedulerEventType.class, schedulerDispatcher);
 387 
 388       // Register event handler for RmAppEvents
 389       //注册RMAppEvent事件的事件处理器
 390       //RMAppManager往异步处理器增加个RMAppEvent事件,类型枚值RMAppEventType.START,所以由ApplicationEventDispatcher(rmContext)来处理
 391       rmDispatcher.register(RMAppEventType.class,
 392           new ApplicationEventDispatcher(rmContext));
 393 
 394       // Register event handler for RmAppAttemptEvents
 395       rmDispatcher.register(RMAppAttemptEventType.class,
 396           new ApplicationAttemptEventDispatcher(rmContext));
 397 
 398       // Register event handler for RmNodes
 399       rmDispatcher.register(
 400           RMNodeEventType.class, new NodeEventDispatcher(rmContext));
 401 
 402       nmLivelinessMonitor = createNMLivelinessMonitor();
 403       addService(nmLivelinessMonitor);
 404 
 405       resourceTracker = createResourceTrackerService();
 406       addService(resourceTracker);
 407       rmContext.setResourceTrackerService(resourceTracker);
 408 
 409       DefaultMetricsSystem.initialize("ResourceManager");
 410       JvmMetrics.initSingleton("ResourceManager", null);
 411 
 412       // Initialize the Reservation system
 413       if (conf.getBoolean(YarnConfiguration.RM_RESERVATION_SYSTEM_ENABLE,
 414           YarnConfiguration.DEFAULT_RM_RESERVATION_SYSTEM_ENABLE)) {
 415         reservationSystem = createReservationSystem();
 416         if (reservationSystem != null) {
 417           reservationSystem.setRMContext(rmContext);
 418           addIfService(reservationSystem);
 419           rmContext.setReservationSystem(reservationSystem);
 420           LOG.info("Initialized Reservation system");
 421         }
 422       }
 423 
 424       // creating monitors that handle preemption
 425       createPolicyMonitors();
 426 
 427       masterService = createApplicationMasterService();
 428       addService(masterService) ;
 429       rmContext.setApplicationMasterService(masterService);
 430 
 431       applicationACLsManager = new ApplicationACLsManager(conf);
 432 
 433       queueACLsManager = createQueueACLsManager(scheduler, conf);
 434 
 435       rmAppManager = createRMAppManager();
 436       // Register event handler for RMAppManagerEvents
 437       rmDispatcher.register(RMAppManagerEventType.class, rmAppManager);
 438 
 439       clientRM = createClientRMService();
 440       addService(clientRM);
 441       rmContext.setClientRMService(clientRM);
 442 
 443       applicationMasterLauncher = createAMLauncher();
 444       rmDispatcher.register(AMLauncherEventType.class,
 445           applicationMasterLauncher);
 446 
 447       addService(applicationMasterLauncher);
 448       if (UserGroupInformation.isSecurityEnabled()) {
 449         addService(delegationTokenRenewer);
 450         delegationTokenRenewer.setRMContext(rmContext);
 451       }
 452 
 453       new RMNMInfo(rmContext, scheduler);
 454 
 455       super.serviceInit(conf);
 456     }
 457 
 458     @Override
 459     protected void serviceStart() throws Exception {
 460       RMStateStore rmStore = rmContext.getStateStore();
 461       // The state store needs to start irrespective of recoveryEnabled as apps
 462       // need events to move to further states.
 463       rmStore.start();
 464 
 465       if(recoveryEnabled) {
 466         try {
 467           LOG.info("Recovery started");
 468           rmStore.checkVersion();
 469           if (rmContext.isWorkPreservingRecoveryEnabled()) {
 470             rmContext.setEpoch(rmStore.getAndIncrementEpoch());
 471           }
 472           RMState state = rmStore.loadState();
 473           recover(state);
 474           LOG.info("Recovery ended");
 475         } catch (Exception e) {
 476           // the Exception from loadState() needs to be handled for
 477           // HA and we need to give up master status if we got fenced
 478           LOG.error("Failed to load/recover state", e);
 479           throw e;
 480         }
 481       }
 482 
 483       super.serviceStart();
 484     }
 485 
 486     @Override
 487     protected void serviceStop() throws Exception {
 488 
 489       super.serviceStop();
 490       DefaultMetricsSystem.shutdown();
 491       if (rmContext != null) {
 492         RMStateStore store = rmContext.getStateStore();
 493         try {
 494           store.close();
 495         } catch (Exception e) {
 496           LOG.error("Error closing store.", e);
 497         }
 498       }
 499 
 500     }
 501 
 502     protected void createPolicyMonitors() {
 503       if (scheduler instanceof PreemptableResourceScheduler
 504           && conf.getBoolean(YarnConfiguration.RM_SCHEDULER_ENABLE_MONITORS,
 505           YarnConfiguration.DEFAULT_RM_SCHEDULER_ENABLE_MONITORS)) {
 506         LOG.info("Loading policy monitors");
 507         List<SchedulingEditPolicy> policies = conf.getInstances(
 508             YarnConfiguration.RM_SCHEDULER_MONITOR_POLICIES,
 509             SchedulingEditPolicy.class);
 510         if (policies.size() > 0) {
 511           for (SchedulingEditPolicy policy : policies) {
 512             LOG.info("LOADING SchedulingEditPolicy:" + policy.getPolicyName());
 513             // periodically check whether we need to take action to guarantee
 514             // constraints
 515             SchedulingMonitor mon = new SchedulingMonitor(rmContext, policy);
 516             addService(mon);
 517           }
 518         } else {
 519           LOG.warn("Policy monitors configured (" +
 520               YarnConfiguration.RM_SCHEDULER_ENABLE_MONITORS +
 521               ") but none specified (" +
 522               YarnConfiguration.RM_SCHEDULER_MONITOR_POLICIES + ")");
 523         }
 524       }
 525     }
 526   }
 527 
 528   @Private
 529   public static class SchedulerEventDispatcher extends AbstractService
 530       implements EventHandler<SchedulerEvent> {
 531 
 532     private final ResourceScheduler scheduler;
 533     private final BlockingQueue<SchedulerEvent> eventQueue =
 534       new LinkedBlockingQueue<SchedulerEvent>();
 535     private volatile int lastEventQueueSizeLogged = 0;
 536     private final Thread eventProcessor;
 537     private volatile boolean stopped = false;
 538     private boolean shouldExitOnError = false;
 539 
 540     public SchedulerEventDispatcher(ResourceScheduler scheduler) {
 541       super(SchedulerEventDispatcher.class.getName());
 542       this.scheduler = scheduler;
 543       this.eventProcessor = new Thread(new EventProcessor());
 544       this.eventProcessor.setName("ResourceManager Event Processor");
 545     }
 546 
 547     @Override
 548     protected void serviceInit(Configuration conf) throws Exception {
 549       this.shouldExitOnError =
 550           conf.getBoolean(Dispatcher.DISPATCHER_EXIT_ON_ERROR_KEY,
 551             Dispatcher.DEFAULT_DISPATCHER_EXIT_ON_ERROR);
 552       super.serviceInit(conf);
 553     }
 554 
 555     @Override
 556     protected void serviceStart() throws Exception {
 557       this.eventProcessor.start();
 558       super.serviceStart();
 559     }
 560 
 561     private final class EventProcessor implements Runnable {
 562       @Override
 563       public void run() {
 564 
 565         SchedulerEvent event;
 566 
 567         while (!stopped && !Thread.currentThread().isInterrupted()) {
 568           try {
 569             event = eventQueue.take();
 570           } catch (InterruptedException e) {
 571             LOG.error("Returning, interrupted : " + e);
 572             return; // TODO: Kill RM.
 573           }
 574 
 575           try {
 576             scheduler.handle(event);
 577           } catch (Throwable t) {
 578             // An error occurred, but we are shutting down anyway.
 579             // If it was an InterruptedException, the very act of 
 580             // shutdown could have caused it and is probably harmless.
 581             if (stopped) {
 582               LOG.warn("Exception during shutdown: ", t);
 583               break;
 584             }
 585             LOG.fatal("Error in handling event type " + event.getType()
 586                 + " to the scheduler", t);
 587             if (shouldExitOnError
 588                 && !ShutdownHookManager.get().isShutdownInProgress()) {
 589               LOG.info("Exiting, bbye..");
 590               System.exit(-1);
 591             }
 592           }
 593         }
 594       }
 595     }
 596 
 597     @Override
 598     protected void serviceStop() throws Exception {
 599       this.stopped = true;
 600       this.eventProcessor.interrupt();
 601       try {
 602         this.eventProcessor.join();
 603       } catch (InterruptedException e) {
 604         throw new YarnRuntimeException(e);
 605       }
 606       super.serviceStop();
 607     }
 608 
 609     @Override
 610     public void handle(SchedulerEvent event) {
 611       try {
 612         int qSize = eventQueue.size();
 613         if (qSize != 0 && qSize % 1000 == 0
 614             && lastEventQueueSizeLogged != qSize) {
 615           lastEventQueueSizeLogged = qSize;
 616           LOG.info("Size of scheduler event-queue is " + qSize);
 617         }
 618         int remCapacity = eventQueue.remainingCapacity();
 619         if (remCapacity < 1000) {
 620           LOG.info("Very low remaining capacity on scheduler event queue: "
 621               + remCapacity);
 622         }
 623         this.eventQueue.put(event);
 624       } catch (InterruptedException e) {
 625         LOG.info("Interrupted. Trying to exit gracefully.");
 626       }
 627     }
 628   }
 629 
 630   @Private
 631   public static class RMFatalEventDispatcher
 632       implements EventHandler<RMFatalEvent> {
 633 
 634     @Override
 635     public void handle(RMFatalEvent event) {
 636       LOG.fatal("Received a " + RMFatalEvent.class.getName() + " of type " +
 637           event.getType().name() + ". Cause:\n" + event.getCause());
 638 
 639       ExitUtil.terminate(1, event.getCause());
 640     }
 641   }
 642 
 643   public void handleTransitionToStandBy() {
 644     if (rmContext.isHAEnabled()) {
 645       try {
 646         // Transition to standby and reinit active services
 647         LOG.info("Transitioning RM to Standby mode");
 648         transitionToStandby(true);
 649         adminService.resetLeaderElection();
 650         return;
 651       } catch (Exception e) {
 652         LOG.fatal("Failed to transition RM to Standby mode.");
 653         ExitUtil.terminate(1, e);
 654       }
 655     }
 656   }
 657 
 658   @Private
 659   public static final class ApplicationEventDispatcher implements
 660       EventHandler<RMAppEvent> {
 661 
 662     private final RMContext rmContext;
 663 
 664     public ApplicationEventDispatcher(RMContext rmContext) {
 665       this.rmContext = rmContext;
 666     }
 667 
 668     @Override
 669     public void handle(RMAppEvent event) {
 670       ApplicationId appID = event.getApplicationId();
 671       RMApp rmApp = this.rmContext.getRMApps().get(appID);
 672       if (rmApp != null) {
 673         try {
 674           //
 675           rmApp.handle(event);
 676         } catch (Throwable t) {
 677           LOG.error("Error in handling event type " + event.getType()
 678               + " for application " + appID, t);
 679         }
 680       }
 681     }
 682   }
 683 
 684   @Private
 685   public static final class ApplicationAttemptEventDispatcher implements
 686       EventHandler<RMAppAttemptEvent> {
 687 
 688     private final RMContext rmContext;
 689 
 690     public ApplicationAttemptEventDispatcher(RMContext rmContext) {
 691       this.rmContext = rmContext;
 692     }
 693 
 694     @Override
 695     public void handle(RMAppAttemptEvent event) {
 696       ApplicationAttemptId appAttemptID = event.getApplicationAttemptId();
 697       ApplicationId appAttemptId = appAttemptID.getApplicationId();
 698       RMApp rmApp = this.rmContext.getRMApps().get(appAttemptId);
 699       if (rmApp != null) {
 700         RMAppAttempt rmAppAttempt = rmApp.getRMAppAttempt(appAttemptID);
 701         if (rmAppAttempt != null) {
 702           try {
 703             rmAppAttempt.handle(event);
 704           } catch (Throwable t) {
 705             LOG.error("Error in handling event type " + event.getType()
 706                 + " for applicationAttempt " + appAttemptId, t);
 707           }
 708         }
 709       }
 710     }
 711   }
 712 
 713   @Private
 714   public static final class NodeEventDispatcher implements
 715       EventHandler<RMNodeEvent> {
 716 
 717     private final RMContext rmContext;
 718 
 719     public NodeEventDispatcher(RMContext rmContext) {
 720       this.rmContext = rmContext;
 721     }
 722 
 723     @Override
 724     public void handle(RMNodeEvent event) {
 725       NodeId nodeId = event.getNodeId();
 726       RMNode node = this.rmContext.getRMNodes().get(nodeId);
 727       if (node != null) {
 728         try {
 729           ((EventHandler<RMNodeEvent>) node).handle(event);
 730         } catch (Throwable t) {
 731           LOG.error("Error in handling event type " + event.getType()
 732               + " for node " + nodeId, t);
 733         }
 734       }
 735     }
 736   }
 737   
 738   protected void startWepApp() {
 739 
 740     // Use the customized yarn filter instead of the standard kerberos filter to
 741     // allow users to authenticate using delegation tokens
 742     // 4 conditions need to be satisfied -
 743     // 1. security is enabled
 744     // 2. http auth type is set to kerberos
 745     // 3. "yarn.resourcemanager.webapp.use-yarn-filter" override is set to true
 746     // 4. hadoop.http.filter.initializers container AuthenticationFilterInitializer
 747 
 748     Configuration conf = getConfig();
 749     boolean enableCorsFilter =
 750         conf.getBoolean(YarnConfiguration.RM_WEBAPP_ENABLE_CORS_FILTER,
 751             YarnConfiguration.DEFAULT_RM_WEBAPP_ENABLE_CORS_FILTER);
 752     boolean useYarnAuthenticationFilter =
 753         conf.getBoolean(
 754           YarnConfiguration.RM_WEBAPP_DELEGATION_TOKEN_AUTH_FILTER,
 755           YarnConfiguration.DEFAULT_RM_WEBAPP_DELEGATION_TOKEN_AUTH_FILTER);
 756     String authPrefix = "hadoop.http.authentication.";
 757     String authTypeKey = authPrefix + "type";
 758     String filterInitializerConfKey = "hadoop.http.filter.initializers";
 759     String actualInitializers = "";
 760     Class<?>[] initializersClasses =
 761         conf.getClasses(filterInitializerConfKey);
 762 
 763     // setup CORS
 764     if (enableCorsFilter) {
 765       conf.setBoolean(HttpCrossOriginFilterInitializer.PREFIX
 766           + HttpCrossOriginFilterInitializer.ENABLED_SUFFIX, true);
 767     }
 768 
 769     boolean hasHadoopAuthFilterInitializer = false;
 770     boolean hasRMAuthFilterInitializer = false;
 771     if (initializersClasses != null) {
 772       for (Class<?> initializer : initializersClasses) {
 773         if (initializer.getName().equals(
 774           AuthenticationFilterInitializer.class.getName())) {
 775           hasHadoopAuthFilterInitializer = true;
 776         }
 777         if (initializer.getName().equals(
 778           RMAuthenticationFilterInitializer.class.getName())) {
 779           hasRMAuthFilterInitializer = true;
 780         }
 781       }
 782       if (UserGroupInformation.isSecurityEnabled()
 783           && useYarnAuthenticationFilter
 784           && hasHadoopAuthFilterInitializer
 785           && conf.get(authTypeKey, "").equals(
 786             KerberosAuthenticationHandler.TYPE)) {
 787         ArrayList<String> target = new ArrayList<String>();
 788         for (Class<?> filterInitializer : initializersClasses) {
 789           if (filterInitializer.getName().equals(
 790             AuthenticationFilterInitializer.class.getName())) {
 791             if (hasRMAuthFilterInitializer == false) {
 792               target.add(RMAuthenticationFilterInitializer.class.getName());
 793             }
 794             continue;
 795           }
 796           target.add(filterInitializer.getName());
 797         }
 798         actualInitializers = StringUtils.join(",", target);
 799 
 800         LOG.info("Using RM authentication filter(kerberos/delegation-token)"
 801             + " for RM webapp authentication");
 802         RMAuthenticationFilter
 803           .setDelegationTokenSecretManager(getClientRMService().rmDTSecretManager);
 804         conf.set(filterInitializerConfKey, actualInitializers);
 805       }
 806     }
 807 
 808     // if security is not enabled and the default filter initializer has not 
 809     // been set, set the initializer to include the
 810     // RMAuthenticationFilterInitializer which in turn will set up the simple
 811     // auth filter.
 812 
 813     String initializers = conf.get(filterInitializerConfKey);
 814     if (!UserGroupInformation.isSecurityEnabled()) {
 815       if (initializersClasses == null || initializersClasses.length == 0) {
 816         conf.set(filterInitializerConfKey,
 817           RMAuthenticationFilterInitializer.class.getName());
 818         conf.set(authTypeKey, "simple");
 819       } else if (initializers.equals(StaticUserWebFilter.class.getName())) {
 820         conf.set(filterInitializerConfKey,
 821           RMAuthenticationFilterInitializer.class.getName() + ","
 822               + initializers);
 823         conf.set(authTypeKey, "simple");
 824       }
 825     }
 826 
 827     Builder<ApplicationMasterService> builder = 
 828         WebApps
 829             .$for("cluster", ApplicationMasterService.class, masterService,
 830                 "ws")
 831             .with(conf)
 832             .withHttpSpnegoPrincipalKey(
 833                 YarnConfiguration.RM_WEBAPP_SPNEGO_USER_NAME_KEY)
 834             .withHttpSpnegoKeytabKey(
 835                 YarnConfiguration.RM_WEBAPP_SPNEGO_KEYTAB_FILE_KEY)
 836             .at(webAppAddress);
 837     String proxyHostAndPort = WebAppUtils.getProxyHostAndPort(conf);
 838     if(WebAppUtils.getResolvedRMWebAppURLWithoutScheme(conf).
 839         equals(proxyHostAndPort)) {
 840       if (HAUtil.isHAEnabled(conf)) {
 841         fetcher = new AppReportFetcher(conf);
 842       } else {
 843         fetcher = new AppReportFetcher(conf, getClientRMService());
 844       }
 845       builder.withServlet(ProxyUriUtils.PROXY_SERVLET_NAME,
 846           ProxyUriUtils.PROXY_PATH_SPEC, WebAppProxyServlet.class);
 847       builder.withAttribute(WebAppProxy.FETCHER_ATTRIBUTE, fetcher);
 848       String[] proxyParts = proxyHostAndPort.split(":");
 849       builder.withAttribute(WebAppProxy.PROXY_HOST_ATTRIBUTE, proxyParts[0]);
 850 
 851     }
 852     webApp = builder.start(new RMWebApp(this));
 853   }
 854 
 855   /**
 856    * Helper method to create and init {@link #activeServices}. This creates an
 857    * instance of {@link RMActiveServices} and initializes it.
 858    * @throws Exception
 859    */
 860   protected void createAndInitActiveServices() throws Exception {
 861     activeServices = new RMActiveServices(this);
 862     //最后调用的是RMActiveServices类的serviceInit函数
 863     activeServices.init(conf);
 864   }
 865 
 866   /**
 867    * Helper method to start {@link #activeServices}.
 868    * @throws Exception
 869    */
 870   void startActiveServices() throws Exception {
 871     if (activeServices != null) {
 872       clusterTimeStamp = System.currentTimeMillis();
 873       activeServices.start();
 874     }
 875   }
 876 
 877   /**
 878    * Helper method to stop {@link #activeServices}.
 879    * @throws Exception
 880    */
 881   void stopActiveServices() throws Exception {
 882     if (activeServices != null) {
 883       activeServices.stop();
 884       activeServices = null;
 885     }
 886   }
 887 
 888   void reinitialize(boolean initialize) throws Exception {
 889     ClusterMetrics.destroy();
 890     QueueMetrics.clearQueueMetrics();
 891     if (initialize) {
 892       resetDispatcher();
 893       createAndInitActiveServices();
 894     }
 895   }
 896 
 897   @VisibleForTesting
 898   protected boolean areActiveServicesRunning() {
 899     return activeServices != null && activeServices.isInState(STATE.STARTED);
 900   }
 901 
 902   synchronized void transitionToActive() throws Exception {
 903     if (rmContext.getHAServiceState() == HAServiceProtocol.HAServiceState.ACTIVE) {
 904       LOG.info("Already in active state");
 905       return;
 906     }
 907 
 908     LOG.info("Transitioning to active state");
 909 
 910     this.rmLoginUGI.doAs(new PrivilegedExceptionAction<Void>() {
 911       @Override
 912       public Void run() throws Exception {
 913         try {
 914           startActiveServices();
 915           return null;
 916         } catch (Exception e) {
 917           reinitialize(true);
 918           throw e;
 919         }
 920       }
 921     });
 922 
 923     rmContext.setHAServiceState(HAServiceProtocol.HAServiceState.ACTIVE);
 924     LOG.info("Transitioned to active state");
 925   }
 926 
 927   synchronized void transitionToStandby(boolean initialize)
 928       throws Exception {
 929     if (rmContext.getHAServiceState() ==
 930         HAServiceProtocol.HAServiceState.STANDBY) {
 931       LOG.info("Already in standby state");
 932       return;
 933     }
 934 
 935     LOG.info("Transitioning to standby state");
 936     HAServiceState state = rmContext.getHAServiceState();
 937     rmContext.setHAServiceState(HAServiceProtocol.HAServiceState.STANDBY);
 938     if (state == HAServiceProtocol.HAServiceState.ACTIVE) {
 939       stopActiveServices();
 940       reinitialize(initialize);
 941     }
 942     LOG.info("Transitioned to standby state");
 943   }
 944 
 945   @Override
 946   protected void serviceStart() throws Exception {
 947     if (this.rmContext.isHAEnabled()) {
 948       transitionToStandby(true);
 949     } else {
 950       transitionToActive();
 951     }
 952 
 953     startWepApp();
 954     if (getConfig().getBoolean(YarnConfiguration.IS_MINI_YARN_CLUSTER,
 955         false)) {
 956       int port = webApp.port();
 957       WebAppUtils.setRMWebAppPort(conf, port);
 958     }
 959     super.serviceStart();
 960   }
 961   
 962   protected void doSecureLogin() throws IOException {
 963     InetSocketAddress socAddr = getBindAddress(conf);
 964     SecurityUtil.login(this.conf, YarnConfiguration.RM_KEYTAB,
 965         YarnConfiguration.RM_PRINCIPAL, socAddr.getHostName());
 966 
 967     // if security is enable, set rmLoginUGI as UGI of loginUser
 968     if (UserGroupInformation.isSecurityEnabled()) {
 969       this.rmLoginUGI = UserGroupInformation.getLoginUser();
 970     }
 971   }
 972 
 973   @Override
 974   protected void serviceStop() throws Exception {
 975     if (webApp != null) {
 976       webApp.stop();
 977     }
 978     if (fetcher != null) {
 979       fetcher.stop();
 980     }
 981     if (configurationProvider != null) {
 982       configurationProvider.close();
 983     }
 984     super.serviceStop();
 985     transitionToStandby(false);
 986     rmContext.setHAServiceState(HAServiceState.STOPPING);
 987   }
 988   
 989   protected ResourceTrackerService createResourceTrackerService() {
 990     return new ResourceTrackerService(this.rmContext, this.nodesListManager,
 991         this.nmLivelinessMonitor,
 992         this.rmContext.getContainerTokenSecretManager(),
 993         this.rmContext.getNMTokenSecretManager());
 994   }
 995 
 996   protected ClientRMService createClientRMService() {
 997     return new ClientRMService(this.rmContext, scheduler, this.rmAppManager,
 998         this.applicationACLsManager, this.queueACLsManager,
 999         this.rmContext.getRMDelegationTokenSecretManager());
1000   }
1001 
1002   protected ApplicationMasterService createApplicationMasterService() {
1003     return new ApplicationMasterService(this.rmContext, scheduler);
1004   }
1005 
1006   protected AdminService createAdminService() {
1007     return new AdminService(this, rmContext);
1008   }
1009 
1010   protected RMSecretManagerService createRMSecretManagerService() {
1011     return new RMSecretManagerService(conf, rmContext);
1012   }
1013 
1014   @Private
1015   public ClientRMService getClientRMService() {
1016     return this.clientRM;
1017   }
1018   
1019   /**
1020    * return the scheduler.
1021    * @return the scheduler for the Resource Manager.
1022    */
1023   @Private
1024   public ResourceScheduler getResourceScheduler() {
1025     return this.scheduler;
1026   }
1027 
1028   /**
1029    * return the resource tracking component.
1030    * @return the resource tracking component.
1031    */
1032   @Private
1033   public ResourceTrackerService getResourceTrackerService() {
1034     return this.resourceTracker;
1035   }
1036 
1037   @Private
1038   public ApplicationMasterService getApplicationMasterService() {
1039     return this.masterService;
1040   }
1041 
1042   @Private
1043   public ApplicationACLsManager getApplicationACLsManager() {
1044     return this.applicationACLsManager;
1045   }
1046 
1047   @Private
1048   public QueueACLsManager getQueueACLsManager() {
1049     return this.queueACLsManager;
1050   }
1051 
1052   @Private
1053   WebApp getWebapp() {
1054     return this.webApp;
1055   }
1056 
1057   @Override
1058   public void recover(RMState state) throws Exception {
1059     // recover RMdelegationTokenSecretManager
1060     rmContext.getRMDelegationTokenSecretManager().recover(state);
1061 
1062     // recover AMRMTokenSecretManager
1063     rmContext.getAMRMTokenSecretManager().recover(state);
1064 
1065     // recover applications
1066     rmAppManager.recover(state);
1067 
1068     setSchedulerRecoveryStartAndWaitTime(state, conf);
1069   }
1070   
1071   /*main函数中主要分析服务初始化和服务启动,RM是个综合服务类继承结构CompositeService->AbstractService,RM初始化是会先进入父类的init函数,
1072    * AbstractService抽取了服务的基本操作如start、stop、close,只要我们的服务覆盖serviceStart、serviceStop、serviceInit等函数就可以控制自己的服务了,
1073    * 这相当于对服务做了统一的管理。
1074    */
1075   public static void main(String argv[]) {
1076     //未捕获异常处理类
1077     Thread.setDefaultUncaughtExceptionHandler(new YarnUncaughtExceptionHandler());
1078     StringUtils.startupShutdownMessage(ResourceManager.class, argv, LOG);
1079     try {
1080       //载入控制文件
1081       Configuration conf = new YarnConfiguration();
1082       //创建空RM对象,并未包含任何服务,也未启动
1083       GenericOptionsParser hParser = new GenericOptionsParser(conf, argv);
1084       argv = hParser.getRemainingArgs();
1085       // If -format-state-store, then delete RMStateStore; else startup normally
1086       if (argv.length == 1 && argv[0].equals("-format-state-store")) {
1087         deleteRMStateStore(conf);
1088       } else {
1089         ResourceManager resourceManager = new ResourceManager();
1090         //添加关闭钩子
1091         ShutdownHookManager.get().addShutdownHook(
1092           new CompositeServiceShutdownHook(resourceManager),
1093           SHUTDOWN_HOOK_PRIORITY);
1094         //初始化服务 ,会调用父类AbstractService的init函数,该函数内部调用serviceInit函数,实际上调用的是ResourceManager的serviceInit函数
1095         resourceManager.init(conf);
1096         //启动RM
1097         resourceManager.start();
1098       }
1099     } catch (Throwable t) {
1100       LOG.fatal("Error starting ResourceManager", t);
1101       System.exit(-1);
1102     }
1103   }
1104 
1105   /**
1106    * Register the handlers for alwaysOn services
1107    */
1108   private Dispatcher setupDispatcher() {
1109     Dispatcher dispatcher = createDispatcher();
1110     dispatcher.register(RMFatalEventType.class,
1111         new ResourceManager.RMFatalEventDispatcher());
1112     return dispatcher;
1113   }
1114 
1115   private void resetDispatcher() {
1116     Dispatcher dispatcher = setupDispatcher();
1117     ((Service)dispatcher).init(this.conf);
1118     ((Service)dispatcher).start();
1119     removeService((Service)rmDispatcher);
1120     // Need to stop previous rmDispatcher before assigning new dispatcher
1121     // otherwise causes "AsyncDispatcher event handler" thread leak
1122     ((Service) rmDispatcher).stop();
1123     rmDispatcher = dispatcher;
1124     addIfService(rmDispatcher);
1125     rmContext.setDispatcher(rmDispatcher);
1126   }
1127 
1128   private void setSchedulerRecoveryStartAndWaitTime(RMState state,
1129       Configuration conf) {
1130     if (!state.getApplicationState().isEmpty()) {
1131       long waitTime =
1132           conf.getLong(YarnConfiguration.RM_WORK_PRESERVING_RECOVERY_SCHEDULING_WAIT_MS,
1133             YarnConfiguration.DEFAULT_RM_WORK_PRESERVING_RECOVERY_SCHEDULING_WAIT_MS);
1134       rmContext.setSchedulerRecoveryStartAndWaitTime(waitTime);
1135     }
1136   }
1137 
1138   /**
1139    * Retrieve RM bind address from configuration
1140    * 
1141    * @param conf
1142    * @return InetSocketAddress
1143    */
1144   public static InetSocketAddress getBindAddress(Configuration conf) {
1145     return conf.getSocketAddr(YarnConfiguration.RM_ADDRESS,
1146       YarnConfiguration.DEFAULT_RM_ADDRESS, YarnConfiguration.DEFAULT_RM_PORT);
1147   }
1148 
1149   /**
1150    * Deletes the RMStateStore
1151    *
1152    * @param conf
1153    * @throws Exception
1154    */
1155   private static void deleteRMStateStore(Configuration conf) throws Exception {
1156     RMStateStore rmStore = RMStateStoreFactory.getStore(conf);
1157     rmStore.init(conf);
1158     rmStore.start();
1159     try {
1160       LOG.info("Deleting ResourceManager state store...");
1161       rmStore.deleteStore();
1162       LOG.info("State store deleted");
1163     } finally {
1164       rmStore.stop();
1165     }
1166   }
1167 }
ResourceManager.java

  我们可以从main()函数开始分析,main()函数内部调用resourceManager.init(conf),该函数初始化服务 ,会调用父类AbstractService的init函数,该函数内部调用serviceInit函数,实际上调用的是ResourceManager的serviceInit函数。且ResourceManager的serviceInit函数内部会调用createAndInitActiveServices(),该函数创建并初始化ResourceManager的内部类RMActiveServices,该函数内部会调用activeServices.init(conf),即最后调用的是ResourceManager类的内部类RMActiveServices类的serviceInit函数serviceInit函数内部调用rmDispatcher.register(RMAppEventType.class, new ApplicationEventDispatcher(rmContext)),即注册RMAppEvent事件的事件处理器,与前面的RMAppManager类的RMAppEventType.START呼应,即RMAppManager往异步处理器增加个RMAppEvent事件,类型枚值RMAppEventType.START,所以由ApplicationEventDispatcher(rmContext)来处理。 其中ApplicationEventDispatcher类是ResourceManager类的一个内部类,它的handle方法内会调用rmApp.handle(event), rmApp是RMApp接口类的对象,这里是它的实现类RMAppImpl的对象,即调用的是RMAppImpl类的handle方法,该函数内部会调用this.stateMachine.doTransition(event.getType(), event),其实在RMAppImpl类构造函数里有this.stateMachine = stateMachineFactory.make(this), stateMachine通过状态工厂创建,状态工厂核心addTransition,这个stateMachine是个状态机工厂,其中绑定了很多的事件转换。 各种状态转变对应的处理器,有个submit应该是对应到MAppEventType.START ,在RMAppImpl类内部有.addTransition(RMAppState.NEW, RMAppState.NEW_SAVING, RMAppEventType.START, new RMAppNewlySavingTransition()), 意思就是接受RMAppEventType.START类型的事件,已经捕捉了RMAppEventType.START事件, 会把RMApp的状态从NEW变成NEW_SAVING, 调用回调类是RMAppNewlySavingTransition。  参考   

  其中addTransition()方法是StateMachineFactory类的方法。在addTransition函数中,就将第二个参数postState传给了新构建的内部类SingleInternalArc。

  1 /**
  2  * State machine topology.
  3  * This object is semantically immutable.  If you have a
  4  * StateMachineFactory there's no operation in the API that changes
  5  * its semantic properties.
  6  *
  7  * @param <OPERAND> The object type on which this state machine operates.
  8  * @param <STATE> The state of the entity.
  9  * @param <EVENTTYPE> The external eventType to be handled.
 10  * @param <EVENT> The event object.
 11  *
 12  */
 13 @Public
 14 @Evolving
 15 final public class StateMachineFactory
 16              <OPERAND, STATE extends Enum<STATE>,
 17               EVENTTYPE extends Enum<EVENTTYPE>, EVENT> {
 18 
 19   private final TransitionsListNode transitionsListNode;
 20 
 21   private Map<STATE, Map<EVENTTYPE,
 22     Transition<OPERAND, STATE, EVENTTYPE, EVENT>>> stateMachineTable;
 23 
 24   private STATE defaultInitialState;
 25 
 26   private final boolean optimized;
 27 
 28   /**
 29    * Constructor
 30    *
 31    * This is the only constructor in the API.
 32    *
 33    */
 34   public StateMachineFactory(STATE defaultInitialState) {
 35     this.transitionsListNode = null;
 36     this.defaultInitialState = defaultInitialState;
 37     this.optimized = false;
 38     this.stateMachineTable = null;
 39   }
 40   
 41   private StateMachineFactory
 42       (StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> that,
 43        ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT> t) {
 44     this.defaultInitialState = that.defaultInitialState;
 45     this.transitionsListNode 
 46         = new TransitionsListNode(t, that.transitionsListNode);
 47     this.optimized = false;
 48     this.stateMachineTable = null;
 49   }
 50 
 51   private StateMachineFactory
 52       (StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> that,
 53        boolean optimized) {
 54     this.defaultInitialState = that.defaultInitialState;
 55     this.transitionsListNode = that.transitionsListNode;
 56     this.optimized = optimized;
 57     if (optimized) {
 58       makeStateMachineTable();
 59     } else {
 60       stateMachineTable = null;
 61     }
 62   }
 63 
 64   private interface ApplicableTransition
 65              <OPERAND, STATE extends Enum<STATE>,
 66               EVENTTYPE extends Enum<EVENTTYPE>, EVENT> {
 67     void apply(StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> subject);
 68   }
 69 
 70   private class TransitionsListNode {
 71     final ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT> transition;
 72     final TransitionsListNode next;
 73 
 74     TransitionsListNode
 75         (ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT> transition,
 76         TransitionsListNode next) {
 77       this.transition = transition;
 78       this.next = next;
 79     }
 80   }
 81 
 82   static private class ApplicableSingleOrMultipleTransition
 83              <OPERAND, STATE extends Enum<STATE>,
 84               EVENTTYPE extends Enum<EVENTTYPE>, EVENT>
 85           implements ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT> {
 86     final STATE preState;
 87     final EVENTTYPE eventType;
 88     final Transition<OPERAND, STATE, EVENTTYPE, EVENT> transition;
 89 
 90     ApplicableSingleOrMultipleTransition
 91         (STATE preState, EVENTTYPE eventType,
 92          Transition<OPERAND, STATE, EVENTTYPE, EVENT> transition) {
 93       this.preState = preState;
 94       this.eventType = eventType;
 95       this.transition = transition;
 96     }
 97 
 98     @Override
 99     public void apply
100              (StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> subject) {
101       Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>> transitionMap
102         = subject.stateMachineTable.get(preState);
103       if (transitionMap == null) {
104         // I use HashMap here because I would expect most EVENTTYPE's to not
105         //  apply out of a particular state, so FSM sizes would be 
106         //  quadratic if I use EnumMap's here as I do at the top level.
107         transitionMap = new HashMap<EVENTTYPE,
108           Transition<OPERAND, STATE, EVENTTYPE, EVENT>>();
109         subject.stateMachineTable.put(preState, transitionMap);
110       }
111       transitionMap.put(eventType, transition);
112     }
113   }
114 
115   /**
116    * @return a NEW StateMachineFactory just like {@code this} with the current
117    *          transition added as a new legal transition.  This overload
118    *          has no hook object.
119    *
120    *         Note that the returned StateMachineFactory is a distinct
121    *         object.
122    *
123    *         This method is part of the API.
124    *
125    * @param preState pre-transition state
126    * @param postState post-transition state
127    * @param eventType stimulus for the transition
128    */
129   public StateMachineFactory
130              <OPERAND, STATE, EVENTTYPE, EVENT>
131           addTransition(STATE preState, STATE postState, EVENTTYPE eventType) {
132     return addTransition(preState, postState, eventType, null);
133   }
134 
135   /**
136    * @return a NEW StateMachineFactory just like {@code this} with the current
137    *          transition added as a new legal transition.  This overload
138    *          has no hook object.
139    *
140    *
141    *         Note that the returned StateMachineFactory is a distinct
142    *         object.
143    *
144    *         This method is part of the API.
145    *
146    * @param preState pre-transition state
147    * @param postState post-transition state
148    * @param eventTypes List of stimuli for the transitions
149    */
150   public StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> addTransition(
151       STATE preState, STATE postState, Set<EVENTTYPE> eventTypes) {
152     return addTransition(preState, postState, eventTypes, null);
153   }
154 
155   /**
156    * @return a NEW StateMachineFactory just like {@code this} with the current
157    *          transition added as a new legal transition
158    *
159    *         Note that the returned StateMachineFactory is a distinct
160    *         object.
161    *
162    *         This method is part of the API.
163    *
164    * @param preState pre-transition state
165    * @param postState post-transition state
166    * @param eventTypes List of stimuli for the transitions
167    * @param hook transition hook
168    */
169   public StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> addTransition(
170       STATE preState, STATE postState, Set<EVENTTYPE> eventTypes,
171       SingleArcTransition<OPERAND, EVENT> hook) {
172     StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> factory = null;
173     for (EVENTTYPE event : eventTypes) {
174       if (factory == null) {
175         factory = addTransition(preState, postState, event, hook);
176       } else {
177         factory = factory.addTransition(preState, postState, event, hook);
178       }
179     }
180     return factory;
181   }
182 
183   /**
184    * @return a NEW StateMachineFactory just like {@code this} with the current
185    *          transition added as a new legal transition
186    *
187    *         Note that the returned StateMachineFactory is a distinct object.
188    *
189    *         This method is part of the API.
190    *
191    * @param preState pre-transition state
192    * @param postState post-transition state
193    * @param eventType stimulus for the transition
194    * @param hook transition hook
195    */
196   //
197   public StateMachineFactory
198              <OPERAND, STATE, EVENTTYPE, EVENT>
199           addTransition(STATE preState, STATE postState,
200                         EVENTTYPE eventType,
201                         SingleArcTransition<OPERAND, EVENT> hook){
202     return new StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT>
203         (this, new ApplicableSingleOrMultipleTransition<OPERAND, STATE, EVENTTYPE, EVENT>
204            (preState, eventType, new SingleInternalArc(postState, hook)));
205   }
206 
207   /**
208    * @return a NEW StateMachineFactory just like {@code this} with the current
209    *          transition added as a new legal transition
210    *
211    *         Note that the returned StateMachineFactory is a distinct object.
212    *
213    *         This method is part of the API.
214    *
215    * @param preState pre-transition state
216    * @param postStates valid post-transition states
217    * @param eventType stimulus for the transition
218    * @param hook transition hook
219    */
220   public StateMachineFactory
221              <OPERAND, STATE, EVENTTYPE, EVENT>
222           addTransition(STATE preState, Set<STATE> postStates,
223                         EVENTTYPE eventType,
224                         MultipleArcTransition<OPERAND, EVENT, STATE> hook){
225     return new StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT>
226         (this,
227          new ApplicableSingleOrMultipleTransition<OPERAND, STATE, EVENTTYPE, EVENT>
228            (preState, eventType, new MultipleInternalArc(postStates, hook)));
229   }
230 
231   /**
232    * @return a StateMachineFactory just like {@code this}, except that if
233    *         you won't need any synchronization to build a state machine
234    *
235    *         Note that the returned StateMachineFactory is a distinct object.
236    *
237    *         This method is part of the API.
238    *
239    *         The only way you could distinguish the returned
240    *         StateMachineFactory from {@code this} would be by
241    *         measuring the performance of the derived 
242    *         {@code StateMachine} you can get from it.
243    *
244    * Calling this is optional.  It doesn't change the semantics of the factory,
245    *   if you call it then when you use the factory there is no synchronization.
246    */
247   public StateMachineFactory
248              <OPERAND, STATE, EVENTTYPE, EVENT>
249           installTopology() {
250     return new StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT>(this, true);
251   }
252 
253   /**
254    * Effect a transition due to the effecting stimulus.
255    * @param state current state
256    * @param eventType trigger to initiate the transition
257    * @param cause causal eventType context
258    * @return transitioned state
259    */
260   private STATE doTransition
261            (OPERAND operand, STATE oldState, EVENTTYPE eventType, EVENT event)
262       throws InvalidStateTransitonException {
263     // We can assume that stateMachineTable is non-null because we call
264     //  maybeMakeStateMachineTable() when we build an InnerStateMachine ,
265     //  and this code only gets called from inside a working InnerStateMachine .
266     Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>> transitionMap
267       = stateMachineTable.get(oldState);
268     if (transitionMap != null) {
269       Transition<OPERAND, STATE, EVENTTYPE, EVENT> transition
270           = transitionMap.get(eventType);
271       if (transition != null) {
272         return transition.doTransition(operand, oldState, event, eventType);
273       }
274     }
275     throw new InvalidStateTransitonException(oldState, eventType);
276   }
277 
278   private synchronized void maybeMakeStateMachineTable() {
279     if (stateMachineTable == null) {
280       makeStateMachineTable();
281     }
282   }
283 
284   private void makeStateMachineTable() {
285     Stack<ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT>> stack =
286       new Stack<ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT>>();
287 
288     Map<STATE, Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>>>
289       prototype = new HashMap<STATE, Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>>>();
290 
291     prototype.put(defaultInitialState, null);
292 
293     // I use EnumMap here because it'll be faster and denser.  I would
294     //  expect most of the states to have at least one transition.
295     stateMachineTable
296        = new EnumMap<STATE, Map<EVENTTYPE,
297                            Transition<OPERAND, STATE, EVENTTYPE, EVENT>>>(prototype);
298 
299     for (TransitionsListNode cursor = transitionsListNode;
300          cursor != null;
301          cursor = cursor.next) {
302       stack.push(cursor.transition);
303     }
304 
305     while (!stack.isEmpty()) {
306       stack.pop().apply(this);
307     }
308   }
309 
310   private interface Transition<OPERAND, STATE extends Enum<STATE>,
311           EVENTTYPE extends Enum<EVENTTYPE>, EVENT> {
312     STATE doTransition(OPERAND operand, STATE oldState,
313                        EVENT event, EVENTTYPE eventType);
314   }
315 
316   private class SingleInternalArc
317                     implements Transition<OPERAND, STATE, EVENTTYPE, EVENT> {
318 
319     private STATE postState;
320     private SingleArcTransition<OPERAND, EVENT> hook; // transition hook
321 
322     SingleInternalArc(STATE postState,
323         SingleArcTransition<OPERAND, EVENT> hook) {
324       this.postState = postState;
325       this.hook = hook;
326     }
327 
328     @Override
329     public STATE doTransition(OPERAND operand, STATE oldState,
330                               EVENT event, EVENTTYPE eventType) {
331       if (hook != null) {
332         hook.transition(operand, event);
333       }
334       return postState;
335     }
336   }
337 
338   private class MultipleInternalArc
339               implements Transition<OPERAND, STATE, EVENTTYPE, EVENT>{
340 
341     // Fields
342     private Set<STATE> validPostStates;
343     private MultipleArcTransition<OPERAND, EVENT, STATE> hook;  // transition hook
344 
345     MultipleInternalArc(Set<STATE> postStates,
346                    MultipleArcTransition<OPERAND, EVENT, STATE> hook) {
347       this.validPostStates = postStates;
348       this.hook = hook;
349     }
350 
351     @Override
352     public STATE doTransition(OPERAND operand, STATE oldState,
353                               EVENT event, EVENTTYPE eventType)
354         throws InvalidStateTransitonException {
355       STATE postState = hook.transition(operand, event);
356 
357       if (!validPostStates.contains(postState)) {
358         throw new InvalidStateTransitonException(oldState, eventType);
359       }
360       return postState;
361     }
362   }
363 
364   /* 
365    * @return a {@link StateMachine} that starts in 
366    *         {@code initialState} and whose {@link Transition} s are
367    *         applied to {@code operand} .
368    *
369    *         This is part of the API.
370    *
371    * @param operand the object upon which the returned 
372    *                {@link StateMachine} will operate.
373    * @param initialState the state in which the returned 
374    *                {@link StateMachine} will start.
375    *                
376    */
377   public StateMachine<STATE, EVENTTYPE, EVENT>
378         make(OPERAND operand, STATE initialState) {
379     return new InternalStateMachine(operand, initialState);
380   }
381 
382   /* 
383    * @return a {@link StateMachine} that starts in the default initial
384    *          state and whose {@link Transition} s are applied to
385    *          {@code operand} . 
386    *
387    *         This is part of the API.
388    *
389    * @param operand the object upon which the returned 
390    *                {@link StateMachine} will operate.
391    *                
392    */
393   public StateMachine<STATE, EVENTTYPE, EVENT> make(OPERAND operand) {
394     return new InternalStateMachine(operand, defaultInitialState);
395   }
396 
397   private class InternalStateMachine
398         implements StateMachine<STATE, EVENTTYPE, EVENT> {
399     private final OPERAND operand;
400     private STATE currentState;
401 
402     InternalStateMachine(OPERAND operand, STATE initialState) {
403       this.operand = operand;
404       this.currentState = initialState;
405       if (!optimized) {
406         maybeMakeStateMachineTable();
407       }
408     }
409 
410     @Override
411     public synchronized STATE getCurrentState() {
412       return currentState;
413     }
414 
415     @Override
416     public synchronized STATE doTransition(EVENTTYPE eventType, EVENT event)
417          throws InvalidStateTransitonException  {
418       currentState = StateMachineFactory.this.doTransition
419           (operand, currentState, eventType, event);
420       return currentState;
421     }
422   }
423 
424   /**
425    * Generate a graph represents the state graph of this StateMachine
426    * @param name graph name
427    * @return Graph object generated
428    */
429   @SuppressWarnings("rawtypes")
430   public Graph generateStateGraph(String name) {
431     maybeMakeStateMachineTable();
432     Graph g = new Graph(name);
433     for (STATE startState : stateMachineTable.keySet()) {
434       Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>> transitions
435           = stateMachineTable.get(startState);
436       for (Entry<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>> entry :
437          transitions.entrySet()) {
438         Transition<OPERAND, STATE, EVENTTYPE, EVENT> transition = entry.getValue();
439         if (transition instanceof StateMachineFactory.SingleInternalArc) {
440           StateMachineFactory.SingleInternalArc sa
441               = (StateMachineFactory.SingleInternalArc) transition;
442           Graph.Node fromNode = g.getNode(startState.toString());
443           Graph.Node toNode = g.getNode(sa.postState.toString());
444           fromNode.addEdge(toNode, entry.getKey().toString());
445         } else if (transition instanceof StateMachineFactory.MultipleInternalArc) {
446           StateMachineFactory.MultipleInternalArc ma
447               = (StateMachineFactory.MultipleInternalArc) transition;
448           Iterator iter = ma.validPostStates.iterator();
449           while (iter.hasNext()) {
450             Graph.Node fromNode = g.getNode(startState.toString());
451             Graph.Node toNode = g.getNode(iter.next().toString());
452             fromNode.addEdge(toNode, entry.getKey().toString());
453           }
454         }
455       }
456     }
457     return g;
458   }
459 }
StateMachineFactory.java

StateMachineFactory.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/state/StateMachineFactory.java

它调用的是StateMachineFactory的addTransition()函数,

 1 //StateMachineFactory.java
 2 public StateMachineFactory
 3              <OPERAND, STATE, EVENTTYPE, EVENT>
 4           addTransition(STATE preState, STATE postState,
 5                         EVENTTYPE eventType,
 6                         SingleArcTransition<OPERAND, EVENT> hook){
 7     return new StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT>
 8         (this, new ApplicableSingleOrMultipleTransition<OPERAND, STATE, EVENTTYPE, EVENT>
 9            (preState, eventType, new SingleInternalArc(postState, hook)));
10   }

  addTransition()方法内部会调用 new StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> (this, new ApplicableSingleOrMultipleTransition<OPERAND, STATE, EVENTTYPE, EVENT> (preState, eventType, new SingleInternalArc(postState, hook))), 其中初始化的内部类SingleInternalArc中,保存了状态转换之后的值postState,此时的值就是RMAppState.NEW_SAVING。 也保存了回调函数hook=RMAppNewlySavingTransition。

  之后就该返回到RMAppImpl类的handle函数中,调用this.stateMachine.doTransition(event.getType(), event), 进入到StateMachineFactory类中内部接口类Transition的doTransition方法, 再调用StateMachineFactory类的doTransition方法。

到 return transition.doTransition(operand, oldState, event, eventType), 其中oldState=RMAppState.NEW, transition=org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc。 进入内部类SingleInternalArc的函数doTransition中。该方法内部

会调用此处调用的hook.transition(operand, event), 并且最后返回值return postState;就是上面保存的RMAppState.NEW_SAVING,到上层保存在变量currentState里,即返回到RMAppImpl类的handle函数中,这个变量在RMAppImpl中被get函数getState()获取。

if (oldState != getState()) { LOG.info(appID + " State change from " + oldState + " to "   + getState()); }, 打印出来状态由 NEW 转变成 NEW_SAVING 。 例如:

2017-02-20 22:59:07,702 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1487602693580_0001 State change from NEW to NEW_SAVING

此时就完成了一次状态机的状态转变

  再回到hook变量,hook初始化的就是上面状态机绑定的回调类RMAppNewlySavingTransition。再次回调到了RMAppImpl类中的内部类RMAppNewlySavingTransition的transition函数里, 该函数内部会调用

app.rmContext.getStateStore().storeNewApplication(app), 其中app.rmContext=RMContextImpl , app.rmContext.getStateStore()=RMStateStore , 进入RMStateStore类的storeNewApplication函数里,

  1 @Private
  2 @Unstable
  3 /**
  4  * Base class to implement storage of ResourceManager state.
  5  * Takes care of asynchronous notifications and interfacing with YARN objects.
  6  * Real store implementations need to derive from it and implement blocking
  7  * store and load methods to actually store and load the state.
  8  */
  9 public abstract class RMStateStore extends AbstractService {
 10 
 11   // constants for RM App state and RMDTSecretManagerState.
 12   protected static final String RM_APP_ROOT = "RMAppRoot";
 13   protected static final String RM_DT_SECRET_MANAGER_ROOT = "RMDTSecretManagerRoot";
 14   protected static final String DELEGATION_KEY_PREFIX = "DelegationKey_";
 15   protected static final String DELEGATION_TOKEN_PREFIX = "RMDelegationToken_";
 16   protected static final String DELEGATION_TOKEN_SEQUENCE_NUMBER_PREFIX =
 17       "RMDTSequenceNumber_";
 18   protected static final String AMRMTOKEN_SECRET_MANAGER_ROOT =
 19       "AMRMTokenSecretManagerRoot";
 20   protected static final String VERSION_NODE = "RMVersionNode";
 21   protected static final String EPOCH_NODE = "EpochNode";
 22   private ResourceManager resourceManager;
 23   private final ReadLock readLock;
 24   private final WriteLock writeLock;
 25 
 26   public static final Log LOG = LogFactory.getLog(RMStateStore.class);
 27 
 28   /**
 29    * The enum defines state of RMStateStore.
 30    */
 31   public enum RMStateStoreState {
 32     ACTIVE,
 33     FENCED
 34   };
 35 
 36   private static final StateMachineFactory<RMStateStore,
 37                                            RMStateStoreState,
 38                                            RMStateStoreEventType, 
 39                                            RMStateStoreEvent>
 40       stateMachineFactory = new StateMachineFactory<RMStateStore,
 41                                                     RMStateStoreState,
 42                                                     RMStateStoreEventType,
 43                                                     RMStateStoreEvent>(
 44       RMStateStoreState.ACTIVE)
 45       .addTransition(RMStateStoreState.ACTIVE,
 46           EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED),
 47           RMStateStoreEventType.STORE_APP, new StoreAppTransition())
 48       .addTransition(RMStateStoreState.ACTIVE,
 49           EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED),
 50           RMStateStoreEventType.UPDATE_APP, new UpdateAppTransition())
 51       .addTransition(RMStateStoreState.ACTIVE,
 52           EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED),
 53           RMStateStoreEventType.REMOVE_APP, new RemoveAppTransition())
 54       .addTransition(RMStateStoreState.ACTIVE,
 55           EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED),
 56           RMStateStoreEventType.STORE_APP_ATTEMPT,
 57           new StoreAppAttemptTransition())
 58       .addTransition(RMStateStoreState.ACTIVE,
 59           EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED),
 60           RMStateStoreEventType.UPDATE_APP_ATTEMPT,
 61           new UpdateAppAttemptTransition())
 62       .addTransition(RMStateStoreState.ACTIVE,
 63           EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED),
 64           RMStateStoreEventType.STORE_MASTERKEY,
 65           new StoreRMDTMasterKeyTransition())
 66       .addTransition(RMStateStoreState.ACTIVE,
 67           EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED),
 68           RMStateStoreEventType.REMOVE_MASTERKEY,
 69           new RemoveRMDTMasterKeyTransition())
 70       .addTransition(RMStateStoreState.ACTIVE,
 71           EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED),
 72           RMStateStoreEventType.STORE_DELEGATION_TOKEN,
 73           new StoreRMDTTransition())
 74       .addTransition(RMStateStoreState.ACTIVE,
 75           EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED),
 76           RMStateStoreEventType.REMOVE_DELEGATION_TOKEN,
 77           new RemoveRMDTTransition())
 78       .addTransition(RMStateStoreState.ACTIVE,
 79           EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED),
 80           RMStateStoreEventType.UPDATE_DELEGATION_TOKEN,
 81           new UpdateRMDTTransition())
 82       .addTransition(RMStateStoreState.ACTIVE,
 83           EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED),
 84           RMStateStoreEventType.UPDATE_AMRM_TOKEN,
 85           new StoreOrUpdateAMRMTokenTransition())
 86       .addTransition(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED,
 87           RMStateStoreEventType.FENCED)
 88       .addTransition(RMStateStoreState.FENCED, RMStateStoreState.FENCED,
 89           EnumSet.of(
 90           RMStateStoreEventType.STORE_APP,
 91           RMStateStoreEventType.UPDATE_APP,
 92           RMStateStoreEventType.REMOVE_APP,
 93           RMStateStoreEventType.STORE_APP_ATTEMPT,
 94           RMStateStoreEventType.UPDATE_APP_ATTEMPT,
 95           RMStateStoreEventType.FENCED,
 96           RMStateStoreEventType.STORE_MASTERKEY,
 97           RMStateStoreEventType.REMOVE_MASTERKEY,
 98           RMStateStoreEventType.STORE_DELEGATION_TOKEN,
 99           RMStateStoreEventType.REMOVE_DELEGATION_TOKEN,
100           RMStateStoreEventType.UPDATE_DELEGATION_TOKEN,
101           RMStateStoreEventType.UPDATE_AMRM_TOKEN));
102 
103   private final StateMachine<RMStateStoreState,
104                              RMStateStoreEventType,
105                              RMStateStoreEvent> stateMachine;
106 
107   private static class StoreAppTransition
108       implements MultipleArcTransition<RMStateStore, RMStateStoreEvent,
109           RMStateStoreState> {
110     @Override
111     public RMStateStoreState transition(RMStateStore store,
112         RMStateStoreEvent event) {
113       if (!(event instanceof RMStateStoreAppEvent)) {
114         // should never happen
115         LOG.error("Illegal event type: " + event.getClass());
116         return RMStateStoreState.ACTIVE;
117       }
118       boolean isFenced = false;
119       ApplicationStateData appState =
120           ((RMStateStoreAppEvent) event).getAppState();
121       ApplicationId appId =
122           appState.getApplicationSubmissionContext().getApplicationId();
123       LOG.info("Storing info for app: " + appId);
124       try {
125         store.storeApplicationStateInternal(appId, appState);
126         store.notifyApplication(new RMAppEvent(appId,
127                RMAppEventType.APP_NEW_SAVED));
128       } catch (Exception e) {
129         LOG.error("Error storing app: " + appId, e);
130         isFenced = store.notifyStoreOperationFailedInternal(e);
131       }
132       return finalState(isFenced);
133     };
134   }
135 
136   private static class UpdateAppTransition implements
137       MultipleArcTransition<RMStateStore, RMStateStoreEvent,
138           RMStateStoreState> {
139     @Override
140     public RMStateStoreState transition(RMStateStore store,
141         RMStateStoreEvent event) {
142       if (!(event instanceof RMStateUpdateAppEvent)) {
143         // should never happen
144         LOG.error("Illegal event type: " + event.getClass());
145         return RMStateStoreState.ACTIVE;
146       }
147       boolean isFenced = false;
148       ApplicationStateData appState =
149           ((RMStateUpdateAppEvent) event).getAppState();
150       ApplicationId appId =
151           appState.getApplicationSubmissionContext().getApplicationId();
152       LOG.info("Updating info for app: " + appId);
153       try {
154         store.updateApplicationStateInternal(appId, appState);
155         store.notifyApplication(new RMAppEvent(appId,
156             RMAppEventType.APP_UPDATE_SAVED));
157       } catch (Exception e) {
158         LOG.error("Error updating app: " + appId, e);
159         isFenced = store.notifyStoreOperationFailedInternal(e);
160       }
161       return finalState(isFenced);
162     };
163   }
164 
165   private static class RemoveAppTransition implements
166       MultipleArcTransition<RMStateStore, RMStateStoreEvent,
167           RMStateStoreState> {
168     @Override
169     public RMStateStoreState transition(RMStateStore store,
170         RMStateStoreEvent event) {
171       if (!(event instanceof RMStateStoreRemoveAppEvent)) {
172         // should never happen
173         LOG.error("Illegal event type: " + event.getClass());
174         return RMStateStoreState.ACTIVE;
175       }
176       boolean isFenced = false;
177       ApplicationStateData appState =
178           ((RMStateStoreRemoveAppEvent) event).getAppState();
179       ApplicationId appId =
180           appState.getApplicationSubmissionContext().getApplicationId();
181       LOG.info("Removing info for app: " + appId);
182       try {
183         store.removeApplicationStateInternal(appState);
184       } catch (Exception e) {
185         LOG.error("Error removing app: " + appId, e);
186         isFenced = store.notifyStoreOperationFailedInternal(e);
187       }
188       return finalState(isFenced);
189     };
190   }
191 
192   private static class StoreAppAttemptTransition implements
193       MultipleArcTransition<RMStateStore, RMStateStoreEvent,
194           RMStateStoreState> {
195     @Override
196     public RMStateStoreState transition(RMStateStore store,
197         RMStateStoreEvent event) {
198       if (!(event instanceof RMStateStoreAppAttemptEvent)) {
199         // should never happen
200         LOG.error("Illegal event type: " + event.getClass());
201         return RMStateStoreState.ACTIVE;
202       }
203       boolean isFenced = false;
204       ApplicationAttemptStateData attemptState =
205           ((RMStateStoreAppAttemptEvent) event).getAppAttemptState();
206       try {
207         if (LOG.isDebugEnabled()) {
208           LOG.debug("Storing info for attempt: " + attemptState.getAttemptId());
209         }
210         store.storeApplicationAttemptStateInternal(attemptState.getAttemptId(),
211             attemptState);
212         store.notifyApplicationAttempt(new RMAppAttemptEvent
213                (attemptState.getAttemptId(),
214                RMAppAttemptEventType.ATTEMPT_NEW_SAVED));
215       } catch (Exception e) {
216         LOG.error("Error storing appAttempt: " + attemptState.getAttemptId(), e);
217         isFenced = store.notifyStoreOperationFailedInternal(e);
218       }
219       return finalState(isFenced);
220     };
221   }
222 
223   private static class UpdateAppAttemptTransition implements
224       MultipleArcTransition<RMStateStore, RMStateStoreEvent,
225           RMStateStoreState> {
226     @Override
227     public RMStateStoreState transition(RMStateStore store,
228         RMStateStoreEvent event) {
229       if (!(event instanceof RMStateUpdateAppAttemptEvent)) {
230         // should never happen
231         LOG.error("Illegal event type: " + event.getClass());
232         return RMStateStoreState.ACTIVE;
233       }
234       boolean isFenced = false;
235       ApplicationAttemptStateData attemptState =
236           ((RMStateUpdateAppAttemptEvent) event).getAppAttemptState();
237       try {
238         if (LOG.isDebugEnabled()) {
239           LOG.debug("Updating info for attempt: " + attemptState.getAttemptId());
240         }
241         store.updateApplicationAttemptStateInternal(attemptState.getAttemptId(),
242             attemptState);
243         store.notifyApplicationAttempt(new RMAppAttemptEvent
244                (attemptState.getAttemptId(),
245                RMAppAttemptEventType.ATTEMPT_UPDATE_SAVED));
246       } catch (Exception e) {
247         LOG.error("Error updating appAttempt: " + attemptState.getAttemptId(), e);
248         isFenced = store.notifyStoreOperationFailedInternal(e);
249       }
250       return finalState(isFenced);
251     };
252   }
253 
254   private static class StoreRMDTTransition implements
255       MultipleArcTransition<RMStateStore, RMStateStoreEvent,
256           RMStateStoreState> {
257     @Override
258     public RMStateStoreState transition(RMStateStore store,
259         RMStateStoreEvent event) {
260       if (!(event instanceof RMStateStoreRMDTEvent)) {
261         // should never happen
262         LOG.error("Illegal event type: " + event.getClass());
263         return RMStateStoreState.ACTIVE;
264       }
265       boolean isFenced = false;
266       RMStateStoreRMDTEvent dtEvent = (RMStateStoreRMDTEvent) event;
267       try {
268         LOG.info("Storing RMDelegationToken and SequenceNumber");
269         store.storeRMDelegationTokenState(
270             dtEvent.getRmDTIdentifier(), dtEvent.getRenewDate());
271       } catch (Exception e) {
272         LOG.error("Error While Storing RMDelegationToken and SequenceNumber ",
273             e);
274         isFenced = store.notifyStoreOperationFailedInternal(e);
275       }
276       return finalState(isFenced);
277     }
278   }
279 
280   private static class RemoveRMDTTransition implements
281       MultipleArcTransition<RMStateStore, RMStateStoreEvent,
282           RMStateStoreState> {
283     @Override
284     public RMStateStoreState transition(RMStateStore store,
285         RMStateStoreEvent event) {
286       if (!(event instanceof RMStateStoreRMDTEvent)) {
287         // should never happen
288         LOG.error("Illegal event type: " + event.getClass());
289         return RMStateStoreState.ACTIVE;
290       }
291       boolean isFenced = false;
292       RMStateStoreRMDTEvent dtEvent = (RMStateStoreRMDTEvent) event;
293       try {
294         LOG.info("Removing RMDelegationToken and SequenceNumber");
295         store.removeRMDelegationTokenState(dtEvent.getRmDTIdentifier());
296       } catch (Exception e) {
297         LOG.error("Error While Removing RMDelegationToken and SequenceNumber ",
298             e);
299         isFenced = store.notifyStoreOperationFailedInternal(e);
300       }
301       return finalState(isFenced);
302     }
303   }
304 
305   private static class UpdateRMDTTransition implements
306       MultipleArcTransition<RMStateStore, RMStateStoreEvent,
307           RMStateStoreState> {
308     @Override
309     public RMStateStoreState transition(RMStateStore store,
310         RMStateStoreEvent event) {
311       if (!(event instanceof RMStateStoreRMDTEvent)) {
312         // should never happen
313         LOG.error("Illegal event type: " + event.getClass());
314         return RMStateStoreState.ACTIVE;
315       }
316       boolean isFenced = false;
317       RMStateStoreRMDTEvent dtEvent = (RMStateStoreRMDTEvent) event;
318       try {
319         LOG.info("Updating RMDelegationToken and SequenceNumber");
320         store.updateRMDelegationTokenState(
321             dtEvent.getRmDTIdentifier(), dtEvent.getRenewDate());
322       } catch (Exception e) {
323         LOG.error("Error While Updating RMDelegationToken and SequenceNumber ",
324             e);
325         isFenced = store.notifyStoreOperationFailedInternal(e);
326       }
327       return finalState(isFenced);
328     }
329   }
330 
331   private static class StoreRMDTMasterKeyTransition implements
332       MultipleArcTransition<RMStateStore, RMStateStoreEvent,
333           RMStateStoreState> {
334     @Override
335     public RMStateStoreState transition(RMStateStore store,
336         RMStateStoreEvent event) {
337       if (!(event instanceof RMStateStoreRMDTMasterKeyEvent)) {
338         // should never happen
339         LOG.error("Illegal event type: " + event.getClass());
340         return RMStateStoreState.ACTIVE;
341       }
342       boolean isFenced = false;
343       RMStateStoreRMDTMasterKeyEvent dtEvent =
344           (RMStateStoreRMDTMasterKeyEvent) event;
345       try {
346         LOG.info("Storing RMDTMasterKey.");
347         store.storeRMDTMasterKeyState(dtEvent.getDelegationKey());
348       } catch (Exception e) {
349         LOG.error("Error While Storing RMDTMasterKey.", e);
350         isFenced = store.notifyStoreOperationFailedInternal(e);
351       }
352       return finalState(isFenced);
353     }
354   }
355 
356   private static class RemoveRMDTMasterKeyTransition implements
357       MultipleArcTransition<RMStateStore, RMStateStoreEvent,
358           RMStateStoreState> {
359     @Override
360     public RMStateStoreState transition(RMStateStore store,
361         RMStateStoreEvent event) {
362       if (!(event instanceof RMStateStoreRMDTMasterKeyEvent)) {
363         // should never happen
364         LOG.error("Illegal event type: " + event.getClass());
365         return RMStateStoreState.ACTIVE;
366       }
367       boolean isFenced = false;
368       RMStateStoreRMDTMasterKeyEvent dtEvent =
369           (RMStateStoreRMDTMasterKeyEvent) event;
370       try {
371         LOG.info("Removing RMDTMasterKey.");
372         store.removeRMDTMasterKeyState(dtEvent.getDelegationKey());
373       } catch (Exception e) {
374         LOG.error("Error While Removing RMDTMasterKey.", e);
375         isFenced = store.notifyStoreOperationFailedInternal(e);
376       }
377       return finalState(isFenced);
378     }
379   }
380 
381   private static class StoreOrUpdateAMRMTokenTransition implements
382       MultipleArcTransition<RMStateStore, RMStateStoreEvent,
383           RMStateStoreState> {
384     @Override
385     public RMStateStoreState transition(RMStateStore store,
386         RMStateStoreEvent event) {
387       if (!(event instanceof RMStateStoreAMRMTokenEvent)) {
388         // should never happen
389         LOG.error("Illegal event type: " + event.getClass());
390         return RMStateStoreState.ACTIVE;
391       }
392       RMStateStoreAMRMTokenEvent amrmEvent = (RMStateStoreAMRMTokenEvent) event;
393       boolean isFenced = false;
394       try {
395         LOG.info("Updating AMRMToken");
396         store.storeOrUpdateAMRMTokenSecretManagerState(
397             amrmEvent.getAmrmTokenSecretManagerState(), amrmEvent.isUpdate());
398       } catch (Exception e) {
399         LOG.error("Error storing info for AMRMTokenSecretManager", e);
400         isFenced = store.notifyStoreOperationFailedInternal(e);
401       }
402       return finalState(isFenced);
403     }
404   }
405 
406   private static RMStateStoreState finalState(boolean isFenced) {
407     return isFenced ? RMStateStoreState.FENCED : RMStateStoreState.ACTIVE;
408   }
409 
410   public RMStateStore() {
411     super(RMStateStore.class.getName());
412     ReentrantReadWriteLock lock = new ReentrantReadWriteLock();
413     this.readLock = lock.readLock();
414     this.writeLock = lock.writeLock();
415     stateMachine = stateMachineFactory.make(this);
416   }
417 
418   public static class RMDTSecretManagerState {
419     // DTIdentifier -> renewDate
420     Map<RMDelegationTokenIdentifier, Long> delegationTokenState =
421         new HashMap<RMDelegationTokenIdentifier, Long>();
422 
423     Set<DelegationKey> masterKeyState =
424         new HashSet<DelegationKey>();
425 
426     int dtSequenceNumber = 0;
427 
428     public Map<RMDelegationTokenIdentifier, Long> getTokenState() {
429       return delegationTokenState;
430     }
431 
432     public Set<DelegationKey> getMasterKeyState() {
433       return masterKeyState;
434     }
435 
436     public int getDTSequenceNumber() {
437       return dtSequenceNumber;
438     }
439   }
440 
441   /**
442    * State of the ResourceManager
443    */
444   public static class RMState {
445     Map<ApplicationId, ApplicationStateData> appState =
446         new TreeMap<ApplicationId, ApplicationStateData>();
447 
448     RMDTSecretManagerState rmSecretManagerState = new RMDTSecretManagerState();
449 
450     AMRMTokenSecretManagerState amrmTokenSecretManagerState = null;
451 
452     public Map<ApplicationId, ApplicationStateData> getApplicationState() {
453       return appState;
454     }
455 
456     public RMDTSecretManagerState getRMDTSecretManagerState() {
457       return rmSecretManagerState;
458     }
459 
460     public AMRMTokenSecretManagerState getAMRMTokenSecretManagerState() {
461       return amrmTokenSecretManagerState;
462     }
463   }
464     
465   private Dispatcher rmDispatcher;
466 
467   /**
468    * Dispatcher used to send state operation completion events to 
469    * ResourceManager services
470    */
471   public void setRMDispatcher(Dispatcher dispatcher) {
472     this.rmDispatcher = dispatcher;
473   }
474   
475   AsyncDispatcher dispatcher;
476 
477   @Override
478   protected void serviceInit(Configuration conf) throws Exception{
479     // create async handler
480     dispatcher = new AsyncDispatcher();
481     dispatcher.init(conf);
482     dispatcher.register(RMStateStoreEventType.class, 
483                         new ForwardingEventHandler());
484     dispatcher.setDrainEventsOnStop();
485     initInternal(conf);
486   }
487 
488   @Override
489   protected void serviceStart() throws Exception {
490     dispatcher.start();
491     startInternal();
492   }
493 
494   /**
495    * Derived classes initialize themselves using this method.
496    */
497   protected abstract void initInternal(Configuration conf) throws Exception;
498 
499   /**
500    * Derived classes start themselves using this method.
501    * The base class is started and the event dispatcher is ready to use at
502    * this point
503    */
504   protected abstract void startInternal() throws Exception;
505 
506   @Override
507   protected void serviceStop() throws Exception {
508     dispatcher.stop();
509     closeInternal();
510   }
511 
512   /**
513    * Derived classes close themselves using this method.
514    * The base class will be closed and the event dispatcher will be shutdown 
515    * after this
516    */
517   protected abstract void closeInternal() throws Exception;
518 
519   /**
520    * 1) Versioning scheme: major.minor. For e.g. 1.0, 1.1, 1.2...1.25, 2.0 etc.
521    * 2) Any incompatible change of state-store is a major upgrade, and any
522    *    compatible change of state-store is a minor upgrade.
523    * 3) If theres's no version, treat it as CURRENT_VERSION_INFO.
524    * 4) Within a minor upgrade, say 1.1 to 1.2:
525    *    overwrite the version info and proceed as normal.
526    * 5) Within a major upgrade, say 1.2 to 2.0:
527    *    throw exception and indicate user to use a separate upgrade tool to
528    *    upgrade RM state.
529    */
530   public void checkVersion() throws Exception {
531     Version loadedVersion = loadVersion();
532     LOG.info("Loaded RM state version info " + loadedVersion);
533     if (loadedVersion != null && loadedVersion.equals(getCurrentVersion())) {
534       return;
535     }
536     // if there is no version info, treat it as CURRENT_VERSION_INFO;
537     if (loadedVersion == null) {
538       loadedVersion = getCurrentVersion();
539     }
540     if (loadedVersion.isCompatibleTo(getCurrentVersion())) {
541       LOG.info("Storing RM state version info " + getCurrentVersion());
542       storeVersion();
543     } else {
544       throw new RMStateVersionIncompatibleException(
545         "Expecting RM state version " + getCurrentVersion()
546             + ", but loading version " + loadedVersion);
547     }
548   }
549 
550   /**
551    * Derived class use this method to load the version information from state
552    * store.
553    */
554   protected abstract Version loadVersion() throws Exception;
555 
556   /**
557    * Derived class use this method to store the version information.
558    */
559   protected abstract void storeVersion() throws Exception;
560 
561   /**
562    * Get the current version of the underlying state store.
563    */
564   protected abstract Version getCurrentVersion();
565 
566 
567   /**
568    * Get the current epoch of RM and increment the value.
569    */
570   public abstract long getAndIncrementEpoch() throws Exception;
571   
572   /**
573    * Blocking API
574    * The derived class must recover state from the store and return a new 
575    * RMState object populated with that state
576    * This must not be called on the dispatcher thread
577    */
578   public abstract RMState loadState() throws Exception;
579   
580   /**
581    * Non-Blocking API
582    * ResourceManager services use this to store the application's state
583    * This does not block the dispatcher threads
584    * RMAppStoredEvent will be sent on completion to notify the RMApp
585    */
586   @SuppressWarnings("unchecked")
587   public void storeNewApplication(RMApp app) {
588     ApplicationSubmissionContext context = app
589                                             .getApplicationSubmissionContext();
590     assert context instanceof ApplicationSubmissionContextPBImpl;
591     ApplicationStateData appState =
592         ApplicationStateData.newInstance(
593             app.getSubmitTime(), app.getStartTime(), context, app.getUser());
594     dispatcher.getEventHandler().handle(new RMStateStoreAppEvent(appState));
595   }
596 
597   @SuppressWarnings("unchecked")
598   public void updateApplicationState(
599       ApplicationStateData appState) {
600     dispatcher.getEventHandler().handle(new RMStateUpdateAppEvent(appState));
601   }
602 
603   public void updateFencedState() {
604     handleStoreEvent(new RMStateStoreEvent(RMStateStoreEventType.FENCED));
605   }
606 
607   /**
608    * Blocking API
609    * Derived classes must implement this method to store the state of an 
610    * application.
611    */
612   protected abstract void storeApplicationStateInternal(ApplicationId appId,
613       ApplicationStateData appStateData) throws Exception;
614 
615   protected abstract void updateApplicationStateInternal(ApplicationId appId,
616       ApplicationStateData appStateData) throws Exception;
617   
618   @SuppressWarnings("unchecked")
619   /**
620    * Non-blocking API
621    * ResourceManager services call this to store state on an application attempt
622    * This does not block the dispatcher threads
623    * RMAppAttemptStoredEvent will be sent on completion to notify the RMAppAttempt
624    */
625   public void storeNewApplicationAttempt(RMAppAttempt appAttempt) {
626     Credentials credentials = getCredentialsFromAppAttempt(appAttempt);
627 
628     AggregateAppResourceUsage resUsage =
629         appAttempt.getRMAppAttemptMetrics().getAggregateAppResourceUsage();
630     ApplicationAttemptStateData attemptState =
631         ApplicationAttemptStateData.newInstance(
632             appAttempt.getAppAttemptId(),
633             appAttempt.getMasterContainer(),
634             credentials, appAttempt.getStartTime(),
635             resUsage.getMemorySeconds(),
636             resUsage.getVcoreSeconds());
637 
638     dispatcher.getEventHandler().handle(
639       new RMStateStoreAppAttemptEvent(attemptState));
640   }
641 
642   @SuppressWarnings("unchecked")
643   public void updateApplicationAttemptState(
644       ApplicationAttemptStateData attemptState) {
645     dispatcher.getEventHandler().handle(
646       new RMStateUpdateAppAttemptEvent(attemptState));
647   }
648 
649   /**
650    * Blocking API
651    * Derived classes must implement this method to store the state of an 
652    * application attempt
653    */
654   protected abstract void storeApplicationAttemptStateInternal(
655       ApplicationAttemptId attemptId,
656       ApplicationAttemptStateData attemptStateData) throws Exception;
657 
658   protected abstract void updateApplicationAttemptStateInternal(
659       ApplicationAttemptId attemptId,
660       ApplicationAttemptStateData attemptStateData) throws Exception;
661 
662   /**
663    * RMDTSecretManager call this to store the state of a delegation token
664    * and sequence number
665    */
666   public void storeRMDelegationToken(
667       RMDelegationTokenIdentifier rmDTIdentifier, Long renewDate) {
668     handleStoreEvent(new RMStateStoreRMDTEvent(rmDTIdentifier, renewDate,
669         RMStateStoreEventType.STORE_DELEGATION_TOKEN));
670   }
671 
672   /**
673    * Blocking API
674    * Derived classes must implement this method to store the state of
675    * RMDelegationToken and sequence number
676    */
677   protected abstract void storeRMDelegationTokenState(
678       RMDelegationTokenIdentifier rmDTIdentifier, Long renewDate)
679       throws Exception;
680 
681   /**
682    * RMDTSecretManager call this to remove the state of a delegation token
683    */
684   public void removeRMDelegationToken(
685       RMDelegationTokenIdentifier rmDTIdentifier) {
686     handleStoreEvent(new RMStateStoreRMDTEvent(rmDTIdentifier, null,
687         RMStateStoreEventType.REMOVE_DELEGATION_TOKEN));
688   }
689 
690   /**
691    * Blocking API
692    * Derived classes must implement this method to remove the state of RMDelegationToken
693    */
694   protected abstract void removeRMDelegationTokenState(
695       RMDelegationTokenIdentifier rmDTIdentifier) throws Exception;
696 
697   /**
698    * RMDTSecretManager call this to update the state of a delegation token
699    * and sequence number
700    */
701   public void updateRMDelegationToken(
702       RMDelegationTokenIdentifier rmDTIdentifier, Long renewDate) {
703     handleStoreEvent(new RMStateStoreRMDTEvent(rmDTIdentifier, renewDate,
704         RMStateStoreEventType.UPDATE_DELEGATION_TOKEN));
705   }
706 
707   /**
708    * Blocking API
709    * Derived classes must implement this method to update the state of
710    * RMDelegationToken and sequence number
711    */
712   protected abstract void updateRMDelegationTokenState(
713       RMDelegationTokenIdentifier rmDTIdentifier, Long renewDate)
714       throws Exception;
715 
716   /**
717    * RMDTSecretManager call this to store the state of a master key
718    */
719   public void storeRMDTMasterKey(DelegationKey delegationKey) {
720     handleStoreEvent(new RMStateStoreRMDTMasterKeyEvent(delegationKey,
721         RMStateStoreEventType.STORE_MASTERKEY));
722   }
723 
724   /**
725    * Blocking API
726    * Derived classes must implement this method to store the state of
727    * DelegationToken Master Key
728    */
729   protected abstract void storeRMDTMasterKeyState(DelegationKey delegationKey)
730       throws Exception;
731 
732   /**
733    * RMDTSecretManager call this to remove the state of a master key
734    */
735   public void removeRMDTMasterKey(DelegationKey delegationKey) {
736     handleStoreEvent(new RMStateStoreRMDTMasterKeyEvent(delegationKey,
737         RMStateStoreEventType.REMOVE_MASTERKEY));
738   }
739 
740   /**
741    * Blocking API
742    * Derived classes must implement this method to remove the state of
743    * DelegationToken Master Key
744    */
745   protected abstract void removeRMDTMasterKeyState(DelegationKey delegationKey)
746       throws Exception;
747 
748   /**
749    * Blocking API Derived classes must implement this method to store or update
750    * the state of AMRMToken Master Key
751    */
752   protected abstract void storeOrUpdateAMRMTokenSecretManagerState(
753       AMRMTokenSecretManagerState amrmTokenSecretManagerState, boolean isUpdate)
754       throws Exception;
755 
756   /**
757    * Store or Update state of AMRMToken Master Key
758    */
759   public void storeOrUpdateAMRMTokenSecretManager(
760       AMRMTokenSecretManagerState amrmTokenSecretManagerState, boolean isUpdate) {
761     handleStoreEvent(new RMStateStoreAMRMTokenEvent(
762         amrmTokenSecretManagerState, isUpdate,
763         RMStateStoreEventType.UPDATE_AMRM_TOKEN));
764   }
765 
766   /**
767    * Non-blocking API
768    * ResourceManager services call this to remove an application from the state
769    * store
770    * This does not block the dispatcher threads
771    * There is no notification of completion for this operation.
772    */
773   @SuppressWarnings("unchecked")
774   public void removeApplication(RMApp app) {
775     ApplicationStateData appState =
776         ApplicationStateData.newInstance(
777             app.getSubmitTime(), app.getStartTime(),
778             app.getApplicationSubmissionContext(), app.getUser());
779     for(RMAppAttempt appAttempt : app.getAppAttempts().values()) {
780       appState.attempts.put(appAttempt.getAppAttemptId(), null);
781     }
782     
783     dispatcher.getEventHandler().handle(new RMStateStoreRemoveAppEvent(appState));
784   }
785 
786   /**
787    * Blocking API
788    * Derived classes must implement this method to remove the state of an 
789    * application and its attempts
790    */
791   protected abstract void removeApplicationStateInternal(
792       ApplicationStateData appState) throws Exception;
793 
794   // TODO: This should eventually become cluster-Id + "AM_RM_TOKEN_SERVICE". See
795   // YARN-1779
796   public static final Text AM_RM_TOKEN_SERVICE = new Text(
797     "AM_RM_TOKEN_SERVICE");
798 
799   public static final Text AM_CLIENT_TOKEN_MASTER_KEY_NAME =
800       new Text("YARN_CLIENT_TOKEN_MASTER_KEY");
801   
802   public Credentials getCredentialsFromAppAttempt(RMAppAttempt appAttempt) {
803     Credentials credentials = new Credentials();
804 
805     SecretKey clientTokenMasterKey =
806         appAttempt.getClientTokenMasterKey();
807     if(clientTokenMasterKey != null){
808       credentials.addSecretKey(AM_CLIENT_TOKEN_MASTER_KEY_NAME,
809           clientTokenMasterKey.getEncoded());
810     }
811     return credentials;
812   }
813   
814   @VisibleForTesting
815   protected boolean isFencedState() {
816     return (RMStateStoreState.FENCED == getRMStateStoreState());
817   }
818 
819   // Dispatcher related code
820   protected void handleStoreEvent(RMStateStoreEvent event) {
821     this.writeLock.lock();
822     try {
823 
824       if (LOG.isDebugEnabled()) {
825         LOG.debug("Processing event of type " + event.getType());
826       }
827 
828       final RMStateStoreState oldState = getRMStateStoreState();
829 
830       this.stateMachine.doTransition(event.getType(), event);
831 
832       if (oldState != getRMStateStoreState()) {
833         LOG.info("RMStateStore state change from " + oldState + " to "
834             + getRMStateStoreState());
835       }
836 
837     } catch (InvalidStateTransitonException e) {
838       LOG.error("Can't handle this event at current state", e);
839     } finally {
840       this.writeLock.unlock();
841     }
842   }
843 
844   /**
845    * This method is called to notify the ResourceManager that the store
846    * operation has failed.
847    * @param failureCause the exception due to which the operation failed
848    */
849   protected void notifyStoreOperationFailed(Exception failureCause) {
850     if (isFencedState()) {
851       return;
852     }
853     if (notifyStoreOperationFailedInternal(failureCause)) {
854       updateFencedState();
855     }
856   }
857 
858   @SuppressWarnings("unchecked")
859   private boolean notifyStoreOperationFailedInternal(
860       Exception failureCause) {
861     boolean isFenced = false;
862     LOG.error("State store operation failed ", failureCause);
863     if (HAUtil.isHAEnabled(getConfig())) {
864       LOG.warn("State-store fenced ! Transitioning RM to standby");
865       isFenced = true;
866       Thread standByTransitionThread =
867           new Thread(new StandByTransitionThread());
868       standByTransitionThread.setName("StandByTransitionThread Handler");
869       standByTransitionThread.start();
870     } else if (YarnConfiguration.shouldRMFailFast(getConfig())) {
871       LOG.fatal("Fail RM now due to state-store error!");
872       rmDispatcher.getEventHandler().handle(
873           new RMFatalEvent(RMFatalEventType.STATE_STORE_OP_FAILED,
874               failureCause));
875     } else {
876       LOG.warn("Skip the state-store error.");
877     }
878     return isFenced;
879   }
880  
881   @SuppressWarnings("unchecked")
882   /**
883    * This method is called to notify the application that
884    * new application is stored or updated in state store
885    * @param event App event containing the app id and event type
886    */
887   private void notifyApplication(RMAppEvent event) {
888     rmDispatcher.getEventHandler().handle(event);
889   }
890   
891   @SuppressWarnings("unchecked")
892   /**
893    * This method is called to notify the application attempt
894    * that new attempt is stored or updated in state store
895    * @param event App attempt event containing the app attempt
896    * id and event type
897    */
898   private void notifyApplicationAttempt(RMAppAttemptEvent event) {
899     rmDispatcher.getEventHandler().handle(event);
900   }
901   
902   /**
903    * EventHandler implementation which forward events to the FSRMStateStore
904    * This hides the EventHandle methods of the store from its public interface 
905    */
906   private final class ForwardingEventHandler 
907                                   implements EventHandler<RMStateStoreEvent> {
908     
909     @Override
910     public void handle(RMStateStoreEvent event) {
911       handleStoreEvent(event);
912     }
913   }
914 
915   /**
916    * Derived classes must implement this method to delete the state store
917    * @throws Exception
918    */
919   public abstract void deleteStore() throws Exception;
920 
921   public void setResourceManager(ResourceManager rm) {
922     this.resourceManager = rm;
923   }
924 
925   private class StandByTransitionThread implements Runnable {
926     @Override
927     public void run() {
928       LOG.info("RMStateStore has been fenced");
929       resourceManager.handleTransitionToStandBy();
930     }
931   }
932 
933   public RMStateStoreState getRMStateStoreState() {
934     this.readLock.lock();
935     try {
936       return this.stateMachine.getCurrentState();
937     } finally {
938       this.readLock.unlock();
939     }
940   }
941 }
RMStateStore.java

RMStateStore.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java

storeNewApplication函数内dispatcher.getEventHandler().handle(new RMStateStoreAppEvent(appState)), handle函数的参数new RMStateStoreAppEvent(appState),初始化了一个RMStateStoreEventType.STORE_APP事件。

 1 public class RMStateStoreAppEvent extends RMStateStoreEvent {
 2 
 3   private final ApplicationStateData appState;
 4 
 5   public RMStateStoreAppEvent(ApplicationStateData appState) {
 6     super(RMStateStoreEventType.STORE_APP);
 7     this.appState = appState;
 8   }
 9 
10   public ApplicationStateData getAppState() {
11     return appState;
12   }
13 }

  此处的dispatcher是RMStateStore类自己的变量,只在初始化时绑定了一个RMStateStoreEventType, 看RMStateStore类的serviceInit()函数, 该函数内部调用dispatcher.register(RMStateStoreEventType.class, new ForwardingEventHandler()), 调用的类是RMStateStore 类的内部类 ForwardingEventHandler, 该内部类的handle函数调用了函数handleStoreEvent(), 该函数内部调用 this.stateMachine.doTransition(event.getType(), event), 同前面一样,又会进入类StateMachineFactory中的内部类SingleInternalArc里, 不过这次的状态机工厂是RMStateStore类的内部变量,上面的状态机工厂是RMAppImpl类的,他们绑定的事件不同。 可以在RMStateStore类最开始.addTransition(RMStateStoreState.ACTIVE, EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED), RMStateStoreEventType.STORE_APP, new StoreAppTransition()), 看到RMStateStoreEventType.STORE_APP事件只是将状态RMStateStoreState.ACTIVE转变为 EnumSet.of(RMStateStoreState.ACTIVE, RMStateStoreState.FENCED)。  {不确定,,,主要作用就是完成RMAppImpl类当前信息的日志记录。日记记录是为了RM的重启。 详情见ResourceManager重启过程。 }
  RMStateStoreEventType.STORE_APP事件绑定的类是StoreAppTransition, 我们追踪一下addTransition()函数, 如下:

 1 //StateMachineFactory.java  
 2 public StateMachineFactory
 3              <OPERAND, STATE, EVENTTYPE, EVENT>
 4           addTransition(STATE preState, Set<STATE> postStates,
 5                         EVENTTYPE eventType,
 6                         MultipleArcTransition<OPERAND, EVENT, STATE> hook){
 7     return new StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT>
 8         (this,
 9          new ApplicableSingleOrMultipleTransition<OPERAND, STATE, EVENTTYPE, EVENT>
10            (preState, eventType, new MultipleInternalArc(postStates, hook)));
11   }

  最后会在StateMachineFactory中的内部类MultipleInternalArc里调用hook.transition(operand, event),它是接口类MultipleArcTransition的函数。  在看一下绑定类StoreAppTransition,它是RMStateStore类的内部类, 该类实现了MultipleArcTransition类,所以最后调用的是RMStateStore类的内部类StoreAppTransition的函数transition()。

  在该函数内部进一步调用store.notifyApplication(new RMAppEvent(appId, RMAppEventType.APP_NEW_SAVED)), 而notifyApplication函数内部进一步调用 rmDispatcher.getEventHandler().handle(event), 向中央调度器发送了一个RMAppEvent(appId, RMAppEventType.APP_NEW_SAVED)事件, 事件类型是 RMAppEventType.APP_NEW_SAVED, 由于在ResourceManager中,将RMAppEventType类型的事件绑定到了ResourceManager类的内部类ApplicationEventDispatcher中, ApplicationEventDispatcher类的handle()函数中调用rmApp.handle(event), 最终调用的是RMAppImpl类的handle()函数, 所以和前面的RMAppEventType.START事件一样,会被RMAppImpl类处理。

RMAppImpl类内部 .addTransition(RMAppState.NEW_SAVING, RMAppState.SUBMITTED,

RMAppEventType.APP_NEW_SAVED, new AddApplicationToSchedulerTransition()),

RMAppImpl收到RMAppEventType.APP_NEW_SAVED事件后,

将自身的运行状态由NEW_SAVING转换为SUBMITTED ,调用回调类是RMAppImpl类的内部类

AddApplicationToSchedulerTransition, 该内部类的transition()函数内部会调用 app.handler.handle(

new AppAddedSchedulerEvent(app.applicationId,

app.submissionContext.getQueue(), app.user, app.submissionContext.getReservationID())),

如下所示:

 1 //RMAppImpl.java
 2 private static final class AddApplicationToSchedulerTransition extends
 3       RMAppTransition {
 4     @Override
 5     public void transition(RMAppImpl app, RMAppEvent event) {
 6       app.handler.handle(new AppAddedSchedulerEvent(app.applicationId,
 7         app.submissionContext.getQueue(), app.user,
 8         app.submissionContext.getReservationID()));
 9     }
10   }

在AppAddedSchedulerEvent类中,如下所示:

 1 //AppAddedSchedulerEvent.java
 2 public AppAddedSchedulerEvent(ApplicationId applicationId, String queue,
 3       String user, ReservationId reservationID) {
 4     this(applicationId, queue, user, false, reservationID);
 5   }
 6 
 7   public AppAddedSchedulerEvent(ApplicationId applicationId, String queue,
 8       String user, boolean isAppRecovering, ReservationId reservationID) {
 9     super(SchedulerEventType.APP_ADDED);
10     this.applicationId = applicationId;
11     this.queue = queue;
12     this.user = user;
13     this.reservationID = reservationID;
14     this.isAppRecovering = isAppRecovering;
15   }

  scheduler收到SchedulerEventType.APP_ADDED事件之后,首先进行权限检查,然后将应用程序信息保存到内部的数据结构中,并向RMAppImpl发送APP_ACCEPTED事件。 具体过程如下:

 1 public enum SchedulerEventType {
 2 
 3   // Source: Node
 4   NODE_ADDED,
 5   NODE_REMOVED,
 6   NODE_UPDATE,
 7   NODE_RESOURCE_UPDATE,
 8   NODE_LABELS_UPDATE,
 9 
10   // Source: RMApp
11   APP_ADDED,
12   APP_REMOVED,
13 
14   // Source: RMAppAttempt
15   APP_ATTEMPT_ADDED,
16   APP_ATTEMPT_REMOVED,
17 
18   // Source: ContainerAllocationExpirer
19   CONTAINER_EXPIRED,
20 
21   // Source: RMContainer
22   CONTAINER_RESCHEDULED,
23 
24   // Source: SchedulingEditPolicy
25   DROP_RESERVATION,
26   PREEMPT_CONTAINER,
27   KILL_CONTAINER
28 }
SchedulerEventType.java

SchedulerEventType.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/SchedulerEventType.java 

  新建的事件类AppAddedSchedulerEvent, 其中有super(SchedulerEventType.APP_ADDED), 事件类型SchedulerEventType.APP_ADDED由于在ResourceManager中,将SchedulerEventType类型的事件绑定到了EventHandler<SchedulerEvent> 的对象schedulerDispatcher上, 绑定过程是在ResourceManager类的内部类RMActiveServices的serviceInit()函数中, 如下所示:

1 //ResourceManager类的内部类RMActiveServices的serviceInit()函数的绑定SchedulerEventType类型的事件
2 schedulerDispatcher = createSchedulerEventDispatcher();
3 addIfService(schedulerDispatcher);
4 rmDispatcher.register(SchedulerEventType.class, schedulerDispatcher);

  我们进入createSchedulerEventDispatcher()函数, 如下所示:

1 //ResourceManager.java
2 protected EventHandler<SchedulerEvent> createSchedulerEventDispatcher() {
3     return new SchedulerEventDispatcher(this.scheduler);
4   }

  这里发现传入的调度器scheduler是this.scheduler, this.scheduler 在绑定ScheduleEventType类型的事件前面进行的初始化,如下所示:

 1 //ResourceManager类的内部类RMActiveServices的serviceInit()函数
 2 // Initialize the scheduler
 3       scheduler = createScheduler();
 4       scheduler.setRMContext(rmContext);
 5       addIfService(scheduler);
 6       rmContext.setScheduler(scheduler);
 7 
 8       schedulerDispatcher = createSchedulerEventDispatcher();
 9       addIfService(schedulerDispatcher);
10       rmDispatcher.register(SchedulerEventType.class, schedulerDispatcher);
11 
12       // Register event handler for RmAppEvents
13       //注册RMAppEvent事件的事件处理器
14       //RMAppManager往异步处理器增加个RMAppEvent事件,类型枚值RMAppEventType.START,所以由ApplicationEventDispatcher(rmContext)来处理
15       rmDispatcher.register(RMAppEventType.class,
16           new ApplicationEventDispatcher(rmContext));
17 
18       // Register event handler for RmAppAttemptEvents
19       rmDispatcher.register(RMAppAttemptEventType.class,
20           new ApplicationAttemptEventDispatcher(rmContext));

  再进入createScheduler函数, 如下所示:

 1 //ResourceManager.java
 2 protected ResourceScheduler createScheduler() {
 3     String schedulerClassName = conf.get(YarnConfiguration.RM_SCHEDULER,
 4         YarnConfiguration.DEFAULT_RM_SCHEDULER);
 5     LOG.info("Using Scheduler: " + schedulerClassName);
 6     try {
 7       Class<?> schedulerClazz = Class.forName(schedulerClassName);
 8       if (ResourceScheduler.class.isAssignableFrom(schedulerClazz)) {
 9         return (ResourceScheduler) ReflectionUtils.newInstance(schedulerClazz,
10             this.conf);
11       } else {
12         throw new YarnRuntimeException("Class: " + schedulerClassName
13             + " not instance of " + ResourceScheduler.class.getCanonicalName());
14       }
15     } catch (ClassNotFoundException e) {
16       throw new YarnRuntimeException("Could not instantiate Scheduler: "
17           + schedulerClassName, e);
18     }
19   }

  发现用的默认调度器是YarnConfiguration.DEFAULT_RM_SCHEDULER,  而它是在YarnConfiguration.java中,取值为org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler。 如下所示:

1  //YarnConfiguration.java
2   /** The class to use as the resource scheduler.*/
3   public static final String RM_SCHEDULER = 
4     RM_PREFIX + "scheduler.class";
5  
6   public static final String DEFAULT_RM_SCHEDULER = 
7       "org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler";

  进入CapacityScheduler类中,如下所示:

   1 @LimitedPrivate("yarn")
   2 @Evolving
   3 @SuppressWarnings("unchecked")
   4 public class CapacityScheduler extends
   5     AbstractYarnScheduler<FiCaSchedulerApp, FiCaSchedulerNode> implements
   6     PreemptableResourceScheduler, CapacitySchedulerContext, Configurable {
   7 
   8   private static final Log LOG = LogFactory.getLog(CapacityScheduler.class);
   9   private YarnAuthorizationProvider authorizer;
  10  
  11   private CSQueue root;
  12   // timeout to join when we stop this service
  13   protected final long THREAD_JOIN_TIMEOUT_MS = 1000;
  14 
  15   static final Comparator<CSQueue> queueComparator = new Comparator<CSQueue>() {
  16     @Override
  17     public int compare(CSQueue q1, CSQueue q2) {
  18       if (q1.getUsedCapacity() < q2.getUsedCapacity()) {
  19         return -1;
  20       } else if (q1.getUsedCapacity() > q2.getUsedCapacity()) {
  21         return 1;
  22       }
  23 
  24       return q1.getQueuePath().compareTo(q2.getQueuePath());
  25     }
  26   };
  27 
  28   static final Comparator<FiCaSchedulerApp> applicationComparator = 
  29     new Comparator<FiCaSchedulerApp>() {
  30     @Override
  31     public int compare(FiCaSchedulerApp a1, FiCaSchedulerApp a2) {
  32       return a1.getApplicationId().compareTo(a2.getApplicationId());
  33     }
  34   };
  35 
  36   @Override
  37   public void setConf(Configuration conf) {
  38       yarnConf = conf;
  39   }
  40   
  41   private void validateConf(Configuration conf) {
  42     // validate scheduler memory allocation setting
  43     int minMem = conf.getInt(
  44       YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_MB,
  45       YarnConfiguration.DEFAULT_RM_SCHEDULER_MINIMUM_ALLOCATION_MB);
  46     int maxMem = conf.getInt(
  47       YarnConfiguration.RM_SCHEDULER_MAXIMUM_ALLOCATION_MB,
  48       YarnConfiguration.DEFAULT_RM_SCHEDULER_MAXIMUM_ALLOCATION_MB);
  49 
  50     if (minMem <= 0 || minMem > maxMem) {
  51       throw new YarnRuntimeException("Invalid resource scheduler memory"
  52         + " allocation configuration"
  53         + ", " + YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_MB
  54         + "=" + minMem
  55         + ", " + YarnConfiguration.RM_SCHEDULER_MAXIMUM_ALLOCATION_MB
  56         + "=" + maxMem + ", min and max should be greater than 0"
  57         + ", max should be no smaller than min.");
  58     }
  59 
  60     // validate scheduler vcores allocation setting
  61     int minVcores = conf.getInt(
  62       YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_VCORES,
  63       YarnConfiguration.DEFAULT_RM_SCHEDULER_MINIMUM_ALLOCATION_VCORES);
  64     int maxVcores = conf.getInt(
  65       YarnConfiguration.RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES,
  66       YarnConfiguration.DEFAULT_RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES);
  67 
  68     if (minVcores <= 0 || minVcores > maxVcores) {
  69       throw new YarnRuntimeException("Invalid resource scheduler vcores"
  70         + " allocation configuration"
  71         + ", " + YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_VCORES
  72         + "=" + minVcores
  73         + ", " + YarnConfiguration.RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES
  74         + "=" + maxVcores + ", min and max should be greater than 0"
  75         + ", max should be no smaller than min.");
  76     }
  77   }
  78 
  79   @Override
  80   public Configuration getConf() {
  81     return yarnConf;
  82   }
  83 
  84   private CapacitySchedulerConfiguration conf;
  85   private Configuration yarnConf;
  86 
  87   private Map<String, CSQueue> queues = new ConcurrentHashMap<String, CSQueue>();
  88 
  89   private AtomicInteger numNodeManagers = new AtomicInteger(0);
  90 
  91   private ResourceCalculator calculator;
  92   private boolean usePortForNodeName;
  93 
  94   private boolean scheduleAsynchronously;
  95   private AsyncScheduleThread asyncSchedulerThread;
  96   private RMNodeLabelsManager labelManager;
  97   
  98   /**
  99    * EXPERT
 100    */
 101   private long asyncScheduleInterval;
 102   private static final String ASYNC_SCHEDULER_INTERVAL =
 103       CapacitySchedulerConfiguration.SCHEDULE_ASYNCHRONOUSLY_PREFIX
 104           + ".scheduling-interval-ms";
 105   private static final long DEFAULT_ASYNC_SCHEDULER_INTERVAL = 5;
 106   
 107   private boolean overrideWithQueueMappings = false;
 108   private List<QueueMapping> mappings = null;
 109   private Groups groups;
 110 
 111   @VisibleForTesting
 112   public synchronized String getMappedQueueForTest(String user)
 113       throws IOException {
 114     return getMappedQueue(user);
 115   }
 116 
 117   public CapacityScheduler() {
 118     super(CapacityScheduler.class.getName());
 119   }
 120 
 121   @Override
 122   public QueueMetrics getRootQueueMetrics() {
 123     return root.getMetrics();
 124   }
 125 
 126   public CSQueue getRootQueue() {
 127     return root;
 128   }
 129   
 130   @Override
 131   public CapacitySchedulerConfiguration getConfiguration() {
 132     return conf;
 133   }
 134 
 135   @Override
 136   public synchronized RMContainerTokenSecretManager 
 137   getContainerTokenSecretManager() {
 138     return this.rmContext.getContainerTokenSecretManager();
 139   }
 140 
 141   @Override
 142   public Comparator<FiCaSchedulerApp> getApplicationComparator() {
 143     return applicationComparator;
 144   }
 145 
 146   @Override
 147   public ResourceCalculator getResourceCalculator() {
 148     return calculator;
 149   }
 150 
 151   @Override
 152   public Comparator<CSQueue> getQueueComparator() {
 153     return queueComparator;
 154   }
 155 
 156   @Override
 157   public int getNumClusterNodes() {
 158     return numNodeManagers.get();
 159   }
 160 
 161   @Override
 162   public synchronized RMContext getRMContext() {
 163     return this.rmContext;
 164   }
 165 
 166   @Override
 167   public synchronized void setRMContext(RMContext rmContext) {
 168     this.rmContext = rmContext;
 169   }
 170 
 171   private synchronized void initScheduler(Configuration configuration) throws
 172       IOException {
 173     this.conf = loadCapacitySchedulerConfiguration(configuration);
 174     validateConf(this.conf);
 175     this.minimumAllocation = this.conf.getMinimumAllocation();
 176     initMaximumResourceCapability(this.conf.getMaximumAllocation());
 177     this.calculator = this.conf.getResourceCalculator();
 178     this.usePortForNodeName = this.conf.getUsePortForNodeName();
 179     this.applications =
 180         new ConcurrentHashMap<ApplicationId,
 181             SchedulerApplication<FiCaSchedulerApp>>();
 182     this.labelManager = rmContext.getNodeLabelManager();
 183     authorizer = YarnAuthorizationProvider.getInstance(yarnConf);
 184     initializeQueues(this.conf);
 185 
 186     scheduleAsynchronously = this.conf.getScheduleAynschronously();
 187     asyncScheduleInterval =
 188         this.conf.getLong(ASYNC_SCHEDULER_INTERVAL,
 189             DEFAULT_ASYNC_SCHEDULER_INTERVAL);
 190     if (scheduleAsynchronously) {
 191       asyncSchedulerThread = new AsyncScheduleThread(this);
 192     }
 193 
 194     LOG.info("Initialized CapacityScheduler with " +
 195         "calculator=" + getResourceCalculator().getClass() + ", " +
 196         "minimumAllocation=<" + getMinimumResourceCapability() + ">, " +
 197         "maximumAllocation=<" + getMaximumResourceCapability() + ">, " +
 198         "asynchronousScheduling=" + scheduleAsynchronously + ", " +
 199         "asyncScheduleInterval=" + asyncScheduleInterval + "ms");
 200   }
 201 
 202   private synchronized void startSchedulerThreads() {
 203     if (scheduleAsynchronously) {
 204       Preconditions.checkNotNull(asyncSchedulerThread,
 205           "asyncSchedulerThread is null");
 206       asyncSchedulerThread.start();
 207     }
 208   }
 209 
 210   @Override
 211   public void serviceInit(Configuration conf) throws Exception {
 212     Configuration configuration = new Configuration(conf);
 213     super.serviceInit(conf);
 214     initScheduler(configuration);
 215   }
 216 
 217   @Override
 218   public void serviceStart() throws Exception {
 219     startSchedulerThreads();
 220     super.serviceStart();
 221   }
 222 
 223   @Override
 224   public void serviceStop() throws Exception {
 225     synchronized (this) {
 226       if (scheduleAsynchronously && asyncSchedulerThread != null) {
 227         asyncSchedulerThread.interrupt();
 228         asyncSchedulerThread.join(THREAD_JOIN_TIMEOUT_MS);
 229       }
 230     }
 231     super.serviceStop();
 232   }
 233 
 234   @Override
 235   public synchronized void
 236   reinitialize(Configuration conf, RMContext rmContext) throws IOException {
 237     Configuration configuration = new Configuration(conf);
 238     CapacitySchedulerConfiguration oldConf = this.conf;
 239     this.conf = loadCapacitySchedulerConfiguration(configuration);
 240     validateConf(this.conf);
 241     try {
 242       LOG.info("Re-initializing queues...");
 243       refreshMaximumAllocation(this.conf.getMaximumAllocation());
 244       reinitializeQueues(this.conf);
 245     } catch (Throwable t) {
 246       this.conf = oldConf;
 247       refreshMaximumAllocation(this.conf.getMaximumAllocation());
 248       throw new IOException("Failed to re-init queues", t);
 249     }
 250   }
 251   
 252   long getAsyncScheduleInterval() {
 253     return asyncScheduleInterval;
 254   }
 255 
 256   private final static Random random = new Random(System.currentTimeMillis());
 257   
 258   /**
 259    * Schedule on all nodes by starting at a random point.
 260    * @param cs
 261    */
 262   static void schedule(CapacityScheduler cs) {
 263     // First randomize the start point
 264     int current = 0;
 265     Collection<FiCaSchedulerNode> nodes = cs.getAllNodes().values();
 266     int start = random.nextInt(nodes.size());
 267     for (FiCaSchedulerNode node : nodes) {
 268       if (current++ >= start) {
 269         cs.allocateContainersToNode(node);
 270       }
 271     }
 272     // Now, just get everyone to be safe
 273     for (FiCaSchedulerNode node : nodes) {
 274       cs.allocateContainersToNode(node);
 275     }
 276     try {
 277       Thread.sleep(cs.getAsyncScheduleInterval());
 278     } catch (InterruptedException e) {}
 279   }
 280   
 281   static class AsyncScheduleThread extends Thread {
 282 
 283     private final CapacityScheduler cs;
 284     private AtomicBoolean runSchedules = new AtomicBoolean(false);
 285 
 286     public AsyncScheduleThread(CapacityScheduler cs) {
 287       this.cs = cs;
 288       setDaemon(true);
 289     }
 290 
 291     @Override
 292     public void run() {
 293       while (true) {
 294         if (!runSchedules.get()) {
 295           try {
 296             Thread.sleep(100);
 297           } catch (InterruptedException ie) {}
 298         } else {
 299           schedule(cs);
 300         }
 301       }
 302     }
 303 
 304     public void beginSchedule() {
 305       runSchedules.set(true);
 306     }
 307 
 308     public void suspendSchedule() {
 309       runSchedules.set(false);
 310     }
 311 
 312   }
 313   
 314   @Private
 315   public static final String ROOT_QUEUE = 
 316     CapacitySchedulerConfiguration.PREFIX + CapacitySchedulerConfiguration.ROOT;
 317 
 318   static class QueueHook {
 319     public CSQueue hook(CSQueue queue) {
 320       return queue;
 321     }
 322   }
 323   private static final QueueHook noop = new QueueHook();
 324 
 325   private void initializeQueueMappings() throws IOException {
 326     overrideWithQueueMappings = conf.getOverrideWithQueueMappings();
 327     LOG.info("Initialized queue mappings, override: "
 328         + overrideWithQueueMappings);
 329     // Get new user/group mappings
 330     List<QueueMapping> newMappings = conf.getQueueMappings();
 331     //check if mappings refer to valid queues
 332     for (QueueMapping mapping : newMappings) {
 333       if (!mapping.queue.equals(CURRENT_USER_MAPPING) &&
 334           !mapping.queue.equals(PRIMARY_GROUP_MAPPING)) {
 335         CSQueue queue = queues.get(mapping.queue);
 336         if (queue == null || !(queue instanceof LeafQueue)) {
 337           throw new IOException(
 338               "mapping contains invalid or non-leaf queue " + mapping.queue);
 339         }
 340       }
 341     }
 342     //apply the new mappings since they are valid
 343     mappings = newMappings;
 344     // initialize groups if mappings are present
 345     if (mappings.size() > 0) {
 346       groups = new Groups(conf);
 347     }
 348   }
 349 
 350   @Lock(CapacityScheduler.class)
 351   private void initializeQueues(CapacitySchedulerConfiguration conf)
 352     throws IOException {
 353 
 354     root = 
 355         parseQueue(this, conf, null, CapacitySchedulerConfiguration.ROOT, 
 356             queues, queues, noop);
 357     labelManager.reinitializeQueueLabels(getQueueToLabels());
 358     LOG.info("Initialized root queue " + root);
 359     initializeQueueMappings();
 360     setQueueAcls(authorizer, queues);
 361   }
 362 
 363   @Lock(CapacityScheduler.class)
 364   private void reinitializeQueues(CapacitySchedulerConfiguration conf) 
 365   throws IOException {
 366     // Parse new queues
 367     Map<String, CSQueue> newQueues = new HashMap<String, CSQueue>();
 368     CSQueue newRoot = 
 369         parseQueue(this, conf, null, CapacitySchedulerConfiguration.ROOT, 
 370             newQueues, queues, noop); 
 371     
 372     // Ensure all existing queues are still present
 373     validateExistingQueues(queues, newQueues);
 374 
 375     // Add new queues
 376     addNewQueues(queues, newQueues);
 377     
 378     // Re-configure queues
 379     root.reinitialize(newRoot, clusterResource);
 380     initializeQueueMappings();
 381 
 382     // Re-calculate headroom for active applications
 383     root.updateClusterResource(clusterResource, new ResourceLimits(
 384         clusterResource));
 385 
 386     labelManager.reinitializeQueueLabels(getQueueToLabels());
 387     setQueueAcls(authorizer, queues);
 388   }
 389 
 390   @VisibleForTesting
 391   public static void setQueueAcls(YarnAuthorizationProvider authorizer,
 392       Map<String, CSQueue> queues) throws IOException {
 393     for (CSQueue queue : queues.values()) {
 394       AbstractCSQueue csQueue = (AbstractCSQueue) queue;
 395       authorizer.setPermission(csQueue.getPrivilegedEntity(),
 396         csQueue.getACLs(), UserGroupInformation.getCurrentUser());
 397     }
 398   }
 399 
 400   private Map<String, Set<String>> getQueueToLabels() {
 401     Map<String, Set<String>> queueToLabels = new HashMap<String, Set<String>>();
 402     for (CSQueue queue : queues.values()) {
 403       queueToLabels.put(queue.getQueueName(), queue.getAccessibleNodeLabels());
 404     }
 405     return queueToLabels;
 406   }
 407 
 408   /**
 409    * Ensure all existing queues are present. Queues cannot be deleted
 410    * @param queues existing queues
 411    * @param newQueues new queues
 412    */
 413   @Lock(CapacityScheduler.class)
 414   private void validateExistingQueues(
 415       Map<String, CSQueue> queues, Map<String, CSQueue> newQueues) 
 416   throws IOException {
 417     // check that all static queues are included in the newQueues list
 418     for (Map.Entry<String, CSQueue> e : queues.entrySet()) {
 419       if (!(e.getValue() instanceof ReservationQueue)) {
 420         String queueName = e.getKey();
 421         CSQueue oldQueue = e.getValue();
 422         CSQueue newQueue = newQueues.get(queueName); 
 423         if (null == newQueue) {
 424           throw new IOException(queueName + " cannot be found during refresh!");
 425         } else if (!oldQueue.getQueuePath().equals(newQueue.getQueuePath())) {
 426           throw new IOException(queueName + " is moved from:"
 427               + oldQueue.getQueuePath() + " to:" + newQueue.getQueuePath()
 428               + " after refresh, which is not allowed.");
 429         }
 430       }
 431     }
 432   }
 433 
 434   /**
 435    * Add the new queues (only) to our list of queues...
 436    * ... be careful, do not overwrite existing queues.
 437    * @param queues
 438    * @param newQueues
 439    */
 440   @Lock(CapacityScheduler.class)
 441   private void addNewQueues(
 442       Map<String, CSQueue> queues, Map<String, CSQueue> newQueues) 
 443   {
 444     for (Map.Entry<String, CSQueue> e : newQueues.entrySet()) {
 445       String queueName = e.getKey();
 446       CSQueue queue = e.getValue();
 447       if (!queues.containsKey(queueName)) {
 448         queues.put(queueName, queue);
 449       }
 450     }
 451   }
 452   
 453   @Lock(CapacityScheduler.class)
 454   static CSQueue parseQueue(
 455       CapacitySchedulerContext csContext,
 456       CapacitySchedulerConfiguration conf, 
 457       CSQueue parent, String queueName, Map<String, CSQueue> queues,
 458       Map<String, CSQueue> oldQueues, 
 459       QueueHook hook) throws IOException {
 460     CSQueue queue;
 461     String fullQueueName =
 462         (parent == null) ? queueName
 463             : (parent.getQueuePath() + "." + queueName);
 464     String[] childQueueNames = 
 465       conf.getQueues(fullQueueName);
 466     boolean isReservableQueue = conf.isReservable(fullQueueName);
 467     if (childQueueNames == null || childQueueNames.length == 0) {
 468       if (null == parent) {
 469         throw new IllegalStateException(
 470             "Queue configuration missing child queue names for " + queueName);
 471       }
 472       // Check if the queue will be dynamically managed by the Reservation
 473       // system
 474       if (isReservableQueue) {
 475         queue =
 476             new PlanQueue(csContext, queueName, parent,
 477                 oldQueues.get(queueName));
 478       } else {
 479         queue =
 480             new LeafQueue(csContext, queueName, parent,
 481                 oldQueues.get(queueName));
 482 
 483         // Used only for unit tests
 484         queue = hook.hook(queue);
 485       }
 486     } else {
 487       if (isReservableQueue) {
 488         throw new IllegalStateException(
 489             "Only Leaf Queues can be reservable for " + queueName);
 490       }
 491       ParentQueue parentQueue = 
 492         new ParentQueue(csContext, queueName, parent, oldQueues.get(queueName));
 493 
 494       // Used only for unit tests
 495       queue = hook.hook(parentQueue);
 496 
 497       List<CSQueue> childQueues = new ArrayList<CSQueue>();
 498       for (String childQueueName : childQueueNames) {
 499         CSQueue childQueue = 
 500           parseQueue(csContext, conf, queue, childQueueName, 
 501               queues, oldQueues, hook);
 502         childQueues.add(childQueue);
 503       }
 504       parentQueue.setChildQueues(childQueues);
 505     }
 506 
 507     if(queue instanceof LeafQueue == true && queues.containsKey(queueName)
 508       && queues.get(queueName) instanceof LeafQueue == true) {
 509       throw new IOException("Two leaf queues were named " + queueName
 510         + ". Leaf queue names must be distinct");
 511     }
 512     queues.put(queueName, queue);
 513 
 514     LOG.info("Initialized queue: " + queue);
 515     return queue;
 516   }
 517 
 518   public CSQueue getQueue(String queueName) {
 519     if (queueName == null) {
 520       return null;
 521     }
 522     return queues.get(queueName);
 523   }
 524 
 525   private static final String CURRENT_USER_MAPPING = "%user";
 526 
 527   private static final String PRIMARY_GROUP_MAPPING = "%primary_group";
 528 
 529   private String getMappedQueue(String user) throws IOException {
 530     for (QueueMapping mapping : mappings) {
 531       if (mapping.type == MappingType.USER) {
 532         if (mapping.source.equals(CURRENT_USER_MAPPING)) {
 533           if (mapping.queue.equals(CURRENT_USER_MAPPING)) {
 534             return user;
 535           }
 536           else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) {
 537             return groups.getGroups(user).get(0);
 538           }
 539           else {
 540             return mapping.queue;
 541           }
 542         }
 543         if (user.equals(mapping.source)) {
 544           return mapping.queue;
 545         }
 546       }
 547       if (mapping.type == MappingType.GROUP) {
 548         for (String userGroups : groups.getGroups(user)) {
 549           if (userGroups.equals(mapping.source)) {
 550             return mapping.queue;
 551           }
 552         }
 553       }
 554     }
 555     return null;
 556   }
 557 
 558   private String getQueueMappings(ApplicationId applicationId, String queueName,
 559       String user) {
 560     if (mappings != null && mappings.size() > 0) {
 561       try {
 562         String mappedQueue = getMappedQueue(user);
 563         if (mappedQueue != null) {
 564           // We have a mapping, should we use it?
 565           if (queueName.equals(YarnConfiguration.DEFAULT_QUEUE_NAME)
 566               || overrideWithQueueMappings) {
 567             LOG.info("Application " + applicationId + " user " + user
 568                 + " mapping [" + queueName + "] to [" + mappedQueue
 569                 + "] override " + overrideWithQueueMappings);
 570             queueName = mappedQueue;
 571             RMApp rmApp = rmContext.getRMApps().get(applicationId);
 572             rmApp.setQueue(queueName);
 573           }
 574         }
 575       } catch (IOException ioex) {
 576         String message = "Failed to submit application " + applicationId +
 577             " submitted by user " + user + " reason: " + ioex.getMessage();
 578         this.rmContext.getDispatcher().getEventHandler()
 579             .handle(new RMAppEvent(applicationId,
 580                 RMAppEventType.APP_REJECTED, message));
 581         return null;
 582       }
 583     }
 584     return queueName;
 585   }
 586 
 587   private synchronized void addApplicationOnRecovery(
 588       ApplicationId applicationId, String queueName, String user) {
 589     queueName = getQueueMappings(applicationId, queueName, user);
 590     if (queueName == null) {
 591       // Exception encountered while getting queue mappings.
 592       return;
 593     }
 594     // sanity checks.
 595     CSQueue queue = getQueue(queueName);
 596     if (queue == null) {
 597       //During a restart, this indicates a queue was removed, which is
 598       //not presently supported
 599       if (!YarnConfiguration.shouldRMFailFast(getConfig())) {
 600         this.rmContext.getDispatcher().getEventHandler().handle(
 601             new RMAppEvent(applicationId, RMAppEventType.KILL,
 602             "Application killed on recovery as it was submitted to queue " +
 603             queueName + " which no longer exists after restart."));
 604         return;
 605       } else {
 606         String queueErrorMsg = "Queue named " + queueName
 607             + " missing during application recovery."
 608             + " Queue removal during recovery is not presently supported by the"
 609             + " capacity scheduler, please restart with all queues configured"
 610             + " which were present before shutdown/restart.";
 611         LOG.fatal(queueErrorMsg);
 612         throw new QueueInvalidException(queueErrorMsg);
 613       }
 614     }
 615     if (!(queue instanceof LeafQueue)) {
 616       // During RM restart, this means leaf queue was converted to a parent
 617       // queue, which is not supported for running apps.
 618       if (!YarnConfiguration.shouldRMFailFast(getConfig())) {
 619         this.rmContext.getDispatcher().getEventHandler().handle(
 620             new RMAppEvent(applicationId, RMAppEventType.KILL,
 621             "Application killed on recovery as it was submitted to queue " +
 622             queueName + " which is no longer a leaf queue after restart."));
 623         return;
 624       } else {
 625         String queueErrorMsg = "Queue named " + queueName
 626             + " is no longer a leaf queue during application recovery."
 627             + " Changing a leaf queue to a parent queue during recovery is"
 628             + " not presently supported by the capacity scheduler. Please"
 629             + " restart with leaf queues before shutdown/restart continuing"
 630             + " as leaf queues.";
 631         LOG.fatal(queueErrorMsg);
 632         throw new QueueInvalidException(queueErrorMsg);
 633       }
 634     }
 635    // Submit to the queue
 636     try {
 637       queue.submitApplication(applicationId, user, queueName);
 638    } catch (AccessControlException ace) {
 639       // Ignore the exception for recovered app as the app was previously
 640       // accepted.
 641     }
 642     queue.getMetrics().submitApp(user);
 643     SchedulerApplication<FiCaSchedulerApp> application =
 644         new SchedulerApplication<FiCaSchedulerApp>(queue, user);
 645     applications.put(applicationId, application);
 646     LOG.info("Accepted application " + applicationId + " from user: " + user
 647         + ", in queue: " + queueName);
 648     if (LOG.isDebugEnabled()) {
 649       LOG.debug(applicationId + " is recovering. Skip notifying APP_ACCEPTED");
 650     }
 651   }
 652 
 653   private synchronized void addApplication(ApplicationId applicationId,
 654       String queueName, String user) {
 655     queueName = getQueueMappings(applicationId, queueName, user);
 656     if (queueName == null) {
 657       // Exception encountered while getting queue mappings.
 658       return;
 659     }
 660     // sanity checks.
 661     CSQueue queue = getQueue(queueName);
 662     if (queue == null) {
 663       String message = "Application " + applicationId + 
 664       " submitted by user " + user + " to unknown queue: " + queueName;
 665       this.rmContext.getDispatcher().getEventHandler()
 666           .handle(new RMAppEvent(applicationId,
 667               RMAppEventType.APP_REJECTED, message));
 668       return;
 669     }
 670     if (!(queue instanceof LeafQueue)) {
 671       String message = "Application " + applicationId + 
 672           " submitted by user " + user + " to non-leaf queue: " + queueName;
 673       this.rmContext.getDispatcher().getEventHandler()
 674           .handle(new RMAppEvent(applicationId,
 675               RMAppEventType.APP_REJECTED, message));
 676       return;
 677     }
 678     // Submit to the queue
 679     try {
 680       queue.submitApplication(applicationId, user, queueName);
 681     } catch (AccessControlException ace) {
 682       LOG.info("Failed to submit application " + applicationId + " to queue "
 683           + queueName + " from user " + user, ace);
 684       this.rmContext.getDispatcher().getEventHandler()
 685           .handle(new RMAppEvent(applicationId,
 686               RMAppEventType.APP_REJECTED, ace.toString()));
 687       return;
 688     }
 689     // update the metrics
 690     queue.getMetrics().submitApp(user);
 691     SchedulerApplication<FiCaSchedulerApp> application =
 692         new SchedulerApplication<FiCaSchedulerApp>(queue, user);
 693     applications.put(applicationId, application);
 694     LOG.info("Accepted application " + applicationId + " from user: " + user
 695         + ", in queue: " + queueName);
 696     rmContext.getDispatcher().getEventHandler()
 697         .handle(new RMAppEvent(applicationId, RMAppEventType.APP_ACCEPTED));
 698   }
 699 
 700   private synchronized void addApplicationAttempt(
 701       ApplicationAttemptId applicationAttemptId,
 702       boolean transferStateFromPreviousAttempt,
 703       boolean isAttemptRecovering) {
 704     SchedulerApplication<FiCaSchedulerApp> application =
 705         applications.get(applicationAttemptId.getApplicationId());
 706     if (application == null) {
 707       LOG.warn("Application " + applicationAttemptId.getApplicationId() +
 708           " cannot be found in scheduler.");
 709       return;
 710     }
 711     CSQueue queue = (CSQueue) application.getQueue();
 712 
 713     FiCaSchedulerApp attempt =
 714         new FiCaSchedulerApp(applicationAttemptId, application.getUser(),
 715           queue, queue.getActiveUsersManager(), rmContext);
 716     if (transferStateFromPreviousAttempt) {
 717       attempt.transferStateFromPreviousAttempt(application
 718         .getCurrentAppAttempt());
 719     }
 720     application.setCurrentAppAttempt(attempt);
 721 
 722     queue.submitApplicationAttempt(attempt, application.getUser());
 723     LOG.info("Added Application Attempt " + applicationAttemptId
 724         + " to scheduler from user " + application.getUser() + " in queue "
 725         + queue.getQueueName());
 726     if (isAttemptRecovering) {
 727       if (LOG.isDebugEnabled()) {
 728         LOG.debug(applicationAttemptId
 729             + " is recovering. Skipping notifying ATTEMPT_ADDED");
 730       }
 731     } else {
 732       rmContext.getDispatcher().getEventHandler().handle(
 733         new RMAppAttemptEvent(applicationAttemptId,
 734             RMAppAttemptEventType.ATTEMPT_ADDED));
 735     }
 736   }
 737 
 738   private synchronized void doneApplication(ApplicationId applicationId,
 739       RMAppState finalState) {
 740     SchedulerApplication<FiCaSchedulerApp> application =
 741         applications.get(applicationId);
 742     if (application == null){
 743       // The AppRemovedSchedulerEvent maybe sent on recovery for completed apps,
 744       // ignore it.
 745       LOG.warn("Couldn't find application " + applicationId);
 746       return;
 747     }
 748     CSQueue queue = (CSQueue) application.getQueue();
 749     if (!(queue instanceof LeafQueue)) {
 750       LOG.error("Cannot finish application " + "from non-leaf queue: "
 751           + queue.getQueueName());
 752     } else {
 753       queue.finishApplication(applicationId, application.getUser());
 754     }
 755     application.stop(finalState);
 756     applications.remove(applicationId);
 757   }
 758 
 759   private synchronized void doneApplicationAttempt(
 760       ApplicationAttemptId applicationAttemptId,
 761       RMAppAttemptState rmAppAttemptFinalState, boolean keepContainers) {
 762     LOG.info("Application Attempt " + applicationAttemptId + " is done." +
 763         " finalState=" + rmAppAttemptFinalState);
 764     
 765     FiCaSchedulerApp attempt = getApplicationAttempt(applicationAttemptId);
 766     SchedulerApplication<FiCaSchedulerApp> application =
 767         applications.get(applicationAttemptId.getApplicationId());
 768 
 769     if (application == null || attempt == null) {
 770       LOG.info("Unknown application " + applicationAttemptId + " has completed!");
 771       return;
 772     }
 773 
 774     // Release all the allocated, acquired, running containers
 775     for (RMContainer rmContainer : attempt.getLiveContainers()) {
 776       if (keepContainers
 777           && rmContainer.getState().equals(RMContainerState.RUNNING)) {
 778         // do not kill the running container in the case of work-preserving AM
 779         // restart.
 780         LOG.info("Skip killing " + rmContainer.getContainerId());
 781         continue;
 782       }
 783       completedContainer(
 784         rmContainer,
 785         SchedulerUtils.createAbnormalContainerStatus(
 786           rmContainer.getContainerId(), SchedulerUtils.COMPLETED_APPLICATION),
 787         RMContainerEventType.KILL);
 788     }
 789 
 790     // Release all reserved containers
 791     for (RMContainer rmContainer : attempt.getReservedContainers()) {
 792       completedContainer(
 793         rmContainer,
 794         SchedulerUtils.createAbnormalContainerStatus(
 795           rmContainer.getContainerId(), "Application Complete"),
 796         RMContainerEventType.KILL);
 797     }
 798 
 799     // Clean up pending requests, metrics etc.
 800     attempt.stop(rmAppAttemptFinalState);
 801 
 802     // Inform the queue
 803     String queueName = attempt.getQueue().getQueueName();
 804     CSQueue queue = queues.get(queueName);
 805     if (!(queue instanceof LeafQueue)) {
 806       LOG.error("Cannot finish application " + "from non-leaf queue: "
 807           + queueName);
 808     } else {
 809       queue.finishApplicationAttempt(attempt, queue.getQueueName());
 810     }
 811   }
 812 
 813   @Override
 814   @Lock(Lock.NoLock.class)
 815   public Allocation allocate(ApplicationAttemptId applicationAttemptId,
 816       List<ResourceRequest> ask, List<ContainerId> release, 
 817       List<String> blacklistAdditions, List<String> blacklistRemovals) {
 818 
 819     FiCaSchedulerApp application = getApplicationAttempt(applicationAttemptId);
 820     if (application == null) {
 821       LOG.info("Calling allocate on removed " +
 822           "or non existant application " + applicationAttemptId);
 823       return EMPTY_ALLOCATION;
 824     }
 825     
 826     // Sanity check
 827     SchedulerUtils.normalizeRequests(
 828         ask, getResourceCalculator(), getClusterResource(),
 829         getMinimumResourceCapability(), getMaximumResourceCapability());
 830 
 831     // Release containers
 832     releaseContainers(release, application);
 833 
 834     synchronized (application) {
 835 
 836       // make sure we aren't stopping/removing the application
 837       // when the allocate comes in
 838       if (application.isStopped()) {
 839         LOG.info("Calling allocate on a stopped " +
 840             "application " + applicationAttemptId);
 841         return EMPTY_ALLOCATION;
 842       }
 843 
 844       if (!ask.isEmpty()) {
 845 
 846         if(LOG.isDebugEnabled()) {
 847           LOG.debug("allocate: pre-update" +
 848             " applicationAttemptId=" + applicationAttemptId + 
 849             " application=" + application);
 850         }
 851         application.showRequests();
 852   
 853         // Update application requests
 854         application.updateResourceRequests(ask);
 855   
 856         LOG.debug("allocate: post-update");
 857         application.showRequests();
 858       }
 859 
 860       if(LOG.isDebugEnabled()) {
 861         LOG.debug("allocate:" +
 862           " applicationAttemptId=" + applicationAttemptId + 
 863           " #ask=" + ask.size());
 864       }
 865 
 866       application.updateBlacklist(blacklistAdditions, blacklistRemovals);
 867 
 868       return application.getAllocation(getResourceCalculator(),
 869                    clusterResource, getMinimumResourceCapability());
 870     }
 871   }
 872 
 873   @Override
 874   @Lock(Lock.NoLock.class)
 875   public QueueInfo getQueueInfo(String queueName, 
 876       boolean includeChildQueues, boolean recursive) 
 877   throws IOException {
 878     CSQueue queue = null;
 879     queue = this.queues.get(queueName);
 880     if (queue == null) {
 881       throw new IOException("Unknown queue: " + queueName);
 882     }
 883     return queue.getQueueInfo(includeChildQueues, recursive);
 884   }
 885 
 886   @Override
 887   @Lock(Lock.NoLock.class)
 888   public List<QueueUserACLInfo> getQueueUserAclInfo() {
 889     UserGroupInformation user = null;
 890     try {
 891       user = UserGroupInformation.getCurrentUser();
 892     } catch (IOException ioe) {
 893       // should never happen
 894       return new ArrayList<QueueUserACLInfo>();
 895     }
 896 
 897     return root.getQueueUserAclInfo(user);
 898   }
 899 
 900   private synchronized void nodeUpdate(RMNode nm) {
 901     if (LOG.isDebugEnabled()) {
 902       LOG.debug("nodeUpdate: " + nm + " clusterResources: " + clusterResource);
 903     }
 904 
 905     FiCaSchedulerNode node = getNode(nm.getNodeID());
 906     
 907     List<UpdatedContainerInfo> containerInfoList = nm.pullContainerUpdates();
 908     List<ContainerStatus> newlyLaunchedContainers = new ArrayList<ContainerStatus>();
 909     List<ContainerStatus> completedContainers = new ArrayList<ContainerStatus>();
 910     for(UpdatedContainerInfo containerInfo : containerInfoList) {
 911       newlyLaunchedContainers.addAll(containerInfo.getNewlyLaunchedContainers());
 912       completedContainers.addAll(containerInfo.getCompletedContainers());
 913     }
 914     
 915     // Processing the newly launched containers
 916     for (ContainerStatus launchedContainer : newlyLaunchedContainers) {
 917       containerLaunchedOnNode(launchedContainer.getContainerId(), node);
 918     }
 919 
 920     // Process completed containers
 921     for (ContainerStatus completedContainer : completedContainers) {
 922       ContainerId containerId = completedContainer.getContainerId();
 923       LOG.debug("Container FINISHED: " + containerId);
 924       completedContainer(getRMContainer(containerId), 
 925           completedContainer, RMContainerEventType.FINISHED);
 926     }
 927 
 928     // Now node data structures are upto date and ready for scheduling.
 929     if(LOG.isDebugEnabled()) {
 930       LOG.debug("Node being looked for scheduling " + nm
 931         + " availableResource: " + node.getAvailableResource());
 932     }
 933   }
 934   
 935   /**
 936    * Process resource update on a node.
 937    */
 938   private synchronized void updateNodeAndQueueResource(RMNode nm, 
 939       ResourceOption resourceOption) {
 940     updateNodeResource(nm, resourceOption);
 941     root.updateClusterResource(clusterResource, new ResourceLimits(
 942         clusterResource));
 943   }
 944   
 945   /**
 946    * Process node labels update on a node.
 947    * 
 948    * TODO: Currently capacity scheduler will kill containers on a node when
 949    * labels on the node changed. It is a simply solution to ensure guaranteed
 950    * capacity on labels of queues. When YARN-2498 completed, we can let
 951    * preemption policy to decide if such containers need to be killed or just
 952    * keep them running.
 953    */
 954   private synchronized void updateLabelsOnNode(NodeId nodeId,
 955       Set<String> newLabels) {
 956     FiCaSchedulerNode node = nodes.get(nodeId);
 957     if (null == node) {
 958       return;
 959     }
 960     
 961     // labels is same, we don't need do update
 962     if (node.getLabels().size() == newLabels.size()
 963         && node.getLabels().containsAll(newLabels)) {
 964       return;
 965     }
 966     
 967     // Kill running containers since label is changed
 968     for (RMContainer rmContainer : node.getRunningContainers()) {
 969       ContainerId containerId = rmContainer.getContainerId();
 970       completedContainer(rmContainer, 
 971           ContainerStatus.newInstance(containerId,
 972               ContainerState.COMPLETE, 
 973               String.format(
 974                   "Container=%s killed since labels on the node=%s changed",
 975                   containerId.toString(), nodeId.toString()),
 976               ContainerExitStatus.KILLED_BY_RESOURCEMANAGER),
 977           RMContainerEventType.KILL);
 978     }
 979     
 980     // Unreserve container on this node
 981     RMContainer reservedContainer = node.getReservedContainer();
 982     if (null != reservedContainer) {
 983       dropContainerReservation(reservedContainer);
 984     }
 985     
 986     // Update node labels after we've done this
 987     node.updateLabels(newLabels);
 988   }
 989 
 990   private synchronized void allocateContainersToNode(FiCaSchedulerNode node) {
 991     if (rmContext.isWorkPreservingRecoveryEnabled()
 992         && !rmContext.isSchedulerReadyForAllocatingContainers()) {
 993       return;
 994     }
 995 
 996     // Assign new containers...
 997     // 1. Check for reserved applications
 998     // 2. Schedule if there are no reservations
 999 
1000     RMContainer reservedContainer = node.getReservedContainer();
1001     if (reservedContainer != null) {
1002       FiCaSchedulerApp reservedApplication =
1003           getCurrentAttemptForContainer(reservedContainer.getContainerId());
1004       
1005       // Try to fulfill the reservation
1006       LOG.info("Trying to fulfill reservation for application " + 
1007           reservedApplication.getApplicationId() + " on node: " + 
1008           node.getNodeID());
1009       
1010       LeafQueue queue = ((LeafQueue)reservedApplication.getQueue());
1011       CSAssignment assignment =
1012           queue.assignContainers(
1013               clusterResource,
1014               node,
1015               // TODO, now we only consider limits for parent for non-labeled
1016               // resources, should consider labeled resources as well.
1017               new ResourceLimits(labelManager.getResourceByLabel(
1018                   RMNodeLabelsManager.NO_LABEL, clusterResource)));
1019       
1020       RMContainer excessReservation = assignment.getExcessReservation();
1021       if (excessReservation != null) {
1022       Container container = excessReservation.getContainer();
1023       queue.completedContainer(
1024           clusterResource, assignment.getApplication(), node, 
1025           excessReservation, 
1026           SchedulerUtils.createAbnormalContainerStatus(
1027               container.getId(), 
1028               SchedulerUtils.UNRESERVED_CONTAINER), 
1029           RMContainerEventType.RELEASED, null, true);
1030       }
1031 
1032     }
1033 
1034     // Try to schedule more if there are no reservations to fulfill
1035     if (node.getReservedContainer() == null) {
1036       if (calculator.computeAvailableContainers(node.getAvailableResource(),
1037         minimumAllocation) > 0) {
1038         if (LOG.isDebugEnabled()) {
1039           LOG.debug("Trying to schedule on node: " + node.getNodeName() +
1040               ", available: " + node.getAvailableResource());
1041         }
1042         root.assignContainers(
1043             clusterResource,
1044             node,
1045             // TODO, now we only consider limits for parent for non-labeled
1046             // resources, should consider labeled resources as well.
1047             new ResourceLimits(labelManager.getResourceByLabel(
1048                 RMNodeLabelsManager.NO_LABEL, clusterResource)));
1049       }
1050     } else {
1051       LOG.info("Skipping scheduling since node " + node.getNodeID() + 
1052           " is reserved by application " + 
1053           node.getReservedContainer().getContainerId().getApplicationAttemptId()
1054           );
1055     }
1056   
1057   }
1058 
1059   @Override
1060   public void handle(SchedulerEvent event) {
1061     switch(event.getType()) {
1062     case NODE_ADDED:
1063     {
1064       NodeAddedSchedulerEvent nodeAddedEvent = (NodeAddedSchedulerEvent)event;
1065       addNode(nodeAddedEvent.getAddedRMNode());
1066       recoverContainersOnNode(nodeAddedEvent.getContainerReports(),
1067         nodeAddedEvent.getAddedRMNode());
1068     }
1069     break;
1070     case NODE_REMOVED:
1071     {
1072       NodeRemovedSchedulerEvent nodeRemovedEvent = (NodeRemovedSchedulerEvent)event;
1073       removeNode(nodeRemovedEvent.getRemovedRMNode());
1074     }
1075     break;
1076     case NODE_RESOURCE_UPDATE:
1077     {
1078       NodeResourceUpdateSchedulerEvent nodeResourceUpdatedEvent = 
1079           (NodeResourceUpdateSchedulerEvent)event;
1080       updateNodeAndQueueResource(nodeResourceUpdatedEvent.getRMNode(),
1081         nodeResourceUpdatedEvent.getResourceOption());
1082     }
1083     break;
1084     case NODE_LABELS_UPDATE:
1085     {
1086       NodeLabelsUpdateSchedulerEvent labelUpdateEvent =
1087           (NodeLabelsUpdateSchedulerEvent) event;
1088       
1089       for (Entry<NodeId, Set<String>> entry : labelUpdateEvent
1090           .getUpdatedNodeToLabels().entrySet()) {
1091         NodeId id = entry.getKey();
1092         Set<String> labels = entry.getValue();
1093         updateLabelsOnNode(id, labels);
1094       }
1095     }
1096     break;
1097     case NODE_UPDATE:
1098     {
1099       NodeUpdateSchedulerEvent nodeUpdatedEvent = (NodeUpdateSchedulerEvent)event;
1100       RMNode node = nodeUpdatedEvent.getRMNode();
1101       nodeUpdate(node);
1102       if (!scheduleAsynchronously) {
1103         allocateContainersToNode(getNode(node.getNodeID()));
1104       }
1105     }
1106     break;
1107     case APP_ADDED:
1108     {
1109       AppAddedSchedulerEvent appAddedEvent = (AppAddedSchedulerEvent) event;
1110       String queueName =
1111           resolveReservationQueueName(appAddedEvent.getQueue(),
1112               appAddedEvent.getApplicationId(),
1113               appAddedEvent.getReservationID());
1114       if (queueName != null) {
1115         if (!appAddedEvent.getIsAppRecovering()) {
1116           addApplication(appAddedEvent.getApplicationId(), queueName,
1117               appAddedEvent.getUser());
1118         } else {
1119           addApplicationOnRecovery(appAddedEvent.getApplicationId(), queueName,
1120               appAddedEvent.getUser());
1121         }
1122       }
1123     }
1124     break;
1125     case APP_REMOVED:
1126     {
1127       AppRemovedSchedulerEvent appRemovedEvent = (AppRemovedSchedulerEvent)event;
1128       doneApplication(appRemovedEvent.getApplicationID(),
1129         appRemovedEvent.getFinalState());
1130     }
1131     break;
1132     case APP_ATTEMPT_ADDED:
1133     {
1134       AppAttemptAddedSchedulerEvent appAttemptAddedEvent =
1135           (AppAttemptAddedSchedulerEvent) event;
1136       addApplicationAttempt(appAttemptAddedEvent.getApplicationAttemptId(),
1137         appAttemptAddedEvent.getTransferStateFromPreviousAttempt(),
1138         appAttemptAddedEvent.getIsAttemptRecovering());
1139     }
1140     break;
1141     case APP_ATTEMPT_REMOVED:
1142     {
1143       AppAttemptRemovedSchedulerEvent appAttemptRemovedEvent =
1144           (AppAttemptRemovedSchedulerEvent) event;
1145       doneApplicationAttempt(appAttemptRemovedEvent.getApplicationAttemptID(),
1146         appAttemptRemovedEvent.getFinalAttemptState(),
1147         appAttemptRemovedEvent.getKeepContainersAcrossAppAttempts());
1148     }
1149     break;
1150     case CONTAINER_EXPIRED:
1151     {
1152       ContainerExpiredSchedulerEvent containerExpiredEvent = 
1153           (ContainerExpiredSchedulerEvent) event;
1154       ContainerId containerId = containerExpiredEvent.getContainerId();
1155       completedContainer(getRMContainer(containerId), 
1156           SchedulerUtils.createAbnormalContainerStatus(
1157               containerId, 
1158               SchedulerUtils.EXPIRED_CONTAINER), 
1159           RMContainerEventType.EXPIRE);
1160     }
1161     break;
1162     case DROP_RESERVATION:
1163     {
1164       ContainerPreemptEvent dropReservationEvent = (ContainerPreemptEvent)event;
1165       RMContainer container = dropReservationEvent.getContainer();
1166       dropContainerReservation(container);
1167     }
1168     break;
1169     case PREEMPT_CONTAINER:
1170     {
1171       ContainerPreemptEvent preemptContainerEvent =
1172           (ContainerPreemptEvent)event;
1173       ApplicationAttemptId aid = preemptContainerEvent.getAppId();
1174       RMContainer containerToBePreempted = preemptContainerEvent.getContainer();
1175       preemptContainer(aid, containerToBePreempted);
1176     }
1177     break;
1178     case KILL_CONTAINER:
1179     {
1180       ContainerPreemptEvent killContainerEvent = (ContainerPreemptEvent)event;
1181       RMContainer containerToBeKilled = killContainerEvent.getContainer();
1182       killContainer(containerToBeKilled);
1183     }
1184     break;
1185     case CONTAINER_RESCHEDULED:
1186     {
1187       ContainerRescheduledEvent containerRescheduledEvent =
1188           (ContainerRescheduledEvent) event;
1189       RMContainer container = containerRescheduledEvent.getContainer();
1190       recoverResourceRequestForContainer(container);
1191     }
1192     break;
1193     default:
1194       LOG.error("Invalid eventtype " + event.getType() + ". Ignoring!");
1195     }
1196   }
1197 
1198   private synchronized void addNode(RMNode nodeManager) {
1199     FiCaSchedulerNode schedulerNode = new FiCaSchedulerNode(nodeManager,
1200         usePortForNodeName, nodeManager.getNodeLabels());
1201     this.nodes.put(nodeManager.getNodeID(), schedulerNode);
1202     Resources.addTo(clusterResource, schedulerNode.getTotalResource());
1203 
1204     // update this node to node label manager
1205     if (labelManager != null) {
1206       labelManager.activateNode(nodeManager.getNodeID(),
1207           schedulerNode.getTotalResource());
1208     }
1209     
1210     root.updateClusterResource(clusterResource, new ResourceLimits(
1211         clusterResource));
1212     int numNodes = numNodeManagers.incrementAndGet();
1213     updateMaximumAllocation(schedulerNode, true);
1214     
1215     LOG.info("Added node " + nodeManager.getNodeAddress() + 
1216         " clusterResource: " + clusterResource);
1217 
1218     if (scheduleAsynchronously && numNodes == 1) {
1219       asyncSchedulerThread.beginSchedule();
1220     }
1221   }
1222 
1223   private synchronized void removeNode(RMNode nodeInfo) {
1224     // update this node to node label manager
1225     if (labelManager != null) {
1226       labelManager.deactivateNode(nodeInfo.getNodeID());
1227     }
1228     
1229     FiCaSchedulerNode node = nodes.get(nodeInfo.getNodeID());
1230     if (node == null) {
1231       return;
1232     }
1233     Resources.subtractFrom(clusterResource, node.getTotalResource());
1234     root.updateClusterResource(clusterResource, new ResourceLimits(
1235         clusterResource));
1236     int numNodes = numNodeManagers.decrementAndGet();
1237 
1238     if (scheduleAsynchronously && numNodes == 0) {
1239       asyncSchedulerThread.suspendSchedule();
1240     }
1241     
1242     // Remove running containers
1243     List<RMContainer> runningContainers = node.getRunningContainers();
1244     for (RMContainer container : runningContainers) {
1245       completedContainer(container, 
1246           SchedulerUtils.createAbnormalContainerStatus(
1247               container.getContainerId(), 
1248               SchedulerUtils.LOST_CONTAINER), 
1249           RMContainerEventType.KILL);
1250     }
1251     
1252     // Remove reservations, if any
1253     RMContainer reservedContainer = node.getReservedContainer();
1254     if (reservedContainer != null) {
1255       completedContainer(reservedContainer, 
1256           SchedulerUtils.createAbnormalContainerStatus(
1257               reservedContainer.getContainerId(), 
1258               SchedulerUtils.LOST_CONTAINER), 
1259           RMContainerEventType.KILL);
1260     }
1261 
1262     this.nodes.remove(nodeInfo.getNodeID());
1263     updateMaximumAllocation(node, false);
1264 
1265     LOG.info("Removed node " + nodeInfo.getNodeAddress() + 
1266         " clusterResource: " + clusterResource);
1267   }
1268   
1269   @Lock(CapacityScheduler.class)
1270   @Override
1271   protected synchronized void completedContainer(RMContainer rmContainer,
1272       ContainerStatus containerStatus, RMContainerEventType event) {
1273     if (rmContainer == null) {
1274       LOG.info("Null container completed...");
1275       return;
1276     }
1277     
1278     Container container = rmContainer.getContainer();
1279     
1280     // Get the application for the finished container
1281     FiCaSchedulerApp application =
1282         getCurrentAttemptForContainer(container.getId());
1283     ApplicationId appId =
1284         container.getId().getApplicationAttemptId().getApplicationId();
1285     if (application == null) {
1286       LOG.info("Container " + container + " of" + " unknown application "
1287           + appId + " completed with event " + event);
1288       return;
1289     }
1290     
1291     // Get the node on which the container was allocated
1292     FiCaSchedulerNode node = getNode(container.getNodeId());
1293     
1294     // Inform the queue
1295     LeafQueue queue = (LeafQueue)application.getQueue();
1296     queue.completedContainer(clusterResource, application, node, 
1297         rmContainer, containerStatus, event, null, true);
1298 
1299     LOG.info("Application attempt " + application.getApplicationAttemptId()
1300         + " released container " + container.getId() + " on node: " + node
1301         + " with event: " + event);
1302   }
1303 
1304   @Lock(Lock.NoLock.class)
1305   @VisibleForTesting
1306   @Override
1307   public FiCaSchedulerApp getApplicationAttempt(
1308       ApplicationAttemptId applicationAttemptId) {
1309     return super.getApplicationAttempt(applicationAttemptId);
1310   }
1311   
1312   @Lock(Lock.NoLock.class)
1313   public FiCaSchedulerNode getNode(NodeId nodeId) {
1314     return nodes.get(nodeId);
1315   }
1316   
1317   @Lock(Lock.NoLock.class)
1318   Map<NodeId, FiCaSchedulerNode> getAllNodes() {
1319     return nodes;
1320   }
1321 
1322   @Override
1323   @Lock(Lock.NoLock.class)
1324   public void recover(RMState state) throws Exception {
1325     // NOT IMPLEMENTED
1326   }
1327 
1328   @Override
1329   public void dropContainerReservation(RMContainer container) {
1330     if(LOG.isDebugEnabled()){
1331       LOG.debug("DROP_RESERVATION:" + container.toString());
1332     }
1333     completedContainer(container,
1334         SchedulerUtils.createAbnormalContainerStatus(
1335             container.getContainerId(),
1336             SchedulerUtils.UNRESERVED_CONTAINER),
1337         RMContainerEventType.KILL);
1338   }
1339 
1340   @Override
1341   public void preemptContainer(ApplicationAttemptId aid, RMContainer cont) {
1342     if(LOG.isDebugEnabled()){
1343       LOG.debug("PREEMPT_CONTAINER: application:" + aid.toString() +
1344           " container: " + cont.toString());
1345     }
1346     FiCaSchedulerApp app = getApplicationAttempt(aid);
1347     if (app != null) {
1348       app.addPreemptContainer(cont.getContainerId());
1349     }
1350   }
1351 
1352   @Override
1353   public void killContainer(RMContainer cont) {
1354     if (LOG.isDebugEnabled()) {
1355       LOG.debug("KILL_CONTAINER: container" + cont.toString());
1356     }
1357     completedContainer(cont, SchedulerUtils.createPreemptedContainerStatus(
1358       cont.getContainerId(), SchedulerUtils.PREEMPTED_CONTAINER),
1359       RMContainerEventType.KILL);
1360   }
1361 
1362   @Override
1363   public synchronized boolean checkAccess(UserGroupInformation callerUGI,
1364       QueueACL acl, String queueName) {
1365     CSQueue queue = getQueue(queueName);
1366     if (queue == null) {
1367       if (LOG.isDebugEnabled()) {
1368         LOG.debug("ACL not found for queue access-type " + acl
1369             + " for queue " + queueName);
1370       }
1371       return false;
1372     }
1373     return queue.hasAccess(acl, callerUGI);
1374   }
1375 
1376   @Override
1377   public List<ApplicationAttemptId> getAppsInQueue(String queueName) {
1378     CSQueue queue = queues.get(queueName);
1379     if (queue == null) {
1380       return null;
1381     }
1382     List<ApplicationAttemptId> apps = new ArrayList<ApplicationAttemptId>();
1383     queue.collectSchedulerApplications(apps);
1384     return apps;
1385   }
1386 
1387   private CapacitySchedulerConfiguration loadCapacitySchedulerConfiguration(
1388       Configuration configuration) throws IOException {
1389     try {
1390       InputStream CSInputStream =
1391           this.rmContext.getConfigurationProvider()
1392               .getConfigurationInputStream(configuration,
1393                   YarnConfiguration.CS_CONFIGURATION_FILE);
1394       if (CSInputStream != null) {
1395         configuration.addResource(CSInputStream);
1396         return new CapacitySchedulerConfiguration(configuration, false);
1397       }
1398       return new CapacitySchedulerConfiguration(configuration, true);
1399     } catch (Exception e) {
1400       throw new IOException(e);
1401     }
1402   }
1403 
1404   private synchronized String resolveReservationQueueName(String queueName,
1405       ApplicationId applicationId, ReservationId reservationID) {
1406     CSQueue queue = getQueue(queueName);
1407     // Check if the queue is a plan queue
1408     if ((queue == null) || !(queue instanceof PlanQueue)) {
1409       return queueName;
1410     }
1411     if (reservationID != null) {
1412       String resQName = reservationID.toString();
1413       queue = getQueue(resQName);
1414       if (queue == null) {
1415         String message =
1416             "Application "
1417                 + applicationId
1418                 + " submitted to a reservation which is not yet currently active: "
1419                 + resQName;
1420         this.rmContext.getDispatcher().getEventHandler()
1421             .handle(new RMAppEvent(applicationId,
1422                 RMAppEventType.APP_REJECTED, message));
1423         return null;
1424       }
1425       if (!queue.getParent().getQueueName().equals(queueName)) {
1426         String message =
1427             "Application: " + applicationId + " submitted to a reservation "
1428                 + resQName + " which does not belong to the specified queue: "
1429                 + queueName;
1430         this.rmContext.getDispatcher().getEventHandler()
1431             .handle(new RMAppEvent(applicationId,
1432                 RMAppEventType.APP_REJECTED, message));
1433         return null;
1434       }
1435       // use the reservation queue to run the app
1436       queueName = resQName;
1437     } else {
1438       // use the default child queue of the plan for unreserved apps
1439       queueName = queueName + ReservationConstants.DEFAULT_QUEUE_SUFFIX;
1440     }
1441     return queueName;
1442   }
1443 
1444   @Override
1445   public synchronized void removeQueue(String queueName)
1446       throws SchedulerDynamicEditException {
1447     LOG.info("Removing queue: " + queueName);
1448     CSQueue q = this.getQueue(queueName);
1449     if (!(q instanceof ReservationQueue)) {
1450       throw new SchedulerDynamicEditException("The queue that we are asked "
1451           + "to remove (" + queueName + ") is not a ReservationQueue");
1452     }
1453     ReservationQueue disposableLeafQueue = (ReservationQueue) q;
1454     // at this point we should have no more apps
1455     if (disposableLeafQueue.getNumApplications() > 0) {
1456       throw new SchedulerDynamicEditException("The queue " + queueName
1457           + " is not empty " + disposableLeafQueue.getApplications().size()
1458           + " active apps " + disposableLeafQueue.pendingApplications.size()
1459           + " pending apps");
1460     }
1461 
1462     ((PlanQueue) disposableLeafQueue.getParent()).removeChildQueue(q);
1463     this.queues.remove(queueName);
1464     LOG.info("Removal of ReservationQueue " + queueName + " has succeeded");
1465   }
1466 
1467   @Override
1468   public synchronized void addQueue(Queue queue)
1469       throws SchedulerDynamicEditException {
1470 
1471     if (!(queue instanceof ReservationQueue)) {
1472       throw new SchedulerDynamicEditException("Queue " + queue.getQueueName()
1473           + " is not a ReservationQueue");
1474     }
1475 
1476     ReservationQueue newQueue = (ReservationQueue) queue;
1477 
1478     if (newQueue.getParent() == null
1479         || !(newQueue.getParent() instanceof PlanQueue)) {
1480       throw new SchedulerDynamicEditException("ParentQueue for "
1481           + newQueue.getQueueName()
1482           + " is not properly set (should be set and be a PlanQueue)");
1483     }
1484 
1485     PlanQueue parentPlan = (PlanQueue) newQueue.getParent();
1486     String queuename = newQueue.getQueueName();
1487     parentPlan.addChildQueue(newQueue);
1488     this.queues.put(queuename, newQueue);
1489     LOG.info("Creation of ReservationQueue " + newQueue + " succeeded");
1490   }
1491 
1492   @Override
1493   public synchronized void setEntitlement(String inQueue,
1494       QueueEntitlement entitlement) throws SchedulerDynamicEditException,
1495       YarnException {
1496     LeafQueue queue = getAndCheckLeafQueue(inQueue);
1497     ParentQueue parent = (ParentQueue) queue.getParent();
1498 
1499     if (!(queue instanceof ReservationQueue)) {
1500       throw new SchedulerDynamicEditException("Entitlement can not be"
1501           + " modified dynamically since queue " + inQueue
1502           + " is not a ReservationQueue");
1503     }
1504 
1505     if (!(parent instanceof PlanQueue)) {
1506       throw new SchedulerDynamicEditException("The parent of ReservationQueue "
1507           + inQueue + " must be an PlanQueue");
1508     }
1509 
1510     ReservationQueue newQueue = (ReservationQueue) queue;
1511 
1512     float sumChilds = ((PlanQueue) parent).sumOfChildCapacities();
1513     float newChildCap = sumChilds - queue.getCapacity() + entitlement.getCapacity();
1514 
1515     if (newChildCap >= 0 && newChildCap < 1.0f + CSQueueUtils.EPSILON) {
1516       // note: epsilon checks here are not ok, as the epsilons might accumulate
1517       // and become a problem in aggregate
1518       if (Math.abs(entitlement.getCapacity() - queue.getCapacity()) == 0
1519           && Math.abs(entitlement.getMaxCapacity() - queue.getMaximumCapacity()) == 0) {
1520         return;
1521       }
1522       newQueue.setEntitlement(entitlement);
1523     } else {
1524       throw new SchedulerDynamicEditException(
1525           "Sum of child queues would exceed 100% for PlanQueue: "
1526               + parent.getQueueName());
1527     }
1528     LOG.info("Set entitlement for ReservationQueue " + inQueue + "  to "
1529         + queue.getCapacity() + " request was (" + entitlement.getCapacity() + ")");
1530   }
1531 
1532   @Override
1533   public synchronized String moveApplication(ApplicationId appId,
1534       String targetQueueName) throws YarnException {
1535     FiCaSchedulerApp app =
1536         getApplicationAttempt(ApplicationAttemptId.newInstance(appId, 0));
1537     String sourceQueueName = app.getQueue().getQueueName();
1538     LeafQueue source = getAndCheckLeafQueue(sourceQueueName);
1539     String destQueueName = handleMoveToPlanQueue(targetQueueName);
1540     LeafQueue dest = getAndCheckLeafQueue(destQueueName);
1541     // Validation check - ACLs, submission limits for user & queue
1542     String user = app.getUser();
1543     try {
1544       dest.submitApplication(appId, user, destQueueName);
1545     } catch (AccessControlException e) {
1546       throw new YarnException(e);
1547     }
1548     // Move all live containers
1549     for (RMContainer rmContainer : app.getLiveContainers()) {
1550       source.detachContainer(clusterResource, app, rmContainer);
1551       // attach the Container to another queue
1552       dest.attachContainer(clusterResource, app, rmContainer);
1553     }
1554     // Detach the application..
1555     source.finishApplicationAttempt(app, sourceQueueName);
1556     source.getParent().finishApplication(appId, app.getUser());
1557     // Finish app & update metrics
1558     app.move(dest);
1559     // Submit to a new queue
1560     dest.submitApplicationAttempt(app, user);
1561     applications.get(appId).setQueue(dest);
1562     LOG.info("App: " + app.getApplicationId() + " successfully moved from "
1563         + sourceQueueName + " to: " + destQueueName);
1564     return targetQueueName;
1565   }
1566 
1567   /**
1568    * Check that the String provided in input is the name of an existing,
1569    * LeafQueue, if successful returns the queue.
1570    *
1571    * @param queue
1572    * @return the LeafQueue
1573    * @throws YarnException
1574    */
1575   private LeafQueue getAndCheckLeafQueue(String queue) throws YarnException {
1576     CSQueue ret = this.getQueue(queue);
1577     if (ret == null) {
1578       throw new YarnException("The specified Queue: " + queue
1579           + " doesn't exist");
1580     }
1581     if (!(ret instanceof LeafQueue)) {
1582       throw new YarnException("The specified Queue: " + queue
1583           + " is not a Leaf Queue. Move is supported only for Leaf Queues.");
1584     }
1585     return (LeafQueue) ret;
1586   }
1587 
1588   /** {@inheritDoc} */
1589   @Override
1590   public EnumSet<SchedulerResourceTypes> getSchedulingResourceTypes() {
1591     if (calculator.getClass().getName()
1592       .equals(DefaultResourceCalculator.class.getName())) {
1593       return EnumSet.of(SchedulerResourceTypes.MEMORY);
1594     }
1595     return EnumSet
1596       .of(SchedulerResourceTypes.MEMORY, SchedulerResourceTypes.CPU);
1597   }
1598   
1599   @Override
1600   public Resource getMaximumResourceCapability(String queueName) {
1601     CSQueue queue = getQueue(queueName);
1602     if (queue == null) {
1603       LOG.error("Unknown queue: " + queueName);
1604       return getMaximumResourceCapability();
1605     }
1606     if (!(queue instanceof LeafQueue)) {
1607       LOG.error("queue " + queueName + " is not an leaf queue");
1608       return getMaximumResourceCapability();
1609     }
1610     return ((LeafQueue)queue).getMaximumAllocation();
1611   }
1612 
1613   private String handleMoveToPlanQueue(String targetQueueName) {
1614     CSQueue dest = getQueue(targetQueueName);
1615     if (dest != null && dest instanceof PlanQueue) {
1616       // use the default child reservation queue of the plan
1617       targetQueueName = targetQueueName + ReservationConstants.DEFAULT_QUEUE_SUFFIX;
1618     }
1619     return targetQueueName;
1620   }
1621 
1622   @Override
1623   public Set<String> getPlanQueues() {
1624     Set<String> ret = new HashSet<String>();
1625     for (Map.Entry<String, CSQueue> l : queues.entrySet()) {
1626       if (l.getValue() instanceof PlanQueue) {
1627         ret.add(l.getKey());
1628       }
1629     }
1630     return ret;
1631   }
1632 }
CapacityScheduler.java

CapacityScheduler.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java

  其实由前面可知, ResourceManager 类的 createSchedulerEventDispatcher() 函数调用的是 ResourceManager 类的内部类 SchedulerEventDispatcher, 如下所示

//ResourceManager类的内部类SchedulerEventDispatcher
public SchedulerEventDispatcher(ResourceScheduler scheduler) {
      super(SchedulerEventDispatcher.class.getName());
      this.scheduler = scheduler;
      this.eventProcessor = new Thread(new EventProcessor());
      this.eventProcessor.setName("ResourceManager Event Processor");
    }

  会进一步调用内部类 SchedulerEventDispatcher 的内部类 EventProcessor, 如下所示:

 1 //ResourceManager类的内部类SchedulerEventDispatcher的内部类EventProcessor
 2 private final class EventProcessor implements Runnable {
 3       @Override
 4       public void run() {
 5 
 6         SchedulerEvent event;
 7 
 8         while (!stopped && !Thread.currentThread().isInterrupted()) {
 9           try {
10             event = eventQueue.take();
11           } catch (InterruptedException e) {
12             LOG.error("Returning, interrupted : " + e);
13             return; // TODO: Kill RM.
14           }
15 
16           try {
17             scheduler.handle(event);
18           } catch (Throwable t) {
19             // An error occurred, but we are shutting down anyway.
20             // If it was an InterruptedException, the very act of 
21             // shutdown could have caused it and is probably harmless.
22             if (stopped) {
23               LOG.warn("Exception during shutdown: ", t);
24               break;
25             }
26             LOG.fatal("Error in handling event type " + event.getType()
27                 + " to the scheduler", t);
28             if (shouldExitOnError
29                 && !ShutdownHookManager.get().isShutdownInProgress()) {
30               LOG.info("Exiting, bbye..");
31               System.exit(-1);
32             }
33           }
34         }
35       }
36     }

  在函数run()内部, 有scheduler.handle(event), 我们知道,这个scheduler类型是默认调度起CapacityScheduler的对象。 所以最后调用的是 CapacityScheduler类的 handle()函数, 如下所示:

  1 //CapacityScheduler.java
  2 public void handle(SchedulerEvent event) {
  3     switch(event.getType()) {
4 ......

5
case APP_ADDED: 6 { 7 AppAddedSchedulerEvent appAddedEvent = (AppAddedSchedulerEvent) event; 8 String queueName = 9 resolveReservationQueueName(appAddedEvent.getQueue(), 10 appAddedEvent.getApplicationId(), 11 appAddedEvent.getReservationID()); 12 if (queueName != null) { 13 if (!appAddedEvent.getIsAppRecovering()) { 14 addApplication(appAddedEvent.getApplicationId(), queueName, 15 appAddedEvent.getUser()); 16 } else { 17 addApplicationOnRecovery(appAddedEvent.getApplicationId(), queueName, 18 appAddedEvent.getUser()); 19 } 20 } 21 } 22 break;
23    .......
24  }

  我们可以看到APP_ADDED对应的操作, 其中 handle()函数内部调用 addApplication(appAddedEvent.getApplicationId(), queueName,  appAddedEvent.getUser()), 进入函数addApplication(),如下所示:

 1 //CapacityScheduler.java
 2 private synchronized void addApplication(ApplicationId applicationId,
 3       String queueName, String user) {
 4     queueName = getQueueMappings(applicationId, queueName, user);
 5     if (queueName == null) {
 6       // Exception encountered while getting queue mappings.
 7       return;
 8     }
 9     // sanity checks.
10     CSQueue queue = getQueue(queueName);
11     if (queue == null) {
12       String message = "Application " + applicationId + 
13       " submitted by user " + user + " to unknown queue: " + queueName;
14       this.rmContext.getDispatcher().getEventHandler()
15           .handle(new RMAppEvent(applicationId,
16               RMAppEventType.APP_REJECTED, message));
17       return;
18     }
19     if (!(queue instanceof LeafQueue)) {
20       String message = "Application " + applicationId + 
21           " submitted by user " + user + " to non-leaf queue: " + queueName;
22       this.rmContext.getDispatcher().getEventHandler()
23           .handle(new RMAppEvent(applicationId,
24               RMAppEventType.APP_REJECTED, message));
25       return;
26     }
27     // Submit to the queue
28     try {
29       queue.submitApplication(applicationId, user, queueName);
30     } catch (AccessControlException ace) {
31       LOG.info("Failed to submit application " + applicationId + " to queue "
32           + queueName + " from user " + user, ace);
33       this.rmContext.getDispatcher().getEventHandler()
34           .handle(new RMAppEvent(applicationId,
35               RMAppEventType.APP_REJECTED, ace.toString()));
36       return;
37     }
38     // update the metrics
39     queue.getMetrics().submitApp(user);
40     SchedulerApplication<FiCaSchedulerApp> application =
41         new SchedulerApplication<FiCaSchedulerApp>(queue, user);
42     applications.put(applicationId, application);
43     LOG.info("Accepted application " + applicationId + " from user: " + user
44         + ", in queue: " + queueName);
45     rmContext.getDispatcher().getEventHandler()
46         .handle(new RMAppEvent(applicationId, RMAppEventType.APP_ACCEPTED));
47   }

  该函数内部会调用 this.rmContext.getDispatcher().getEventHandler().handle(new RMAppEvent(applicationId, RMAppEventType.APP_REJECTED, message)), 前面我们类似的已经分析过,其中this.rmContext=RMContextImpl  this.rmContext.getDispatcher()=AsyncDispatcher   this.rmContext.getDispatcher().getEventHandler()=AsyncDispatcher$GenericEventHandler。 其中RMAppEventType事件已经在ResourceManager类的内部类RMActiveServices的serviceInit()函数中注册过,我们知道最后调用的是 RMAppImpl 类的 handle()函数, 进一步在RMAppImpl 类中, (上一次的状态是从RMAppState.NEW_SAVING 转变为 RMAppState.SUBMITTED)调用StateMachineFactory类的 .addTransition(。。。)函数。

   同理rmContext.getDispatcher().getEventHandler().handle(new RMAppEvent(applicationId, RMAppEventType.APP_ACCEPTED)), 最后在 RMAppImpl 类中调用StateMachineFactory类的 .addTransition(。。。)函数。

  我们知道,上一次的状态是RMAppState.SUBMITTED,具体如下所示:

 1 //RMAppImpl.java
 2 private static final StateMachineFactory<RMAppImpl,
 3                                            RMAppState,
 4                                            RMAppEventType,
 5                                            RMAppEvent> stateMachineFactory
 6                                = new StateMachineFactory<RMAppImpl,
 7                                            RMAppState,
 8                                            RMAppEventType,
 9                                            RMAppEvent>(RMAppState.NEW)
10 
11 
12      // Transitions from NEW state
13     .......
14     .addTransition(RMAppState.NEW, RMAppState.NEW_SAVING,
15         RMAppEventType.START, new RMAppNewlySavingTransition())
16     ......
17 
18     // Transitions from NEW_SAVING state
19     ......
20     .addTransition(RMAppState.NEW_SAVING, RMAppState.SUBMITTED,
21         RMAppEventType.APP_NEW_SAVED, new AddApplicationToSchedulerTransition())
22     ......
23 
24      // Transitions from SUBMITTED state
25     ......
26     .addTransition(RMAppState.SUBMITTED, RMAppState.FINAL_SAVING,
27         RMAppEventType.APP_REJECTED,
28         new FinalSavingTransition(
29           new AppRejectedTransition(), RMAppState.FAILED))
30     .addTransition(RMAppState.SUBMITTED, RMAppState.ACCEPTED,
31         RMAppEventType.APP_ACCEPTED, new StartAppAttemptTransition())
32     ......
33     .installTopology();

  由此可知, 对于RMAppImpl收到RMAppEventType.APP_ACCEPTED 事件后,将自身的运行状态由 RMAppState.SUBMITTED转变为 RMAppState.ACCEPTED , 调用回调类是RMAppImpl类的内部类StartAppAttemptTransition, 如下所示:

1 //RMAppImpl.java 的内部类 StartAppAttemptTransition
2 private static final class StartAppAttemptTransition extends RMAppTransition {
3     @Override
4     public void transition(RMAppImpl app, RMAppEvent event) {
5       app.createAndStartNewAttempt(false);
6     };
7   }

   进入函数createAndStartNewAttempt, 如下所示:

 1 //RMAppImpl.java
 2 private void createNewAttempt() {
 3     ApplicationAttemptId appAttemptId =
 4         ApplicationAttemptId.newInstance(applicationId, attempts.size() + 1);
 5     RMAppAttempt attempt =
 6         new RMAppAttemptImpl(appAttemptId, rmContext, scheduler, masterService,
 7           submissionContext, conf,
 8           // The newly created attempt maybe last attempt if (number of
 9           // previously failed attempts(which should not include Preempted,
10           // hardware error and NM resync) + 1) equal to the max-attempt
11           // limit.
12           maxAppAttempts == (getNumFailedAppAttempts() + 1), amReq);
13     attempts.put(appAttemptId, attempt);
14     currentAttempt = attempt;
15   }
16   
17   private void
18       createAndStartNewAttempt(boolean transferStateFromPreviousAttempt) {
19     createNewAttempt();
20     handler.handle(new RMAppStartAttemptEvent(currentAttempt.getAppAttemptId(),
21       transferStateFromPreviousAttempt));
22   }

  createNewAttempt()函数创建了一个运行实例对象RMAppAttemptImpl。 并且handler.handle(new RMAppStartAttemptEvent(currentAttempt.getAppAttemptId(),  transferStateFromPreviousAttempt)),调用 RMAppStartAttemptEvent类,如下所示:

 1 //RMAppStartAttemptEvent.java
 2 public class RMAppStartAttemptEvent extends RMAppAttemptEvent {
 3 
 4   private final boolean transferStateFromPreviousAttempt;
 5 
 6   public RMAppStartAttemptEvent(ApplicationAttemptId appAttemptId,
 7       boolean transferStateFromPreviousAttempt) {
 8     super(appAttemptId, RMAppAttemptEventType.START);
 9     this.transferStateFromPreviousAttempt = transferStateFromPreviousAttempt;
10   }
11 
12   public boolean getTransferStateFromPreviousAttempt() {
13     return transferStateFromPreviousAttempt;
14   }
15 }

  其中事件类型RMAppAttemptEventType.START,  由于在ResourceManager中,将RMAppAttemptEventType类型的事件绑定到了ApplicationAttemptEventDispatcher类,如下所示:

 1 public enum RMAppAttemptEventType {
 2   // Source: RMApp
 3   START,
 4   KILL,
 5 
 6   // Source: AMLauncher
 7   LAUNCHED,
 8   LAUNCH_FAILED,
 9 
10   // Source: AMLivelinessMonitor
11   EXPIRE,
12   
13   // Source: ApplicationMasterService
14   REGISTERED,
15   STATUS_UPDATE,
16   UNREGISTERED,
17 
18   // Source: Containers
19   CONTAINER_ALLOCATED,
20   CONTAINER_FINISHED,
21   
22   // Source: RMStateStore
23   ATTEMPT_NEW_SAVED,
24   ATTEMPT_UPDATE_SAVED,
25 
26   // Source: Scheduler
27   ATTEMPT_ADDED,
28   
29   // Source: RMAttemptImpl.recover
30   RECOVER
31 
32 }
RMAppAttemptEventType.java

RMAppAttemptEventType.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptEventType.java

//ResourceManager.java的内部类RMActiveServices的serviceInit()函数
// Register event handler for RmAppAttemptEvents
      rmDispatcher.register(RMAppAttemptEventType.class,
          new ApplicationAttemptEventDispatcher(rmContext));

  进入ApplicationAttemptEventDispatcher,如下所示:

//ResourceManager.java的内部类ApplicationAttemptEventDispatcher  
@Private
  public static final class ApplicationAttemptEventDispatcher implements
      EventHandler<RMAppAttemptEvent> {

    private final RMContext rmContext;

    public ApplicationAttemptEventDispatcher(RMContext rmContext) {
      this.rmContext = rmContext;
    }

    @Override
    public void handle(RMAppAttemptEvent event) {
      ApplicationAttemptId appAttemptID = event.getApplicationAttemptId();
      ApplicationId appAttemptId = appAttemptID.getApplicationId();
      RMApp rmApp = this.rmContext.getRMApps().get(appAttemptId);
      if (rmApp != null) {
        RMAppAttempt rmAppAttempt = rmApp.getRMAppAttempt(appAttemptID);
        if (rmAppAttempt != null) {
          try {
            rmAppAttempt.handle(event);
          } catch (Throwable t) {
            LOG.error("Error in handling event type " + event.getType()
                + " for applicationAttempt " + appAttemptId, t);
          }
        }
      }
    }
  }

  该类函数handle()函数内部调用 rmAppAttempt.handle(event), 其中rmAppAttempt 是 接口RMAppAttempt的对象, 该接口只有一个实现类 public class RMAppAttemptImpl implements RMAppAttempt, Recoverable {。。。}。 所以最终调用的是RMAppAttemptImpl类的handle()函数。

   1 @SuppressWarnings({"unchecked", "rawtypes"})
   2 public class RMAppAttemptImpl implements RMAppAttempt, Recoverable {
   3 
   4   private static final Log LOG = LogFactory.getLog(RMAppAttemptImpl.class);
   5 
   6   private static final RecordFactory recordFactory = RecordFactoryProvider
   7       .getRecordFactory(null);
   8 
   9   public final static Priority AM_CONTAINER_PRIORITY = recordFactory
  10       .newRecordInstance(Priority.class);
  11   static {
  12     AM_CONTAINER_PRIORITY.setPriority(0);
  13   }
  14 
  15   private final StateMachine<RMAppAttemptState,
  16                              RMAppAttemptEventType,
  17                              RMAppAttemptEvent> stateMachine;
  18 
  19   private final RMContext rmContext;
  20   private final EventHandler eventHandler;
  21   private final YarnScheduler scheduler;
  22   private final ApplicationMasterService masterService;
  23 
  24   private final ReadLock readLock;
  25   private final WriteLock writeLock;
  26 
  27   private final ApplicationAttemptId applicationAttemptId;
  28   private final ApplicationSubmissionContext submissionContext;
  29   private Token<AMRMTokenIdentifier> amrmToken = null;
  30   private volatile Integer amrmTokenKeyId = null;
  31   private SecretKey clientTokenMasterKey = null;
  32 
  33   private ConcurrentMap<NodeId, List<ContainerStatus>>
  34       justFinishedContainers =
  35       new ConcurrentHashMap<NodeId, List<ContainerStatus>>();
  36   // Tracks the previous finished containers that are waiting to be
  37   // verified as received by the AM. If the AM sends the next allocate
  38   // request it implicitly acks this list.
  39   private ConcurrentMap<NodeId, List<ContainerStatus>>
  40       finishedContainersSentToAM =
  41       new ConcurrentHashMap<NodeId, List<ContainerStatus>>();
  42   private Container masterContainer;
  43 
  44   private float progress = 0;
  45   private String host = "N/A";
  46   private int rpcPort = -1;
  47   private String originalTrackingUrl = "N/A";
  48   private String proxiedTrackingUrl = "N/A";
  49   private long startTime = 0;
  50   private long finishTime = 0;
  51   private long launchAMStartTime = 0;
  52   private long launchAMEndTime = 0;
  53 
  54   // Set to null initially. Will eventually get set
  55   // if an RMAppAttemptUnregistrationEvent occurs
  56   private FinalApplicationStatus finalStatus = null;
  57   private final StringBuilder diagnostics = new StringBuilder();
  58   private int amContainerExitStatus = ContainerExitStatus.INVALID;
  59 
  60   private Configuration conf;
  61   // Since AM preemption, hardware error and NM resync are not counted towards
  62   // AM failure count, even if this flag is true, a new attempt can still be
  63   // re-created if this attempt is eventually failed because of preemption,
  64   // hardware error or NM resync. So this flag indicates that this may be
  65   // last attempt.
  66   private final boolean maybeLastAttempt;
  67   private static final ExpiredTransition EXPIRED_TRANSITION =
  68       new ExpiredTransition();
  69 
  70   private RMAppAttemptEvent eventCausingFinalSaving;
  71   private RMAppAttemptState targetedFinalState;
  72   private RMAppAttemptState recoveredFinalState;
  73   private RMAppAttemptState stateBeforeFinalSaving;
  74   private Object transitionTodo;
  75   
  76   private RMAppAttemptMetrics attemptMetrics = null;
  77   private ResourceRequest amReq = null;
  78 
  79   private static final StateMachineFactory<RMAppAttemptImpl,
  80                                            RMAppAttemptState,
  81                                            RMAppAttemptEventType,
  82                                            RMAppAttemptEvent>
  83        stateMachineFactory  = new StateMachineFactory<RMAppAttemptImpl,
  84                                             RMAppAttemptState,
  85                                             RMAppAttemptEventType,
  86                                      RMAppAttemptEvent>(RMAppAttemptState.NEW)
  87 
  88        // Transitions from NEW State
  89       .addTransition(RMAppAttemptState.NEW, RMAppAttemptState.SUBMITTED,
  90           RMAppAttemptEventType.START, new AttemptStartedTransition())
  91       .addTransition(RMAppAttemptState.NEW, RMAppAttemptState.FINAL_SAVING,
  92           RMAppAttemptEventType.KILL,
  93           new FinalSavingTransition(new BaseFinalTransition(
  94             RMAppAttemptState.KILLED), RMAppAttemptState.KILLED))
  95       .addTransition(RMAppAttemptState.NEW, RMAppAttemptState.FINAL_SAVING,
  96           RMAppAttemptEventType.REGISTERED,
  97           new FinalSavingTransition(
  98             new UnexpectedAMRegisteredTransition(), RMAppAttemptState.FAILED))
  99       .addTransition( RMAppAttemptState.NEW,
 100           EnumSet.of(RMAppAttemptState.FINISHED, RMAppAttemptState.KILLED,
 101             RMAppAttemptState.FAILED, RMAppAttemptState.LAUNCHED),
 102           RMAppAttemptEventType.RECOVER, new AttemptRecoveredTransition())
 103           
 104       // Transitions from SUBMITTED state
 105       .addTransition(RMAppAttemptState.SUBMITTED, 
 106           EnumSet.of(RMAppAttemptState.LAUNCHED_UNMANAGED_SAVING,
 107                      RMAppAttemptState.SCHEDULED),
 108           RMAppAttemptEventType.ATTEMPT_ADDED,
 109           new ScheduleTransition())
 110       .addTransition(RMAppAttemptState.SUBMITTED, RMAppAttemptState.FINAL_SAVING,
 111           RMAppAttemptEventType.KILL,
 112           new FinalSavingTransition(new BaseFinalTransition(
 113             RMAppAttemptState.KILLED), RMAppAttemptState.KILLED))
 114       .addTransition(RMAppAttemptState.SUBMITTED, RMAppAttemptState.FINAL_SAVING,
 115           RMAppAttemptEventType.REGISTERED,
 116           new FinalSavingTransition(
 117             new UnexpectedAMRegisteredTransition(), RMAppAttemptState.FAILED))
 118           
 119        // Transitions from SCHEDULED State
 120       .addTransition(RMAppAttemptState.SCHEDULED,
 121           EnumSet.of(RMAppAttemptState.ALLOCATED_SAVING,
 122             RMAppAttemptState.SCHEDULED),
 123           RMAppAttemptEventType.CONTAINER_ALLOCATED,
 124           new AMContainerAllocatedTransition())
 125       .addTransition(RMAppAttemptState.SCHEDULED, RMAppAttemptState.FINAL_SAVING,
 126           RMAppAttemptEventType.KILL,
 127           new FinalSavingTransition(new BaseFinalTransition(
 128             RMAppAttemptState.KILLED), RMAppAttemptState.KILLED))
 129       .addTransition(RMAppAttemptState.SCHEDULED,
 130           RMAppAttemptState.FINAL_SAVING,
 131           RMAppAttemptEventType.CONTAINER_FINISHED,
 132           new FinalSavingTransition(
 133             new AMContainerCrashedBeforeRunningTransition(),
 134             RMAppAttemptState.FAILED))
 135 
 136        // Transitions from ALLOCATED_SAVING State
 137       .addTransition(RMAppAttemptState.ALLOCATED_SAVING, 
 138           RMAppAttemptState.ALLOCATED,
 139           RMAppAttemptEventType.ATTEMPT_NEW_SAVED, new AttemptStoredTransition())
 140           
 141        // App could be killed by the client. So need to handle this. 
 142       .addTransition(RMAppAttemptState.ALLOCATED_SAVING, 
 143           RMAppAttemptState.FINAL_SAVING,
 144           RMAppAttemptEventType.KILL,
 145           new FinalSavingTransition(new BaseFinalTransition(
 146             RMAppAttemptState.KILLED), RMAppAttemptState.KILLED))
 147       .addTransition(RMAppAttemptState.ALLOCATED_SAVING, 
 148           RMAppAttemptState.FINAL_SAVING,
 149           RMAppAttemptEventType.CONTAINER_FINISHED,
 150           new FinalSavingTransition(
 151             new AMContainerCrashedBeforeRunningTransition(), 
 152             RMAppAttemptState.FAILED))
 153 
 154        // Transitions from LAUNCHED_UNMANAGED_SAVING State
 155       .addTransition(RMAppAttemptState.LAUNCHED_UNMANAGED_SAVING, 
 156           RMAppAttemptState.LAUNCHED,
 157           RMAppAttemptEventType.ATTEMPT_NEW_SAVED, 
 158           new UnmanagedAMAttemptSavedTransition())
 159       // attempt should not try to register in this state
 160       .addTransition(RMAppAttemptState.LAUNCHED_UNMANAGED_SAVING, 
 161           RMAppAttemptState.FINAL_SAVING,
 162           RMAppAttemptEventType.REGISTERED,
 163           new FinalSavingTransition(
 164             new UnexpectedAMRegisteredTransition(), RMAppAttemptState.FAILED))
 165       // App could be killed by the client. So need to handle this. 
 166       .addTransition(RMAppAttemptState.LAUNCHED_UNMANAGED_SAVING, 
 167           RMAppAttemptState.FINAL_SAVING,
 168           RMAppAttemptEventType.KILL,
 169           new FinalSavingTransition(new BaseFinalTransition(
 170             RMAppAttemptState.KILLED), RMAppAttemptState.KILLED))
 171 
 172        // Transitions from ALLOCATED State
 173       .addTransition(RMAppAttemptState.ALLOCATED, RMAppAttemptState.LAUNCHED,
 174           RMAppAttemptEventType.LAUNCHED, new AMLaunchedTransition())
 175       .addTransition(RMAppAttemptState.ALLOCATED, RMAppAttemptState.FINAL_SAVING,
 176           RMAppAttemptEventType.LAUNCH_FAILED,
 177           new FinalSavingTransition(new LaunchFailedTransition(),
 178             RMAppAttemptState.FAILED))
 179       .addTransition(RMAppAttemptState.ALLOCATED, RMAppAttemptState.FINAL_SAVING,
 180           RMAppAttemptEventType.KILL,
 181           new FinalSavingTransition(
 182             new KillAllocatedAMTransition(), RMAppAttemptState.KILLED))
 183           
 184       .addTransition(RMAppAttemptState.ALLOCATED, RMAppAttemptState.FINAL_SAVING,
 185           RMAppAttemptEventType.CONTAINER_FINISHED,
 186           new FinalSavingTransition(
 187             new AMContainerCrashedBeforeRunningTransition(), RMAppAttemptState.FAILED))
 188 
 189        // Transitions from LAUNCHED State
 190       .addTransition(RMAppAttemptState.LAUNCHED, RMAppAttemptState.RUNNING,
 191           RMAppAttemptEventType.REGISTERED, new AMRegisteredTransition())
 192       .addTransition(RMAppAttemptState.LAUNCHED,
 193           EnumSet.of(RMAppAttemptState.LAUNCHED, RMAppAttemptState.FINAL_SAVING),
 194           RMAppAttemptEventType.CONTAINER_FINISHED,
 195           new ContainerFinishedTransition(
 196             new AMContainerCrashedBeforeRunningTransition(),
 197             RMAppAttemptState.LAUNCHED))
 198       .addTransition(
 199           RMAppAttemptState.LAUNCHED, RMAppAttemptState.FINAL_SAVING,
 200           RMAppAttemptEventType.EXPIRE,
 201           new FinalSavingTransition(EXPIRED_TRANSITION,
 202             RMAppAttemptState.FAILED))
 203       .addTransition(RMAppAttemptState.LAUNCHED, RMAppAttemptState.FINAL_SAVING,
 204           RMAppAttemptEventType.KILL,
 205           new FinalSavingTransition(new FinalTransition(
 206             RMAppAttemptState.KILLED), RMAppAttemptState.KILLED))
 207 
 208        // Transitions from RUNNING State
 209       .addTransition(RMAppAttemptState.RUNNING,
 210           EnumSet.of(RMAppAttemptState.FINAL_SAVING, RMAppAttemptState.FINISHED),
 211           RMAppAttemptEventType.UNREGISTERED, new AMUnregisteredTransition())
 212       .addTransition(RMAppAttemptState.RUNNING, RMAppAttemptState.RUNNING,
 213           RMAppAttemptEventType.STATUS_UPDATE, new StatusUpdateTransition())
 214       .addTransition(RMAppAttemptState.RUNNING, RMAppAttemptState.RUNNING,
 215           RMAppAttemptEventType.CONTAINER_ALLOCATED)
 216       .addTransition(
 217           RMAppAttemptState.RUNNING,
 218           EnumSet.of(RMAppAttemptState.RUNNING, RMAppAttemptState.FINAL_SAVING),
 219           RMAppAttemptEventType.CONTAINER_FINISHED,
 220           new ContainerFinishedTransition(
 221             new AMContainerCrashedAtRunningTransition(),
 222             RMAppAttemptState.RUNNING))
 223       .addTransition(
 224           RMAppAttemptState.RUNNING, RMAppAttemptState.FINAL_SAVING,
 225           RMAppAttemptEventType.EXPIRE,
 226           new FinalSavingTransition(EXPIRED_TRANSITION,
 227             RMAppAttemptState.FAILED))
 228       .addTransition(
 229           RMAppAttemptState.RUNNING, RMAppAttemptState.FINAL_SAVING,
 230           RMAppAttemptEventType.KILL,
 231           new FinalSavingTransition(new FinalTransition(
 232             RMAppAttemptState.KILLED), RMAppAttemptState.KILLED))
 233 
 234        // Transitions from FINAL_SAVING State
 235       .addTransition(RMAppAttemptState.FINAL_SAVING,
 236           EnumSet.of(RMAppAttemptState.FINISHING, RMAppAttemptState.FAILED,
 237             RMAppAttemptState.KILLED, RMAppAttemptState.FINISHED),
 238             RMAppAttemptEventType.ATTEMPT_UPDATE_SAVED,
 239             new FinalStateSavedTransition())
 240       .addTransition(RMAppAttemptState.FINAL_SAVING, RMAppAttemptState.FINAL_SAVING,
 241           RMAppAttemptEventType.CONTAINER_FINISHED,
 242           new ContainerFinishedAtFinalSavingTransition())
 243       .addTransition(RMAppAttemptState.FINAL_SAVING, RMAppAttemptState.FINAL_SAVING,
 244           RMAppAttemptEventType.EXPIRE,
 245           new AMExpiredAtFinalSavingTransition())
 246       .addTransition(RMAppAttemptState.FINAL_SAVING, RMAppAttemptState.FINAL_SAVING,
 247           EnumSet.of(
 248               RMAppAttemptEventType.UNREGISTERED,
 249               RMAppAttemptEventType.STATUS_UPDATE,
 250               RMAppAttemptEventType.LAUNCHED,
 251               RMAppAttemptEventType.LAUNCH_FAILED,
 252             // should be fixed to reject container allocate request at Final
 253             // Saving in scheduler
 254               RMAppAttemptEventType.CONTAINER_ALLOCATED,
 255               RMAppAttemptEventType.ATTEMPT_NEW_SAVED,
 256               RMAppAttemptEventType.KILL))
 257 
 258       // Transitions from FAILED State
 259       // For work-preserving AM restart, failed attempt are still capturing
 260       // CONTAINER_FINISHED event and record the finished containers for the
 261       // use by the next new attempt.
 262       .addTransition(RMAppAttemptState.FAILED, RMAppAttemptState.FAILED,
 263           RMAppAttemptEventType.CONTAINER_FINISHED,
 264           new ContainerFinishedAtFinalStateTransition())
 265       .addTransition(
 266           RMAppAttemptState.FAILED,
 267           RMAppAttemptState.FAILED,
 268           EnumSet.of(
 269               RMAppAttemptEventType.EXPIRE,
 270               RMAppAttemptEventType.KILL,
 271               RMAppAttemptEventType.UNREGISTERED,
 272               RMAppAttemptEventType.STATUS_UPDATE,
 273               RMAppAttemptEventType.CONTAINER_ALLOCATED))
 274 
 275       // Transitions from FINISHING State
 276       .addTransition(RMAppAttemptState.FINISHING,
 277           EnumSet.of(RMAppAttemptState.FINISHING, RMAppAttemptState.FINISHED),
 278           RMAppAttemptEventType.CONTAINER_FINISHED,
 279           new AMFinishingContainerFinishedTransition())
 280       .addTransition(RMAppAttemptState.FINISHING, RMAppAttemptState.FINISHED,
 281           RMAppAttemptEventType.EXPIRE,
 282           new FinalTransition(RMAppAttemptState.FINISHED))
 283       .addTransition(RMAppAttemptState.FINISHING, RMAppAttemptState.FINISHING,
 284           EnumSet.of(
 285               RMAppAttemptEventType.UNREGISTERED,
 286               RMAppAttemptEventType.STATUS_UPDATE,
 287               RMAppAttemptEventType.CONTAINER_ALLOCATED,
 288             // ignore Kill as we have already saved the final Finished state in
 289             // state store.
 290               RMAppAttemptEventType.KILL))
 291 
 292       // Transitions from FINISHED State
 293       .addTransition(
 294           RMAppAttemptState.FINISHED,
 295           RMAppAttemptState.FINISHED,
 296           EnumSet.of(
 297               RMAppAttemptEventType.EXPIRE,
 298               RMAppAttemptEventType.UNREGISTERED,
 299               RMAppAttemptEventType.CONTAINER_ALLOCATED,
 300               RMAppAttemptEventType.KILL))
 301       .addTransition(RMAppAttemptState.FINISHED, 
 302           RMAppAttemptState.FINISHED, 
 303           RMAppAttemptEventType.CONTAINER_FINISHED, 
 304           new ContainerFinishedAtFinalStateTransition())
 305 
 306       // Transitions from KILLED State
 307       .addTransition(
 308           RMAppAttemptState.KILLED,
 309           RMAppAttemptState.KILLED,
 310           EnumSet.of(RMAppAttemptEventType.ATTEMPT_ADDED,
 311               RMAppAttemptEventType.LAUNCHED,
 312               RMAppAttemptEventType.LAUNCH_FAILED,
 313               RMAppAttemptEventType.EXPIRE,
 314               RMAppAttemptEventType.REGISTERED,
 315               RMAppAttemptEventType.CONTAINER_ALLOCATED,
 316               RMAppAttemptEventType.UNREGISTERED,
 317               RMAppAttemptEventType.KILL,
 318               RMAppAttemptEventType.STATUS_UPDATE))
 319       .addTransition(RMAppAttemptState.KILLED, 
 320           RMAppAttemptState.KILLED, 
 321           RMAppAttemptEventType.CONTAINER_FINISHED, 
 322           new ContainerFinishedAtFinalStateTransition())
 323     .installTopology();
 324 
 325   public RMAppAttemptImpl(ApplicationAttemptId appAttemptId,
 326       RMContext rmContext, YarnScheduler scheduler,
 327       ApplicationMasterService masterService,
 328       ApplicationSubmissionContext submissionContext,
 329       Configuration conf, boolean maybeLastAttempt, ResourceRequest amReq) {
 330     this.conf = conf;
 331     this.applicationAttemptId = appAttemptId;
 332     this.rmContext = rmContext;
 333     this.eventHandler = rmContext.getDispatcher().getEventHandler();
 334     this.submissionContext = submissionContext;
 335     this.scheduler = scheduler;
 336     this.masterService = masterService;
 337 
 338     ReentrantReadWriteLock lock = new ReentrantReadWriteLock();
 339     this.readLock = lock.readLock();
 340     this.writeLock = lock.writeLock();
 341 
 342     this.proxiedTrackingUrl = generateProxyUriWithScheme();
 343     this.maybeLastAttempt = maybeLastAttempt;
 344     this.stateMachine = stateMachineFactory.make(this);
 345 
 346     this.attemptMetrics =
 347         new RMAppAttemptMetrics(applicationAttemptId, rmContext);
 348     
 349     this.amReq = amReq;
 350   }
 351 
 352   @Override
 353   public ApplicationAttemptId getAppAttemptId() {
 354     return this.applicationAttemptId;
 355   }
 356 
 357   @Override
 358   public ApplicationSubmissionContext getSubmissionContext() {
 359     return this.submissionContext;
 360   }
 361 
 362   @Override
 363   public FinalApplicationStatus getFinalApplicationStatus() {
 364     this.readLock.lock();
 365     try {
 366       return this.finalStatus;
 367     } finally {
 368       this.readLock.unlock();
 369     }
 370   }
 371 
 372   @Override
 373   public RMAppAttemptState getAppAttemptState() {
 374     this.readLock.lock();
 375     try {
 376         return this.stateMachine.getCurrentState();
 377     } finally {
 378       this.readLock.unlock();
 379     }
 380   }
 381 
 382   @Override
 383   public String getHost() {
 384     this.readLock.lock();
 385 
 386     try {
 387       return this.host;
 388     } finally {
 389       this.readLock.unlock();
 390     }
 391   }
 392 
 393   @Override
 394   public int getRpcPort() {
 395     this.readLock.lock();
 396 
 397     try {
 398       return this.rpcPort;
 399     } finally {
 400       this.readLock.unlock();
 401     }
 402   }
 403 
 404   @Override
 405   public String getTrackingUrl() {
 406     this.readLock.lock();
 407     try {
 408       return (getSubmissionContext().getUnmanagedAM()) ? 
 409               this.originalTrackingUrl : this.proxiedTrackingUrl;
 410     } finally {
 411       this.readLock.unlock();
 412     }
 413   }
 414   
 415   @Override
 416   public String getOriginalTrackingUrl() {
 417     this.readLock.lock();
 418     try {
 419       return this.originalTrackingUrl;
 420     } finally {
 421       this.readLock.unlock();
 422     }    
 423   }
 424   
 425   @Override
 426   public String getWebProxyBase() {
 427     this.readLock.lock();
 428     try {
 429       return ProxyUriUtils.getPath(applicationAttemptId.getApplicationId());
 430     } finally {
 431       this.readLock.unlock();
 432     }    
 433   }
 434   
 435   private String generateProxyUriWithScheme() {
 436     this.readLock.lock();
 437     try {
 438       final String scheme = WebAppUtils.getHttpSchemePrefix(conf);
 439       String proxy = WebAppUtils.getProxyHostAndPort(conf);
 440       URI proxyUri = ProxyUriUtils.getUriFromAMUrl(scheme, proxy);
 441       URI result = ProxyUriUtils.getProxyUri(null, proxyUri,
 442           applicationAttemptId.getApplicationId());
 443       return result.toASCIIString();
 444     } catch (URISyntaxException e) {
 445       LOG.warn("Could not proxify the uri for "
 446           + applicationAttemptId.getApplicationId(), e);
 447       return null;
 448     } finally {
 449       this.readLock.unlock();
 450     }
 451   }
 452 
 453   private void setTrackingUrlToRMAppPage(RMAppAttemptState stateToBeStored) {
 454     originalTrackingUrl = pjoin(
 455         WebAppUtils.getResolvedRMWebAppURLWithScheme(conf),
 456         "cluster", "app", getAppAttemptId().getApplicationId());
 457     switch (stateToBeStored) {
 458     case KILLED:
 459     case FAILED:
 460       proxiedTrackingUrl = originalTrackingUrl;
 461       break;
 462     default:
 463       break;
 464     }
 465   }
 466 
 467   private void setTrackingUrlToAHSPage(RMAppAttemptState stateToBeStored) {
 468     originalTrackingUrl = pjoin(
 469         WebAppUtils.getHttpSchemePrefix(conf) +
 470         WebAppUtils.getAHSWebAppURLWithoutScheme(conf),
 471         "applicationhistory", "app", getAppAttemptId().getApplicationId());
 472     switch (stateToBeStored) {
 473     case KILLED:
 474     case FAILED:
 475       proxiedTrackingUrl = originalTrackingUrl;
 476       break;
 477     default:
 478       break;
 479     }
 480   }
 481 
 482   private void invalidateAMHostAndPort() {
 483     this.host = "N/A";
 484     this.rpcPort = -1;
 485   }
 486 
 487   // This is only used for RMStateStore. Normal operation must invoke the secret
 488   // manager to get the key and not use the local key directly.
 489   @Override
 490   public SecretKey getClientTokenMasterKey() {
 491     return this.clientTokenMasterKey;
 492   }
 493 
 494   @Override
 495   public Token<AMRMTokenIdentifier> getAMRMToken() {
 496     this.readLock.lock();
 497     try {
 498       return this.amrmToken;
 499     } finally {
 500       this.readLock.unlock();
 501     }
 502   }
 503 
 504   @Private
 505   public void setAMRMToken(Token<AMRMTokenIdentifier> lastToken) {
 506     this.writeLock.lock();
 507     try {
 508       this.amrmToken = lastToken;
 509       this.amrmTokenKeyId = null;
 510     } finally {
 511       this.writeLock.unlock();
 512     }
 513   }
 514 
 515   @Private
 516   public int getAMRMTokenKeyId() {
 517     Integer keyId = this.amrmTokenKeyId;
 518     if (keyId == null) {
 519       this.readLock.lock();
 520       try {
 521         if (this.amrmToken == null) {
 522           throw new YarnRuntimeException("Missing AMRM token for "
 523               + this.applicationAttemptId);
 524         }
 525         keyId = this.amrmToken.decodeIdentifier().getKeyId();
 526         this.amrmTokenKeyId = keyId;
 527       } catch (IOException e) {
 528         throw new YarnRuntimeException("AMRM token decode error for "
 529             + this.applicationAttemptId, e);
 530       } finally {
 531         this.readLock.unlock();
 532       }
 533     }
 534     return keyId;
 535   }
 536 
 537   @Override
 538   public Token<ClientToAMTokenIdentifier> createClientToken(String client) {
 539     this.readLock.lock();
 540 
 541     try {
 542       Token<ClientToAMTokenIdentifier> token = null;
 543       ClientToAMTokenSecretManagerInRM secretMgr =
 544           this.rmContext.getClientToAMTokenSecretManager();
 545       if (client != null &&
 546           secretMgr.getMasterKey(this.applicationAttemptId) != null) {
 547         token = new Token<ClientToAMTokenIdentifier>(
 548             new ClientToAMTokenIdentifier(this.applicationAttemptId, client),
 549             secretMgr);
 550       }
 551       return token;
 552     } finally {
 553       this.readLock.unlock();
 554     }
 555   }
 556 
 557   @Override
 558   public String getDiagnostics() {
 559     this.readLock.lock();
 560 
 561     try {
 562       return this.diagnostics.toString();
 563     } finally {
 564       this.readLock.unlock();
 565     }
 566   }
 567 
 568   public int getAMContainerExitStatus() {
 569     this.readLock.lock();
 570     try {
 571       return this.amContainerExitStatus;
 572     } finally {
 573       this.readLock.unlock();
 574     }
 575   }
 576 
 577   @Override
 578   public float getProgress() {
 579     this.readLock.lock();
 580 
 581     try {
 582       return this.progress;
 583     } finally {
 584       this.readLock.unlock();
 585     }
 586   }
 587 
 588   @VisibleForTesting
 589   @Override
 590   public List<ContainerStatus> getJustFinishedContainers() {
 591     this.readLock.lock();
 592     try {
 593       List<ContainerStatus> returnList = new ArrayList<ContainerStatus>();
 594       for (Collection<ContainerStatus> containerStatusList :
 595           justFinishedContainers.values()) {
 596         returnList.addAll(containerStatusList);
 597       }
 598       return returnList;
 599     } finally {
 600       this.readLock.unlock();
 601     }
 602   }
 603 
 604   @Override
 605   public ConcurrentMap<NodeId, List<ContainerStatus>>
 606   getJustFinishedContainersReference
 607       () {
 608     this.readLock.lock();
 609     try {
 610       return this.justFinishedContainers;
 611     } finally {
 612       this.readLock.unlock();
 613     }
 614   }
 615 
 616   @Override
 617   public ConcurrentMap<NodeId, List<ContainerStatus>>
 618   getFinishedContainersSentToAMReference() {
 619     this.readLock.lock();
 620     try {
 621       return this.finishedContainersSentToAM;
 622     } finally {
 623       this.readLock.unlock();
 624     }
 625   }
 626 
 627   @Override
 628   public List<ContainerStatus> pullJustFinishedContainers() {
 629     this.writeLock.lock();
 630 
 631     try {
 632       List<ContainerStatus> returnList = new ArrayList<ContainerStatus>();
 633 
 634       // A new allocate means the AM received the previously sent
 635       // finishedContainers. We can ack this to NM now
 636       sendFinishedContainersToNM();
 637 
 638       // Mark every containerStatus as being sent to AM though we may return
 639       // only the ones that belong to the current attempt
 640       boolean keepContainersAcressAttempts = this.submissionContext
 641           .getKeepContainersAcrossApplicationAttempts();
 642       for (NodeId nodeId:justFinishedContainers.keySet()) {
 643 
 644         // Clear and get current values
 645         List<ContainerStatus> finishedContainers = justFinishedContainers.put
 646             (nodeId, new ArrayList<ContainerStatus>());
 647 
 648         if (keepContainersAcressAttempts) {
 649           returnList.addAll(finishedContainers);
 650         } else {
 651           // Filter out containers from previous attempt
 652           for (ContainerStatus containerStatus: finishedContainers) {
 653             if (containerStatus.getContainerId().getApplicationAttemptId()
 654                 .equals(this.getAppAttemptId())) {
 655               returnList.add(containerStatus);
 656             }
 657           }
 658         }
 659 
 660         finishedContainersSentToAM.putIfAbsent(nodeId, new ArrayList
 661               <ContainerStatus>());
 662         finishedContainersSentToAM.get(nodeId).addAll(finishedContainers);
 663       }
 664 
 665       return returnList;
 666     } finally {
 667       this.writeLock.unlock();
 668     }
 669   }
 670 
 671   @Override
 672   public Container getMasterContainer() {
 673     this.readLock.lock();
 674 
 675     try {
 676       return this.masterContainer;
 677     } finally {
 678       this.readLock.unlock();
 679     }
 680   }
 681 
 682   @InterfaceAudience.Private
 683   @VisibleForTesting
 684   public void setMasterContainer(Container container) {
 685     masterContainer = container;
 686   }
 687 
 688   @Override
 689   public void handle(RMAppAttemptEvent event) {
 690 
 691     this.writeLock.lock();
 692 
 693     try {
 694       ApplicationAttemptId appAttemptID = event.getApplicationAttemptId();
 695       LOG.debug("Processing event for " + appAttemptID + " of type "
 696           + event.getType());
 697       final RMAppAttemptState oldState = getAppAttemptState();
 698       try {
 699         /* keep the master in sync with the state machine */
 700         this.stateMachine.doTransition(event.getType(), event);
 701       } catch (InvalidStateTransitonException e) {
 702         LOG.error("Can't handle this event at current state", e);
 703         /* TODO fail the application on the failed transition */
 704       }
 705 
 706       if (oldState != getAppAttemptState()) {
 707         LOG.info(appAttemptID + " State change from " + oldState + " to "
 708             + getAppAttemptState());
 709       }
 710     } finally {
 711       this.writeLock.unlock();
 712     }
 713   }
 714 
 715   @Override
 716   public ApplicationResourceUsageReport getApplicationResourceUsageReport() {
 717     this.readLock.lock();
 718     try {
 719       ApplicationResourceUsageReport report =
 720           scheduler.getAppResourceUsageReport(this.getAppAttemptId());
 721       if (report == null) {
 722         report = RMServerUtils.DUMMY_APPLICATION_RESOURCE_USAGE_REPORT;
 723       }
 724       AggregateAppResourceUsage resUsage =
 725           this.attemptMetrics.getAggregateAppResourceUsage();
 726       report.setMemorySeconds(resUsage.getMemorySeconds());
 727       report.setVcoreSeconds(resUsage.getVcoreSeconds());
 728       return report;
 729     } finally {
 730       this.readLock.unlock();
 731     }
 732   }
 733 
 734   @Override
 735   public void recover(RMState state) {
 736     ApplicationStateData appState =
 737         state.getApplicationState().get(getAppAttemptId().getApplicationId());
 738     ApplicationAttemptStateData attemptState =
 739         appState.getAttempt(getAppAttemptId());
 740     assert attemptState != null;
 741     LOG.info("Recovering attempt: " + getAppAttemptId() + " with final state: "
 742         + attemptState.getState());
 743     diagnostics.append("Attempt recovered after RM restart");
 744     diagnostics.append(attemptState.getDiagnostics());
 745     this.amContainerExitStatus = attemptState.getAMContainerExitStatus();
 746     if (amContainerExitStatus == ContainerExitStatus.PREEMPTED) {
 747       this.attemptMetrics.setIsPreempted();
 748     }
 749 
 750     Credentials credentials = attemptState.getAppAttemptTokens();
 751     setMasterContainer(attemptState.getMasterContainer());
 752     recoverAppAttemptCredentials(credentials, attemptState.getState());
 753     this.recoveredFinalState = attemptState.getState();
 754     this.originalTrackingUrl = attemptState.getFinalTrackingUrl();
 755     this.finalStatus = attemptState.getFinalApplicationStatus();
 756     this.startTime = attemptState.getStartTime();
 757     this.finishTime = attemptState.getFinishTime();
 758     this.attemptMetrics.updateAggregateAppResourceUsage(
 759         attemptState.getMemorySeconds(),attemptState.getVcoreSeconds());
 760   }
 761 
 762   public void transferStateFromPreviousAttempt(RMAppAttempt attempt) {
 763     this.justFinishedContainers = attempt.getJustFinishedContainersReference();
 764     this.finishedContainersSentToAM =
 765         attempt.getFinishedContainersSentToAMReference();
 766   }
 767 
 768   private void recoverAppAttemptCredentials(Credentials appAttemptTokens,
 769       RMAppAttemptState state) {
 770     if (appAttemptTokens == null || state == RMAppAttemptState.FAILED
 771         || state == RMAppAttemptState.FINISHED
 772         || state == RMAppAttemptState.KILLED) {
 773       return;
 774     }
 775 
 776     if (UserGroupInformation.isSecurityEnabled()) {
 777       byte[] clientTokenMasterKeyBytes = appAttemptTokens.getSecretKey(
 778           RMStateStore.AM_CLIENT_TOKEN_MASTER_KEY_NAME);
 779       if (clientTokenMasterKeyBytes != null) {
 780         clientTokenMasterKey = rmContext.getClientToAMTokenSecretManager()
 781             .registerMasterKey(applicationAttemptId, clientTokenMasterKeyBytes);
 782       }
 783     }
 784 
 785     setAMRMToken(rmContext.getAMRMTokenSecretManager().createAndGetAMRMToken(
 786         applicationAttemptId));
 787   }
 788 
 789   private static class BaseTransition implements
 790       SingleArcTransition<RMAppAttemptImpl, RMAppAttemptEvent> {
 791 
 792     @Override
 793     public void transition(RMAppAttemptImpl appAttempt,
 794         RMAppAttemptEvent event) {
 795     }
 796 
 797   }
 798 
 799   private static final class AttemptStartedTransition extends BaseTransition {
 800     @Override
 801     public void transition(RMAppAttemptImpl appAttempt,
 802         RMAppAttemptEvent event) {
 803 
 804         boolean transferStateFromPreviousAttempt = false;
 805       if (event instanceof RMAppStartAttemptEvent) {
 806         transferStateFromPreviousAttempt =
 807             ((RMAppStartAttemptEvent) event)
 808               .getTransferStateFromPreviousAttempt();
 809       }
 810       appAttempt.startTime = System.currentTimeMillis();
 811 
 812       // Register with the ApplicationMasterService
 813       appAttempt.masterService
 814           .registerAppAttempt(appAttempt.applicationAttemptId);
 815 
 816       if (UserGroupInformation.isSecurityEnabled()) {
 817         appAttempt.clientTokenMasterKey =
 818             appAttempt.rmContext.getClientToAMTokenSecretManager()
 819               .createMasterKey(appAttempt.applicationAttemptId);
 820       }
 821 
 822       // Add the applicationAttempt to the scheduler and inform the scheduler
 823       // whether to transfer the state from previous attempt.
 824       appAttempt.eventHandler.handle(new AppAttemptAddedSchedulerEvent(
 825         appAttempt.applicationAttemptId, transferStateFromPreviousAttempt));
 826     }
 827   }
 828 
 829   private static final List<ContainerId> EMPTY_CONTAINER_RELEASE_LIST =
 830       new ArrayList<ContainerId>();
 831 
 832   private static final List<ResourceRequest> EMPTY_CONTAINER_REQUEST_LIST =
 833       new ArrayList<ResourceRequest>();
 834 
 835   @VisibleForTesting
 836   public static final class ScheduleTransition
 837       implements
 838       MultipleArcTransition<RMAppAttemptImpl, RMAppAttemptEvent, RMAppAttemptState> {
 839     @Override
 840     public RMAppAttemptState transition(RMAppAttemptImpl appAttempt,
 841         RMAppAttemptEvent event) {
 842       ApplicationSubmissionContext subCtx = appAttempt.submissionContext;
 843       if (!subCtx.getUnmanagedAM()) {
 844         // Need reset #containers before create new attempt, because this request
 845         // will be passed to scheduler, and scheduler will deduct the number after
 846         // AM container allocated
 847         
 848         // Currently, following fields are all hard code,
 849         // TODO: change these fields when we want to support
 850         // priority/resource-name/relax-locality specification for AM containers
 851         // allocation.
 852         appAttempt.amReq.setNumContainers(1);
 853         appAttempt.amReq.setPriority(AM_CONTAINER_PRIORITY);
 854         appAttempt.amReq.setResourceName(ResourceRequest.ANY);
 855         appAttempt.amReq.setRelaxLocality(true);
 856         
 857         // AM resource has been checked when submission
 858         Allocation amContainerAllocation =
 859             appAttempt.scheduler.allocate(appAttempt.applicationAttemptId,
 860                 Collections.singletonList(appAttempt.amReq),
 861                 EMPTY_CONTAINER_RELEASE_LIST, null, null);
 862         if (amContainerAllocation != null
 863             && amContainerAllocation.getContainers() != null) {
 864           assert (amContainerAllocation.getContainers().size() == 0);
 865         }
 866         return RMAppAttemptState.SCHEDULED;
 867       } else {
 868         // save state and then go to LAUNCHED state
 869         appAttempt.storeAttempt();
 870         return RMAppAttemptState.LAUNCHED_UNMANAGED_SAVING;
 871       }
 872     }
 873   }
 874 
 875   private static final class AMContainerAllocatedTransition
 876       implements
 877       MultipleArcTransition<RMAppAttemptImpl, RMAppAttemptEvent, RMAppAttemptState> {
 878     @Override
 879     public RMAppAttemptState transition(RMAppAttemptImpl appAttempt,
 880         RMAppAttemptEvent event) {
 881       // Acquire the AM container from the scheduler.
 882       Allocation amContainerAllocation =
 883           appAttempt.scheduler.allocate(appAttempt.applicationAttemptId,
 884             EMPTY_CONTAINER_REQUEST_LIST, EMPTY_CONTAINER_RELEASE_LIST, null,
 885             null);
 886       // There must be at least one container allocated, because a
 887       // CONTAINER_ALLOCATED is emitted after an RMContainer is constructed,
 888       // and is put in SchedulerApplication#newlyAllocatedContainers.
 889 
 890       // Note that YarnScheduler#allocate is not guaranteed to be able to
 891       // fetch it since container may not be fetchable for some reason like
 892       // DNS unavailable causing container token not generated. As such, we
 893       // return to the previous state and keep retry until am container is
 894       // fetched.
 895       if (amContainerAllocation.getContainers().size() == 0) {
 896         appAttempt.retryFetchingAMContainer(appAttempt);
 897         return RMAppAttemptState.SCHEDULED;
 898       }
 899 
 900       // Set the masterContainer
 901       appAttempt.setMasterContainer(amContainerAllocation.getContainers()
 902           .get(0));
 903       RMContainerImpl rmMasterContainer = (RMContainerImpl)appAttempt.scheduler
 904           .getRMContainer(appAttempt.getMasterContainer().getId());
 905       rmMasterContainer.setAMContainer(true);
 906       // The node set in NMTokenSecrentManager is used for marking whether the
 907       // NMToken has been issued for this node to the AM.
 908       // When AM container was allocated to RM itself, the node which allocates
 909       // this AM container was marked as the NMToken already sent. Thus,
 910       // clear this node set so that the following allocate requests from AM are
 911       // able to retrieve the corresponding NMToken.
 912       appAttempt.rmContext.getNMTokenSecretManager()
 913         .clearNodeSetForAttempt(appAttempt.applicationAttemptId);
 914       appAttempt.getSubmissionContext().setResource(
 915         appAttempt.getMasterContainer().getResource());
 916       appAttempt.storeAttempt();
 917       return RMAppAttemptState.ALLOCATED_SAVING;
 918     }
 919   }
 920 
 921   private void retryFetchingAMContainer(final RMAppAttemptImpl appAttempt) {
 922     // start a new thread so that we are not blocking main dispatcher thread.
 923     new Thread() {
 924       @Override
 925       public void run() {
 926         try {
 927           Thread.sleep(500);
 928         } catch (InterruptedException e) {
 929           LOG.warn("Interrupted while waiting to resend the"
 930               + " ContainerAllocated Event.");
 931         }
 932         appAttempt.eventHandler.handle(
 933             new RMAppAttemptEvent(appAttempt.applicationAttemptId,
 934                 RMAppAttemptEventType.CONTAINER_ALLOCATED));
 935       }
 936     }.start();
 937   }
 938 
 939   private static final class AttemptStoredTransition extends BaseTransition {
 940     @Override
 941     public void transition(RMAppAttemptImpl appAttempt,
 942                                                     RMAppAttemptEvent event) {
 943       appAttempt.launchAttempt();
 944     }
 945   }
 946 
 947   private static class AttemptRecoveredTransition
 948       implements
 949       MultipleArcTransition<RMAppAttemptImpl, RMAppAttemptEvent, RMAppAttemptState> {
 950     @Override
 951     public RMAppAttemptState transition(RMAppAttemptImpl appAttempt,
 952         RMAppAttemptEvent event) {
 953       RMApp rmApp = appAttempt.rmContext.getRMApps().get(
 954           appAttempt.getAppAttemptId().getApplicationId());
 955 
 956       /*
 957        * If last attempt recovered final state is null .. it means attempt was
 958        * started but AM container may or may not have started / finished.
 959        * Therefore we should wait for it to finish.
 960        */
 961       if (appAttempt.recoveredFinalState != null) {
 962         appAttempt.progress = 1.0f;
 963         // We will replay the final attempt only if last attempt is in final
 964         // state but application is not in final state.
 965         if (rmApp.getCurrentAppAttempt() == appAttempt
 966             && !RMAppImpl.isAppInFinalState(rmApp)) {
 967           // Add the previous finished attempt to scheduler synchronously so
 968           // that scheduler knows the previous attempt.
 969           appAttempt.scheduler.handle(new AppAttemptAddedSchedulerEvent(
 970             appAttempt.getAppAttemptId(), false, true));
 971           (new BaseFinalTransition(appAttempt.recoveredFinalState)).transition(
 972               appAttempt, event);
 973         }
 974         return appAttempt.recoveredFinalState;
 975       } else if (RMAppImpl.isAppInFinalState(rmApp))  {
 976         // Somehow attempt final state was not saved but app final state was saved.
 977         // Skip adding the attempt into scheduler
 978         RMAppState appState = ((RMAppImpl) rmApp).getRecoveredFinalState();
 979         LOG.warn(rmApp.getApplicationId() + " final state (" + appState
 980             + ") was recorded, but " + appAttempt.applicationAttemptId
 981             + " final state (" + appAttempt.recoveredFinalState
 982             + ") was not recorded.");
 983         switch (appState) {
 984         case FINISHED:
 985           return RMAppAttemptState.FINISHED;
 986         case FAILED:
 987           return RMAppAttemptState.FAILED;
 988         case KILLED:
 989           return RMAppAttemptState.KILLED;
 990         }
 991         return RMAppAttemptState.FAILED;
 992       } else{
 993         // Add the current attempt to the scheduler.
 994         if (appAttempt.rmContext.isWorkPreservingRecoveryEnabled()) {
 995           // Need to register an app attempt before AM can register
 996           appAttempt.masterService
 997               .registerAppAttempt(appAttempt.applicationAttemptId);
 998 
 999           // Add attempt to scheduler synchronously to guarantee scheduler
1000           // knows attempts before AM or NM re-registers.
1001           appAttempt.scheduler.handle(new AppAttemptAddedSchedulerEvent(
1002             appAttempt.getAppAttemptId(), false, true));
1003         }
1004 
1005         /*
1006          * Since the application attempt's final state is not saved that means
1007          * for AM container (previous attempt) state must be one of these.
1008          * 1) AM container may not have been launched (RM failed right before
1009          * this).
1010          * 2) AM container was successfully launched but may or may not have
1011          * registered / unregistered.
1012          * In whichever case we will wait (by moving attempt into LAUNCHED
1013          * state) and mark this attempt failed (assuming non work preserving
1014          * restart) only after
1015          * 1) Node manager during re-registration heart beats back saying
1016          * am container finished.
1017          * 2) OR AMLivelinessMonitor expires this attempt (when am doesn't
1018          * heart beat back).  
1019          */
1020         (new AMLaunchedTransition()).transition(appAttempt, event);
1021         return RMAppAttemptState.LAUNCHED;
1022       }
1023     }
1024   }
1025 
1026 
1027   private void rememberTargetTransitions(RMAppAttemptEvent event,
1028       Object transitionToDo, RMAppAttemptState targetFinalState) {
1029     transitionTodo = transitionToDo;
1030     targetedFinalState = targetFinalState;
1031     eventCausingFinalSaving = event;
1032   }
1033 
1034   private void rememberTargetTransitionsAndStoreState(RMAppAttemptEvent event,
1035       Object transitionToDo, RMAppAttemptState targetFinalState,
1036       RMAppAttemptState stateToBeStored) {
1037 
1038     rememberTargetTransitions(event, transitionToDo, targetFinalState);
1039     stateBeforeFinalSaving = getState();
1040 
1041     // As of today, finalState, diagnostics, final-tracking-url and
1042     // finalAppStatus are the only things that we store into the StateStore
1043     // AFTER the initial saving on app-attempt-start
1044     // These fields can be visible from outside only after they are saved in
1045     // StateStore
1046     String diags = null;
1047 
1048     // don't leave the tracking URL pointing to a non-existent AM
1049     if (conf.getBoolean(YarnConfiguration.APPLICATION_HISTORY_ENABLED,
1050             YarnConfiguration.DEFAULT_APPLICATION_HISTORY_ENABLED)) {
1051       setTrackingUrlToAHSPage(stateToBeStored);
1052     } else {
1053       setTrackingUrlToRMAppPage(stateToBeStored);
1054     }
1055     String finalTrackingUrl = getOriginalTrackingUrl();
1056     FinalApplicationStatus finalStatus = null;
1057     int exitStatus = ContainerExitStatus.INVALID;
1058     switch (event.getType()) {
1059     case LAUNCH_FAILED:
1060       diags = event.getDiagnosticMsg();
1061       break;
1062     case REGISTERED:
1063       diags = getUnexpectedAMRegisteredDiagnostics();
1064       break;
1065     case UNREGISTERED:
1066       RMAppAttemptUnregistrationEvent unregisterEvent =
1067           (RMAppAttemptUnregistrationEvent) event;
1068       diags = unregisterEvent.getDiagnosticMsg();
1069       // reset finalTrackingUrl to url sent by am
1070       finalTrackingUrl = sanitizeTrackingUrl(unregisterEvent.getFinalTrackingUrl());
1071       finalStatus = unregisterEvent.getFinalApplicationStatus();
1072       break;
1073     case CONTAINER_FINISHED:
1074       RMAppAttemptContainerFinishedEvent finishEvent =
1075           (RMAppAttemptContainerFinishedEvent) event;
1076       diags = getAMContainerCrashedDiagnostics(finishEvent);
1077       exitStatus = finishEvent.getContainerStatus().getExitStatus();
1078       break;
1079     case KILL:
1080       break;
1081     case EXPIRE:
1082       diags = getAMExpiredDiagnostics(event);
1083       break;
1084     default:
1085       break;
1086     }
1087     AggregateAppResourceUsage resUsage =
1088         this.attemptMetrics.getAggregateAppResourceUsage();
1089     RMStateStore rmStore = rmContext.getStateStore();
1090     setFinishTime(System.currentTimeMillis());
1091 
1092     ApplicationAttemptStateData attemptState =
1093         ApplicationAttemptStateData.newInstance(
1094             applicationAttemptId,  getMasterContainer(),
1095             rmStore.getCredentialsFromAppAttempt(this),
1096             startTime, stateToBeStored, finalTrackingUrl, diags,
1097             finalStatus, exitStatus,
1098           getFinishTime(), resUsage.getMemorySeconds(),
1099           resUsage.getVcoreSeconds());
1100     LOG.info("Updating application attempt " + applicationAttemptId
1101         + " with final state: " + targetedFinalState + ", and exit status: "
1102         + exitStatus);
1103     rmStore.updateApplicationAttemptState(attemptState);
1104   }
1105 
1106   private static class FinalSavingTransition extends BaseTransition {
1107 
1108     Object transitionToDo;
1109     RMAppAttemptState targetedFinalState;
1110 
1111     public FinalSavingTransition(Object transitionToDo,
1112         RMAppAttemptState targetedFinalState) {
1113       this.transitionToDo = transitionToDo;
1114       this.targetedFinalState = targetedFinalState;
1115     }
1116 
1117     @Override
1118     public void transition(RMAppAttemptImpl appAttempt, RMAppAttemptEvent event) {
1119       // For cases Killed/Failed, targetedFinalState is the same as the state to
1120       // be stored
1121       appAttempt.rememberTargetTransitionsAndStoreState(event, transitionToDo,
1122         targetedFinalState, targetedFinalState);
1123     }
1124   }
1125 
1126   private static class FinalStateSavedTransition implements
1127       MultipleArcTransition<RMAppAttemptImpl, RMAppAttemptEvent, RMAppAttemptState> {
1128     @Override
1129     public RMAppAttemptState transition(RMAppAttemptImpl appAttempt,
1130         RMAppAttemptEvent event) {
1131       RMAppAttemptEvent causeEvent = appAttempt.eventCausingFinalSaving;
1132 
1133       if (appAttempt.transitionTodo instanceof SingleArcTransition) {
1134         ((SingleArcTransition) appAttempt.transitionTodo).transition(
1135           appAttempt, causeEvent);
1136       } else if (appAttempt.transitionTodo instanceof MultipleArcTransition) {
1137         ((MultipleArcTransition) appAttempt.transitionTodo).transition(
1138           appAttempt, causeEvent);
1139       }
1140       return appAttempt.targetedFinalState;
1141     }
1142   }
1143   
1144   private static class BaseFinalTransition extends BaseTransition {
1145 
1146     private final RMAppAttemptState finalAttemptState;
1147 
1148     public BaseFinalTransition(RMAppAttemptState finalAttemptState) {
1149       this.finalAttemptState = finalAttemptState;
1150     }
1151 
1152     @Override
1153     public void transition(RMAppAttemptImpl appAttempt,
1154         RMAppAttemptEvent event) {
1155       ApplicationAttemptId appAttemptId = appAttempt.getAppAttemptId();
1156 
1157       // Tell the AMS. Unregister from the ApplicationMasterService
1158       appAttempt.masterService.unregisterAttempt(appAttemptId);
1159 
1160       // Tell the application and the scheduler
1161       ApplicationId applicationId = appAttemptId.getApplicationId();
1162       RMAppEvent appEvent = null;
1163       boolean keepContainersAcrossAppAttempts = false;
1164       switch (finalAttemptState) {
1165         case FINISHED:
1166         {
1167           appEvent =
1168               new RMAppEvent(applicationId, RMAppEventType.ATTEMPT_FINISHED,
1169               appAttempt.getDiagnostics());
1170         }
1171         break;
1172         case KILLED:
1173         {
1174           appAttempt.invalidateAMHostAndPort();
1175           // Forward diagnostics received in attempt kill event.
1176           appEvent =
1177               new RMAppFailedAttemptEvent(applicationId,
1178                   RMAppEventType.ATTEMPT_KILLED,
1179                   event.getDiagnosticMsg(), false);
1180         }
1181         break;
1182         case FAILED:
1183         {
1184           appAttempt.invalidateAMHostAndPort();
1185 
1186           if (appAttempt.submissionContext
1187             .getKeepContainersAcrossApplicationAttempts()
1188               && !appAttempt.submissionContext.getUnmanagedAM()) {
1189             // See if we should retain containers for non-unmanaged applications
1190             if (!appAttempt.shouldCountTowardsMaxAttemptRetry()) {
1191               // Premption, hardware failures, NM resync doesn't count towards
1192               // app-failures and so we should retain containers.
1193               keepContainersAcrossAppAttempts = true;
1194             } else if (!appAttempt.maybeLastAttempt) {
1195               // Not preemption, hardware failures or NM resync.
1196               // Not last-attempt too - keep containers.
1197               keepContainersAcrossAppAttempts = true;
1198             }
1199           }
1200           appEvent =
1201               new RMAppFailedAttemptEvent(applicationId,
1202                 RMAppEventType.ATTEMPT_FAILED, appAttempt.getDiagnostics(),
1203                 keepContainersAcrossAppAttempts);
1204 
1205         }
1206         break;
1207         default:
1208         {
1209           LOG.error("Cannot get this state!! Error!!");
1210         }
1211         break;
1212       }
1213 
1214       appAttempt.eventHandler.handle(appEvent);
1215       appAttempt.eventHandler.handle(new AppAttemptRemovedSchedulerEvent(
1216         appAttemptId, finalAttemptState, keepContainersAcrossAppAttempts));
1217       appAttempt.removeCredentials(appAttempt);
1218 
1219       appAttempt.rmContext.getRMApplicationHistoryWriter()
1220           .applicationAttemptFinished(appAttempt, finalAttemptState);
1221       appAttempt.rmContext.getSystemMetricsPublisher()
1222           .appAttemptFinished(appAttempt, finalAttemptState,
1223               appAttempt.rmContext.getRMApps().get(
1224                   appAttempt.applicationAttemptId.getApplicationId()),
1225               System.currentTimeMillis());
1226     }
1227   }
1228 
1229   private static class AMLaunchedTransition extends BaseTransition {
1230     @Override
1231     public void transition(RMAppAttemptImpl appAttempt,
1232                             RMAppAttemptEvent event) {
1233       if (event.getType() == RMAppAttemptEventType.LAUNCHED) {
1234         appAttempt.launchAMEndTime = System.currentTimeMillis();
1235         long delay = appAttempt.launchAMEndTime -
1236             appAttempt.launchAMStartTime;
1237         ClusterMetrics.getMetrics().addAMLaunchDelay(delay);
1238       }
1239       // Register with AMLivelinessMonitor
1240       appAttempt.attemptLaunched();
1241 
1242       // register the ClientTokenMasterKey after it is saved in the store,
1243       // otherwise client may hold an invalid ClientToken after RM restarts.
1244       if (UserGroupInformation.isSecurityEnabled()) {
1245         appAttempt.rmContext.getClientToAMTokenSecretManager()
1246             .registerApplication(appAttempt.getAppAttemptId(),
1247             appAttempt.getClientTokenMasterKey());
1248       }
1249     }
1250   }
1251 
1252   @Override
1253   public boolean shouldCountTowardsMaxAttemptRetry() {
1254     try {
1255       this.readLock.lock();
1256       int exitStatus = getAMContainerExitStatus();
1257       return !(exitStatus == ContainerExitStatus.PREEMPTED
1258           || exitStatus == ContainerExitStatus.ABORTED
1259           || exitStatus == ContainerExitStatus.DISKS_FAILED
1260           || exitStatus == ContainerExitStatus.KILLED_BY_RESOURCEMANAGER);
1261     } finally {
1262       this.readLock.unlock();
1263     }
1264   }
1265 
1266   private static final class UnmanagedAMAttemptSavedTransition 
1267                                                 extends AMLaunchedTransition {
1268     @Override
1269     public void transition(RMAppAttemptImpl appAttempt,
1270                             RMAppAttemptEvent event) {
1271       // create AMRMToken
1272       appAttempt.amrmToken =
1273           appAttempt.rmContext.getAMRMTokenSecretManager().createAndGetAMRMToken(
1274             appAttempt.applicationAttemptId);
1275 
1276       super.transition(appAttempt, event);
1277     }    
1278   }
1279 
1280   private static final class LaunchFailedTransition extends BaseFinalTransition {
1281 
1282     public LaunchFailedTransition() {
1283       super(RMAppAttemptState.FAILED);
1284     }
1285 
1286     @Override
1287     public void transition(RMAppAttemptImpl appAttempt,
1288         RMAppAttemptEvent event) {
1289 
1290       // Use diagnostic from launcher
1291       appAttempt.diagnostics.append(event.getDiagnosticMsg());
1292 
1293       // Tell the app, scheduler
1294       super.transition(appAttempt, event);
1295 
1296     }
1297   }
1298 
1299   private static final class KillAllocatedAMTransition extends
1300       BaseFinalTransition {
1301     public KillAllocatedAMTransition() {
1302       super(RMAppAttemptState.KILLED);
1303     }
1304 
1305     @Override
1306     public void transition(RMAppAttemptImpl appAttempt,
1307         RMAppAttemptEvent event) {
1308 
1309       // Tell the application and scheduler
1310       super.transition(appAttempt, event);
1311 
1312       // Tell the launcher to cleanup.
1313       appAttempt.eventHandler.handle(new AMLauncherEvent(
1314           AMLauncherEventType.CLEANUP, appAttempt));
1315 
1316     }
1317   }
1318 
1319   private static final class AMRegisteredTransition extends BaseTransition {
1320     @Override
1321     public void transition(RMAppAttemptImpl appAttempt,
1322         RMAppAttemptEvent event) {
1323       long delay = System.currentTimeMillis() - appAttempt.launchAMEndTime;
1324       ClusterMetrics.getMetrics().addAMRegisterDelay(delay);
1325       RMAppAttemptRegistrationEvent registrationEvent
1326           = (RMAppAttemptRegistrationEvent) event;
1327       appAttempt.host = registrationEvent.getHost();
1328       appAttempt.rpcPort = registrationEvent.getRpcport();
1329       appAttempt.originalTrackingUrl =
1330           sanitizeTrackingUrl(registrationEvent.getTrackingurl());
1331 
1332       // Let the app know
1333       appAttempt.eventHandler.handle(new RMAppEvent(appAttempt
1334           .getAppAttemptId().getApplicationId(),
1335           RMAppEventType.ATTEMPT_REGISTERED));
1336 
1337       // TODO:FIXME: Note for future. Unfortunately we only do a state-store
1338       // write at AM launch time, so we don't save the AM's tracking URL anywhere
1339       // as that would mean an extra state-store write. For now, we hope that in
1340       // work-preserving restart, AMs are forced to reregister.
1341 
1342       appAttempt.rmContext.getRMApplicationHistoryWriter()
1343           .applicationAttemptStarted(appAttempt);
1344       appAttempt.rmContext.getSystemMetricsPublisher()
1345           .appAttemptRegistered(appAttempt, System.currentTimeMillis());
1346     }
1347   }
1348 
1349   private static final class AMContainerCrashedBeforeRunningTransition extends
1350       BaseFinalTransition {
1351 
1352     public AMContainerCrashedBeforeRunningTransition() {
1353       super(RMAppAttemptState.FAILED);
1354     }
1355 
1356     @Override
1357     public void transition(RMAppAttemptImpl appAttempt,
1358         RMAppAttemptEvent event) {
1359       RMAppAttemptContainerFinishedEvent finishEvent =
1360           ((RMAppAttemptContainerFinishedEvent)event);
1361 
1362       // UnRegister from AMLivelinessMonitor
1363       appAttempt.rmContext.getAMLivelinessMonitor().unregister(
1364           appAttempt.getAppAttemptId());
1365 
1366       // Setup diagnostic message and exit status
1367       appAttempt.setAMContainerCrashedDiagnosticsAndExitStatus(finishEvent);
1368 
1369       // Tell the app, scheduler
1370       super.transition(appAttempt, finishEvent);
1371     }
1372   }
1373 
1374   private void setAMContainerCrashedDiagnosticsAndExitStatus(
1375       RMAppAttemptContainerFinishedEvent finishEvent) {
1376     ContainerStatus status = finishEvent.getContainerStatus();
1377     String diagnostics = getAMContainerCrashedDiagnostics(finishEvent);
1378     this.diagnostics.append(diagnostics);
1379     this.amContainerExitStatus = status.getExitStatus();
1380   }
1381 
1382   private String getAMContainerCrashedDiagnostics(
1383       RMAppAttemptContainerFinishedEvent finishEvent) {
1384     ContainerStatus status = finishEvent.getContainerStatus();
1385     StringBuilder diagnosticsBuilder = new StringBuilder();
1386     diagnosticsBuilder.append("AM Container for ").append(
1387       finishEvent.getApplicationAttemptId()).append(
1388       " exited with ").append(" exitCode: ").append(status.getExitStatus()).
1389       append("\n");
1390     if (this.getTrackingUrl() != null) {
1391       diagnosticsBuilder.append("For more detailed output,").append(
1392         " check application tracking page:").append(
1393         this.getTrackingUrl()).append(
1394         "Then, click on links to logs of each attempt.\n");
1395     }
1396     diagnosticsBuilder.append("Diagnostics: ").append(status.getDiagnostics())
1397         .append("Failing this attempt");
1398     return diagnosticsBuilder.toString();
1399   }
1400 
1401   private static class FinalTransition extends BaseFinalTransition {
1402 
1403     public FinalTransition(RMAppAttemptState finalAttemptState) {
1404       super(finalAttemptState);
1405     }
1406 
1407     @Override
1408     public void transition(RMAppAttemptImpl appAttempt,
1409         RMAppAttemptEvent event) {
1410 
1411       appAttempt.progress = 1.0f;
1412 
1413       // Tell the app and the scheduler
1414       super.transition(appAttempt, event);
1415 
1416       // UnRegister from AMLivelinessMonitor. Perhaps for
1417       // FAILING/KILLED/UnManaged AMs
1418       appAttempt.rmContext.getAMLivelinessMonitor().unregister(
1419           appAttempt.getAppAttemptId());
1420       appAttempt.rmContext.getAMFinishingMonitor().unregister(
1421           appAttempt.getAppAttemptId());
1422 
1423       if(!appAttempt.submissionContext.getUnmanagedAM()) {
1424         // Tell the launcher to cleanup.
1425         appAttempt.eventHandler.handle(new AMLauncherEvent(
1426             AMLauncherEventType.CLEANUP, appAttempt));
1427       }
1428     }
1429   }
1430 
1431   private static class ExpiredTransition extends FinalTransition {
1432 
1433     public ExpiredTransition() {
1434       super(RMAppAttemptState.FAILED);
1435     }
1436 
1437     @Override
1438     public void transition(RMAppAttemptImpl appAttempt,
1439         RMAppAttemptEvent event) {
1440       appAttempt.diagnostics.append(getAMExpiredDiagnostics(event));
1441       super.transition(appAttempt, event);
1442     }
1443   }
1444 
1445   private static String getAMExpiredDiagnostics(RMAppAttemptEvent event) {
1446     String diag =
1447         "ApplicationMaster for attempt " + event.getApplicationAttemptId()
1448             + " timed out";
1449     return diag;
1450   }
1451 
1452   private static class UnexpectedAMRegisteredTransition extends
1453       BaseFinalTransition {
1454 
1455     public UnexpectedAMRegisteredTransition() {
1456       super(RMAppAttemptState.FAILED);
1457     }
1458 
1459     @Override
1460     public void transition(RMAppAttemptImpl appAttempt, RMAppAttemptEvent event) {
1461       assert appAttempt.submissionContext.getUnmanagedAM();
1462       appAttempt.diagnostics.append(getUnexpectedAMRegisteredDiagnostics());
1463       super.transition(appAttempt, event);
1464     }
1465 
1466   }
1467 
1468   private static String getUnexpectedAMRegisteredDiagnostics() {
1469     return "Unmanaged AM must register after AM attempt reaches LAUNCHED state.";
1470   }
1471 
1472   private static final class StatusUpdateTransition extends
1473       BaseTransition {
1474     @Override
1475     public void transition(RMAppAttemptImpl appAttempt,
1476         RMAppAttemptEvent event) {
1477 
1478       RMAppAttemptStatusupdateEvent statusUpdateEvent
1479         = (RMAppAttemptStatusupdateEvent) event;
1480 
1481       // Update progress
1482       appAttempt.progress = statusUpdateEvent.getProgress();
1483 
1484       // Ping to AMLivelinessMonitor
1485       appAttempt.rmContext.getAMLivelinessMonitor().receivedPing(
1486           statusUpdateEvent.getApplicationAttemptId());
1487     }
1488   }
1489 
1490   private static final class AMUnregisteredTransition implements
1491       MultipleArcTransition<RMAppAttemptImpl, RMAppAttemptEvent, RMAppAttemptState> {
1492 
1493     @Override
1494     public RMAppAttemptState transition(RMAppAttemptImpl appAttempt,
1495         RMAppAttemptEvent event) {
1496       // Tell the app
1497       if (appAttempt.getSubmissionContext().getUnmanagedAM()) {
1498         // Unmanaged AMs have no container to wait for, so they skip
1499         // the FINISHING state and go straight to FINISHED.
1500         appAttempt.updateInfoOnAMUnregister(event);
1501         new FinalTransition(RMAppAttemptState.FINISHED).transition(
1502             appAttempt, event);
1503         return RMAppAttemptState.FINISHED;
1504       }
1505       // Saving the attempt final state
1506       appAttempt.rememberTargetTransitionsAndStoreState(event,
1507         new FinalStateSavedAfterAMUnregisterTransition(),
1508         RMAppAttemptState.FINISHING, RMAppAttemptState.FINISHED);
1509       ApplicationId applicationId =
1510           appAttempt.getAppAttemptId().getApplicationId();
1511 
1512       // Tell the app immediately that AM is unregistering so that app itself
1513       // can save its state as soon as possible. Whether we do it like this, or
1514       // we wait till AppAttempt is saved, it doesn't make any difference on the
1515       // app side w.r.t failure conditions. The only event going out of
1516       // AppAttempt to App after this point of time is AM/AppAttempt Finished.
1517       appAttempt.eventHandler.handle(new RMAppEvent(applicationId,
1518         RMAppEventType.ATTEMPT_UNREGISTERED));
1519       return RMAppAttemptState.FINAL_SAVING;
1520     }
1521   }
1522 
1523   private static class FinalStateSavedAfterAMUnregisterTransition extends
1524       BaseTransition {
1525     @Override
1526     public void
1527         transition(RMAppAttemptImpl appAttempt, RMAppAttemptEvent event) {
1528       // Unregister from the AMlivenessMonitor and register with AMFinishingMonitor
1529       appAttempt.rmContext.getAMLivelinessMonitor().unregister(
1530         appAttempt.applicationAttemptId);
1531       appAttempt.rmContext.getAMFinishingMonitor().register(
1532         appAttempt.applicationAttemptId);
1533 
1534       // Do not make any more changes to this transition code. Make all changes
1535       // to the following method. Unless you are absolutely sure that you have
1536       // stuff to do that shouldn't be used by the callers of the following
1537       // method.
1538       appAttempt.updateInfoOnAMUnregister(event);
1539     }
1540   }
1541 
1542   private void updateInfoOnAMUnregister(RMAppAttemptEvent event) {
1543     progress = 1.0f;
1544     RMAppAttemptUnregistrationEvent unregisterEvent =
1545         (RMAppAttemptUnregistrationEvent) event;
1546     diagnostics.append(unregisterEvent.getDiagnosticMsg());
1547     originalTrackingUrl = sanitizeTrackingUrl(unregisterEvent.getFinalTrackingUrl());
1548     finalStatus = unregisterEvent.getFinalApplicationStatus();
1549   }
1550 
1551   private static final class ContainerFinishedTransition
1552       implements
1553       MultipleArcTransition<RMAppAttemptImpl, RMAppAttemptEvent, RMAppAttemptState> {
1554 
1555     // The transition To Do after attempt final state is saved.
1556     private BaseTransition transitionToDo;
1557     private RMAppAttemptState currentState;
1558 
1559     public ContainerFinishedTransition(BaseTransition transitionToDo,
1560         RMAppAttemptState currentState) {
1561       this.transitionToDo = transitionToDo;
1562       this.currentState = currentState;
1563     }
1564 
1565     @Override
1566     public RMAppAttemptState transition(RMAppAttemptImpl appAttempt,
1567         RMAppAttemptEvent event) {
1568 
1569       RMAppAttemptContainerFinishedEvent containerFinishedEvent =
1570           (RMAppAttemptContainerFinishedEvent) event;
1571       ContainerStatus containerStatus =
1572           containerFinishedEvent.getContainerStatus();
1573 
1574       // Is this container the AmContainer? If the finished container is same as
1575       // the AMContainer, AppAttempt fails
1576       if (appAttempt.masterContainer != null
1577           && appAttempt.masterContainer.getId().equals(
1578               containerStatus.getContainerId())) {
1579         appAttempt.sendAMContainerToNM(appAttempt, containerFinishedEvent);
1580 
1581         // Remember the follow up transition and save the final attempt state.
1582         appAttempt.rememberTargetTransitionsAndStoreState(event,
1583             transitionToDo, RMAppAttemptState.FAILED, RMAppAttemptState.FAILED);
1584         return RMAppAttemptState.FINAL_SAVING;
1585       }
1586 
1587       // Add all finished containers so that they can be acked to NM
1588       addJustFinishedContainer(appAttempt, containerFinishedEvent);
1589       return this.currentState;
1590     }
1591   }
1592 
1593 
1594   // Ack NM to remove finished containers from context.
1595   private void sendFinishedContainersToNM() {
1596     for (NodeId nodeId : finishedContainersSentToAM.keySet()) {
1597 
1598       // Clear and get current values
1599       List<ContainerStatus> currentSentContainers =
1600           finishedContainersSentToAM.put(nodeId,
1601             new ArrayList<ContainerStatus>());
1602       List<ContainerId> containerIdList =
1603           new ArrayList<ContainerId>(currentSentContainers.size());
1604       for (ContainerStatus containerStatus : currentSentContainers) {
1605         containerIdList.add(containerStatus.getContainerId());
1606       }
1607       eventHandler.handle(new RMNodeFinishedContainersPulledByAMEvent(nodeId,
1608         containerIdList));
1609     }
1610   }
1611 
1612   // Add am container to the list so that am container instance will be
1613   // removed from NMContext.
1614   private void sendAMContainerToNM(RMAppAttemptImpl appAttempt,
1615       RMAppAttemptContainerFinishedEvent containerFinishedEvent) {
1616     NodeId nodeId = containerFinishedEvent.getNodeId();
1617     finishedContainersSentToAM.putIfAbsent(nodeId,
1618       new ArrayList<ContainerStatus>());
1619     appAttempt.finishedContainersSentToAM.get(nodeId).add(
1620       containerFinishedEvent.getContainerStatus());
1621     if (!appAttempt.getSubmissionContext()
1622       .getKeepContainersAcrossApplicationAttempts()) {
1623       appAttempt.sendFinishedContainersToNM();
1624     }
1625   }
1626 
1627   private static void addJustFinishedContainer(RMAppAttemptImpl appAttempt,
1628       RMAppAttemptContainerFinishedEvent containerFinishedEvent) {
1629     appAttempt.justFinishedContainers.putIfAbsent(containerFinishedEvent
1630         .getNodeId(), new ArrayList<ContainerStatus>());
1631     appAttempt.justFinishedContainers.get(containerFinishedEvent
1632             .getNodeId()).add(containerFinishedEvent.getContainerStatus());
1633   }
1634 
1635   private static final class ContainerFinishedAtFinalStateTransition
1636       extends BaseTransition {
1637     @Override
1638     public void
1639         transition(RMAppAttemptImpl appAttempt, RMAppAttemptEvent event) {
1640       RMAppAttemptContainerFinishedEvent containerFinishedEvent =
1641           (RMAppAttemptContainerFinishedEvent) event;
1642       
1643       // Normal container. Add it in completed containers list
1644       addJustFinishedContainer(appAttempt, containerFinishedEvent);
1645     }
1646   }
1647 
1648   private static class AMContainerCrashedAtRunningTransition extends
1649       BaseTransition {
1650     @Override
1651     public void
1652         transition(RMAppAttemptImpl appAttempt, RMAppAttemptEvent event) {
1653       RMAppAttemptContainerFinishedEvent finishEvent =
1654           (RMAppAttemptContainerFinishedEvent) event;
1655       // container associated with AM. must not be unmanaged
1656       assert appAttempt.submissionContext.getUnmanagedAM() == false;
1657       // Setup diagnostic message and exit status
1658       appAttempt.setAMContainerCrashedDiagnosticsAndExitStatus(finishEvent);
1659       new FinalTransition(RMAppAttemptState.FAILED).transition(appAttempt,
1660         event);
1661     }
1662   }
1663 
1664   private static final class AMFinishingContainerFinishedTransition
1665       implements
1666       MultipleArcTransition<RMAppAttemptImpl, RMAppAttemptEvent, RMAppAttemptState> {
1667 
1668     @Override
1669     public RMAppAttemptState transition(RMAppAttemptImpl appAttempt,
1670         RMAppAttemptEvent event) {
1671 
1672       RMAppAttemptContainerFinishedEvent containerFinishedEvent
1673         = (RMAppAttemptContainerFinishedEvent) event;
1674       ContainerStatus containerStatus =
1675           containerFinishedEvent.getContainerStatus();
1676 
1677       // Is this container the ApplicationMaster container?
1678       if (appAttempt.masterContainer.getId().equals(
1679           containerStatus.getContainerId())) {
1680         new FinalTransition(RMAppAttemptState.FINISHED).transition(
1681             appAttempt, containerFinishedEvent);
1682         appAttempt.sendAMContainerToNM(appAttempt, containerFinishedEvent);
1683         return RMAppAttemptState.FINISHED;
1684       }
1685       // Add all finished containers so that they can be acked to NM.
1686       addJustFinishedContainer(appAttempt, containerFinishedEvent);
1687 
1688       return RMAppAttemptState.FINISHING;
1689     }
1690   }
1691 
1692   private static class ContainerFinishedAtFinalSavingTransition extends
1693       BaseTransition {
1694     @Override
1695     public void
1696         transition(RMAppAttemptImpl appAttempt, RMAppAttemptEvent event) {
1697       RMAppAttemptContainerFinishedEvent containerFinishedEvent =
1698           (RMAppAttemptContainerFinishedEvent) event;
1699       ContainerStatus containerStatus =
1700           containerFinishedEvent.getContainerStatus();
1701 
1702       // If this is the AM container, it means the AM container is finished,
1703       // but we are not yet acknowledged that the final state has been saved.
1704       // Thus, we still return FINAL_SAVING state here.
1705       if (appAttempt.masterContainer.getId().equals(
1706         containerStatus.getContainerId())) {
1707         appAttempt.sendAMContainerToNM(appAttempt, containerFinishedEvent);
1708 
1709         if (appAttempt.targetedFinalState.equals(RMAppAttemptState.FAILED)
1710             || appAttempt.targetedFinalState.equals(RMAppAttemptState.KILLED)) {
1711           // ignore Container_Finished Event if we were supposed to reach
1712           // FAILED/KILLED state.
1713           return;
1714         }
1715 
1716         // pass in the earlier AMUnregistered Event also, as this is needed for
1717         // AMFinishedAfterFinalSavingTransition later on
1718         appAttempt.rememberTargetTransitions(event,
1719           new AMFinishedAfterFinalSavingTransition(
1720             appAttempt.eventCausingFinalSaving), RMAppAttemptState.FINISHED);
1721         return;
1722       }
1723 
1724       // Add all finished containers so that they can be acked to NM.
1725       addJustFinishedContainer(appAttempt, containerFinishedEvent);
1726     }
1727   }
1728 
1729   private static class AMFinishedAfterFinalSavingTransition extends
1730       BaseTransition {
1731     RMAppAttemptEvent amUnregisteredEvent;
1732     public AMFinishedAfterFinalSavingTransition(
1733         RMAppAttemptEvent amUnregisteredEvent) {
1734       this.amUnregisteredEvent = amUnregisteredEvent;
1735     }
1736 
1737     @Override
1738     public void
1739         transition(RMAppAttemptImpl appAttempt, RMAppAttemptEvent event) {
1740       appAttempt.updateInfoOnAMUnregister(amUnregisteredEvent);
1741       new FinalTransition(RMAppAttemptState.FINISHED).transition(appAttempt,
1742           event);
1743     }
1744   }
1745 
1746   private static class AMExpiredAtFinalSavingTransition extends
1747       BaseTransition {
1748     @Override
1749     public void
1750         transition(RMAppAttemptImpl appAttempt, RMAppAttemptEvent event) {
1751       if (appAttempt.targetedFinalState.equals(RMAppAttemptState.FAILED)
1752           || appAttempt.targetedFinalState.equals(RMAppAttemptState.KILLED)) {
1753         // ignore Container_Finished Event if we were supposed to reach
1754         // FAILED/KILLED state.
1755         return;
1756       }
1757 
1758       // pass in the earlier AMUnregistered Event also, as this is needed for
1759       // AMFinishedAfterFinalSavingTransition later on
1760       appAttempt.rememberTargetTransitions(event,
1761         new AMFinishedAfterFinalSavingTransition(
1762         appAttempt.eventCausingFinalSaving), RMAppAttemptState.FINISHED);
1763     }
1764   }
1765 
1766   @Override
1767   public long getStartTime() {
1768     this.readLock.lock();
1769     try {
1770       return this.startTime;
1771     } finally {
1772       this.readLock.unlock();
1773     }
1774   }
1775 
1776   @Override
1777   public RMAppAttemptState getState() {
1778     this.readLock.lock();
1779 
1780     try {
1781       return this.stateMachine.getCurrentState();
1782     } finally {
1783       this.readLock.unlock();
1784     }
1785   }
1786 
1787   @Override
1788   public YarnApplicationAttemptState createApplicationAttemptState() {
1789     RMAppAttemptState state = getState();
1790     // If AppAttempt is in FINAL_SAVING state, return its previous state.
1791     if (state.equals(RMAppAttemptState.FINAL_SAVING)) {
1792       state = stateBeforeFinalSaving;
1793     }
1794     return RMServerUtils.createApplicationAttemptState(state);
1795   }
1796 
1797   private void launchAttempt(){
1798     launchAMStartTime = System.currentTimeMillis();
1799     // Send event to launch the AM Container
1800     eventHandler.handle(new AMLauncherEvent(AMLauncherEventType.LAUNCH, this));
1801   }
1802   
1803   private void attemptLaunched() {
1804     // Register with AMLivelinessMonitor
1805     rmContext.getAMLivelinessMonitor().register(getAppAttemptId());
1806   }
1807   
1808   private void storeAttempt() {
1809     // store attempt data in a non-blocking manner to prevent dispatcher
1810     // thread starvation and wait for state to be saved
1811     LOG.info("Storing attempt: AppId: " + 
1812               getAppAttemptId().getApplicationId() 
1813               + " AttemptId: " + 
1814               getAppAttemptId()
1815               + " MasterContainer: " + masterContainer);
1816     rmContext.getStateStore().storeNewApplicationAttempt(this);
1817   }
1818 
1819   private void removeCredentials(RMAppAttemptImpl appAttempt) {
1820     // Unregister from the ClientToAMTokenSecretManager
1821     if (UserGroupInformation.isSecurityEnabled()) {
1822       appAttempt.rmContext.getClientToAMTokenSecretManager()
1823         .unRegisterApplication(appAttempt.getAppAttemptId());
1824     }
1825 
1826     // Remove the AppAttempt from the AMRMTokenSecretManager
1827     appAttempt.rmContext.getAMRMTokenSecretManager()
1828       .applicationMasterFinished(appAttempt.getAppAttemptId());
1829   }
1830 
1831   private static String sanitizeTrackingUrl(String url) {
1832     return (url == null || url.trim().isEmpty()) ? "N/A" : url;
1833   }
1834 
1835   @Override
1836   public ApplicationAttemptReport createApplicationAttemptReport() {
1837     this.readLock.lock();
1838     ApplicationAttemptReport attemptReport = null;
1839     try {
1840       // AM container maybe not yet allocated. and also unmangedAM doesn't have
1841       // am container.
1842       ContainerId amId =
1843           masterContainer == null ? null : masterContainer.getId();
1844       attemptReport = ApplicationAttemptReport.newInstance(this
1845           .getAppAttemptId(), this.getHost(), this.getRpcPort(), this
1846           .getTrackingUrl(), this.getOriginalTrackingUrl(), this.getDiagnostics(),
1847           YarnApplicationAttemptState .valueOf(this.getState().toString()), amId);
1848     } finally {
1849       this.readLock.unlock();
1850     }
1851     return attemptReport;
1852   }
1853 
1854   // for testing
1855   public boolean mayBeLastAttempt() {
1856     return maybeLastAttempt;
1857   }
1858 
1859   @Override
1860   public RMAppAttemptMetrics getRMAppAttemptMetrics() {
1861     // didn't use read/write lock here because RMAppAttemptMetrics has its own
1862     // lock
1863     return attemptMetrics;
1864   }
1865 
1866   @Override
1867   public long getFinishTime() {
1868     try {
1869       this.readLock.lock();
1870       return this.finishTime;
1871     } finally {
1872       this.readLock.unlock();
1873     }
1874   }
1875 
1876   private void setFinishTime(long finishTime) {
1877     try {
1878       this.writeLock.lock();
1879       this.finishTime = finishTime;
1880     } finally {
1881       this.writeLock.unlock();
1882     }
1883   }
1884 }
RMAppAttemptImpl.java

RMAppAttemptImpl.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java

  RMAppAttemptImpl接受RMAppAttemptEventType.START事件后,进行一系列初始化工作。 在类RMAppAttemptImpl中,也有状态机工厂。  同前面的分析类似, 在RMAppAttemptImpl类的handle()函数如下所示:

 1 // RMAppAttemptImpl.java
 2 @Override
 3   public void handle(RMAppAttemptEvent event) {
 4 
 5     this.writeLock.lock();
 6 
 7     try {
 8       ApplicationAttemptId appAttemptID = event.getApplicationAttemptId();
 9       LOG.debug("Processing event for " + appAttemptID + " of type "
10           + event.getType());
11       final RMAppAttemptState oldState = getAppAttemptState();
12       try {
13         /* keep the master in sync with the state machine */
14         this.stateMachine.doTransition(event.getType(), event);
15       } catch (InvalidStateTransitonException e) {
16         LOG.error("Can't handle this event at current state", e);
17         /* TODO fail the application on the failed transition */
18       }
19 
20       if (oldState != getAppAttemptState()) {
21         LOG.info(appAttemptID + " State change from " + oldState + " to "
22             + getAppAttemptState());
23       }
24     } finally {
25       this.writeLock.unlock();
26     }
27   }

  handle()函数内部调用 this.stateMachine.doTransition(event.getType(), event), 其中this.stateMachine RMAppAttemptImpl类的构造函数中进行初始化, 如下所示:

 1 //RMAppAttemptImpl.java 
 2 public RMAppAttemptImpl(ApplicationAttemptId appAttemptId,
 3       RMContext rmContext, YarnScheduler scheduler,
 4       ApplicationMasterService masterService,
 5       ApplicationSubmissionContext submissionContext,
 6       Configuration conf, boolean maybeLastAttempt, ResourceRequest amReq) {
 7     this.conf = conf;
 8     this.applicationAttemptId = appAttemptId;
 9     this.rmContext = rmContext;
10     this.eventHandler = rmContext.getDispatcher().getEventHandler();
11     this.submissionContext = submissionContext;
12     this.scheduler = scheduler;
13     this.masterService = masterService;
14 
15     ReentrantReadWriteLock lock = new ReentrantReadWriteLock();
16     this.readLock = lock.readLock();
17     this.writeLock = lock.writeLock();
18 
19     this.proxiedTrackingUrl = generateProxyUriWithScheme();
20     this.maybeLastAttempt = maybeLastAttempt;
21     this.stateMachine = stateMachineFactory.make(this);
22 
23     this.attemptMetrics =
24         new RMAppAttemptMetrics(applicationAttemptId, rmContext);
25     
26     this.amReq = amReq;
27   }

  this.stateMachine = stateMachineFactory.make(this), 会触发 StateMachineFactory类进行状态转换, 如下所示:

//RMAppAttemptImpl.java
private static final StateMachineFactory<RMAppAttemptImpl,
                                           RMAppAttemptState,
                                           RMAppAttemptEventType,
                                           RMAppAttemptEvent>
       stateMachineFactory  = new StateMachineFactory<RMAppAttemptImpl,
                                            RMAppAttemptState,
                                            RMAppAttemptEventType,
                                     RMAppAttemptEvent>(RMAppAttemptState.NEW)

       // Transitions from NEW State
      .addTransition(RMAppAttemptState.NEW, RMAppAttemptState.SUBMITTED,
          RMAppAttemptEventType.START, new AttemptStartedTransition())
      ......
      .installTopology();    

  接受RMAppAttemptEventType.START事件,将自身状态由RMAppAttemptState.NEW转换为RMAppAttemptState.SUBMITTED,并调用AttemptStartedTransition。 如下所示:

1 public enum RMAppAttemptState {
2   NEW, SUBMITTED, SCHEDULED, ALLOCATED, LAUNCHED, FAILED, RUNNING, FINISHING, 
3   FINISHED, KILLED, ALLOCATED_SAVING, LAUNCHED_UNMANAGED_SAVING, FINAL_SAVING
4 }
RMAppAttemptState.java

RMAppAttemptState.java 在 hadoop-2.7.3-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptState.java

 1 //RMAppAttemptImpl.java  的内部类 AttemptStartedTransition
 2  private static final class AttemptStartedTransition extends BaseTransition {
 3     @Override
 4     public void transition(RMAppAttemptImpl appAttempt,
 5         RMAppAttemptEvent event) {
 6 
 7         boolean transferStateFromPreviousAttempt = false;
 8       if (event instanceof RMAppStartAttemptEvent) {
 9         transferStateFromPreviousAttempt =
10             ((RMAppStartAttemptEvent) event)
11               .getTransferStateFromPreviousAttempt();
12       }
13       appAttempt.startTime = System.currentTimeMillis();
14 
15       // Register with the ApplicationMasterService
16       appAttempt.masterService
17           .registerAppAttempt(appAttempt.applicationAttemptId);
18 
19       if (UserGroupInformation.isSecurityEnabled()) {
20         appAttempt.clientTokenMasterKey =
21             appAttempt.rmContext.getClientToAMTokenSecretManager()
22               .createMasterKey(appAttempt.applicationAttemptId);
23       }
24 
25       // Add the applicationAttempt to the scheduler and inform the scheduler
26       // whether to transfer the state from previous attempt.
27       appAttempt.eventHandler.handle(new AppAttemptAddedSchedulerEvent(
28         appAttempt.applicationAttemptId, transferStateFromPreviousAttempt));
29     }
30   }

  最后会调用appAttempt.eventHandler.handle(new AppAttemptAddedSchedulerEvent( appAttempt.applicationAttemptId, transferStateFromPreviousAttempt)), 会调用AppAttemptAddedSchedulerEvent类。如下所示:

 1 //AppAttemptAddedSchedulerEvent.java
 2   public AppAttemptAddedSchedulerEvent(
 3       ApplicationAttemptId applicationAttemptId,
 4       boolean transferStateFromPreviousAttempt) {
 5     this(applicationAttemptId, transferStateFromPreviousAttempt, false);
 6   }
 7 
 8   public AppAttemptAddedSchedulerEvent(
 9       ApplicationAttemptId applicationAttemptId,
10       boolean transferStateFromPreviousAttempt,
11       boolean isAttemptRecovering) {
12     super(SchedulerEventType.APP_ATTEMPT_ADDED);
13     this.applicationAttemptId = applicationAttemptId;
14     this.transferStateFromPreviousAttempt = transferStateFromPreviousAttempt;
15     this.isAttemptRecovering = isAttemptRecovering;
16   }

  这里最后发送了一个SchedulerEventType.APP_ATTEMPT_ADDED事件。 其中SchedulerEventType类型事件是在ResourceManager类的内部类RMActiveServices的serviceInit()函数中进行绑定将SchedulerEventType类型的事件绑定到了EventHandler<SchedulerEvent> 的对象schedulerDispatcher上, 同上面的分析一样, 最终CapacityScheduler收到SchedulerEventType.APP_ATTEMPT_ADDED事件后,走到默认调度器CapacityScheduler中的handle()函数。 如下所示:

 1 //CapacityScheduler.java 
 2 public void handle(SchedulerEvent event) {
 3     switch(event.getType()) {
 4     ......
 5     case APP_ATTEMPT_ADDED:
 6     {
 7       AppAttemptAddedSchedulerEvent appAttemptAddedEvent =
 8           (AppAttemptAddedSchedulerEvent) event;
 9       addApplicationAttempt(appAttemptAddedEvent.getApplicationAttemptId(),
10         appAttemptAddedEvent.getTransferStateFromPreviousAttempt(),
11         appAttemptAddedEvent.getIsAttemptRecovering());
12     }
13     break;
14     ......
15     }
16   }

  我们可以看到, 调用了addApplicationAttempt() 函数, 进入函数addApplicationAttempt(), 如下所示:

 1 //CapacityScheduler.java  
 2 private synchronized void addApplicationAttempt(
 3       ApplicationAttemptId applicationAttemptId,
 4       boolean transferStateFromPreviousAttempt,
 5       boolean isAttemptRecovering) {
 6     SchedulerApplication<FiCaSchedulerApp> application =
 7         applications.get(applicationAttemptId.getApplicationId());
 8     if (application == null) {
 9       LOG.warn("Application " + applicationAttemptId.getApplicationId() +
10           " cannot be found in scheduler.");
11       return;
12     }
13     CSQueue queue = (CSQueue) application.getQueue();
14 
15     FiCaSchedulerApp attempt =
16         new FiCaSchedulerApp(applicationAttemptId, application.getUser(),
17           queue, queue.getActiveUsersManager(), rmContext);
18     if (transferStateFromPreviousAttempt) {
19       attempt.transferStateFromPreviousAttempt(application
20         .getCurrentAppAttempt());
21     }
22     application.setCurrentAppAttempt(attempt);
23 
24     queue.submitApplicationAttempt(attempt, application.getUser());
25     LOG.info("Added Application Attempt " + applicationAttemptId
26         + " to scheduler from user " + application.getUser() + " in queue "
27         + queue.getQueueName());
28     if (isAttemptRecovering) {
29       if (LOG.isDebugEnabled()) {
30         LOG.debug(applicationAttemptId
31             + " is recovering. Skipping notifying ATTEMPT_ADDED");
32       }
33     } else {
34       rmContext.getDispatcher().getEventHandler().handle(
35         new RMAppAttemptEvent(applicationAttemptId,
36             RMAppAttemptEventType.ATTEMPT_ADDED));
37     }
38   }

  其中addApplicationAttempt() 函数的24和25行是将运行实例加入到队列中, 并打印:例如 Added Application Attempt appattempt_1487944669971_0001_000001 to scheduler from user root in queue default

34~36行是发送事件RMAppAttemptEventType.ATTEMPT_ADDED给RMAppAttemptImpl。 具体分析同上, RMAppAttemptEventType类型事件是在ResourceManager类的内部类RMActiveServices的serviceInit()函数中进行绑定的,将RMAppAttemptEventType类型的事件绑定到了内部类ApplicationAttemptEventDispatcher, 该类内部会调用rmAppAttempt.handle(event), 即RMAppAttemptImpl类的handle()函数, 函数内部会调用this.stateMachine.doTransition(event.getType(), event), 触发StateMachineFactory类的转换事件, 我们知道,上一步的状态是RMAppAttemptState.SUBMITTED, 如下所示:

 1 //RMAppAttemptImpl.java
 2 private static final StateMachineFactory<RMAppAttemptImpl,
 3                                            RMAppAttemptState,
 4                                            RMAppAttemptEventType,
 5                                            RMAppAttemptEvent>
 6        stateMachineFactory  = new StateMachineFactory<RMAppAttemptImpl,
 7                                             RMAppAttemptState,
 8                                             RMAppAttemptEventType,
 9                                      RMAppAttemptEvent>(RMAppAttemptState.NEW)
10 
11        ......  
12       // Transitions from SUBMITTED state
13       .addTransition(RMAppAttemptState.SUBMITTED, 
14           EnumSet.of(RMAppAttemptState.LAUNCHED_UNMANAGED_SAVING,
15                      RMAppAttemptState.SCHEDULED),
16           RMAppAttemptEventType.ATTEMPT_ADDED,
17           new ScheduleTransition())
18       
19       ......
20     .installTopology();

   将自身状态由RMAppAttemptState.SUBMITTED转换为EnumSet.of(RMAppAttemptState.LAUNCHED_UNMANAGED_SAVING, RMAppAttemptState.SCHEDULED)并调用ScheduleTransition。 如下所示:

 

 1 // RMAppAttemptImpl.java 的内部类 ScheduleTransition
 2  @VisibleForTesting
 3   public static final class ScheduleTransition
 4       implements
 5       MultipleArcTransition<RMAppAttemptImpl, RMAppAttemptEvent, RMAppAttemptState> {
 6     @Override
 7     public RMAppAttemptState transition(RMAppAttemptImpl appAttempt,
 8         RMAppAttemptEvent event) {
 9       ApplicationSubmissionContext subCtx = appAttempt.submissionContext;
10       if (!subCtx.getUnmanagedAM()) {
11         // Need reset #containers before create new attempt, because this request
12         // will be passed to scheduler, and scheduler will deduct the number after
13         // AM container allocated
14         
15         // Currently, following fields are all hard code,
16         // TODO: change these fields when we want to support
17         // priority/resource-name/relax-locality specification for AM containers
18         // allocation.
19         appAttempt.amReq.setNumContainers(1);
20         appAttempt.amReq.setPriority(AM_CONTAINER_PRIORITY);
21         appAttempt.amReq.setResourceName(ResourceRequest.ANY);
22         appAttempt.amReq.setRelaxLocality(true);
23         
24         // AM resource has been checked when submission
25         Allocation amContainerAllocation =
26             appAttempt.scheduler.allocate(appAttempt.applicationAttemptId,
27                 Collections.singletonList(appAttempt.amReq),
28                 EMPTY_CONTAINER_RELEASE_LIST, null, null);
29         if (amContainerAllocation != null
30             && amContainerAllocation.getContainers() != null) {
31           assert (amContainerAllocation.getContainers().size() == 0);
32         }
33         return RMAppAttemptState.SCHEDULED;
34       } else {
35         // save state and then go to LAUNCHED state
36         appAttempt.storeAttempt();
37         return RMAppAttemptState.LAUNCHED_UNMANAGED_SAVING;
38       }
39     }
40   }

 

  为AM申请Container资源,该资源描述如19~22行, 即一个优先级为AM_CONTAINER_PRIORITY(值为0),可在任意节点上ResourceRequest.ANY。 

该函数内部调用 Allocation amContainerAllocation = appAttempt.scheduler.allocate(appAttempt.applicationAttemptId, Collections.singletonList(appAttempt.amReq), EMPTY_CONTAINER_RELEASE_LIST, null, null), 即去scheduler里面执行allocate 然后会返回一个Allocation对象,会等NodeManager去heartBeat的时候,ResourceManager发现这个NM还有资源, 然后就assign这个Allocation到这个NM上面, 再去Launch AM 。

  这里的allocate()即是接口 YarnScheduler.java 的allocate()函数, 我们前面已经分析过, 默认调度器类是 CapacityScheduler, 所以最终调用的是 接口YarnScheduler 的 实现类CapacityScheduler.java 的allocate()函数。如下所示:

 1 //CapacityScheduler.java
 2   @Override
 3   @Lock(Lock.NoLock.class)
 4   public Allocation allocate(ApplicationAttemptId applicationAttemptId,
 5       List<ResourceRequest> ask, List<ContainerId> release, 
 6       List<String> blacklistAdditions, List<String> blacklistRemovals) {
 7     
 8     //
 9     FiCaSchedulerApp application = getApplicationAttempt(applicationAttemptId);
10     if (application == null) {
11       LOG.info("Calling allocate on removed " +
12           "or non existant application " + applicationAttemptId);
13       return EMPTY_ALLOCATION;
14     }
15     
16     // Sanity check
17     //完整性检查
18     SchedulerUtils.normalizeRequests(
19         ask, getResourceCalculator(), getClusterResource(),
20         getMinimumResourceCapability(), getMaximumResourceCapability());
21 
22     // Release containers
23     //释放容器
24     releaseContainers(release, application);
25 
26     synchronized (application) {
27 
28       // make sure we aren't stopping/removing the application
29       // when the allocate comes in
30       if (application.isStopped()) {
31         LOG.info("Calling allocate on a stopped " +
32             "application " + applicationAttemptId);
33         return EMPTY_ALLOCATION;
34       }
35 
36       if (!ask.isEmpty()) {
37 
38         if(LOG.isDebugEnabled()) {
39           LOG.debug("allocate: pre-update" +
40             " applicationAttemptId=" + applicationAttemptId + 
41             " application=" + application);
42         }
43         application.showRequests();
44   
45         // Update application requests
46         application.updateResourceRequests(ask);
47   
48         LOG.debug("allocate: post-update");
49         application.showRequests();
50       }
51 
52       if(LOG.isDebugEnabled()) {
53         LOG.debug("allocate:" +
54           " applicationAttemptId=" + applicationAttemptId + 
55           " #ask=" + ask.size());
56       }
57 
58       application.updateBlacklist(blacklistAdditions, blacklistRemovals);
59 
60       return application.getAllocation(getResourceCalculator(),
61                    clusterResource, getMinimumResourceCapability());
62     }
63   }

  最后会调用application.getAllocation(getResourceCalculator(), clusterResource, getMinimumResourceCapability()), 看getAllocation()函数,如下所示:

 1 //FiCaSchedulerApp.java 
 2  /**
 3    * This method produces an Allocation that includes the current view
 4    * of the resources that will be allocated to and preempted from this
 5    * application.
 6    *
 7    * @param rc
 8    * @param clusterResource
 9    * @param minimumAllocation
10    * @return an allocation
11    */
12   public synchronized Allocation getAllocation(ResourceCalculator rc,
13       Resource clusterResource, Resource minimumAllocation) {
14 
15     Set<ContainerId> currentContPreemption = Collections.unmodifiableSet(
16         new HashSet<ContainerId>(containersToPreempt));
17     containersToPreempt.clear();
18     Resource tot = Resource.newInstance(0, 0);
19     for(ContainerId c : currentContPreemption){
20       Resources.addTo(tot,
21           liveContainers.get(c).getContainer().getResource());
22     }
23     int numCont = (int) Math.ceil(
24         Resources.divide(rc, clusterResource, tot, minimumAllocation));
25     ResourceRequest rr = ResourceRequest.newInstance(
26         Priority.UNDEFINED, ResourceRequest.ANY,
27         minimumAllocation, numCont);
28     ContainersAndNMTokensAllocation allocation =
29         pullNewlyAllocatedContainersAndNMTokens();
30     Resource headroom = getHeadroom();
31     setApplicationHeadroomForMetrics(headroom);
32     return new Allocation(allocation.getContainerList(), headroom, null,
33       currentContPreemption, Collections.singletonList(rr),
34       allocation.getNMTokenList());
35   }

  该方法会继续调用 ContainersAndNMTokensAllocation allocation = pullNewlyAllocatedContainersAndNMTokens(), 如下所示:

 1 //SchedulerApplicationAttempt.java
 2   // Create container token and NMToken altogether, if either of them fails for
 3   // some reason like DNS unavailable, do not return this container and keep it
 4   // in the newlyAllocatedContainers waiting to be refetched.
 5   public synchronized ContainersAndNMTokensAllocation
 6       pullNewlyAllocatedContainersAndNMTokens() {
 7     List<Container> returnContainerList =
 8         new ArrayList<Container>(newlyAllocatedContainers.size());
 9     List<NMToken> nmTokens = new ArrayList<NMToken>();
10     for (Iterator<RMContainer> i = newlyAllocatedContainers.iterator(); i
11       .hasNext();) {
12       RMContainer rmContainer = i.next();
13       Container container = rmContainer.getContainer();
14       try {
15         // create container token and NMToken altogether.
16         container.setContainerToken(rmContext.getContainerTokenSecretManager()
17           .createContainerToken(container.getId(), container.getNodeId(),
18             getUser(), container.getResource(), container.getPriority(),
19             rmContainer.getCreationTime(), this.logAggregationContext));
20         NMToken nmToken =
21             rmContext.getNMTokenSecretManager().createAndGetNMToken(getUser(),
22               getApplicationAttemptId(), container);
23         if (nmToken != null) {
24           nmTokens.add(nmToken);
25         }
26       } catch (IllegalArgumentException e) {
27         // DNS might be down, skip returning this container.
28         LOG.error("Error trying to assign container token and NM token to" +
29             " an allocated container " + container.getId(), e);
30         continue;
31       }
32       returnContainerList.add(container);
33       i.remove();
34       rmContainer.handle(new RMContainerEvent(rmContainer.getContainerId(),
35         RMContainerEventType.ACQUIRED));
36     }
37     return new ContainersAndNMTokensAllocation(returnContainerList, nmTokens);
38   }

  到这一步就遇到瓶颈,追踪中断 参考  参考1    参考2    参考3    分析如下:

我们使用的是 Capacity 调度器,CapacityScheduler.allocate() 方法的主要做两件事情:

  • 调用 FicaSchedulerApp.updateResourceRequests() 更新 APP (指从调度器角度看的 APP) 的资源需求。
  • 通过 FicaSchedulerApp.pullNewlyAllocatedContainersAndNMTokens() 把 FicaSchedulerApp.newlyAllocatedContainers 这个 List 中的Container取出来,封装后返回。

FicaSchedulerApp.newlyAllocatedContainers 这个数据结构中存放的,正是最近申请到的 Container 。那么,这个 List 中的元素是怎么来的呢,这要从 NM 的心跳说起。

  也即此刻,某个node(称为“AM-NODE”)正好通过heartbeat向ResourceManager.ResourceTrackerService汇报自己所在节点的资源使用情况。 

所以去NodeManager.java 中分析。  参考1    参考2   参考3    参考4   

   以下的分析先暂停,因为任务要求要分析调度器的源码。

posted @ 2017-05-18 18:19  秦时明月0515  阅读(2089)  评论(0编辑  收藏  举报