HadoopSourceAnalyse --- Mapreduce ApplicationMaster Task FSM
Overview
图 1-1
当Task 被创建出来之后,处于NEW 状态,并等待 T_SCHEDULE 事件,该事件将由Job对像触发。
T_SCHEDULE Handle
在Task收到该 事件后,首先会创建一个Attempt对像并注册,该 对像将用来,执行并跟踪task的执行,对Map task 和Reduce task 分别各自不同的实现,这里以Map为例:
TaskAttemptImpl attempt = createAttempt(); attempt.setAvataar(avataar); if (LOG.isDebugEnabled()) { LOG.debug("Created attempt " + attempt.getID()); } switch (attempts.size()) { case 0: attempts = Collections.singletonMap(attempt.getID(), (TaskAttempt) attempt); break; case 1: Map<TaskAttemptId, TaskAttempt> newAttempts = new LinkedHashMap<TaskAttemptId, TaskAttempt>(maxAttempts); newAttempts.putAll(attempts); attempts = newAttempts; attempts.put(attempt.getID(), attempt); break; default: attempts.put(attempt.getID(), attempt); break; }
@Override protected TaskAttemptImpl createAttempt() { return new MapTaskAttemptImpl(getID(), nextAttemptNumber, eventHandler, jobFile, partition, taskSplitMetaInfo, conf, taskAttemptListener, jobToken, credentials, clock, appContext); }
之后,通知Attempt对像,事件进入TaskAttempt FSM, Attempt将向Master申请运行该 task 所需的资源, Task进入SCHEDULED状态,并等待Attempt申请资源成功后的 T_ATTEMPT_LAUNCHED事件。
T_ATTEMPT_LAUNCHED Handle
收到该 事件之后,Task 记录下对应的Attempt已经提交,并进入Running 状态:开始等待:T_ATTEMPT_COMMIT_PENDING,T_ADD_SPEC_ATTEMPT,只到收到T_ATTEMPT_SUCCEEDED事件。
T_ATTEMPT_COMMIT_PENDING
收到该事件之后,Task记录正在运行中的Attempt,如果task已经有了对应的Attempt 那和以,偿试杀掉新的Attempt:
if (task.commitAttempt == null) { // TODO: validate attemptID task.commitAttempt = attemptID; LOG.info(attemptID + " given a go for committing the task output."); } else { // Don't think this can be a pluggable decision, so simply raise an // event for the TaskAttempt to delete its output. LOG.info(task.commitAttempt + " already given a go for committing the task output, so killing " + attemptID); task.eventHandler.handle(new TaskAttemptEvent( attemptID, TaskAttemptEventType.TA_KILL)); }
T_ADD_SPEC_ATTEMPT Handle
收到该 事件,task 偿试,创建另一个新的Attempt 对像来运行该 task, Just for speculation now, 以后会用来,并发处理当前task。
T_ATTEMPT_SUCCEEDED Handle
更新task 状态,并退出:
task.handleTaskAttemptCompletion( taskAttemptId, TaskAttemptCompletionEventStatus.SUCCEEDED); task.finishedAttempts.add(taskAttemptId); task.inProgressAttempts.remove(taskAttemptId); task.successfulAttempt = taskAttemptId; task.sendTaskSucceededEvents(); for (TaskAttempt attempt : task.attempts.values()) { if (attempt.getID() != task.successfulAttempt && // This is okay because it can only talk us out of sending a // TA_KILL message to an attempt that doesn't need one for // other reasons. !attempt.isFinished()) { LOG.info("Issuing kill to other attempt " + attempt.getID()); task.eventHandler.handle( new TaskAttemptEvent(attempt.getID(), TaskAttemptEventType.TA_KILL)); } } task.finished(TaskStateInternal.SUCCEEDED);
如果在task运行过和出错,task 会偿试创建新的Attempt对像重新运行该 task。