江山疯宇晴

HadoopSourceAnalyse --- Mapreduce ApplicationMaster Task FSM

Overview


图 1-1
当Task 被创建出来之后,处于NEW 状态,并等待 T_SCHEDULE 事件,该事件将由Job对像触发。

T_SCHEDULE Handle

在Task收到该 事件后,首先会创建一个Attempt对像并注册,该 对像将用来,执行并跟踪task的执行,对Map task 和Reduce task 分别各自不同的实现,这里以Map为例:
 TaskAttemptImpl attempt = createAttempt();
    attempt.setAvataar(avataar);
    if (LOG.isDebugEnabled()) {
      LOG.debug("Created attempt " + attempt.getID());
    }
    switch (attempts.size()) {
      case 0:
        attempts = Collections.singletonMap(attempt.getID(),
            (TaskAttempt) attempt);
        break;
        
      case 1:
        Map<TaskAttemptId, TaskAttempt> newAttempts
            = new LinkedHashMap<TaskAttemptId, TaskAttempt>(maxAttempts);
        newAttempts.putAll(attempts);
        attempts = newAttempts;
        attempts.put(attempt.getID(), attempt);
        break;

      default:
        attempts.put(attempt.getID(), attempt);
        break;
    }

  @Override
  protected TaskAttemptImpl createAttempt() {
    return new MapTaskAttemptImpl(getID(), nextAttemptNumber,
        eventHandler, jobFile,
        partition, taskSplitMetaInfo, conf, taskAttemptListener,
        jobToken, credentials, clock, appContext);
  }

之后,通知Attempt对像,事件进入TaskAttempt FSM, Attempt将向Master申请运行该 task 所需的资源, Task进入SCHEDULED状态,并等待Attempt申请资源成功后的 T_ATTEMPT_LAUNCHED事件。

T_ATTEMPT_LAUNCHED Handle

收到该 事件之后,Task 记录下对应的Attempt已经提交,并进入Running 状态:开始等待:T_ATTEMPT_COMMIT_PENDING,T_ADD_SPEC_ATTEMPT,只到收到T_ATTEMPT_SUCCEEDED事件。

T_ATTEMPT_COMMIT_PENDING

收到该事件之后,Task记录正在运行中的Attempt,如果task已经有了对应的Attempt 那和以,偿试杀掉新的Attempt:
if (task.commitAttempt == null) {
        // TODO: validate attemptID
        task.commitAttempt = attemptID;
        LOG.info(attemptID + " given a go for committing the task output.");
      } else {
        // Don't think this can be a pluggable decision, so simply raise an
        // event for the TaskAttempt to delete its output.
        LOG.info(task.commitAttempt
            + " already given a go for committing the task output, so killing "
            + attemptID);
        task.eventHandler.handle(new TaskAttemptEvent(
            attemptID, TaskAttemptEventType.TA_KILL));
      }


T_ADD_SPEC_ATTEMPT Handle

收到该 事件,task 偿试,创建另一个新的Attempt 对像来运行该 task, Just for speculation now, 以后会用来,并发处理当前task。

T_ATTEMPT_SUCCEEDED Handle

更新task 状态,并退出:
 task.handleTaskAttemptCompletion(
          taskAttemptId, 
          TaskAttemptCompletionEventStatus.SUCCEEDED);
      task.finishedAttempts.add(taskAttemptId);
      task.inProgressAttempts.remove(taskAttemptId);
      task.successfulAttempt = taskAttemptId;
      task.sendTaskSucceededEvents();
      for (TaskAttempt attempt : task.attempts.values()) {
        if (attempt.getID() != task.successfulAttempt &&
            // This is okay because it can only talk us out of sending a
            //  TA_KILL message to an attempt that doesn't need one for
            //  other reasons.
            !attempt.isFinished()) {
          LOG.info("Issuing kill to other attempt " + attempt.getID());
          task.eventHandler.handle(
              new TaskAttemptEvent(attempt.getID(), 
                  TaskAttemptEventType.TA_KILL));
        }
      }
      task.finished(TaskStateInternal.SUCCEEDED);


如果在task运行过和出错,task 会偿试创建新的Attempt对像重新运行该 task。

posted on 2013-05-20 12:59  江山疯宇晴  阅读(229)  评论(0编辑  收藏  举报

导航