江山疯宇晴

HadoopSourceAnalyse --- Mapreduce ApplicationMaster init&startup

Overview

ApplicationMaster 是运MapReduce 任务的中间核心组组件,主要负责向ResourceManager请求Container来运行任务,并监视任务的执行:


图 1-1
上图展示了,ApplicationMaster 在初始化的时候,做了哪 些事情, 当一个新的任务提交的时候,ApplicationMaster会依次启动下列的Service:

  • Dispatcher: AsyncDispatcher
  • clientService: MRClientService
  • CommitterEventHandler Service: CommitterEventHandler
  • TaskAttemptListener Service: TaskAttemptListenerImpl
  • ContainerAllocator Service: ContainerAllocatorRouter
  • ContainerLauncher Servide: ContainerLauncherRouter
  • CleanupStaging Service: StagingDirCleaningService
  • History service: JobHistoryEventHandler
EventHandler
Histtory EventType JobHistoryEventHandler
JobEventType JobEventDispatcher
TaskEventType TaskEventDispatcher
TaskAttemptEventType TaskAttemptEventDispatcher
CommitterEventType CommitterEventHandler
Speculator.EventType SpeculatorEventDispatcher
ContainerAllocator.EventType ContainerAllocatorRouter
ContainerLauncher.EventType ContainerLauncherRouter

Client Service initiate and startup


图  2-1
Client 初始化的时候将读取以下信息:
  • yarn.app.mapreduce.am.job.client.thread-count: Default is:1
  • yarn.app.mapreduce.am.job.client.port-range
  • hadoop.security.authorization:Default is false
客户端可以通过这ClientService与ApplicatiionMaster通信,监控job的执行。

CommitterEVentDispatcher Init and Startup



图 3-1
CommitterEventHandler 初始化时读取以下信息:
  • yarn.app.mapreduce.am.job.committer.cancel-timeout: Default is : 60000 ms;
  • yarn.app.mapreduce.am.job.committer.commit-window: Default is: 10000 ms。

TaskAttemptListenerImpl init &startup


图 4-1
TaskAttempListener初始化时读取以下信息:
  • yarn.app.mapreduce.am.job.committer.commit-window
  • yarn.app.mapreduce.am.job.task.listener.thread-count: Default is 30

ContainerAllocatorRouter Init and Startup

Router 的主要功能是根据job的类弄启动,LocalContainerAllocator 或 RMContainerAllocator;

RmContainerAllocator init and Startup


图 5-1
RMContainerAllocator 初始 时读取以下信息:
  • yarn.app.mapreduce.am.scheduler.heartbeat.interval-ms: Default is 1000 ms
  • yarn.app.mapreduce.am.job.node-blacklisting.enable: Default is true
  • mapreduce.job.maxtaskfailures.per.tracker: Default is 3
  • yarn.app.mapreduce.am.job.node-blacklisting.ignore-threshold-node-percent: Default is 33%
  • mapreduce.job.reduce.slowstart.completedmaps: Default is 0.05
  • yarn.app.mapreduce.am.job.reduce.ramup.limit: Default is 0.5
  • yarn.app.mapreduce.am.job.reduce.preemption.limit: Default is 0.5
  • yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms: Default is 360000ms

ContainerLauncherRouter Init and Startup

 与ContrineAllocatorRouter类似,ContainerLaunerRouter 根据Job类型初始化,LocalContainerLauncher 或 ContainerLauncherImpl 对像;

CoutainerLauncherImpl Init and Startup


图 6-1
ContainerLauncherImpl 初始化时会读取以下配置信息:
  • yarn.app.mapreduce.am.containerlauncher.thread-count-limit: Default is: 500

JobHistoryEventHandler init and Startup


图 7-1
JobhistoryEventHadnler 初始化读取以下信息:
  • yarn.app.mapreduce.am.staging-dir: Default is: /tmp/hadoop-yarn/staging/<user>/.staging
  • mapreduce.jobhistory.intermediate-done-dir: 如果该 property没有配置,则读取;yarn.app.mapreduce.am.staging-dir,如果该property也没有配置则使用默认值:/tmp/hadoop-yarn/staging/history/done_intermediate/
  • yarn.app.mapreduce.am.history.max-unflushed-events:Default is 200
  • yarn.app.mapreduce.am.history.job-complete-unflushed-multiplier: Default is 30
  • yarn.app.mapreduce.am.history.complete-event-flush-timeout: default is 30000 ms
  • yarn.app.mapreduce.am.history.use-batched-flush.queue-size.threadhold: Defult is 50

posted on 2013-05-13 11:25  江山疯宇晴  阅读(501)  评论(0编辑  收藏  举报

导航