Hadoop源码分析28 JobTracker 处理JobClient请求

提交命令行:hadoop jar/opt/hadoop-1.0.0/hadoop-examples-1.0.0.jar wordcount/user/admin/in/yellow.txt /user/admin/out/3

 

RPC请求:getProtocolVersion(org.apache.hadoop.mapred.JobSubmissionProtocol,28) from 10.1.1.101:37721

处理:返回JobSubmitProtocolVersion

返回:28

 

RPC请求:getStagingAreaDir()from 10.1.1.101:37722

处理:查看conf.get("mapreduce.jobtracker.staging.root.dir")

返回:hdfs://server1:9000/tmp/hadoop-admin/mapred/staging/admin/.staging

 

RPC请求:getNewJobId()from 10.1.1.101:37725

处理:new JobID(getTrackerIdentifier(),nextJobId++)

返回:job_201404152128_0001

 

RPC请求:getQueueAdmins(default)from 10.1.1.101:37730

处理:aclsEnabled=false, allAllowed=true     

返回:All usersare allowed

 

RPC请求:submitJob(job_201404152128_0001,hdfs://server1:9000/tmp/hadoop-admin/mapred/staging/admin/.staging/job_201404152128_0001,org.apache.hadoop.security.Credentials@6e28575) from10.1.1.101:37734

处理:

生成JobInfo对象{id=job_201404152128_0001jobSubmitDir=hdfs://server1:9000/tmp/hadoop-admin/mapred/staging/admin/.staging/job_201404152128_0001user=admin}


生成JobInProgress对象:其中包括一个JobStatus对象{jobid=job_201404152128_0001;mapProgress=0reduceProgress=0cleanupProgress=0setupProgress=0; runState=4 startTime=1397612941994; user=adminpriority=NOMALschedulingInfo="NA"; failureInfo = "NA"; } 


将文件job.xmlHDFS复制到本地:fs.copyToLocalFile(jobFilePath,localJobFile);

jobFilePath=hdfs://server1:9000/tmp/hadoop-admin/mapred/staging/admin/.staging/job_201404152128_0001/job.xml/tmp/hadoop-admin/mapred/local/jobTracker/job_201404152128_0001.xml

将该文件加载到配置:conf=newJobConf(localJobFile);

然后读取job.xml中的配置


生成JobProfile对象:{jobFile="hdfs://server1:9000/tmp/hadoop-admin/mapred/staging/admin/.staging/job_201404152128_0001/job.xml"

jobid=job_201404152128_0001name="wordcount"queueName="default"url="http://server1:50030/jobdetails.jsp?jobid=job_201404152128_0001"(id=720)   

user="admin"}


totalSubmissions++

JobInProgress添加到jobs,其内容为

{job_201404152128_0001=org.apache.hadoop.mapred.JobInProgress@7e9b4b1f} 

JobInProgress添加到JobQueueJobInProgressListenerEagerTaskInitializationListener


 

EagerTaskInitializationListener线程从jobInitQueue取出JobInProgress,初始化JobInProgress

threadPool.execute(new InitJob(job));

实际调用JobTracker.initJob(JobInProgressjob)

hdfs://server1:9000/tmp/hadoop-admin/mapred/staging/admin/.staging/job_201404152128_0001/job.splitmetainfo读取TaskSplitMetaInfo,生成相应的TaskInProgress(mapsreducescleanupsetup)


返回:JobInProgress的成员JobStatus

 

RPC请求:getJobProfile(job_201404152128_0001)from 10.1.1.101:37744

处理:

返回:JobInProgress的成员JobProfile

 

RPC请求:getJobStatus(job_201404152128_0001)from 10.1.1.101:37744

处理:

返回:JobInProgress的成员JobStatus

 

RPC请求:getTaskCompletionEvents(job_201404152128_0001,0, 10) from 10.1.1.101:37744

处理:

返回:JobInProgress的成员taskCompletionEvents

..................

此后JobClient会不断请求getJobStatusgetTaskCompletionEvents

 

 

posted @ 2014-05-28 08:48  lihui1625  阅读(145)  评论(0编辑  收藏  举报