TaskTracker任务初始化及启动task源码级分析
在监听器初始化Job、JobTracker相应TaskTracker心跳、调度器分配task源码级分析中我们分析的Tasktracker发送心跳的机制,这一节我们分析TaskTracker接受JobTracker的响应信息后的工作内容。
TaskTracker中的transmitHeartBeat方法通过调用JobTracker.heartbeat方法获得心跳的响应信息HeartbeatResponse,然后返回给TaskTracker.offerService()方法。HeartbeatResponse中包含了以下几个重要的信息:
(1)可能包含一个cleanup task或者一个setup task,一个心跳只能包含一个这种类型的task。优先考虑map的cleanup,然后map的setup,然后reduce的cleanup,然后reduce的setup;
(2)调度器分配的MapTask(可以有多个,最多有一个非本地的Map(而且一旦有此种类的Map,则会停止分配Map,返回Map列表))或者ReduceTask(一次心跳最多分配1个);
(3)TaskTracker上对应的一些应该被Kill的Task;
(4)TaskTracker上对应的一些应该被Kill的Job;
(5)TaskTracker上可以保存数据的Task;
(6)下一次的心跳间隔;
(7)如果JobTracker重启了,还会有需要恢复的Job列表;
(8)还有就是只返回重启命令ReinitTrackerAction。如果TaskTracker不是第一次发送心跳链接JobTracker,且JobTracker也没重启,并且没有此TaskTracker上一次心跳信息,说明可能存在严重的问题,因此让此tasktracker重新初始化。
TaskTracker.offerService()方法是一个while循环,始终是执行等待心跳时间发送心跳,接受响应信息,分析响应信息中的任务。接受到响应信息HeartbeatResponse之后:
一、获取恢复作业列表(如果响应信息中有要恢复的作业),重置各个Job的状态,然后将所有正在运行的处于SHUFFLE阶段的Reduce Task回滚放入shouldReset中;
二、然后调用HeartbeatResponse的getActions()函数获得JobTracker传过来的所有指令即一个TaskTrackerAction数组:TaskTrackerAction[] actions = heartbeatResponse.getActions()。
三、如果actions是重新初始化命令则会直接返回State.STALE到run()中,会跳出内层while循环,然后外层while继续执行,调用initialize()方法进行初始化,并再次执行offerService()。
四、重置心跳间隔heartbeatInterval = heartbeatResponse.getHeartbeatInterval()
五、置justStarted、justInited都为false表示已经启动服务,并已连接JobTracker
六、遍历actions数组:
(1)如果是LaunchTaskAction,则调用addToTaskQueue((LaunchTaskAction)action)将Action添加到任务队列中,加入TaskLauncher线程的执行队列。addToTaskQueue方法会根据LaunchTaskAction的类型将这个action加入mapLauncher或者reduceLauncher,这两个launcher都是TaskLauncher extends Thread的对象,这两个线程对象都是在initialize()时初始化,会通过addToTaskQueue(action)方法将action加入 List<TaskInProgress> tasksToLaunch列表,注意这个TaskInProgress是TaskTracker.TaskInProgress,而非MapRed包中的 TaskInProgress类。TaskLauncher类的run方法会始终监控tasksToLaunch,一旦发现有新的任务,就获取第一个task,并检查是否可以运行此task等待有足够的slot来运行此task,还要判断(canBeLaunched()方法)此task的运行状态必须是UNASSIGNED、FAILED_UNCLEAN、KILLED_UNCLEAN三者之一才可以执行。最终通过startNewTask(tip)方法来执行。
(2)如果是CommitTaskAction,就加入commitResponses.add(commitAction.getTaskID()),这类任务指的是处理完数据之后,将最终结果从临时目录转移到最终目录的过程,只有将输出结果直接写到HDFS上的任务才会经历这个过程,只有两类任务:reduce task和map-only类型的map task。不管是map task、Reduce task、setup task、cleanup job task、cleanup task task执行完后都会调用done(umbilical, reporter)该方法会通过层层调用找到commitResponses等待JobTracker的commit命令。
(3)其他则直接加入tasksToCleanup.put(action),包括杀死任务或作业。taskCleanupThread线程会始终监控tasksToCleanup队列,从中take一个TaskTrackerAction action,如果这个action是KillJobAction类型,就调用方法purgeJob((KillJobAction) action)来处理,这个方法会从runningJobs获取对应的RunningJob,如果允许清理文件会将这个job对应的文件都删除,将这个RunningJob对应的所有task清空;如果这个action是KillTaskAction,就调用processKillTaskAction((KillTaskAction) action)来处理:会从tasks中获取对应的TaskInProgress,然后从runningJobs中找到对应的RunningJob,并从RunningJob中的task列表中删除这个task。
。
七、markUnresponsiveTasks(),杀死一定时间没没有汇报进度的task
八、killOverflowingTasks(),当剩余磁盘空间小于mapred.local.dir.minspacekill(默认为0)时,寻找合适的任务将其杀掉以释放空间
九、到这已经做了清理和恢复工作,所以如果acceptNewTasks==false并且此tasktracker处于空闲,就将acceptNewTasks=true,可以接受新的任务了
十、checkJettyPort(server.getPort()),官方给的解释是:为了谨慎,因为有些情况获得的jetty端口不一致。检查是如果端口号小于0,shuttingDown = true这样会使得run中的两层循环、offerService()中的while循环都退出,致使main()结束运行,该tasktracker关闭。
上面的六中介绍了各种类型的任务,其中map task和reduce task都是通过startNewTask(tip)方法来启动的。这个方法对每个TaskTracker.TaskInProgress都会启动一个单独的线程来执行,这个线程的run方法主要工作是,一旦运行过程出错,异常处理会将这个tip杀死,并清理相对于的一些数据。:
1 RunningJob rjob = localizeJob(tip); 2 tip.getTask().setJobFile(rjob.getLocalizedJobConf().toString()); 3 // Localization is done. Neither rjob.jobConf nor rjob.ugi can be null 4 launchTaskForJob(tip, new JobConf(rjob.getJobConf()), rjob); //执行task
(1)localizeJob(tip)方法是确保首先对作业进行本地化,即第一个tip要对作业进行本地化,后续的tip只对任务本地化。会调用initializeJob(t, rjob, ttAddr)方法对作业进行本地化,会从HDFS下载JobToken和job.xml到本地,然后通过TaskController.initializeJob方法完成剩余的工作,默认是DefaultTaskController,这个initializeJob方法会在本地创建一些目录,并下载job.jar到本地,创建job-acls.xml保存作业访问控制权限等信息。在这个方法中除了作业初始化其他的任务初始化基本没做什么工作。
(2)launchTaskForJob(tip, new JobConf(rjob.getJobConf()), rjob)方法来执行,会调用TaskTracker.TaskInProgress的launchTask()函数启动Task,如果这个task的状态是UNASSIGNED、FAILED_UNCLEAN、KILLED_UNCLEAN三者之一,就调用方法对localizeTask(task)对task做一些配置信息,然后创建一个TaskRunner,如果是map类型的任务会创建MapTaskRunner,如果是reduce类型的任务会创建ReduceTaskRunner,但任务的启动最终均是其父类TaskRunner.run()方法完成。启动TaskRunner。TaskRunner是一个线程类,其run()方法代码如下:
1 @Override 2 public final void run() { 3 String errorInfo = "Child Error"; 4 try { 5 6 //before preparing the job localize 7 //all the archives 8 TaskAttemptID taskid = t.getTaskID(); 9 final LocalDirAllocator lDirAlloc = new LocalDirAllocator("mapred.local.dir"); 10 //simply get the location of the workDir and pass it to the child. The 11 //child will do the actual dir creation 12 final File workDir = 13 new File(new Path(localdirs[rand.nextInt(localdirs.length)], 14 TaskTracker.getTaskWorkDir(t.getUser(), taskid.getJobID().toString(), 15 taskid.toString(), 16 t.isTaskCleanupTask())).toString()); 17 18 String user = tip.getUGI().getUserName(); 19 20 // Set up the child task's configuration. After this call, no localization 21 // of files should happen in the TaskTracker's process space. Any changes to 22 // the conf object after this will NOT be reflected to the child. 23 // setupChildTaskConfiguration(lDirAlloc); 24 25 if (!prepare()) { 26 return; 27 } 28 29 // Accumulates class paths for child. 30 List<String> classPaths = getClassPaths(conf, workDir, 31 taskDistributedCacheManager); 32 33 long logSize = TaskLog.getTaskLogLength(conf); 34 35 // Build exec child JVM args. 36 Vector<String> vargs = getVMArgs(taskid, workDir, classPaths, logSize); 37 38 tracker.addToMemoryManager(t.getTaskID(), t.isMapTask(), conf); 39 40 // set memory limit using ulimit if feasible and necessary ... 41 String setup = getVMSetupCmd(); 42 // Set up the redirection of the task's stdout and stderr streams 43 File[] logFiles = prepareLogFiles(taskid, t.isTaskCleanupTask()); 44 File stdout = logFiles[0]; 45 File stderr = logFiles[1]; 46 tracker.getTaskTrackerInstrumentation().reportTaskLaunch(taskid, stdout, 47 stderr); 48 49 Map<String, String> env = new HashMap<String, String>(); 50 errorInfo = getVMEnvironment(errorInfo, user, workDir, conf, env, taskid, 51 logSize); 52 53 // flatten the env as a set of export commands 54 List <String> setupCmds = new ArrayList<String>(); 55 for(Entry<String, String> entry : env.entrySet()) { 56 StringBuffer sb = new StringBuffer(); 57 sb.append("export "); 58 sb.append(entry.getKey()); 59 sb.append("=\""); 60 sb.append(entry.getValue()); 61 sb.append("\""); 62 setupCmds.add(sb.toString()); 63 } 64 setupCmds.add(setup); 65 66 launchJvmAndWait(setupCmds, vargs, stdout, stderr, logSize, workDir); 67 tracker.getTaskTrackerInstrumentation().reportTaskEnd(t.getTaskID()); 68 if (exitCodeSet) { 69 if (!killed && exitCode != 0) { 70 if (exitCode == 65) { 71 tracker.getTaskTrackerInstrumentation().taskFailedPing(t.getTaskID()); 72 } 73 throw new IOException("Task process exit with nonzero status of " + 74 exitCode + "."); 75 } 76 } 77 } catch (FSError e) { 78 LOG.fatal("FSError", e); 79 try { 80 tracker.fsErrorInternal(t.getTaskID(), e.getMessage()); 81 } catch (IOException ie) { 82 LOG.fatal(t.getTaskID()+" reporting FSError", ie); 83 } 84 } catch (Throwable throwable) { 85 LOG.warn(t.getTaskID() + " : " + errorInfo, throwable); 86 Throwable causeThrowable = new Throwable(errorInfo, throwable); 87 ByteArrayOutputStream baos = new ByteArrayOutputStream(); 88 causeThrowable.printStackTrace(new PrintStream(baos)); 89 try { 90 tracker.reportDiagnosticInfoInternal(t.getTaskID(), baos.toString()); 91 } catch (IOException e) { 92 LOG.warn(t.getTaskID()+" Reporting Diagnostics", e); 93 } 94 } finally { 95 96 // It is safe to call TaskTracker.TaskInProgress.reportTaskFinished with 97 // *false* since the task has either 98 // a) SUCCEEDED - which means commit has been done 99 // b) FAILED - which means we do not need to commit 100 tip.reportTaskFinished(false); 101 } 102 }
run方法主要是做一些准备工作,包括通过getVMArgs方法获取JVM的参数信息、通过getVMEnvironment获得环境变量信息然后组合成启动命令setupCmds;最终通过launchJvmAndWait(setupCmds, vargs, stdout, stderr, logSize, workDir)交给jvmManager对象启动一个JVM。
JvmManager负责管理TaskTracker上所有正在使用的JVM,包括启动、停止、杀死JVM等。一般来说map和Reduce占用的资源量不同,所以JvmManager使用mapJvmManager和reduceJvmManager来分别管理两种类型的task对应的JVM。且要满足:
A、两种task对应的slot的数量均不能超过此TaskTracker中各自最大slot数量;
B、每个JVM只能同时运行一个任务;
C、JVM可复用,且有次数限制和仅限同一个作业的同类型任务使用。
launchJvmAndWait方法会调用jvmManager.launchJvm(this, jvmManager.constructJvmEnv(setup, vargs, stdout,stderr, logSize, workDir, conf))来启动task。这个方法会根据task的类型,选择mapJvmManager或者reduceJvmManager的reapJvm(t, env)来启动JVM,两种类型(mapJvmManager、reduceJvmManager)使用的是同一个方法。该方法代码如下:
1 private synchronized void reapJvm( 2 TaskRunner t, JvmEnv env) throws IOException, InterruptedException { 3 if (t.getTaskInProgress().wasKilled()) { 4 //the task was killed in-flight 5 //no need to do the rest of the operations 6 return; 7 } 8 boolean spawnNewJvm = false; 9 JobID jobId = t.getTask().getJobID(); 10 //Check whether there is a free slot to start a new JVM. 11 //,or, Kill a (idle) JVM and launch a new one 12 //When this method is called, we *must* 13 // (1) spawn a new JVM (if we are below the max) 14 // (2) find an idle JVM (that belongs to the same job), or, 15 // (3) kill an idle JVM (from a different job) 16 // (the order of return is in the order above) 17 int numJvmsSpawned = jvmIdToRunner.size(); 18 JvmRunner runnerToKill = null; 19 if (numJvmsSpawned >= maxJvms) { 20 //go through the list of JVMs for all jobs. 21 Iterator<Map.Entry<JVMId, JvmRunner>> jvmIter = 22 jvmIdToRunner.entrySet().iterator(); 23 24 while (jvmIter.hasNext()) { 25 JvmRunner jvmRunner = jvmIter.next().getValue(); 26 JobID jId = jvmRunner.jvmId.getJobId(); 27 //look for a free JVM for this job; if one exists then just break 28 if (jId.equals(jobId) && !jvmRunner.isBusy() && !jvmRunner.ranAll()){ 29 setRunningTaskForJvm(jvmRunner.jvmId, t); //reserve the JVM 30 LOG.info("No new JVM spawned for jobId/taskid: " + 31 jobId+"/"+t.getTask().getTaskID() + 32 ". Attempting to reuse: " + jvmRunner.jvmId); 33 return; 34 } 35 //Cases when a JVM is killed: 36 // (1) the JVM under consideration belongs to the same job 37 // (passed in the argument). In this case, kill only when 38 // the JVM ran all the tasks it was scheduled to run (in terms 39 // of count). 40 // (2) the JVM under consideration belongs to a different job and is 41 // currently not busy 42 //But in both the above cases, we see if we can assign the current 43 //task to an idle JVM (hence we continue the loop even on a match) 44 if ((jId.equals(jobId) && jvmRunner.ranAll()) || 45 (!jId.equals(jobId) && !jvmRunner.isBusy())) { 46 runnerToKill = jvmRunner; 47 spawnNewJvm = true; 48 } 49 } 50 } else { 51 spawnNewJvm = true; 52 } 53 54 if (spawnNewJvm) { 55 if (runnerToKill != null) { 56 LOG.info("Killing JVM: " + runnerToKill.jvmId); 57 killJvmRunner(runnerToKill); 58 } 59 //888888888888888888888********************************** 60 spawnNewJvm(jobId, env, t); //在此运行Child 61 return; 62 } 63 //*MUST* never reach this 64 LOG.fatal("Inconsistent state!!! " + 65 "JVM Manager reached an unstable state " + 66 "while reaping a JVM for task: " + t.getTask().getTaskID()+ 67 " " + getDetails() + ". Aborting. "); 68 System.exit(-1); 69 }
A、先检查已启动的JVM数是否低于对应类型(map、reduce)的slot的上限,低于的话直接启动一个JVM,否则执行B;
B、检查所有已启动的JVM(jvmIdToRunner)找到满足:(1)当前状态为空对应jvmRunner.isBusy();(2)复用次数未超过上限对应jvmRunner.ranAll();(3)与将要启动的任务同属一个作业对应jId.equals(jobId);这样的JVM,则可直接复用不需启动新的JVM,保留此JVM对应setRunningTaskForJvm(jvmRunner.jvmId, t)。
C、查找当前TaskTracker所有已启动的JVM,满足一下之一:(1)复用次数已达上限且与新任务同属一个作业;(2)当前处于空闲状态但与新任务不属于一个作业;就直接杀死该JVM对应方法killJvmRunner(runnerToKill),并启动一个新的JVM
通过spawnNewJvm(jobId, env, t)创建一个JvmRunner线程,将其加入jvmIdToRunner,调用setRunningTaskForJvm修改一些数据结构,启动这个JvmRunner。其runn方法直接调用runChild(env),代码如下:
1 public void runChild(JvmEnv env) throws IOException, InterruptedException{ 2 int exitCode = 0; 3 try { 4 env.vargs.add(Integer.toString(jvmId.getId())); 5 TaskRunner runner = jvmToRunningTask.get(jvmId); 6 if (runner != null) { 7 Task task = runner.getTask(); 8 //Launch the task controller to run task JVM 9 String user = task.getUser(); 10 TaskAttemptID taskAttemptId = task.getTaskID(); 11 String taskAttemptIdStr = task.isTaskCleanupTask() ? 12 (taskAttemptId.toString() + TaskTracker.TASK_CLEANUP_SUFFIX) : 13 taskAttemptId.toString(); 14 exitCode = tracker.getTaskController().launchTask(user,//DefaultTaskController++++++++++++++执行任务 15 jvmId.jobId.toString(), taskAttemptIdStr, env.setup, 16 env.vargs, env.workDir, env.stdout.toString(), 17 env.stderr.toString()); 18 } 19 } catch (IOException ioe) { 20 // do nothing 21 // error and output are appropriately redirected 22 } finally { // handle the exit code 23 // although the process has exited before we get here, 24 // make sure the entire process group has also been killed. 25 kill(); 26 updateOnJvmExit(jvmId, exitCode); 27 LOG.info("JVM : " + jvmId + " exited with exit code " + exitCode 28 + ". Number of tasks it ran: " + numTasksRan); 29 deleteWorkDir(tracker, firstTask); 30 } 31 }
最重要的是tracker.getTaskController().launchTask,该方法代码如下(默认是DefaultTaskController):
1 /** 2 * Create all of the directories for the task and launches the child jvm. 3 * @param user the user name 4 * @param attemptId the attempt id 5 * @throws IOException 6 */ 7 @Override 8 public int launchTask(String user, 9 String jobId, 10 String attemptId, 11 List<String> setup, 12 List<String> jvmArguments, 13 File currentWorkDirectory, 14 String stdout, 15 String stderr) throws IOException { 16 ShellCommandExecutor shExec = null; 17 try { 18 FileSystem localFs = FileSystem.getLocal(getConf()); 19 20 //create the attempt dirs 21 new Localizer(localFs, 22 getConf().getStrings(JobConf.MAPRED_LOCAL_DIR_PROPERTY)). 23 initializeAttemptDirs(user, jobId, attemptId); 24 25 // create the working-directory of the task 26 if (!currentWorkDirectory.mkdir()) { 27 throw new IOException("Mkdirs failed to create " 28 + currentWorkDirectory.toString()); 29 } 30 31 //mkdir the loglocation 32 String logLocation = TaskLog.getAttemptDir(jobId, attemptId).toString(); 33 if (!localFs.mkdirs(new Path(logLocation))) { 34 throw new IOException("Mkdirs failed to create " 35 + logLocation); 36 } 37 //read the configuration for the job 38 FileSystem rawFs = FileSystem.getLocal(getConf()).getRaw(); 39 long logSize = 0; //TODO MAPREDUCE-1100 40 // get the JVM command line. 41 String cmdLine = 42 TaskLog.buildCommandLine(setup, jvmArguments, 43 new File(stdout), new File(stderr), logSize, true); 44 45 // write the command to a file in the 46 // task specific cache directory 47 // TODO copy to user dir 48 Path p = new Path(allocator.getLocalPathForWrite( 49 TaskTracker.getPrivateDirTaskScriptLocation(user, jobId, attemptId), 50 getConf()), COMMAND_FILE); //"taskjvm.sh"文件 51 52 String commandFile = writeCommand(cmdLine, rawFs, p);//将命令写入"taskjvm.sh",p是文件名 53 rawFs.setPermission(p, TaskController.TASK_LAUNCH_SCRIPT_PERMISSION); 54 shExec = new ShellCommandExecutor(new String[]{ 55 "bash", "-c", commandFile}, 56 currentWorkDirectory); 57 shExec.execute(); 58 } catch (Exception e) { 59 if (shExec == null) { 60 return -1; 61 } 62 int exitCode = shExec.getExitCode(); 63 LOG.warn("Exit code from task is : " + exitCode); 64 LOG.info("Output from DefaultTaskController's launchTask follows:"); 65 logOutput(shExec.getOutput()); 66 return exitCode; 67 } 68 return 0; 69 }
launchTask方法首先会在磁盘上创建任务工作目录,接着讲任务启动命令写入shell脚本”taskjvm.sh“中,并构造一个ShellCommandExecutor对象调用其execute()方法通过ProcessBuilder执行命令"bash -c taskjvm.sh",这样就启动了一个JVM来执行task。脚本最终会启动一个org.apache.hadoop.mapred.Child类来运行任务的。其main方法内容较长代码如下:
1 //真正的map task和reduce task都是在Child进程中运行的,Child的main函数的主要逻辑如下 2 public static void main(String[] args) throws Throwable { 3 LOG.debug("Child starting"); 4 //创建RPC Client,启动日志同步线程 5 final JobConf defaultConf = new JobConf(); 6 String host = args[0]; 7 int port = Integer.parseInt(args[1]); 8 final InetSocketAddress address = NetUtils.makeSocketAddr(host, port); 9 final TaskAttemptID firstTaskid = TaskAttemptID.forName(args[2]); 10 final String logLocation = args[3]; 11 final int SLEEP_LONGER_COUNT = 5; 12 int jvmIdInt = Integer.parseInt(args[4]); 13 JVMId jvmId = new JVMId(firstTaskid.getJobID(),firstTaskid.isMap(),jvmIdInt); 14 String prefix = firstTaskid.isMap() ? "MapTask" : "ReduceTask"; 15 16 cwd = System.getenv().get(TaskRunner.HADOOP_WORK_DIR); 17 if (cwd == null) { 18 throw new IOException("Environment variable " + 19 TaskRunner.HADOOP_WORK_DIR + " is not set"); 20 } 21 22 // file name is passed thru env 23 String jobTokenFile = 24 System.getenv().get(UserGroupInformation.HADOOP_TOKEN_FILE_LOCATION); 25 Credentials credentials = 26 TokenCache.loadTokens(jobTokenFile, defaultConf); 27 LOG.debug("loading token. # keys =" +credentials.numberOfSecretKeys() + 28 "; from file=" + jobTokenFile); 29 30 Token<JobTokenIdentifier> jt = TokenCache.getJobToken(credentials); 31 SecurityUtil.setTokenService(jt, address); 32 UserGroupInformation current = UserGroupInformation.getCurrentUser(); 33 current.addToken(jt); 34 35 UserGroupInformation taskOwner 36 = UserGroupInformation.createRemoteUser(firstTaskid.getJobID().toString()); 37 taskOwner.addToken(jt); 38 39 // Set the credentials 40 defaultConf.setCredentials(credentials); 41 42 final TaskUmbilicalProtocol umbilical = 43 taskOwner.doAs(new PrivilegedExceptionAction<TaskUmbilicalProtocol>() { 44 @Override 45 public TaskUmbilicalProtocol run() throws Exception { 46 return (TaskUmbilicalProtocol)RPC.getProxy(TaskUmbilicalProtocol.class, 47 TaskUmbilicalProtocol.versionID, 48 address, 49 defaultConf); 50 } 51 }); 52 53 int numTasksToExecute = -1; //-1 signifies "no limit" 54 int numTasksExecuted = 0; 55 Runtime.getRuntime().addShutdownHook(new Thread() { 56 public void run() { 57 try { 58 if (taskid != null) { 59 TaskLog.syncLogs 60 (logLocation, taskid, isCleanup, currentJobSegmented); 61 } 62 } catch (Throwable throwable) { 63 } 64 } 65 }); 66 Thread t = new Thread() { 67 public void run() { 68 //every so often wake up and syncLogs so that we can track 69 //logs of the currently running task 70 while (true) { 71 try { 72 Thread.sleep(5000); 73 if (taskid != null) { 74 TaskLog.syncLogs 75 (logLocation, taskid, isCleanup, currentJobSegmented); 76 } 77 } catch (InterruptedException ie) { 78 } catch (IOException iee) { 79 LOG.error("Error in syncLogs: " + iee); 80 System.exit(-1); 81 } 82 } 83 } 84 }; 85 t.setName("Thread for syncLogs"); 86 t.setDaemon(true); 87 t.start(); 88 89 String pid = ""; 90 if (!Shell.WINDOWS) { 91 pid = System.getenv().get("JVM_PID"); 92 } 93 JvmContext context = new JvmContext(jvmId, pid); 94 int idleLoopCount = 0; 95 Task task = null; 96 97 UserGroupInformation childUGI = null; 98 99 final JvmContext jvmContext = context; 100 try { 101 while (true) {//不断询问TaskTracker,以获得新任务 102 taskid = null; 103 currentJobSegmented = true; 104 //从TaskTracker通过网络通信得到JvmTask对象 105 JvmTask myTask = umbilical.getTask(context);//获取新任务 106 if (myTask.shouldDie()) {//JVM所属作业不存在或者被杀死 107 break; 108 } else { 109 if (myTask.getTask() == null) { //暂时没有新任务 110 taskid = null; 111 currentJobSegmented = true; 112 //等待一段时间继续询问TaskTracker 113 if (++idleLoopCount >= SLEEP_LONGER_COUNT) { 114 //we sleep for a bigger interval when we don't receive 115 //tasks for a while 116 Thread.sleep(1500); 117 } else { 118 Thread.sleep(500); 119 } 120 continue; 121 } 122 } 123 //有新任务,进行本地化 124 idleLoopCount = 0; 125 task = myTask.getTask(); 126 task.setJvmContext(jvmContext); 127 taskid = task.getTaskID(); 128 129 // Create the JobConf and determine if this job gets segmented task logs 130 final JobConf job = new JobConf(task.getJobFile()); 131 currentJobSegmented = logIsSegmented(job); 132 133 isCleanup = task.isTaskCleanupTask(); 134 // reset the statistics for the task 135 FileSystem.clearStatistics(); 136 137 // Set credentials 138 job.setCredentials(defaultConf.getCredentials()); 139 //forcefully turn off caching for localfs. All cached FileSystems 140 //are closed during the JVM shutdown. We do certain 141 //localfs operations in the shutdown hook, and we don't 142 //want the localfs to be "closed" 143 job.setBoolean("fs.file.impl.disable.cache", false); 144 145 // set the jobTokenFile into task 146 task.setJobTokenSecret(JobTokenSecretManager. 147 createSecretKey(jt.getPassword())); 148 149 // setup the child's mapred-local-dir. The child is now sandboxed and 150 // can only see files down and under attemtdir only. 151 TaskRunner.setupChildMapredLocalDirs(task, job); 152 153 // setup the child's attempt directories 154 localizeTask(task, job, logLocation); 155 156 //setupWorkDir actually sets up the symlinks for the distributed 157 //cache. After a task exits we wipe the workdir clean, and hence 158 //the symlinks have to be rebuilt. 159 TaskRunner.setupWorkDir(job, new File(cwd)); 160 161 //create the index file so that the log files 162 //are viewable immediately 163 TaskLog.syncLogs 164 (logLocation, taskid, isCleanup, logIsSegmented(job)); 165 166 numTasksToExecute = job.getNumTasksToExecutePerJvm(); 167 assert(numTasksToExecute != 0); 168 169 task.setConf(job); 170 171 // Initiate Java VM metrics 172 initMetrics(prefix, jvmId.toString(), job.getSessionId()); 173 174 LOG.debug("Creating remote user to execute task: " + job.get("user.name")); 175 childUGI = UserGroupInformation.createRemoteUser(job.get("user.name")); 176 // Add tokens to new user so that it may execute its task correctly. 177 for(Token<?> token : UserGroupInformation.getCurrentUser().getTokens()) { 178 childUGI.addToken(token); 179 } 180 181 // Create a final reference to the task for the doAs block 182 final Task taskFinal = task; 183 childUGI.doAs(new PrivilegedExceptionAction<Object>() { 184 @Override 185 public Object run() throws Exception { 186 try { 187 // use job-specified working directory 188 FileSystem.get(job).setWorkingDirectory(job.getWorkingDirectory()); 189 taskFinal.run(job, umbilical); // run the task,启动任务 190 } finally { 191 TaskLog.syncLogs 192 (logLocation, taskid, isCleanup, logIsSegmented(job)); 193 TaskLogsTruncater trunc = new TaskLogsTruncater(defaultConf); 194 trunc.truncateLogs(new JVMInfo( 195 TaskLog.getAttemptDir(taskFinal.getTaskID(), 196 taskFinal.isTaskCleanupTask()), Arrays.asList(taskFinal))); 197 } 198 199 return null; 200 } 201 }); 202 //如果JVM服用次数达到上限数目,则直接退出 203 if (numTasksToExecute > 0 && ++numTasksExecuted == numTasksToExecute) { 204 break; 205 } 206 } 207 } catch (FSError e) { 208 LOG.fatal("FSError from child", e); 209 umbilical.fsError(taskid, e.getMessage(), jvmContext); 210 } catch (Exception exception) { 211 LOG.warn("Error running child", exception); 212 try { 213 if (task != null) { 214 // do cleanup for the task 215 if(childUGI == null) { 216 task.taskCleanup(umbilical); 217 } else { 218 final Task taskFinal = task; 219 childUGI.doAs(new PrivilegedExceptionAction<Object>() { 220 @Override 221 public Object run() throws Exception { 222 taskFinal.taskCleanup(umbilical); 223 return null; 224 } 225 }); 226 } 227 } 228 } catch (Exception e) { 229 LOG.info("Error cleaning up", e); 230 } 231 // Report back any failures, for diagnostic purposes 232 ByteArrayOutputStream baos = new ByteArrayOutputStream(); 233 exception.printStackTrace(new PrintStream(baos)); 234 if (taskid != null) { 235 umbilical.reportDiagnosticInfo(taskid, baos.toString(), jvmContext); 236 } 237 } catch (Throwable throwable) { 238 LOG.fatal("Error running child : " 239 + StringUtils.stringifyException(throwable)); 240 if (taskid != null) { 241 Throwable tCause = throwable.getCause(); 242 String cause = tCause == null 243 ? throwable.getMessage() 244 : StringUtils.stringifyException(tCause); 245 umbilical.fatalError(taskid, cause, jvmContext); 246 } 247 } finally { 248 RPC.stopProxy(umbilical); 249 shutdownMetrics(); 250 // Shutting down log4j of the child-vm... 251 // This assumes that on return from Task.run() 252 // there is no more logging done. 253 LogManager.shutdown(); 254 } 255 }
上述代码涉及的任务本地化内容有:(1)将任务相关的一些配置参数添加到作业配置JobConf中,有同名则覆盖,形成任务自己的配置JobConf,并采用轮询的方式选择一个目录存放对应任务对象的配置文件,也就是任务配置文件由两部分组成:一个是作业的JobConf一个是任务自己的特定的参数;(2)在目录中建立指向分布式缓存中所有数据文件的链接,以便能够直接使用这些文件。taskFinal.run(job,umbilical)方法会调用相应的MapTask或者ReduceTask的run方法来执行,这以后再分析。
上述reapJvm方法中的A和C都会启动一个JVM,B使用的是旧的JVM,那是如何执行的呢?答案就在Child的main方法中,其中int jvmIdInt = Integer.parseInt(args[4]);这个Id是一个整数类型,是父进程最初创建该jvmRunner时生成的,他是一个随机数,联合jobID一起标示了一个运行特定job任务的特定进程;然后main中的while循环会通过JvmTask myTask = umbilical.getTask(context)不断的去通过jvmManager.getTaskForJvm(jvmId)获取TaskTracker上关于指定的JVM上的新的task,从而使得复用的JVM中的task执行。
到目前为止tasktracker端接受Jobtracker的心跳相应信息并对各种任务类型的启动过程有了初步的了解,下一步就是map和reduce的执行过程了。
参考:1、董西成,《hadoop技术内幕---深入理解MapReduce架构设计与实现原理》
2、http://guoyunsky.iteye.com/blog/1729457 ,这有关于复用JVM的说明