YARN-SLS中NM和AM如何进入调度
RM是scheduler的执行者:
SLSrunner.java
private void startRM() throws ClassNotFoundException, YarnException {
Configuration rmConf = new YarnConfiguration(getConf());
String schedulerClass = rmConf.get(YarnConfiguration.RM_SCHEDULER);
if (Class.forName(schedulerClass) == CapacityScheduler.class) {
rmConf.set(YarnConfiguration.RM_SCHEDULER,
SLSCapacityScheduler.class.getName());
rmConf.setBoolean(YarnConfiguration.RM_SCHEDULER_ENABLE_MONITORS, true);
rmConf.set(YarnConfiguration.RM_SCHEDULER_MONITOR_POLICIES,
ProportionalCapacityPreemptionPolicy.class.getName());
} else if (Class.forName(schedulerClass) == FairScheduler.class) {
rmConf.set(YarnConfiguration.RM_SCHEDULER,
SLSFairScheduler.class.getName());
} else if (Class.forName(schedulerClass) == FifoScheduler.class) {
// TODO add support for FifoScheduler
throw new YarnException("Fifo Scheduler is not supported yet.");
}
那么NM和AM如何加入调度呢?
SLSrunner.java->startNM()
// create NM simulators
Random random = new Random();
Set<String> rackSet = new ConcurrentHashSet<>();
int threadPoolSize = Math.max(poolSize,
SLSConfiguration.RUNNER_POOL_SIZE_DEFAULT);
ExecutorService executorService = Executors.
newFixedThreadPool(threadPoolSize);
for (Map.Entry<String, Resource> entry : nodeResourceMap.entrySet()) {
executorService.submit(new Runnable() {
@Override public void run() {
try {
// we randomize the heartbeat start time from zero to 1 interval
NMSimulator nm = new NMSimulator();
Resource nmResource = nodeManagerResource;
String hostName = entry.getKey();
if (entry.getValue() != null) {
nmResource = entry.getValue();
}
nm.init(hostName, nmResource,
random.nextInt(heartbeatInterval),
heartbeatInterval, rm, resourceUtilizationRatio);
nmMap.put(nm.getNode().getNodeID(), nm);
runner.schedule(nm);
rackSet.add(nm.getNode().getRackName());
通过runner也就是TaskRunner类中的schedule方法加入调度。
AM:
SLSrunner.java->startNM()->....->runNewAM()
private void runNewAM(String jobType, String user,
String jobQueue, String oldJobId, long jobStartTimeMS,
long jobFinishTimeMS, List<ContainerSimulator> containerList,
ReservationId reservationId, long deadline, Resource amContainerResource,
Map<String, String> params) {
AMSimulator amSim = (AMSimulator) ReflectionUtils.newInstance(
amClassMap.get(jobType), new Configuration());
if (amSim != null) {
int heartbeatInterval = getConf().getInt(
SLSConfiguration.AM_HEARTBEAT_INTERVAL_MS,
SLSConfiguration.AM_HEARTBEAT_INTERVAL_MS_DEFAULT);
boolean isTracked = trackedApps.contains(oldJobId);
if (oldJobId == null) {
oldJobId = Integer.toString(AM_ID);
}
AM_ID++;
amSim.init(heartbeatInterval, containerList, rm, this, jobStartTimeMS,
jobFinishTimeMS, user, jobQueue, isTracked, oldJobId,
runner.getStartTimeMS(), amContainerResource, params);
if(reservationId != null) {
// if we have a ReservationId, delegate reservation creation to
// AMSim (reservation shape is impl specific)
UTCClock clock = new UTCClock();
amSim.initReservation(reservationId, deadline, clock.getTime());
}
runner.schedule(amSim);
maxRuntime = Math.max(maxRuntime, jobFinishTimeMS);
numTasks += containerList.size();
amMap.put(oldJobId, amSim);
那么TaskRunner类中的schedule方法长啥样?
@SuppressWarnings("unchecked")
private void schedule(Task task, long timeNow) {
task.timeRebase(timeNow);
task.setQueue(queue);
queue.add(task);
}
public void schedule(Task task) {
schedule(task, System.currentTimeMillis());
}
可以看到,加入了queue。queue是DelayQueue类的具体对象。
DelayQueue类:
DelayQueue是一个无界阻塞队列,只有在延迟期满时才能从中提取元素。该队列的头部是延迟期满后保存时间最长的Delayed 元素。
DelayQueue阻塞队列在我们系统开发中也常常会用到,例如:缓存系统的设计,缓存中的对象,超过了空闲时间,需要从缓存中移出;任务调度系统,能够准确的把握任务的执行时间。我们可能需要通过线程处理很多时间上要求很严格的数据,如果使用普通的线程,我们就需要遍历所有的对象,一个一个的检 查看数据是否过期等,首先这样在执行上的效率不会太高,其次就是这种设计的风格也大大的影响了数据的精度。一个需要12:00点执行的任务可能12:01 才执行,这样对数据要求很高的系统有更大的弊端。由此我们可以使用DelayQueue。
在sls定义的job和container中,都是定时的,因此很适合采用DelayQueue模式进行调度。
posted on 2018-07-12 23:15 sichenzhao 阅读(256) 评论(0) 编辑 收藏 举报