逖靖寒的世界

每天进步一点点

导航

使用CapacityTaskScheduler

Hadoop的版本为0.19.2

关于这个调度的详细介绍,可以参考:http://hadoop.apache.org/common/docs/r0.19.2/capacity_scheduler.html

本文只介绍如何搭建一个CapacityTaskScheduler的系统。

在Master机器上执行如下操作:

1 将contrib/capacity-scheduler/hadoop-0.19.2-capacity-scheduler.jar文件拷贝到lib包下面(注意,如果有FairScheduler存在,请先删除这个包)。

2 添加如下内容到hadoop-site.xml文件中:

<property>
  <name>mapred.jobtracker.taskScheduler</name>
  <value>org.apache.hadoop.mapred.CapacityTaskScheduler</value>
</property>

<property>
  <name>mapred.queue.names</name>
  <value>logQueue1,logQueue2,algQueue1,algQueue2,default</value>
</property>

3 在capacity-scheduler.xml文件中填写如下内容:

<?xml version="1.0"?>
<configuration>

    <property>
        <name>mapred.capacity-scheduler.queue.logQueue1.guaranteed-capacity</name>
        <value>20</value>
    </property>

    <property>
        <name>mapred.capacity-scheduler.queue.logQueue1.reclaim-time-limit</name>
        <value>5</value>
    </property>

    <property>
        <name>mapred.capacity-scheduler.queue.logQueue1.supports-priority</name>
        <value>true</value>
    </property>

    <property>
        <name>mapred.capacity-scheduler.queue.logQueue2.guaranteed-capacity</name>
        <value>20</value>
    </property>

    <property>
        <name>mapred.capacity-scheduler.queue.logQueue2.reclaim-time-limit</name>
        <value>5</value>
    </property>

    <property>
        <name>mapred.capacity-scheduler.queue.logQueue2.supports-priority</name>
        <value>true</value>
    </property>

    <property>
        <name>mapred.capacity-scheduler.queue.algQueue1.guaranteed-capacity</name>
        <value>20</value>
    </property>

    <property>
        <name>mapred.capacity-scheduler.queue.algQueue1.reclaim-time-limit</name>
        <value>5</value>
    </property>

    <property>
        <name>mapred.capacity-scheduler.queue.algQueue1.supports-priority</name>
        <value>true</value>
    </property>

    <property>
        <name>mapred.capacity-scheduler.queue.algQueue2.guaranteed-capacity</name>
        <value>20</value>
    </property>

    <property>
        <name>mapred.capacity-scheduler.queue.algQueue2.reclaim-time-limit</name>
        <value>5</value>
    </property>

    <property>
        <name>mapred.capacity-scheduler.queue.algQueue2.supports-priority</name>
        <value>true</value>
    </property>

    <property>
        <name>mapred.capacity-scheduler.queue.default.guaranteed-capacity</name>
        <value>20</value>
    </property>

    <property>
        <name>mapred.capacity-scheduler.queue.default.reclaim-time-limit</name>
        <value>5</value>
    </property>

    <property>
        <name>mapred.capacity-scheduler.queue.default.supports-priority</name>
        <value>true</value>
    </property>

</configuration>

4 重启Hadoop

5 在Job的代码中,设置Job属于的队列:

conf.setQueueName(“QueueName”);

经过以上五步操作以后,我们的调度就配置起来了。

通过JobTracker的web界面看到如下的情况:

Scheduling Information

Queue Name
Scheduling Information

algQueue1
Guaranteed Capacity (%) : 20.0
Guaranteed Capacity Maps : 7
Guaranteed Capacity Reduces : 7
User Limit : 100
Reclaim Time limit : 5
Number of Running Maps : 0
Number of Running Reduces : 0
Number of Waiting Maps : 0
Number of Waiting Reduces : 0
Priority Supported : YES

algQueue2
Guaranteed Capacity (%) : 20.0
Guaranteed Capacity Maps : 7
Guaranteed Capacity Reduces : 7
User Limit : 100
Reclaim Time limit : 5
Number of Running Maps : 0
Number of Running Reduces : 0
Number of Waiting Maps : 0
Number of Waiting Reduces : 0
Priority Supported : YES

default
Guaranteed Capacity (%) : 20.0
Guaranteed Capacity Maps : 7
Guaranteed Capacity Reduces : 7
User Limit : 100
Reclaim Time limit : 0
Number of Running Maps : 0
Number of Running Reduces : 0
Number of Waiting Maps : 0
Number of Waiting Reduces : 0
Priority Supported : YES

logQueue1
Guaranteed Capacity (%) : 20.0
Guaranteed Capacity Maps : 7
Guaranteed Capacity Reduces : 7
User Limit : 100
Reclaim Time limit : 5
Number of Running Maps : 0
Number of Running Reduces : 0
Number of Waiting Maps : 0
Number of Waiting Reduces : 0
Priority Supported : YES

logQueue2
Guaranteed Capacity (%) : 20.0
Guaranteed Capacity Maps : 7
Guaranteed Capacity Reduces : 7
User Limit : 100
Reclaim Time limit : 5
Number of Running Maps : 0
Number of Running Reduces : 0
Number of Waiting Maps : 0
Number of Waiting Reduces : 0
Priority Supported : YES

posted on 2010-01-27 21:45  逖靖寒  阅读(2221)  评论(0编辑  收藏  举报