企业搜索引擎开发之连接器connector（十四）

回顾Context类的start方法，还有一部分是启动调度器的方法

/**
   * Start up the Scheduler.
   */
  private void startScheduler() {
    traversalScheduler =
        (TraversalScheduler) getRequiredBean("TraversalScheduler",
            TraversalScheduler.class);
    if (traversalScheduler != null) {
      traversalScheduler.init();
    }
  }

即执行TraversalScheduler类对象的init()方法

TraversalScheduler是实现Runnable接口的类，实现该接口的多线程方法，其源码如下：

/**
 * Scheduler that schedules connector traversal.  This class is thread safe.
 * Must initialize TraversalScheduler before running it.
 *
 * <p> This facility includes a schedule thread that runs a loop.
 * Each iteration it asks the instantiator for the schedule
 * for each Connector Instance and runs batches for those that
 * are
 * <OL>
 * <LI> scheduled to run.
 * <LI> have not exhausted their quota for the current time interval.
 * <LI> are not currently running.
 * </OL>
 * The implementation must handle the situation that a Connector
 * Instance is running.
 */
public class TraversalScheduler implements Runnable {
  public static final String SCHEDULER_CURRENT_TIME = "/Scheduler/currentTime";

  private static final Logger LOGGER =
    Logger.getLogger(TraversalScheduler.class.getName());

  private final Instantiator instantiator;

  private boolean isInitialized; // Protected by instance lock.
  private boolean isShutdown; // Protected by instance lock.

  /**
   * Create a scheduler object.
   *
   * @param instantiator used to get schedule for connector instances
   */
  public TraversalScheduler(Instantiator instantiator) {
    this.instantiator = instantiator;
    this.isInitialized = false;
    this.isShutdown = false;
  }

  public synchronized void init() {
    if (isInitialized) {
      return;
    }
    isInitialized = true;
    isShutdown = false;
    new Thread(this, "TraversalScheduler").start();
  }

  public synchronized void shutdown() {
    if (isShutdown) {
      return;
    }
    isInitialized = false;
    isShutdown = true;
  }

  /**
   * Determines whether scheduler should run.
   *
   * @return true if we are in a running state and scheduler should run or
   *         continue running.
   */
  private synchronized boolean isRunningState() {
    return isInitialized && !isShutdown;
  }

  private void scheduleBatches() {
    for (String connectorName : instantiator.getConnectorNames()) {
      NDC.pushAppend(connectorName);
      try {
        instantiator.getConnectorCoordinator(connectorName).startBatch();
      } catch (ConnectorNotFoundException e) {
        // Looks like the connector just got deleted.  Don't schedule it.
      } finally {
        NDC.pop();
      }
    }
  }

  public void run() {
    NDC.push("Traverse");
    try {
      while (true) {
        try {
          if (!isRunningState()) {
            LOGGER.info("TraversalScheduler thread is stopping due to "
                + "shutdown or not being initialized.");
            return;
          }
          scheduleBatches();
          // Give someone else a chance to run.
          try {
            synchronized (this) {
              wait(1000);
            }
          } catch (InterruptedException e) {
            // May have been interrupted for shutdown.
          }
        } catch (Throwable t) {
          LOGGER.log(Level.SEVERE,
              "TraversalScheduler caught unexpected Throwable: ", t);
        }
      }
    } finally {
      NDC.remove();
    }
  }
}

TraversalScheduler类依赖于Instantiator类，用于Instantiator遍历所有连接器的ConnectorCoordinatorImpl对象并启用startBatch()方法

多线程实现方法run()里面是一个死循环，不断的轮询执行scheduleBatches()方法

我们回顾前面的ConnectorCoordinatorImpl类的startBatch()方法

//@Override
  public synchronized boolean startBatch() throws ConnectorNotFoundException {
    verifyConnectorInstanceAvailable();
    if (!shouldRun()) {
      return false;
    }

    BatchSize batchSize = loadManager.determineBatchSize();
    if (batchSize.getMaximum() == 0) {
      return false;
    }
    taskHandle = null;
    currentBatchKey = new Object();

    try {
      BatchCoordinator batchCoordinator = new BatchCoordinator(this);
      TraversalManager traversalManager =
          getConnectorInterfaces().getTraversalManager();
      Traverser traverser = new QueryTraverser(pusherFactory,
          traversalManager, batchCoordinator, name,
          Context.getInstance().getTraversalContext());
      TimedCancelable batch =  new CancelableBatch(traverser, name,
          batchCoordinator, batchCoordinator, batchSize);
      taskHandle = threadPool.submit(batch);
      return true;
    } catch (ConnectorNotFoundException cnfe) {
      LOGGER.log(Level.WARNING, "Connector not found - this is normal if you "
          + " recently reconfigured your connector instance: " + cnfe);
    } catch (InstantiatorException ie) {
      LOGGER.log(Level.WARNING,
          "Failed to perform connector content traversal.", ie);
      delayTraversal(TraversalDelayPolicy.ERROR);
    }
    return false;
  }

方法首先执行!shouldRun()的判断，我们分析一下该方法的源码：

/**
   * Returns {@code true} if it is OK to start a traversal,
   * {@code false} otherwise.
   */
  // Package access because this is called by tests.
  synchronized boolean shouldRun() {
    // Are we already running? If so, we shouldn't run again.
    if (taskHandle != null && !taskHandle.isDone()) {
      return false;
    }

    // Don't run if we have postponed traversals.
    if (System.currentTimeMillis() < traversalDelayEnd) {
      return false;
    }

    Schedule schedule = getSchedule();

    // Don't run if traversals are disabled.
    if (schedule.isDisabled()) {
      return false;
    }

    // Don't run if we have exceeded our configured host load.
    if (loadManager.shouldDelay()) {
      return false;
    }

    // OK to run if we are within one of the Schedule's traversal intervals.
    Calendar now = Calendar.getInstance();
    int hour = now.get(Calendar.HOUR_OF_DAY);
    for (ScheduleTimeInterval interval : schedule.getTimeIntervals()) {
      int startHour = interval.getStartTime().getHour();
      int endHour = interval.getEndTime().getHour();
      if (0 == endHour) {
        endHour = 24;
      }
      if (endHour < startHour) {
        // The traversal interval straddles midnight.
        if ((hour >= startHour) || (hour < endHour)) {
          return true;
        }
      } else {
        // The traversal interval falls wholly within the day.
        if ((hour >= startHour) && (hour < endHour)) {
          return true;
        }
      }
    }

    return false;
  }

该方法对是否调度连接器做出审查，如上轮调度是否完成、调度设置是否可用、加载管理器是否要求延迟、调度时机是否成熟等

---------------------------------------------------------------------------

本系列企业搜索引擎开发之连接器connector系本人原创

转载请注明出处博客园刺猬的温驯

本文链接 http://www.cnblogs.com/chenying99/archive/2013/03/20/2970378.html

posted on 2013-03-20 01:11 刺猬的温驯阅读(311) 评论(0) 编辑收藏举报

刷新页面返回顶部

君子博学而日参省乎己则知明而行无过矣

公告

君子博学而日参省乎己 则知明而行无过矣

公告

君子博学而日参省乎己则知明而行无过矣