企业搜索引擎开发之连接器connector（六）

在继续分析源码前，有必要熟悉一下连接器的UML模型图，不然面对那错综芜杂的依赖关系难免使人无法理清头绪

先熟悉一下下面的uml模型：

我画的该图示不全的，为的是避免细节的干扰而更能够清晰的表述连接器的UML模型

ConnectorCoordinatorImpl类通过成员变量ThreadPool调用实现多线程接口的CancelBatch类来实现连接器采集功能的

多线程类CancelBatch通过调用Traverser类型的对象（这里是QueryTraverser）调用连接器的TraversalManager接口实例完成数据的遍历

其他的无疑是辅助类，BatchResultRecorder类用来记录采集结果信息；TraversalStateStore用来存储状态；BatchSize传入批次大小信息；

FeedConnection用来像搜索引擎应用程序发布xmlfeed数据的接口 TraversalContext是上下文信息（需要实现TraversalManger接口的调度类同时实现TraversalContextAware接口）

实际的连接器类模型远比上面的要复杂，里面的个别类同时实现上面几种接口，是不是违背了单一职责原则呢

其中起着枢纽作用的是BatchCoordinator类，该类同时实现了TraversalStateStore,BatchResultRecorder, BatchTimeout三种接口，相关接口和类的UML模型图如下：

所以在程序中BatchCoordinator类能够适应不同的类型，类似多面人的角色

接下来我们回顾头来分析上文提到的 ConnectorCoordinatorImpl类的startBatch()方法，就会更易于看懂了

/**
   * Starts running a batch for this {@link ConnectorCoordinator} if a batch is
   * not already running.
   *
   * @return true if this call started a batch
   * @throws ConnectorNotFoundException if this {@link ConnectorCoordinator}
   *         does not exist.
   */
  //@Override
  public synchronized boolean startBatch() throws ConnectorNotFoundException {
    verifyConnectorInstanceAvailable();
    if (!shouldRun()) {
      return false;
    }

    BatchSize batchSize = loadManager.determineBatchSize();
    if (batchSize.getMaximum() == 0) {
      return false;
    }
    taskHandle = null;
    currentBatchKey = new Object();

    try {
      BatchCoordinator batchCoordinator = new BatchCoordinator(this);
      TraversalManager traversalManager =
          getConnectorInterfaces().getTraversalManager();
      Traverser traverser = new QueryTraverser(pusherFactory,
          traversalManager, batchCoordinator, name,
          Context.getInstance().getTraversalContext());
      TimedCancelable batch =  new CancelableBatch(traverser, name,
          batchCoordinator, batchCoordinator, batchSize);
      taskHandle = threadPool.submit(batch);
      return true;
    } catch (ConnectorNotFoundException cnfe) {
      LOGGER.log(Level.WARNING, "Connector not found - this is normal if you "
          + " recently reconfigured your connector instance: " + cnfe);
    } catch (InstantiatorException ie) {
      LOGGER.log(Level.WARNING,
          "Failed to perform connector content traversal.", ie);
      delayTraversal(TraversalDelayPolicy.ERROR);
    }
    return false;
  }

方法体中先实例化BatchCoordinator类，BatchCoordinator batchCoordinator = new BatchCoordinator(this);从它的类名可以猜测到它是一个协调者角色

Traverser traverser = new QueryTraverser(pusherFactory,traversalManager, batchCoordinator, name,Context.getInstance().getTraversalContext());

实例化QueryTraverser时，batchCoordinator是充当TraversalStateStore接口类型类，用于连接器状态存储；

TimedCancelable batch = new CancelableBatch(traverser, name,batchCoordinator, batchCoordinator, batchSize);

实例化CancelableBatch时，batchCoordinator参数一充当BatchResultRecorder接口类型类，用于记录连接器采集结果信息；

batchCoordinator参数二充当BatchTimeout接口类型类，大概用于强制重置连接器实例

熟悉一下BatchCoordinator类源码，看它是怎样实现上述三种类型接口功能的：

/**
 * Coordinate operations that apply to a running batch with other changes that
 * affect this [@link {@link ConnectorCoordinatorImpl}.
 * <p>
 * The {@link ConnectorCoordinatorImpl} monitor is used to guard batch
 * operations.
 * <p>
 * To avoid long held locks the {@link ConnectorCoordinatorImpl} monitor is
 * not held while a batch runs or even between the time a batch is canceled
 * and the time its background processing completes. Therefore, a lingering
 * batch may attempt to record completion information, modify the checkpoint
 * or timeout after the lingering batch has been canceled. These operations
 * may even occur after a new batch has started. To avoid corrupting the
 * {@link ConnectorCoordinatorImpl} state this class employs the batchKey
 * protocol to disable completion operations that are performed on behalf of
 * lingering batches. Here is how the protocol works.
 * <OL>
 * <LI>To start a batch starts while holding the
 * {@link ConnectorCoordinatorImpl} monitor assign the batch a unique key.
 * Store the key in ConnectorCoordinator.this.currentBatchKey. Also create a
 * {@link BatchCoordinator} with BatchCoordinator.requiredBatchKey set to the
 * key for the batch.
 * <LI>To cancel a batch while holding the ConnectorCoordinatorImpl monitor,
 * null out ConnectorCoordinator.this.currentBatchKey.
 * <LI>The {@link BatchCoordinator} performs all completion operations for a
 * batch and prevents operations on behalf of non current batches. To check
 * while holding the {@link ConnectorCoordinatorImpl} monitor it
 * verifies that
 * BatchCoordinator.requiredBatchKey equals
 * ConnectorCoordinator.this.currentBatchKey.
 * </OL>
 */
class BatchCoordinator implements TraversalStateStore,
    BatchResultRecorder, BatchTimeout {

  private static final Logger LOGGER =
      Logger.getLogger(BatchCoordinator.class.getName());

  private final Object requiredBatchKey;
  private final ConnectorCoordinatorImpl connectorCoordinator;

  /**
   * Creates a BatchCoordinator
   */
  BatchCoordinator(ConnectorCoordinatorImpl connectorCoordinator) {
    this.requiredBatchKey = connectorCoordinator.currentBatchKey;
    this.connectorCoordinator = connectorCoordinator;
  }

  public String getTraversalState() {
    synchronized (connectorCoordinator) {
      if (connectorCoordinator.currentBatchKey == requiredBatchKey) {
        try {
          return connectorCoordinator.getConnectorState();
        } catch (ConnectorNotFoundException cnfe) {
          // Connector disappeared while we were away.
          throw new BatchCompletedException();
        }
      } else {
        throw new BatchCompletedException();
      }
    }
  }

  public void storeTraversalState(String state) {
    synchronized (connectorCoordinator) {
      if (connectorCoordinator.currentBatchKey == requiredBatchKey) {
        try {
          connectorCoordinator.setConnectorState(state);
        } catch (ConnectorNotFoundException cnfe) {
          // Connector disappeared while we were away.
          // Don't try to store results.
          throw new BatchCompletedException();
        }
      } else {
        throw new BatchCompletedException();
      }
    }
  }

  public void recordResult(BatchResult result) {
    synchronized (connectorCoordinator) {
      if (connectorCoordinator.currentBatchKey == requiredBatchKey) {
        connectorCoordinator.recordResult(result);
      } else {
        LOGGER.fine("Ignoring a BatchResult returned from a "
            + "prevously canceled traversal batch.  Connector = "
            + connectorCoordinator.getConnectorName()
            + "  result = " + result + "  batchKey = " + requiredBatchKey);
      }
    }
  }

  public void timeout() {
    synchronized (connectorCoordinator) {
      if (connectorCoordinator.currentBatchKey == requiredBatchKey) {
        connectorCoordinator.resetBatch();
      } else {
        LOGGER.warning("Ignoring Timeout for previously prevously canceled"
            + " or completed traversal batch.  Connector = "
            + connectorCoordinator.getConnectorName()
            + "  batchKey = "+ requiredBatchKey);
      }
    }
  }

  // TODO(strellis): Add this Exception to throws for BatchRecorder,
  //     TraversalStateStore, BatchTimeout interfaces and catch this
  //     specific exception rather than IllegalStateException.
  private static class BatchCompletedException extends IllegalStateException {
  }
}

从上述代码可以看出，BatchCoordinator类主要用到了依赖的ConnectorCoordinatorImpl类成员，调用了ConnectorCoordinatorImpl类的相应方法，这种处理方式有点类似装饰模式，本文就写到这里了，其余部分留待下文分析吧

---------------------------------------------------------------------------

本系列企业搜索引擎开发之连接器connector系本人原创

转载请注明出处博客园刺猬的温驯

本文链接http://www.cnblogs.com/chenying99/archive/2013/03/18/2965328.html

posted on 2013-03-18 22:13 刺猬的温驯阅读(395) 评论(0) 编辑收藏举报

刷新页面返回顶部

君子博学而日参省乎己则知明而行无过矣

公告

君子博学而日参省乎己 则知明而行无过矣

公告

君子博学而日参省乎己则知明而行无过矣