在继续分析源码前,有必要熟悉一下连接器的UML模型图,不然面对那错综芜杂的依赖关系难免使人无法理清头绪
先熟悉一下下面的uml模型:
我画的该图示不全的,为的是避免细节的干扰而更能够清晰的表述连接器的UML模型
ConnectorCoordinatorImpl类通过成员变量ThreadPool调用实现多线程接口的CancelBatch类来实现连接器采集功能的
多线程类CancelBatch通过调用Traverser类型的对象(这里是QueryTraverser)调用连接器的TraversalManager接口实例完成数据的遍历
其他的无疑是辅助类,BatchResultRecorder类用来记录采集结果信息;TraversalStateStore用来存储状态;BatchSize传入批次大小信息;
FeedConnection用来像搜索引擎应用程序发布xmlfeed数据的接口 TraversalContext是上下文信息(需要实现TraversalManger接口的调度类同时实现TraversalContextAware接口)
实际的连接器类模型远比上面的要复杂,里面的个别类同时实现上面几种接口,是不是违背了单一职责原则呢
其中起着枢纽作用的是BatchCoordinator类,该类同时实现了TraversalStateStore,BatchResultRecorder, BatchTimeout三种接口,相关接口和类的UML模型图如下:
所以在程序中BatchCoordinator类能够适应不同的类型,类似多面人的角色
接下来我们回顾头来分析上文提到的 ConnectorCoordinatorImpl类的startBatch()方法,就会更易于看懂了
/** * Starts running a batch for this {@link ConnectorCoordinator} if a batch is * not already running. * * @return true if this call started a batch * @throws ConnectorNotFoundException if this {@link ConnectorCoordinator} * does not exist. */ //@Override public synchronized boolean startBatch() throws ConnectorNotFoundException { verifyConnectorInstanceAvailable(); if (!shouldRun()) { return false; } BatchSize batchSize = loadManager.determineBatchSize(); if (batchSize.getMaximum() == 0) { return false; } taskHandle = null; currentBatchKey = new Object(); try { BatchCoordinator batchCoordinator = new BatchCoordinator(this); TraversalManager traversalManager = getConnectorInterfaces().getTraversalManager(); Traverser traverser = new QueryTraverser(pusherFactory, traversalManager, batchCoordinator, name, Context.getInstance().getTraversalContext()); TimedCancelable batch = new CancelableBatch(traverser, name, batchCoordinator, batchCoordinator, batchSize); taskHandle = threadPool.submit(batch); return true; } catch (ConnectorNotFoundException cnfe) { LOGGER.log(Level.WARNING, "Connector not found - this is normal if you " + " recently reconfigured your connector instance: " + cnfe); } catch (InstantiatorException ie) { LOGGER.log(Level.WARNING, "Failed to perform connector content traversal.", ie); delayTraversal(TraversalDelayPolicy.ERROR); } return false; }
方法体中先实例化BatchCoordinator类,BatchCoordinator batchCoordinator = new BatchCoordinator(this);从它的类名可以猜测到它是一个协调者角色
Traverser traverser = new QueryTraverser(pusherFactory,traversalManager, batchCoordinator, name,Context.getInstance().getTraversalContext());
实例化QueryTraverser时,batchCoordinator是充当TraversalStateStore接口类型类,用于连接器状态存储;
TimedCancelable batch = new CancelableBatch(traverser, name,batchCoordinator, batchCoordinator, batchSize);
实例化CancelableBatch时,batchCoordinator参数一充当BatchResultRecorder接口类型类,用于记录连接器采集结果信息;
batchCoordinator参数二充当BatchTimeout接口类型类,大概用于强制重置连接器实例
熟悉一下BatchCoordinator类源码,看它是怎样实现上述三种类型接口功能的:
/** * Coordinate operations that apply to a running batch with other changes that * affect this [@link {@link ConnectorCoordinatorImpl}. * <p> * The {@link ConnectorCoordinatorImpl} monitor is used to guard batch * operations. * <p> * To avoid long held locks the {@link ConnectorCoordinatorImpl} monitor is * not held while a batch runs or even between the time a batch is canceled * and the time its background processing completes. Therefore, a lingering * batch may attempt to record completion information, modify the checkpoint * or timeout after the lingering batch has been canceled. These operations * may even occur after a new batch has started. To avoid corrupting the * {@link ConnectorCoordinatorImpl} state this class employs the batchKey * protocol to disable completion operations that are performed on behalf of * lingering batches. Here is how the protocol works. * <OL> * <LI>To start a batch starts while holding the * {@link ConnectorCoordinatorImpl} monitor assign the batch a unique key. * Store the key in ConnectorCoordinator.this.currentBatchKey. Also create a * {@link BatchCoordinator} with BatchCoordinator.requiredBatchKey set to the * key for the batch. * <LI>To cancel a batch while holding the ConnectorCoordinatorImpl monitor, * null out ConnectorCoordinator.this.currentBatchKey. * <LI>The {@link BatchCoordinator} performs all completion operations for a * batch and prevents operations on behalf of non current batches. To check * while holding the {@link ConnectorCoordinatorImpl} monitor it * verifies that * BatchCoordinator.requiredBatchKey equals * ConnectorCoordinator.this.currentBatchKey. * </OL> */ class BatchCoordinator implements TraversalStateStore, BatchResultRecorder, BatchTimeout { private static final Logger LOGGER = Logger.getLogger(BatchCoordinator.class.getName()); private final Object requiredBatchKey; private final ConnectorCoordinatorImpl connectorCoordinator; /** * Creates a BatchCoordinator */ BatchCoordinator(ConnectorCoordinatorImpl connectorCoordinator) { this.requiredBatchKey = connectorCoordinator.currentBatchKey; this.connectorCoordinator = connectorCoordinator; } public String getTraversalState() { synchronized (connectorCoordinator) { if (connectorCoordinator.currentBatchKey == requiredBatchKey) { try { return connectorCoordinator.getConnectorState(); } catch (ConnectorNotFoundException cnfe) { // Connector disappeared while we were away. throw new BatchCompletedException(); } } else { throw new BatchCompletedException(); } } } public void storeTraversalState(String state) { synchronized (connectorCoordinator) { if (connectorCoordinator.currentBatchKey == requiredBatchKey) { try { connectorCoordinator.setConnectorState(state); } catch (ConnectorNotFoundException cnfe) { // Connector disappeared while we were away. // Don't try to store results. throw new BatchCompletedException(); } } else { throw new BatchCompletedException(); } } } public void recordResult(BatchResult result) { synchronized (connectorCoordinator) { if (connectorCoordinator.currentBatchKey == requiredBatchKey) { connectorCoordinator.recordResult(result); } else { LOGGER.fine("Ignoring a BatchResult returned from a " + "prevously canceled traversal batch. Connector = " + connectorCoordinator.getConnectorName() + " result = " + result + " batchKey = " + requiredBatchKey); } } } public void timeout() { synchronized (connectorCoordinator) { if (connectorCoordinator.currentBatchKey == requiredBatchKey) { connectorCoordinator.resetBatch(); } else { LOGGER.warning("Ignoring Timeout for previously prevously canceled" + " or completed traversal batch. Connector = " + connectorCoordinator.getConnectorName() + " batchKey = "+ requiredBatchKey); } } } // TODO(strellis): Add this Exception to throws for BatchRecorder, // TraversalStateStore, BatchTimeout interfaces and catch this // specific exception rather than IllegalStateException. private static class BatchCompletedException extends IllegalStateException { } }
从上述代码可以看出,BatchCoordinator类主要用到了依赖的ConnectorCoordinatorImpl类成员,调用了ConnectorCoordinatorImpl类的相应方法,这种处理方式有点类似装饰模式,本文就写到这里了,其余部分留待下文分析吧
---------------------------------------------------------------------------
本系列企业搜索引擎开发之连接器connector系本人原创
转载请注明出处 博客园 刺猬的温驯
本文链接http://www.cnblogs.com/chenying99/archive/2013/03/18/2965328.html