我们此次分享Driver端的容错,ReceivedBlockTracker,JobGenerator,DStream对于Driver端的元数据的容错至关重要。
本期概览:
1 ReceivedBlockTracker从元数据容错层面谈driver容错
2 JobGenerator,DStream从业务逻辑级别和业务物理执行级别谈driver容错
接下来,我们首先进入ReceivedBlockTracker的源码部分
/**
* Class that keep track of all the received blocks, and allocate them to batches
* when required. All actions taken by this class can be saved to a write ahead log
* (if a checkpoint directory has been provided), so that the state of the tracker
* (received blocks and block-to-batch allocations) can be recovered after driver failure.
*
* Note that when any instance of this class is created with a checkpoint directory,
* it will try reading events from logs in the directory.
*/
源码注释是藏龙卧虎的地方,以上注释说明此类事管理元数据,保存状态的类。并且提供状态恢复功能。WAL机制
/** Add received block. This event will get written to the write ahead log (if enabled). */
val writeResult = writeToLog(BlockAdditionEvent(receivedBlockInfo))
if (writeResult) {
synchronized {
getReceivedBlockQueue(receivedBlockInfo.streamId) += receivedBlockInfo
}
logDebug(s"Stream ${receivedBlockInfo.streamId} received " +
s"block ${receivedBlockInfo.blockStoreResult.blockId}")
} else {
logDebug(s"Failed to acknowledge stream ${receivedBlockInfo.streamId} receiving " +
s"block ${receivedBlockInfo.blockStoreResult.blockId} in the Write Ahead Log.")
}
要注意的是:
writeToLog(BlockAdditionEvent(receivedBlockInfo))写入WAL成功之后,才会将元数据( getReceivedBlockQueue(receivedBlockInfo.streamId) += receivedBlockInfo)加入到队列中。加入到内存队列中方便JobGenerator等作业的调度。
/**
* Allocate all unallocated blocks to the given batch.
* This event will get written to the write ahead log (if enabled).
*/
系统升级的时候,可以改写下面红框中的源代码
在整个生命周期的最后一个步骤,做了checkpoint的操作。