From Aligned to Unaligned Checkpoints - Part 1: Checkpoints, Alignment, and Backpressure

To achieve that, Flink injects checkpoint barriers into the streams at the sources, which travel through the entire topology and eventually reach the sinks. These barriers divide the stream into a pre-checkpoint epoch (all events that are persisted in state or emitted into sinks) and a post-checkpoint epoch (events not reflected in the state, to be re-processed when resuming from the checkpoint).

Flink 在 source 节点注入 barriers,barriers 把数据流分割成检查点前数据流和检查点后数据流。

 

Operators need to make sure that they take the checkpoint exactly when all pre-checkpoint events are processed and no post-checkpoint events have yet been processed. When the first barrier reaches the head of the input buffer queue and is consumed by the operator, the operator starts the so-called alignment phase. During that phase, the operator will not consume any data from the channels where it already received a barrier, until it has received a barrier from all input channels.

算子要确保所有检查点前的数据都已被处理,所有检查点后的数据未被处理。当第一个 barrier 到达输入缓存队列头部和被算子消费后,算子开始对齐阶段。在对齐阶段,算子不会消费通道中的任何数据,直到算子收到从所有通道传来的 barrier。

 

Once all barriers are received, the operator snapshots its state, forwards the barrier to the output, and ends the alignment phase, which unblocks all inputs. An operator state snapshot is written into the checkpoint storage, typically asynchronously while data processing continues. Once all operators have successfully written their state snapshot to the checkpoint storage, the checkpoint is successfully completed and can be used for recovery.

一旦算子收到所有输入通道的 barrier,算子对当前 state 生成快照,转发 barrier 到输入节点,结束对齐阶段,解锁所有输入流。算子状态快照会写入检查点缓存,通常是异步写入,不影响处理数据流。一旦所有算子成功写入状态快照到检查点存储,一个检查点就完成了,可以用来恢复作业状态。

 

One important thing to note here is that the barriers flow with the events, strictly in line. In a healthy setup without backpressure, barriers flow and align in milliseconds. The checkpoint duration is dominated by the time it takes to write the state snapshots to the checkpoint storage, which becomes faster with incremental checkpoints. If the events flow slowly under backpressure, so will the barriers. That means that barriers can take long to travel from sources to sinks resulting in the alignment phase to take even longer to complete.

这里需要注意的一件重要的事情是,barriers 会随着事件的发生而流动,严格地保持一致(如何严格保持一致?)。如果没有反压,barrier 可以在毫秒级完成流动和对齐。生成检查点花费的时间主要是把状态快照写入检查点存储的时间,如果是增量检查点花费的时间就会更少。在反压状态下,事件流和 barrier 流动都会很慢,意味着 barriers 要花费更长的时间从 source 到 sink,对齐阶段也要花费更长的时间才能完成。

 

ref:https://flink.apache.org/2020/10/15/from-aligned-to-unaligned-checkpoints-part-1.html

posted @ 2020-11-04 21:11  王晓天  阅读(163)  评论(0编辑  收藏  举报