Barriers(数据屏障)
A core element in Flink’s distributed snapshotting are the stream barriers. These barriers are injected into the data stream and flow with the records as part of the data stream. Barriers never overtake records, they flow strictly in line. A barrier separates the records in the data stream into the set of records that goes into the current snapshot, and the records that go into the next snapshot. Each barrier carries the ID of the snapshot whose records it pushed in front of it. Barriers do not interrupt the flow of the stream and are hence very lightweight. Multiple barriers from different snapshots can be in the stream at the same time, which means that various snapshots may happen concurrently.
Flink 分布式快照的核心元素是 Barriers(数据屏障)。数据屏障被注入数据流,随数据流流动,成为数据流的一部分。数据屏障永远不会超车,它们严格按照顺序流动。一个数据屏障将数据流中的记录分隔为进入当前快照的记录集和进入下一个快照的记录集。每个数据屏障都携带快照ID,数据屏障推动该快照记录向前流动。数据屏障不会干扰数据流,因此是非常轻量级的。来自不同快照的多个数据屏障可以同时存在于在数据流中,这意味着各种快照可能并发发生。
Stream barriers are injected into the parallel data flow at the stream sources. The point where the barriers for snapshot n are injected (let’s call it Sn) is the position in the source stream up to which the snapshot covers the data. For example, in Apache Kafka, this position would be the last record’s offset in the partition. This position Sn is reported to the checkpoint coordinator (Flink’s JobManager).
数据屏障被注入数据源的并行数据流中。快照n的屏障被注入的位置(记为Sn)是在快照覆盖数据的源头的另一端。例如,在Apache Kafka中,这个位置将是分区中最后一条记录的偏移量。将此位置Sn报告给检查点协调器(Flink的JobManager)。
The barriers then flow downstream. When an intermediate operator has received a barrier for snapshot n from all of its input streams, it emits a barrier for snapshot n into all of its outgoing streams. Once a sink operator (the end of a streaming DAG) has received the barrier n from all of its input streams, it acknowledges that snapshot n to the checkpoint coordinator. After all sinks have acknowledged a snapshot, it is considered completed.
然后屏障顺流而下。当一个中间算子从它的所有输入流接收到快照n的barrier时,它会在它的所有输出流中发出快照n的barrier。一旦一个 sink 算子(DAG末端)从它的所有输入流接收到barrier n,它就向检查点协调器确认快照n。在所有的 sink 都确认了一个快照后,就认为一个快照已经完成。
Once snapshot n has been completed, the job will never again ask the source for records from before Sn, since at that point these records (and their descendant records) will have passed through the entire data flow topology.
快照n一旦完成,作业将不再向数据源请求来自Sn之前的记录,因为此时这些记录(及其后代记录)早已流经整个数据流拓扑。
Operators that receive more than one input stream need to align the input streams on the snapshot barriers. The figure above illustrates this:
接收多个输入流的算子需要依赖快照屏障对齐输入流。上图说明了这一点:
- As soon as the operator receives snapshot barrier n from an incoming stream, it cannot process any further records from that stream until it has received the barrier n from the other inputs as well. Otherwise, it would mix records that belong to snapshot n and with records that belong to snapshot n+1.
- 一旦算子从输入流接收到snapshot barrier n,它就不能处理该流的任何记录,直到它也从其他输入流接收到barrier n。否则,它将混合属于快照n的记录和属于快照n+1的记录。
- Streams that report barrier n are temporarily set aside. Records that are received from these streams are not processed, but put into an input buffer.
- 报告barrier n的流被暂时搁置。从这些流接收到的记录不被处理,而是被放入一个输入缓冲区。
- Once the last stream has received barrier n, the operator emits all pending outgoing records, and then emits snapshot n barriers itself.
- 一旦最后一个流接收到barrier n,算子就会发出所有挂起的传出记录,然后发出快照 barrier n 自身。
- After that, it resumes processing records from all input streams, processing records from the input buffers before processing the records from the streams.
- 之后,它继续处理来自所有输入流的记录,在处理来自输入流的记录之前处理来自输入缓冲区的记录。
ref:
https://ci.apache.org/projects/flink/flink-docs-stable/internals/stream_checkpointing.html
https://flink.apache.org/features/2018/03/01/end-to-end-exactly-once-apache-flink.html
https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/state/checkpoints.html
https://kaiwu.lagou.com/course/courseInfo.htm?courseId=81#/detail/pc?id=2049