Apache Flink——介绍

1 引入







为了更广泛的支持大数据的生态圈,Flink也实现了很多Connector的子项目,比如与Hadoop HDFS集成。并且,Flink也宣布支持了Tachyon、S3以及MapRFS。

True stream processing engine

Stream processing is a rapidly-growing data processing paradigm范例 for real-time实时分析 analytics and event-driven事件驱动 applications. Apache Flink is a distributed data processing framework for stateful有状态 computations over unbounded and bounded data streams.

2 Features of Flink

  • Next-generation engine for stream processing
  • Low latency & high throughput
  • Robust Fault tolerance, apps应用程序 start exactly where they failed
  • Rich set of libraries
  • App state is re-scalable应用状态可重新缩放, possible to add resources while the app is running
  • Exactly once semantics语义
  • Event Time Handling处理
  • State management

3 Batch Vs Stream

Streaming — a type of data processing engine that is designed with infinite data sets无限数据集 in mind考虑

In batch processing, we first collect & then push for processing. Since the data being processed is accumulated积累的 in advance it is bounded data. This mode方式 of processing is useful for historical fact-finding历史事实调查. It is query driven strategy.

In stream processing, we feed the data to the processing engine as soon as就 it arrives. Since data is not accumulated beforehand事先 it is unbounded data. This processing is useful for live fact-finding实时事实调查. It is a data-driven strategy.

Stream processing can be both stateless as well as stateful.流处理既可以是无状态的,也可以是有状态的。

Stateful stream processing means a “State” is shared between events(stream entities). And therefore past events can influence the way current events are processed当前事件的处理方式.

4 Flink Ecosystem

Flink’s architecture is based on a distributed dataflow programming model where data is processed as a series of transformations转换 on distributed data streams.

5 Components of Flink

  1. Client — Submits Jobs (Data Flow Graph)
  2. Job manager — Accepts the task. It transforms the DFG submitted by the client into an ExecutionGraph for event execution.    It has the following以下 services.    Dispatcher — Starts a new Job master for each submitted job. Job Master — Manages the execution of a single job. Resource Manager — Assigns the tasks to the task managers for execution. Checkpoint coordinator协调器 — Ensures fault tolerance.    The dispatcher also powers支持 a web dashboard仪表板 & HTTP endpoint端点 that provide information about job executions.
  3. Task manager — Executes the tasks (Can be one or more). These Task Managers contain slots. Slots are an abstraction used to represent表示 CPU & memory resources available on Task Manager. Slots are smallest unit of resource scheduling. Each task runs within a task slot.
  4. Actor system: Part of Akka framework. Provides services such as scheduling, configuration, logging, etc.

Flink application is represented as a Job graph, where the nodes are the operators and links determine决定 the input and output, to and from various operators. Task manager has one or more Task slots that provide an execution environment for the tasks.

Apache Flink is built on Kappa Architecture.

6 Data Transfer

TaskManager leverages利用 a pool of network buffers to send and receive data. The sender task serializes the records into a byte buffer and puts the buffer into a queue. The receiver task takes the buffer from the queue and de-serializes the incoming records. If the tasks are in the same process, there is no need for a network connection.

6.1 Flink Data Exchange Strategy

Forward | Broadcast | Key-based | Random

Flink also leverages Credit based flow control流量控制, task chaining任务链 for data transfer数据传输.

7 Program flow

8 Operators

In Flink, each function like map, filter, reduce, etc is implemented as a long-running长时间运行 operator. There are the following major operators:

Map | FlatMap | Reduce | Filter | Iterate | Window | KeyBy

A task is one parallel instance of an Operator or Operator chain.

9 Window

Windows split the streams into “buckets桶” of finite size, over其上 which computations can be applied.

Tumbling | Sliding | Session | Global

Trigger — A Trigger determines when a window should be evaluated评估 to emit the results for that part of the window.

Evictors — An Evictor can remove elements from a pane窗格 before/after the evaluation of WindowFunction and after the window, evaluation求值 gets triggered by a Trigger.

Watermarks — Watermark(t) indicates that  all data with a timestamp not greater than t  has arrived, and will not come again in the future, so the window can be safely triggered and destroyed.

10 Time Notions

  • Processing time — time of a particular machine that will process the specific element. This is the default in Flink.
  • Event time — It is the time when an event is generated at the source. It’s the actual time实际时间 of an event.
  • Ingestion摄入 time — It is a time when the Flink receives an event for processing.

11 Fault tolerance

State — State can be considered as memory in operators that remember information about past input. Each task manager maintains its own Rocks DB file and the backup of its state. RocksDB as backend后端 is a good option when:

  • The state of our job is larger than what can fit in local memory
  • We need incremental checkpointing增量检查点
  • We need predictable latency

Flink supports two types of the state namely, Operator & Keyed State.

Checkpoints — Task managers take a backup备份 of their state and store it in a durable持久 store (such as HDFS) at regular intervals定期, this is called checkpoints.

Savepoints — When a user, using an API, manually takes a backup of the cluster state then it is called Savepoint.

The state persistence持久性 in the filesystem has the following file pattern.


12 Scalability

In Flink, re-scaling重新缩放 is done by redistributing重新分配 the “state” to its worker machines. When Flink receives a re-scaling request then it will take the saved state from HDFS (checkpoint) and then redistribute it to more instances of the operator. This requires a job restart.

Using an adaptive scheduler Flink automatically upscales/downscales Flink application when the amount of data to process increases or decreases.

Flink also supports a reactive mode where Flink will add more Task Managers. Flink decides when to add and remove Task Managers based on resource availability. In K8s-based deployment, a Horizontal Pod autoscaler is used.

13 Data & Task Parallelism

14 Spark & Flink

RDD → Dataflows

SparkMLib →FlinkML

Custom Memory Management →Automatic Memory Manager

DAG-based Execution Engine →Cyclic Dependency Graph

Second latency →Sub second latency

Explicit Backpressure → In built backpressure setting

15 Deployment Mode

Session — Shared job manager

Application — Dedicated Job Manager


Deployments部署 also have modes Active主动 & passive被动 in terms of resource readiness准备.


posted @ 2023-06-11 22:09  ImreW  阅读(25)  评论(0编辑  收藏  举报