|NO.Z.00049|——————————|BigDataEnd|——|Hadoop&Flink.V04|——|Flink.v04|Flink Connector|kafka|源码理解|源码说明.V2|

一、源码提取说明

### --- 直接启动consumer

~~~     # 源码提取说明：FlinkKafkaConsumerBase.java
~~~     # 第160~161行
    /** The startup mode for the consumer (default is {@link StartupMode#GROUP_OFFSETS}). */
    private StartupMode startupMode = StartupMode.GROUP_OFFSETS;

### --- 该枚举类型有5个值：

~~~     GROUP_OFFSETS：从保存在zookeeper或者是Kafka broker的对应消费者组提交的offset开始消费，
~~~     这个是默认的配置
~~~     EARLIEST：尽可能从最早的offset开始消费
~~~     LATEST：从最近的offset开始消费
~~~     TIMESTAMP：从用户提供的timestamp处开始消费
~~~     SPECIFIC_OFFSETS：从用户提供的offset处开始消费

### --- 根据startup mode，获取从哪个地方开始消费。然后，partition discoverer就会拉取初始分区的数据

~~~     # 源码提取说明：FlinkKafkaConsumerBase.java
~~~     # 第583~605行
            LOG.info("Consumer subtask {} will start reading {} partitions with offsets in restored state: {}",
                getRuntimeContext().getIndexOfThisSubtask(), subscribedPartitionsToStartOffsets.size(), subscribedPartitionsToStartOffsets);
        } else {
            // use the partition discoverer to fetch the initial seed partitions,
            // and set their initial offsets depending on the startup mode.
            // for SPECIFIC_OFFSETS and TIMESTAMP modes, we set the specific offsets now;
            // for other modes (EARLIEST, LATEST, and GROUP_OFFSETS), the offset is lazily determined
            // when the partition is actually read.
            switch (startupMode) {
                case SPECIFIC_OFFSETS:
                    if (specificStartupOffsets == null) {
                        throw new IllegalStateException(
                            "Startup mode for the consumer set to " + StartupMode.SPECIFIC_OFFSETS +
                                ", but no specific offsets were specified.");
                    }

                    for (KafkaTopicPartition seedPartition : allPartitions) {
                        Long specificOffset = specificStartupOffsets.get(seedPartition);
                        if (specificOffset != null) {
                            // since the specified offsets represent the next record to read, we subtract
                            // it by one so that the initial state of the consumer will be correct
                            subscribedPartitionsToStartOffsets.put(seedPartition, specificOffset - 1);
                        } else {

~~~     如果startup模式为SPECIFIC_OFFSETS:
~~~     异常情况:如果没有配置具体从哪个offset开始消费

### --- 正常情况：获取每个分区指定的消费起始offset

Long specificOffset = specificStartupOffsets.get(seedPartition);

~~~     # 源码提取说明：FlinkKafkaConsumerBase.java
~~~     # 第163~164行
    /** Specific startup offsets; only relevant when startup mode is {@link StartupMode#SPECIFIC_OFFSETS}. */
    private Map<KafkaTopicPartition, Long> specificStartupOffsets;

~~~     # 第599~610行
                    for (KafkaTopicPartition seedPartition : allPartitions) {
                        Long specificOffset = specificStartupOffsets.get(seedPartition);
                        if (specificOffset != null) {   #如果分区配置了offset，从设置的offset开始消费
                            // since the specified offsets represent the next record to read, we subtract
                            // it by one so that the initial state of the consumer will be correct
                            subscribedPartitionsToStartOffsets.put(seedPartition, specificOffset - 1);
                        } else {    # 如果分区没有配置offset，从GROUP_OFFSET开始消费
                            // default to group offset behaviour if the user-provided specific offsets
                            // do not contain a value for this partition
                            subscribedPartitionsToStartOffsets.put(seedPartition, KafkaTopicPartitionStateSentinel.GROUP_OFFSET);
                        }
                    }

### --- Run方法：

~~~     判断保存分区和读取起始偏移量的集合是否为空：
~~~     记录Kafka offset成功提交和失败提交的数量

~~~     # 源码提取说明：FlinkKafkaConsumerBase.java
~~~     # 第212~216行

    /** Counter for successful Kafka offset commits. */
    private transient Counter successfulCommits;

    /** Counter for failed Kafka offset commits. */
    private transient Counter failedCommits;

### --- 获取当前自任务的索引

~~~     # 源码提取说明：FlinkKafkaConsumerBase.java
~~~     # 第707行
        final int subtaskIndex = this.getRuntimeContext().getIndexOfThisSubtask();

### --- 注册一个提交时的回调函数，提交成功时，提交成功计数器加一；提交失败时，提交失败计数器加一

~~~     # 源码提取说明：FlinkKafkaConsumerBase.java
~~~     # 第709~720行
        this.offsetCommitCallback = new KafkaCommitCallback() {
            @Override
            public void onSuccess() {
                successfulCommits.inc();
            }

            @Override
            public void onException(Throwable cause) {
                LOG.warn(String.format("Consumer subtask %d failed async Kafka commit.", subtaskIndex), cause);
                failedCommits.inc();
            }
        };

### --- 接下来判断subscribedPartitionsToStartOffsets集合是否为空。如果为空标记数据源的状态为暂时空闲。

~~~     # 源码提取说明：FlinkKafkaConsumerBase.java
~~~     # 第722~727行
        // mark the subtask as temporarily idle if there are no initial seed partitions;
        // once this subtask discovers some partitions and starts collecting records, the subtask's
        // status will automatically be triggered back to be active.
        if (subscribedPartitionsToStartOffsets.isEmpty()) {
            sourceContext.markAsTemporarilyIdle();
        }

### --- 创建一个KafkaFetcher,借助KafkaConsumer API从Kafka的broker拉取数据

~~~     # 源码提取说明：FlinkKafkaConsumerBase.java
~~~     # 第731~743行
        // from this point forward:
        //   - 'snapshotState' will draw offsets from the fetcher,
        //     instead of being built from `subscribedPartitionsToStartOffsets`
        //   - 'notifyCheckpointComplete' will start to do work (i.e. commit offsets to
        //     Kafka through the fetcher, if configured to do so)
        this.kafkaFetcher = createFetcher(
                sourceContext,
                subscribedPartitionsToStartOffsets,
                watermarkStrategy,
                (StreamingRuntimeContext) getRuntimeContext(),
                offsetCommitMode,
                getRuntimeContext().getMetricGroup().addGroup(KAFKA_CONSUMER_METRICS_GROUP),
                useMetrics);

### --- 根据分区发现间隔时间，来确定是否启动分区定时发现任务

~~~     如果没有配置分区定时发现时间间隔则直接启动获取数据任务；
~~~     否则，启动定期分区发现任务和数据获取任务

~~~     # 源码提取说明：FlinkKafkaConsumerBase.java
~~~     # 第749~759行

        // depending on whether we were restored with the current state version (1.3),
        // remaining logic branches off into 2 paths:
        //  1) New state - partition discovery loop executed as separate thread, with this
        //                 thread running the main fetcher loop
        //  2) Old state - partition discovery is disabled and only the main fetcher loop is executed
        if (discoveryIntervalMillis == PARTITION_DISCOVERY_DISABLED) {
            kafkaFetcher.runFetchLoop();
        } else {
            runWithPartitionDiscovery();
        }
    }

### --- 循环拉取数据源码：

~~~     # 源码提取说明：FlinkKafkaConsumerBase.java
~~~     # 第761~777行
    private void runWithPartitionDiscovery() throws Exception {
        final AtomicReference<Exception> discoveryLoopErrorRef = new AtomicReference<>();
        createAndStartDiscoveryLoop(discoveryLoopErrorRef); #启动分区发现定时任务

        kafkaFetcher.runFetchLoop();            # 启动从kafka broker上拉取数据任务

        // make sure that the partition discoverer is waked up so that
        // the discoveryLoopThread exits
        partitionDiscoverer.wakeup();   # 确保分区发现器在分区发现循环线程启动期间一直处于唤醒状态
        joinDiscoveryLoopThread();  # 等待发现分区线程执行完毕

        // rethrow any fetcher errors
        final Exception discoveryLoopError = discoveryLoopErrorRef.get();
        if (discoveryLoopError != null) {
            throw new RuntimeException(discoveryLoopError);
        }
    }

### --- createAndStartDiscoveryLoop:启动分区发现任务的方法：

~~~     # 源码提取说明：FlinkKafkaConsumerBase.java
~~~     # 第786~797行
    private void createAndStartDiscoveryLoop(AtomicReference<Exception> discoveryLoopErrorRef) {
        discoveryLoopThread = new Thread(() -> {
            try {
                // --------------------- partition discovery loop ---------------------

                // throughout the loop, we always eagerly check if we are still running before
                // performing the next operation, so that we can escape the loop as soon as possible

                while (running) {
                    if (LOG.isDebugEnabled()) {
                        LOG.debug("Consumer subtask {} is trying to discover new partitions ...", getRuntimeContext().getIndexOfThisSubtask());
                    }

### --- 尝试发现新的分区：

~~~     # 源码提取说明：FlinkKafkaConsumerBase.java
~~~     # 第799~806行
                    final List<KafkaTopicPartition> discoveredPartitions;
                    try {
                        discoveredPartitions = partitionDiscoverer.discoverPartitions();
                    } catch (AbstractPartitionDiscoverer.WakeupException | AbstractPartitionDiscoverer.ClosedException e) {
                        // the partition discoverer may have been closed or woken up before or during the discovery;
                        // this would only happen if the consumer was canceled; simply escape the loop
                        break;
                    }

### --- 将发现的新分区添加到kafkaFetcher中

~~~     # 源码提取说明：FlinkKafkaConsumerBase.java
~~~     # 第808~822行
                    // no need to add the discovered partitions if we were closed during the meantime
                    if (running && !discoveredPartitions.isEmpty()) {
                        kafkaFetcher.addDiscoveredPartitions(discoveredPartitions);
                    }

                    // do not waste any time sleeping if we're not running anymore
                    if (running && discoveryIntervalMillis != 0) {
                        try {
                            Thread.sleep(discoveryIntervalMillis);
                        } catch (InterruptedException iex) {
                            // may be interrupted if the consumer was canceled midway; simply escape the loop
                            break;
                        }
                    }
                }

### --- 启动分区发现定时任务

~~~     # 源码提取说明：FlinkKafkaConsumerBase.java
~~~     # 第823~835行
            } catch (Exception e) {
                discoveryLoopErrorRef.set(e);
            } finally {
                // calling cancel will also let the fetcher loop escape
                // (if not running, cancel() was already called)
                if (running) {
                    cancel();
                }
            }
        }, "Kafka Partition Discovery for " + getRuntimeContext().getTaskNameWithSubtasks());

        discoveryLoopThread.start();
    }

### --- partitionDiscoverer.discoverPartitions()的调用，即发现分区的执行过程。

~~~     # 源码提取说明：AbstractPartitionDiscoverer.java
~~~     # 第118~163行
    /**
     * Execute a partition discovery attempt for this subtask.
     * This method lets the partition discoverer update what partitions it has discovered so far.
     *
     * @return List of discovered new partitions that this subtask should subscribe to.
     */
    public List<KafkaTopicPartition> discoverPartitions() throws WakeupException, ClosedException {
        if (!closed && !wakeup) { # 确保没有关闭数据源，也没有wakeup
            try {
                List<KafkaTopicPartition> newDiscoveredPartitions;

                // (1) get all possible partitions, based on whether we are subscribed to fixed topics or a topic pattern
                if (topicsDescriptor.isFixedTopics()) { # 如果配置了fixedtopics，获取这些topic的分区
                    newDiscoveredPartitions = getAllPartitionsForTopics(topicsDescriptor.getFixedTopics());
                } else { # 否则获取所有topic
                    List<String> matchedTopics = getAllTopics();

                    // retain topics that match the pattern
                    Iterator<String> iter = matchedTopics.iterator();
                    while (iter.hasNext()) {
                        if (!topicsDescriptor.isMatchingTopic(iter.next())) {
                            iter.remove(); # 逐个删除不匹配的topic
                        }
                    }

                    if (matchedTopics.size() != 0) {    # 如果有匹配的topic，获取他们的分区
                        // get partitions only for matched topics
                        newDiscoveredPartitions = getAllPartitionsForTopics(matchedTopics);
                    } else { # 否则，将newDiscoveredPartitions设置为null
                        newDiscoveredPartitions = null;
                    }
                }
                
                // (2) eliminate partition that are old partitions or should not be subscribed by this subtask
                if (newDiscoveredPartitions == null || newDiscoveredPartitions.isEmpty()) {
                    throw new RuntimeException("Unable to retrieve any partitions with KafkaTopicsDescriptor: " + topicsDescriptor);
                } else {
                    Iterator<KafkaTopicPartition> iter = newDiscoveredPartitions.iterator();
                    KafkaTopicPartition nextPartition;
                    while (iter.hasNext()) {          # 分区存入discoveredPartitions集合中
                        nextPartition = iter.next();  # 返回值为分区是否归当前task消费
                        if (!setAndCheckDiscoveredPartition(nextPartition)) {
                            iter.remove();
                        }
                    }
                }

Walter Savage Landor:strove with none,for none was worth my strife.Nature I loved and, next to Nature, Art:I warm'd both hands before the fire of life.It sinks, and I am ready to depart

——W.S.Landor