|NO.Z.00099|——————————|BigDataEnd|——|Hadoop&kafka.V06|——|kafka.v06|Kafka源码剖析|Producer消费者流程.v02|
一、订阅Topic
### --- 订阅Topic
~~~ 下面我们先来看一下subscribe方法都有哪些逻辑
public void subscribe(Collection<String> topics, ConsumerRebalanceListenerlistener) {
// 轻量级锁
acquireAndEnsureOpen();
try {
if (topics == null) {
throw new IllegalArgumentException("Topic collection to subscribe to cannot be null");
} else if (topics.isEmpty()) {
// topics为空,则开始取消订阅的逻辑
this.unsubscribe();
} else {
// topic合法性判断,包含null或者空字符串直接抛异常
for (String topic : topics) {
if (topic == null || topic.trim().isEmpty())
throw new IllegalArgumentException("Topic collection to subscribe to cannot contain null or empty topic");
}
// 如果没有消费协调者直接抛异常
throwIfNoAssignorsConfigured();
log.debug("Subscribed to topic(s): {}", Utils.join(topics, ", "));
// 开始订阅
this.subscriptions.subscribe(new HashSet<>(topics), listener);
// 更新元数据,如果metadata当前不包括所有的topics则标记强制更新
metadata.setTopics(subscriptions.groupSubscription());
}
} finally {
release();
}
}
public void subscribe(Set<String> topics, ConsumerRebalanceListenerlistener) {
if (listener == null)
throw new IllegalArgumentException("RebalanceListener cannot be null");
// 按照指定的Topic名字进行订阅,自动分配分区
setSubscriptionType(SubscriptionType.AUTO_TOPICS);
// 监听
this.listener = listener;
// 修改订阅信息
changeSubscription(topics);
}
private void changeSubscription(Set<String> topicsToSubscribe) {
if (!this.subscription.equals(topicsToSubscribe)) {
// 如果使用AUTO_TOPICS或AUTO_PARTITION模式,则使用此集合记录所有订阅的Topic
this.subscription = topicsToSubscribe;
// Consumer Group中会选一个Leader,Leader会使用这个集合记录Consumer Group中所有消费者订阅的Topic,而其他的Follower的这个集合只会保存自身订阅的Topic
this.groupSubscription.addAll(topicsToSubscribe);
}
}
~~~ KafkaConsumer不是线程安全类,开启轻量级锁,topics为空抛异常,topics是空集合开始取消订阅,
~~~ 再次判断topics集合中是否有非法数据,判断消费者协调者是否为空。开始订阅对应topic。
~~~ listener默认为NoOpConsumerRebalanceListener ,一个空操作
~~~ # 轻量级锁:
~~~ 分别记录了当前使用KafkaConsumer的线程id和重入次数,
~~~ KafkaConsumer的acquire()和release()方法实现了一个”轻量级锁“,它并非真正的锁,
~~~ 仅是检测是否有多线程并发操作KafkaConsumer而已
~~~ 每一个KafkaConsumer实例内部都拥有一个SubscriptionState对象,
~~~ subscribe内部调用了subscribe方法,subscribe方法订阅信息记录到SubscriptionState ,
~~~ 多次订阅会覆盖旧数据。
~~~ 更新metadata,判断如果metadata中不包含当前groupSubscription,
~~~ 开始标记更新(后面会有更新的逻辑),并且消费者侧的topic不会过期
二、消息消费过程
### --- 消息消费过程
~~~ 下面KafkaConsumer的核心方法poll是如何拉取消息的,先来看一下下面的代码:
### --- poll
public ConsumerRecords<K, V> poll(long timeout) {
// 使用轻量级锁检测kafkaConsumer是否被其他线程使用
acquireAndEnsureOpen();
try {
// 超时间小于0抛异常
if (timeout < 0)
throw new IllegalArgumentException("Timeout must not be negative");
// 订阅类型为NONE抛异常,表示当前消费者没有订阅任何topic或者没有分配分区
if (this.subscriptions.hasNoSubscriptionOrUserAssignment())
throw new IllegalStateException("Consumer is not subscribed to any topics or assigned any partitions");
// poll for new data until the timeout expires
long start = time.milliseconds();
long remaining = timeout;
do {
// 核心方法,拉取消息
Map<TopicPartition, List<ConsumerRecord<K, V>>> records = pollOnce(remaining);
if (!records.isEmpty()) {
// before returning the fetched records, we can send off the next round of fetches
// and avoid block waiting for their responses to enable pipelining while the user
// is handling the fetched records.
//
// NOTE: since the consumed position has already been updated, we must not allow
// wakeups or any other errors to be triggered prior to returning the fetched records.
// 如果拉取到了消息,发送一次消息拉取的请求,不会阻塞不会被中断
// 在返回数据之前,发送下次的 fetch 请求,避免用户在下次获取数据时线程 block
if (fetcher.sendFetches() > 0 || client.hasPendingRequests())
client.pollNoWakeup();
// 经过烂机器处理后返回
if (this.interceptors == null)
return new ConsumerRecords<>(records);
else
return this.interceptors.onConsume(new ConsumerRecords<>(records));
}
long elapsed = time.milliseconds() - start;
// 拉取超时就结束
remaining = timeout - elapsed;
} while (remaining > 0);
return ConsumerRecords.empty();
} finally {
release();
}
}
### --- 这里可以看出,poll 方法的真正实现是在 pollOnce 方法中,
~~~ poll 方法通过 pollOnce 方法获取可用的数据
~~~ 使用轻量级锁检测kafkaConsumer是否被其他线程使用
~~~ 检查超时时间是否小于0,小于0抛出异常,停止消费
~~~ 检查这个 consumer 是否订阅的相应的 topic-partition
~~~ 调用 pollOnce() 方法获取相应的 records
~~~ 在返回获取的 records 前,发送下一次的 fetch 请求,
~~~ 避免用户在下次请求时线程 block在pollOnce() 方法中
~~~ 如果在给定的时间(timeout)内获取不到可用的 records,返回空数据
### --- pollOnce
// 除了获取新数据外,还会做一些必要的 offset-commit和reset-offset的操作
private Map<TopicPartition, List<ConsumerRecord<K, V>>> pollOnce(long timeout) {
client.maybeTriggerWakeup();
// 1. 获取 GroupCoordinator 地址并连接、加入 Group、sync Group、自动commit, join 及 sync 期间 group 会进行 rebalance
coordinator.poll(time.milliseconds(), timeout);
// 2. 更新订阅的 topic-partition 的 offset(如果订阅的 topic-partitionlist 没有有效的 offset 的情况下)
// fetch positions if we have partitions we're subscribed to that we
// don't know the offset for
if (!subscriptions.hasAllFetchPositions())
updateFetchPositions(this.subscriptions.missingFetchPositions());
// 3. 获取 fetcher 已经拉取到的数据
// if data is available already, return it immediately
Map<TopicPartition, List<ConsumerRecord<K, V>>> records = fetcher.fetchedRecords();
if (!records.isEmpty())
return records;
// 4. 发送 fetch 请求,会从多个 topic-partition 拉取数据(只要对应的 topicpartition没有未完成的请求)
// send any new fetches (won't resend pending fetches)
fetcher.sendFetches();
long now = time.milliseconds();
long pollTimeout = Math.min(coordinator.timeToNextPoll(now), timeout);
// 5. 调用 poll 方法发送请求(底层发送请求的接口)
client.poll(pollTimeout, now, new PollCondition() {
@Override
public boolean shouldBlock() {
// since a fetch might be completed by the background thread, we need this poll condition
// to ensure that we do not block unnecessarily in poll()
return !fetcher.hasCompletedFetches();
}
});
// 6. 如果 group 需要 rebalance,直接返回空数据,这样更快地让 group 进行稳定状态
// after the long poll, we should check whether the group needs to rebalance
// prior to returning data so that the group can stabilize faster
if (coordinator.needRejoin())
return Collections.emptyMap();
// 获取到请求的结果
return fetcher.fetchedRecords();
}
### --- pollOnce 可以简单分为6步来看,其作用分别如下:
### --- coordinator.poll()
~~~ 获取 GroupCoordinator 的地址,并建立相应 tcp 连接,发送 join-group、sync-group,
~~~ 之后才真正加入到了一个 group 中,这时会获取其要消费的 topic-partition 列表,
~~~ 如果设置了自动 commit,也会在这一步进行 commit。
~~~ 总之对于一个新建的 group,group 状态将会从 Empty –>PreparingRebalance –> AwaiSync –> Stable
~~~ 获取 GroupCoordinator 的地址,并建立相应 tcp 连接;
~~~ 发送 join-group 请求,然后 group 将会进行 rebalance;
~~~ 发送 sync-group 请求,之后才正在加入到了一个 group 中,这时会通过请求获取其要消费的 topic
### --- partition 列表;
~~~ 如果设置了自动 commit,也会在这一步进行 commit offset
### --- updateFetchPositions()
~~~ 这个方法主要是用来更新这个 consumer 实例订阅的 topic-partition 列表的 fetch-offset 信息。
~~~ 目的就是为了获取其订阅的每个 topic-partition 对应的 position,
~~~ 这样 Fetcher 才知道从哪个 offset 开始去拉取这个 topic-partition 的数据
~~~ # 在 Fetcher 中,这个 consumer 实例订阅的每个 topic-partition 都会有一个对应的TopicPartitionState 对象,
private void updateFetchPositions(Set<TopicPartition> partitions) {
// 先重置那些调用 seekToBegin 和 seekToEnd 的 offset 的 tp,设置其 thefetch position 的 offset
// lookup any positions for partitions which are awaiting reset (which may be the
// case if the user called seekToBeginning or seekToEnd. We do this check first to
// avoid an unnecessary lookup of committed offsets (which typically occurs when
// the user is manually assigning partitions and managing their own offsets).
fetcher.resetOffsetsIfNeeded(partitions);
if (!subscriptions.hasAllFetchPositions(partitions)) {
// if we still don't have offsets for the given partitions, then we should either
// seek to the last committed position or reset using the auto reset policy
// 获取所有分配 tp 的 offset, 即 committed offset, 更新到TopicPartitionState 中的 committed offset 中
// first refresh commits for all assigned partitions
coordinator.refreshCommittedOffsetsIfNeeded();
// 如果 the fetch position 值无效,则将上步获取的 committed offset 设置为 the fetch position
// then do any offset lookups in case some positions are not known
fetcher.updateFetchPositions(partitions);
}
}
### --- 在这个对象中会记录以下这些内容:
private static class TopicPartitionState {
// Fetcher 下次去拉取时的 offset,Fecher 在拉取时需要知道这个值
private Long position; // last consumed position
// 最后一次获取的高水位标记
private Long highWatermark; // the high watermark from last fetch
private Long lastStableOffset;
// consumer 已经处理完的最新一条消息的 offset,consumer 主动调用 offsetcommit时会更新这个值;
private OffsetAndMetadata committed; // last committed position
// 是否暂停
private boolean paused; // whether this partition has been paused by the user
// 这 topic-partition offset 重置的策略,重置之后,这个策略就会改为 null,防止再次操作
private OffsetResetStrategy resetStrategy; // the strategy to use if the offset needs resetting
}
### --- fetcher.fetchedRecords()
~~~ 返回其 fetched records,并更新其 fetch-position offset,
~~~ 只有在 offset-commit 时(自动commit 时,是在第一步实现的),才会更新其 committed offset;
public Map<TopicPartition, List<ConsumerRecord<K, V>>> fetchedRecords() {
Map<TopicPartition, List<ConsumerRecord<K, V>>> fetched = new HashMap<>();
// 在max.poll.records中设置单词最大的拉取条数
int recordsRemaining = maxPollRecords;
try {
while (recordsRemaining > 0) {
if (nextInLineRecords == null || nextInLineRecords.isFetched) {
// 从队列中获取但不移除次队列的头,如果此队列为空,则返回null
CompletedFetch completedFetch = completedFetches.peek();
if (completedFetch == null) break;
// 获取下一个要处理的nextInLineRecords
nextInLineRecords = parseCompletedFetch(completedFetch);
completedFetches.poll();
} else {
// 拉取records,更新position
List<ConsumerRecord<K, V>> records = fetchRecords(nextInLineRecords, recordsRemaining);
TopicPartition partition = nextInLineRecords.partition;
if (!records.isEmpty()) {
List<ConsumerRecord<K, V>> currentRecords = fetched.get(partition);
if (currentRecords == null) {
fetched.put(partition, records);
} else {
// this case shouldn't usually happen because we only send one fetch at a time per partition,
// but it might conceivably happen in some rare cases (such as partition leader changes).
// we have to copy to a new list because the old one may be immutable
List<ConsumerRecord<K, V>> newRecords = new ArrayList<>(records.size() + currentRecords.size());
newRecords.addAll(currentRecords);
newRecords.addAll(records);
fetched.put(partition, newRecords);
}
recordsRemaining -= records.size();
}
}
}
} catch (KafkaException e) {
if (fetched.isEmpty())
throw e;
}
return fetched;
}
private List<ConsumerRecord<K, V>> fetchRecords(PartitionRecords partitionRecords, int maxRecords) {
if (!subscriptions.isAssigned(partitionRecords.partition)) {
// this can happen when a rebalance happened before fetched records are returned to the consumer's poll call
log.debug("Not returning fetched records for partition {} since it is no longer assigned",
partitionRecords.partition);
} else {
// 这个tp不能来消费了,比如调用pause方法暂停消费
// note that the consumed position should always be available as long as the partition is still assigned
long position = subscriptions.position(partitionRecords.partition);
if (!subscriptions.isFetchable(partitionRecords.partition)) {
// this can happen when a partition is paused before fetched records are returned to the consumer's poll call
log.debug("Not returning fetched records for assigned partition {} since it is no longer fetchable",
partitionRecords.partition);
} else if (partitionRecords.nextFetchOffset == position) {
// 获取该 tp 对应的records,并更新 partitionRecords 的fetchOffset(用于判断是否顺序)
List<ConsumerRecord<K, V>> partRecords = partitionRecords.fetchRecords(maxRecords);
long nextOffset = partitionRecords.nextFetchOffset;
log.trace("Returning fetched records at offset {} for assigned partition {} and update " +
"position to {}", position, partitionRecords.partition, nextOffset);
// 更新消费的到 offset( the fetch position)
subscriptions.position(partitionRecords.partition, nextOffset);
// 获取 Lag(即 position与 hw 之间差值),hw 为 null 时,才返回null
Long partitionLag = subscriptions.partitionLag(partitionRecords.partition, isolationLevel);
if (partitionLag != null)
this.sensors.recordPartitionLag(partitionRecords.partition, partitionLag);
return partRecords;
} else {
// these records aren't next in line based on the last consumed position, ignore them
// they must be from an obsolete request
log.debug("Ignoring fetched records for {} at offset {} since the current position is {}",
partitionRecords.partition, partitionRecords.nextFetchOffset, position);
}
}
partitionRecords.drain();
return emptyList();
}
### --- fetcher.sendFetches()
~~~ 只要订阅的 topic-partition list 没有未处理的 fetch 请求,
~~~ 就发送对这个 topic-partition 的 fetch请求,
~~~ 在真正发送时,还是会按 node 级别去发送,
~~~ leader 是同一个 node 的 topic-partition 会合成一个请求去发送;
// 向订阅的所有 partition (只要该 leader 暂时没有拉取请求)所在 leader 发送 fetch请求
public int sendFetches() {
// 1.创建Fetch Request
Map<Node, FetchRequest.Builder> fetchRequestMap = createFetchRequests();
for (Map.Entry<Node, FetchRequest.Builder> fetchEntry : fetchRequestMap.entrySet()) {
final FetchRequest.Builder request = fetchEntry.getValue();
final Node fetchTarget = fetchEntry.getKey();
log.debug("Sending {} fetch for partitions {} to broker {}", isolationLevel, request.fetchData().keySet(),
fetchTarget);
// 2.发送Fetch Request
client.send(fetchTarget, request)
.addListener(new RequestFutureListener<ClientResponse>() {
@Override
public void onSuccess(ClientResponse resp) {
FetchResponse response = (FetchResponse) resp.responseBody();
if (!matchesRequestedPartitions(request, response)) {
// obviously we expect the broker to always send us valid responses, so this check
// is mainly for test cases where mock fetch responses must be manually crafted.
log.warn("Ignoring fetch response containing partitions {} since it does not match " +
"the requested partitions {}", response.responseData().keySet(),
request.fetchData().keySet());
return;
}
Set<TopicPartition> partitions = new HashSet<>(response.responseData().keySet());
FetchResponseMetricAggregator metricAggregator = new FetchResponseMetricAggregator(sensors, partitions);
for (Map.Entry<TopicPartition, FetchResponse.PartitionData> entry : response.responseData().entrySet()) {
TopicPartition partition = entry.getKey();
long fetchOffset = request.fetchData().get(partition).fetchOffset;
FetchResponse.PartitionData fetchData = entry.getValue();
log.debug("Fetch {} at offset {} for partition {} returned fetch data {}",
isolationLevel, fetchOffset, partition, fetchData);
completedFetches.add(new CompletedFetch(partition, fetchOffset, fetchData, metricAggregator,
resp.requestHeader().apiVersion()));
}
sensors.fetchLatency.record(resp.requestLatencyMs());
}
@Override
public void onFailure(RuntimeException e) {
log.debug("Fetch request {} to {} failed", request.fetchData(), fetchTarget, e);
}
});
}
return fetchRequestMap.size();
}
~~~ # createFetchRequests():
~~~ 为订阅的所有 topic-partition list 创建 fetch 请求(只要该topicpartition没有还在处理的请求),
~~~ 创建的 fetch 请求依然是按照 node 级别创建的;
~~~ # client.send():
~~~ 发送 fetch 请求,并设置相应的 Listener,请求处理成功的话,
~~~ 就加入到completedFetches 中,在加入这个 completedFetches 集合时,
~~~ 是按照 topic-partition 级别去加入,这样也就方便了后续的处理。
~~~ 从这里可以看出,在每次发送 fetch 请求时,都会向所有可发送的 topic-partition 发送 fetch 请求,
~~~ 调用一次 fetcher.sendFetches,拉取到的数据,可需要多次 pollOnce 循环才能处理完,
~~~ 因为Fetcher 线程是在后台运行,这也保证了尽可能少地阻塞用户的处理线程,
~~~ 因为如果 Fetcher 中没有可处理的数据,用户的线程是会阻塞在 poll 方法中的
### --- client.poll()
~~~ 调用底层 NetworkClient 提供的接口去发送相应的请求;
### --- coordinator.needRejoin()
~~~ 如果当前实例分配的 topic-partition 列表发送了变化,
~~~ 那么这个 consumer group 就需要进行rebalance
Walter Savage Landor:strove with none,for none was worth my strife.Nature I loved and, next to Nature, Art:I warm'd both hands before the fire of life.It sinks, and I am ready to depart
——W.S.Landor
分类:
bdv013-kafka
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 全程不用写代码,我用AI程序员写了一个飞机大战
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 记一次.NET内存居高不下排查解决与启示
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了
· DeepSeek 开源周回顾「GitHub 热点速览」