RocketMQ之消息接收源码分析
一、概述
对于任何一款消息中间件而言,消费者客户端一般有两种方式从消息中间件获取消息并消费:
- Push方式:由消息中间件(MQ消息服务器代理)主动地将消息推送给消费者;采用
Push
方式,可以尽可能实时地将消息发送给消费者进行消费。但是,在消费者的处理消息的能力较弱的时候(比如,消费者端的业务系统处理一条消息的流程比较复杂,其中的调用链路比较多导致消费时间比较久。概括起来地说就是“慢消费问题”),而MQ
不断地向消费者Push
消息,消费者端的缓冲区可能会溢出,导致异常; - Pull方式:由消费者客户端主动向消息中间件(MQ消息服务器代理)拉取消息;采用
Pull
方式,如何设置Pull
消息的频率需要重点去考虑,举个例子来说,可能1
分钟内连续来了1000
条消息,然后2
小时内没有新消息产生(概括起来说就是“消息延迟与忙等待”)。如果每次Pull
的时间间隔比较久,会增加消息的延迟,即消息到达消费者的时间加长,MQ
中消息的堆积量变大;若每次Pull
的时间间隔较短,但是在一段时间内MQ
中并没有任何消息可以消费,那么会产生很多无效的Pull
请求的RPC
开销,影响MQ
整体的网络性能;
从严格意义上说,RocketMQ
并没有实现真正的消息消费的Push
模式,而是对Pull
模式进行了一定的优化,
一方面在Consumer
端开启后台独立的线程 — PullMessageService
不断地从阻塞队列 — pullRequestQueue
中获取PullRequest
请求并通过网络通信模块发送Pull
消息的RPC
请求给Broker
端。
另外一方面,consumer
端后台还有另外一个独立线程 — RebalanceService
根据Topic
中消息队列个数和当前消费组内消费者个数进行负载均衡,将产生的对应PullRequest
实例放入阻塞队列 — pullRequestQueue
中。这里算是比较典型的生产者-消费者模型,实现了准实时的自动消息拉取。然后,再根据业务反馈是否成功消费来推动消费进度。
在Broker
端,PullMessageProcessor
业务处理器收到Pull
消息的RPC
请求后,通过MessageStore
实例从commitLog
获取消息。如果第一次尝试Pull
消息失败(比如Broker
端没有可以消费的消息),则通过长轮询机制先hold
住并且挂起该请求,然后通过Broker
端的后台线程PullRequestHoldService
重新尝试和后台线程ReputMessageService
的二次处理。
消费消息可以分成pull
和push
方式,push
消息使用比较简单,因为RocketMQ
已经帮助我们封装了大部分流程,我们只要重写回调函数即可。
下面我们就以push
消费方式为例,分析下这部分源代码流程。
二、流程
2.1 消费者启动流程图
2.2 消费者类图
RebalanceService :均衡消息队列服务,负责通过MQClientInstance
分配当前Consumer
可消费的消息队列(MessageQueue)。当有新的Consumer
的加入或移除,都会重新分配消息队列。
PullMessageService:拉取消息服务,不断的从Broker
拉取消息,包含一个需要获取消息的pullRequestQueue
(是阻塞的),这个队列的由RebalanceService
放PullRequest
对象,并不断依次从队列中取出请求向broker send Request
。并提交消费任务到ConsumeMessageService
。只有在PUSH
模式下才会使用PullMessageService
服务线程,该线程主要是对pullRequestQueue:LinkedBlockingQueuePullRequest
请求对象;当队列里有PullRequest
对象时,从Broker
中拉取消息,如果队列为空,则阻塞。同时该线程也提供了两种拉取方式,分别是立即拉取和延迟拉取两种;
ConsumeMessageService:消费消息服务,不断的消费消息,并处理消费结果。
RemoteBrokerOffsetStore: Consumer
消费进度管理,负责从Broker
获取消费进度,同步消费进度到Broker
。
ProcessQueue:消息处理队列。
MQClientInstance:是一个单例模式,封装对Namesrv
,Broker
的API
调用,提供给Producer
、Consumer
使用。
RebalanceImpl:消费端负载均衡的逻辑。该类的调用轨迹如下:
(MQClientInstance start -->
(this.rebalanceService.start()) --->
RebalanceService.run(this.mqClientFactory.doRebalance()) --->
MQConsumerInner.doRebalance(DefaultMQPushConsumerImpl) --->
RebalanceImpl.doRebalance
在这里着重说明一点:消息队列数量与消费者关系:1个消费者可以消费多个队列,但1个消息队列只会被一个消费者消费;如果消费者数量大于消息队列数量,则有的消费者会消费不到消息(集群模式)
三、消费者源码流程
consumer
启动的时候会启动两个service
:
- RebalanceService:主要实现
consumer
的负载均衡,但是并不会直接发送获取消息的请求,而是构造request
之后放到PullMessageService
中,等待PullMessageService
的线程取出执行; - PullMessageService:主要负责从
broker
获取message
,包含一个需要获取消息的请求队列(是阻塞的),并不断依次从队列中取出请求向broker send Request
。
3.1 消费客户端启动
根据官方提供的例子,Consumer.java里面使用DefaultMQPushConsumer
启动消息消费者,如下:
public class Consumer {
public static final String CONSUMER_GROUP = "please_rename_unique_group_name_4";
public static final String DEFAULT_NAMESRVADDR = "127.0.0.1:9876";
public static final String TOPIC = "TopicTest";
public static void main(String[] args) {
//初始化DefaultMQPushConsumer
DefaultMQPushConsumer consumer = new DefaultMQPushConsumer(CONSUMER_GROUP);
//设置命名服务,参考namesrv的启动
//consumer.setNamesrvAddr(DEFAULT_NAMESRVADDR);
//设置消费起始位置
consumer.setConsumeFromWhere(ConsumeFromWhere.CONSUME_FROM_FIRST_OFFSET);
//订阅消费的主题和过滤符
consumer.subscribe(Topic, "*");
//设置消息回调函数
consumer.registerMessageListener((MessageListenerConcurrently) (msgs, context) -> {
System.out.printf("%s Receive New Messages: %s %n", Thread.currentThread().getName(), msg);
return ConsumeConcurrentlyStatus.CONSUME_SUCCESS;
});
//启动消费者
consumer.start();
}
}
3.2 消息者启动
我们接着看consumer.start()
方法
//org.apache.rocketmq.client.consumer;
public class DefaultMQPushConsumer extends ClientConfig
implements MQPushConsumer {
//...
@Override
public void start() throws MQClientException {
setConsumerGroup(NamespaceUtil.wrapNamespace(this.getNamespace(), this.consumerGroup));
//
this.defaultMQPushConsumerImpl.start();
if (null != traceDispatcher) {
try {
traceDispatcher.start(this.getNamesrvAddr(), this.getAccessChannel());
} catch (MQClientException e) {
log.warn("trace dispatcher start failed ", e);
}
}
}
}
DefaultMQPushConsumerImpl.java
//org.apache.rocketmq.client.impl.consumer;
public class DefaultMQPushConsumerImpl implements MQConsumerInner {
//...
private MQClientInstance mQClientFactory;
public synchronized void start() throws MQClientException {
switch (this.serviceState) {
case CREATE_JUST:
this.serviceState = ServiceState.START_FAILED;
//检查参数
this.checkConfig();
this.copySubscription();
if (this.defaultMQPushConsumer.getMessageModel() == MessageModel.CLUSTERING) {
this.defaultMQPushConsumer.changeInstanceNameToPID();
}
this.mQClientFactory = MQClientManager.getInstance()
.getAndCreateMQClientInstance(this.defaultMQPushConsumer, this.rpcHook);
this.rebalanceImpl.setConsumerGroup(this.defaultMQPushConsumer.getConsumerGroup());
this.rebalanceImpl.setMessageModel(this.defaultMQPushConsumer.getMessageModel());
this.rebalanceImpl.setAllocateMessageQueueStrategy(this.defaultMQPushConsumer
.getAllocateMessageQueueStrategy());
this.rebalanceImpl.setmQClientFactory(this.mQClientFactory);
if (this.pullAPIWrapper == null) {
this.pullAPIWrapper = new PullAPIWrapper(
mQClientFactory,
this.defaultMQPushConsumer.getConsumerGroup(), isUnitMode());
}
this.pullAPIWrapper.registerFilterMessageHook(filterMessageHookList);
if (this.defaultMQPushConsumer.getOffsetStore() != null) {
this.offsetStore = this.defaultMQPushConsumer.getOffsetStore();
} else {
//5、消费进度存储offsetStore,广播和集群不同
switch (this.defaultMQPushConsumer.getMessageModel()) {
case BROADCASTING:
this.offsetStore = new LocalFileOffsetStore(this.mQClientFactory,
this.defaultMQPushConsumer.getConsumerGroup());
break;
case CLUSTERING:
this.offsetStore = new RemoteBrokerOffsetStore(this.mQClientFactory,
this.defaultMQPushConsumer.getConsumerGroup());
break;
default:
break;
}
this.defaultMQPushConsumer.setOffsetStore(this.offsetStore);
}
this.offsetStore.load();
if (this.getMessageListenerInner() instanceof MessageListenerOrderly) {
this.consumeOrderly = true;
this.consumeMessageService = new ConsumeMessageOrderlyService(this,
(MessageListenerOrderly) this.getMessageListenerInner());
//POPTODO reuse Executor ?
this.consumeMessagePopService = new ConsumeMessagePopOrderlyService(this,
(MessageListenerOrderly) this.getMessageListenerInner());
} else if (this.getMessageListenerInner() instanceof MessageListenerConcurrently) {
this.consumeOrderly = false;
this.consumeMessageService = new ConsumeMessageConcurrentlyService(this,
(MessageListenerConcurrently) this.getMessageListenerInner());
//POPTODO reuse Executor ?
this.consumeMessagePopService = new ConsumeMessagePopConcurrentlyService(this,
(MessageListenerConcurrently) this.getMessageListenerInner());
}
this.consumeMessageService.start();
this.consumeMessagePopService.start();
boolean registerOK = mQClientFactory.registerConsumer(
this.defaultMQPushConsumer.getConsumerGroup(), this);
if (!registerOK) {
this.serviceState = ServiceState.CREATE_JUST;
this.consumeMessageService.shutdown(defaultMQPushConsumer
.getAwaitTerminationMillisWhenShutdown());
throw new MQClientException("The consumer group[" +
this.defaultMQPushConsumer.getConsumerGroup()
+ "] has been created before, specify another name please."
+ FAQUrl.suggestTodo(FAQUrl.GROUP_NAME_DUPLICATE_URL),
null);
}
mQClientFactory.start();
this.serviceState = ServiceState.RUNNING;
break;
case RUNNING:
case START_FAILED:
case SHUTDOWN_ALREADY:
throw new MQClientException("The PushConsumer service state not OK, maybe started once, "
+ this.serviceState
+ FAQUrl.suggestTodo(FAQUrl.CLIENT_SERVICE_NOT_OK),
null);
default:
break;
}
this.updateTopicSubscribeInfoWhenSubscriptionChanged();
this.mQClientFactory.checkClientInBroker();
this.mQClientFactory.sendHeartbeatToAllBrokerWithLock();
this.mQClientFactory.rebalanceImmediately();
}
}
在初始化一堆参数之后,然后调用mQClientFactory.start();
3.3 MQClientInstance
//org.apache.rocketmq.client.impl.factory;
public class MQClientInstance {
public void start() throws MQClientException {
synchronized (this) {
switch (this.serviceState) {
case CREATE_JUST:
this.serviceState = ServiceState.START_FAILED;
// If not specified,looking address from name server
if (null == this.clientConfig.getNamesrvAddr()) {
this.mQClientAPIImpl.fetchNameServerAddr();
}
// Start request-response channel
this.mQClientAPIImpl.start();
// Start various schedule tasks
this.startScheduledTask();
// Start pull service
this.pullMessageService.start();
// Start rebalance service
this.rebalanceService.start();
// Start push service
this.defaultMQProducer.getDefaultMQProducerImpl().start(false);
log.info("the client factory [{}] start OK", this.clientId);
this.serviceState = ServiceState.RUNNING;
break;
case START_FAILED:
throw new MQClientException("The Factory object["
+ this.getClientId() + "] has been created before, and failed.", null);
default:
break;
}
}
}
}
各行代码的作用就像源代码里面的注释一样,重点看下pullMessageService.start()
和rebalanceService.start()
pullMessageService.start()
作用是不断从一个阻塞队列里面获取pullRequest
请求,然后去RocketMQ broker
里面获取消息。
如果没有pullRequest
的话,那么它将阻塞。
那么,pullRequest
请求是怎么放进去的呢?这个就要看rebalanceService
了。
3.4 pullMessageService.start
//org.apache.rocketmq.client.impl.consumer;
public class PullMessageService extends ServiceThread {
private final LinkedBlockingQueue<PullRequest> pullRequestQueue =
new LinkedBlockingQueue<PullRequest>();
@Override
public void run() {
while (!this.isStopped()) {
try {
MessageRequest messageRequest = this.messageRequestQueue.take();
if (messageRequest.getMessageRequestMode() == MessageRequestMode.POP) {
this.popMessage((PopRequest) messageRequest);
} else {
this.pullMessage((PullRequest) messageRequest);
}
} catch (InterruptedException e) {
} catch (Exception e) {
//...
}
}
}
}
顺便说一句,pullMessageService
和rebalanceService
都是继承自ServiceThread
ServiceThread
简单封装了线程的启动,调用start
方法,就会调用它的run
方法。
//org.apache.rocketmq.common;
public abstract class ServiceThread implements Runnable {
public void start() {
//...
this.thread.start();
}
}
这样启动线程就要方便一点,继续分析之前的分析。
从pullMessageService
的run
方法可以看出它是从阻塞队列pullRequestQueue
里面获取pullRequest
,如果没有那么将阻塞。
执行完一次pullReqeust
之后,再继续下一次获取阻塞队列,因为它是个while
循环。
所以,我们需要分析下pullRequest
放进队列的流程,也就是rebalanceService
。
3.5 rebalanceService(消费端负载均衡)
关于消费者的Rebalance过程
,入口在RebalanceService
,这是个线程,默认每隔20s做一次rebalance
。
//org.apache.rocketmq.client.impl.consumer;
public class RebalanceService extends ServiceThread {
private static long waitInterval = Long.parseLong(
System.getProperty("rocketmq.client.rebalance.waitInterval", "20000"));
@Override
public void run() {
while (!this.isStopped()) {
this.waitForRunning(waitInterval);
this.mqClientFactory.doRebalance();
}
}
}
//org.apache.rocketmq.client.impl.factory;
public class MQClientInstance {
//...
public void doRebalance() {
for (Map.Entry<String, MQConsumerInner> entry : this.consumerTable.entrySet()) {
MQConsumerInner impl = entry.getValue();
if (impl != null) {
try {
impl.doRebalance();
} catch (Throwable e) {
log.error("doRebalance exception", e);
}
}
}
}
}
MQConsumerInner
是个接口,最后调用DefaultMQPushConsumerImpl.java
//org.apache.rocketmq.client.impl.consumer;
public class DefaultMQPushConsumerImpl implements MQConsumerInner {
//...
@Override
public void doRebalance() {
if (!this.pause) {
this.rebalanceImpl.doRebalance(this.isConsumeOrderly());
}
}
}
一路跟下来,来到了RebalanceImpl.java
的rebalanceByTopic
方法,这个方法里面有两个case
(Broadcasting
和Clustering
)也就是消息消费的两个模式广播和集群消息。
广播的话,所有的监听者都会收到消息,集群的话,只有一个消费者可以收到,我们以集群消息为例。
先大概解释下在rebalanceByTopic
里面要做什么。
- 从
namesrv
获取broker
里面这个topic
的消费者数量 - 从
namesrv
获取broker
这个topic
的消息队列数量 - 根据前两部获取的数据进行负载均衡计算,计算出当前消费者客户端分配到的消息队列。
- 按照分配到的消息队列,去
broker
请求这个消息队列里面的消息。
广播消息:
//org.apache.rocketmq.client.impl.consumer;
public abstract class RebalanceImpl {
//...
public boolean doRebalance(final boolean isOrder) {
boolean balanced = true;
Map<String, SubscriptionData> subTable = this.getSubscriptionInner();
if (subTable != null) {
for (final Map.Entry<String, SubscriptionData> entry : subTable.entrySet()) {
final String topic = entry.getKey();
try {
if (!clientRebalance(topic) && tryQueryAssignment(topic)) {
balanced = this.getRebalanceResultFromBroker(topic, isOrder);
} else {
balanced = this.rebalanceByTopic(topic, isOrder);
}
} catch (Throwable e) {
if (!topic.startsWith(MixAll.RETRY_GROUP_TOPIC_PREFIX)) {
log.warn("rebalance Exception", e);
balanced = false;
}
}
}
}
this.truncateMessageQueueNotMyTopic();
return balanced;
}
private boolean rebalanceByTopic(final String topic, final boolean isOrder) {
boolean balanced = true;
switch (messageModel) {
case BROADCASTING: {
Set<MessageQueue> mqSet = this.topicSubscribeInfoTable.get(topic);
if (mqSet != null) {
//topicSubscribeInfoTable的更新操作(更新topic对应的MessageQueue)信息,
// 发生在发送消息时(updateTopicRouteInfoFromNameServer方法)
boolean changed = this.updateProcessQueueTableInRebalance(topic, mqSet, isOrder);
if (changed) {
this.messageQueueChanged(topic, mqSet, mqSet);
log.info("messageQueueChanged {} {} {} {}", consumerGroup, topic,
mqSet, mqSet);
}
balanced = mqSet.equals(getWorkingMessageQueue(topic));
} else {
this.messageQueueChanged(topic, Collections.<MessageQueue>emptySet(),
Collections.<MessageQueue>emptySet());
log.warn("doRebalance, {}, but the topic[{}] not exist.", consumerGroup, topic);
}
break;
}
case CLUSTERING: {
//...
}
default:
break;
}
return balanced;
}
private boolean updateProcessQueueTableInRebalance(final String topic,
final Set<MessageQueue> mqSet,
final boolean isOrder) {
boolean changed = false;
HashMap<MessageQueue, ProcessQueue> removeQueueMap =
new HashMap<>(this.processQueueTable.size());
//移除 在processQueueTable && 不存在于 mqSet 里的消息队列
Iterator<Entry<MessageQueue, ProcessQueue>> it = this.processQueueTable.entrySet().iterator();
while (it.hasNext()) {
Entry<MessageQueue, ProcessQueue> next = it.next();
MessageQueue mq = next.getKey();
ProcessQueue pq = next.getValue();
if (mq.getTopic().equals(topic)) {
if (!mqSet.contains(mq)) {//不包含的队列
pq.setDropped(true);
removeQueueMap.put(mq, pq);
} else if (pq.isPullExpired() &&
this.consumeType() == ConsumeType.CONSUME_PASSIVELY) {//拉取的队列超时,同样清理
//PUSH模式下,移除拉取超时的
pq.setDropped(true);
removeQueueMap.put(mq, pq);
log.error("[BUG]doRebalance, {}, try remove unnecessary mq, {}," +
" because pull is pause, so try to fixed it",
consumerGroup, mq);
}
}
}
for (Entry<MessageQueue, ProcessQueue> entry : removeQueueMap.entrySet()) {
MessageQueue mq = entry.getKey();
ProcessQueue pq = entry.getValue();
if (this.removeUnnecessaryMessageQueue(mq, pq)) {
this.processQueueTable.remove(mq);
changed = true;
log.info("doRebalance, {}, remove unnecessary mq, {}", consumerGroup, mq);
}
}
// add new message queue
// 把远端新增的队列加入到`processQueueTable`中
boolean allMQLocked = true;
List<PullRequest> pullRequestList = new ArrayList<>();
for (MessageQueue mq : mqSet) {
//如果processQueueTable不包括这个mq
if (!this.processQueueTable.containsKey(mq)) {
if (isOrder && !this.lock(mq)) {
log.warn("doRebalance, {}, add a new mq failed, {}, because lock failed",
consumerGroup, mq);
allMQLocked = false;
continue;
}
//把这个mq的offset先干掉,再添加
this.removeDirtyOffset(mq);
ProcessQueue pq = createProcessQueue(topic);
pq.setLocked(true);
long nextOffset = this.computePullFromWhere(mq);
if (nextOffset >= 0) {
ProcessQueue pre = this.processQueueTable.putIfAbsent(mq, pq);
if (pre != null) {
log.info("doRebalance, {}, mq already exists, {}", consumerGroup, mq);
} else {
log.info("doRebalance, {}, add a new mq, {}", consumerGroup, mq);
PullRequest pullRequest = new PullRequest();
pullRequest.setConsumerGroup(consumerGroup);
pullRequest.setNextOffset(nextOffset);
pullRequest.setMessageQueue(mq);
pullRequest.setProcessQueue(pq);
pullRequestList.add(pullRequest);
//返回是否有变化
changed = true;
}
} else {
log.warn("doRebalance, {}, add new mq failed, {}", consumerGroup, mq);
}
}
}
if (!allMQLocked) {
mQClientFactory.rebalanceLater(500);
}
// 将pullRequest放在pullRequestQueue中等待去取数据
this.dispatchPullRequest(pullRequestList, 500);
return change;
}
//...
}
集群模式的更新队列方式使用的同样是updateProcessQueueTableInRebalance
。
那我们继续3.4 pullMessageService.start
分析,因为rebalanceService
已经把pullRequest
放到了阻塞队列。
3.6 PullMessageService.run
//org.apache.rocketmq.client.impl.consumer;
public class PullMessageService extends ServiceThread {
private final LinkedBlockingQueue<PullRequest> pullRequestQueue =
new LinkedBlockingQueue<PullRequest>();
@Override
public void run() {
while (!this.isStopped()) {
try {
MessageRequest messageRequest = this.messageRequestQueue.take();
if (messageRequest.getMessageRequestMode() == MessageRequestMode.POP) {
this.popMessage((PopRequest) messageRequest);
} else {
this.pullMessage((PullRequest) messageRequest);
}
} catch (InterruptedException e) {
} catch (Exception e) {
//...
}
}
}
private void pullMessage(final PullRequest pullRequest) {
final MQConsumerInner consumer = this.mQClientFactory
.selectConsumer(pullRequest.getConsumerGroup());
if (consumer != null) {
DefaultMQPushConsumerImpl impl = (DefaultMQPushConsumerImpl) consumer;
impl.pullMessage(pullRequest);
} else {
}
}
}
调用到DefaultMQPushConsumerImpl.pullMessage(pullRequest)
这个方法里面。
//org.apache.rocketmq.client.impl.consumer;
public class DefaultMQPushConsumerImpl implements MQConsumerInner {
public void pullMessage(final PullRequest pullRequest) {
//...
final long beginTimestamp = System.currentTimeMillis();
PullCallback pullCallback = new PullCallback() {
@Override
public void onSuccess(PullResult pullResult) {
if (pullResult != null) {
pullResult = this.pullAPIWrapper.processPullResult(
pullRequest.getMessageQueue(), pullResult, subscriptionData);
switch (pullResult.getPullStatus()) {
case FOUND:
long prevRequestOffset = pullRequest.getNextOffset();
pullRequest.setNextOffset(pullResult.getNextBeginOffset());
long pullRT = System.currentTimeMillis() - beginTimestamp;
this.getConsumerStatsManager().incPullRT(pullRequest.getConsumerGroup(),
pullRequest.getMessageQueue().getTopic(), pullRT);
long firstMsgOffset = Long.MAX_VALUE;
if (pullResult.getMsgFoundList() == null
|| pullResult.getMsgFoundList().isEmpty()) {
this.executePullRequestImmediately(pullRequest);
} else {
firstMsgOffset = pullResult.getMsgFoundList().get(0).getQueueOffset();
this.getConsumerStatsManager().incPullTPS(
pullRequest.getConsumerGroup(),
pullRequest.getMessageQueue().getTopic(),
pullResult.getMsgFoundList().size());
boolean dispatchToConsume = processQueue.putMessage(
pullResult.getMsgFoundList());
this.consumeMessageService.submitConsumeRequest(
pullResult.getMsgFoundList(),
processQueue,
pullRequest.getMessageQueue(),
dispatchToConsume);
if (this.defaultMQPushConsumer.getPullInterval() > 0) {
this.executePullRequestLater(pullRequest,
this.defaultMQPushConsumer.getPullInterval());
} else {
this.executePullRequestImmediately(pullRequest);
}
}
if (pullResult.getNextBeginOffset() < prevRequestOffset
|| firstMsgOffset < prevRequestOffset) {
log.warn("[BUG] pull message result maybe data wrong, " +
"nextBeginOffset: {} " +
"firstMsgOffset: {} prevRequestOffset: {}",
pullResult.getNextBeginOffset(),
firstMsgOffset,
prevRequestOffset);
}
break;
case NO_NEW_MSG:
case NO_MATCHED_MSG:
//...
break;
case OFFSET_ILLEGAL:
//...
break;
default:
break;
}
}
}
@Override
public void onException(Throwable e) {
if (!pullRequest.getMessageQueue().getTopic()
.startsWith(MixAll.RETRY_GROUP_TOPIC_PREFIX)) {
log.warn("execute the pull request exception", e);
}
if (e instanceof MQBrokerException
&& ((MQBrokerException) e).getResponseCode() == ResponseCode.FLOW_CONTROL) {
this.executePullRequestLater(pullRequest,
PULL_TIME_DELAY_MILLS_WHEN_BROKER_FLOW_CONTROL);
} else {
this.executePullRequestLater(pullRequest,
pullTimeDelayMillsWhenException);
}
}
};
//...
try {
this.pullAPIWrapper.pullKernelImpl(
pullRequest.getMessageQueue(),
subExpression,
subscriptionData.getExpressionType(),
subscriptionData.getSubVersion(),
pullRequest.getNextOffset(),
this.defaultMQPushConsumer.getPullBatchSize(),
sysFlag,
commitOffsetValue,
BROKER_SUSPEND_MAX_TIME_MILLIS,
CONSUMER_TIMEOUT_MILLIS_WHEN_SUSPEND,
CommunicationMode.ASYNC,
pullCallback
);
} catch (Exception e) {
this.executePullRequestLater(pullRequest, pullTimeDelayMillsWhenException);
}
}
}
上面这段代码主要就是设置消息获取后的回调函数PullCallback pullCallback
,然后调用pullAPIWrapper.pullKernelImpl
去Broker
里面获取消息。
获取成功后,就会回调pullCallback的onSuccess
方法的FOUND case
分支。
在pullCallback
的onSuccess
方法的FOUND case
分支,会根据回调是同步还是异步,分为两种情况,如下:
同步消息和异步消息区别的源代码实现以后再讲。
四、消费进度
首先消费者订阅消息消费队列(MessageQueue
),当生产者将消息负载发送到MessageQueue
中时,消费订阅者开始消费消息,消息消费过程中,为了避免重复消费,需要一个地方存储消费进度(消费偏移量)。
广播模式 :每条消息都被每一个消费者消费,使用本地文件
的消费进度。
集群模式 :一条消息被集群中任何一个消费者消费,使用Broker
的消费进度。
广播模式使用本地的消费进度即可,因为消费者之间互相独立,集群模式则不是,正常情况下,一条消息在一个消费者上消费成功(一条消息只能被集群内的一个消费者消费),则不会发送到其他消费者,所以,进度不能保存在消费端,只能集中保存在一个地方,比较合适的是在Broker端。接下来我们先分析一下消息消费进度接口:OffsetStore.java
在入口代码:DefaultMQPushConsumerImpl#start()的第5点里
根据消息消费模式(集群模式、广播模式)会创建不同的OffsetStore
方式。
由于上篇文章,谈到广播模式消息,如果返回CONSUME_LATER
,竟然不会重试,而是直接丢弃,为什么呢?由于这个原因,这次破天荒的从广播模式的OffsetStore
开始学习。
4.1 LocalFileOffsetStore(广播模式)
消息进度以本地文件方式保存。
4.1.1 核心属性与构造函数
//org.apache.rocketmq.client.consumer.store;
public class LocalFileOffsetStore implements OffsetStore {
public final static String LOCAL_OFFSET_STORE_DIR = System.getProperty(
"rocketmq.client.localOffsetStoreDir",
System.getProperty("user.home") + File.separator + ".rocketmq_offsets");
private final static InternalLogger log = ClientLogger.getLog();
private final MQClientInstance mQClientFactory;
private final String groupName;
private final String storePath;
private ConcurrentMap<MessageQueue, AtomicLong> offsetTable =
new ConcurrentHashMap<MessageQueue, AtomicLong>();
public LocalFileOffsetStore(MQClientInstance mQClientFactory, String groupName) {
this.mQClientFactory = mQClientFactory;
this.groupName = groupName;
this.storePath = LOCAL_OFFSET_STORE_DIR + File.separator +
this.mQClientFactory.getClientId() + File.separator +
this.groupName + File.separator +
"offsets.json";
}
}
- LOCAL_OFFSET_STORE_DIR:
offset
存储根目录,默认为用户主目录,例如:/home/dingw
,可以在消费者启动的JVM参数中,通过-Drocketmq.client.localOffsetStoreDir=路径
。 - groupName: 消费组名称
- storePath: 具体的消费进度保存文件名(全路径)
- offsetTable: 内存中的
offset
进度保持,以MessageQueue
为键,偏移量为值
LocalFileOffsetStore
首先在DefaultMQPushConsumerImpl#start
方法中创建,并执行load
方法加载消费进度。
接下来结束一下几个关键的实现方法
4.1.2 load()方法
//org.apache.rocketmq.client.consumer.store;
public class LocalFileOffsetStore implements OffsetStore {
//...
@Override
public void load() throws MQClientException {
OffsetSerializeWrapper offsetSerializeWrapper = this.readLocalOffset();
if (offsetSerializeWrapper != null && offsetSerializeWrapper.getOffsetTable() != null) {
offsetTable.putAll(offsetSerializeWrapper.getOffsetTable());
for (MessageQueue mq : offsetSerializeWrapper.getOffsetTable().keySet()) {
AtomicLong offset = offsetSerializeWrapper.getOffsetTable().get(mq);
log.info("load consumer's offset, {} {} {}",
this.groupName, mq, offset.get());
}
}
}
}
该方法,主要就是读取offsets.json
或offsets.json.bak
中的内容,然后将json转换成map:
然后更新或获取消息队列的消费进度,就是从内存(Map)或store中获取,接下来看一下初次保存offsets.json文件
//org.apache.rocketmq.client.consumer.store;
public class LocalFileOffsetStore implements OffsetStore {
//...
@Override
public void persistAll(Set<MessageQueue> mqs) {
if (null == mqs || mqs.isEmpty())
return;
OffsetSerializeWrapper offsetSerializeWrapper = new OffsetSerializeWrapper();
for (Map.Entry<MessageQueue, AtomicLong> entry : this.offsetTable.entrySet()) {
if (mqs.contains(entry.getKey())) {
AtomicLong offset = entry.getValue();
offsetSerializeWrapper.getOffsetTable().put(entry.getKey(), offset);
}
}
String jsonString = offsetSerializeWrapper.toJson(true);
if (jsonString != null) {
try {
MixAll.string2File(jsonString, this.storePath);
} catch (IOException e) {
log.error("persistAll consumer offset Exception, " + this.storePath, e);
}
}
}
}
保存逻辑很简单,就没必要一一分析,重点看一下,该方法在什么时候调用:【MQClientInstance#startScheduledTask】
顺藤摸瓜,原因是一个定时任务,默认消费端启动10秒后,每隔5s的频率持久化一次
广播模式消费进度存储容易,但其实还是不明白为什么RocketMQ
广播模式,如果消费失败,则丢弃,因为广播模式有时候也必须确保每个消费者都成功消费,通常的场景为,通过MQ刷新本地缓存等。
4.2 集群模式消费进度存储((RemoteBrokerOffsetStore)
在阅读RemoteBrokerOffsetStore
之前,我们先思考一下如下几个问题:
在集群模式下,多个消费者会负载到不同的消费队列上,因为消息消费进度是基于消息队列进行保存的,也就是不同的消费者之间的消费进度保存是不会存在并发的,但是在同一个消费者,非顺序消息消费时,一个消费者(多个线程)并发消费消息,比如m1 < m2,但m2先消费完,此时是如何保存的消费进度呢?举个例子,如果m2的offset为5,而m1的offset为4,如果m2先消费完,保存进度为5,那m1消息消费完,保存进度为4,这样岂不乱来了。
4.2.1 RemoteBrokerOffsetStore(核心属性)
//org.apache.rocketmq.client.consumer.store;
public class RemoteBrokerOffsetStore implements OffsetStore {
private final static Logger log = ClientLogger.getLog();
// MQ客户端实例,该实例被同一个客户端的消费者、生产者共用
private final MQClientInstance mQClientFactory;
// MQ消费组
private final String groupName;
// 消费进度存储(内存中)
private ConcurrentMap<MessageQueue, AtomicLong> offsetTable =
new ConcurrentHashMap<MessageQueue, AtomicLong>();
// 构造方法
public RemoteBrokerOffsetStore(MQClientInstance mQClientFactory, String groupName) {
this.mQClientFactory = mQClientFactory;
this.groupName = groupName;
}
//...
}
4.2.2 updateOffset(更新offset)
//org.apache.rocketmq.client.consumer.store;
public class RemoteBrokerOffsetStore implements OffsetStore {
//...
@Override
public void updateOffset(MessageQueue mq, long offset, boolean increaseOnly) {
if (mq != null) {
AtomicLong offsetOld = this.offsetTable.get(mq);
if (null == offsetOld) { // @1
offsetOld = this.offsetTable.putIfAbsent(mq, new AtomicLong(offset)); // @2
}
if (null != offsetOld) { // @3
if (increaseOnly) {
MixAll.compareAndIncreaseOnly(offsetOld, offset); // @4
} else {
offsetOld.set(offset); // @5
}
}
}
}
}
代码@1:如果当前并没有存储该mq
的offset
,则把传入的offset
放入内存中(map)
代码@3:如果offsetOld
不为空,这里如果不为空,说明同时对一个MQ
消费队列进行消费,并发执行
代码@4,@5,根据increaseOnly
更新原先的offsetOld
的值,这个值是个局部变量,但这里到底有什么用呢?
4.2.3 readOffset(读取消费进度)
根据读取来源,读取消费队列的消费进度
//org.apache.rocketmq.client.consumer.store;
public class RemoteBrokerOffsetStore implements OffsetStore {
//...
public long readOffset(final MessageQueue mq, final ReadOffsetType type) {
if (mq != null) {
switch (type) {
// 先从内存中读取,如果内存中不存在,再尝试从磁盘中读取
case MEMORY_FIRST_THEN_STORE:
// 从内存中读取
case READ_FROM_MEMORY: {
AtomicLong offset = this.offsetTable.get(mq);
if (offset != null) {
return offset.get();
} else if (ReadOffsetType.READ_FROM_MEMORY == type) {
return -1;
}
}
// 从磁盘中读取
case READ_FROM_STORE: {
try {
long brokerOffset = this.fetchConsumeOffsetFromBroker(mq);
AtomicLong offset = new AtomicLong(brokerOffset);
this.updateOffset(mq, offset.get(), false);
return brokerOffset;
}
// No offset in broker
catch (MQBrokerException e) {
return -1;
}
//Other exceptions
catch (Exception e) {
log.warn("fetchConsumeOffsetFromBroker exception, " + mq, e);
return -2;
}
}
default:
break;
}
}
return -1;
}
}
这里主要关注从磁盘中读取消费进度,核心入口方法:fetchConsumeOffsetFromBroker
//org.apache.rocketmq.client.consumer.store;
public class RemoteBrokerOffsetStore implements OffsetStore {
//...
private long fetchConsumeOffsetFromBroker(MessageQueue mq) throws RemotingException,
MQBrokerException, InterruptedException, MQClientException {
FindBrokerResult findBrokerResult = this.mQClientFactory.findBrokerAddressInAdmin(
this.mQClientFactory.getBrokerNameFromMessageQueue(mq), MixAll.MASTER_ID, true);
if (null == findBrokerResult) {
this.mQClientFactory.updateTopicRouteInfoFromNameServer(mq.getTopic());
findBrokerResult = this.mQClientFactory.findBrokerAddressInSubscribe(
this.mQClientFactory.getBrokerNameFromMessageQueue(mq), MixAll.MASTER_ID, false);
}
if (findBrokerResult != null) {
QueryConsumerOffsetRequestHeader requestHeader = new QueryConsumerOffsetRequestHeader();
requestHeader.setTopic(mq.getTopic());
requestHeader.setConsumerGroup(this.groupName);
requestHeader.setQueueId(mq.getQueueId());
requestHeader.setBname(mq.getBrokerName());
return this.mQClientFactory.getMQClientAPIImpl().queryConsumerOffset(
findBrokerResult.getBrokerAddr(), requestHeader, 1000 * 5);
} else {
throw new MQClientException("The broker[" + mq.getBrokerName() + "] not exist", null);
}
}
}
这里,主要是首先根据mq
的broker
名称获取broker
地址,然后发送请求,我们重点关注一下消费进度是保存在broker
哪个地方:
Broker
端的offset
管理参照ConsumerOffsetManager
,保存逻辑其实与广播模式差不多,就不深入研究了,重点说一下offset
保存的路径:/rocketmq_home/store/config/consumerOffset.json
综上所述,我们了解到的情况是,广播模式,存放在消费者本地,集群模式,存储在Broker
,存储文件,存放的是JSON
。也就是OffsetStore
提供保存消费进度方法,也就是{“consumeGroup" : [ {”ConsumeQueue1“:offset} ]}
4.3 拓展
现在我们思考如下问题:下面讨论还是基于非顺序消息:
1、集群模式,一个消费组是多个线程消费该队列中的消息,并发执行,例如在q1中存在 m1,m2,m3,m4,m5
最后消费成功的顺序有可能是 m1,m3,m2,m5,m4,如果消费消息,就将该消息的offset
存入offset
中,岂不是会乱,如果一批拉取了多条消息,消费进度是如何保存的。要解决上述问题,我们移步到到调用offsetStore.updateStore
方法,重点看一下那块逻辑:
ConsumeMessageConcurrentlyService#processConsumeResult
也就是消息处理后,然后移除该批处理消息,然后返回要更新的offset
。那我们重点看一下removeMessage
方法:
public long removeMessage(final List<MessageExt> msgs) {
long result = -1;
final long now = System.currentTimeMillis();
try {
this.lockTreeMap.writeLock().lockInterruptibly();
this.lastConsumeTimestamp = now;
try {
if (!msgTreeMap.isEmpty()) {
result = this.queueOffsetMax + 1;
int removedCnt = 0;
for (MessageExt msg : msgs) {
MessageExt prev = msgTreeMap.remove(msg.getQueueOffset());
if (prev != null) {
removedCnt--;
}
}
msgCount.addAndGet(removedCnt);
if (!msgTreeMap.isEmpty()) {
result = msgTreeMap.firstKey();
}
}
} finally {
this.lockTreeMap.writeLock().unlock();
}
} catch (Throwable t) {
log.error("removeMessage exception", t);
}
return result;
}
主要一下,msgTreeMap
的类型,TreeMap
,按消息的offset
升序排序,返回的result
,如果treemap
中不存在任何消息,那就返回该处理队列最大的偏移量+1,如果移除自己本批消息后,处理队列中,还存在消息,则返回该处理队列中最小的偏移量,也就是此时返回的偏移量有可能不是消息本身的偏移量,而是处理队列中最小的偏移量。
有点:防止消息丢失(也就是没有消费到)
缺点:会造成消息重复消费
回来
上面代码里的mqset
就是这个topic
的消费队列,一般是4
个,但是这个值是可以修改的,存储的位置在~/store/config/topics.json里面,比如:
"TopicTest":{
"order":false,
"perm":6,
"readQueueNums":4,
"topicFilterType":"SINGLE_TAG",
"topicName":"TopicTest",
"topicSysFlag":0,
"writeQueueNums":4
}
可以修改readQueueNums
和writeQueueNums
为其他值
try {
allocateResult = strategy.allocate(
this.consumerGroup,
this.mQClientFactory.getClientId(),
mqAll,
cidAll);
} catch (Throwable e) {
return;
}
这段代码就是客户端根据获取到的这个topic
消费者数量和消息队列数量,使用负载均衡策略计算出当前客户端能够使用的消息队列。
负载均衡策略代码在这个位置。
consumer
负载均衡有6
种模式:
- 分页模式(随机分配模式)
- 手动配置模式
- 指定机房模式
- 就近机房模式
- 统一哈希模式
- 环型模式