session.timeout.ms、heartbeat.interval.ms、max.poll.interval.ms的含义及联系
如果你使用消费者,那么一定会接触这几个参数:
session.timeout.ms、heartbeat.interval.ms、max.poll.interval.ms,先让我们看看分别代表什么含义吧!
session.timeout.ms:
The timeout used to detect worker failures. The worker sends periodic heartbeats to indicate its liveness to the broker. If no heartbeats are received by the broker before the expiration of this session timeout, then the broker will remove the worker from the group and initiate a rebalance. Note that the value must be in the allowable range as configured in the broker configuration bygroup.min.session.timeout.msandgroup.max.session.timeout.ms.
名称:会话超时时间
作用:检测消费者是否超时故障的超时时间
机制:消费者 定期发送心跳证明自己的存活,如果在这个时间之内broker没收到,那broker就将此消费者从group中移除,进行一次reblance。
注意:此值的配置需要在group.min.session.timeout.ms 和 group.max.session.timeout.ms 范围内。
heartbeat.interval.ms:
The expected time between heartbeats to the consumer coordinator when using Kafka's group management facilities. Heartbeats are used to ensure that the consumer's session stays active and to facilitate rebalancing when new consumers join or leave the group. The value must be set lower thansession.timeout.ms, but typically should be set no higher than 1/3 of that value. It can be adjusted even lower to control the expected time for normal rebalances.
名称:心跳间隔时间
作用:当消费者使用group时,用于确保消费者存活,并在消费者加入或离开group时促进reblance
注意:必须小于session.timeout.ms,但通常又不能大于session.timeout.ms的1/3,越小重新平衡的时间越短
max.poll.interval.ms:
The maximum delay between invocations of poll() when using consumer group management. This places an upper bound on the amount of time that the consumer can be idle before fetching more records. If poll() is not called before expiration of this timeout, then the consumer is considered failed and the group will rebalance in order to reassign the partitions to another member.
名称:最大拉取间隔时间
作用:检测消费者是否pull超时或失败
机制:如果消费者两次pull的时间超过了此值,那就认为此消费者能力不足,将此消费者的commit标记为失败,并将此消费者从group移除,触发一次reblance,将该消费者消费的分区分配给其他人。
注意:该值越大,reblance的时间越长。
所以这三个参数的目的是保证group中都是能正常消费的消费者:
1、通过心跳判断:消费者隔heartbeat.interval.ms向broker汇报一次心跳,broker计算消费者多久没有向自己发心跳了,如果超过了session.timeout.ms,那么就认为该消费者不可用了,将其移除。
2、通过pull()时间间隔判断:broker如果发现max.poll.interval.ms没有调用pull()方法,那么就将此消费者移除。
那么有同学可能会问了:如果通过心跳判断消费者没有死,但是通过pull超时的,那么会移除么?
虽然从0.10.1以后session.timeout.ms 和 max.poll.interval.ms 解耦了,可以在处理消息的同时发送心跳,在处理消息的时候不被移除,但是当处理完毕再次调用pull方法时发现此消费者的两次pull是超时的,仍然会将其做失败的重试处理,销毁旧线程,从线程池取新线程,所以答案是会移除。