Lettuce命令延迟测量(CommandLatency)
Lettuce使用了LatencyUtils进行命令延迟测量,LatencyUtils是一个延迟统计追踪开发包,提供了很多有用的追踪工具.LatencyStats的设计旨在通过简单、嵌入式(drop-in)的延迟行为记录对象,对进程间延迟进行记录和追踪。LatencyStats的功能包括底层追踪和暂停影响纠正、遗漏补偿等。通过可插拔式的暂停监测器与区间估计(interval estimator)结合LatencyStats给出校正后延迟统计直方图。本文会持续更新(https://www.cnblogs.com/wei-zw/p/9159234.html )
测试数据都有什么
下面看看命令延迟统计的结果,可以发现统计了命令的个数,第一个响应的最小延迟,最大延迟以及百分位数的统计,还有响应完成的统计数据
1 2 3 4 5 6 | {[local:any -> localhost/ 127.0 . 0.1 : 6379 , commandType=GET]=[count= 5 , timeUnit=MICROSECONDS, firstResponse=[min= 348 , max= 518 , percentiles={ 50.0 = 462 , 90.0 = 518 , 95.0 = 518 , 99.0 = 518 , 99.9 = 518 }], completion=[min= 440 , max= 8978 , percentiles={ 50.0 = 544 , 90.0 = 8978 , 95.0 = 8978 , 99.0 = 8978 , 99.9 = 8978 }]], [local:any -> localhost/ 127.0 . 0.1 : 6379 , commandType=SET]=[count= 6 , timeUnit=MICROSECONDS, firstResponse=[min= 501 , max= 15925 , percentiles={ 50.0 = 581 , 90.0 = 15925 , 95.0 = 15925 , 99.0 = 15925 , 99.9 = 15925 }], completion=[min= 540 , max= 19267 , percentiles={ 50.0 = 622 , 90.0 = 19267 , 95.0 = 19267 , 99.0 = 19267 , 99.9 = 19267 }]]} |
如何使用命令延迟测量
在上文中说过,lettuce默认使用的是LatencyUtils作为命令延迟收集器,如果没有更好的选择建议使用默认命令延迟收集器;是不是使用默认命令延迟收集器就什么都不用做了呢?当然不是.下面通过源码走读方式确认一下我们需要做什么?下面是DefaultClientResources 关于命令延迟测量相关源码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | //如果命令延迟收集器为null if (builder.commandLatencyCollector == null ) { //如果默认命令延迟收集器可用 if (DefaultCommandLatencyCollector.isAvailable()) { //如果命令延迟收集器选项不为null,则使用用户自定义都命令延迟收集器选项设置 if (builder.commandLatencyCollectorOptions != null ) { commandLatencyCollector = new DefaultCommandLatencyCollector(builder.commandLatencyCollectorOptions); } else { //如果没有设置则使用默认数据 commandLatencyCollector = new DefaultCommandLatencyCollector(DefaultCommandLatencyCollectorOptions.create()); } } else { //如果默认命令延迟收集器不可用则将命令延迟收集器选项设置为不可用,并将收集器设置为不可用收集器 logger.debug( "LatencyUtils/HdrUtils are not available, metrics are disabled" ); builder.commandLatencyCollectorOptions = DefaultCommandLatencyCollectorOptions.disabled(); commandLatencyCollector = DefaultCommandLatencyCollector.disabled(); } //将共享收集器设置为false sharedCommandLatencyCollector = false ; } else { //命令延迟收集器不为null则使用用户指定的命令延迟收集器,并将共享收集器设置为true sharedCommandLatencyCollector = true ; commandLatencyCollector = builder.commandLatencyCollector; } //命令延迟发射器选项 commandLatencyPublisherOptions = builder.commandLatencyPublisherOptions; //如果命令延迟收集器可用同时命令延迟发射器选项不为null if (commandLatencyCollector.isEnabled() && commandLatencyPublisherOptions != null ) { metricEventPublisher = new DefaultCommandLatencyEventPublisher(eventExecutorGroup, commandLatencyPublisherOptions, eventBus, commandLatencyCollector); } else { //如果命令延迟收集器不可用或命令发射选项为null都将测量事件发射器设置为null metricEventPublisher = null ; } |
通过上文源码可以发现使用默认命令延迟测量只需要保证默认命令延迟收集器可用就可以了.那么如何是可用的呢?原来只要在POM中添加LatencyUtils的依赖就可以了
1 2 3 4 5 6 | /** * 如果HdrUtils和LatencyUtils在classpath下是有效的就返回true */ public static boolean isAvailable() { return LATENCY_UTILS_AVAILABLE && HDR_UTILS_AVAILABLE; } |
1 2 3 4 5 | <dependency> <groupId>org.latencyutils</groupId> <artifactId>LatencyUtils</artifactId> <version> 2.0 . 3 </version> </dependency> |
延迟测量数据如何被发送的
通过下面源码可以发现延迟测量数据是通过事件总线发送出去的.同时是按照一个固定的频率发送命令延迟测量数据,这个频率是用户可以配置,如果不配置则默认为10分钟
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | /** * 默认命令延迟事件发射器 * */ public class DefaultCommandLatencyEventPublisher implements MetricEventPublisher { //事件处理线程池 private final EventExecutorGroup eventExecutorGroup; //事件发射选项 private final EventPublisherOptions options; //事件总线 private final EventBus eventBus; //命令延迟收集器 private final CommandLatencyCollector commandLatencyCollector; //发射器 private final Runnable EMITTER = this ::emitMetricsEvent; private volatile ScheduledFuture<?> scheduledFuture; public DefaultCommandLatencyEventPublisher(EventExecutorGroup eventExecutorGroup, EventPublisherOptions options, EventBus eventBus, CommandLatencyCollector commandLatencyCollector) { this .eventExecutorGroup = eventExecutorGroup; this .options = options; this .eventBus = eventBus; this .commandLatencyCollector = commandLatencyCollector; //事件发射间隔不为0 if (!options.eventEmitInterval().isZero()) { //固定间隔发送指标事件 scheduledFuture = this .eventExecutorGroup.scheduleAtFixedRate(EMITTER, options.eventEmitInterval().toMillis(), options.eventEmitInterval().toMillis(), TimeUnit.MILLISECONDS); } } @Override public boolean isEnabled() { //指标间隔不为0 return !options.eventEmitInterval().isZero() && scheduledFuture != null ; } @Override public void shutdown() { if (scheduledFuture != null ) { scheduledFuture.cancel( true ); scheduledFuture = null ; } } @Override public void emitMetricsEvent() { if (!isEnabled() || !commandLatencyCollector.isEnabled()) { return ; } //发送命令延迟测试事件 eventBus.publish( new CommandLatencyEvent(commandLatencyCollector.retrieveMetrics())); } } |
如何接收到延迟测量数据
我们已经知道延迟测量数据是通过事件总线发送出去的,现在只要订阅事件总线的事件就可以了
1 2 | client.getResources().eventBus().get().filter(redisEvent -> redisEvent instanceof CommandLatencyEvent) .cast(CommandLatencyEvent. class ).doOnNext(events::add).subscribe(System.out::println); |
Lettuce中是如何进行延迟测量的
CommandHandler继承了ChannelDuplexHandler 不管发送还是接收都在CommandHandler中处理,所以延迟测量也是在CommandHandler中实现的.
我们从write方法开始看起
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | public void write(ChannelHandlerContext ctx, Object msg, ChannelPromise promise) throws Exception { if (debugEnabled) { logger.debug( "{} write(ctx, {}, promise)" , logPrefix(), msg); } //如果msg实现了RedisCommand接口就表示发送单个命令 if (msg instanceof RedisCommand) { writeSingleCommand(ctx, (RedisCommand<?, ?, ?>) msg, promise); return ; } //如果实现了List接口就表示批量发送命令 if (msg instanceof List) { List<RedisCommand<?, ?, ?>> batch = (List<RedisCommand<?, ?, ?>>) msg; //如果集合长度为1 还是执行发送单个命令 if (batch.size() == 1 ) { writeSingleCommand(ctx, batch.get( 0 ), promise); return ; } //批处理 writeBatch(ctx, batch, promise); return ; } if (msg instanceof Collection) { writeBatch(ctx, (Collection<RedisCommand<?, ?, ?>>) msg, promise); } } |
1 2 3 4 5 6 7 8 9 10 | private void writeSingleCommand(ChannelHandlerContext ctx, RedisCommand<?, ?, ?> command, ChannelPromise promise) { if (!isWriteable(command)) { promise.trySuccess(); return ; } //入队 addToStack(command, promise); ctx.write(command, promise); } |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | private void addToStack(RedisCommand<?, ?, ?> command, ChannelPromise promise) { try { validateWrite( 1 ); if (command.getOutput() == null ) { // fire&forget commands are excluded from metrics complete(command); } RedisCommand<?, ?, ?> redisCommand = potentiallyWrapLatencyCommand(command); if (promise.isVoid()) { stack.add(redisCommand); } else { promise.addListener(AddToStack.newInstance(stack, redisCommand)); } } catch (Exception e) { command.completeExceptionally(e); throw e; } } |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | /** * 可能包装为延迟测量命令 * */ private RedisCommand<?, ?, ?> potentiallyWrapLatencyCommand(RedisCommand<?, ?, ?> command) { //如果延迟测量不可用则直接返回 if (!latencyMetricsEnabled) { return command; } //如果当前命令就是延迟命令 if (command instanceof WithLatency) { WithLatency withLatency = (WithLatency) command; //重置数据 withLatency.firstResponse(- 1 ); withLatency.sent(nanoTime()); return command; } //创建延迟测量命令并设置初始化数据 LatencyMeteredCommand<?, ?, ?> latencyMeteredCommand = new LatencyMeteredCommand<>(command); latencyMeteredCommand.firstResponse(- 1 ); latencyMeteredCommand.sent(nanoTime()); return latencyMeteredCommand; } |
此时命令已经包装为了一个延迟测量命令,同时记录了命令的发送时间.在接收到响应到时候会记录响应时间
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | private boolean decode(ChannelHandlerContext ctx, ByteBuf buffer, RedisCommand<?, ?, ?> command) { //如果延迟测量可用且命令实现了WithLatency接口 if (latencyMetricsEnabled && command instanceof WithLatency) { //类型强转 WithLatency withLatency = (WithLatency) command; //如果第一个响应时间不为-1则设置当前时间(纳秒)为第一个响应时间 if (withLatency.getFirstResponse() == - 1 ) { withLatency.firstResponse(nanoTime()); } //开始解码,如果解码失败则返回,不记录延迟测量数据 if (!decode0(ctx, buffer, command)) { return false ; } //记录延迟数据 recordLatency(withLatency, command.getType()); return true ; } return decode0(ctx, buffer, command); } |
1 2 3 4 5 6 7 8 9 10 11 12 | private void recordLatency(WithLatency withLatency, ProtocolKeyword commandType) { //如果withLatency不为null且命令延迟收集器可用同时channel和remote()不为null if (withLatency != null && clientResources.commandLatencyCollector().isEnabled() && channel != null && remote() != null ) { //第一个响应延迟等于第一个响应时间减去发送时间 long firstResponseLatency = withLatency.getFirstResponse() - withLatency.getSent(); //结束时间为当前时间减去发送时间 long completionLatency = nanoTime() - withLatency.getSent(); //使用延迟收集器记录数据 clientResources.commandLatencyCollector().recordCommandLatency(local(), remote(), commandType, firstResponseLatency, completionLatency); } } |
总的来说,第一个响应时间就是开始解码的时间,完成时间就是完成解码时间,如果疏漏也欢迎大家留言
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】凌霞软件回馈社区,博客园 & 1Panel & Halo 联合会员上线
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步