springcloud如何实现服务的平滑发布
在之前的文章中我们提到服务的优雅下线,见:
SpringCloud服务如何在Eureka安全优雅的下线
但这个对于ribbon调用其实是不平滑的,shutdown请求到后服务就马上关闭了,服务消费此时未感应到服务下线了,会仍然往这个服务发送请求,从而导致报错。
简介方案有:一、开启重试(前提是保证接口做好幂等处理)。
二、使用pause来下线服务(推荐)
操作步骤如下:
1、 服务提供方配置
后台端点禁用安全校验 management.security.enabled=false # 开启服务暂停端点 endpoints.pause.enabled=true # 禁用密码验证 endpoints.pause.sensitive=false |
由于这些管理端点比较敏感需要加一个filter来过滤IP白名单
代码参考:对actuator的管理端点进行ip白名单限制(springBoot添加filter)
2、 服务消费者
# 2秒拉取最新的注册信息 eureka.client.registry-fetch-interval-seconds=2 # 2秒刷新ribbon中的缓存信息 |
3、发布流程
Curl –X POST http://127.0.0.1:端口/pause Sleep 6S Kill -9 Java –jar xx.jar启动服务 curl -I -m 10 -o /dev/null -s -w %{http_code} http://127.0.0.1:端口/health 来检测是否是200,持续N秒,如果失败则需要回滚发布并终止后续节点的发布。 |
说明:这里的sleep的最大理论值为: eureka.client.registry-fetch-interval-seconds + (ribbon.ServerListRefreshInterval+eureka.client.registry-fetch-interval-seconds) = 6S;
后面括号里的相加是因为这2个定时有可能恰好非常巧的错过了才会出现,为了安全起见我们可以基于上述的公式再加个一两秒。
为什么要访问/health呢?主要是为了对服务进行预热(主要是数据库连接池/jedis连接池等),这样当超时时间很多的服务在第一次请求时不会出现超时。
4、eureka
# 5秒清理一次过期的注册信息 # 如果是按照上面的流程来执行发布则其实可以不配,使用默认值 eureka.server.eviction-interval-timer-in-ms=5000 # 关闭自我保护 # 内网服务不需要进行分区保护 eureka.server.enable-self-preservation=false # 服务注册5秒即可被发现 |
三、扩展tomcat的shutdownhook(不推荐,如果切换为成其他容器则无效了)
import java.time.Duration; import java.time.LocalDateTime; import java.util.concurrent.Executor; import java.util.concurrent.ThreadPoolExecutor; import java.util.concurrent.TimeUnit; import org.apache.catalina.connector.Connector; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.boot.context.embedded.tomcat.TomcatConnectorCustomizer; import org.springframework.context.ApplicationListener; import org.springframework.context.annotation.Configuration; import org.springframework.context.event.ContextClosedEvent; import lombok.extern.slf4j.Slf4j; /** * 优雅关闭tomcat * @author yangzl * @data 2019年4月2日 * */ @Slf4j @Configuration public class TomcatGracefulShutdown implements TomcatConnectorCustomizer, ApplicationListener<ContextClosedEvent> { // 有个等待时间的配置 @Autowired private ShutdownProperties properties; private volatile Connector connector; @Override public void customize(Connector connector) { this.connector = connector; } @Override public void onApplicationEvent(final ContextClosedEvent event) { LocalDateTime startShutdown = LocalDateTime.now(); LocalDateTime stopShutdown = LocalDateTime.now(); try { log.info("We are now in down mode, please wait " + properties.getWaitSecond() + " second(s)..."); if (connector == null) { log.info("We are running unit test ... "); Thread.sleep(properties.getWaitSecond() * 1000); return; } connector.pause(); final Executor executor = connector.getProtocolHandler().getExecutor(); if (executor instanceof ThreadPoolExecutor) { log.info("executor is ThreadPoolExecutor"); final ThreadPoolExecutor threadPoolExecutor = (ThreadPoolExecutor) executor; threadPoolExecutor.shutdown(); if (!threadPoolExecutor.awaitTermination(properties.getWaitSecond(), TimeUnit.SECONDS)) { log.warn("Tomcat thread pool did not shut down gracefully within " + properties.getWaitSecond() + " second(s). Proceeding with force shutdown"); } else { log.debug("Tomcat thread pool is empty, we stop now"); } } stopShutdown = LocalDateTime.now(); } catch (final InterruptedException ex) { log.error("The await termination has been interrupted : " + ex.getMessage()); Thread.currentThread().interrupt(); } finally { final long seconds = Duration.between(startShutdown, stopShutdown).getSeconds(); log.info("Shutdown performed in " + seconds + " second(s)"); } } }
调用shutdown时tomcat会此等待M秒后再退出,效果基本等同于第二种方案,但最终退出时有时会报错,而且也仅仅适配tomcat,不够通用。