记一次Spring Cloud微服务某个模块运行一段时间卡死
某个模块运行一段时间运行卡死,导致其他模块Feign调用超时。
起初认为是Feign设置配置问题,于是延迟超时时间,connectTimeout和readTimeout
以及ribbon的ReadTimeout和ConnectTimeout
# feign 配置 feign: sentinel: enabled: true okhttp: enabled: true httpclient: enabled: false client: config: default: connectTimeout: 30000 readTimeout: 30000 compression: request: enabled: true response: enabled: true #请求处理的超时时间
ribbon: ReadTimeout: 30000 ConnectTimeout: 30000
然而并没用,继续运行一段时间开始,然后发现swagger上对应的模块打不开,系统也无error日志打印,what?
这时候发现事情并没有那么简单,然后导出对应的pid 的jvm日志
命令:jstack -l pid >dump.log
部分日志如下:
2022-06-16 14:51:15 Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.121-b13 mixed mode): "Attach Listener" #1460 daemon prio=9 os_prio=0 tid=0x00007f90047c3800 nid=0x394e92 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "XNIO-1 task-16" #1459 prio=5 os_prio=0 tid=0x00007f8ff8544800 nid=0x3949f8 waiting on condition [0x00007f8fdf9bd000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000e1149748> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at com.alibaba.druid.pool.DruidDataSource.takeLast(DruidDataSource.java:2175) at com.alibaba.druid.pool.DruidDataSource.getConnectionInternal(DruidDataSource.java:1672) at com.alibaba.druid.pool.DruidDataSource.getConnectionDirect(DruidDataSource.java:1409) at com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:1389) at com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:1379) at com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:109) at org.springframework.jdbc.datasource.DataSourceUtils.fetchConnection(DataSourceUtils.java:158) at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:116) at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:79) at org.mybatis.spring.transaction.SpringManagedTransaction.openConnection(SpringManagedTransaction.java:80) at org.mybatis.spring.transaction.SpringManagedTransaction.getConnection(SpringManagedTransaction.java:67) at org.apache.ibatis.executor.BaseExecutor.getConnection(BaseExecutor.java:336) at com.baomidou.mybatisplus.core.executor.MybatisSimpleExecutor.prepareStatement(MybatisSimpleExecutor.java:91) at com.baomidou.mybatisplus.core.executor.MybatisSimpleExecutor.doQuery(MybatisSimpleExecutor.java:66) at org.apache.ibatis.executor.BaseExecutor.queryFromDatabase(BaseExecutor.java:324) at org.apache.ibatis.executor.BaseExecutor.query(BaseExecutor.java:156) at com.baomidou.mybatisplus.core.executor.MybatisCachingExecutor.query(MybatisCachingExecutor.java:163) at com.baomidou.mybatisplus.core.executor.MybatisCachingExecutor.query(MybatisCachingExecutor.java:90) at sun.reflect.GeneratedMethodAccessor154.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.ibatis.plugin.Plugin.invoke(Plugin.java:63) at com.sun.proxy.$Proxy259.query(Unknown Source)
.....
DruidDataSource连接池bug,然后看看我用的版本1.1.22,版本这么低,杀个程序员祭天,升级一下1.2.5
添加了一些配置如下(仅个人配置,具体配置看项目需求)
# 数据源配置 spring: datasource: type: com.alibaba.druid.pool.DruidDataSource druid: driver-class-name: xxxxxxxxxxxxxxxxx username: xxxxxxxxxxxxxxxxx password: xxxxxxxxxxxxxxxxx url:xxxxxxxxxxxxxxxxx # 初始连接数 initialSize: 5 # 最小连接池数量 minIdle: 10 # 最大连接池数量 maxActive: 20 # 配置获取连接等待超时的时间 maxWait: 60000 # 配置间隔多久才进行一次检测,检测需要关闭的空闲连接,单位是毫秒 timeBetweenEvictionRunsMillis: 60000 # 配置一个连接在池中最小生存的时间,单位是毫秒 minEvictableIdleTimeMillis: 300000 # 配置一个连接在池中最大生存的时间,单位是毫秒 maxEvictableIdleTimeMillis: 900000
OK,问题解决。