准备
本次测试具备的环境很简单,项目都是 java 项目,jdk 为1.7(1.8 也成),tomcat 为7版本,war 包放进 tomcat 路径下的 webapps 就成。
然后改数据库配置文件
sql 文件放到数据库内执行,生成相应的数据库以及表
首页:http://www.ganziwen.cn:8082/dangdang/main/main.jsp
注册页面:http://www.ganziwen.cn:8082/dangdang/user/registForm.jsp
线程栈死锁
问题描述
数据库保证已经完全连上,但是首页的图片无法加载出来
排查方案
出现这种问题尝试结合之前讲的架构从几个方面去分析,主要是排查法:
一、请求没法送到服务器
- 负载机 cpu、内存、网络、磁盘
二、请求发送到服务器但是无法被处理
- 应用服务器/数据库服务器 cpu、内存、网络、磁盘
- 容器连接池
- 数据库连接池
- 代码逻辑
- sql 语句慢
- JVM 的堆内存(jmap -histo:live pid ,jmap -dump:live,format=b,file=heap.bin pid)
- gc 频繁(jstat -gcutil pid 1000)
- 线程栈问题(jstack pid)
- 磁盘 io(nmon/sar -d/iostat -x)
排查过程
这里我们的负载机排查很好排查:直接访问一个服务器上的其他链接:
从这张图片,可以侧面论证,我们的负载机是没有问题的,最起码可以访问服务器,同时也说明 web 容器的连接池并没有满,因为可以访问我们的 82 端口,那么尝试从其他方面去考虑,看看 jvm 和线程栈
首先我们查一下 java 进程,得到 pid 为 8071:
# ps -ef|grep java|grep -v grep root 8071 1 0 13:10 pts/0 00:00:10 /opt/jdk1.8/bin/java -Djava.util.logging.config.file=/opt/tomcat7/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djdk.tls.ephemeralDHKeySize=2048 -Dignore.endorsed.dirs= -classpath /opt/tomcat7/bin/bootstrap.jar:/opt/tomcat7/bin/tomcat-juli.jar -Dcatalina.base=/opt/tomcat7 -Dcatalina.home=/opt/tomcat7 -Djava.io.tmpdir=/opt/tomcat7/temp org.apache.catalina.startup.Bootstrap start
然后我们看一下 jvm 的 gc 情况:
# jstat -gcutil 8071 3000 S0 S1 E O M CCS YGC YGCT FGC FGCT GCT 0.00 100.00 64.96 82.81 96.73 92.25 25 0.189 1 0.051 0.239 0.00 100.00 64.96 82.81 96.73 92.25 25 0.189 1 0.051 0.239 0.00 100.00 65.23 82.81 96.73 92.25 25 0.189 1 0.051 0.239 0.00 100.00 65.23 82.81 96.73 92.25 25 0.189 1 0.051 0.239
可以发现。,gc 的频率并不是很高,暂时排除是 gc 频繁导致的
接下来看看线程栈问题:
jstack 8071 > 1.log
打开 1.log:
我们可以发现,在 http-nio-8082-exec-XXX 这种的线程中状态几乎都为 blocked,说明所有的 nio 的线程锁住了,没有可供使用的线程都被死锁了,代表请求发送到服务器,没有线程可供处理。而且有一个 deadblock
Found one Java-level deadlock: ============================= "http-bio-8082-exec-10": waiting to lock monitor 0x00007fa8a4005c98 (object 0x00000000e161a920, a java.lang.Object), which is held by "http-bio-8082-exec-8" "http-bio-8082-exec-8": waiting to lock monitor 0x00007fa888005c98 (object 0x00000000e16159d8, a java.lang.Object), which is held by "http-bio-8082-exec-10"
我们看到这里,大概可以判断到应该是线程栈死锁导致的,而且可以看到在锁住的线程调用的方法的路径以及方法名是:org.tarena.common 路径下的 DbUtil类 路径下的 getConnection 方法的 43 行(DbUtil.java:43)
"http-nio-8082-exec-109" #139 daemon prio=5 os_prio=0 tid=0x00007f915c840800 nid=0x2087 waiting for monitor entry [0x00007f91336b2000] java.lang.Thread.State: BLOCKED (on object monitor) at org.tarena.common.DbUtil.getConnection(DbUtil.java:43) - waiting to lock <0x00000000ec36cbb8> (a java.lang.Object) at org.tarena.dao.impl.DangDAOImpl.getConnection(DangDAOImpl.java:18) at org.tarena.dao.impl.BookDAOImpl.findHotBoardBooks(BookDAOImpl.java:159) at org.tarena.action.main.HotBoardBook.execute(HotBoardBook.java:20) at sun.reflect.GeneratedMethodAccessor55.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.opensymphony.xwork2.DefaultActionInvocation.invokeAction(DefaultActionInvocation.java:441) at com.opensymphony.xwork2.DefaultActionInvocation.invokeActionOnly(DefaultActionInvocation.java:280) at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:243) at com.opensymphony.xwork2.interceptor.DefaultWorkflowInterceptor.doIntercept(DefaultWorkflowInterceptor.java:165) at com.opensymphony.xwork2.interceptor.MethodFilterInterceptor.intercept(MethodFilterInterceptor.java:87) at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:237) at com.opensymphony.xwork2.validator.ValidationInterceptor.doIntercept(ValidationInterceptor.java:252) at org.apache.struts2.interceptor.validation.AnnotationValidationInterceptor.doIntercept(AnnotationValidationInterceptor.java:68) at com.opensymphony.xwork2.interceptor.MethodFilterInterceptor.intercept(MethodFilterInterceptor.java:87) at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:237) at com.opensymphony.xwork2.interceptor.ConversionErrorInterceptor.intercept(ConversionErrorInterceptor.java:122) at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:237) at com.opensymphony.xwork2.interceptor.ParametersInterceptor.doIntercept(ParametersInterceptor.java:195) at com.opensymphony.xwork2.interceptor.MethodFilterInterceptor.intercept(MethodFilterInterceptor.java:87) at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:237) at com.opensymphony.xwork2.interceptor.ParametersInterceptor.doIntercept(ParametersInterceptor.java:195) at com.opensymphony.xwork2.interceptor.MethodFilterInterceptor.intercept(MethodFilterInterceptor.java:87) at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:237) at com.opensymphony.xwork2.interceptor.StaticParametersInterceptor.intercept(StaticParametersInterceptor.java:179) at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:237) at org.apache.struts2.interceptor.MultiselectInterceptor.intercept(MultiselectInterceptor.java:75) at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:237) at org.apache.struts2.interceptor.CheckboxInterceptor.intercept(CheckboxInterceptor.java:94) at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:237) at org.apache.struts2.interceptor.FileUploadInterceptor.intercept(FileUploadInterceptor.java:235) at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:237) at com.opensymphony.xwork2.interceptor.ModelDrivenInterceptor.intercept(ModelDrivenInterceptor.java:89) at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:237) at com.opensymphony.xwork2.interceptor.ScopedModelDrivenInterceptor.intercept(ScopedModelDrivenInterceptor.java:130) at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:237) at org.apache.struts2.interceptor.debugging.DebuggingInterceptor.intercept(DebuggingInterceptor.java:267) at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:237) at com.opensymphony.xwork2.interceptor.ChainingInterceptor.intercept(ChainingInterceptor.java:126) at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:237) at com.opensymphony.xwork2.interceptor.PrepareInterceptor.doIntercept(PrepareInterceptor.java:138) at com.opensymphony.xwork2.interceptor.MethodFilterInterceptor.intercept(MethodFilterInterceptor.java:87) at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:237) at com.opensymphony.xwork2.interceptor.I18nInterceptor.intercept(I18nInterceptor.java:165) at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:237) at org.apache.struts2.interceptor.ServletConfigInterceptor.intercept(ServletConfigInterceptor.java:164) at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:237) at com.opensymphony.xwork2.interceptor.AliasInterceptor.intercept(AliasInterceptor.java:179) at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:237) at com.opensymphony.xwork2.interceptor.ExceptionMappingInterceptor.intercept(ExceptionMappingInterceptor.java:176) at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:237) at org.apache.struts2.impl.StrutsActionProxy.execute(StrutsActionProxy.java:52) at org.apache.struts2.dispatcher.Dispatcher.serviceAction(Dispatcher.java:488) at org.apache.struts2.dispatcher.ng.ExecuteOperations.executeAction(ExecuteOperations.java:77) at org.apache.struts2.dispatcher.ng.filter.StrutsPrepareAndExecuteFilter.doFilter(StrutsPrepareAndExecuteFilter.java:91) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:219) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:110) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:494) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:169) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:104) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:1025) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:445) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1137) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:637) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1775) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1734) - locked <0x00000000e0f1c108> (a org.apache.tomcat.util.net.NioChannel) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:748)
我们切换到该路径下:这里不可以直接 vi ,将文件 down 到本地利用 jd-gui 反编译 .class文件成为一个 .java 文件
# pwd /opt/tomcat7/webapps/dangdang/WEB-INF/classes/org/tarena/common
# ll
total 36
-rw-r--r-- 1 root root 737 Jun 24 2016 BooToStrUtil.class
-rw-r--r-- 1 root root 338 Jun 24 2016 Constant.class
-rw-r--r-- 1 root root 2697 Jun 24 2016 CookieUtil.class
-rw-r--r-- 1 root root 371 Jun 24 2016 DangException.class
-rw-r--r-- 1 root root 2954 Jun 24 2016 DbUtil.class
-rw-r--r-- 1 root root 1397 Jun 24 2016 DegistUtil.class
-rw-r--r-- 1 root root 440 Jun 24 2016 EmailUtil.class
-rw-r--r-- 1 root root 2998 Jun 24 2016 ImageUtil.class
-rw-r--r-- 1 root root 2237 Jun 24 2016 VerifyUtil.clas
反编译打开如下:
1 // 2 // Source code recreated from a .class file by IntelliJ IDEA 3 // (powered by Fernflower decompiler) 4 // 5 6 package org.tarena.common; 7 8 import java.sql.Connection; 9 import java.sql.DriverManager; 10 import java.sql.SQLException; 11 import java.util.Properties; 12 13 public class DbUtil { 14 private static Object ds = new Object(); 15 private static Object connLocal = new Object(); 16 private static String driver; 17 private static String url; 18 private static String dbUser; 19 private static String dbPwd; 20 public static int maxConnection; 21 public static int countConnection; 22 23 static { 24 Properties pro = new Properties(); 25 26 try { 27 pro.load(DbUtil.class.getClassLoader().getResourceAsStream("dbcp.properties")); 28 url = pro.getProperty("url"); 29 driver = pro.getProperty("driverClassName"); 30 dbUser = pro.getProperty("username"); 31 dbPwd = pro.getProperty("password"); 32 maxConnection = Integer.parseInt(pro.getProperty("maxActive")); 33 Class.forName(driver); 34 } catch (Exception var2) { 35 var2.printStackTrace(); 36 } 37 38 } 39 40 public DbUtil() { 41 } 42 43 public static Connection getConnection() throws SQLException { 44 synchronized(ds) { 45 Connection connection = null; 46 47 try { 48 synchronized(connLocal) { 49 connection = DriverManager.getConnection(url, dbUser, dbPwd); 50 System.out.println(countConnection + ":" + maxConnection + ":" + (countConnection > maxConnection)); 51 if (countConnection > maxConnection) { 52 throw new RuntimeException(); 53 } 54 55 ++countConnection; 56 System.out.println(countConnection); 57 } 58 } catch (SQLException var4) { 59 var4.printStackTrace(); 60 } 61 62 return connection; 63 } 64 } 65 66 public static void closeConnection(Connection conn) throws SQLException { 67 synchronized(connLocal) { 68 if (conn != null) { 69 synchronized(ds) { 70 System.out.println("->cloase"); 71 } 72 } 73 74 } 75 } 76 }
这里是因为加了一个同步锁:synchronized(ds) ,要解决把这里删掉就成了。
如果是其他问题:
应用服务器 cpu :top,看 CPU 使用率和平均负载
服务器 io 问题:iostat -x或者 sar -d
- avgrq-sz:每个IO的平均扇区数,即所有请求的平均大小,以扇区(512字节)为单位
- avgqu-sz:平均为完成的IO请求数量,即平均意义山的请求队列长度
# iostat -x Linux 2.6.32-754.9.1.el6.x86_64 (xiaowenshu) 02/03/2019 _x86_64_ (1 CPU)avg-cpu: %user %nice %system %iowait %steal %idle
0.49 0.00 0.28 0.02 0.00 99.21Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await r_await w_await svctm %util
vda 0.01 0.70 0.06 0.38 3.68 8.65 28.25 0.00 5.68 1.67 6.30 1.11 0.05
# sar -d
Linux 2.6.32-754.9.1.el6.x86_64 (xiaowenshu) 02/03/2019 x86_64 (1 CPU)
03:20:36 PM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
03:20:38 PM dev252-0 1.51 0.00 44.22 29.33 0.00 3.00 3.00 0.45
03:20:40 PM dev252-0 2.01 0.00 40.20 20.00 0.00 1.00 0.75 0.15
03:20:42 PM dev252-0 1.99 0.00 31.84 16.00 0.00 0.50 0.50 0.10
Average: dev252-0 1.84 0.00 38.73 21.09 0.00 1.36 1.27 0.23
数据库连接池:我们的连接池有 151 个,远远没有达到
堆内存溢出
重现问题:压测该页面(里面有俩请求)
压测后报错:OOM(heap space)
java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:3332) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448) at java.lang.StringBuilder.append(StringBuilder.java:136) at org.apache.catalina.valves.ErrorReportValve.getPartialServletStackTrace(ErrorReportValve.java:332) at org.apache.catalina.valves.ErrorReportValve.report(ErrorReportValve.java:251) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:148) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:1025) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:445) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1137) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:637) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1775) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1734) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:748)
而且压测过程中,看我们的 cpu :
# top top - 17:14:41 up 8 days, 5:15, 7 users, load average: 0.22, 0.05, 0.02 Tasks: 113 total, 2 running, 111 sleeping, 0 stopped, 0 zombie Cpu(s): 99.3%us, 0.4%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.4%si, 0.0%st Mem: 2054084k total, 1978048k used, 76036k free, 135632k buffers Swap: 0k total, 0k used, 0k free, 724640k cachedPID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8591 root 20 0 2551m 640m 14m S 99.0 31.9 3:50.78 java
并且 full gc 次数夸张:
# jstat -gcutil 8591 1000
S0 S1 E O M CCS YGC YGCT FGC FGCT GCT
100.00 0.00 100.00 100.00 96.53 91.74 71 0.944 2485 269.686 270.630
100.00 0.00 100.00 100.00 96.53 91.74 71 0.944 2495 271.024 271.967
100.00 0.00 100.00 100.00 96.53 91.74 71 0.944 2503 272.043 272.986
100.00 0.00 100.00 100.00 96.53 91.74 71 0.944 2513 273.024 273.967
100.00 0.00 100.00 100.00 96.53 91.74 71 0.944 2523 274.110 275.053
100.00 0.00 100.00 100.00 96.53 91.74 71 0.944 2533 275.096 276.040
100.00 0.00 100.00 100.00 96.53 91.74 71 0.944 2541 276.077 277.020
那么,我们有两种方式去分析堆内存溢出
- jmap -histo:live pid
- jmap -dump:live,format=b,file=heap.bin pid
先看相对简单的第一种:
num #instances #bytes class name ---------------------------------------------- 1: 56192 473581544 [I 2: 76980 12271064 [C 3: 1657 3424216 [B 4: 74385 1785240 java.lang.String 5: 12044 1127904 [Ljava.util.concurrent.ConcurrentHashMap$Node; 6: 11899 1047112 org.apache.catalina.session.StandardSession 7: 31696 1014272 java.util.concurrent.ConcurrentHashMap$Node 8: 29348 939136 java.util.HashMap$Node 9: 5899 849456 java.awt.image.DirectColorModel 10: 12146 837528 [Ljava.util.Hashtable$Entry; 11: 7973 819408 [Ljava.util.HashMap$Node; 12: 12191 780224 java.util.concurrent.ConcurrentHashMap 13: 5898 613392 sun.awt.image.IntegerInterleavedRaster 14: 5153 586696 java.lang.Class 15: 12083 579984 java.util.Hashtable 16: 8503 527832 [Ljava.lang.Object; 17: 5409 475992 java.lang.reflect.Method 18: 5898 424656 sun.awt.image.BufImgSurfaceData 19: 8740 419520 java.util.HashMap 20: 16394 393456 java.util.ArrayList
这里,我们没有发现开发写的方法,类似 org 的这种,有也是 tomcat 自带的一些。最多的是 int 类型和 char等
所以我们用方法2,dump 下来后用 mat 分析,文件较大,dump 时间较长,如图,可以发现是 ImageUtil 这个类导致的,创建的对象全部写入了一个 ArrayList List ,调用之后没有置为 NULL,所以把堆内存占满了,要修复的话,把 ArrayList 置为 NULL 就行
栈溢出
点击我的当当:
结果:状态码为 500 ,提示如下,为栈溢出,同时显示了类的路径以及名字
java.lang.StackOverflowError
org.tarena.action.order.OrderInfoAction.recursion(OrderInfoAction.java:29)
org.tarena.action.order.OrderInfoAction.recursion(OrderInfoAction.java:29)
org.tarena.action.order.OrderInfoAction.recursion(OrderInfoAction.java:29)
org.tarena.action.order.OrderInfoAction.recursion(OrderInfoAction.java:29)
org.tarena.action.order.OrderInfoAction.recursion(OrderInfoAction.java:29)
org.tarena.action.order.OrderInfoAction.recursion(OrderInfoAction.java:29)
数据库连接池不释放
过程:压测一个商品的详情页请求,看看报错如何
结果:
没有响应
分析过程:
按照我们刚刚的过程
一、请求没法送到服务器
- 负载机 cpu、内存、网络、磁盘
二、请求发送到服务器但是无法被处理
- 应用服务器/数据库服务器 cpu、内存、网络、磁盘
- 容器连接池
- 数据库连接池
- 代码逻辑
- sql 语句慢
- JVM 的堆内存(jmap -histo:live pid ,jmap -dump:live,format=b,file=heap.bin pid)
- gc 频繁(jstat -gcutil pid 1000)
- 线程栈问题(jstack pid)
- 磁盘 io(nmon/sar -d/iostat -x)
先看 8080 能不能访问:可以访问
这说明 tomcat 连接池没问题,网路内存也没问题
接下来看 cpu ,top:没问题
]# top top - 18:54:44 up 8 days, 6:55, 3 users, load average: 0.00, 0.00, 0.00 Tasks: 101 total, 1 running, 100 sleeping, 0 stopped, 0 zombie Cpu(s): 0.3%us, 0.3%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 2054084k total, 1817816k used, 236268k free, 186872k buffers Swap: 0k total, 0k used, 0k free, 833836k cachedPID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1178 root 10 -10 123m 9.9m 5456 S 0.3 0.5 47:35.04 AliYunDun
1884 root 20 0 221m 9.9m 5064 S 0.3 0.5 2:29.35 docker
9136 root 20 0 2587m 316m 16m S 0.3 15.8 0:41.86 java
线程栈以及 jvm 都没问题,问题在哪?
我们看下数据库连接池,发现连接池很多被 e6mall 占用,但是 sleep 了,我们可以看到有 30 个连接池被占用了(ip和db都要符合规则)。
之前我们有提到过,数据库连接池除了在数据库内部设置最大连接数,在程序内也有,我们的数据库最大连接数查询命令为:show variables like '%max_connections%';并且可以查到,是 151 个。接下来我们去程序中查看数据库的属性连接
我们的最大连接数一般是在配置文件内,去 e6mall 内查看 jdbc.properties:(xxx为我的 ip)
jdbc.driverClassName=com.mysql.jdbc.Driver jdbc.url=jdbc\:mysql\://xx.xxx.xxx.xxx\:3306/e6mall?useUnicode\=true&characterEncoding\=utf-8&autoReconnect\=true&zeroDateTimeBehavior\=round jdbc.username=root jdbc.password=123456
可以看到只有基础的属性配置,但是没有详细的连接配置,我们的 dangdang 是有的,那么这里在哪配的呢?
e6mall/WEB-INF/classes/applicationContext.xml
<?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:p="http://www.springframework.org/schema/p" xmlns:context="http://www.springframework.org/schema/context" xmlns:aop="http://www.springframework.org/schema/aop" xmlns:tx="http://www.springframework.org/schema/tx" xsi:schemaLocation=" http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context-3.0.xsd http://www.springframework.org/schema/aop http://www.springframework.org/schema/aop/spring-aop-3.0.xsd http://www.springframework.org/schema/tx http://www.springframework.org/schema/tx/spring-tx-3.0.xsd"><span style="color: #0000ff;"><</span><span style="color: #800000;">bean </span><span style="color: #ff0000;">id</span><span style="color: #0000ff;">="propertyConfigurer"</span><span style="color: #ff0000;"> class</span><span style="color: #0000ff;">="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer"</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">property </span><span style="color: #ff0000;">name</span><span style="color: #0000ff;">="locations"</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">list</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span>classpath:jdbc.properties<span style="color: #0000ff;"></</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"></</span><span style="color: #800000;">list</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"></</span><span style="color: #800000;">property</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"></</span><span style="color: #800000;">bean</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">bean </span><span style="color: #ff0000;">id</span><span style="color: #0000ff;">="velocityEngine"</span><span style="color: #ff0000;"> class</span><span style="color: #0000ff;">="org.springframework.ui.velocity.VelocityEngineFactoryBean"</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">property </span><span style="color: #ff0000;">name</span><span style="color: #0000ff;">="velocityProperties"</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">props</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">prop </span><span style="color: #ff0000;">key</span><span style="color: #0000ff;">="resource.loader"</span><span style="color: #0000ff;">></span>class<span style="color: #0000ff;"></</span><span style="color: #800000;">prop</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">prop </span><span style="color: #ff0000;">key</span><span style="color: #0000ff;">="class.resource.loader.class"</span><span style="color: #0000ff;">></span><span style="color: #000000;"> org.apache.velocity.runtime.resource.loader.ClasspathResourceLoader </span><span style="color: #0000ff;"></</span><span style="color: #800000;">prop</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">prop </span><span style="color: #ff0000;">key</span><span style="color: #0000ff;">="velocimacro.library"</span><span style="color: #0000ff;">></</span><span style="color: #800000;">prop</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"></</span><span style="color: #800000;">props</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"></</span><span style="color: #800000;">property</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"></</span><span style="color: #800000;">bean</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">bean </span><span style="color: #ff0000;">id</span><span style="color: #0000ff;">="dataSource"</span><span style="color: #ff0000;"> class</span><span style="color: #0000ff;">="com.mchange.v2.c3p0.ComboPooledDataSource"</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">property </span><span style="color: #ff0000;">name</span><span style="color: #0000ff;">="driverClass"</span><span style="color: #ff0000;"> value</span><span style="color: #0000ff;">="${jdbc.driverClassName}"</span> <span style="color: #0000ff;">/></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">property </span><span style="color: #ff0000;">name</span><span style="color: #0000ff;">="jdbcUrl"</span><span style="color: #ff0000;"> value</span><span style="color: #0000ff;">="${jdbc.url}"</span> <span style="color: #0000ff;">/></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">property </span><span style="color: #ff0000;">name</span><span style="color: #0000ff;">="user"</span><span style="color: #ff0000;"> value</span><span style="color: #0000ff;">="${jdbc.username}"</span> <span style="color: #0000ff;">/></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">property </span><span style="color: #ff0000;">name</span><span style="color: #0000ff;">="password"</span><span style="color: #ff0000;"> value</span><span style="color: #0000ff;">="${jdbc.password}"</span> <span style="color: #0000ff;">/></span> <span style="color: #008000;"><!--</span><span style="color: #008000;">连接池中保留的最小连接数。 </span><span style="color: #008000;">--></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">property </span><span style="color: #ff0000;">name</span><span style="color: #0000ff;">="minPoolSize"</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span>5<span style="color: #0000ff;"></</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"></</span><span style="color: #800000;">property</span><span style="color: #0000ff;">></span> <span style="color: #008000;"><!--</span><span style="color: #008000;">连接池中保留的最大连接数。Default: 15 </span><span style="color: #008000;">--></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">property </span><span style="color: #ff0000;">name</span><span style="color: #0000ff;">="maxPoolSize"</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span><span style="background-color: #ff0000;">30</span><span style="color: #0000ff;"></</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"></</span><span style="color: #800000;">property</span><span style="color: #0000ff;">></span> <span style="color: #008000;"><!--</span><span style="color: #008000;">初始化时获取的连接数,取值应在minPoolSize与maxPoolSize之间。Default: 3 </span><span style="color: #008000;">--></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">property </span><span style="color: #ff0000;">name</span><span style="color: #0000ff;">="initialPoolSize"</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span>10<span style="color: #0000ff;"></</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"></</span><span style="color: #800000;">property</span><span style="color: #0000ff;">></span> <span style="color: #008000;"><!--</span><span style="color: #008000;">最大空闲时间,60秒内未使用则连接被丢弃。若为0则永不丢弃。Default: 0 </span><span style="color: #008000;">--></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">property </span><span style="color: #ff0000;">name</span><span style="color: #0000ff;">="maxIdleTime"</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span>60<span style="color: #0000ff;"></</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"></</span><span style="color: #800000;">property</span><span style="color: #0000ff;">></span> <span style="color: #008000;"><!--</span><span style="color: #008000;">当连接池中的连接耗尽的时候c3p0一次同时获取的连接数。Default: 3 </span><span style="color: #008000;">--></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">property </span><span style="color: #ff0000;">name</span><span style="color: #0000ff;">="acquireIncrement"</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span>5<span style="color: #0000ff;"></</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"></</span><span style="color: #800000;">property</span><span style="color: #0000ff;">></span> <span style="color: #008000;"><!--</span><span style="color: #008000;"> JDBC的标准参数,用以控制数据源内加载的PreparedStatements数量。但由于预缓存的statements 属于单个connection而不是整个连接池。所以设置这个参数需要考虑到多方面的因素。 如果maxStatements与maxStatementsPerConnection均为0,则缓存被关闭。Default: 0 </span><span style="color: #008000;">--></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">property </span><span style="color: #ff0000;">name</span><span style="color: #0000ff;">="maxStatements"</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span>0<span style="color: #0000ff;"></</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"></</span><span style="color: #800000;">property</span><span style="color: #0000ff;">></span> <span style="color: #008000;"><!--</span><span style="color: #008000;">每60秒检查所有连接池中的空闲连接。Default: 0 </span><span style="color: #008000;">--></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">property </span><span style="color: #ff0000;">name</span><span style="color: #0000ff;">="idleConnectionTestPeriod"</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span>60<span style="color: #0000ff;"></</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"></</span><span style="color: #800000;">property</span><span style="color: #0000ff;">></span> <span style="color: #008000;"><!--</span><span style="color: #008000;">定义在从数据库获取新连接失败后重复尝试的次数。Default: 30 </span><span style="color: #008000;">--></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">property </span><span style="color: #ff0000;">name</span><span style="color: #0000ff;">="acquireRetryAttempts"</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span>30<span style="color: #0000ff;"></</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"></</span><span style="color: #800000;">property</span><span style="color: #0000ff;">></span> <span style="color: #008000;"><!--</span><span style="color: #008000;"> 获取连接失败将会引起所有等待连接池来获取连接的线程抛出异常。但是数据源仍有效 保留,并在下次调用getConnection()的时候继续尝试获取连接。如果设为true,那么在尝试 获取连接失败后该数据源将申明已断开并永久关闭。Default: false </span><span style="color: #008000;">--></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">property </span><span style="color: #ff0000;">name</span><span style="color: #0000ff;">="breakAfterAcquireFailure"</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span>false<span style="color: #0000ff;"></</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"></</span><span style="color: #800000;">property</span><span style="color: #0000ff;">></span> <span style="color: #008000;"><!--</span><span style="color: #008000;"> 因性能消耗大请只在需要的时候使用它。如果设为true那么在每个connection提交的 时候都将校验其有效性。建议使用idleConnectionTestPeriod或automaticTestTable 等方法来提升连接测试的性能。Default: false </span><span style="color: #008000;">--></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">property </span><span style="color: #ff0000;">name</span><span style="color: #0000ff;">="testConnectionOnCheckout"</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"><</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span>false<span style="color: #0000ff;"></</span><span style="color: #800000;">value</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"></</span><span style="color: #800000;">property</span><span style="color: #0000ff;">></span> <span style="color: #0000ff;"></</span><span style="color: #800000;">bean</span><span style="color: #0000ff;">></span></pre>
说明最大连接数我配了 30 ,就连上了 30 ,初步判断是数据库连接池不释放,那么怎么验证?我们把 30 改成 120 再压测,如果数据库连接池又被占满了,则可以判断是连接池不释放导致的。
果然,120个又被占满,说明是数据库连接池不释放导致的问题
总结
根据问题现象,去定位问题。具体方式是根据我们的架构拓扑图去排查,有些是经验推断,以及日志内的分析去定位大致范围。还有比较重要的能定位问题的:日志内的埋点的时间信息。在日志内打印接口的调用时间。如果响应时间长,在日志内打出来接口的调用时间,再减去数据库内发生的时间,就是接受请求之前所花费的时间。