记录一次No buffer space available问题
前言:
服务器的接口服务一直是好的,突然有一天,恩,接口服务不通了,看log,发现了这个错误:
Caused by: java.net.SocketException: No buffer space available (maximum connections reached?): connect at sun.nio.ch.Net.connect0(Native Method) at sun.nio.ch.Net.connect(Unknown Source) at sun.nio.ch.Net.connect(Unknown Source) at sun.nio.ch.SocketChannelImpl.connect(Unknown Source) at java.nio.channels.SocketChannel.open(Unknown Source) at sun.nio.ch.PipeImpl$Initializer$LoopbackConnector.run(Unknown Source) ... 36 common frames omitted 2019-07-02 13:10:18 [main] INFO o.a.c.h.Http11NioProtocol - Pausing ProtocolHandler ["http-nio-8556"] 2019-07-02 13:10:18 [main] ERROR o.a.c.c.Connector - Protocol handler pause failed java.lang.NullPointerException: null at org.apache.tomcat.util.net.AbstractEndpoint.unlockAccept(AbstractEndpoint.java:899) at org.apache.tomcat.util.net.AbstractEndpoint.pause(AbstractEndpoint.java:1185) at org.apache.coyote.AbstractProtocol.pause(AbstractProtocol.java:612) at org.apache.catalina.connector.Connector.pause(Connector.java:944) at org.apache.catalina.core.StandardService.stopInternal(StandardService.java:467) at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:226) at org.apache.catalina.core.StandardServer.stopInternal(StandardServer.java:814) at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:226) at org.apache.catalina.startup.Tomcat.stop(Tomcat.java:377) at org.springframework.boot.web.embedded.tomcat.TomcatWebServer.stopTomcat(TomcatWebServer.java:247) at org.springframework.boot.web.embedded.tomcat.TomcatWebServer.stopSilently(TomcatWebServer.java:235) at org.springframework.boot.web.embedded.tomcat.TomcatWebServer.start(TomcatWebServer.java:210) at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.startWebServer(ServletWebServerApplicationContext.java:300) at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.finishRefresh(ServletWebServerApplicationContext.java:162) at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:553) at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.refresh(ServletWebServerApplicationContext.java:140) at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:762) at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:398) at org.springframework.boot.SpringApplication.run(SpringApplication.java:330) at org.springframework.boot.SpringApplication.run(SpringApplication.java:1258) at org.springframework.boot.SpringApplication.run(SpringApplication.java:1246) at com.winning.platwebservice.DqmsServiceApplication.main(DqmsServiceApplication.java:10) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:48) at org.springframework.boot.loader.Launcher.launch(Launcher.java:87) at org.springframework.boot.loader.Launcher.launch(Launcher.java:50) at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:51)
解决思路:
1.看到No buffer space available,字面意思是说缓冲区内存不足,于是开始查机器内存:因为我们是window服务器,发现硬盘还有50G,查看运行内存还有10G,虚拟内存还有5G,应该不是这些问题,排除;
2.通过查看time_wait进程发现,pid为8561的有好多time_wait进行,再查询发现是zabbix_client,联系了部署zabbix_client的同事,能不能先停一下,发现停下后发现tomcat还是无法启动,排除;
3.更换端口,无效;
4.调整部署的xms参数,调小调大,无效;
5.百度,百度告诉我们这个问题要修改注册表信息,要重启,因为服务器部署了很多别人的应用,没有考虑这种办法;
6.重启tomcat时有报错:
Caused by: java.net.BindException: Address already in use: connect at sun.nio.ch.Net.connect0(Native Method) at sun.nio.ch.Net.connect(Unknown Source) at sun.nio.ch.Net.connect(Unknown Source) at sun.nio.ch.SocketChannelImpl.connect(Unknown Source) at java.nio.channels.SocketChannel.open(Unknown Source) at sun.nio.ch.PipeImpl$Initializer$LoopbackConnector.run(Unknown Source) ... 37 common frames omitted 2019-07-02 13:10:46 [main] INFO o.a.c.h.Http11NioProtocol - Pausing ProtocolHandler ["http-nio-8556"] 2019-07-02 13:10:46 [main] INFO o.a.c.c.StandardService - Stopping service [Tomcat] 2019-07-02 13:10:47 [main] INFO o.a.c.u.LifecycleBase - The stop() method was called on component [StandardServer[-1]] after stop() had already been called. The second call will be ignored. 2019-07-02 13:10:47 [main] INFO o.a.c.h.Http11NioProtocol - Stopping ProtocolHandler ["http-nio-8556"] 2019-07-02 13:10:47 [main] INFO o.a.c.h.Http11NioProtocol - Destroying ProtocolHandler ["http-nio-8556"] 2019-07-02 13:10:47 [main] INFO o.s.b.a.l.ConditionEvaluationReportLoggingListener - Error starting ApplicationContext. To display the conditions report re-run your application with 'debug' enabled. 2019-07-02 13:10:47 [main] ERROR o.s.b.d.LoggingFailureAnalysisReporter - *************************** APPLICATION FAILED TO START *************************** Description: The Tomcat connector configured to listen on port 8556 failed to start. The port may already be in use or the connector may be misconfigured.
猜测可能是因为端口被使用情况,可是查询端口发现这个端口没有被使用,而且更换端口也是无效,有一个印象比较深刻的是,我们查询我们自己端口号和其他端口号时,总是有一个5开头的端口被使用(比如说我们系统是8556端口,发现有一个58556端口被使用,查询8561端口,也发现有一个58561端口被使用),后来发现有一个pid是12122的几乎占据了所有5开头的5位数端口号,这是一个java.exe,我们把它杀掉之后,重启tomcat,发现ok了。
总结:
因为这个服务器部署了好多的项目,是测试服务器,同一时间有很多的连接和http请求,达到了window系统的上限,所以需要修改注册表信息重启,或是停掉消耗资源最多的那个应用。
Ride the wave as long as it will take you.