java程序后台报错java.net.SocketException: Too many open files

问题描述:

  今天一个同事反映程序有问题,让帮忙查看后台日志,发现后台日志报错的信息如下:

java.net.SocketException: Too many open files
        at java.net.Socket.createImpl(Socket.java:460)
        at java.net.Socket.connect(Socket.java:587)
        at org.apache.commons.net.SocketClient.connect(SocketClient.java:163)
        at org.apache.commons.net.SocketClient.connect(SocketClient.java:184)
        at com.asiainfo.goods.wo.store.scheduler.util.FtpUtil.downloadFileByFileName(FtpUtil.java:270)
        at com.asiainfo.goods.wo.store.scheduler.job.TargetUserJob.dealWithFtpByRequestId(TargetUserJob.java:186)
        at com.asiainfo.goods.wo.store.scheduler.job.TargetUserJob.execute(TargetUserJob.java:80)
        at com.asiainfo.goods.presale.scheduler.job.QuartzJobFactory.execute(QuartzJobFactory.java:68)
        at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)
2018-04-08 18:45:00 [com.asiainfo.goods.wo.store.scheduler.job.TargetUserJob]-[ERROR]:216 - file doesnot exist===TARGETCUSTU00011201
804081800575321.zip

 

问题分析:

  通过以上的错误提示可以知道,是程序打开太多的文件导致的.

解决过程:

1.查看当前系统用户下设置的打开文件的上限

[aiprd@host-10-191-5-227 log]$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 256705
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 65536
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 20000
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

备注:当前用户下,每个进程最多可以打开65536个文件描述符.

2.查看应用程序进程当前已经打开的进程数

[aiprd@host-10-191-5-227 log]$ lsof -p 2526 | wc -l
65785

备注:当前应用程序打开的文件65785显然是超过了65536的限制,导致进程后续无法打开新的文件.

3.通过lsof命令针对单独的进程查看发现大量的deleted的文件

备注:很多文件已经不存在了,但是,文件的描述符还是打开的.

[aiprd@host-10-191-5-227 log]$ lsof -p 2526 | grep deleted | wc -l
65274

 

 备注:deleted的文件有65274个.可见,大部分的文件描述符占用都是deleted的文件.

4.将应用程序进程关闭,释放打开的文件

[aiprd@host-10-191-5-227 log]$ kill -9 2526
[aiprd@host-10-191-5-227 log]$ lsof -p 2526 | wc -l
0

 

 5.重启应用程序,并且查看打开的文件

[aiprd@host-10-191-5-227 log]$ ps -ef | grep scheduler_hdfs | grep -v grep | awk '{print $2}'
29639
[aiprd@host-10-191-5-227 log]$ lsof -p 29639 | wc -l
485
[aiprd@host-10-191-5-227 log]$ lsof -p 29639 | grep deleted | wc -l
0

 

 备注:应用程序重启之后,之间打开的文件都释放掉了.后台程序可以正确的进行处理.

 

文档创建时间:2018年4月8日21:25:33

posted @ 2018-04-08 21:27  Zhai_David  阅读(591)  评论(0编辑  收藏  举报