2024-07-03 11:08:16.066 [DEBUG] [[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'] [70930e830d7c5249,70930e830d7c5249,false] SecurityContextHolder now cleared, as request processing completed
七月 03, 2024 11:08:07 上午 org.jboss.netty.channel.socket.nio.NioWorker
警告: Unexpected exception in the selector loop.
java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.util.HashMap.newKeyIterator(HashMap.java:968)
    at java.util.HashMap$KeySet.iterator(HashMap.java:1002)
    at java.util.HashSet.iterator(HashSet.java:170)
    at sun.nio.ch.Util$2.iterator(Util.java:303)
    at org.jboss.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:274)
    at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:200)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
   

    OutOfMemoryError是java.lang.VirtualMachineError的子类,当JVM资源利用出现问题时抛出,更具体地说,这个错误是由于JVM花费太长时间执行GC且只能回收很少的堆内存时抛出的。
    以下代码可以复现java.lang.OutOfMemoryError: GC Overhead Limit Exceeded错误
    使用一个while死循环不停地往HashMap中添加随机数。在执行main方法之前,先设置JVM参数为-Xmx300m -XX:+UseParallelGC(JVM堆为300MB,GC算法为ParallelGC),然后运行main方法,会遇到java.lang.OutOfMemoryError: GC Overhead Limit Exceeded错误

    package com.galaxy.concurrency.jvm;

    import java.util.HashMap;
    import java.util.Map;
    import java.util.Random;

    public class OutOfMemoryGCLimitExceed {

        public static void addRandomDataToMap() {
            Map<Integer, String> dataMap = new HashMap<>();
            Random r = new Random();
            while (true) {
                dataMap.put(r.nextInt(), String.valueOf(r.nextInt()));
            }
        }

        public static void main(String[] args) {
            addRandomDataToMap();
        }
    }

    解决方案:
        通过检查可能存在内存泄漏的代码来发现应用程序所存在的问题
        考虑:
            1、应用程序中哪些对象占据了堆的大部分空间?(What are the objects in the application that occupy large portions of the heap?)
            2、这些对象在源码中的哪些部分被使用?(In which parts of the source code are these objects being allocated?)
        工具:
            自动化图形工具,比如JVisualVM、JConsole,它可以帮助检测代码中的性能问题,包括java.lang.OutOfMemoryError
        
        快捷解决方法:
            方式:更改JVM启动配置来增加堆大小,或者在JVM启动配置里增加-XX:-UseGCOverheadLimit选项来关闭GC Overhead limit exceeded
                例如,JVM参数为Java应用程序提供了1GB堆空间:java -Xmx1024m com.xyz.TheClassName
                      JVM参数不仅为Java应用程序提供了1GB堆空间,也增加-XX:-UseGCOverheadLimit选项来关闭GC Overhead limit exceeded:java -Xmx1024m -XX:-UseGCOverheadLimit com.xyz.TheClassNam
            还是出现问题:
                但增加-XX:-UseGCOverheadLimit选项的方式治标不治本,JVM最终会抛出java.lang.OutOfMemoryError: Java heap space错误
                
线上事故解决过程及总结
    1、异常日志

 2024-07-03 11:08:16.066 [DEBUG] [[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'] [70930e830d7c5249,70930e830d7c5249,false] SecurityContextHolder now cleared, as request processing completed
        七月 03, 2024 11:08:07 上午 org.jboss.netty.channel.socket.nio.NioWorker
        警告: Unexpected exception in the selector loop.
        java.lang.OutOfMemoryError: GC overhead limit exceeded
            at java.util.HashMap.newKeyIterator(HashMap.java:968)
            at java.util.HashMap$KeySet.iterator(HashMap.java:1002)
            at java.util.HashSet.iterator(HashSet.java:170)
            at sun.nio.ch.Util$2.iterator(Util.java:303)
            at org.jboss.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:274)
            at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:200)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
            at java.lang.Thread.run(Thread.java:745)

        <Jul 3, 2024 11:08:23 AM CST> <Error> <JDBC> <BEA-001112> <Test "SELECT 1 FROM DUAL" set up for pool "sinosoftDataSource" failed with exception: "java.sql.SQLException: Protocol violation: [1]".>                 

    2、问题排查
        1、OOM Killer
            Linux下发生OOM,不一定是因为Java服务耗内存,也可能是因为其他程序申请了很多内存,此时所有应用所需要的内存超过物理内存,然后Java服务很耗内存且被Linux操作系统找到,就会被 kill,

           这是Linux为避免物理内存过载导致系统崩溃而采取的内存保护机制。
        2、环境部署情况                
            我们这个服务是单独部署的,使用weblogic部署,JDK:export JAVA_HOME=/app/jdk1.6.0_24
            因此我们将视线转到JVM内存配置上。这个应用访问量不大,线上服务器内存为 15G,我们先用JDK自带的命令工具查看了JVM配置
            
            使用jps查询到使用该JDK部署多个项目,可以使用项目对应的端口进行判断项目PID

            [weblogic@newcoreSIT01 bin]$ sudo ./jps
            23989 Server
            3672 Server
            26137 Jps
            29041 Server
            3403 Bootstrap
            29418 Server
            20363 Server
            2175 Server
            4293 Server
            27986 Bootstrap
            27422 Server
            
            [weblogic@newcoreSIT01 bin]$ lsof -i:8001
            COMMAND  PID     USER   FD   TYPE     DEVICE SIZE/OFF NODE NAME
            java    2175 weblogic  703u  IPv6 3943359544      0t0  TCP localhost:vcom-tunnel (LISTEN)
            java    2175 weblogic  704u  IPv6 3943359545      0t0  TCP [fe80::250:56ff:fea5:35bf]:vcom-tunnel (LISTEN)
            java    2175 weblogic  705u  IPv6 3943359546      0t0  TCP newcoreSIT01.sinosafe.local:vcom-tunnel (LISTEN)
            java    2175 weblogic  706u  IPv6 3943359547      0t0  TCP localhost:vcom-tunnel (LISTEN)

            查看JVM配置方式:
            方式一: 

[weblogic@newcoreSIT01 bin]$ sudo ./jps -v | grep 2175

   2175 Server -Xms512m -Xmx1024m -XX:CompileThreshold=8000 -XX:PermSize=512m -XX:MaxPermSize=1024m -Dweblogic.Name=AdminServer -Djava.security.policy=/app/weblogic/Oracle/Middleware/wlserver_10.3/server/lib/weblogic.policy -Xverify:none -da -Dplatform.home=/app/weblogic/Oracle/Middleware/wlserver_10.3 -Dwls.home=/app/weblogic/Oracle/Middleware/wlserver_10.3/server -Dweblogic.home=/app/weblogic/Oracle/Middleware/wlserver_10.3/server -Dweblogic.management.discover=true -Dwlw.iterativeDev= -Dwlw.testConsole= -Dwlw.logErrorsToConsole= -Dweblogic.ext.dirs=/app/weblogic/Oracle/Middleware/patch_wls1036/profiles/default/sysext_manifest_classpath:/app/weblogic/Oracle/Middleware/patch_ocp371/profiles/default/sysext_manifest_classpath -Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,address=10899,server=y,suspend=n -Djava.compiler=NONE

            方式二:
                #查看jvm参数,pid为spacex.jar的进程号

  sudo jinfo -flags pid
                
                [weblogic@newcoreSIT01 bin]$ sudo ./jinfo -flags 2175
                Attaching to process ID 2175, please wait...
                Exception in thread "main" java.lang.reflect.InvocationTargetException
                        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
                        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
                        at java.lang.reflect.Method.invoke(Method.java:597)
                        at sun.tools.jinfo.JInfo.runTool(JInfo.java:79)
                        at sun.tools.jinfo.JInfo.main(JInfo.java:53)
                Caused by: java.lang.RuntimeException: Type "nmethodBucket*", referenced in VMStructs::localHotSpotVMStructs in the remote VM,

               was not present in the remote VMStructs::localHotSpotVMTypes table (should have been caught in the debug build of that VM). Can not continue.
                        at sun.jvm.hotspot.HotSpotTypeDataBase.lookupOrFail(HotSpotTypeDataBase.java:361)
                        at sun.jvm.hotspot.HotSpotTypeDataBase.readVMStructs(HotSpotTypeDataBase.java:252)
                        at sun.jvm.hotspot.HotSpotTypeDataBase.<init>(HotSpotTypeDataBase.java:87)
                        at sun.jvm.hotspot.bugspot.BugSpotAgent.setupVM(BugSpotAgent.java:568)
                        at sun.jvm.hotspot.bugspot.BugSpotAgent.go(BugSpotAgent.java:494)
                        at sun.jvm.hotspot.bugspot.BugSpotAgent.attach(BugSpotAgent.java:332)
                        at sun.jvm.hotspot.tools.Tool.start(Tool.java:163)
                        at sun.jvm.hotspot.tools.JInfo.main(JInfo.java:128)         

            方式三:
                查看weblogic( setDomainEnv.sh)启动文件配置JVM

  if [ "${JAVA_VENDOR}" = "Sun" ] ; then
                        WLS_MEM_ARGS_64BIT="-Xms512m -Xmx1024m"
                        export WLS_MEM_ARGS_64BIT
                        WLS_MEM_ARGS_32BIT="-Xms512m -Xmx1024m"
                        export WLS_MEM_ARGS_32BIT
                else
                        WLS_MEM_ARGS_64BIT="-Xms512m -Xmx1024m"
                        export WLS_MEM_ARGS_64BIT
                        WLS_MEM_ARGS_32BIT="-Xms512m -Xmx1024m"
                        export WLS_MEM_ARGS_32BIT
                fi      

        3、内存情况
    

       [weblogic@newcoreSIT01 bin]$ free -m
            total       used       free     shared    buffers     cached
            Mem:         15947      15348        599          0        225       3010
            -/+ buffers/cache:      12111       3835
            Swap:        10239       1488       8751

        4、weblogic配置jconsole-sunos(solaris)+weblogic
            1、jconsole使用jmx进行监控,需要在应用启动时,配置启动参数。因为使用的是weblogic服务器,故需要在${DOMAIN_HOME}/bin/setDomainEnv.sh环境中进行配置。

            JAVA_OPTIONS="${JAVA_OPTIONS} -Dcom.sun.management.jmxremote.port=9000"
            JAVA_OPTIONS="${JAVA_OPTIONS} -Dcom.sun.management.jmxremote.authenticate=false"
            JAVA_OPTIONS="${JAVA_OPTIONS} -Dcom.sun.management.jmxremote.ssl=false"
            NBZ SIT : -Dcom.sun.management.jmxremote.port=9991

            jconsole 连接配置
            ip地址:端口
            用户  密码
        
    代码原因

     //04责任险业务分类特殊处理
            if ("04".equals(dto.getRiskCode().substring(0, 2))) {
                GuXXXXXDto guXXXXXDto = new GuXXXXXDto();
                guXXXXXDto.setProposalNo(proposalNo);
                List<GuXXXXXDto> XXXXXList = guXXXXXDao.find(guXXXXXDto, null);
                
                GuXXXXicListDto guPXXXListDto = new GuXXXDto();
                +  guXXXcListDao.setProposalNo(proposalNo);
                List<GuXXXXicListDto> proposalDynamiList = guXXXcListDao.find(guPrXXXXynamicListDto, null);  # 全表查询导致oom
                ServiceManager.prpall.getXXXXindService().proceXXXlProBusinessType04(XXXXXList,propoXXXnamiList,dto);guPrXXXXcListDto

 

posted on 2024-10-09 14:58  Old-Kang  阅读(19)  评论(0编辑  收藏  举报