一个大对象引起的血案,GC的踩坑实录
/** * 模拟当系统中使用大对象时,对JVM造成的影响 * * @author 包子(何锦彬). 2017.01.07 * @QQ 277803242 */ @WebServlet("/Test") public class Test extends HttpServlet { private static final long serialVersionUID = 1L; private Logger logger = Logger.getLogger(Test.class.getName()); protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { long startTime = System.currentTimeMillis(); // deal try { // 模拟业务花费时间 Thread.sleep(500); } catch (InterruptedException e) { } long costTime = (System.currentTimeMillis() - startTime); // 接口1秒timeout,打印出日志 if (costTime > 1000) { logger.warning("cost time:" + costTime); } Writer out = response.getWriter(); out.append("ok"); } }
JAVA_OPTS="-Xmx1200m -Xms1200m -Xmn200m -Xss228k -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+PrintGCDetails"
/** * 模拟定时对账, 模拟来了个大对象对账 * * @author 包子(何锦彬). 2017.01.07 * @QQ 277803242 */ @WebServlet(name = "init", loadOnStartup = 1, description = "init", urlPatterns = "/task") public class BigObject extends HttpServlet { class ReconciliationTask extends TimerTask { @Override public void run() { logger.warning("a big object have commit,exp: reconciliation"); byte[] bigObject = new byte[1024 * 1024 * 500];// 年老代 500M,刚刚好达到年老代的50%,不断触发GC try { Thread.sleep(60 * 1000); } catch (InterruptedException e) { // TODO Auto-generated catch block e.printStackTrace(); } } } private static final long serialVersionUID = 1L; private Logger logger = Logger.getLogger(Test.class.getName()); @Override public void init() throws ServletException { Timer timer = new Timer(); timer.schedule(new ReconciliationTask(), 60 * 1000);// 1分钟后模拟开始对账任务 logger.log(Level.INFO, "task have registered"); super.init(); } }
GC [1 CMS-initial-mark: 1048576K(1740800K)] 1051853K(1832960K), 0.0042630 secs] [Times: user=0.00 sys=0.00, real=0.01 secs] [CMS-concurrent-mark: 0.023/0.023 secs] [Times: user=0.02 sys=0.00, real=0.02 secs] [CMS-concurrent-preclean: 0.007/0.007 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] CMS: abort preclean due to time [CMS-concurrent-abortable-preclean: 0.222/5.020 secs] [Times: user=0.20 sys=0.03, real=5.02 secs] [GC[YG occupancy: 3277 K (92160 K)][Rescan (parallel) , 0.0067540 secs][weak refs processing, 0.0000400 secs][scrub string table, 0.0004160 secs] [1 CMS-remark: 1048576K(1740800K)] 1051853K(1832960K), 0.0076820 secs] [Times: user=0.02 sys=0.01, real=0.01 secs] [CMS-concurrent-sweep: 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] [CMS-concurrent-reset: 0.014/0.014 secs] [Times: user=0.00 sys=0.01, real=0.01 secs]
S0C S1C S0U S1U EC EU OC OU PC PU YGC YGCT FGC FGCT GCT
20480.0 20480.0 0.0 17257.9 163840.0 93241.7 1024000.0 512000.0 29356.0 17612.0 1 0.032 5 0.159 0.191 20480.0 20480.0 0.0 17257.9 163840.0 93241.8 1024000.0 512000.0 29356.0 17612.0 1 0.032 5 0.159 0.191 20480.0 20480.0 0.0 17257.9 163840.0 93241.9 1024000.0 512000.0 29356.0 17612.0 1 0.032 5 0.159 0.191 20480.0 20480.0 0.0 17257.9 163840.0 93242.0 1024000.0 512000.0 29356.0 17612.0 1 0.032 5 0.159 0.191 20480.0 20480.0 0.0 17257.9 163840.0 93242.0 1024000.0 512000.0 29356.0 17612.0 1 0.032 5 0.159 0.191 20480.0 20480.0 0.0 17257.9 163840.0 93242.1 1024000.0 512000.0 29356.0 17612.0 1 0.032 6 0.183 0.214
13-Jan-2017 01:26:43.692 WARNING [http-nio-8080-exec-78] org.hjb.memory.Test.doGet cost time:1482 13-Jan-2017 01:26:43.693 WARNING [http-nio-8080-exec-28] org.hjb.memory.Test.doGet cost time:1434 13-Jan-2017 01:26:43.694 WARNING [http-nio-8080-exec-23] org.hjb.memory.Test.doGet cost time:1492 13-Jan-2017 01:26:43.699 WARNING [http-nio-8080-exec-85] org.hjb.memory.Test.doGet cost time:1514 13-Jan-2017 01:26:43.700 WARNING [http-nio-8080-exec-97] org.hjb.memory.Test.doGet cost time:1515 13-Jan-2017 01:26:43.701 WARNING [http-nio-8080-exec-83] org.hjb.memory.Test.doGet cost time:1517 13-Jan-2017 01:26:43.704 WARNING [http-nio-8080-exec-87] org.hjb.memory.Test.doGet cost time:1547 13-Jan-2017 01:26:43.705 WARNING [http-nio-8080-exec-41] org.hjb.memory.Test.doGet cost time:1548 13-Jan-2017 01:26:43.707 WARNING [http-nio-8080-exec-35] org.hjb.memory.Test.doGet cost time:1552 13-Jan-2017 01:26:43.707 WARNING [http-nio-8080-exec-26] org.hjb.memory.Test.doGet cost time:1557 13-Jan-2017 01:26:43.709 WARNING [http-nio-8080-exec-101] org.hjb.memory.Test.doGet cost time:1559 13-Jan-2017 01:26:43.690 WARNING [http-nio-8080-exec-42] org.hjb.memory.Test.doGet cost time:1482 13-Jan-2017 01:26:43.716 WARNING [http-nio-8080-exec-80] org.hjb.memory.Test.doGet cost time:1578 13-Jan-2017 01:26:43.720 WARNING [http-nio-8080-exec-3] org.hjb.memory.Test.doGet cost time:1583 13-Jan-2017 01:26:43.690 WARNING [http-nio-8080-exec-22] org.hjb.memory.Test.doGet cost time:1420 13-Jan-2017 01:26:43.690 WARNING [http-nio-8080-exec-59] org.hjb.memory.Test.doGet cost time:1295 13-Jan-2017 01:26:43.689 WARNING [http-nio-8080-exec-51] org.hjb.memory.Test.doGet cost time:1294 13-Jan-2017 01:26:43.727 WARNING [http-nio-8080-exec-79] org.hjb.memory.Test.doGet cost time:1559 13-Jan-2017 01:26:43.731 WARNING [http-nio-8080-exec-63] org.hjb.memory.Test.doGet cost time:1630 13-Jan-2017 01:26:43.687 WARNING [http-nio-8080-exec-77] org.hjb.memory.Test.doGet cost time:1429 13-Jan-2017 01:26:43.686 WARNING [http-nio-8080-exec-95] org.hjb.memory.Test.doGet cost time:1371 13-Jan-2017 01:26:43.737 WARNING [http-nio-8080-exec-8] org.hjb.memory.Test.doGet cost time:1595 13-Jan-2017 01:26:43.686 WARNING [http-nio-8080-exec-84] org.hjb.memory.Test.doGet cost time:1429 13-Jan-2017 01:26:43.686 WARNING [http-nio-8080-exec-92] org.hjb.memory.Test.doGet cost time:1295 13-Jan-2017 01:26:43.678 WARNING [http-nio-8080-exec-20] org.hjb.memory.Test.doGet cost time:1390
踩过GC的几个坑
1, 内存不够用,JVM用到SWAP
如果每次GC时间不合理时,如: FGC=10,FGCT=1, 基本可以肯定是用到了SWAP内存区了 (当然也和你内存大小有关, 指一般系统内存在8G附近)
2, 用到大对象,导致频繁FGC( FULLGC和 Major GC 对性能都有严重影响)
看很多资料上说是60%触发Major GC, 但经测试发现, 在OU到了50%时开始不断触发Major GC
4,YGC变动越来越久, 如用到了string.intern()方法. (比如在 fastjson 中有用到),
原因: string.intern() 往常量池加如数据。 而ROOT GC时, 需要扫描所有常量作为根节点, 当常量池越大量增加时,扫描的数据时间增加
5,GC里面出现有
Concurrent mode failed
原因:由于GC在CMS时,往OU区放入对象,然后不可用。
发现大量这种现象,
1,增加OLD区的内存压缩参数, UseCMSCompactAtFullCollection 或 CMSFullGCsBeforeCompaction
2, 建议调大年轻代内存调大,或增加OLD区的回收频率
Prommotion failed
1. 年轻代对象晋升失败, 当启用了担保分配,每次晋升会检查年老代的内存是否够大于年轻代平均晋升对象的大小,如果小于将会进行FULLGC,日志会显示Prommotion faile引起fullGC
2, JDK6后默认都开启了担保分配
再解释下GC的几个名词
有几个误区:
FULL GC , Major GC , Minor GC
Minor GC 就是年轻代清理,这个好理解
FULL GC 指整个永久代(或G1的Metaspace代)的和堆内存 ,堆外内存一起清理. 在GC日志里的FULL GC指的是这个
Major GC 年老代的清理, 在jstat -gc 里的FGC只是指这个,只是年老代的GC而已
详细:https://plumbr.eu/handbook/garbage-collection-in-java
误区:jstat 的FGC, 它统计的仅仅是STW的次数,即 两个init-mark和 remark的阶段
问题:
引起FULL GC的原因有哪些呢?
1、System.gc()方法的调用
2、老年代代空间不足
3、永生区空间不足
4、CMS GC时出现,如GC日志中找到promotion failed(晋升失败)和concurrent mode failure(回收时有对象要分配)
5、统计得到的Minor GC晋升到旧生代的平均大小大于老年代的剩余空间
6、空间碎片太多,堆中分配很大的对象(如果如此,建议每次FULLGC后开启压缩)
如何查看GC 发生的原因呢?
目前能想到的,最简单是 jstat -gccause
1,这种跑批的任务不应该和服务放一块后面,应该是单独的模块
2. 对于对账,我的解决是后面放mongoDB,然后拿出来进行对账,完全是个离线处理的过程了
3,感谢各位的疑问。其实不一定是对账,很多系统会把跑批和服务放一块,如果跑批涉及到把数据load到内存的话,一定要注意垃圾回收了。不然容易触发CMS的GC。 而且经过我测试,每次到了老年代的50%就会不断触发GC,如果跑批任务持续时间比较久,不释放对象,将不停的Major GC。
有个疑问
上面例子中, 我加上 +XX:CMSInitiatingOccupancyFraction=80 后,还是会不停的Major GC( 例子中只占50%),待解答
持续更新留言问题,解答疑问
欢迎关注我的公众号,专注重现各种线上的BUG
或搜 “包子的实验室”