JVM的GC分为两个主要部分,第一部分是判断对象是否已死(堆内存的垃圾回收占主要部分,方法区(metaspace)的内存回收在最新的官方文档中未给出详细解释,暂时不做讨论范围),第二部分是对内存区进行回收用于下次的内存分配。
一、判断对象是否已死
JDK 8的VM实现是Hotspot虚拟机,它采用的是可达性分析算法。
a.引用计数算法
给每个对象添加一个引用计数器,每当有一个地方引用它时,计数器值就加1;当引用失效时计数器的值就减1。但在VM中它无法解决的一个问题就是循环引用问题(也是 Hotspot没有采用它的主要原因):
public class ReferenceCountingGC { public Object instance = null; private static final int _1MB = 1024 * 1024; private byte[] size = new byte[_1MB]; //这个成员属性的意义在于占据堆内存,以便稍后观察JVM是否回收了了堆内存 public static void main(String[] args) { ReferenceCountingGC obj1 = new ReferenceCountingGC(),obj2 = new ReferenceCountingGC(); obj1.instance = obj2; obj2.instance = obj1; //这两行代码为循环引用的发生场景 obj1 = null; obj2 = null; //这两行代码导致对象无法访问,但是他们对应的计数器值并不为0,如果采用的是引用计数算法,将不会发生堆内存回收 System.gc(); //垃圾回收发生在此行,分为新生代垃圾回收、堆内存和元空间的回收
} }
b.可达性分析算法
这个算法的基本思路就是通过一系列成为“GC Roots”的对象作为起始点,从这些节点开始向下搜索,搜索所走过的路径称为引用链,当一个对象到GC Roots没有任何引用链的时,则证明此对象是不可用的,即对象已死。
Object5、Object6、Object7到GC Roots没有引用链的存在,因此这三个对象会回收。
在Java语言中,可作为GC Roots的对象包括下面几种:
虚拟机栈(栈中的本地变量表)中引用的对象
本地方法栈中JNT(即一般说的Native方法)引用的对象
方法区中常量引用的对象
方法区中变量引用的对象
c.再谈引用
判断一个对象是否已死,无论是引用计数算法还是可达性分析算法,它们都与引用有关,在JDK1.2以后,Java中的引用分为Strong Reference(强引用)、Soft Reference(软引用)、弱引用(Weak Reference)、虚引用(Phantom Reference):
强引用就是指在程序代码中普遍存在的,类似“Object obj = new Object()”这类的引用,只要引用还存在,垃圾收集器永远不会回收掉被引用的对象。
软引用是用来描述还有用但非必需的对象。对于软引用关联的对象,在系统将要发生内存溢出之前,将会把这些对象列进回收范围进行二次回收,如果这次回收还没有足够的内存,才会抛出内存溢出异常。在JDK1.2之后,提供了SoftReference类来实现软引用。
弱引用也是来描述非必需对象的,但是它的强度比软引用更弱一些,被弱引用关联的对象只能生存到下一次内存回收之前。在JDK1.2后,提供了WeakReference类来实现弱引用。
虚引用的存在不会对被引用对象产生任何影响,为一个对象设置虚引用的目的就是能在这个对象被垃圾收集器回收时收到一个系统的通知。
二、内存回收算法
a.标记-清除算法
b.复制算法
c.标记-整理算法
d.分代收集算法
现代的商业虚拟机都使用分代收集算法,即堆内存中的新生代采用复制算法,老年代采用标记-清除或者标记-整理算法。
注:新生代和老年代的意义是相对于GC的而不是运行时数据区,不同GC对于堆内存的划分有所不同,垃圾回收的侧重点也不相同。
The blue area in Figure 3-1, "Typical Distribution for Lifetimes of Objects" is a typical distribution for the lifetimes of objects. The x-axis is object lifetimes measured in bytes allocated. The byte count on the y-axis is the total bytes in objects with the corresponding lifetime. The sharp peak at the left represents objects that can be reclaimed (in other words, have "died") shortly after being allocated. Iterator objects, for example, are often alive for the duration of a single loop.
Figure 3-1 Typical Distribution for Lifetimes of Objects
To optimize for this scenario, memory is managed in generations (memory pools holding objects of different ages). Garbage collection occurs in each generation when the generation fills up. The vast majority of objects are allocated in a pool dedicated to young objects (the young generation), and most objects die there. When the young generation fills up, it causes a minor collection in which only the young generation is collected; garbage in other generations is not reclaimed. Minor collections can be optimized, assuming that the weak generational hypothesis holds and most objects in the young generation are garbage and can be reclaimed. The costs of such collections are, to the first order, proportional to the number of live objects being collected; a young generation full of dead objects is collected very quickly. Typically, some fraction of the surviving objects from the young generation are moved to the tenured generation during each minor collection. Eventually, the tenured generation will fill up and must be collected, resulting in a major collection, in which the entire heap is collected. Major collections usually last much longer than minor collections because a significantly larger number of objects are involved.
Figure 3-2, "Default Arrangement of Generations, Except for Parallel Collector and G1" shows the default arrangement of generations (for all collectors with the exception of the parallel collector and G1):
Figure 3-2 Default Arrangement of Generations, Except for Parallel Collector and G1
At initialization, a maximum address space is virtually reserved but not allocated to physical memory unless it is needed. The complete address space reserved for object memory can be divided into the young and tenured generations.
The young generation consists of eden and two survivor spaces. Most objects are initially allocated in eden. One survivor space is empty at any time, and serves as the destination of any live objects in eden; the other survivor space is the destination during the next copying collection. Objects are copied between survivor spaces in this way until they are old enough to be tenured (copied to the tenured generation).
新生代中的对象在下面情况中进入老年代:
大对象通过分配担保直接进入老年代(大对象大小的临界值根据不同GC而设定)
长期存活的对象
大对象直接进入老年代
Minor GC后,Survivor仍然放不下
动态年龄判断 ,大于等于某个年龄的对象超过了survivor空间一半 ,大于等于某个年龄的对象直接进入老年代