.net CLR 4.0对垃圾回收机制的改进
A survey of garbage collection and the changes CLR 4.0 brings in - series of what is new in CLR 4.0
导言Introduction
垃圾回收(Garbage Collection)在.net中是一个很重要的机制. 本文将要谈到CLR4.0对垃圾回收做了哪些改进. 为了更好地理解这些改进, 本文也要介绍垃圾回收的历史. 这样我们对整个垃圾回收有一个大的印象. 这个大印象对于我们掌握.net架构是有帮助的.
Garbage Collection is an important component of .net. The post will talk about what has been improved in CLR 4.0. To understand it, I will take a survey of the history of garbage collection. This way we can have a big picture of garbage collection. This will help us master .net architecture in comprehensive manner.
关于垃圾回收About Garbage collection
在C++时代,我们需要自己来管理申请内存和释放内存. 于是有了new, delete关键字. 还有的一些内存申请和释放函数(malloc/free). C++程序必须很好地管理自己的内存, 不然就会造成内存泄漏(Memory leak). 在.net时代, 微软为开发人员提供了一个强有力的机制--垃圾回收. 垃圾回收机制是CLR的一部分, 我们不用操心内存何时释放, 我们可以花更多精力关注应用程序的业务逻辑. CLR里面的垃圾回收机制用一定的算法判断某些内存程序不再使用,回收这些内存并交给我们的程序再使用.
In the times of C++, we need to allocate and release memory by ourselves carefully, therefore there are new, delete keywords in C++, and fuctions(malloc/free) to allocate and release memory. C++ program has to manage its memory well, otherwise there will be memory leak. In .net, Microsoft provides a strong machanism to developers—Garbage collection. The Garbage collection is part of CLR. We do not need to worry about when to release memory. We can spend more time on buisness logic of applications. The Garbage colleciton of CLR adopts algorithms to decide which part of memory the program does not need any more, and then release these memory for further use.
垃圾回收的功能The functionalities of Garbage collection
用来管理托管资源和非托管资源所占用的内存分配和释放。In charging of the releasing and re-allocation of memory of managed and unmanaged resources.
寻找不再使用的对象,释放其占用的内存, 以及释放非托管资源所占用的内存. Find the objects no longer needed, release the memory the objects occupied, and affranchise memory occupied by unmanaged resources.
垃圾回收器释放内存之后, 出现了内存碎片, 垃圾回收器移动一些对象, 以得到整块的内存,同时所有的对象引用都将被调整为指向对象新的存储位置。After releasing the memory no longer needed, there is memory scrap. Garbage collector shifts objects to get consecutive memory space, and then the references of objects will be adjusted according to the shifted address of objects.
下面我们来看看CLR是如何管理托管资源的. Let’s see how CLR takes care of managed resources.
托管堆和托管栈Managed heap and Managed stack:
.net CLR在运行我们的程序时,在内存中开辟了两块地方作不同的用处--托管栈和托管堆. 托管栈用来存放局部变量, 跟踪程序调用与返回. 托管堆用来存放引用类型. 引用类型总是存放于托管堆. 值类型通常是放在托管栈上面的. 如果一个值类型是一个引用类型的一部分,则此值类型随该引用类型存放于托管堆中. 哪些东西是值类型? 就是定义于System.ValueType之下的这些类型:
bool byte char decimal double enum float int long sbyte short struct uint ulong ushort
When .net CLR runs our program, CLR declares two ranges of memory for different purposes. Managed stack is to store local variables, and trace the call and return of routines. Managed heap is to store reference types. Usually value types was put on managed stack. If a value type is a part of a reference type, then the value type will be stored in managed heap along with the reference type. What are value types? They are the types defined in System.ValueType:
bool byte char decimal double enum float int long sbyte short struct uint ulong ushort
什么是引用类型呢? 只要用class, interface, delegate, object, string声明的类型, 就是引用类型. What are reference types? The types declared with class, interface, delegate, object, stirng, are reference types.
我们定义一个局部变量, 其类型是引用类型. 当我们给它赋一个值, 如下例:We declare a local variable, which is a reference type, and we assign a value to the local variable, like the following:
private void MyMethod()
{
MyType myType = new MyType();
myType.DoSomeThing();
}
在此例中, myType 是局部变量, new实例化出来的对象存储于托管堆, 而myType变量存储于托管栈. 在托管栈的myType变量存储了一个指向托管堆上new实例化出来对象的引用. CLR运行此方法时, 将托管栈指针移动, 为局部变量myType分配空间, 当执行new时, CLR先查看托管堆是否有足够空间, 足够的话就只是简单地移动下托管堆的指针, 来为MyType对象分配空间, 如果托管堆没有足够空间, 会引起垃圾收集器工作. CLR在分配空间之前,知道所有类型的元数据,所以能知道每个类型的大小, 即占用空间的大小.
In this sample, myType is a local variable. the object instantiated by new operation is stored in managed heap, and the myType local variable is stored in managed stack. The myType local variable on managed stack has a pointer pointing to the address of the object instantiated by new operation. When CLR executes the method, CLR moves the pointer of managed stack to allocate memory for the local variable myType. When CLR executes new operation, CLR checks first whether managed heap has enough space, if enough then do a simple action – move the pointer of managed heap to allocate space for the object of MyType. If managed heap does not have space, this triggers garbage collector to function. CLR knows all the metadata of types, and knows the size of all the types, and then knows how big space the types need.
当CLR完成MyMethod方法的执行时, 托管栈上的myType局部变量被立即删除, 但是托管堆上的MyType对象却不一定马上删除. 这取决于垃圾收集器的触发条件.后面要介绍此触发条件.When CLR finishs execution of MyMethod method, the local variable myType on managed stack is deleted immediately, but the object of MyType on managed heap may not be deleted immediately. This depends on the trigger condition of garbage collector. I will talk about the trigger condition later.
上面我们了解了CLR如何管理托管资源. 下面我们来看垃圾收集器如何寻找不再使用的托管对象,并释放其占用的内存. In previous paragraphs, we learn how CLR manages managed resources. In following paragraphs, we will see how garbage collector find objects no longer needed, and release the memory.
垃圾收集器如何寻找不再使用的托管对象,并释放其占用的内存How garbage collector find objects no longer needed and release memory
前面我们了解了CLR如何管理托管栈上的对象.按照先进后出原则即可比较容易地管理托管栈的内存. 托管堆的管理比托管栈的管理复杂多了.下面所谈都是针对托管堆的管理. In previous paragraphs, we learn how CLR manages the objects on managed stack. It is easy to manage managed stack as long as you utilize the rule “first in last out”. The management of managed heap is much more complicated than the management of managed stack. The following is all about the management of managed heap.
根The root
垃圾收集器寻找不再使用的托管对象时, 其判断依据是当一个对象不再有引用指向它, 就说明此对象是可以释放了. 一些复杂的情况下可以出现一个对象指向第二个对象,第二个对象指向第三个对象,…就象一个链表. 那么, 垃圾收集器从哪里开始查找不再使用的托管对象呢? 以刚才所说的链表为例, 显然是应该从链表的开头开始查找. 那么,在链表开头的是些什么东东呢? The criteria garbage collector uses to judge whether an object is no longer needed is that an object can be released when the object does have any reference. In some complicated cases, it happends that the first object refers to the second object, and the second object points to the third object, etc. It is looking like a chain of single linked nodes. Then the question is : where does the garbage collector begins to find objects no longer needed? For the example of the single linked node chain, we can say it is obvious garbage collector starts from the beginning of the chain. Then the next question is: what are the stuff at the beginning of the chain.
是局部变量, 全局变量, 静态变量, 指向托管堆的CPU寄存器. 在CLR中,它们被称之为根. The answer is : local variables, global variables, static variables, the CPU registers pointing to managed heap. In CLR, they are called “the roots”.
有了开始点, 垃圾收集器接下来怎么做呢? Got the roots, what will garbage collector do next?
创建一个图, 一个描述对象间引用关系的图. Build a graph, which shows the reference relationship among objects.
垃圾收集器首先假定所有在托管堆里面的对象都是不可到达的(或者说没有被引用的,不再需要的), 然后从根上的那些变量开始, 针对每一个根上的变量, 找出其引用的托管堆上的对象, 将找到的对象加入这个图, 然后再沿着这个对象往下找,看看它有没有引用另外一个对象, 有的话,继续将找到的对象加入图中,如果没有的话, 就说明这条链已经找到尾部了. 垃圾收集器就去从根上的另外一个变量开始找, 直到根上的所有变量都找过了, 然后垃圾收集器才停止查找. 值得一提的是, 在查找过程中, 垃圾收集器有些小的优化, 如: 由于对象间的引用关系可能是比较复杂的, 所以有可能找到一个对象, 而此对象已经加入图了, 那么垃圾收集器就不再在此条链上继续查找, 转去其他的链上继续找. 这样对垃圾收集器的性能有所改善.
First garbage collector supposes all the objects in managed heap are not reachable( do not have reference, or no longer needed). Then start from the variables in the roots. For each of the variable in the roots, search the object the variable refers to, and add the found object into the graph, and search again after the found object for next refered object, etc. Check whether the found object has next reference. If has, continue to add the next found object into the graph. If not, it means this is the end of the chain, then stop searching on the chain, continue on next variable in the roots, keep searching on roots, until all the searching are finished. In the searching process, garbage collector has some optimization to improve the performance. Like: Because the reference relationship could be complicated among objects, it is possible to find an object that has been added into the graph, then garbage collector stops searching on the chain, continue to search next chain. This way helps on performance of garbage collection.
垃圾收集器建好这个图之后, 剩下那些没有在这个图中的对象就是不再需要的. 垃圾收集器就可以回收它们占用的空间.After buidling the reference graph among objects, the objects not in the graph are no longer needed objects. Garbage collector could release the memory space occupied by the no longer needed objects.
未完待续To be continued…
参考文献References
Garbage Collection: Automatic Memory Management in the Microsoft .NET Framework By Jeffrey Richter http://msdn.microsoft.com/en-us/magazine/bb985010.aspx
Garbage Collection Part 2: Automatic Memory Management in the Microsoft .NET Framework By Jeffrey Richter http://msdn.microsoft.com/en-us/magazine/bb985011.aspx
Garbage Collector Basics and Performance Hints By Rico Mariani at Microsoft http://msdn.microsoft.com/en-us/library/ms973837.aspx
www.3kk.com原创
posted on 2010-02-19 19:11 passer.net 阅读(417) 评论(0) 编辑 收藏 举报