[No0000144]深入浅出图解C#堆与栈 C# Heap(ing) VS Stack(ing)理解堆与栈1/4

前言

 
虽然在.Net Framework 中我们不必考虑内在管理和垃圾回收(GC),但是为了优化应用程序性能我们始终需要了解内存管理和垃圾回收(GC)。另外,了解内存管理可以帮助我们理解在每一个程序中定义的每一个变量是怎样工作的。
 
 

简介

这篇文章会包含堆与栈的基础知识,变量类型,变量工作原理。

在程序运行时,.NET FRAMEWORK把对象存储在内存中的两个位置:堆与栈,并且它们都会帮助我们更好的运行程序。堆与栈寄存在电脑的操作内存中,并包含我们需要的信息使整个程序运行正常。

 

堆与栈:有什么不同?

 
栈,或多或少负责跟踪正在程序中运行的代码。
堆,或多或少负责跟踪程序对象或数据。
 
栈,把它想像成叠在一起的盒子(像搭积木一样)。每一次调用一个方法就会在最上面叠一个盒子,用来跟踪程序运行情况。我们只能使用栈中叠在最上面的盒子里的东西。当某一最上面的盒子里的代码执行完毕(如方法执行完成),就把它扔掉并继续去使用下一个盒子。
堆,与栈类似,只是它是用来保存信息而不是跟踪执行。所以,堆里的任何信息都可以在任何时间被访问。有了堆,访问信息没有约束,而不像栈只能访问最上面的盒子。
堆的情况就像你把一堆刚洗完的衣服放在床上还没有时间来的及收走,你可以迅速拿到你想要拿的衣服。栈的情况就像你叠在一起的鞋盒子,你需要拿走最上面的盒子才能拿到下一个盒子。
 
 
上图并不上真正的内存运行情况,只是为了让大家区分堆和栈。
栈,会自我管理,它有自己的内存管理机制。当最上面的盒子不再使用时,会自动被扔掉。
堆,相反,我们要控制它的垃圾回收(GC)。我们要去管理堆是否干净,就像管理床上的脏衣服。你不手动扔掉它,就会在床上变臭。
 

什么在堆和栈里

 
当程序执行时,我们主要有4种类型的东西放进堆和栈里:值类型,引用类型,指针,指令。
 

值类型:

  • bool
  • byte
  • char
  • decimal
  • double
  • enum
  • float
  • int
  • long
  • sbyte
  • short
  • struct
  • uint
  • ulong
  • ushort
它们都衍生于System.ValueType。
 

引用类型:

  • class
  • interface
  • delegate
  • object
  • string
它们都衍生于System.Object。当然object就是System.Object。
 
 

指针:

 

第三种被放于内存管理体制中的是类型的引用。这个引用通常被叫作指针。我们并不具体的使用指针,它们由CLR管理。一个指针(引用)是不同于引用类型的。我们定义它是一个引用类型,意味着我们可以通过指针访问它。一个指针占有一小块内存,这块内存指向另一块内存。指针占用在内存中的存储和其它的相同,只是存放的值既不是内存地址也不是空null。

 

 

 

指令:

 
我们会在后面的文章中介绍指令怎么工作。
 
 
 

总结

 
第一节到此结束,以后的章节里会介绍不同对象在堆和栈里存放的不同。

前言

 
虽然在.Net Framework 中我们不必考虑内在管理和垃圾回收(GC),但是为了优化应用程序性能我们始终需要了解内存管理和垃圾回收(GC)。另外,了解内存管理可以帮助我们理解在每一个程序中定义的每一个变量是怎样工作的。
 

简介


这一节介绍栈的基本工作原理。
 
 

两个黄金规则

 
  1. 引用类型永远存储在堆里。
  2. 值类型和指针永远存储在它们声明时所在的堆或栈里。
 

栈工作原理

 
栈,如第一节所说,在代码运行时负责跟踪每一个线程的所在(什么被调用了)。你可以把它想像成一个线程“状态”,而每一个线程都有它自己的栈。当我们的代码执行一次方法调用,线程开始执行寄存在方法(Method)表里的JIT编译过的指令,并且把该方法的参数存放到当前线程栈里。然后,随着代码的执行每遇见方法中的变量,该变量都会被放到栈的最上面,如此重复把所有变量都放到栈上(当然引用类型只存放指针)。
为了方便理解,让我们看代码与图例。
 
执行下面的方法:
[csharp] view plain copy
 
  1. public int AddFive(int pValue)  
  2.           {  
  3.                 int result;  
  4.                 result = pValue + 5;  
  5.                 return result;  
  6.           }  

下面是栈里发生的情况.  有必要提醒的是,我们现在假设当前代码产生的栈存储会放到所有既有项(栈里已经存储的数据)之上。一旦我们开始执行该方法,方法参数pValue会被放到栈上(以后的文章里会介绍参数传递)。

注意:方法并不存在栈里,图只是为了阐述原理而放的引用。

下一步,控制(线程执行方法)被传递到寄存在方法类型表里的AddFive()方法对应的指令集中。如果方法是第一次被触发,会执行JIT编译。

随着方法的执行,栈会分配一块内存给变量result存放。

方法执行完成,返回result。

该次任务在栈里所占的所有内存将被清理,仅一个指针被移动到AddFive()开始时所在的可用内存地址上。接着会执行栈里AddFive()下面一个方法(图里看不到)。

在这个例子当中,变量result被放到了栈里。事实上,方法体内每次定义的值类型变量都会被放到栈里。

 

当然值类型有时候也会被放到堆里,我们将会在下一节提到。

 

总结

 
栈可以想像成一个严格顺序执行的序列,不允许跳跃穿插访问。栈有自我清理功能。本文以执行一个简单C#方法为例阐述了栈的基本工作原理。下一节继续介绍堆栈工作原理以及一个更复杂一些的例子。

前言

 
虽然在.Net Framework 中我们不必考虑内在管理和垃圾回收(GC),但是为了优化应用程序性能我们始终需要了解内存管理和垃圾回收(GC)。另外,了解内存管理可以帮助我们理解在每一个程序中定义的每一个变量是怎样工作的。
 
 

简介

 
本文将介绍值类型与引用类型在堆栈里的基本存储原理。
 

值类型会存储在堆里?

 
是的,值类型有时候就是会存储在堆里。上一节中介绍的黄金规则2:值类型和指针永远存储在它们声明时所在的堆或栈里。如果一个值类型不是在方法中定义的,而是在一个引用类型里,那么此值类型将会被放在这个引用类型里并存储在堆上。
 

代码图例

 
我们定义一个引用类型:
[csharp] view plain copy
 
  1. public class MyInt  
  2.           {            
  3.              public int MyValue;  
  4.           }  

里面包含一个值类型MyValue。
执行下面的方法:
[csharp] view plain copy
 
  1. public MyInt AddFive(int pValue)  
  2.           {  
  3.                 MyInt result = new MyInt();  
  4.                 result.MyValue = pValue + 5;  
  5.                 return result;  
  6.           }  

就像上一节介绍的一样,线程开始执行此方法,参数pValue将会被放到当前线程栈上。
 
接下来不同于上一节所介绍的是MyInt是一个引用类型,它将被放到堆上并在栈上放一个指针指向它在堆里的存储。
 
当AddFive()执行完成后,如上一节所讲栈开始清理。
现在是需要C#垃圾回收GC的时候了。当我们的程序所占内存到达临界值时(即将溢出),我们会需要更多的堆空间,GC就会开始运行。GC停止所有当前运行线程(整体停止),找到堆里所有主程序不会访问到的对象并删除它们。然后,GC会识别所有堆里剩下的对象并分配内存空间给它们,同时调整堆和栈里指向它们的指针。你可以想像这是非常耗资源的,这会影响到程序的性能。这就是为什么我们需要理解和注意堆栈的使用,进而写出高性能代码。
 

堆栈原理对代码的影响

 
当我们使用引用类型时,我们在和指向引用类型的指针打交道,而不是引用类型本身。
当我们使用值类型时,我们就是在和值类型本身打交道。
 

代码图例

 
假设执行方法:
[csharp] view plain copy
 
  1. public int ReturnValue()  
  2.           {  
  3.                 int x = new int();  
  4.                 x = 3;  
  5.                 int y = new int();  
  6.                 y = x;        
  7.                 y = 4;            
  8.                 return x;  
  9.           }  

我们会得到值 3。
 
 
使用引用类型:
[csharp] view plain copy
 
  1. public class MyInt  
  2.           {  
  3.                 public int MyValue;  
  4.           }  


如果执行方法:
[csharp] view plain copy
 
  1. public int ReturnValue2()  
  2. {  
  3.       MyInt x = new MyInt();  
  4.       x.MyValue = 3;  
  5.       MyInt y = new MyInt();  
  6.       y = x;                   
  7.       y.MyValue = 4;                
  8.       return x.MyValue;  
  9. }  

我们得到的值是4而不是3!(译外话:这是很简单,但相信还是有很多人不知道原理的)
 
第一个示例中:
[csharp] view plain copy
 
  1. public int ReturnValue()  
  2.           {  
  3.                 int x = 3;  
  4.                 int y = x;      
  5.                 y = 4;  
  6.                 return x;  
  7.           }  

 
x就是3,y就是4。操作两个不同对象。
 
第二个示例:
[csharp] view plain copy
 
  1. public int ReturnValue2()  
  2.           {  
  3.                 MyInt x;  
  4.                 x.MyValue = 3;  
  5.                 MyInt y;  
  6.                 y = x;                  
  7.                 y.MyValue = 4;  
  8.                 return x.MyValue;  
  9.           }  
 
得到的值是4不是3是因为我们操作栈里两个指针并且它们指向堆里同一个对象。
 
 

总结

 
希望这篇文章能帮助你更好的理解值类型变量与引用类型变量的不同,同时理解什么是指针,什么时候用到指针。以后的文章里会更深入的介绍C#内存管理并详细阐述方法参数。
 

Even though with the .NET framework we don't have to actively worry about memory management and garbage collection (GC), we still have to keep memory management and GC in mind in order to optimize the performance of our applications. Also, having a basic understanding of how memory management works will help explain the behavior of the variables we work with in every program we write.  In this article I'll cover the basics of the Stack and Heap, types of variables and why some variables work as they do.

There are two places the .NET framework stores items in memory as your code executes.  If you are not yet familiar with them, let me introduce you to the Stack and the Heap. Both the Stack and Heap help us run our code. They reside in the operating memory on our machine and contain the pieces of information we need to make it all happen.

Stack vs. Heap: What's the difference?

The Stack is more or less responsible for keeping track of what's executing in our code (or what's been "called").  The Heap is more or less responsible for keeping track of our objects (our data, well... most of it; we'll get to that later).

Think of the Stack as a series of boxes stacked one on top of the next.  We keep track of what's going on in our application by stacking another box on top every time we call a method (called a Frame).  We can only use what's in the top box on the Stack.  When we're done with the top box (the method is done executing) we throw it away and proceed to use the stuff in the previous box on the top of the Stack. The Heap is similar except that its purpose is to hold information (not keep track of execution most of the time) so anything in our Heap can be accessed at  any time.  With the Heap, there are no constraints as to what can be accessed like in the Stack.  The Heap is like the heap of clean laundry on our bed that we have not taken the time to put away yet; we can grab what we need quickly. The Stack is like the Stack of shoe boxes in the closet where we have to take off the top one to get to the one underneath it.

heapvsstack1.gif

The picture above, while not really a true representation of what's happening in memory, helps us distinguish a Stack from a Heap.
 
The Stack is self-maintaining, meaning that it basically takes care of its own memory management.  When the top box is no longer used, it's thrown out.  The Heap, on the other hand, must worry about Garbage collection (GC), which deals with how to keep the Heap clean (no one wants dirty laundry laying around, it stinks!).

What goes on the Stack and Heap?

We have four main types of things we'll be putting in the Stack and Heap as our code is executing: Value Types, Reference Types, Pointers, and Instructions. 

Value Types:

In C#, all the "things" declared with the following list of type declarations are Value types (because they are from System.ValueType):

  • bool
  • byte
  • char
  • decimal
  • double
  • enum
  • float
  • int
  • long
  • sbyte
  • short
  • struct
  • uint
  • ulong
  • ushort

Reference Types:

All the "things" declared with the types in this list are Reference types (and inherit from System.Object, except, of course, for object which is the System.Object object):

  • class
  • interface
  • delegate
  • object
  • string

Pointers:

The third type of "thing" to be put in our memory management scheme is a Reference to a Type. A Reference is often referred to as a Pointer. We don't explicitly use Pointers, they are managed by the Common Language Runtime (CLR). A Pointer (or Reference) is different than a Reference Type in that when we say something is a Reference Type, it means we access it through a Pointer. A Pointer is a chunk of space in memory that points to another space in memory.  A Pointer takes up space just like any other thing that we're putting in the Stack and Heap and its value is either a memory address or null. 

heapvsstack2.gif

Instructions:

You'll see how the  "Instructions" work later in this article...

How is it decided what goes where? (Huh?)

Ok, one last thing and we'll get to the fun stuff.

Here are our two golden rules:

  1. A Reference Type always goes on the Heap; easy enough, right?  

  2. Value Types and Pointers always go where they were declared. This is a little more complex and needs a bit more understanding of how the Stack works to figure out where "things" are declared.

The Stack, as we mentioned earlier, is responsible for keeping track of where each thread is during the execution of our code (or what's been called).  You can think of it as a thread "state" and each thread has its own Stack.  When our code makes a call to execute a method the thread starts executing the instructions that have been JIT compiled and live on the method table, it also puts  the method's parameters on the thread Stack. Then, as we go through the code and run into variables within the method, they are placed on top of the Stack. This will be easiest to understand with an example.

Take the following method:

           public int AddFive(int pValue)
          {
                int result;
                result = pValue + 5;
                
return result;
          }

Here's what happens at the very top of the Stack.  Keep in mind that what we are looking at is placed on top of many other items already living in the Stack:

Once we start executing the method, the method's parameters are placed on the Stack (we'll talk more about passing parameters later).

NOTE : the method does not live on the stack and is illustrated just for reference.

heapvsstack3.gif

Next, control (the thread executing the method) is passed to the instructions to the AddFive() method which lives in our type's method table, a JIT compilation is performed if this is the first time we are hitting the method.

heapvsstack4.gif

As the method executes, we need some memory for the "result" variable and it is allocated on the Stack.

 heapvsstack5.gif

The method finishes execution and our result is returned.

heapvsstack6.gif

And all memory allocated on the Stack is cleaned up by moving a pointer to the available memory address where AddFive() started and we go down to the previous method on the stack (not seen here).

heapvsstack7.gif

In this example, our "result" variable is placed on the stack.  As a matter of fact, every time a Value Type is declared within the body of a method, it will be placed on the stack.

Now, Value Types are also sometimes placed on the Heap.  Remember the rule, Value Types always go where they were declared?  Well, if a Value Type is declared outside of a method, but inside a Reference Type then it will be placed within the Reference Type on the Heap.

Here's another example.

If we have the following MyInt class (which is a Reference Type because it is a class):

          public class MyInt
          {          
             
public int MyValue;
          }

and the following method is executing:

          public MyInt AddFive(int pValue)
          {
                MyInt result = new MyInt();
                result.MyValue = pValue + 5;
                return result;
          }

Then just as before, the thread starts executing the method and its parameters are placed on sthe thread's stack.

heapvsstack8.gif

Now is when it gets interesting.

Because MyInt is a Reference Type, it is placed on the Heap and referenced by a Pointer on the Stack.

heapvsstack9.gif

After AddFive() is finished executing (like in the first example), and we are cleaning up...

heapvsstack10.gif

we're left with an orphaned MyInt in the Heap (there is no longer anyone in the Stack standing around pointing to MyInt)!

heapvsstack11.gif

This is where the Garbage Collection (GC) comes into play.  Once our program reaches a certain memory threshold and we need more Heap space, our GC will kick off.  The GC will stop all running threads (a FULL STOP), find all objects in the Heap that are not being accessed by the main program and delete them.  The GC will then reorganize all the objects left in the Heap to make space and adjust all the Pointers to these objects in both the Stack and the Heap.  As you can imagine, this can be quite expensive in terms of performance, so now you can see why it can be important to pay attention to what's in the Stack and Heap when trying to write high-performance code.

Ok, that's great, but how does it really affect me?

Good question. 

When we are using Reference Types, we're dealing with Pointers to the type, not the thing itself.  When we're using Value Types, we're using the thing itself.  Clear as mud, right?

Again, this is best described by example.

If we execute the following method:

          public int ReturnValue()
          {
                int x = new int();
                x = 3;
                int y = new int();
                y = x;      
                y = 4;          
                
return x;
    
      }

We'll get the value 3.  Simple enough, right?

However, if we are using the MyInt class from before:

     public class MyInt
          {

                public int MyValue;
          }

and we are executing the following method:

          public int ReturnValue2()
          {
                MyInt x = new MyInt();
                x.MyValue = 3;
                MyInt y = new MyInt();
                y = x;                 
                y.MyValue = 4;              
                
return x.MyValue;
          }

What do we get?    4!

Why?...  How does x.MyValue get to be 4? Take a look at what we're doing and see if it makes sense:

In the first example everything goes as planned:

          public int ReturnValue()
          {
                int x = 3;
                int y = x;    
                y = 4;
                
return x;
          }

heapvsstack12.gif

In the next example, we don't get "3" because both variables "x" and "y" point to the same object in the Heap.

          public int ReturnValue2()
          {
                MyInt x;
                x.MyValue = 3;
                MyInt y;
                y = x;                
                y.MyValue = 4;
                
return x.MyValue;
          }

heapvsstack13.gif

Hopefully this gives you a better understanding of a basic difference between Value Type and Reference Type variables in C# and a basic understanding of what a Pointer is and when it is used.  In the next part of this series, we'll get further into memory management and specifically talk about method parameters.

For now...

Happy coding.

posted @ 2018-04-24 15:52  CharyGao  阅读(580)  评论(0编辑  收藏  举报