JMM 指令重排 - 并发现象探究

From:http://blog.csdn.net/yxc135/article/details/17720327

 

我们在平时所习惯的单线程编程中默认了一种乐观的模型——串行一致性。即在程序中只存在唯一的操作执行顺序,并且在每次读取变量时,都能获得在执行序列(任何处理器)最近一次写入该变量的值。但在JMM以及底层的任何一看现代多处理器架构中都不会提供这种串行一致性。这在并发编程中会造成一些在单线程环境下看来难以理解的现象。

 

  比如,如下的Java代码:

 

[java] view plain copy
 
 在CODE上查看代码片派生到我的代码片
  1. package pack;  
  2.   
  3. public class Main {  
  4.     static int x = 0,y = 0;  
  5.     static int a = 0,b = 0;  
  6.   
  7.     /** 
  8.      * @param args 
  9.      */  
  10.     public static void main(String[] args) {  
  11.         for (int i=0;i<100000;i++) {  
  12.             Thread one = new Thread(new Runnable() {  
  13.                 public void run() {  
  14.                     a = 1;  
  15.                     x = b;  
  16.                 }  
  17.             });  
  18.             Thread other = new Thread(new Runnable() {  
  19.                 public void run() {  
  20.                     b = 1;  
  21.                     y = a;    
  22.                 }  
  23.             });  
  24.               
  25.             x = 0;  
  26.             y = 0;  
  27.             a = 0;  
  28.             b = 0;  
  29.               
  30.             one.start();  
  31.             other.start();  
  32.             try {  
  33.                 one.join();  
  34.                 other.join();  
  35.             } catch (Exception e) {  
  36.                 System.out.println("exception");  
  37.             }  
  38.             if ((x==0)&&(y==0)) {  
  39.                 System.out.println("pass");  
  40.             }  
  41.         }  
  42.         System.out.println("end");  
  43.           
  44.     }  
  45.       
  46. }  

 

 

  其中,我们对两个并发执行的线程运行结果进行测试,这两个线程分别对四个static变量a、b、x、y进行操作:

 

[java] view plain copy
 
 在CODE上查看代码片派生到我的代码片
  1. Thread one = new Thread(new Runnable() {  
  2.     public void run() {  
  3.         a = 1;  
  4.         x = b;  
  5.     }  
  6. });  
  7. Thread other = new Thread(new Runnable() {  
  8.     public void run() {  
  9.         b = 1;  
  10.         y = a;    
  11.     }  
  12. });  



 

  如果你多运行几次该程序,会发现结果中出现了pass,即有x==0,y==0的情况。

 

  造成这种现象的原因就是指令重排,首先我们看一下JMM的happens-before对指令重排所做的定义:

 

"Happens before" is a partial order describing program events,
invented by Leslie Lamport.

Consider multithreaded executions as traces R of events E, as defined
below.  (A trace is just a sequence.)

Events E ::= start(T)
	  |  end(T)
          |  read(T,x,v)
  	  |  write(T,x,v)
	  |  spawn(T1,T2)
	  |  join(T1,T2)
	  |  lock(T,x)
	  |  unlock(T,x)

Here T is a thread identifier, x is a variable, and v is a value.  So
the event read(T,x,v) indicates that thread T read value v from
variable x.  We also assume that traces R are well-formed by requiring
the first event by a thread T in R must be start(T).  No events by T
may follow end(T) in the trace.

Let E1 < E2 be the ordering of events as they appear in the trace,
which is transitive, irreflexive, and antisymmetric, as usual.  Define
happens-before ordering <: in a trace R as follows: E1 <: E2 iff E1 < E2
and one of the following holds:

  a) thread(E1) = thread(E2)
  b) E1 is spawn(T1,T2), and E2 is start(T2)
  c) E2 is join(T1,T2), and E1 is end(T2)
  d) E1 is unlock(T1,x) and E2 lock(T2,x)
  e) there exists E3 with E1 <: E3 and E3 <: E2 (i.e., the
     happens-before ordering is transitive)


VISIBILITY

Given EW == write(T1,x,v1) and ER == read(T2,x,v2) in trace R, we have
  that EW "is not visible" to ER (i.e., v1 != v2) if

  a) ER <: EW   (i.e., the read happens before the write)
  b) there exists some intervening event EW2 == write(T,x,v3) such
     that EW <: EW2 <: R   (i.e., the first write is overwritten by
       the second)

Otherwise EW is visible at ER, and thus the read could "see" the value
written in EW.

 

  由于没有加锁,显然one和other两个线程中的指令执行顺序与另外一个线程无关。因此可能出现的执行结果有:

(1)x==0,y==1,对应的可能执行顺序:a=1,x=b,b=1,y=a

(2)x==1,y==1,对应的可能执行顺序:b=1,a=1,x=b,y=a

(3)x==1,y==0,对应的可能执行顺序:b=1,y=a,a=1,x=b

  这三种情况都是很容易看出来的,但是根据happens-before的定义,为什么出现x==0,y==0的情况呢?

  注意happens-before中(a)条很容易引起误会,你可能会这样认为:因为对于one线程而言,事件E1:a=1和事件E2:b=1满足上述情况中的E1<E2,且thread(E1)<thread(E2),因而a=1和x=b是不会发生重排的;同理,other线程中的b=1和y=a是不会发生重排的。但这是错误的,实际的情况我摘录了stackoverflow上的一个回答:

 

Imagine this simple program (all variables initially 0):

T1:

x = 5;
y = 6;

T2:

if (y == 6) System.out.println(x);

 

The program order rule boils down to:

If x and y are actions of the same thread and x comes before y in program order, then hb(x, y) (i.e. x happens-before y).

happens-before has a very specific meaning in the JMM. In particular, it does not mean that y=6 must be subsequent to x=5 in T1 from a wall clock perspective. It only means that the sequence of actions executed by T1 must be consistent with that order. You can also refer to JLS 17.4.5:

It should be noted that the presence of a happens-before relationship between two actions does not necessarily imply that they have to take place in that order in an implementation. If the reordering produces results consistent with a legal execution, it is not illegal.

In the example I gave above, you will agree that from T1's perspective (i.e. in a single threaded program), x=5;y=6; is consistent with y=6;x=5; since you don't read the values.

  总结一下就是,对于该线程本身来说,虽然发生了指令重排,但底层机制还是保证了它所以为的指令执行是按程序中所写的顺序执行的。但对于其它线程来说,这是不能保证的。

 

  对于现代的多处理器架构,底层是会发生内存级的指令重排的。

  我们看一下Intel官方文档中的内存级指令重排情况:

  Intel官方列出的有关Memory Reordering的情况总共有8种:
  Neither Loads Nor Stores Are Reordered with Like Operations
  Stores Are Not Reordered With Earlier Loads
  Loads May Be Reordered with Earlier Stores to Different Locations
  Intra-Processor Forwarding Is Allowed
  Stores Are Transitively Visible
  Stores Are Seen in a Consistent Order by Other Processors
  Locked Instructions Have a Total Order
  Loads and Stores Are Not Reordered with Locked Instructions
  
  可以看出,第三种会造成上述奇怪现象的发生,我们以one线程为例:x=b在底层执行时分两步,load b和store x,因此指令序列是 store a,load b,store x。由于load可以被重排到store指令之前,因此可能出现该种指令序列:load b,store a,store x。同理,other线程中可能出现指令序列:load a,store b,store y。最终的实际执行序列可能是:load a,load b,store a,store x,store b,store y。(注意load会将值存入寄存器,后面的store会用寄存器中的值)

  这就出现了x==0,y==0的奇怪现象!

 

  因此,在没有进行任何同步的情况下,不能在一个线程中对另一个线程的指令执行顺序做任何假设(很容易默认指令执行的顺序与代码中所写的顺序相同,这是错误的)!

 

posted on 2016-07-29 14:10  alvin.zhang  阅读(424)  评论(0编辑  收藏  举报

导航