VEX IR语言语法

/*---------------------------------------------------------------*/
/*--- High-level IR description ---*/
/*---------------------------------------------------------------*/

/* Vex IR is an architecture-neutral intermediate representation.
Unlike some IRs in systems similar to Vex, it is not like assembly
language (ie. a list of instructions). Rather, it is more like the
IR that might be used in a compiler.

相对汇编语言,VEX IR更像是Compiler的中间语言

Code blocks
~~~~~~~~~~~
The code is broken into small code blocks ("superblocks", type:
'IRSB'). Each code block typically represents from 1 to perhaps 50
instructions. IRSBs are single-entry, multiple-exit code blocks.
Each IRSB contains three things:

单入口,多出口的代码块,与Intel Pin中的Trace级别相仿

- a type environment, which indicates the type of each temporary
value present in the IRSB

【实例:】

(*ir_block).tyenv
    -types
      -[0] Ity_I32
      -[1] Ity_I32
    -types_size 0x00000008
    -types_used 0x00000002

 

types_used提示有多少个Temp变量被使用,types数组里面分别保存着每个Temp变量的类型

 

- a list of statements, which represent code
【实例:】

stmts_size    0x00000003    int
stmts_used    0x00000003    int
-     (*ir_block).stmts[0]  
    tag Ist_IMark - (*ir_block).stmts[1]
    tag Ist_WrTmp - (*ir_block).stmts[2]
    tag Ist_Put

 

Statements也是保存在stmts数组中,stmts_used代表实际上使用的Statements的数目

 

 

- a jump that exits from the end the IRSB
【实例:】  

jumpkind    Ijk_Boring

 

最后打印出来的结果如下

0x77D699A0: movl %esi,%esp

IRSB {
  t0:I32   t1:I32           【2个Temp变量】

------ IMark(0x77D699A0, 2, 0) ------ 【3个Statements,包含IMark,但是没有包含最后一条,因为它是对于IP寄存器操作的,是自动的】
  t0 = GET:I32(32)           【整条是一个Statements,而GET:I32(32)是Expression】
  PUT(24) = t0
  PUT(68) = 0x77D699A2:I32; exit-Boring

 

其中, 第二条Statements可以继续分解

-     (*ir_block).stmts[1]
    tag    Ist_WrTmp
        .tmp    0  
            .tag    Iex_Get
                .offset    32
                .ty    Ity_I32

 

Because the blocks are multiple-exit, there can be additional
conditional exit statements that cause control to leave the IRSB
before the final exit. Also because of this, IRSBs can cover
multiple non-consecutive sequences of code (up to 3). These are
recorded in the type VexGuestExtents (see libvex.h).

Statements and expressions
~~~~~~~~~~~~~~~~~~~~~~~~~~
Statements (type 'IRStmt') represent operations with side-effects,
eg. guest register writes, stores, and assignments to temporaries.
Expressions (type 'IRExpr') represent operations without
side-effects, eg. arithmetic operations, loads, constants.
Expressions can contain sub-expressions, forming expression trees,
eg. (3 + (4 * load(addr1)).

Statements可以有Side-Effects,但是Expressions是Pure的,没有副作用的。

ST代表从寄存器到内存的数据转移, LD代表从内存到寄存器转移数据

 


 

Expression的类型

typedef
   enum { 
      Iex_Binder=0x15000,
      Iex_Get,
      Iex_GetI,
      Iex_RdTmp,
      Iex_Qop,
      Iex_Triop,
      Iex_Binop,
      Iex_Unop,
      Iex_Load,
      Iex_Const,
      Iex_Mux0X,
      Iex_CCall
   }
   IRExprTag;

 

 

 

 Statements的类型

typedef 
   enum {
      Ist_NoOp=0x19000,
      Ist_IMark,     /* META */
      Ist_AbiHint,   /* META */
      Ist_Put,
      Ist_PutI,
      Ist_WrTmp,
      Ist_Store,
      Ist_CAS,
      Ist_LLSC,
      Ist_Dirty,
      Ist_MBE,       /* META (maybe) */
      Ist_Exit
   } 
   IRStmtTag;

 

  

posted @ 2014-06-18 15:31  Daniel King  阅读(2732)  评论(2编辑  收藏  举报