从__builtin_eh_return看callee saved register

问题

C++的异常处理看起来是一个比较神奇的功能,能够在运行时穿越堆栈,从异常发生位置直达异常处理位置。通过gcc的代码可以看到,这个堆栈回溯的一个关键步骤是这个宏,其中又使用了gcc的内置指令__builtin_eh_return。网上关于__builtin_eh_return这个内置函数的资料较少,结合gcc的源代码可以猜测,这个内置函数的主要功能是和常规的return指令的功能类似,主要就是在函数结束的时候恢复该函数修改的(clobbered)寄存器,也就是所谓的callee saved registers。


/* Install TARGET into CURRENT so that we can return to it.  This is a
   macro because __builtin_eh_return must be invoked in the context of
   our caller.  */

#define uw_install_context(CURRENT, TARGET)				\
  do									\
    {									\
      long offset = uw_install_context_1 ((CURRENT), (TARGET));		\
      void *handler = uw_frob_return_addr ((CURRENT), (TARGET));	\
      _Unwind_DebugHook ((TARGET)->cfa, handler);			\
      __builtin_eh_return (offset, handler);				\
    }									\
  while (0)

测试

SO上说明callee是需要保留EBP//ESI/EDI寄存器的,连超级大佬Raymond Chen都来站台

The Windows and SystemV calling convention for x86-32 requires functions to preserve the ebx, esi, edi, and ebp registers. But these are just conventions. – Raymond Chen

但是简单测试下并非如此,生成的代码中明显修改了rsi寄存器的值,但是在__builtin_eh_return指令中并没有恢复rsi寄存器的内容。

        movq    %rdx, %rsi
        movq    %rax, %rdi

完整测试栗子如下

tsecer@harry: cat eh_return.c 
void foo(long xx, void * yy)
{
    extern int bar(long, long);
    bar(xx, xx);
    __builtin_eh_return (1111, (void*)(2222L));
}
tsecer@harry: g++ -S eh_return.c 
tsecer@harry: cat eh_return.s 
        .file   "eh_return.c"
        .text
        .globl  _Z3foolPv
        .type   _Z3foolPv, @function
_Z3foolPv:
.LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        pushq   %rdx
        pushq   %rax
        subq    $16, %rsp
        .cfi_offset 1, -24
        .cfi_offset 0, -32
        movq    %rdi, -24(%rbp)
        movq    %rsi, -32(%rbp)
        movq    -24(%rbp), %rdx
        movq    -24(%rbp), %rax
        movq    %rdx, %rsi
        movq    %rax, %rdi
        call    _Z3barll
        movl    $1111, %edx
        movl    $2222, %eax
        movq    %rdx, %rcx
        movq    %rax, 8(%rbp,%rcx)
        movq    -16(%rbp), %rax
        movq    -8(%rbp), %rdx
        leaq    8(%rbp,%rcx), %rcx
        movq    0(%rbp), %rbp
        .cfi_restore 6
        .cfi_def_cfa 2, 8
        movq    %rcx, %rsp
        ret
        .cfi_endproc
.LFE0:
        .size   _Z3foolPv, .-_Z3foolPv
        .section        .note.GNU-stack,"",@progbits
tsecer@harry: 

如果觉得这个测试有问题,可以看下gcc自带库中的代码,从汇编代码可以看到,在函数的序言(prelogue)中是修改了rsi和rdi的值,但是在执行__builtin_eh_return函数的时候只是恢复了rbx、和r12——r15,明显没有恢复被修改了的rsi和rdi寄存器。

(gdb) disas
Dump of assembler code for function _Unwind_RaiseException:
   0x00007ffff73431a0 <+0>:     push   %rbp
   0x00007ffff73431a1 <+1>:     mov    %rsp,%rbp
   0x00007ffff73431a4 <+4>:     push   %r15
   0x00007ffff73431a6 <+6>:     push   %r14
   0x00007ffff73431a8 <+8>:     push   %r13
   0x00007ffff73431aa <+10>:    push   %r12
   0x00007ffff73431ac <+12>:    lea    -0x3a0(%rbp),%r14
   0x00007ffff73431b3 <+19>:    push   %rbx
   0x00007ffff73431b4 <+20>:    push   %rdx
   0x00007ffff73431b5 <+21>:    lea    0x10(%rbp),%rsi
   0x00007ffff73431b9 <+25>:    push   %rax
   0x00007ffff73431ba <+26>:    mov    %rdi,%r12
   0x00007ffff73431bd <+29>:    mov    %r14,%rdi
   0x00007ffff73431c0 <+32>:    lea    -0x1c0(%rbp),%r13
###.....
   0x00007ffff734344b <+683>:   call   0x7ffff7342f00 <_Unwind_RaiseException_Phase2>
   0x00007ffff7343450 <+688>:   cmp    $0x7,%eax
   0x00007ffff7343453 <+691>:   jne    0x7ffff7343310 <_Unwind_RaiseException+368>
   0x00007ffff7343459 <+697>:   mov    %rbx,%rsi
   0x00007ffff734345c <+700>:   mov    %r14,%rdi
   0x00007ffff734345f <+703>:   call   0x7ffff7340d80 <uw_install_context_1>
   0x00007ffff7343464 <+708>:   mov    -0x218(%rbp),%r8
   0x00007ffff734346b <+715>:   mov    -0x220(%rbp),%rdi
   0x00007ffff7343472 <+722>:   mov    %r8,%rsi
   0x00007ffff7343475 <+725>:   call   0x7ffff7343190 <_Unwind_DebugHook>
=> 0x00007ffff734347a <+730>:   mov    %rax,%rcx
   0x00007ffff734347d <+733>:   mov    %r8,0x8(%rbp,%rax,1)
   0x00007ffff7343482 <+738>:   mov    -0x38(%rbp),%rax
   0x00007ffff7343486 <+742>:   lea    0x8(%rbp,%rcx,1),%rcx
   0x00007ffff734348b <+747>:   mov    -0x30(%rbp),%rdx
   0x00007ffff734348f <+751>:   mov    -0x28(%rbp),%rbx
   0x00007ffff7343493 <+755>:   mov    -0x20(%rbp),%r12
   0x00007ffff7343497 <+759>:   mov    -0x18(%rbp),%r13
   0x00007ffff734349b <+763>:   mov    -0x10(%rbp),%r14
   0x00007ffff734349f <+767>:   mov    -0x8(%rbp),%r15
   0x00007ffff73434a3 <+771>:   mov    0x0(%rbp),%rbp
   0x00007ffff73434a7 <+775>:   mov    %rcx,%rsp
   0x00007ffff73434aa <+778>:   ret    
End of assembler dump.
(gdb) 

答案

另一个问答解释了这个问题

But in x86-64 System V, the designers chose registers from scratch, and (as my answer on that linked question shows) found that using RDI and RSI for the first 2 args saved instructions (when building SPECint with an early x86-64 port of gcc). Probably because gcc at the time liked to inline memset or memcpy using rep stosd, or the library implementation used that.

大致来说:32bits的sysv对esi/edi的使用和64bits的sysv对于rsi/rdi的使用约定并不相同,而widows和sysv对于64bits下的rsi/rdi的约定也不相同,所以容易引起混淆。如果想当然的以为32bits中esi/edi是callee saved,扩展到64bits之后对应的rsi/rdi也是callee saved,那就有些想当然(的错误)了。

补充

从gdb的代码可以看到,intel的寄存器并不是严格按照字母顺序编码数值的

static const char *att_names64[] = {
  "%rax", "%rcx", "%rdx", "%rbx", "%rsp", "%rbp", "%rsi", "%rdi",
  "%r8", "%r9", "%r10", "%r11", "%r12", "%r13", "%r14", "%r15"
};

从下面的汇编代码也可以看到,rbx并不是第二个而是第四个寄存器。

tsecer@harry: cat gcc_inline_push_reg.c 
void foo()
{
    __asm__(
            "push %rax\n\t"
            "push %rbx\n\t"
            "push %rcx\n\t"
            "push %rdx\n\t"
            "push %rsp\n\t"
            "push %rbp\n\t"
            "push %rsi\n\t"
            "push %rdi\n\t"
    );
}
tsecer@harry: gcc -g -c gcc_inline_push_reg.c
tsecer@harry: gdb gcc_inline_push_reg.o -quiet
Registered pretty printers for UE classes
Registered pretty printers for UE classes
Reading symbols from gcc_inline_push_reg.o...
(gdb) disas/r foo
Dump of assembler code for function foo:
   0x0000000000000000 <+0>:     55      push   %rbp
   0x0000000000000001 <+1>:     48 89 e5        mov    %rsp,%rbp
   0x0000000000000004 <+4>:     50      push   %rax
   0x0000000000000005 <+5>:     53      push   %rbx
   0x0000000000000006 <+6>:     51      push   %rcx
   0x0000000000000007 <+7>:     52      push   %rdx
   0x0000000000000008 <+8>:     54      push   %rsp
   0x0000000000000009 <+9>:     55      push   %rbp
   0x000000000000000a <+10>:    56      push   %rsi
   0x000000000000000b <+11>:    57      push   %rdi
   0x000000000000000c <+12>:    90      nop
   0x000000000000000d <+13>:    5d      pop    %rbp
   0x000000000000000e <+14>:    c3      ret    
End of assembler dump.
(gdb) 

那为什么Ax到Dx不是按照字母顺序编码为0——3呢?从这些讨论可以知道:或许可以认为AX到Dx只是一种巧合的注记表示方法,它们分别是Accumulate、Base、Counter、Double(和Accumulate一起组成更长的一个数值)的缩写,或许从逻辑上(或者386发布时主要是用的汇编语言来看)理解,Accumulate或许和Counter更长在一起使用?

SE的这个讨论帖子中,又更多的深入讨论,其中一个观点就是根据使用频率对寄存器进行的数值编码:

i always learned these registers as accumulate, count, data, and base. They weren't ordered alphabetically so much as ordered by usage, ax for most arithmetic operations, cx for loop counters, dx for either left over arithmetic (think of the remainder or carry for div/mul) or i/o data, and bx for a base pointer to memory. Roughly, the ACDB is the order of importance for your average use case – Steve Cox Dec 1, 2017 at 18:54

有人提到pusha指令的一个细节,所有寄存器入栈的顺序也是,ACDB,从侧面印证寄存器内部使用的是这种顺序的编码

The AX/CX/DX/BX order also makes an appearance in PUSHA, which suggests it might correspond to the internal register file implementation... – Stephen Kitt Dec 1, 2017 at 15:31

posted on 2024-03-04 20:19  tsecer  阅读(36)  评论(0编辑  收藏  举报

导航