PowerPC下连接器--relax选项实现

一、问题

在编译一些大的工程的可执行文件的时候，可以发现如果使用DEBUG版本，代码段加上和代码段放在一起的只读数据(字符串或者常量全局变量等)数量将会比较庞大，而在glibc的crti中会有一个对GLABAL_OFF_TABLE(GOT)表的重定向，这个重定向类型为为，从网络上可以看到这种错误类型提示信息(来自http://meld.org/discussion/linking-error-cge-51-p2020-rdb-tool-chain)：

oot@Desktop # ppc_85xx-gcc -o sampleprog sampleprog.c
/opt/cge5.1p2020rdb/montavista/cge/devkit/ppc/85xx/bin/../target/usr/lib/crt1.o: In function `_start':
(.text+0x20): relocation truncated to fit: R_PPC_REL24 against symbol `__libc_start_main@@GLIBC_2.0' defined in .plt section in /opt/cge5.1p2020rdb/montavista/cge/devkit/ppc/85xx/bin/../target/usr/lib/crt1.o
/opt/cge5.1p2020rdb/montavista/cge/devkit/ppc/85xx/bin/../target/usr/lib/crti.o: In function `_init':
/home/build/BUILD/glibc-2.5.90/objdir/csu/crti.S:18: relocation truncated to fit: R_PPC_LOCAL24PC against symbol `_GLOBAL_OFFSET_TABLE_' defined in .got section in /opt/cge5.1p2020rdb/montavista/cge/devkit/ppc/85xx/bin/../target/usr/lib/crt1.o+fffffffc
/tmp/cckOiWe5.o: In function `main':
sampleprog.c:(.text+0x28): relocation truncated to fit: R_PPC_REL24 against symbol `printf@@GLIBC_2.4' defined in .plt section in /opt/cge5.1p2020rdb/montavista/cge/devkit/ppc/85xx/bin/../target/usr/lib/crt1.o
collect2: ld returned 1 exit status

二、出现原因

PowerPC属于RISC机型，RISC的一个重要特定就是指令定长，在32bits系统中每条指令就是32bits。而一般一个指令都是由操作码部分和操作数部分两部分组成(当然也有例外，就是没有操作数，例如系统调用指令sc，中断返回指令rfi等)，其中的操作码表示不同的动作，而操作数则表明了操作的内容，例如 mr r1，r2，就是将r2寄存器的值复制到r1中，由于系统中通常有32个寄存器，32个寄存器需要5bits来进行表示不同寄存器编号，所以这个指令中需要有10bits来作操作数，可以有22bits作操作码，当然由于RISC中操作指令一般不多，事实上不会用这么多bit作操作码的。所以这个例子中操作数限制没有体现出来。

现在看一下跳转指令，大部分机型中跳转更喜欢相对跳转，也就是相对于当前指令运行时PC加上一个偏移量的目的去执行(可能是函数，也可能是if else的无条件跳转)，原因是因为这种代码长度比较短，并且便于重定位，整个代码段起始位置变化之后，相对位移不变，所以不用重定位。所以在386中也有这种相对跳转指令。

一个PowerPC跳转指令的32bits大致分为两部分

010110 000000000000000000000000

也就是高6bits用来作为操作码，表示为一个跳转指令，之后的剩余26bits在相对跳转的时候作为相对于当前PC的偏移值。在编译的时候，一个.o文件不知道自己引用的代码在整个可执行文件中的位置，所以它大胆假设和此处的差异在整个范围之内。但是如果不巧的时候，这目标地址偏偏就真的和起跳处相差过大，此时连接器在整个24bits中无法填充这个偏移值，此时就会提示上面的错误。

三、连接器--relax的实现方法

powerpc的工具链应该为自己的错误假设买单，它就有必要解决自己错误假设引入的问题。注意：连接器一般是不会修改操作码的(也就是不会把一个无条件b指令转换为条件bc跳转指令)。所以连接器引入了--relax来解决这个问题，看一下链接器是如何解决这个问题的。

BinUtils\binutils-2.21.1\bfd\elf32-ppc.c:ppc_elf_relax_section函数是ppc解决这个问题的主题，看一下它的主要相关代码：

static const int stub_entry[] =
{
    0x3d800000, /* lis 12,xxx@ha */
    0x398c0000, /* addi 12,12,xxx@l */
    0x7d8903a6, /* mtctr 12 */
    0x4e800420, /* bctr */
};

static bfd_boolean
ppc_elf_relax_section (bfd *abfd,
         asection *isec,
         struct bfd_link_info *link_info,
         bfd_boolean *again)
{

……

for (irel = internal_relocs; irel < irelend; irel++)这里遍历一个目标文件中所有的重定位项，找到其中的重定位类型，然后将他们收集到fixups指向的链表结构中，这个链表的作用是为了删除冗余。例如，同一个模块可能有多处bl target，对于相同的target，它们可以共用一个“跳板”。
{

……

   /* Look for an existing fixup to this address. */
      for (f = fixups; f ; f = f->next)
if (f->tsec == tsec && f->toff == toff)
   break;

if (f == NULL)
{

……

stub_rtype = R_PPC_RELAX;
   if (tsec == htab->plt
       || tsec == htab->glink)
     {
       stub_rtype = R_PPC_RELAX_PLT;
       if (r_type == R_PPC_PLTREL24)
  stub_rtype = R_PPC_RELAX_PLTREL24;
     }

   /* Hijack the old relocation. Since we need two
      relocations for this use a "composite" reloc. */由于后面将会修改跳板，而跳板内需要重定向，并且需要两处重定向，一个是目标32bits地址的高16bits，一个是低16bits，而原始的重定向只有一个重定向，所以此处引入了一个新的重定向类型R_PPC_RELAX，这一个重定位项要求链接器进行两处重定向。
   irel->r_info = ELF32_R_INFO (ELF32_R_SYM (irel->r_info),
           stub_rtype);
   irel->r_offset = trampoff + insn_offset;
   if (r_type == R_PPC_PLTREL24
       && stub_rtype != R_PPC_RELAX_PLTREL24)
     irel->r_addend = 0;

   /* Record the fixup so we don't do it again this section. */
   f = bfd_malloc (sizeof (*f));
   f->next = fixups;
   f->tsec = tsec;
   f->toff = toff;
   f->trampoff = trampoff;
   fixups = f;

trampoff += size;
changes++;将新的修补项纪录起来。

}

……

/* Write out the trampolines. */
if (fixups != NULL)
    {
      const int *stub;
      bfd_byte *dest;
      int i, size;

      do
{
   struct one_fixup *f = fixups;
   fixups = fixups->next;
   free (f);由于上面说过，这个fixups主要是用来查重的，所以这里就可以删除了。
}
      while (fixups);

contents = bfd_realloc_or_free (contents, trampoff);这里的trampoff指向所有跳板(stub桩)汇集在一起之后原始代码段扩充后大小。
if (contents == NULL)
goto error_return;

      isec->size = (isec->size + 3) & (bfd_vma) -4;
      dest = contents + isec->size;
      /* Branch around the trampolines. */
      if (maybe_pasted)
{
   bfd_vma val = B + trampoff - isec->size;
   bfd_put_32 (abfd, val, dest);
   dest += 4;
}
      isec->size = trampoff;

      if (link_info->shared)
{
   stub = shared_stub_entry;
   size = ARRAY_SIZE (shared_stub_entry);
}
      else
{
   stub = stub_entry;
   size = ARRAY_SIZE (stub_entry);
}

      i = 0;
      while (dest < contents + trampoff)
{
   bfd_put_32 (abfd, stub[i], dest);这里开始对每一个桩函数进行拷贝，桩代码的内容就是stub_entry中四条指令，每个目标项对应一个。
   i++;
   if (i == size)
     i = 0;
   dest += 4;
}
      BFD_ASSERT (i == 0);
    }

在ppc_elf_relocate_section函数中，对这个R_PPC_RELAX重定位类型的处理

case R_PPC_RELAX:
   if (info->shared)
     relocation -= (input_section->output_section->vma
      + input_section->output_offset
      + rel->r_offset - 4);

   {
     unsigned long t0;
     unsigned long t1;

t0 = bfd_get_32 (output_bfd, contents + rel->r_offset);
t1 = bfd_get_32 (output_bfd, contents + rel->r_offset + 4);

     /* We're clearing the bits for R_PPC_ADDR16_HA
        and R_PPC_ADDR16_LO here. */
     t0 &= ~0xffff;
     t1 &= ~0xffff;

     /* t0 is HA, t1 is LO */
     relocation += addend;
     t0 |= ((relocation + 0x8000) >> 16) & 0xffff;
     t1 |= relocation & 0xffff;

bfd_put_32 (output_bfd, t0, contents + rel->r_offset);
bfd_put_32 (output_bfd, t1, contents + rel->r_offset + 4);

     /* Rewrite the reloc and convert one of the trailing nop
        relocs to describe this relocation. */
     BFD_ASSERT (ELF32_R_TYPE (relend[-1].r_info) == R_PPC_NONE);
     /* The relocs are at the bottom 2 bytes */
     rel[0].r_offset += 2;
     memmove (rel + 1, rel, (relend - rel - 1) * sizeof (*rel));
     rel[0].r_info = ELF32_R_INFO (r_symndx, R_PPC_ADDR16_HA);
     rel[1].r_offset += 4;
     rel[1].r_info = ELF32_R_INFO (r_symndx, R_PPC_ADDR16_LO);
     rel++;
   }
   continue;

四、思路总结

当距离太长的时候，链接器首先在这个跳转起点(bl指令)所在代码节的最后(注意这个桩代码的放置位置，因为一个节内地址相对于节结束位置是连接是确定的，不依赖其它节)为每个跳转补一块跳板(也就是代码中所说的桩stub)，修改这个bl指令的操作数，让这个bl指令跳转到这个桩代码处，跳转过来之后执行下面注释中代码

static const int stub_entry[] =
{
    0x3d800000, /* lis 12,xxx@ha */
    0x398c0000, /* addi 12,12,xxx@l */
    0x7d8903a6, /* mtctr 12 */
    0x4e800420, /* bctr */
};

可以看到，其中的xxx也就是真正的跳转目标，这四条指令首先把目标的高16位放r12寄存器的高16bits中，然后目标地址(绝对地址)的低16位放入r12的低16位，从而组成一个32bits的目标绝对地址，然后将这个地址放入ctr寄存器，之后通过bctr跳转到目的地址。从这个复杂的指令也可以看出为什么powerpc大胆假设是在24bits范围之内而是用相对跳转，原始的一个指令在这里要转换为4条指令。

五、一点精确计算

那么我们可以精确计算一下这个跳转的范围，其中有26bits作为操作数，所以范围为64M范围之内，但是要考虑到前向跳转和后像跳转，所以减去一个符号位，总共可以前后在32M范围内。当起跳点和目标之间的距离大于这个区间的时候，就会提示连接错误，此时需要在连接器中添加--relax选项。由于crti是在链接器输入非常靠前的节，所以一个大工程的DEBUG版本是有可能超过这个值的。

posted on 2019-03-06 20:42 tsecer 阅读(1007) 评论(0) 编辑收藏举报

刷新页面返回顶部

tsecer

PowerPC下连接器--relax选项实现

导航

公告