Reading Notes on Linkers & Loaders

工作中要用C++, 有时遇到些诡异的链接相关的问题解决不好。需要系统的学习下,2014.11开始读 Linkers And Loaders.2014.12.31终于读完。

目标:熟悉Linux环境下的ELF文件格式、静态库、动态库,及其相关工具基本使用,出现问题时能有解决思路。

link: http://www.iecc.com/linker/

Chapter 1 Intro

Chapter 2 Achitectural Issues

ABI
Intel X86 架构 //TODO
Paging and Virtual Memory
PIC p.252

Chapter 3 Object Files

BSS Segment(Block Started by Symbol) 用作未初始化的数据段
ELF 格式 //TODO
PE

Chapter 4 Storage allocation

C++ duplicate removal

Templates are essentially macros with arguments that are datatypes, and that expand into a distinct routines for every distinct set of type arguments.

The GNU linker deals with the template problem by defining a "link once" type of section similar to common blocks.

Chapter 5 Symbol Mangement

DWARF

DWARF is a widely used, standardized debugging data format. DWARF was originally designed along with Executable and Linkable Format (ELF), although it is independent of object file formats. The name is a medieval fantasy complement to "ELF" that has no official meaning, although the backronym 'Debugging With Attributed Record Formats' was later proposed.

  • Annotated C++ Reference Manual

Chapter 6 Libraries

readelf

查看一个 ELF 格式 .a文件里有哪些 .o文件
$ readelf -h libcommon.a |grep File

GNU Binary Utilities

ar,nm,ranlib,readelf等工具手册
https://sourceware.org/binutils/docs/binutils/index.html
ar: Create, modify, and extract from archives
nm: List symbols from object files
ranlib: Generate index to archive contents
readelf: Display the contents of ELF format files

Chapter 7 Relocation

Once a linker has scanned all of the input files to determin segment sizes, symbol definitions and symbol references, figured out which library moduleds to include, and decided where in the output address space all of the segments will go, the next stage is the heart of the linking process, relocation.

  • Hardware and software relocation
    Hardware relocation allows an operation system to give each process a separate address that starts at a fixed known address, which makes program loading easier and prevents buggy programs in one address space from damaging programs in otheraddress spaces. Software linker or loader relocation combines input files into one large file that's ready to be loaded into the address apace provided by hardware relocation, frequently with no load-time fixing up at all.

Chapter 8 Loading and overlays

Loading is a little different depending on whether a program is loaded by mapping into a process address space via the virtual memeory system or just read in using normal I/O calls.

  • PIC costs and benefits
    The advantages of PIC are straighforward; it makes it possible to load code without having to do load-time relocation, and to share memory pages of code among processes even though they don't all have the same address space allocated.

Chapter 9 Shared libraries

  • shatic linked shared libraries
    that is , libraries where program and data addresses in libraires are bound to executables at link time.

  • dynamic linked libraries

  • Binding time
    A much more interestin problem occurs when the library is present, but the library has changed since the program was linked. In a conventionally linked program, symbols are bound to addresses and library code is bound to the executable at link time, so the library the program was linked with is the one it uese reardless of subsequent changes to the libray. With static shared libraries, symbols are still bound to addresses at link time, but library code isn't bound to the executable until run time. (With dynamic shared biraries, they're both delayed until runtime.)

Chapter 10 Dynamic Linking and Loading

PLT: Procedure call table

Chapter 11 Advanced techniques

Techniques for C++

  • Trial linking
  • Duplicate code elimination

The Java linking model

What makes it interesting is that Java also defines a protable binary object code format, a virtual machine that executes programs in that binary format, and a loading system that permits a Java program to add code to itself on the fly.

A Java application can either use the built-in boosttrap class loader which loads classes from files on the local disk, or it can provide its own class loader which can create or retrieve classes any way it wants. Most commonly a custom class loader retrieves class files over a network conection, but it could equally well generate code on the fly or extract code from compressed or encypted files. When a class is loaded due to a reference from another class, the system uses same loader that loaded the referring class. Each class loader has its own separate name space, so even if an application run from the disk and one run over the net have identically named classes or class members, there's no name collision.

posted @ 2014-12-06 23:45  apricot  阅读(195)  评论(0编辑  收藏  举报