【转】gcc程序编译的过程
原文:https://www.jianshu.com/p/00ee0ec582a1
编译多个源代码文件会生成多个目标文件,每个目标文件都包含一个源文件的机器码和相关数据的符号表。除非使用-c
选项指示 GCC 只编译不链接,否则 GCC 会使用临时文件作为目标文件输出:
$ gcc -c main.c
$ gcc -c func.c
这些命令会在当前目录中生成两个目标文件,分别是 main.o 和 func.o。把两个源文件名放在同一个 GCC 命令中,也可以获得同样的结果:
$ gcc -c main.c func.c
反汇编 objdump -D a (objdump -D a|vim -)
————————————————————————————————————————————————————————————————————————————————
一个C/C++文件要经过预处理(preprocessing)、编译(compilation)、汇编(assembly)和链接(linking)等4步才能变成可执行文件,通常使用“编译”统称这4个步骤。
-
预处理(preprocessing)
C/C++源文件中,以“#”开头的命令被称为预处理命令,如包含命令“#include”、宏定义命令“#define”、条件编译命令“#if”、“#ifdef”等。预处理就是将要包含(include)的文件插入原文件中、将宏定义展开、根据条件编译命令选择要使用的代码,最后将这些代码输出到一个“.i”文件中等待进一步处理。 -
编译(compilation)
编译就是把C/C++代码(比如上述的“.i”文件)翻译成汇编代码。 -
汇编(assembly)
汇编就是将第二步输出的汇编代码翻译成符合一定格式的机器代码,在linux系统上一般表现为ELF目标文件(OBJ文件)。“反汇编”是指将机器代码转换为汇编代码。 -
链接(linking)
链接就是将上步生成的OBJ文件和系统库的OBJ文件、库文件链接起来,最终生成可以在特定平台运行的可执行文件。
编译器利用这4个步骤中的一个或多个来处理输入文件,源文件的后缀名表示源文件所用的语言,后缀名控制着编译器的默认动作。
文件后缀名对应表:
后缀名 | 类型 |
---|---|
.c | c源程序 |
.h | 预处理器文件 |
.cpp | c++源程序 |
.i | 预处理后的c文件 |
.ii | 预处理后的c++文件 |
.s | 汇编语言源程序 |
.o | 目标文件(Object file) |
.a | 静态链接库文件(linux) |
.so | 动态链接库文件(linux) |
.lib | 静态链接库文件(windows) |
.dll | 动态链接库文件(windows) |
gcc的使用方法:
gcc --help
Usage: gcc [options] file...
Options:
-pass-exit-codes Exit with highest error code from a phase.
--help Display this information.
--target-help Display target specific command line options.
--help={common|optimizers|params|target|warnings|[^]{joined|separate|undocumented}}[,...].
Display specific types of command line options.
(Use '-v --help' to display command line options of sub-processes).
--version Display compiler version information.
-dumpspecs Display all of the built in spec strings.
-dumpversion Display the version of the compiler.
-dumpmachine Display the compiler's target processor.
-print-search-dirs Display the directories in the compiler's search path.
-print-libgcc-file-name Display the name of the compiler's companion library.
-print-file-name=<lib> Display the full path to library <lib>.
-print-prog-name=<prog> Display the full path to compiler component <prog>.
-print-multiarch Display the target's normalized GNU triplet, used as
a component in the library path.
-print-multi-directory Display the root directory for versions of libgcc.
-print-multi-lib Display the mapping between command line options and
multiple library search directories.
-print-multi-os-directory Display the relative path to OS libraries.
-print-sysroot Display the target libraries directory.
-print-sysroot-headers-suffix Display the sysroot suffix used to find headers.
-Wa,<options> Pass comma-separated <options> on to the assembler.
-Wp,<options> Pass comma-separated <options> on to the preprocessor.
-Wl,<options> Pass comma-separated <options> on to the linker.
-Xassembler <arg> Pass <arg> on to the assembler.
-Xpreprocessor <arg> Pass <arg> on to the preprocessor.
-Xlinker <arg> Pass <arg> on to the linker.
-save-temps Do not delete intermediate files.
-save-temps=<arg> Do not delete intermediate files.
-no-canonical-prefixes Do not canonicalize paths when building relative
prefixes to other gcc components.
-pipe Use pipes rather than intermediate files.
-time Time the execution of each subprocess.
-specs=<file> Override built-in specs with the contents of <file>.
-std=<standard> Assume that the input sources are for <standard>.
--sysroot=<directory> Use <directory> as the root directory for headers
and libraries.
-B <directory> Add <directory> to the compiler's search paths.
-v Display the programs invoked by the compiler.
-### Like -v but options quoted and commands not executed.
-E Preprocess only; do not compile, assemble or link.
-S Compile only; do not assemble or link.
-c Compile and assemble, but do not link.
-o <file> Place the output into <file>.
-pie Create a position independent executable.
-shared Create a shared library.
-x <language> Specify the language of the following input files.
Permissible languages include: c c++ assembler none
'none' means revert to the default behavior of
guessing the language based on the file's extension.
Options starting with -g, -f, -m, -O, -W, or --param are automatically
passed on to the various sub-processes invoked by gcc. In order to pass
other options on to these processes the -W<letter> options must be used.
For bug reporting instructions, please see:
<http://gcc.gnu.org/bugs.html>.
常用选项:
选项 | 含义 |
---|---|
-v | 查看gcc编译器的版本,显示gcc执行时的详细过程 |
-E | 只预处理,不编译、汇编、链接 |
-S | 只编译,不汇编、链接 |
-c | 编译和汇编,不链接 |
-o <file> | 指定输出文件名为file |
-static | 进行静态编译,即链接静态库,禁止使用动态库 |
-shared | 1.可以生成动态库文件 2.进行动态编译,尽可能的链接动态库,只有没有动态库时才会链接同名的静态库(默认选项,可省略) |
-Ldir | 在库文件的搜索路径列表中添加dir目录 |
-lname | 链接称为libname.a(静态库)或者libname.so(动态库)的库文件。若两个库都在,则根据编译方式(-static还是-shared)而进行链接。 |
-fPIC | 生成使用相对地址的位置无关的目标代码(Position Independent Code)。 |
以一个实例来分析gcc程序编译的过程(helloworld.c):
#include <stdio.h>
#define TRUE 1
#define FALSE 0
#define DEBUG_ENABLE
int main(void){
int i = 0;
if(i == TRUE){
printf("hello\n");
}else{
#ifdef DEBUG_ENABLE
printf("i = %d\n",i);
#endif
printf("hello world\n");
}
return 0;
}
1.预处理
gcc -E -o helloworld.i helloworld.c
打开helloworld.i文件(用sublime打开),
可以看到include的文件已插入原文件中,宏定义展开、条件编译命令已选择好代码:
434 __attribute__((__cdecl__)) __attribute__((__nothrow__)) int putw (int, FILE *);
435
436
437
438
439
440 # 2 "helloworld.c" 2
441
442
443
444
445
446
447
448 # 8 "helloworld.c"
449 int main(void){
450 int i = 0;
451 if(i == 1){
452 printf("hello\n");
453 }else{
454
455 printf("i = %d\n",i);
456
457 printf("hello world\n");
458 }
459 return 0;
460 }
461
2.编译
gcc -S -o helloworld.s helloworld.i
编译生成的汇编代码内容如下(用sublime打开):
.file "helloworld.c"
.def ___main; .scl 2; .type 32; .endef
.section .rdata,"dr"
LC0:
.ascii "hello\0"
LC1:
.ascii "i = %d\12\0"
LC2:
.ascii "hello world\0"
.text
.globl _main
.def _main; .scl 2; .type 32; .endef
_main:
LFB10:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
andl $-16, %esp
subl $32, %esp
call ___main
movl $0, 28(%esp)
cmpl $1, 28(%esp)
jne L2
movl $LC0, (%esp)
call _puts
jmp L3
L2:
movl 28(%esp), %eax
movl %eax, 4(%esp)
movl $LC1, (%esp)
call _printf
movl $LC2, (%esp)
call _puts
L3:
movl $0, %eax
leave
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
LFE10:
.ident "GCC: (MinGW.org GCC-6.3.0-1) 6.3.0"
.def _puts; .scl 2; .type 32; .endef
.def _printf; .scl 2; .type 32; .endef
3.汇编
gcc -c -o helloworld.o helloworld.s
.o文件打开内容如下(用winhex打开):
4.链接
gcc -o helloworld helloworld.o
最终生成helloworld.exe文件,执行(使用的是Notepad++里的控制台):
helloworld
helloworld
Process started (PID=15044) >>>
i = 0
hello world
<<< Process finished (PID=15044). (Exit code 0)
在编译过程中,除非使用了"-E"、"-S"、"-c"选项,或者编译器错误阻止了完整的过程,否则最后步骤总是链接。
例如:
gcc helloworld.c
gcc -o helloworld helloworld.c
都是已经完成链接操作。
作者:Mr_Bluyee
链接:https://www.jianshu.com/p/00ee0ec582a1
来源:简书
著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。
1.源代码
.text # section declaration
# we must export the entry point to the ELF linker or
.global _start # loader. They conventionally recognize _start as their
# entry point. Use ld -e foo to override the default.
_start:
# write our string to stdout
movl $len,%edx # third argument: message length
movl $msg,%ecx # second argument: pointer to message to write
movl $1,%ebx # first argument: file handle (stdout)
movl $4,%eax # system call number (sys_write)
int $0x80 # call kernel
# and exit
movl $0,%ebx # first argument: exit code
movl $1,%eax # system call number (sys_exit)
int $0x80 # call kernel
.data # section declaration
msg:
.ascii "Hello, world!\n" # our dear string
len = . - msg # length of our dear string
2.编译和连接步骤
3.使用objdump –D hello 反汇编