汇编和内存角度理解C/C++

Author: ChrisZZ (https://cnblogs.com/zjutzz)
Created: 2024.04.01 09:00:00
Updated: 2024.04.03 17:15:00

0. Purpose
1. 查看函数地址
2. main 函数和普通函数，汇编代码没区别
3. 非 main() 函数作为程序入口函数
4. 是否勾选 link to binary, 汇编代码不一样
5. for 循环和等价的 goto 写法
6. if/else 和等价的 goto 写法
7. push rbp 做了什么
8. 在 gdb 中查看内存
9. gdb 启动时不要显示欢迎信息
10. 理解虚拟内存
11. 查看进程可用地址空间范围
12. 汇编层面理解 volatile
13. 不能修改 volatile const / const 变量的现象
14. 在 macOS 下查看栈空间大小
15. 再谈栈的增长方向
16. 数组名不是指针
17. 函数名不是函数指针
18. Compiler Explorer 怎么用？
- 18.1 官方 wiki
- 18.2 快速查看源代码对应的汇编代码
- 18.3 快速查看汇编对应的源代码
- 18.4 包含一个外部文件：使用 URL
- 18.5 创建第二个编辑器/编译器
- 18.6 mq白的 godbolt 使用文档
19. 羽夏的 C语言理解系列博客
20. NDK 生成汇编代码
21. 使用全局变量

0. Purpose

在看《CPU眼里的C/C++》，书上一些例子无法复现（用的 x86-64 gcc(trunk)，看不出具体的 GCC 版本，作者也没制作对应 Compiler Explorer (Godbolt) 的链接）。做一些笔记。

Compiler Explorer 不能调试，使用 GDB 调试，并在断点前后打印和比较内存、寄存器取值，可以进一步理解。对 GDB 的使用做一些记录。

也看了 B站的up主 mq白cpp 对一些常见错误观念的纠正视频（数组名是指针？典, 兼听则明，也记录下。

终极目的是加深对 C/C++ 程序的理解，在面对 crash 时犹如庖丁解牛。

1. 查看函数地址

勾选 “Link to binary”.

https://godbolt.org/z/v5bGqEv48

2. main 函数和普通函数，汇编代码没区别

https://godbolt.org/z/fTW8dYYMP

 int main()
{
    return 0;
}
 
int func()
{
    return 0;
}

 main:
        push    rbp
        mov     rbp, rsp
        mov     eax, 0
        pop     rbp
        ret
func():
        push    rbp
        mov     rbp, rsp
        mov     eax, 0
        pop     rbp
        ret

3. 非 main() 函数作为程序入口函数

书上说用 -nostartfiles -efunc 即可使用 func() 作为入口函数， func() 结束的时候调用 exit() 即可避免 segfault。

https://godbolt.org/z/qqr1qYE1v

试了下，即使调用了 exit(), 也会 segfault。

4. 是否勾选 `link to binary`, 汇编代码不一样

测试代码:

 int a;
 
void write()
{
    a = 1;
}

不勾选 "link to binary":

https://godbolt.org/z/ETKb18KG7

 a:
        .zero   4
write:
        push    rbp
        mov     rbp, rsp
        mov     DWORD PTR a[rip], 1
        nop
        pop     rbp
        ret

勾选 "link to binary":

https://godbolt.org/z/fx799vz46

 write:
 push   rbp
 mov    rbp,rsp
 mov    DWORD PTR [rip+0x2f00],0x1        # 404014 <a>
 nop
 pop    rbp
 ret
main:
 push   rbp
 mov    rbp,rsp
 mov    eax,0x0
 pop    rbp
 ret

差异

变量 a 不见了，直接用了具体的内存地址。本质上，变量表达的是内存地址，生成可执行文件时（包含了链接），变量展开为具体的内存地址。

验证

https://godbolt.org/z/YK7EsYGGr

增加了 main 函数，打印变量 a 的地址。

 ...
int main()
{
    printf("addr of a is %p\n", &a);
    return 0;
}

打印出 a 的地址，和手动计算 rip+0x2ee8 = 0x401134 + 0x2ee8 结果相同，都是 0x40401c

5. for 循环和等价的 goto 写法

先看代码： func1() 是常规的 for 循环， func2() 是等价的 goto 方式的写法。

 #include <stdio.h>
 
void func1()
{
    int i = 0;
    for (; i < 10; i++)
    {
        printf("hello %d\n", i);
    }
}
 
void func2()
{
    int i = 0;
    goto L7;
L8:
    printf("hello %d\n", i);
    i++;
L7:
    if (i < 10)
    {
        goto L8;
    }
}

查看汇编代码， func1() 和 func2() 汇编可以说是一样的，差别是可以忽略的（Label 名字不同、加了 nop 气泡）

https://godbolt.org/z/q7T9n93Mb

解释：

 void func2()
{
    int i = 0; // for 循环之前的代码， 包括 for 循环里第一个分号之前的代码
    goto L7;  // 跳转到条件判断，也就是 for 循环中两个分号之间的内容
L8:
    printf("hello %d\n", i); // for 循环体的内容
    i++;  // for 循环中第二个分号 到 `)` 之间的内容， 也在这里
L7: 
    if (i < 10) // for 循环中两个分号之间的条件判断
    {
        goto L8; // 如果满足条件， 则执行 for 循环体
    }
}

6. if/else 和等价的 goto 写法

先看 C++ 代码:

 int func3(int x)
{
    if (x > 1)
    {
        return 1;
    }
    else
    {
        return 0;
    }
}
 
int func4(int x)
{
    if (x <= 1)
    {
        goto L9;
    }
    return 1;
L9:
    return 0;
}

https://godbolt.org/z/8dhe5o7d9

对比两个函数 func3() 和 func4() 的汇编，只有 Label 名字的差别，其他都一样

7. push rbp 做了什么

x86-64 架构的函数调用，对应到汇编代码中的第一句是 push rbp. 其中 rbp 是64位寄存器， 64/8=8 bytes。

使用 gdb 的 x 命令检查内存，可以查看 push rbp 之前和之后，栈内存的变化。 x 命令语法：

 x/<n/f/u> <addr>

x: examine 的缩写
<addr>: 我们取 <addr> 为栈内存地址，就可以查看 push rbp 把 rbp 寄存器里的值存储到了哪里。
n: 被查看的内存单元的数量。单元的大小是通过 u 指定的
f: 显示内存单元的数据时，使用的格式
- 我们使用 g, 意思是 8 bytes，原因是 rsp 寄存器是8bytes的长度

使用 x/6xg 检查 rsp 前后的内存：

 (gdb) x/6xg  $rsp-0x10
0x7fffffffd7f0: 0x00007fffffffdc09      0x0000000000000064
0x7fffffffd800: 0x00007fffffffd810      0x0000555555555156
0x7fffffffd810: 0x0000000000000001      0x00007ffff7da8d90

启动调试：

 b main
r
ni
si
ni
i reg # info register
# 此时记录 rbp 的值 如 0x7fffffffd850

rsp 寄存器的变化:

rsp 寄存器的值减小了8 (0x...d848 -> 0x...d840)
rsp 寄存器的值存储的内容变了

register	old value (old content)	new value
rsp	0x7fffffffd848(0x0000555555555156)	0x7fffffffd840(0x00007fffffffd850)
rbp	0x7fffffffd850	0x7fffffffd850

8. 在 gdb 中查看内存

gdb 的 x 命令能检查内存，但是看起来有点别扭。书本、博客中很多说明都是每行一个字节的内存， gdb 的 x 则是每行4个字节。

4字节连续显示的好处是减少了大小端导致的误读，但10年多的编码经验中还没遇到过小端的机器。那就索性用小端，每行显示一个字节吧！

使用 gdb 的配置脚本可以搞定这事儿，自定义了 xbytes 命令，用法是 xbytes 4 &a 这样子。以下内容放到 ~/.gdbinit 里:

 #----------------------------------------------------------------------
# => define xbytes command, to view memory bytes, one byte per line
# usage example:
# (gdb) xbytes 4 &a
# 0x7fffffffd848: 78
# 0x7fffffffd849: 56
# 0x7fffffffd84a: 34
# 0x7fffffffd84b: 12
#----------------------------------------------------------------------
python
import gdb
 
class XBytes(gdb.Command):
    """xbytes NUM_BYTES ADDRESS
    Display NUM_BYTES bytes starting at ADDRESS in blue color."""
 
    def __init__(self):
        super(XBytes, self).__init__("xbytes", gdb.COMMAND_DATA, gdb.COMPLETE_SYMBOL)
 
    def invoke(self, arg, from_tty):
        argv = gdb.string_to_argv(arg)
        if len(argv) != 2:
            raise gdb.GdbError("xbytes requires 2 arguments: NUM_BYTES and ADDRESS")
 
        num_bytes = int(argv[0])
        address = gdb.parse_and_eval(argv[1])
        address = int(address.cast(gdb.lookup_type('void').pointer()))
 
        # ANSI escape code for blue color and reset
        blue = '\033[34m'
        reset = '\033[0m'
 
        # Read and display the specified number of bytes
        for addr in range(address, address + num_bytes):
            byte = gdb.selected_inferior().read_memory(addr, 1)
            byte_value = int.from_bytes(byte, byteorder='little')
            # Print address in blue
            print(f"{blue}0x{addr:x}:{reset} 0x{byte_value:02x}")
 
# Register the command
XBytes()
end

测试一下，代码用的是 test.cpp:

 #include <stdio.h>
int main()
{
    int a = 0x12345678;
    int b = 0x0001;
    int c = 0xabcdef;
    printf("a = %d\n", a);
    printf("b = %d\n", b);
    return 0;
}

编译:

 g++ test.cpp -O0 -g

调试:

 gdb a.out
b main
r
n
n
n
(gdb) xbytes 4 &a
0x7fffffffd844: 0x78
0x7fffffffd845: 0x56
0x7fffffffd846: 0x34
0x7fffffffd847: 0x12
(gdb) xbytes 4 &b
0x7fffffffd848: 0x01
0x7fffffffd849: 0x00
0x7fffffffd84a: 0x00
0x7fffffffd84b: 0x00
(gdb) xbytes 4 &c
0x7fffffffd84c: 0xef
0x7fffffffd84d: 0xcd
0x7fffffffd84e: 0xab
0x7fffffffd84f: 0x00

9. gdb 启动时不要显示欢迎信息

要么修改 gdb 源码，要么制作命令别名（推荐，简单）. vim ~/.aliasrc:

 alias gdb='gdb -q'

https://stackoverflow.com/questions/34199640/how-to-specify-silent-quiet-in-gdbinit

10. 理解虚拟内存

https://godbolt.org/z/YGvM3e66j

几乎一样的代码，但是分别编译，分别运行。全局变量 a 的地址是一样的，取值不同。为什么 &a 相同？因为用的是虚拟内存。

11. 查看进程可用地址空间范围

测试代码是 test2.c:

 long test()
{
    long a = 1;
    a += 2;
    return a;
}
 
int main()
{
    test();
    return 0;
}

从 0x7fffff7ff000这个地址开始是可以访问的：

 (gdb) xbytes 1 0x7fffff7ff000
0x7fffff7ff000: 0x00

再小1一个字节的地址，就不行了：

 (gdb) xbytes 1 0x7fffff7ff000-1
Python Exception <class 'gdb.MemoryError'>: Cannot access memory at address 0x7fffff7fefff
Error occurred in Python: Cannot access memory at address 0x7fffff7fefff

用 info proc mappings 可以查看到可用地址空间范围：

 (gdb) info proc mappings
process 2606573
Mapped address spaces:
 
          Start Addr           End Addr       Size     Offset  Perms  objfile
      0x555555554000     0x555555555000     0x1000        0x0  r--p   /home/zz/dbg/a.out
      0x555555555000     0x555555556000     0x1000     0x1000  r-xp   /home/zz/dbg/a.out
      0x555555556000     0x555555557000     0x1000     0x2000  r--p   /home/zz/dbg/a.out
      0x555555557000     0x555555558000     0x1000     0x2000  r--p   /home/zz/dbg/a.out
      0x555555558000     0x555555559000     0x1000     0x3000  rw-p   /home/zz/dbg/a.out
      0x7ffff7d7b000     0x7ffff7d7e000     0x3000        0x0  rw-p   
      0x7ffff7d7e000     0x7ffff7da6000    0x28000        0x0  r--p   /usr/lib/x86_64-linux-gnu/libc.so.6
      0x7ffff7da6000     0x7ffff7f3b000   0x195000    0x28000  r-xp   /usr/lib/x86_64-linux-gnu/libc.so.6
      0x7ffff7f3b000     0x7ffff7f93000    0x58000   0x1bd000  r--p   /usr/lib/x86_64-linux-gnu/libc.so.6
      0x7ffff7f93000     0x7ffff7f94000     0x1000   0x215000  ---p   /usr/lib/x86_64-linux-gnu/libc.so.6
      0x7ffff7f94000     0x7ffff7f98000     0x4000   0x215000  r--p   /usr/lib/x86_64-linux-gnu/libc.so.6
      0x7ffff7f98000     0x7ffff7f9a000     0x2000   0x219000  rw-p   /usr/lib/x86_64-linux-gnu/libc.so.6
      0x7ffff7f9a000     0x7ffff7fa7000     0xd000        0x0  rw-p   
      0x7ffff7fbb000     0x7ffff7fbd000     0x2000        0x0  rw-p   
      0x7ffff7fbd000     0x7ffff7fc1000     0x4000        0x0  r--p   [vvar]
      0x7ffff7fc1000     0x7ffff7fc3000     0x2000        0x0  r-xp   [vdso]
      0x7ffff7fc3000     0x7ffff7fc5000     0x2000        0x0  r--p   /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
      0x7ffff7fc5000     0x7ffff7fef000    0x2a000     0x2000  r-xp   /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
      0x7ffff7fef000     0x7ffff7ffa000     0xb000    0x2c000  r--p   /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
      0x7ffff7ffb000     0x7ffff7ffd000     0x2000    0x37000  r--p   /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
      0x7ffff7ffd000     0x7ffff7fff000     0x2000    0x39000  rw-p   /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
      0x7fffff7ff000     0x7ffffffff000   0x800000        0x0  rw-p   [stack]

12. 汇编层面理解 `volatile`

https://godbolt.org/z/c1K4e8T1K

volatile 的作用是，告诉编译器，别“碰”我的变量（也就是别做编译优化），按变量本身的情况来处理。

13. 不能修改 volatile const / const 变量的现象

https://godbolt.org/z/cE3Mv6a85

 #include <stdio.h>
 
volatile const int a = 1;
int b = 1;
 
int func1()
{
    *(int*)&a = 1;
    return a;
}
 
int func2()
{
    b = 1;
    return b;
}

如果修改，会导致 segfault （GCC 11.4.0不会crash， clang-14.0会crash， GCC 13.2 会 crash）.

解释：

常量 a 的地址 &a, 取值为 0x555555556004, 落在红色的地址范围内(0x555555556000 ~ 0x555555557000), 是read-only的 (r--p)，那么修改它也就是写入它，导致了segfault.
变量 b 的地址 &b 取值落在可读、可写的内存范围（rw-p, w 是可写）, 因此修改它不会导致 crash:

 (gdb) xbytes 1 &a
0x555555556004: 0x01
(gdb) xbytes 1 &b
0x555555558028: 0x01
(gdb) info proc mappings 
      ...
      0x555555556000     0x555555557000     0x1000     0x2000  r--p   /home/zz/dbg/a.out
      0x555555558000     0x555555559000     0x1000     0x3000  rw-p   /home/zz/dbg/a.out

14. 在 macOS 下查看栈空间大小

先前在 Linux 下使用 gdb 的 info proc mappings 命令查看了栈大小。

macOS 下默认没安装 gdb，使用的是 LLDB, 没有直接等价于 info proc mappings 的命令。

查看方法：先进入 gdb 启动调试程序获得PID，然后另外开一个 terminal，用 vmmap 查询栈大小：

 clang test.c -g -O0
lldb ./a.out
b main
r
process status # 查询出 a.out 的进程 id 是73167

在活动监视器里验证下 PID:

用 vmmap 查看内存分布：

 vmmap 73167

15. 再谈栈的增长方向

栈内存的增长：

函数调用之间： stack 增长方向是从高地址到低地址，新的函数调用的stack地址是更低的地址
函数内部：局部变量之间，stack 增长方向是从低到高，新定义的局部变量的stack地址是更高的地址

一图胜千言：

测试代码:

 #include <stdio.h>
#include <stdlib.h>
 
 
void functionB(int b) {
    int localB = b;
    printf("Address of localB in functionB: %p\n", (void*)&localB);
}
 
void functionA(int a) {
    int b = 2;
    // int* c = (int*)malloc(4);
    // *c = 0x11223344;
    int d = 0x12345678;
    int e = 0x11223344;
 
    printf("Address of b: %p\n", &b);
    printf("Address of d: %p\n", &d);
    printf("Address of e: %p\n", &e);
 
    int localA = a;
    printf("Address of localA in functionA: %p\n", (void*)&localA);
    functionB(a + 1);
}
 
int main() {
    int localMain = 1;
    printf("Address of localMain in main: %p\n", (void*)&localMain);
    functionA(localMain + 1);
    return 0;
}

vs2022-x64 运行结果：

 Address of localMain in main: 000000E2D74FFDB4
Address of b: 000000E2D74FFD04
Address of d: 000000E2D74FFD24
Address of e: 000000E2D74FFD44
Address of localA in functionA: 000000E2D74FFD64
Address of localB in functionB: 000000E2D74FFCB4

linux-x64 运行结果:

 Address of localMain in main: 0x7ffdfee15714
Address of b: 0x7ffdfee156e8
Address of d: 0x7ffdfee156ec
Address of e: 0x7ffdfee156f0
Address of localA in functionA: 0x7ffdfee156f4
Address of localB in functionB: 0x7ffdfee156b4

16. 数组名不是指针

数组名是指针？典 - mq白cpp

https://godbolt.org/z/EjsMraxxY

数组名不是指针。数组名被当做指针用，从C++层面来说是发生了类型转换，至于汇编代码有没有做类型转换的调用，是实现层面的事情，可能转换也可能直接用，上述 Compiler Explorer 链接里有覆盖这两种情况。

可以用 typeid(T).name() 来获取和打印数组和指针的类型，它们是不一样的:

     int arr[10]{};
    using T = decltype(arr); // 如果用 decltype(+arr), 则 `+` 是做 array-to-pointer 转换
    // print T
    fmt::print("{}\n", typeid(T).name());

其中 +arr 的 +, 意思是 unary plus, 是做 Array-to-pointer conversion. arr 的类型是 A10_i, 转换为指针后类型是 Pi.

17. 函数名不是函数指针

https://godbolt.org/z/xPEjo874W

 #include <iostream>
#include <typeinfo>
#include <fmt/core.h>
 
void f(){}
 
int main()
{
    int arr[10]{};
    using T = decltype(arr);
    // print T
    fmt::print("{}\n", typeid(T).name());
 
    int* p = arr;
    // print p
    fmt::print("{}\n", typeid(p).name());
 
    using T2 = decltype(&f);
    fmt::print("{}\n", typeid(T2).name());
    using T3 = decltype(f);
    fmt::print("{}\n", typeid(T3).name());
    fmt::print("{}\n", sizeof(+f));
 
    return 0;
}

和前一小节类似。函数名被当做函数指针用，在C++概念层面上看，那是发生了 function to pointer 的隐式转换:

类型不同: f 和 &f 类型并不相同，一个是 PFvvE, 一个是 FvvE.

对 f 做 unary plus 时，执行了 function to pointer 的转换，得到了指针类型，能够打印它的 sizeof() 结果；而 sizeof(f) 是无法编译通过的。

18. Compiler Explorer 怎么用？

18.1 官方 wiki

https://github.com/compiler-explorer/compiler-explorer/wiki/

18.2 快速查看源代码对应的汇编代码

在源代码编辑器，鼠标右键，点击 "Reveal linked code":

18.3 快速查看汇编对应的源代码

在汇编代码页面区域，鼠标右键，点击 "Scroll to source":

https://godbolt.org/z/1roTc3be1

18.4 包含一个外部文件：使用 URL

https://godbolt.org/z/Pv0K0c

 #include <https://raw.githubusercontent.com/hanickadot/compile-time-regular-expressions/master/single-header/ctre.hpp>
#include <string_view>
 
constexpr auto match(std::string_view sv) noexcept {
	return ctre::match<"h.*">(sv);
}

非常炫酷

18.5 创建第二个编辑器/编译器

Compiler Explorer 默认只显示一个编辑器。考虑如下的场景，你可能需要第二个编辑器：

两份代码，只有一点点差异，想看对应的汇编的差异/执行结果的差异

同一份代码，想看不同的编译器下的执行结果

18.6 mq白的 godbolt 使用文档

https://mq-b.github.io/Loser-HomeWork/src/卢瑟日经/godbolt使用文档

19. 羽夏的 C语言理解系列博客

合集-羽夏看C语言

从 C/C++ 和汇编相互结合去理解。

20. NDK 生成汇编代码

-DCMAKE_CXX_FLAGS="-save-temps"

e.g.

 cmake ^
    -S . ^
    -B build-android ^
    -D CMAKE_BUILD_TYPE=Release ^
    -P ./zzbuild.cmake ^
    -p android ^
    -a arm64 ^
    -r D:\soft\android-ndk\r21e ^
    -G Ninja ^
    -DCMAKE_CXX_FLAGS="-save-temps"
 
cmake --build build-android

ref: https://groups.google.com/g/android-ndk/c/AVRKyKNuQtk

21. 使用全局变量

之前一直不会用全局变量，这一定程度上避免了坏的设计味道，但面对 OEM 项目的临时代码，设计又改不动的情况，还是要会一点点的全局变量用法。

testbed.cpp:

 std::string binPath;
 
int main()
{
    for (int frameCnt=0; frameCnt<1000; frameCnt++)
    {
        binPath = std::to_string(map_array[frameCnt]);
    }
}

fake_vpt.cpp:

 extern std::string binPath;
 
void vpt_run()
{
    printf("binPath is: %s\n", binPath.c_str());
}

posted @ 2024-03-29 09:38 ChrisZZ 阅读(253) 评论(0) 编辑收藏举报

刷新页面返回顶部

登录后才能查看或发表评论，立即登录或者逛逛博客园首页

相关博文：

· 从栈溢出到获取栈大小

· 报bug和寻求帮助的基本素养

· 初识编译与汇编

· GDB详解

· gdb 调试总结

阅读排行：
· 一个费力不讨好的项目，让我损失了近一半的绩效！
· 清华大学推出第四讲使用 DeepSeek + DeepResearch 让科研像聊天一样简单！
· 实操Deepseek接入个人知识库
· CSnakes vs Python.NET：高效嵌入与灵活互通的跨语言方案对比
· Plotly.NET 一个为 .NET 打造的强大开源交互式图表库

你内心的平庸就是你失去追求卓越信念的那个瞬间

汇编和内存角度理解C/C++

0. Purpose

1. 查看函数地址

2. main 函数和普通函数，汇编代码没区别

3. 非 main() 函数作为程序入口函数

4. 是否勾选 `link to binary`, 汇编代码不一样

5. for 循环和等价的 goto 写法

6. if/else 和等价的 goto 写法

7. push rbp 做了什么

8. 在 gdb 中查看内存

9. gdb 启动时不要显示欢迎信息

10. 理解虚拟内存

11. 查看进程可用地址空间范围

12. 汇编层面理解 `volatile`

13. 不能修改 volatile const / const 变量的现象

14. 在 macOS 下查看栈空间大小

15. 再谈栈的增长方向

16. 数组名不是指针

17. 函数名不是函数指针

18. Compiler Explorer 怎么用？

18.1 官方 wiki

18.2 快速查看源代码对应的汇编代码

18.3 快速查看汇编对应的源代码

18.4 包含一个外部文件：使用 URL

18.5 创建第二个编辑器/编译器

18.6 mq白的 godbolt 使用文档

19. 羽夏的 C语言理解系列博客

20. NDK 生成汇编代码

21. 使用全局变量

公告

搜索

常用链接

我的标签

文章档案 (1)

ZJUT

程序员充电站

友情链接

	main:
	push rbp
	mov rbp, rsp
	mov eax, 0
	pop rbp
	ret
	func():
	push rbp
	mov rbp, rsp
	mov eax, 0
	pop rbp
	ret

	a:
	.zero 4
	write:
	push rbp
	mov rbp, rsp
	mov DWORD PTR a[rip], 1
	nop
	pop rbp
	ret

	write:
	push rbp
	mov rbp,rsp
	mov DWORD PTR [rip+0x2f00],0x1 # 404014 <a>
	nop
	pop rbp
	ret
	main:
	push rbp
	mov rbp,rsp
	mov eax,0x0
	pop rbp
	ret

	#include <stdio.h>

	void func1()
	{
	int i = 0;
	for (; i < 10; i++)
	{
	printf("hello %d\n", i);
	}
	}

	void func2()
	{
	int i = 0;
	goto L7;
	L8:
	printf("hello %d\n", i);
	i++;
	L7:
	if (i < 10)
	{
	goto L8;
	}
	}

	void func2()
	{
	int i = 0; // for 循环之前的代码，包括 for 循环里第一个分号之前的代码
	goto L7; // 跳转到条件判断，也就是 for 循环中两个分号之间的内容
	L8:
	printf("hello %d\n", i); // for 循环体的内容
	i++; // for 循环中第二个分号到 `)` 之间的内容，也在这里
	L7:
	if (i < 10) // for 循环中两个分号之间的条件判断
	{
	goto L8; // 如果满足条件，则执行 for 循环体
	}
	}

	int func3(int x)
	{
	if (x > 1)
	{
	return 1;
	}
	else
	{
	return 0;
	}
	}

	int func4(int x)
	{
	if (x <= 1)
	{
	goto L9;
	}
	return 1;
	L9:
	return 0;
	}

	(gdb) x/6xg $rsp-0x10
	0x7fffffffd7f0: 0x00007fffffffdc09 0x0000000000000064
	0x7fffffffd800: 0x00007fffffffd810 0x0000555555555156
	0x7fffffffd810: 0x0000000000000001 0x00007ffff7da8d90

	b main
	r
	ni
	si
	ni
	i reg # info register
	# 此时记录 rbp 的值如 0x7fffffffd850

	#----------------------------------------------------------------------
	# => define xbytes command, to view memory bytes, one byte per line
	# usage example:
	# (gdb) xbytes 4 &a
	# 0x7fffffffd848: 78
	# 0x7fffffffd849: 56
	# 0x7fffffffd84a: 34
	# 0x7fffffffd84b: 12
	#----------------------------------------------------------------------
	python
	import gdb

	class XBytes(gdb.Command):
	"""xbytes NUM_BYTES ADDRESS
	Display NUM_BYTES bytes starting at ADDRESS in blue color."""

	def __init__(self):
	super(XBytes, self).__init__("xbytes", gdb.COMMAND_DATA, gdb.COMPLETE_SYMBOL)

	def invoke(self, arg, from_tty):
	argv = gdb.string_to_argv(arg)
	if len(argv) != 2:
	raise gdb.GdbError("xbytes requires 2 arguments: NUM_BYTES and ADDRESS")

	num_bytes = int(argv[0])
	address = gdb.parse_and_eval(argv[1])
	address = int(address.cast(gdb.lookup_type('void').pointer()))

	# ANSI escape code for blue color and reset
	blue = '\033[34m'
	reset = '\033[0m'

	# Read and display the specified number of bytes
	for addr in range(address, address + num_bytes):
	byte = gdb.selected_inferior().read_memory(addr, 1)
	byte_value = int.from_bytes(byte, byteorder='little')
	# Print address in blue
	print(f"{blue}0x{addr:x}:{reset} 0x{byte_value:02x}")

	# Register the command
	XBytes()
	end

	#include <stdio.h>
	int main()
	{
	int a = 0x12345678;
	int b = 0x0001;
	int c = 0xabcdef;
	printf("a = %d\n", a);
	printf("b = %d\n", b);
	return 0;
	}

	gdb a.out
	b main
	r
	n
	n
	n
	(gdb) xbytes 4 &a
	0x7fffffffd844: 0x78
	0x7fffffffd845: 0x56
	0x7fffffffd846: 0x34
	0x7fffffffd847: 0x12
	(gdb) xbytes 4 &b
	0x7fffffffd848: 0x01
	0x7fffffffd849: 0x00
	0x7fffffffd84a: 0x00
	0x7fffffffd84b: 0x00
	(gdb) xbytes 4 &c
	0x7fffffffd84c: 0xef
	0x7fffffffd84d: 0xcd
	0x7fffffffd84e: 0xab
	0x7fffffffd84f: 0x00

	long test()
	{
	long a = 1;
	a += 2;
	return a;
	}

	int main()
	{
	test();
	return 0;
	}

	(gdb) xbytes 1 0x7fffff7ff000-1
	Python Exception <class 'gdb.MemoryError'>: Cannot access memory at address 0x7fffff7fefff
	Error occurred in Python: Cannot access memory at address 0x7fffff7fefff

	(gdb) info proc mappings
	process 2606573
	Mapped address spaces:

	Start Addr End Addr Size Offset Perms objfile
	0x555555554000 0x555555555000 0x1000 0x0 r--p /home/zz/dbg/a.out
	0x555555555000 0x555555556000 0x1000 0x1000 r-xp /home/zz/dbg/a.out
	0x555555556000 0x555555557000 0x1000 0x2000 r--p /home/zz/dbg/a.out
	0x555555557000 0x555555558000 0x1000 0x2000 r--p /home/zz/dbg/a.out
	0x555555558000 0x555555559000 0x1000 0x3000 rw-p /home/zz/dbg/a.out
	0x7ffff7d7b000 0x7ffff7d7e000 0x3000 0x0 rw-p
	0x7ffff7d7e000 0x7ffff7da6000 0x28000 0x0 r--p /usr/lib/x86_64-linux-gnu/libc.so.6
	0x7ffff7da6000 0x7ffff7f3b000 0x195000 0x28000 r-xp /usr/lib/x86_64-linux-gnu/libc.so.6
	0x7ffff7f3b000 0x7ffff7f93000 0x58000 0x1bd000 r--p /usr/lib/x86_64-linux-gnu/libc.so.6
	0x7ffff7f93000 0x7ffff7f94000 0x1000 0x215000 ---p /usr/lib/x86_64-linux-gnu/libc.so.6
	0x7ffff7f94000 0x7ffff7f98000 0x4000 0x215000 r--p /usr/lib/x86_64-linux-gnu/libc.so.6
	0x7ffff7f98000 0x7ffff7f9a000 0x2000 0x219000 rw-p /usr/lib/x86_64-linux-gnu/libc.so.6
	0x7ffff7f9a000 0x7ffff7fa7000 0xd000 0x0 rw-p
	0x7ffff7fbb000 0x7ffff7fbd000 0x2000 0x0 rw-p
	0x7ffff7fbd000 0x7ffff7fc1000 0x4000 0x0 r--p [vvar]
	0x7ffff7fc1000 0x7ffff7fc3000 0x2000 0x0 r-xp [vdso]
	0x7ffff7fc3000 0x7ffff7fc5000 0x2000 0x0 r--p /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
	0x7ffff7fc5000 0x7ffff7fef000 0x2a000 0x2000 r-xp /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
	0x7ffff7fef000 0x7ffff7ffa000 0xb000 0x2c000 r--p /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
	0x7ffff7ffb000 0x7ffff7ffd000 0x2000 0x37000 r--p /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
	0x7ffff7ffd000 0x7ffff7fff000 0x2000 0x39000 rw-p /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
	0x7fffff7ff000 0x7ffffffff000 0x800000 0x0 rw-p [stack]

	#include <stdio.h>

	volatile const int a = 1;
	int b = 1;

	int func1()
	{
	(int)&a = 1;
	return a;
	}

	int func2()
	{
	b = 1;
	return b;
	}

	(gdb) xbytes 1 &a
	0x555555556004: 0x01
	(gdb) xbytes 1 &b
	0x555555558028: 0x01
	(gdb) info proc mappings
	...
	0x555555556000 0x555555557000 0x1000 0x2000 r--p /home/zz/dbg/a.out
	0x555555558000 0x555555559000 0x1000 0x3000 rw-p /home/zz/dbg/a.out

	clang test.c -g -O0
	lldb ./a.out
	b main
	r
	process status # 查询出 a.out 的进程 id 是73167

	#include <stdio.h>
	#include <stdlib.h>


	void functionB(int b) {
	int localB = b;
	printf("Address of localB in functionB: %p\n", (void*)&localB);
	}

	void functionA(int a) {
	int b = 2;
	// int* c = (int*)malloc(4);
	// *c = 0x11223344;
	int d = 0x12345678;
	int e = 0x11223344;

	printf("Address of b: %p\n", &b);
	printf("Address of d: %p\n", &d);
	printf("Address of e: %p\n", &e);

	int localA = a;
	printf("Address of localA in functionA: %p\n", (void*)&localA);
	functionB(a + 1);
	}

	int main() {
	int localMain = 1;
	printf("Address of localMain in main: %p\n", (void*)&localMain);
	functionA(localMain + 1);
	return 0;
	}

	Address of localMain in main: 000000E2D74FFDB4
	Address of b: 000000E2D74FFD04
	Address of d: 000000E2D74FFD24
	Address of e: 000000E2D74FFD44
	Address of localA in functionA: 000000E2D74FFD64
	Address of localB in functionB: 000000E2D74FFCB4

	Address of localMain in main: 0x7ffdfee15714
	Address of b: 0x7ffdfee156e8
	Address of d: 0x7ffdfee156ec
	Address of e: 0x7ffdfee156f0
	Address of localA in functionA: 0x7ffdfee156f4
	Address of localB in functionB: 0x7ffdfee156b4

你内心的平庸就是你失去追求卓越信念的那个瞬间

汇编和内存角度理解C/C++

0. Purpose

1. 查看函数地址

2. main 函数和普通函数， 汇编代码没区别

3. 非 main() 函数作为程序入口函数

4. 是否勾选 link to binary, 汇编代码不一样

5. for 循环和等价的 goto 写法

6. if/else 和等价的 goto 写法

7. push rbp 做了什么

8. 在 gdb 中查看内存

9. gdb 启动时不要显示欢迎信息

10. 理解虚拟内存

11. 查看进程可用地址空间范围

12. 汇编层面理解 volatile

13. 不能修改 volatile const / const 变量的现象

14. 在 macOS 下查看栈空间大小

15. 再谈栈的增长方向

16. 数组名不是指针

17. 函数名不是函数指针

18. Compiler Explorer 怎么用？

18.1 官方 wiki

18.2 快速查看源代码对应的汇编代码

18.3 快速查看汇编对应的源代码

18.4 包含一个外部文件：使用 URL

18.5 创建第二个编辑器/编译器

18.6 mq白的 godbolt 使用文档

19. 羽夏的 C语言理解 系列博客

20. NDK 生成汇编代码

21. 使用全局变量

公告

搜索

常用链接

我的标签

文章档案 (1)

ZJUT

程序员充电站

友情链接

2. main 函数和普通函数，汇编代码没区别

4. 是否勾选 `link to binary`, 汇编代码不一样

12. 汇编层面理解 `volatile`

19. 羽夏的 C语言理解系列博客

	int arr[10]{};
	using T = decltype(arr); // 如果用 decltype(+arr), 则 `+` 是做 array-to-pointer 转换
	// print T
	fmt::print("{}\n", typeid(T).name());

	#include <iostream>
	#include <typeinfo>
	#include <fmt/core.h>

	void f(){}

	int main()
	{
	int arr[10]{};
	using T = decltype(arr);
	// print T
	fmt::print("{}\n", typeid(T).name());

	int* p = arr;
	// print p
	fmt::print("{}\n", typeid(p).name());

	using T2 = decltype(&f);
	fmt::print("{}\n", typeid(T2).name());
	using T3 = decltype(f);
	fmt::print("{}\n", typeid(T3).name());
	fmt::print("{}\n", sizeof(+f));

	return 0;
	}

	#include <https://raw.githubusercontent.com/hanickadot/compile-time-regular-expressions/master/single-header/ctre.hpp>
	#include <string_view>

	constexpr auto match(std::string_view sv) noexcept {
	return ctre::match<"h.*">(sv);
	}

	cmake ^
	-S . ^
	-B build-android ^
	-D CMAKE_BUILD_TYPE=Release ^
	-P ./zzbuild.cmake ^
	-p android ^
	-a arm64 ^
	-r D:\soft\android-ndk\r21e ^
	-G Ninja ^
	-DCMAKE_CXX_FLAGS="-save-temps"

	cmake --build build-android

	std::string binPath;

	int main()
	{
	for (int frameCnt=0; frameCnt<1000; frameCnt++)
	{
	binPath = std::to_string(map_array[frameCnt]);
	}
	}

	extern std::string binPath;

	void vpt_run()
	{
	printf("binPath is: %s\n", binPath.c_str());
	}