关于C++中的string的小知识点
这是GCC版本5.x的情况下的分析,在GCC版本4.x的情况下std::string的内存布局将不同。逆向C++的过程中经常遇到std::string,它在内存中的状态是什么样呢?我先简单地写了一个程序,里面用到了string,使用:clang++ -Xclang -fdump-record-layouts xxx.cpp 可以将xxx.cpp中涉及到的C++的对象的内存布局给打印出来,当然看源码也可以看得到,不过没这个直接。
*** Dumping AST Record Layout
0 | class __gnu_cxx::new_allocator<char> (empty)
| [sizeof=1, dsize=0, align=1,
| nvsize=0, nvalign=1]
*** Dumping AST Record Layout
0 | class std::allocator<char> (empty)
0 | class __gnu_cxx::new_allocator<char> (base) (empty)
| [sizeof=1, dsize=0, align=1,
| nvsize=1, nvalign=1]
*** Dumping AST Record Layout
0 | struct std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::_Alloc_hider
0 | class std::allocator<char> (base) (empty)
0 | class __gnu_cxx::new_allocator<char> (base) (empty)
0 | pointer _M_p
| [sizeof=8, dsize=8, align=8,
| nvsize=8, nvalign=8]
*** Dumping AST Record Layout
0 | union std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::(anonymous at /usr/bin/../lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0/bits/basic_string.h:119:7)
0 | char [16] _M_local_buf
0 | size_type _M_allocated_capacity
| [sizeof=16, dsize=16, align=8,
| nvsize=16, nvalign=8]
*** Dumping AST Record Layout
0 | class std::__cxx11::basic_string<char>
0 | struct std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::_Alloc_hider _M_dataplus
0 | class std::allocator<char> (base) (empty)
0 | class __gnu_cxx::new_allocator<char> (base) (empty)
0 | pointer _M_p
8 | size_type _M_string_length
16 | union std::__cxx11::basic_string<char, struct std::char_traits<char>, class std::allocator<char> >::(anonymous at /usr/bin/../lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0/bits/basic_string.h:119:7)
16 | char [16] _M_local_buf
16 | size_type _M_allocated_capacity
| [sizeof=32, dsize=32, align=8,
| nvsize=32, nvalign=8]
class std::__cxx11::basic_string<char>就是我们提到的string,上面的如果仔细看的话,能看出string的结构就是:
+0 pointer _M_P
+8 size_type _M_string_length
+16 union {
char [16] _M_local_buf;
size_type _M_allocated_capacity;
};
当字符串的长度小于等于15时,union的变量就是_M_local_buf,而字符串的长度大于15的时候,union的变量就是_M_allocated_capacity;_M_P是字符串内容指针,_M_string_length是字符串的长度。在string的构造函数中,_M_P一般会指向+16的位置,随着string的长度变化,会动态地申请内存存放string的内容,这个时候_M_P将会指向新的位置,而+16的位置则存放当前申请到内存的大小,也就是string的capacity。所以在字符串析构函数中,如果_M_P不指向+16的位置,那么就要动态free(_M_P)。