typeof, offsetof 和container_of
要理解Linux中实现的双向循环链表("侵入式"链表),首先得弄明白宏container_of。 本文尝试从gcc的关键字typeof和宏offsetof入手,循序渐进地剖析宏container_of之实现原理。
1. typeof (from: https://en.wikipedia.org/wiki/Typeof)
typeof is an operator provided by several programming languages to determine the data type of a variable. This is useful when constructing programs that must accept multiple types of data without explicitly specifying the type. The GNU compiler (GCC) extensions for the C programming language provide typeof: #define max(a, b) \ ({ typeof (a) _a = (a); \ typeof (b) _b = (b); \ _a > _b ? _a : _b; })
typeof和sizeof一样,都是关键字。只不过typeof不是标准的c语言关键字,而是gcc支持的(扩展)关键字。 typeof的作用是取得某个变量的数据类型。例如:
unsigned int a = 1; // typeof (a) is unsigned int short b = 2; // typeof (b) is short
2. offsetof (from: include/linux/stddef.h)
#define offsetof(TYPE, MEMBER) ((size_t)&((TYPE *)0)->MEMBER)
宏offsetof的作用是获取某个成员变量(MEMBER)在其所在的结构体(TYPE)里的偏移。 上面的宏实现得非常巧妙(如果你正好也熟悉汇编,一定会英雄所见略同),剖析如下:
(1) P = (TYPE *)0 // 将地址0x0强制转化为类型为TYPE的结构体X的首地址 (2) M = P->MEMBER // 访问X的成员变量MEMBER (3) A = &M // 取得X的成员变量MEMBER的内存地址 (4) O = (size_t)A // 将成员变量MEMBER的内存地址强制转换成偏移量(Offset), = (size_t)&M // 由于内存地址也是无符号整数,所以MEMBER相对于结构体X首地址的偏移Offset等于&M = (size_t)&(P->MEMBER) = (size_t)&P->MEMBER // 注意: -> 比 & 优先级高 = (size_t)&(P)->MEMBER = (size_t)&((TYPE *)0)->MEMBER
为了理解更容易一些,不妨(多加几重括号)将宏offsetof定义为:
#define offsetof(TYPE, MEMBER) ((size_t)(&(((TYPE *)0)->MEMBER)))
进而,用图解析如下: (此图为本人原创,如需转载请注明出处)
3. container_of (from: include/linux/kernel.h)
1 /** 2 * container_of - cast a member of a structure out to the containing structure 3 * @ptr: the pointer to the member. 4 * @type: the type of the container struct this is embedded in. 5 * @member: the name of the member within the struct. 6 * 7 */ 8 #define container_of(ptr, type, member) ({ \ 9 const typeof( ((type *)0)->member ) *__mptr = (ptr); \ 10 (type *)( (char *)__mptr - offsetof(type,member) );})
- L9: 用一个临时变量__mptr保存成员变量的指针ptr
- L10: offsetof(type, member): 计算出成员变量相对于其所在的结构体的偏移,不妨记为OFFSET
- L10: (char *)__mptr - OFFSET, 就是成员变量所在的结构体的首地址 (注意: 对(char *)巧妙的应用)
因此, 宏container_of的作用就是根据某个成员变量的内存地址,反推出其所在的结构体变量的首地址。用图解析如下: (此图为本人原创,如需转载请注明出处)
示例代码: foo.c
1 #include <stdio.h> 2 3 #define offsetof(TYPE, MEMBER) ((size_t)&((TYPE *)0)->MEMBER) 4 5 #define container_of(ptr, type, member) ({ \ 6 const typeof( ((type *)0)->member ) *__mptr = (ptr); \ 7 (type *)( (char *)__mptr - offsetof(type,member) );}) 8 9 typedef struct foo_s { 10 int m_int; 11 short m_short; 12 char m_char; 13 long long m_longlong; 14 } foo_t; 15 16 int 17 main(int argc, char *argv[]) 18 { 19 foo_t ox = {0x12345678, 0x1234, 'A', 0xfedcba9876543210}; 20 char *p3 = &ox.m_char; 21 foo_t *p = container_of(p3, foo_t, m_char); 22 23 printf("foo_t ox (%p) sizeof(foo_t) = %d\n", &ox, sizeof (foo_t)); 24 printf("foo_t *p (%p) p3(%p)\n", p, p3); 25 printf("foo_t->m_int = %#x\t\t(%d)(%p)\n", 26 p->m_int, offsetof(foo_t, m_int), &ox.m_int); 27 printf("foo_t->m_short = %#x\t\t(%d)(%p)\n", 28 p->m_short, offsetof(foo_t, m_short), &ox.m_short); 29 printf("foo_t->m_char = %c\t\t\t(%d)(%p)\n", 30 p->m_char, offsetof(foo_t, m_char), &ox.m_char); 31 printf("foo_t->m_longlong = %#llx\t(%d)(%p)\n", 32 p->m_longlong, offsetof(foo_t, m_longlong), &ox.m_longlong); 33 34 return 0; 35 }
编译并运行
$ gcc -g -Wall -m32 -o foo foo.c $ ./foo foo_t ox (0xbfdfe990) sizeof(foo_t) = 16 foo_t *p (0xbfdfe990) p3(0xbfdfe996) foo_t->m_int = 0x12345678 (0)(0xbfdfe990) foo_t->m_short = 0x1234 (4)(0xbfdfe994) foo_t->m_char = A (6)(0xbfdfe996) foo_t->m_longlong = 0xfedcba9876543210 (8)(0xbfdfe998)
反汇编并结合gcc -E foo.c
(gdb) set disassembly-flavor intel (gdb) disas /m main Dump of assembler code for function main: 18 { 0x0804841d <+0>: push ebp 0x0804841e <+1>: mov ebp,esp 0x08048420 <+3>: and esp,0xfffffff0 0x08048423 <+6>: sub esp,0x40 19 foo_t ox = {0x12345678, 0x1234, 'A', 0xfedcba9876543210}; 0x08048426 <+9>: mov DWORD PTR [esp+0x30],0x12345678 0x0804842e <+17>: mov WORD PTR [esp+0x34],0x1234 0x08048435 <+24>: mov BYTE PTR [esp+0x36],0x41 0x0804843a <+29>: mov DWORD PTR [esp+0x38],0x76543210 0x08048442 <+37>: mov DWORD PTR [esp+0x3c],0xfedcba98 20 char *p3 = &ox.m_char; 0x0804844a <+45>: lea eax,[esp+0x30] 0x0804844e <+49>: add eax,0x6 0x08048451 <+52>: mov DWORD PTR [esp+0x24],eax 21 foo_t *p = container_of(p3, foo_t, m_char); 0x08048455 <+56>: mov eax,DWORD PTR [esp+0x24] 0x08048459 <+60>: mov DWORD PTR [esp+0x28],eax 0x0804845d <+64>: mov eax,DWORD PTR [esp+0x28] 0x08048461 <+68>: sub eax,0x6 0x08048464 <+71>: mov DWORD PTR [esp+0x2c],eax # # --- L21's output from "gcc -E foo.c" --- # 001 foo_t *p = ({ # 002 const typeof( ((foo_t *)0)->m_char ) *__mptr = (p3); # 003 (foo_t *)( (char *)__mptr - ((size_t)&((foo_t *)0)->m_char) ); # 004 }); #
结合汇编代码阅读下面两行,会更好懂:-)
002 const typeof( ((foo_t *)0)->m_char ) *__mptr = (p3); 003 (foo_t *)( (char *)__mptr - ((size_t)&((foo_t *)0)->m_char) );
小结
- typeof (VAR): 获取变量VAR的数据类型, (注意typeof是gcc支持的扩展关键字)
- offsetof(TYPE, MEMBER): 获取成员变量MEMBER在其所在的结构体(类型为TYPE)里的偏移
- container_of(PTR, TYPE, MEMBER): 根据成员变量MEMBER的内存首地址PTR, 反推出其所在的结构体(类型为TYPE)变量的内存首地址
参考资料
1. $ man -s3 offsetof
2. C Operators (which are copied from book C Programming: A Modern Approach, Second Edition) (快速查询C操作符的脚本戳这里)
APPENDIX A C Operators -------------------------------------------------------------------------------- Precedence Name Symbol(s) Associativity -------------------------------------------------------------------------------- 1 Array subscripting [] Left 1 Function call () Left 1 Structure and union member . -> Left 1 Increment (postfix) ++ Left 1 Decrement (postfix) -- Left -------------------------------------------------------------------------------- 2 Increment (prefix) ++ Right 2 Decrement (prefix) -- Right 2 Address & Right 2 Indirection * Right 2 Unary plus + Right 2 Unary minus - Right 2 Bitwise complement ~ Right 2 Logical negation ! Right 2 Size sizeof Right -------------------------------------------------------------------------------- 3 Cast () Right -------------------------------------------------------------------------------- 4 Multiplicative * / % Left -------------------------------------------------------------------------------- 5 Additive + - Left -------------------------------------------------------------------------------- 6 Bitwise shift << >> Left -------------------------------------------------------------------------------- 7 Relational < > <= >= Left -------------------------------------------------------------------------------- 8 Equality == != Left -------------------------------------------------------------------------------- 9 Bitwise and & Left -------------------------------------------------------------------------------- 10 Bitwise exclusive or ^ Left -------------------------------------------------------------------------------- 11 Bitwise inclusive or | Left -------------------------------------------------------------------------------- 12 Logical and && Left -------------------------------------------------------------------------------- 13 Logical or || Left -------------------------------------------------------------------------------- 14 Conditional ?: Right -------------------------------------------------------------------------------- 15 Assignment = *= /= %= Right += -= <<= >>= &= ^= |= -------------------------------------------------------------------------------- 16 Comma , Left -------------------------------------------------------------------------------- 735