三 task_struct与thread_info及stack三者的关系

来自 https://blog.csdn.net/u014426028/article/details/108037971

在linux内核中进程以及线程（多线程也是通过一组轻量级进程实现的）都是通过task_struct结构体来描述的，我们称它为进程描述符。

而thread_info则是一个与进程描述符相关的小数据结构，它同进程的内核态栈stack存放在一个单独为进程分配的内存区域。由于这个内存区域同时保存了thread_info和stack，所以使用了联合体来定义，相关数据结构如下（基于4.4.87版本内核）：

thread_union联合体定义：

union thread_union {

struct thread_info thread_info;

unsigned long stack[THREAD_SIZE/sizeof(long)];

};

thread_info结构体定义：

struct thread_info {

unsigned long flags; /* low level flags */

mm_segment_t addr_limit; /* address limit */

struct task_struct *task; /* main task structure */

int preempt_count; /* 0 => preemptable, <0 => bug */

int cpu; /* cpu */

};

stack的结构比较复杂，只列出部分成员变量:

struct task_struct {

volatile long state;

void *stack;

//...

#ifdef CONFIG_SMP

int on_cpu;

int wake_cpu;

#endif

int on_rq;

//...

#ifdef CONFIG_SCHED_INFO

struct sched_info sched_info;

#endif

//...

pid_t pid;

pid_t tgid;

//...

};

_{用一副图来表示}：

这样设计的好处就是，得到stack，thread_info或task_struct任意一个数据结构的地址，就可以很快得到另外两个数据的地址。

我们可以通过crash工具在ubuntu系统上做个实验，来窥视一下某个进程的进程描述符

如果通过crash分析内核数据结构，可参考：

http://www.cnblogs.com/yanghaizhou/p/7704421.html

这里以进程systemd进程为例，其pid=1

crash> task 1

PID: 1 TASK: ffff88007c898000 CPU: 1 COMMAND: "systemd"

struct task_struct {

state = 1,

stack = 0xffff88007c894000,

usage = {

counter = 2

。。。

可以看到systemd进程的task_struct结构体指针task=0xffff88007c898000

通过task->stack这个结构体成员即可定位到进程的内核栈地址 stack=0xffff88007c894000

另外从之前的图可以看到，thread_info和stack处于同一地址空间，且thread_info在这段地址空间的_{最低地址处}，而且这个地址空间是以THREAD_SIZE对齐的，所以只要将stack地址的最低N位变为0，即可得到thread_info的地址（2^N=THREAD_SIZE)

例如当THREAD_SZIE=8K时，systemd的thread_info地址就等于0xffff88007c894000&(~(0x1FFF)) = 0xffff88007c89400

crash> * thread_info 0xffff88007c894000

struct thread_info {

task = 0xffff88007c898000,

flags = 0,

status = 0,

cpu = 0,

addr_limit = {

seg = 140737488351232

sig_on_uaccess_error = 0,

uaccess_err = 0

}

而通过thread_info->task这个成员变量，又能访问到进程的task_struct结构体，这样就形成了task_struct, thread_info,stack三者之间的关系网，知道其中任何一个，都可以快速的访问到另外两个，提高了数据存取的效率。

内核 current宏解析

在内核中，可以通过current宏来获得当前执行进程的task_struct指针。现在来简要分析以下：

最原始的定义如下：

#define current get_current()

#define get_current() (current_thread_info()->task)

可以看出，current调用了 current_thread_info函数，此函数的内核路径为： arch/arm/include/asm/thread_info.h，内核版本为2.6.32.65

static inline struct thread_info *current_thread_info(void)

{

return (struct thread_info *)(sp & ~(THREAD_SIZE - 1));

}

其中 thread_info结构体如下：

struct thread_info {

unsigned long flags; /* low level flags */

int preempt_count; /* 0 => preemptable, <0 => bug */

mm_segment_t addr_limit; /* address limit */

struct task_struct *task; /* main task structure */

struct exec_domain *exec_domain; /* execution domain */

__u32 cpu; /* cpu */

__u32 cpu_domain; /* cpu domain */

struct cpu_context_save cpu_context; /* cpu context */

__u32 syscall; /* syscall number */

__u8 used_cp[16]; /* thread used copro */

unsigned long tp_value;

struct crunch_state crunchstate;

union fp_state fpstate __attribute__((aligned(8)));

union vfp_state vfpstate;

#ifdef CONFIG_ARM_THUMBEE

unsigned long thumbee_state; /* ThumbEE Handler Base register */

#endif

struct restart_block restart_block;

};

当内核线程执行到此处时，其SP堆栈指针指向调用进程所对应的内核线程的栈顶。通过 sp & ~(THREAD_SIZE-1)向上对齐，达到栈底部。如下图所示

将结果强制类型转换为thread_info类型，此类型中有一个成员为task_struct，它就是当前正在运行进程的 task_struct指针。

备注：

在内核中，进程的task_struct是由slab分配器来分配的，slab分配器的优点是对象复用和缓存着色。

联合体：

#define THREAD_SIZE 8192 //内核线程栈可以通过内核配置成4K 或者 8K ，此处是8K 。在X86体系结构上，32位的内核栈为8K，64位的为16K。

union thread_union {

struct thread_info thread_info; // sizeof(thread_info) =

unsigned long stack[THREAD_SIZE/sizeof(long)]; //stack 大小为 8K，union联合体的地址是严格按照小端排布的，因此，内核栈的低位地址是thread_info结构体。

};

整个8K的空间，顶部供进程堆栈使用，最下部为thread_info。从用户态切换到内核态时，进程的内核栈还是空的，所以sp寄存器指向栈顶，一旦有数据写入，sp的值就会递减，内核栈按需扩展，理论上最大可扩展到【8192- sizeof(thread_info) 】大小，考虑到函数的现场保护，往往不会有这么大的栈空间。内核在代表进程执行时和所有的中断服务程序执行时，共享8K的内核栈。

Linux源码解析-内核栈与thread_info结构详解

1.什么是进程的内核栈？

在内核态（比如应用进程执行系统调用）时，进程运行需要自己的堆栈信息（不是原用户空间中的栈），而是使用内核空间中的栈，这个栈就是进程的内核栈

2.进程的内核栈在计算机中是如何描述的？

linux中进程使用task_struct数据结构描述，其中有一个stack指针

struct task_struct
{
// ...
void *stack; // 指向内核栈的指针
// ...
};

task_struct数据结构中的stack成员指向thread_union结构（Linux内核通过thread_union联合体来表示进程的内核栈）

union thread_union {
struct thread_info thread_info;
unsigned long stack[THREAD_SIZE/sizeof(long)];
};

struct thread_info是记录部分进程信息的结构体，其中包括了进程上下文信息:

struct thread_info {
struct pcb_struct pcb; /* palcode state */
struct task_struct *task; /* main task structure */ /*这里很重要，task指针指向的是所创建的进程的struct task_struct
unsigned int flags; /* low level flags */
unsigned int ieee_state; /* see fpu.h */
struct exec_domain *exec_domain; /* execution domain */ /*表了当前进程是属于哪一种规范的可执行程序,
//不同的系统产生的可执行文件的差异存放在变量exec_domain中
mm_segment_t addr_limit; /* thread address space */
unsigned cpu; /* current CPU */
int preempt_count; /* 0 => preemptable, <0 => BUG */
int bpt_nsaved;
unsigned long bpt_addr[2]; /* breakpoint handling */
unsigned int bpt_insn[2];
struct restart_block restart_block;
};

从用户态刚切换到内核态以后，进程的内核栈总是空的。因此，esp寄存器指向这个栈的顶端,一旦数据写入堆栈，esp的值就递减

3.thread_info的作用是？

这个结构体保存了进程描述符中中频繁访问和需要快速访问的字段，内核依赖于该数据结构来获得当前进程的描述符(为了获取当前CPU上运行进程的task_struct结构，内核提供了current宏。

#define get_current() (current_thread_info()->task)
#define current get_current()

内核还需要存储每个进程的PCB信息, linux内核是支持不同体系的的, 但是不同的体系结构可能进程需要存储的信息不尽相同,

这就需要我们实现一种通用的方式, 我们将体系结构相关的部分和无关的部门进行分离,用一种通用的方式来描述进程, 这就是struct task_struct, 而thread_info

就保存了特定体系结构的汇编代码段需要访问的那部分进程的数据,我们在thread_info中嵌入指向task_struct的指针, 则我们可以很方便的通过thread_info来查找task_struct

3.内核栈的大小？

进程通过alloc_thread_info函数分配它的内核栈，通过free_thread_info函数释放所分配的内核栈，查看源码

alloc_thread_info函数通过调用__get_free_pages函数分配2个页的内存（8192字节）

https://blog.csdn.net/gatieme/article/details/51577479

posted @ 2022-09-02 14:57 atomxing 阅读(859) 评论(1) 收藏举报

刷新页面返回顶部

libxing

三 task_struct与thread_info及stack三者的关系

公告