MIT 6.828 Lab 03:User Environments ( Part A )

Part 1

这里user environment 含义与进程几乎等价:

// An environment ID 'envid_t' has three parts:
//
// +1+---------------21-----------------+--------10--------+
// |0|          Uniqueifier             |   Environment    |
// | |                                  |      Index       |
// +------------------------------------+------------------+
//                                       \--- ENVX(eid) --/
//
// 01:  The environment index ENVX(eid) equals the environment's index in the
// 'envs[]' array.  
// 02:  The uniqueifier distinguishes environments that were
// created at different times, but share the same environment index.
//
// All real environments are greater than 0 (so the sign bit is zero).
// envid_ts less than 0 signify errors.  The envid_t == 0 is special, and
// stands for the current environment.

Environment State

虚拟内存图
/*
 * Virtual memory map:                                Permissions
 *                                                    kernel/user
 *
 *    4 Gig -------->  +------------------------------+
 *                     |                              | RW/--
 *                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 *                     :              .               :
 *                     :              .               :
 *                     :              .               :
 *                     |~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| RW/--
 *                     |                              | RW/--
 *                     |   Remapped Physical Memory   | RW/--
 *                     |                              | RW/--
 *    KERNBASE, ---->  +------------------------------+ 0xf0000000      --+
 *    KSTACKTOP        |     CPU0's Kernel Stack      | RW/--  KSTKSIZE   |
 *                     | - - - - - - - - - - - - - - -|                   |
 *                     |      Invalid Memory (*)      | --/--  KSTKGAP    |
 *                     +------------------------------+                   |
 *                     |     CPU1's Kernel Stack      | RW/--  KSTKSIZE   |
 *                     | - - - - - - - - - - - - - - -|                 PTSIZE
 *                     |      Invalid Memory (*)      | --/--  KSTKGAP    |
 *                     +------------------------------+                   |
 *                     :              .               :                   |
 *                     :              .               :                   |
 *    MMIOLIM ------>  +------------------------------+ 0xefc00000      --+
 *                     |       Memory-mapped I/O      | RW/--  PTSIZE
 * ULIM, MMIOBASE -->  +------------------------------+ 0xef800000
 *                     |  Cur. Page Table (User R-)   | R-/R-  PTSIZE
 *    UVPT      ---->  +------------------------------+ 0xef400000
 *                     |          RO PAGES            | R-/R-  PTSIZE
 *    UPAGES    ---->  +------------------------------+ 0xef000000
 *                     |           RO ENVS            | R-/R-  PTSIZE
 * UTOP,UENVS ------>  +------------------------------+ 0xeec00000
 * UXSTACKTOP -/       |     User Exception Stack     | RW/RW  PGSIZE
 *                     +------------------------------+ 0xeebff000
 *                     |       Empty Memory (*)       | --/--  PGSIZE
 *    USTACKTOP  --->  +------------------------------+ 0xeebfe000
 *                     |      Normal User Stack       | RW/RW  PGSIZE
 *                     +------------------------------+ 0xeebfd000
 *                     |                              |
 *                     |                              |
 *                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 *                     .                              .
 *                     .                              .
 *                     .                              .
 *                     |~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
 *                     |     Program Data & Heap      |
 *    UTEXT -------->  +------------------------------+ 0x00800000
 *    PFTEMP ------->  |       Empty Memory (*)       |        PTSIZE
 *                     |                              |
 *    UTEMP -------->  +------------------------------+ 0x00400000      --+
 *                     |       Empty Memory (*)       |                   |
 *                     | - - - - - - - - - - - - - - -|                   |
 *                     |  User STAB Data (optional)   |                 PTSIZE
 *    USTABDATA ---->  +------------------------------+ 0x00200000        |
 *                     |       Empty Memory (*)       |                   |
 *    0 ------------>  +------------------------------+                 --+

🚩 ULIM 之上是内核空间!

struct Env定义

inc/env.h中有struct Env的定义,具体每个参数的解释如下:

struct Env {
	struct Trapframe env_tf;	// Saved registers
	struct Env *env_link;		// Next free Env
	envid_t env_id;			// Unique environment identifier
	envid_t env_parent_id;		// env_id of this env's parent
	enum EnvType env_type;		// Indicates special system environments
	unsigned env_status;		// Status of the environment
	uint32_t env_runs;		// Number of times environment has run

	// Address space
	pde_t *env_pgdir;		// Kernel virtual address of page dir
};

特别的:

  1. kernel转入kernel mode 保存env_tf
  2. env_id 对每个进程而言独一无二!当前进程终止后,这个Env结构会分给其他进程,但是eid会重新分配一个新的!!
  3. 每个进程有自己单独的page directory
  4. env_status

JOS environment 与 xv6‘s process的区别:

单独的进程不像xv6那样拥有自己的 kernel stacks,同一时刻只能有一个JOS active的进程在kernel里面,所以 JOS只需要一个kernel stack

Excercise 01

出错 A bug 😭

报错:kernel panic at kern/pmap.c:157: PADDR called with invalid kva 00000000

但是切换到lab2分支进行测试后,发现同样的代码,居然kern_pgdir没有变化,说明是excersise1添加的函数导致的变化?

memset()函数在lib/string.c里面

找到一个博主写的博文!问题跟我的一摸一样!不知道解释的对不对,但至少稍微修改了boot_alloc()函数之后成功通过!

https://blog.csdn.net/qq_33765199/article/details/104743134

Exercise 02

此处,我们还没有文件系统,所以加载 a static binary image(embedded within the kernel)。 Lab3 GNUmakefile 产生了一系列binary images(二进制镜像文件)在obj/user/directory里面,这些文件可以被"link"在kernel executable里面,仿佛它们就是.o 文件一样。

env_init()

初始化envs数组,构建env_free_list链表,注意顺序,envs[0]应该在链表头部位置。

⚠在这里我自己皮了一把,没有用前插法——> 然后错了!!!🐖

void
env_init(void)
{
	// Set up envs array
	// LAB 3: Your code here.
	env_free_list = &envs[0];
	//struct Env * tmp = env_free_list;
	
	env_free_list = NULL;
	for (int i = NENV - 1; i >= 0; i--) {	//前插法构建链表
		envs[i].env_id = 0;
		envs[i].env_link = env_free_list;
		env_free_list = &envs[i];
	}

	// Per-CPU part of the initialization
	env_init_percpu();
}
env_setup_vm()

给e指向的Env结构分配页目录,并且继承内核的页目录结构,唯一需要修改的是UVPT需要映射到当前环境的页目录物理地址e->env_pgdir处,而不是内核的页目录物理地址kern_pgdir处。设置完页目录也就确定了当前用户环境线性地址空间到物理地址空间的映射。

static int
env_setup_vm(struct Env *e)
{
	int i;
	struct PageInfo *p = NULL;

	// Allocate a page for the page directory
	if (!(p = page_alloc(ALLOC_ZERO)))
		return -E_NO_MEM;

	// Now, set e->env_pgdir and initialize the page directory.
	//
	// Hint:
	//    - The VA space of all envs is identical above UTOP  //envs在UTOP上面的VA都是一样的,除了在UVPT处
	//	(except at UVPT, which we've set below).
	//	See inc/memlayout.h for permissions and layout.
	//	Can you use kern_pgdir as a template?  Hint: Yes.
	//	(Make sure you got the permissions right in Lab 2.)
	//    - The initial VA below UTOP is empty.  //比UTOP低的VA 初始为空
	//    - You do not need to make any more calls to page_alloc.//不需要再用page_alloc
	//    - Note: In general, pp_ref is not maintained for
	//	physical pages mapped only above UTOP, but env_pgdir  //被映射到UTOP之上的物理页面一般不维护pp_ref 但是环境env_pgdir是个特例
	//	is an exception -- you need to increment env_pgdir's  //需要增加env_pgdir的pp_ref
	//	pp_ref for env_free to work correctly.
	//    - The functions in kern/pmap.h are handy.

	// LAB 3: Your code here.
	p->pp_ref++;
	e->env_pgdir = (pde_t*) page2kva(p);
	memcpy(e->env_pgdir,kern_pgdir,PGSIZE); //复制内核页目录!!
	

	// UVPT maps the env's own page table read-only.
	// Permissions: kernel R, user R
	e->env_pgdir[PDX(UVPT)] = PADDR(e->env_pgdir) | PTE_P | PTE_U;

	return 0;
}
region_alloc()

为环境env(进程)分配len长度的物理空间,然后映射到va 虚拟地址,

// Allocate len bytes of physical memory for environment env,  //为环境env(进程)分配len长度的物理空间
// and map it at virtual address va in the environment's address space.  //然后映射到va 虚拟地址
// Does not zero or otherwise initialize the mapped pages in any way. //不要初始化/其他方式启动这个pages
// Pages should be writable by user and kernel.  //用户、内核均可写
// Panic if any allocation attempt fails.
//
static void
region_alloc(struct Env *e, void *va, size_t len)
{
	// LAB 3: Your code here.
	// (But only if you need it for load_icode.)
	//
	// Hint: It is easier to use region_alloc if the caller can pass
	//   'va' and 'len' values that are not page-aligned.
	//   You should round va down, and round (va + len) up.  //ROUNDUP 向上舍入,ROUNDDOWN 向下舍入
	//   (Watch out for corner-cases!)
	
	void *begin = ROUNDDOWN(va,PGSIZE);
	void *end =ROUNDUP(va+len,PGSIZE);
	while(begin < end)
	{
	    struct PageInfo * pg=page_alloc(0);//分配一个物理页
	    if(!pg)
	    {
	        panic("region_alloc() out of memory!\n");
	    }
	    page_insert(e->env_pgdir,pg,begin,PTE_W|PTE_U);
	    begin += PGSIZE;
	}
}
load_icode()
补充struct Elf

在inc/elf.h里

#define ELF_MAGIC 0x464C457FU	/* "\x7FELF" in little endian */

struct Elf {
	uint32_t e_magic;	// must equal ELF_MAGIC
	uint8_t e_elf[12];
	uint16_t e_type;
	uint16_t e_machine;
	uint32_t e_version;
	uint32_t e_entry;
	uint32_t e_phoff;   //program header offsets (i assume 🙂)
	uint32_t e_shoff;
	uint32_t e_flags;
	uint16_t e_ehsize;
	uint16_t e_phentsize;
	uint16_t e_phnum;
	uint16_t e_shentsize;
	uint16_t e_shnum;
	uint16_t e_shstrndx;
};
//
// Set up the initial program binary, stack, and processor flags
// for a user process.
// This function is ONLY called during kernel initialization,
// before running the first user-mode environment.
//
// This function loads all loadable segments from the ELF binary image  //将ELF 二进制镜像文件装入进程的user memory,从ELF header中制定的地方开始
// into the environment's user memory, starting at the appropriate
// virtual addresses indicated in the ELF program header.
// At the same time it clears to zero any portions of these segments //将program header里面声明已映射但其实并没有出现在ELF文件里的区域置零(如.bss section) 
// that are marked in the program header as being mapped
// but not actually present in the ELF file - i.e., the program's bss section.
//
// All this is very similar to what our boot loader does, except the boot  //与boot loader类似,除了boot loader 还需要从硬盘读取代码
// loader also needs to read the code from disk.  Take a look at
// boot/main.c to get ideas.
//
// Finally, this function maps one page for the program's initial stack. //最终,这个函数将一页与一个程序的初始stack 完成映射
//
// load_icode panics if it encounters problems.  
//  - How might load_icode fail?  What might be wrong with the given input?//想想哪里会出错?
//
static void
load_icode(struct Env *e, uint8_t *binary)
{
	// Hints:
	//  Load each program segment into virtual memory
	//  at the address specified in the ELF segment header.//按照ELF segment header记录的地址装入
	//  You should only load segments with ph->p_type == ELF_PROG_LOAD. //只load ph->p_type == ELF_PROG_LOAD的section
	//  Each segment's virtual address can be found in ph->p_va  //每个section的虚拟地址
	//  and its size in memory can be found in ph->p_memsz. //在内存中的大小
	//  The ph->p_filesz bytes from the ELF binary, starting at  //ph->p_filesz 
	//  'binary + ph->p_offset', should be copied to virtual address
	//  ph->p_va.  Any remaining memory bytes should be cleared to zero.//其他都为0
	//  (The ELF header should have ph->p_filesz <= ph->p_memsz.)  //范围限制
	//  Use functions from the previous lab to allocate and map pages.
	//
	//  All page protection bits should be user read/write for now.
	//  ELF segments are not necessarily page-aligned, but you can  //虽然ELF segments 不必然是边界对齐,但是可以假设没有两个segment会碰到同一个虚拟地址
	//  assume for this function that no two segments will touch
	//  the same virtual page.
	//
	//  You may find a function like region_alloc useful.
	//
	//  Loading the segments is much simpler if you can move data   //将数据load
	//  directly into the virtual addresses stored in the ELF binary.
	//  So which page directory should be in force during
	//  this function?
	//
	//  You must also do something with the program's entry point,  //程序的entry point也需要修改
	//  to make sure that the environment starts executing there.
	//  What?  (See env_run() and env_pop_tf() below.)

	// LAB 3: Your code here.
	struct Elf *ELFHDR = (struct Elf*) binary;
	struct Proghdr *ph;   //program header
	int ph_num;          //共有多少个segments
	
	//首先检查魔数,检验传入的是否是ELF 文件
	if(ELFHDR->e_magic != ELF_MAGIC)
	{
	    panic("load_icode() error! binary is not in ELF format!\n";
	}
	
	ph=(struct Proghdr *) ((uint8_t *)ELFHDR+ELFHDR->e_phoff);  //program header's pointer(virtual address)
	ph_num = ELFHDR->e_phnum;
	
	lcr3(PADDR(e->env_pgdir)); //将env_pgdir转化为物理地址 然后存入cr3寄存器,当作页表基址
	//这一步很重要,尽管现在kern_pgdir 与 env_pgdir 只有在PDX(UVPT)处不相同
	
	for(int i=0;i<ph_num;i++)  //对每一个segment
	{
	    if(ph[i].p_type == ELF_PROG_LOAD)  //只加载满足该条件的segment
	    {
	        region_alloc(e,(void *)ph[i]->p_va,ph[i]->p_memsz);//分配空间
	        memset((void *)ph[i]->p_va,0,ph[i]->p_memsz);//先全部置零,//因为这里需要访问刚分配的内存,所以之前需要切换页目录
	        memcpy((void *)ph[i]->p_va,binary+ph[i].p_offset,ph[i]->p_filesz); //应该有如下关系:ph->p_filesz <= ph->p_memsz。搜索BSS段
	    }
	}
	
	lcr3(PADDR(kern_pgdir)); //换回去
	e->env_tf.tf_eip = ELFHDR ->e_entry;

	// Now map one page for the program's initial stack
	// at virtual address USTACKTOP - PGSIZE.

	// LAB 3: Your code here.
	region_alloc(e,(void *)(USTACKTOP-PGSIZE),PGSIZE); //将程序的初始stack映射到USTACKTOP-PGSIZE
}
env_create()
// Allocates a new env with env_alloc, loads the named elf
// binary into it with load_icode, and sets its env_type.
// This function is ONLY called during kernel initialization,  //该函数只在运行用户模式环境之前内核初始化的时候被调用
// before running the first user-mode environment.
// The new env's parent ID is set to 0.  //pid=0
//
void
env_create(uint8_t *binary, enum EnvType type)
{
	// LAB 3: Your code here.
	struct Env *e;
	if ( (env_alloc(&e,0)!=0)
	{
	    panic("env_creat() error!\n");
	}
	
	load_icode(e,binary);
	e_env_type = type;
}
总结函数调用关系
env_create()  //最外层调用
    --> env_alloc()
       -->env_setup_vm()
    -->load_icode()
       -->region_call()

env_alloc()

是设置e->env_tf结构的地方,设置完后再执行iret指令,寄存器就会加载这些设置了的值。

​ 现在i386_init()函数中的env_run(&envs[0]);调用应该能正常执行,并且将控制转移到hello(user/hello.c)程序中。我们用GDB在env_pop_tf()函数设置断点,然后通过指令si,单步调试,观察iret指令前后寄存器的变化。iret指令后执行的第一条指令应该是cmpl指令(lib/entry.S中的start label处)然后进入hello中执行(可以查看hello的反汇编obj/user/hello.asm),如果顺利将会执行到一条int指令,这是一个系统调用,将字符显示到控制台,但是现在还不起作用,并且会收到“triple fault”

triple fault

When the CPU discovers that it is not set up to handle this system call interrupt, it will generate a general protection exception, find that it can't handle that, generate a double fault exception, find that it can't handle that either, and finally give up with what's known as a "triple fault".(lab3 toolguide)
验证进入用户模式

(自己的结果当时没有保存,借用博主的:【https://www.cnblogs.com/gatsby123/p/9838304.html】感谢!)

The target architecture is assumed to be i8086
[f000:fff0]    0xffff0:	ljmp   $0xf000,$0xe05b
0x0000fff0 in ?? ()
+ symbol-file obj/kern/kernel
(gdb) b env_pop_tf                //设置断点
Breakpoint 1 at 0xf0102d5f: file kern/env.c, line 470.
(gdb) c
Continuing.
The target architecture is assumed to be i386
=> 0xf0102d5f <env_pop_tf>:	push   %ebp

Breakpoint 1, env_pop_tf (tf=0xf01b2000) at kern/env.c:470
470	{
(gdb) si                        //单步
=> 0xf0102d60 <env_pop_tf+1>:	mov    %esp,%ebp
0xf0102d60	470	{
(gdb)                           //单步
=> 0xf0102d62 <env_pop_tf+3>:	sub    $0xc,%esp
0xf0102d62	470	{
(gdb)                           //单步
=> 0xf0102d65 <env_pop_tf+6>:	mov    0x8(%ebp),%esp
471		asm volatile(
(gdb)                           //单步
=> 0xf0102d68 <env_pop_tf+9>:	popa   
0xf0102d68	471		asm volatile(
(gdb)                           //单步
=> 0xf0102d69 <env_pop_tf+10>:	pop    %es
0xf0102d69 in env_pop_tf (tf=<error reading variable: Unknown argument list address for `tf'.>)
    at kern/env.c:471
471		asm volatile(
(gdb)                           //单步
=> 0xf0102d6a <env_pop_tf+11>:	pop    %ds
0xf0102d6a	471		asm volatile(
(gdb)                           //单步
=> 0xf0102d6b <env_pop_tf+12>:	add    $0x8,%esp
0xf0102d6b	471		asm volatile(
(gdb)                           //单步
=> 0xf0102d6e <env_pop_tf+15>:	iret   
0xf0102d6e	471		asm volatile(
(gdb) info registers            //在执行iret前,查看寄存器信息
eax            0x0	0
ecx            0x0	0
edx            0x0	0
ebx            0x0	0
esp            0xf01b2030	0xf01b2030
ebp            0x0	0x0
esi            0x0	0
edi            0x0	0
eip            0xf0102d6e	0xf0102d6e <env_pop_tf+15>
eflags         0x96	[ PF AF SF ]
cs             0x8	8        //0x8正是内核代码段的段选择子
ss             0x10	16
ds             0x23	35
es             0x23	35
fs             0x23	35
gs             0x23	35
(gdb) si                          //单步执行,指令应该执行iret指令
=> 0x800020:	cmp    $0xeebfe000,%esp
0x00800020 in ?? ()
(gdb) info registers              //执行iret指令后,差看寄存器
eax            0x0	0
ecx            0x0	0
edx            0x0	0
ebx            0x0	0
esp            0xeebfe000	0xeebfe000
ebp            0x0	0x0
esi            0x0	0
edi            0x0	0
eip            0x800020	0x800020
eflags         0x2	[ ]
cs             0x1b	27      //0x18是用户代码段的在GDT中的偏移,用户权限是0x3,所以选择子正好是0x1b
ss             0x23	35     //这些寄存器值都是在env_alloc()中被设置好的
ds             0x23	35
es             0x23	35
fs             0x23	35
gs             0x23	35
(gdb) b *0x800a1c              //通过查看obj/user/hello.asm找到断点位置
Breakpoint 2 at 0x800a1c
(gdb) c
Continuing.
=> 0x800a1c:	int    $0x30   //系统调用指令,现在还不起作用

Breakpoint 2, 0x00800a1c in ?? ()
(gdb) 

观察执行iret前后的cs段寄存器的值,执行iret前cs的值0x8正是内核代码段的段选择子(GD_KT定义在inc/memlayout.h中),执行后cs的值0x1b,0x18是用户代码段的在GDT中的偏移(GD_UT定义在inc/memlayout.h中),用户权限是0x3 (环3),所以选择子正好是0x1b。

Handling Interrupts and Exceptions

此时,第一个int $0x30系统调用进入用户模式后无法切回kernel。所以需要建立 basic exception and system call handling. 了解中断和异常机制。

IDT可以驻留在物理内存中的任何位置。 处理器通过IDT寄存器(IDTR)定位IDT。

LAB3_User-Level Environments_PartA_用户环境和异常处理

IDT包含了三种描述子

  • 任务门
  • 中断门
  • 陷阱门

LAB3_User-Level Environments_PartA_用户环境和异常处理
每个entry为8bytes,有以下关键bit:
16~31:code segment selector
0~15 & 46-64:segment offset (根据以上两项可确定中断处理函数的地址)
Type (8-11):区分中断门、陷阱门、任务门等
DPL:Descriptor Privilege Level, 访问特权级
P:该描述符是否在内存中

Bacis of Protected Control Transfer

exception和interruption都是 protected control transfers,将处理器从用户态切换到内核态,杜绝一切用户态的使用。

interrupt :asynchronous event

exception:synchronous

为了保证切换过程是protected的,有以下两个机制:

  • The Interrupt Descriptor Table中断描述符表

    处理器保证中断和异常只能从 a few specific, well-defined entry-points determined by the kernel itself进入,x86允许256个不同的entry points,每一个入口有一个单独的interrupt vector(0-255)。CPU用这个vector进入IDT(interrupt descriptor table)中寻找。进入时加载以下信息:

    • EIP寄存器值,指示kernel如何处理异常
    • CS(code segment)段内容,指示privilege level at which the handler is to run.
      • JOS 所有的异常都是在环0(内核态)
  • The Task State Segment(任务状态段(TSS)):

    当x86异常发生,并且发生了从用户模式到内核模式的转换时,处理器也会进行栈切换。一个叫做task state segment (TSS)的结构指定了栈的位置。TSS是一个很大的数据结构,由于JOS中内核模式就是指权限0,所以处理器只使用TSS结构的ESP0和SS0两个字段来定义内核栈,其它字段不使用。那么内核如何找到这个TSS结构的呢?JOS内核维护了一个static struct Taskstate ts;的变量,然后在trap_init_percpu()函数中,设置TSS选择子(使用ltr指令)。

    void
    trap_init_percpu(void)
    {
        // Setup a TSS so that we get the right stack
        // when we trap to the kernel.
        ts.ts_esp0 = KSTACKTOP;
        ts.ts_ss0 = GD_KD;
        ts.ts_iomb = sizeof(struct Taskstate);
        // Initialize the TSS slot of the gdt.
        gdt[GD_TSS0 >> 3] = SEG16(STS_T32A, (uint32_t) (&ts),
                        sizeof(struct Taskstate) - 1, 0);
        gdt[GD_TSS0 >> 3].sd_s = 0;
        // Load the TSS selector (like other segment selectors, the
        // bottom three bits are special; we leave them 0)
        ltr(GD_TSS0);       //设置TSS选择子
        // Load the IDT
        lidt(&idt_pd);
    }
    

寄存器总结:

CPU模型

https://pdos.csail.mit.edu/6.828/2018/lec/l-x86.pdf 官方课件

  1. TSS选择器就是刚才用ltr指令设置的。中断发生时,自动通过该寄存器找到TSS结构(JOS中是ts这个变量),将栈寄存器SS和ESP分别设置为其中的SS0和ESP0两个字段的值,这样栈就切换到了内核栈。
  2. GDTR就是全局描述符表寄存器,之前已经设置过了。
  3. PDBR(cr3)是页目录基址寄存器,通过该寄存器找到页目录和页表,将虚拟地址映射为物理地址。
  4. IDTR是中断描述符表寄存器,通过这个寄存器中的值可以 找到中断表

Types of Exceptions and Interrupts

0-31号中断都是同步中断,缺页中断就是14号,31号以上的中断可以由int指令,或者外部设备触发。在JOS中,将用48号中断作为系统调用中断。

An Example

假设处理器正在执行代码,这时遇到一条除法指令尝试除以0,处理器将会做如下动作:

  1. 将栈切换到TSS的SS0和ESP0字段定义的内核栈中,在JOS中两个值分别是GD_KD和KSTACKTOP。
  2. 处理器在内核栈中压入如下参数:
                     +--------------------+ KSTACKTOP             
                     | 0x00000 | old SS   |     " - 4
                     |      old ESP       |     " - 8
                     |     old EFLAGS     |     " - 12
                     | 0x00000 | old CS   |     " - 16
                     |      old EIP       |     " - 20 <---- ESP 
                     +--------------------+    
  1. 除以0的异常中断号是0,处理器读取IDT的第0项,从中解析出CS:EIP。
  2. CS:EIP处的异常处理函数执行。
    对于一些异常来说,除了压入上图五个word,还会压入错误代码,如下所示:
                     +--------------------+ KSTACKTOP             
                     | 0x00000 | old SS   |     " - 4
                     |      old ESP       |     " - 8
                     |     old EFLAGS     |     " - 12
                     | 0x00000 | old CS   |     " - 16
                     |      old EIP       |     " - 20
                     |     error code     |     " - 24 <---- ESP
                     +--------------------+   

⚠ 🚩:KSTACKTOP是栈的最低,栈向下生长

仔细观察压入的数据和Trapframe结构,会发现是一致的。

Nested Exceptions and Interrupts

只有从用户态切换到内核态的时候才会自动切换stacks,如果异常本来发生在内核,CPU只需要在当前栈中压入需要的内容。因此,内核可以处理内部的嵌套异常(nested exceptions)

因为不需要切换stack,所以不需要保存old SS 和 ESP 寄存器内容。

对于不需要push an error code的异常类型,kernel stack如下:

                     +--------------------+ KSTACKTOP             
                     | 0x00000 | old SS   |     " - 4
                     |      old ESP       |     " - 8
                     |     old EFLAGS     |     " - 12
                     | 0x00000 | old CS   |     " - 16
                     |      old EIP       |     " - 20 <---- ESP 
                     +--------------------+             

对于需要push an error code的异常类型,处理器在old EIP之后立刻压入error code,as before

需要注意的是,如果处于some reasons(例如 缺少栈空间)没有办法将old stack压入栈,那么kernel什么都不能做,只能resets itself。所以设计的时候要避免这种情况的发生。

Setting Up the IDT

https://pdos.csail.mit.edu/6.828/2018/lec/x86_idt.pdf 官方课件

Exercise 04

需要我们修改trapentry.S和trap.c建立异常处理函数,在trap_init()中建立并且加载IDT。

trapentry.S

  1. 参考struct Trapframe
inc/trap.h
    
struct PushRegs {
	/* registers as pushed by pusha */
	uint32_t reg_edi;
	uint32_t reg_esi;
	uint32_t reg_ebp;
	uint32_t reg_oesp;		/* Useless */
	uint32_t reg_ebx;
	uint32_t reg_edx;
	uint32_t reg_ecx;
	uint32_t reg_eax;
} __attribute__((packed));
    
struct Trapframe {
	struct PushRegs tf_regs;  //Trapframe的第一个结构就是PushRegs,将寄存器eax-edi压入(并且是倒叙结构)
	uint16_t tf_es;
	uint16_t tf_padding1;
	uint16_t tf_ds;
	uint16_t tf_padding2;
	uint32_t tf_trapno;
	/* below here defined by x86 hardware */
	uint32_t tf_err;
	uintptr_t tf_eip;
	uint16_t tf_cs;
	uint16_t tf_padding3;
	uint32_t tf_eflags;
	/* below here only when crossing rings, such as from user to kernel */
	uintptr_t tf_esp;
	uint16_t tf_ss;
	uint16_t tf_padding4;
} __attribute__((packed));
  1. 寄存器的顺序(查询info registers)

  1. 分析源码:env_pop_tf()
void
env_pop_tf(struct Trapframe *tf)
{
	asm volatile(
		"\tmovl %0,%%esp\n"				//将%esp指向tf地址处
		"\tpopal\n"						//弹出Trapframe结构中的tf_regs值到通用寄存器
		"\tpopl %%es\n"					//弹出Trapframe结构中的tf_es值到%es寄存器
		"\tpopl %%ds\n"					//弹出Trapframe结构中的tf_ds值到%ds寄存器
		"\taddl $0x8,%%esp\n" /* skip tf_trapno and tf_errcode */
		"\tiret\n"						//中断返回指令,具体动作如下:从Trapframe结构中依次弹出tf_eip,tf_cs,tf_eflags,tf_esp,tf_ss到相应寄存器
		: : "g" (tf) : "memory");
	panic("iret failed");  /* mostly to placate the compiler */
}

PushRegs结构保存的正是通用寄存器的值,env_pop_tf()第一条指令,将将%esp指向tf地址处,也就是将栈顶指向Trapframe结构开始处,Trapframe结构开始处正是一个PushRegs结构,popalPushRegs结构中保存的通用寄存器值弹出到寄存器中,接着按顺序弹出寄存器%es, %ds。最后执行iret指令,该指令是中断返回指令,具体动作如下:从Trapframe结构中依次弹出tf_eip,tf_cs,tf_eflags,tf_esp,tf_ss到相应寄存器。你会发现和Trapframe结构从上往下是完全一致的。

结果:
kern/trapentry.S

/*
 * Lab 3: Your code here for generating entry points for the different traps.
 */
TRAPHANDLER_NOEC(th0, 0)
TRAPHANDLER_NOEC(th1, 1)//reserved
TRAPHANDLER_NOEC(th2, 2)//reserved
TRAPHANDLER_NOEC(th3, 3)
TRAPHANDLER_NOEC(th4, 4)
TRAPHANDLER_NOEC(th5, 5)
TRAPHANDLER_NOEC(th6, 6)
TRAPHANDLER_NOEC(th7, 7)
TRAPHANDLER(th8, 8)
TRAPHANDLER_NOEC(th9, 9)//reserved
TRAPHANDLER(th10, 10)
TRAPHANDLER(th11, 11)
TRAPHANDLER(th12, 12)
TRAPHANDLER(th13, 13)
TRAPHANDLER(th14, 14)
TRAPHANDLER_NOEC(th15, 15)//reserved
TRAPHANDLER_NOEC(th16, 16)
TRAPHANDLER(th17, 17)
TRAPHANDLER_NOEC(th18, 18)
TRAPHANDLER_NOEC(th19, 19)

TRAPHANDLER_NOEC(th_syscall, T_SYSCALL)


/*
 * Lab 3: Your code here for _alltraps
 */
 
 _alltraps:
 pushl %ds
 pushl %es
 pushal
 movl $GD_KD, %eax
 movw %ax, %ds
 movw %ax, %es
 pushl %esp
 call trap
 popal
 popl %es
 popl %ds
 addl $8, %esp 
 iret  //中断返回指令
  1. 是否需要压入error code根据lec/x86_idt(上图所示)查询

  2. 参考inc/trap.h中的Trapframe结构。tf_ss,tf_esp,tf_eflags,tf_cs,tf_eip,tf_err在中断发生时由处理器压入,所以现在只需要压入剩下寄存器(%ds,%es,通用寄存器)

    • 根据之前所说,在引发异常时 CPU 会把 SS 寄存器到 EIP 寄存器压入栈中,如果需要 error code 的话也会压入,而在上面宏定义的函数中,trapno 也被压入了,所以这里只需要 push 余下的寄存器,

    • 注意需要根据 trapframe 的结构倒序压入, pushal 指令会按顺序将 eaxedi 压入栈中,

    • call 之后的指令是当 call trap 失败时可以还原相关寄存器

trap_init()

void
trap_init(void)
{
	extern struct Segdesc gdt[];

	// LAB 3: Your code here.
	void th0();
	void th1();
    void th2();
	void th3();
	void th4();
	void th5();
	void th6();
	void th7();
	void th8();
	void th9();
	void th10();
	void th11();
	void th12();
	void th13();
	void th14();
	void th16();
	void th17();
	void th18();
	void th19();
	void th_syscall();
	SETGATE(idt[0], 0, GD_KT, th0, 0);		//格式如下:SETGATE(gate, istrap, sel, off, dpl),定义在inc/mmu.h中
	SETGATE(idt[1], 0, GD_KT, th1, 0); //设置idt[1],段选择子为内核代码段,段内偏移为th1
    SETGATE(idt[2], 0, GD_KT, th2, 0);
	SETGATE(idt[3], 0, GD_KT, th3, 3); //breakpoint 用户也可以使用,所以dpl是3
	SETGATE(idt[4], 0, GD_KT, th4, 0);
	SETGATE(idt[5], 0, GD_KT, th5, 0);
	SETGATE(idt[6], 0, GD_KT, th6, 0);
	SETGATE(idt[7], 0, GD_KT, th7, 0);
	SETGATE(idt[8], 0, GD_KT, th8, 0);
	SETGATE(idt[9], 0, GD_KT, th9, 0);
	SETGATE(idt[10], 0, GD_KT, th10, 0);
	SETGATE(idt[11], 0, GD_KT, th11, 0);
	SETGATE(idt[12], 0, GD_KT, th12, 0);
	SETGATE(idt[13], 0, GD_KT, th13, 0);
	SETGATE(idt[14], 0, GD_KT, th14, 0);
	SETGATE(idt[16], 0, GD_KT, th16, 0);
	SETGATE(idt[16], 0, GD_KT, th17, 0);
	SETGATE(idt[16], 0, GD_KT, th18, 0);
	SETGATE(idt[16], 0, GD_KT, th19, 0);

	SETGATE(idt[T_SYSCALL], 0, GD_KT, th_syscall, 3);		//为什么门的DPL要定义为3,参考《x86汇编语言-从实模式到保护模式》p345


	// Per-CPU setup 
	trap_init_percpu();
}

→ 关于SETGATE:#define SETGATE(gate, istrap, sel, off, dpl)

  • istrap: 1 for a trap (= exception) gate, 0 for an interrupt gate.
  • sel: 代码段选择子 for interrupt/trap handler
    • off: 代码段偏移 for interrupt/trap handler
    • dpl: 描述符特权级

→要注意这里th3对应的breakpoint中断,普通用户也能用,所以要设置为dpl=3

Part A 成功啦!

Question

  1. What is the purpose of having an individual handler function for each exception/interrupt? (i.e., if all exceptions/interrupts were delivered to the same handler, what feature that exists in the current implementation could not be provided?)【为每个异常/中断设置单独的处理函数的目的是什么? (即,如果所有异常/中断都传递给同一个处理程序,则无法提供当前实现中存在哪些功能?)】

不同的中断/异常的处理方式和结构都不一样,可否恢复、从哪里恢复、执行的权限等级、是否有error code等。

  1. Did you have to do anything to make the user/softint program behave correctly? The grade script expects it to produce a general protection fault (trap 13), but softint’s code says int $14. Why should this produce interrupt vector 13? What happens if the kernel actually allows softint’s int $14 instruction to invoke the kernel’s page fault handler (which is interrupt vector 14)? 【 需要做什么才能使user/softint程序正常运行? 评分脚本期望它产生一般保护错误(trap 13),但softint的代码为int $14。 为什么这会产生中断向量13? 如果内核实际上允许softint的int $14指令调用内核的页面错误处理程序(中断向量14)会发生什么?】

由于 trap 14IDT 内描述符的 DPL = 0 (即 INT指令为环0指令),而此时 CPL = 3 即权限不足,所以执行这条指令会引发 trap 13(General Protection Exception)


posted @ 2020-08-18 15:01  Cindy's  阅读(876)  评论(0编辑  收藏  举报