MIT_JOS_Lab4_PartB_and_PartC
Copy-on-Write Fork
As mentioned earlier, Unix provides the fork() system call as its primary process creation primitive. The fork() system call copies the address space of the calling process (the parent) to create a new process (the child).
xv6 Unix implements fork() by copying all data from the parent's pages into new pages allocated for the child. This is essentially the same approach that dumbfork() takes. The way we used is use system call such as sys_page_alloc
and sys_page_alloc
. However, a call to fork() is frequently followed almost immediately by a call to exec() in the child process, which replaces the child's memory with a new program. This means, when we run the child process, we will occur many Page fault, Because new program don't have the same virtual address space as the parent.
The above method will take lots of page fault and we can't deal with easily, A better way is to use the copy-on-write method. When we create a new process, we copy the page directory and page table to the new process, not the content of the assigned memory. At the same time, we mark the shared page as read-only, which is the physical memory segment mapped by the parent process. When one of the two processes tries to write to one of these shared pages, the process takes a page fault. Then we hand the page fault, an alloc new page for the collision, we can get the result.
User-level page fault handling
It's common to set up an address space so that page faults indicate when some action needs to take place. For example, most Unix kernels initially map only a single page in a new process's stack region, and allocate and map additional stack pages later "on demand" as the process's stack consumption increases and causes page faults on stack addresses that are not yet mapped. A typical Unix kernel must keep track of what action to
take when a page fault occurs in each region of a process's space.
This is a lot of information for the kernel to keep track of. Instead of taking the traditional Unix approach, we will decide what to do about each page fault in user space, where bugs are less damaging. This design has the added benefit of allowing programs great flexibility in defining their memory regions; you'll use user-level page fault handling later for mapping and accessing files on a disk-based file system.
Setting the Page Fault Handler
In Lab3, we did not deal with user-level page faults. For kernel page faults, we directly used panic errors, Now we need to know how to deal with page faults that occur in user processes. The user environment registers its page fault entrypoint via the new sys_env_set_pgfault_upcall system call. We have added a new member to the Env structure, env_pgfault_upcall, to record this information, So when a page fault occurs in an Env, we can find the entry address of the page fault handler according to Env->env_pgfault_upcall , For an Env, we must know its env_pgfault_upcall before page fault occur, so sys_env_set_pgfault_upcall is used when fork() a new process,
Exercise 8
// Set the page fault upcall for 'envid' by modifying the corresponding struct
// Env's 'env_pgfault_upcall' field. When 'envid' causes a page fault, the
// kernel will push a fault record onto the exception stack, then branch to
// 'func'.
//
// Returns 0 on success, < 0 on error. Errors are:
// -E_BAD_ENV if environment envid doesn't currently exist,
// or the caller doesn't have permission to change envid.
static int
sys_env_set_pgfault_upcall(envid_t envid, void *func)
{
// LAB 4: Your code here.
struct Env *e;
int r;
if ((r = envid2env(envid, &e, 1)) != 0) {
return r;
}
// func is the address of sys_env_set_pgfault_upcall
e->env_pgfault_upcall = func;
return 0;
// panic("sys_env_set_pgfault_upcall not implemented");
}
Normal and Exception Stacks in User Environments
When an exception or interruption occurs in JOS, we trap into the kernel. At this time, we need to execute the kernel code, so we need to run in the kernel stack. This stack is obtained by reading the TSS of the kernel. Of course, the TSS and the kernel stack of each CPU is different. We remember that the order of entering the kernel when an interruption occurs is what we should do before handle the user page-fault. For a user program, we have also a similar handler stack, that is JOS user exception stack, the stack is also one page in size, and its top is defined to be at virtual address
UXSTACKTOP, so the valid bytes of the user exception stack are from UXSTACKTOP-PGSIZE through UXSTACKTOP-1 inclusive. While running on this exception stack, the user-level page fault handler can use JOS's regular system calls to map new pages or adjust mappings so as to fix whatever problem originally caused the page fault. Then the user-level page fault handler returns, via an assembly language stub, to the faulting code on the original stack.
Invoking the User Page Fault Handler
we will now need to change the page fault handling code in kern/trap.c to handle page faults from user mode as follows. We will call the state of the user environment at the time of the fault the trap-time state. this is a new Env-state.
If there is no page fault handler registered, the JOS kernel destroys the user environment with a message as before. Otherwise, the kernel sets up a trap frame on the exception stack that looks like a struct UTrapframe from inc/trap.h:
Then we will re-enter the user mode and execute the page fault handler in the exception stack. After the handler is executed, the operating system returns to the where page fault occurred and continues to execute, If the currently running user program is a page fault handler, then a page fault occurs in the exception stack, then when building the exception stack and executing the next page fault handler,we should first push an empty 32-bit word in the stack , then push next struct Trapframe. Therefore, we determine whether the page fault occurs in the exception stack, only need to determine the address range pointed to by the %esp register, if it is in the exception stack range, it means that the page fault occurs here
Before:
Previous Frame User Trap Frame
+------------------+ +------------------+
| stack data | +---- | trap-time esp |
| ... | | | trap-time eflags |
+------------------+ <----+ | trap-time eip |
| trap-time eax |
| ... |
| trap-time esi |
| trap-time edi |
| tf_err |
| fault_va |
+------------------+ <-- %esp
After:
Previous Frame User Trap Frame
+------------------+ +------------------+
| stack data | +---- | trap-time esp-4 |
| ... | | | trap-time eflags |
| trap-time eip | | | trap-time eip |
+------------------+ <----+ | trap-time eax |
| ... |
| trap-time esi |
| trap-time edi |
| tf_err |
| fault_va |
+------------------+ <-- %esp
Exercise 9
void page_fault_handler(struct Trapframe *tf)
{
uint32_t fault_va;
// Read processor's CR2 register to find the faulting address
fault_va = rcr2();
// Handle kernel-mode page faults.
// LAB 3: Your code here.
if ((tf->tf_cs & 0x3) == 0) {
panic("page fault in kernel mode!");
}
// We've already handled kernel-mode exceptions, so if we get here,
// the page fault happened in user mode.
// Call the environment's page fault upcall, if one exists. Set up a
// page fault stack frame on the user exception stack (below
// UXSTACKTOP), then branch to curenv->env_pgfault_upcall.
//
// The page fault upcall might cause another page fault, in which case
// we branch to the page fault upcall recursively, pushing another
// page fault stack frame on top of the user exception stack.
//
// It is convenient for our code which returns from a page fault
// (lib/pfentry.S) to have one word of scratch space at the top of the
// trap-time stack; it allows us to more easily restore the eip/esp. In
// the non-recursive case, we don't have to worry about this because
// the top of the regular user stack is free. In the recursive case,
// this means we have to leave an extra word between the current top of
// the exception stack and the new stack frame because the exception
// stack _is_ the trap-time stack.
//
// If there's no page fault upcall, the environment didn't allocate a
// page for its exception stack or can't write to it, or the exception
// stack overflows, then destroy the environment that caused the fault.
// Note that the grade script assumes you will first check for the page
// fault upcall and print the "user fault va" message below if there is
// none. The remaining three checks can be combined into a single test.
//
// Hints:
// user_mem_assert() and env_run() are useful here.
// To change what the user environment runs, modify 'curenv->env_tf'
// (the 'tf' variable points at 'curenv->env_tf').
// LAB 4: Your code here.
if (curenv->env_pgfault_upcall) {
struct UTrapframe *utf;
// Determine the location
if (tf->tf_esp >= UXSTACKTOP - PGSIZE && tf->tf_esp < UXSTACKTOP) {
*(uint32_t *)(tf->tf_esp - 4) = 0; // push an empty 32-bit word
utf = (struct UTrapframe *)(tf->tf_esp - 4 - sizeof(struct UTrapframe));
} else {
utf = (struct UTrapframe *)(UXSTACKTOP - sizeof(struct UTrapframe));
}
// Check permission
user_mem_assert(curenv, (void *)utf, sizeof(struct UTrapframe), PTE_W | PTE_U);
// Set up the user trap frame
utf->utf_esp = tf->tf_esp;
utf->utf_eflags = tf->tf_eflags;
utf->utf_eip = tf->tf_eip;
utf->utf_regs = tf->tf_regs;
utf->utf_err = tf->tf_err;
utf->utf_fault_va = fault_va;
// Switch the environment
tf->tf_esp = (uint32_t)utf;
tf->tf_eip = (uint32_t)curenv->env_pgfault_upcall;
env_run(curenv);
}
// Destroy the environment that caused the fault.
cprintf("[%08x] user fault va %08x ip %08x\n",
curenv->env_id, fault_va, tf->tf_eip);
print_trapframe(tf);
env_destroy(curenv);
}
User-mode Page Fault Entrypoint
Exercise 10
Implement the _pgfault_upcall routine in lib/pfentry.S. The interesting part is returning to the original point in the user code that caused the page fault. You'll return directly there, without going back through the kernel. The hard part is simultaneously switching stacks and re-loading the EIP.
.text
.globl _pgfault_upcall
_pgfault_upcall:
// Call the C page fault handler.
pushl %esp // function argument: pointer to UTF
movl _pgfault_handler, %eax
call *%eax
addl $4, %esp // pop function argument
// Now the C page fault handler has returned and you must return
// to the trap time state.
// Push trap-time %eip onto the trap-time stack.
//
// Explanation:
// We must prepare the trap-time stack for our eventual return to
// re-execute the instruction that faulted.
// Unfortunately, we can't return directly from the exception stack:
// We can't call 'jmp', since that requires that we load the address
// into a register, and all registers must have their trap-time
// values after the return.
// We can't call 'ret' from the exception stack either, since if we
// did, %esp would have the wrong value.
// So instead, we push the trap-time %eip onto the *trap-time* stack!
// Below we'll switch to that stack and call 'ret', which will
// restore %eip to its pre-fault value.
//
// In the case of a recursive fault on the exception stack,
// note that the word we're pushing now will fit in the
// blank word that the kernel reserved for us.
//
// Throughout the remaining code, think carefully about what
// registers are available for intermediate calculations. You
// may find that you have to rearrange your code in non-obvious
// ways as registers become unavailable as scratch space.
//
// LAB 4: Your code here.
movl 0x30(%esp), %ecx // save trap-time esp in ecx
subl $4, %ecx // enlarge the previous stack for 4 bytes
movl %ecx, 0x30(%esp) // write the modified esp back
movl 0x28(%esp), %edx // save trap-time eip in edx
movl %edx, (%ecx) // save eip at new esp for return
// Restore the trap-time registers. After you do this, you
// can no longer modify any general-purpose registers.
// LAB 4: Your code here.
addl $8, %esp // skip fault_va and tf_err
popal // pop PushRegs
// Restore eflags from the stack. After you do this, you can
// no longer use arithmetic operations or anything else that
// modifies eflags.
// LAB 4: Your code here.
addl $4, %esp // skip eip
popfl
// Switch back to the adjusted trap-time stack.
// LAB 4: Your code here.
pop %esp
// Return to re-execute the instruction that faulted.
// LAB 4: Your code here.
ret
Exercise 11
// Set the page fault handler function.
// If there isn't one yet, _pgfault_handler will be 0.
// The first time we register a handler, we need to
// allocate an exception stack (one page of memory with its top
// at UXSTACKTOP), and tell the kernel to call the assembly-language
// _pgfault_upcall routine when a page fault occurs.
//
void set_pgfault_handler(void (*handler)(struct UTrapframe *utf))
{
int r;
if (_pgfault_handler == 0) {
// First time through!
// LAB 4: Your code here.
// we first alloc a page below UXSTACKTOP for the page fault that maybe hanppen
// this is in parent Env
if ((r = sys_page_alloc(thisenv->env_id, (void *)(UXSTACKTOP - PGSIZE), PTE_W | PTE_U | PTE_P)) != 0) {
panic("set_pgfault_handler: %e", r);
}
// set the Assembly language pgfault entrypoint defined in lib/pfentry.S.
if ((r = sys_env_set_pgfault_upcall(thisenv->env_id, _pgfault_upcall)) != 0) {
panic("set_pgfault_handler: %e", r);
}
// panic("set_pgfault_handler not implemented");
}
// Save handler pointer for assembly to call. we should know _pgfault_upcall
// and _pgfault_handler are different function, and _pgfault_handler is a global variable
_pgfault_handler = handler;
}
For the steps of page fault, we cloud describe this process by this picture:
Implementing Copy-on-Write Fork
now we, have the kernel facilities to implement copy-on-write fork() entirely in user space. We have provided a skeleton for your fork() in lib/fork.c. Like dumbfork(), fork() should create a new environment, then scan through the parent environment's entire address space and set up corresponding page mappings in the child. The key difference is that, while dumbfork() copied pages, fork() will initially only copy page mappings. fork() will copy each page only when one of the environments tries to write it. This means, we could only copy page directory and page table, and change the permission of the new Env,
The basic control flow for fork() is as follows:
- The parent installs pgfault() as the C-level page fault handler, using the set_pgfault_handler() function you implemented above. this is the first step in the above picture.
- The parent calls sys_exofork() to create a child environment, this didn't describe in the picture, but this step is very clear, and use the system call in part A
- For each writable or copy-on-write page in its address space below UTOP, the parent calls duppage, which should map the page copy-on-write into the address space of the child and then remap the page copy-on-write in its own address space. [ Note: The ordering here (i.e., marking a page as COW in the child before marking it in the parent) actually matters! Can you see why? Try to think of a specific case where reversing the order could cause trouble. ] duppage sets both PTEs so that the page is not writeable, and to contain PTE_COW in the "avail" field to distinguish copy-on-write pages from genuine read-only pages. In this process, we need to know how Child Env's page tables and page directories come from, we could describe with this picture:
and we need to allocate a fresh page in the child for the exception stack in function fork() with sys_page_alloc() function,
4. The parent sets the user page fault entrypoint for the child to look like its own, it use sys_env_set_pgfault_upcall() function
5. The child is now ready to run, so the parent marks it runnable
Exercise 12
Implement fork, duppage and pgfault in lib/fork.c, these step is not hard when we clear the step that we should do,
envid_t fork(void)
{
// LAB 4: Your code here.
envid_t envid;
uint32_t addr;
int r;
// set the pgfault as the handler address
set_pgfault_handler(pgfault);
// construct a new env
envid = sys_exofork();
if (envid < 0) {
panic("sys_exofork: %e", envid);
}
if (envid == 0) {
// fix thisenv in child
thisenv = &envs[ENVX(sys_getenvid())];
return 0;
}
// copy the address space mappings to child, we just copy page table and page directory
for (addr = 0; addr < USTACKTOP; addr += PGSIZE) {
if ((uvpd[PDX(addr)] & PTE_P) == PTE_P && (uvpt[PGNUM(addr)] & PTE_P) == PTE_P) {
duppage(envid, PGNUM(addr));
}
}
// set the _pgfault_upcall address
void _pgfault_upcall();
// allocate new page for child's user exception stack
if ((r = sys_page_alloc(envid, (void *)(UXSTACKTOP - PGSIZE), PTE_W | PTE_U | PTE_P)) != 0) {
panic("fork: %e", r);
}
// set the _pgfault_upcall address and save it in Env->env_pgfault_upcall
if ((r = sys_env_set_pgfault_upcall(envid, _pgfault_upcall)) != 0) {
panic("fork: %e", r);
}
// mark the child as runnable
if ((r = sys_env_set_status(envid, ENV_RUNNABLE)) != 0)
panic("fork: %e", r);
return envid;
panic("fork not implemented");
}
static int duppage(envid_t envid, unsigned pn)
{
int r;
// get the child envid, and all syscall need the ID
envid_t parent_envid = sys_getenvid();
void *va = (void *)(pn * PGSIZE);
// change the writable page as PTE_COW
if ((uvpt[pn] & PTE_W) == PTE_W || (uvpt[pn] & PTE_COW) == PTE_COW) {
if ((r = sys_page_map(parent_envid, va, envid, va, PTE_COW | PTE_U | PTE_P)) != 0) {
panic("duppage: %e", r);
}
if ((r = sys_page_map(parent_envid, va, parent_envid, va, PTE_COW | PTE_U | PTE_P)) != 0) {
panic("duppage: %e", r);
}
} else {
if ((r = sys_page_map(parent_envid, va, envid, va, PTE_U | PTE_P)) != 0) {
panic("duppage: %e", r);
}
}
// LAB 4: Your code here.
return 0;
}
static void pgfault(struct UTrapframe *utf)
{
void *addr = (void *) utf->utf_fault_va;
uint32_t err = utf->utf_err;
int r;
// Check that the faulting access was (1) a write, and (2) to a
// copy-on-write page. If not, panic.
// Hint:
// Use the read-only page table mappings at uvpt
// (see <inc/memlayout.h>).
// LAB 4: Your code here.
// Allocate a new page, map it at a temporary location (PFTEMP),
// copy the data from the old page to the new page, then move the new
// page to the old page's address.
// Hint:
// You should make three system calls.
// LAB 4: Your code here.
pte_t pte = uvpt[PGNUM(addr)];
envid_t envid = sys_getenvid();
if ((err & FEC_WR) == 0 || (pte & PTE_COW) == 0) {
panic("pgfault: bad faulting access\n");
}
// Map the current process PFTEMP to the physical page pointed to by the current process addr
if ((r = sys_page_alloc(envid, PFTEMP, PTE_W | PTE_U | PTE_P)) != 0) {
panic("pgfault: %e", r);
}
// Make the current process addr point to the newly allocated physical page
if ((r = sys_page_map(envid, PFTEMP, envid, ROUNDDOWN(addr, PGSIZE), PTE_W | PTE_U | PTE_P)) != 0) {
panic("pgfault: %e", r);
}
// Copy the physical page pointed to by PFTEMP to the physical page pointed to by addr
memcpy(PFTEMP, ROUNDDOWN(addr, PGSIZE), PGSIZE);
// Unmap the current process PFTEMP, it was a temp Page
if ((r = sys_page_unmap(envid, PFTEMP)) != 0) {
panic("pgfault: %e", r);
}
// panic("pgfault not implemented");
}
and could use a picture to describe it:
Part C: Preemptive Multitasking and Inter-Process communication
Run user/rotation test program. The test program is derived from a sub-environment, and once the sub-environment receives control from the CPU, it rotates permanently in a tight loop. Neither the parent environment nor the kernel will regain CPU. In terms of protecting the system from errors or malicious code in the user-mode environment, this is obviously not an ideal situation, because any user-mode environment can never return by entering an infinite loop, thereby shutting down the entire system. CPU. In order to allow the kernel to preempt a running environment and forcefully regain control of the CPU from it, we must extend the JOS kernel to support external hardware interrupts of the clock hardware
Clock Interrupts and Preemption
Interrupt discipline
External interrupts (i.e., device interrupts) are referred to as IRQs. There are 16 possible IRQs, numbered 0 through 15. The mapping from IRQ number to IDT entry is not fixed. pic_init
in picirq.c
maps IRQs 0-15 to IDT entries IRQ_OFFSET
through IRQ_OFFSET+15
.
In inc/trap.h
, IRQ_OFFSET
is defined to be decimal 32. Thus the IDT entries 32-47 correspond to the IRQs 0-15. For example, the clock interrupt is IRQ 0. Thus, IDT[IRQ_OFFSET+0] (i.e., IDT[32]) contains the address of the clock's interrupt handler routine in the kernel. This IRQ_OFFSET
is chosen so that the device interrupts do not overlap with the processor exceptions, which could obviously cause confusion. External device interrupts are always disabled when in the kernel (and, like xv6, enabled when in user space). In JOS, External interrupts are controlled by the FL_IF
flag bit of the %eflags
register (see inc/mmu.h
). When this bit is set, external interrupts are enabled. While the bit can be modified in several ways, because of our simplification, we will handle it solely through the process of saving and restoring %eflags
register as we enter and leave user mode.
we will have to ensure that the FL_IF
flag is set in user environments when they run so that when an interrupt arrives, it gets passed through to the processor and handled by your interrupt code. Otherwise, interrupts are masked, or ignored until interrupts are re-enabled. We masked interrupts with the very first instruction of the bootloader, and so far we have never gotten around to re-enabling them.
Exercise 13
for kern/trapentry.S and kern/trap.c , The process of trap is that we define the entry in trapentry.S, and then implement the trap in IDT in trap.c.
// in trapentry.S
TRAPHANDLER_NOEC(th_irq_timer, IRQ_OFFSET + IRQ_TIMER)
TRAPHANDLER_NOEC(th_irq_kbd, IRQ_OFFSET + IRQ_KBD)
TRAPHANDLER_NOEC(th_irq_serial, IRQ_OFFSET + IRQ_SERIAL)
TRAPHANDLER_NOEC(th_irq_spurious, IRQ_OFFSET + IRQ_SPURIOUS)
TRAPHANDLER_NOEC(th_irq_ide, IRQ_OFFSET + IRQ_IDE)
TRAPHANDLER_NOEC(th_irq_error, IRQ_OFFSET + IRQ_ERROR)
// define the trap function
void th_irq_timer();
void th_irq_kbd();
void th_irq_serial();
void th_irq_spurious();
void th_irq_ide();
void th_irq_error();
// construct the IDT,
SETGATE(idt[IRQ_OFFSET + IRQ_TIMER], 0, GD_KT, &th_irq_timer, 0);
SETGATE(idt[IRQ_OFFSET + IRQ_KBD], 0, GD_KT, &th_irq_kbd, 0);
SETGATE(idt[IRQ_OFFSET + IRQ_SERIAL], 0, GD_KT, &th_irq_serial, 0);
SETGATE(idt[IRQ_OFFSET + IRQ_SPURIOUS], 0, GD_KT, &th_irq_spurious, 0);
SETGATE(idt[IRQ_OFFSET + IRQ_IDE], 0, GD_KT, &th_irq_ide, 0);
SETGATE(idt[IRQ_OFFSET + IRQ_ERROR], 0, GD_KT, &th_irq_error, 0);
for the env_alloc() function, When we assign an Env, turn on the FL_IF
bit of this Env, indicating that an interrupt is allowed, when an interrupt is informed to Env, the processor is transferred to execute the interrupt handler in Kernel.
// Enable interrupts while in user mode.
// LAB 4: Your code here.
e->env_tf.tf_eflags |= FL_IF;
Handling Clock Interrupts
When a clock interrupt occurs, enter Trap, we need to modify trap_dispatch() to handle this interrupt according to the interrupt signal,
Exercise 14
// Handle clock interrupts. Don't forget to acknowledge the
// interrupt using lapic_eoi() before calling the scheduler!
// LAB 4: Your code here.
if (tf->tf_trapno == IRQ_OFFSET + IRQ_TIMER) {
// Acknowledge interrupt.
lapic_eoi();
// Choose a user environment to run and run it.
sched_yield();
return;
}
Inter-Process communication (IPC)
We've been focusing on the isolation aspects of the operating system, the ways it provides the illusion that each program has a machine all to itself. Another important service of an operating system is to allow programs to communicate with each other when they want to. It can be quite powerful to let programs interact with other programs. The Unix pipe model is the canonical example.
IPC in JOS
The "messages" that user environments can send to each other using JOS's IPC mechanism consist of two components: a single 32-bit value, and optionally a single page mapping. Allowing environments to pass page mappings in messages provides an efficient way to transfer more data than will fit into a single 32-bit integer, and also allows environments to set up shared memory arrangements easily.
Sending and Receiving Messages
To receive a message, an environment calls sys_ipc_recv
. This system call de-schedules the current environment and does not run it again until a message has been received. When an environment is waiting to receive a message, any other environment can send it a message - not just a particular environment, and not just environments that have a parent/child arrangement with the receiving environment.
To try to send a value, an environment calls sys_ipc_try_send
with both the receiver's environment id and the value to be sent. If the named environment is actually receiving (it has called sys_ipc_recv
and not gotten a value yet), then the send delivers the message and returns 0. Otherwise the send returns -E_IPC_NOT_RECV
to indicate that the target environment is not currently expecting to receive a value.
Transferring Pages
When an environment calls sys_ipc_recv
with a valid dstva
parameter (below UTOP
), the environment is stating that it is willing to receive a page mapping. If the sender sends a page, then that page should be mapped at dstva
in the receiver's address space. If the receiver already had a page mapped at dstva
, then that previous page is unmapped.
When an environment calls sys_ipc_try_send
with a valid srcva
(below UTOP
), it means the sender wants to send the page currently mapped at srcva
to the receiver, with permissions perm
. After a successful IPC, the sender keeps its original mapping for the page at srcva
in its address space, but the receiver also obtains a mapping for this same physical page at the dstva
originally specified by the receiver, in the receiver's address space. As a result this page becomes shared between the sender and receiver.
Implementing IPC
Exercise 15
We use a picture to explain the process of IPC and the flow of data in this process in a simple and clear way. Note that the function of the ipc_recv() function is only to start receiving, and the real data transmission is the ipc_send() function.
So, we can complete these code easily:
// If 'dstva' is < UTOP, then you are willing to receive a page of data.
// 'dstva' is the virtual address at which the sent page should be mapped.
static int sys_ipc_recv(void *dstva)
{
// the address must PAGSIZR align
if ((uint32_t)dstva < UTOP && PGOFF(dstva) != 0) {
return -E_INVAL;
}
curenv->env_ipc_recving = 1;
curenv->env_ipc_dstva = dstva;
curenv->env_status = ENV_NOT_RUNNABLE;
return 0;
}
// envid is the target Env
static int sys_ipc_try_send(envid_t envid, uint32_t value, void *srcva, unsigned perm)
{
struct Env *e;
struct PageInfo *pp;
pte_t *pte;
int r;
// e is the target Env
if ((r = envid2env(envid, &e, 0)) != 0) {
return r;
}
// env is not ready to receive
if (e->env_ipc_recving == 0) {
return -E_IPC_NOT_RECV;
}
// source virtual address is Error
if ((uint32_t)srcva < UTOP) {
if (PGOFF(srcva) != 0) {
return -E_INVAL;
}
// only user have write and read permission
if ((perm & (PTE_U | PTE_P)) != (PTE_U | PTE_P)) {
return -E_INVAL;
}
if ((perm & ~(PTE_SYSCALL)) != 0) {
return -E_INVAL;
}
// get the source virtual address's physical memory
if ((pp = page_lookup(curenv->env_pgdir, srcva, &pte)) == NULL) {
return -E_INVAL;
}
if ((*pte & PTE_W) == 0 && (perm & PTE_W) == PTE_W) {
return -E_INVAL;
}
// insert the physical memory page in the target virtual address
if ((r = page_insert(e->env_pgdir, pp, e->env_ipc_dstva, perm)) != 0) {
return r;
}
e->env_ipc_perm = perm;
} else {
e->env_ipc_perm = 0;
}
// have received all message of pages
e->env_ipc_recving = 0;
e->env_ipc_from = curenv->env_id;
e->env_ipc_value = value;
e->env_status = ENV_RUNNABLE;
return 0;
}
we also need add the system call in syscall()
case SYS_ipc_recv:
return sys_ipc_recv((void *)a1);
case SYS_ipc_try_send:
return sys_ipc_try_send(a1, a2, (void *)a3, a4);
// Receive a value via IPC and return it.
// If 'pg' is nonnull, then any page sent by the sender will be mapped at that address.
// If 'from_env_store' is nonnull, then store the IPC sender's envid in *from_env_store.
// If 'perm_store' is nonnull, then store the IPC sender's page permission
// in *perm_store (this is nonzero iff a page was successfully transferred to 'pg').
// If the system call fails, then store 0 in *fromenv and *perm (if they're nonnull) and return the error.
// Otherwise, return the value sent by the sender
//
int32_t ipc_recv(envid_t *from_env_store, void *pg, int *perm_store)
{
// LAB 4: Your code here.
// panic("ipc_recv not implemented");
int r;
if (pg == NULL) {
pg = (void *)UTOP;
}
// If the sys_ipc_recv() system call fails
if ((r = sys_ipc_recv(pg)) < 0) {
if (from_env_store != NULL) {
*from_env_store = 0;
}
if (perm_store != NULL) {
*perm_store = 0;
}
return r;
}
// If the sys_ipc_recv() system call successed
// thisenv->env_ipc_from was set while the send Env send the data
// so, we change the from_env_store to the Env the message from
if (from_env_store != NULL) {
*from_env_store = thisenv->env_ipc_from;
}
if (perm_store != NULL) {
*perm_store = thisenv->env_ipc_perm;
}
return thisenv->env_ipc_value;
return 0;
}
// Send 'val' (and 'pg' with 'perm', if 'pg' is nonnull) to 'toenv'.
// This function keeps trying until it succeeds.
// It should panic() on any error other than -E_IPC_NOT_RECV.
//
// Hint:
// Use sys_yield() to be CPU-friendly.
// If 'pg' is null, pass sys_ipc_try_send a value that it will understand
// as meaning "no page". (Zero is not the right value.)
void ipc_send(envid_t to_env, uint32_t val, void *pg, int perm)
{
// LAB 4: Your code here.
int r;
if (pg == NULL) {
pg = (void *)UTOP;
}
do {
r = sys_ipc_try_send(to_env, val, pg, perm);
if (r < 0 && r != -E_IPC_NOT_RECV) {
panic("ipc_send: %e", r);
}
sys_yield();
} while(r != 0);
// panic("ipc_send not implemented");
}
and we get an example in user/sendpage.c
void umain(int argc, char **argv)
{
envid_t who;
if ((who = fork()) == 0) {
// Child
// change the who to the send Env.
// and the received's Env's send Env was define by the send Env in send process
ipc_recv(&who, TEMP_ADDR_CHILD, 0);
cprintf("%x got message: %s\n", who, TEMP_ADDR_CHILD);
if (strncmp(TEMP_ADDR_CHILD, str1, strlen(str1)) == 0)
cprintf("child received correct message\n");
memcpy(TEMP_ADDR_CHILD, str2, strlen(str2) + 1);
ipc_send(who, 0, TEMP_ADDR_CHILD, PTE_P | PTE_W | PTE_U);
return;
}
// Parent
// alloc a page for the message to save
sys_page_alloc(thisenv->env_id, TEMP_ADDR, PTE_P | PTE_W | PTE_U);
memcpy(TEMP_ADDR, str1, strlen(str1) + 1);
// we first send the message to child Env, and child Env will save the parent's EnvId in ->env_ipc_from
ipc_send(who, 0, TEMP_ADDR, PTE_P | PTE_W | PTE_U);
// receive the message from the child
ipc_recv(&who, TEMP_ADDR, 0);
cprintf("%x got message: %s\n", who, TEMP_ADDR);
if (strncmp(TEMP_ADDR, str2, strlen(str2)) == 0)
cprintf("parent received correct message\n");
return;
}