操作系统-进程API

概述

进程的基本操作接口：

进程创建：fork (spawn, vfork, clone)
进程执行：exec
进程间同步：wait
进程退出：exit/abort

进程创建：fork()

fork()语义：为调用进程创建一个一模一样的新进程，fork后的两个进程均为独立进程（调用进程为父进程，新进程为子进程）。

函数原型：

#include <sys/types.h>
#include <unistd.h>

pid_t fork(void); // 出错返回-1，子进程返回0，父进程返回子进程PID

fork()调用的特点：

调用一次、返回两次
并发执行
相同但独立的地址空间
共享文件

fork为进程之间建立了父进程和子进程的关系，在进程之间建立了树型结构。

多个进程可以属于同一个进程组：

子进程默认与父进程属于同一个进程组
可以向同一进程组中的所有进程发送信号
主要用于shell程序中

写时拷贝（Copy-On-Write）：只拷贝内存映射，不拷贝实际内存

性能较好：一条映射至少对应一个4K的页面
调用exec的情况里，减少了无用的拷贝

进程的执行：exec()

exec()语义：为进程指定可执行文件和参数。在fork之后调用exec，可以在载入可执行文件后会重置地址空间。

函数原型：

#include <unistd.h>

//成功不返回，错误返回-1
int execve(const char *filename, char *const argv[], char *const envp[]);

exec()调用一次，并从不返回。

exec()的执行过程：

exec()函数加载并运行可执行文件filename，以参数列表argv和环境变量列表envp为参数。

argv变量指向一个以null 结尾的指针数组，其中每个指针都指向一个参数字符串。按照惯例，argv [0]是可执行目标文件的名字。
envp变量指向一个以 null 结尾的指针数组，其中每个指针指向一个环境变量字符串，每个串都是形如name=value的名字一值对。

程序加载的过程：

创建内存映像
在程序头部表的引导下，加载器将可执行文件的片(chunk)复制到代码段和数据段。
加载器跳转到程序的入口点,也就是_start 函数的地址（系统目标文件 ctrl.o中定义）。
start 函数调用系统启动函数__libc_start_main（定义在libc.so 中）。
__libc_start_main初始化执行环境，调用用户层的 main 函数,处理 main函数的返回值,并且在需要的时候把控制返回给内核。

加载的过程（另一种描述）：

加载器删除子进程现有的虚拟內存段,并创建一组新的代码、数据、堆和栈段。新的栈和堆段被初始化为零。通过将虛拟地址空间中的页映射到可执行文件的页大小的片(chunk),新的代码和数据段被初始化为可执行文件的内容。最后,加载器跳转到_start地址,它最终会调用应用程序的 main 函数。除了一些头部信息，在加载过程中没有任何从磁盘到内存的数据复制。直到 CPU 引用一个被映射的虛拟页时才会进行复制，此时，操作系统利用它的页面调度机制自动将页面从磁盘传送到内存。

main函数原型：

int main(int argc, char **argv, char **envp);
// 或者等价的
int main(int argc, char *argv [], char *envp[);

当main函数开始执行时，用户栈的结构如下图。stack的中间是envp[]和argv[]表示的指针数组，每个指针指向一个底端的变量字符串；stack的顶端是系统启动函数libc_start_main。

回收子进程：wait()

语义：等待子进程终止或停止。可以使用 wait()系统调用（或者更完整的兄弟接口 waitpid()）。

wait()函数原型：父进程一旦调用了wait就立即阻塞自己，由wait自动分析是否当前进程的某个子进程已经退出，如果让它找到了这样一个已经变成僵尸的子进程，wait就会收集这个子进程的信息，并把它彻底销毁后返回；如果没有找到这样一个子进程，wait就会一直阻塞在这里，直到有一个出现为止。

#include <sys/types.h>/* 提供类型pid_t的定义*/
#include <wait.h>

int wait(int *status)

waitpid()函数原型：

#include <sys/types.h>
#include <sys/wait.h>

//:如果成功,返回子进程的 PID,如果 WNOHANG,则为0,如果其他错误,则为-1。
pid t waitpid(pid_t pid, int *statusp, int options);

例子

创建一个子进程，统计一个文件的词数，然后父进程等待子进程介绍，回收之。

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/wait.h>

int
main(int argc, char *argv[])
{
  printf("hello world (pid:%d)\n", (int) getpid());
  int rc = fork();
  if (rc < 0) {
    // fork failed; exit
    fprintf(stderr, "fork failed\n");
    exit(1);
  } else if (rc == 0) {
    // child (new process)
    printf("hello, I am child (pid:%d)\n", (int) getpid());
    char *myargs[3];
    myargs[0] = strdup("wc");   // program: "wc" (word count)
    myargs[1] = strdup("p3.c"); // argument: file to count
    myargs[2] = NULL;           // marks end of array
    execvp(myargs[0], myargs);  // runs word count
    printf("this shouldn't print out");
  } else {
    // parent goes down this path (original process)
    int wc = wait(NULL);
    printf("hello, I am parent of %d (wc:%d) (pid:%d)\n",
           rc, wc, (int) getpid());
  }
  return 0;
}

运行后有结果：

Why's the design?

为什么系统设计者要设计如此奇怪的接口（fork + exec），来完成简单的、创建新进程的任务？

LAMPSON 定律：做对事（Get it right）。抽象和简化都不能替代做对事。

事实证明，这种分离 fork()及 exec()的做法对构建 UNIX shell 非常有用，因为这给了 shell 在 fork 之后 exec 之前运行代码的机会，这些代码可以在运行新程序前改变环境，从而让一系列有趣的功能很容易实现（比如重定向）。

wc p3.c > newfile.txt

重定向的工作原理：子进程向标准输出文件描述符的写入被透明地转向新打开的文件（可以使用dup2()函数）。

因此有如下代码（运行之，结果被存储在一个文件中）：

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>
#include <assert.h>
#include <sys/wait.h>

int
main(int argc, char *argv[])
{
  int rc = fork();
  if (rc < 0) {
    // fork failed; exit
    fprintf(stderr, "fork failed\n");
    exit(1);
  } else if (rc == 0) {
    // child: redirect standard output to a file
    close(STDOUT_FILENO); 
    open("./p4.output", O_CREAT|O_WRONLY|O_TRUNC, S_IRWXU);

    // now exec "wc"...
    char *myargs[3];
    myargs[0] = strdup("wc");   // program: "wc" (word count)
    myargs[1] = strdup("p4.c"); // argument: file to count
    myargs[2] = NULL;           // marks end of array
    execvp(myargs[0], myargs);  // runs word count
  } else {
    // parent goes down this path (original process)
    int wc = wait(NULL);
    assert(wc >= 0);
  }
  return 0;
}

UNIX 管道也是用类似的方式实现的，但用的是 pipe()系统调用。在这种情况下，一个进程的输出被链接到了一个内核管道（pipe）上，另一个进程的输入也被连接到了同一个管道上。前一个进程的输出无缝地作为后一个进程的输入，许多命令可以用这种方式串联在一起，共同完成某项任务。

grep -o foo file | wc -l。

reference

[1] 操作系统导论（ostep

[2] 深入理解计算机系统

[3] 上海交通大学并行与分布式系统研究所-进程

posted @ 2021-10-14 21:44 zju_cxl 阅读(162) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

P=NP

To be or not to be, this is a question.