Task、Thread、Process、Program
Before entering into a discussion about concurrency, it is necessary to define some relevant terminology to prevent confusion. Developers who are more familiar with UNIX systems or older Mac OS X technologies may find the terms “task”, “process”, and “thread” used somewhat differently in this document. This document uses these terms in the following way:
-
The term thread is used to refer to a separate path of execution for code. The underlying implementation for threads in Mac OS X is based on the POSIX threads API.
-
The term process is used to refer to a running executable, which can encompass multiple threads.
-
The term task is used to refer to the abstract concept of work that needs to be performed.
For complete definitions of these and other key terms used by this document, see “Glossary.”
thread
A flow of execution in a process. Each thread has its own stack space but otherwise shares memory with other threads in the same process.
process
The runtime instance of an application or program. A process has its own virtual memory space and system resources (including port rights) that are independent of those assigned to other programs. A process always contains at least one thread (the main thread) and may contain any number of additional threads.
task
A quantity of work to be performed. Although some technologies (most notably Carbon Multiprocessing Services) use this term differently, the preferred usage is as an abstract concept indicating some quantity of work to be performed.
program
A combination of code and resources that can be run to perform some task. Programs need not have a graphical user interface, although graphical applications are also considered programs.
本文的主要目的是介绍在Linux内核中,task,process, thread这3个名字之间的区别和联系。并且和WINDOWS中的相应观念进行比较。如果你已经很清楚了,那么就不用往下看了。
LINUX版本:2.6.18
ARCH: X86
首先要明确的是,按照LKD 2里面的说法,LINUX和其他OS 比如WINDOWS, SOLARIS之间一个很大的不同是没有严格定义的线程(thread)。那么你也许会问,如果LINUX中没有线程,那么如何来表示类似WINDOWS线程的那种执行观念呢?答案是LINUX中,PROCESS(进程)可以当作线程。
那么你也许又会问,WINDOWS中的多线程程序在LINUX中是怎样表示的呢?具体来说,LINUX中的PROCESS有2种。一种是独立的PROCESS。自己有自己的地址空间,资源列表,代码等。另外一种PROCESS是和其他PROCESS共享一个地址空间,资源列表的。这种PROCESS就类似于WINDOWS中的线程。
在看LINUX内核代码的时候,你会同时看到process, task, thread这3个名字。下面简要介绍下他们之间的区别:
1、task 可以理解为一个LINUX PROCESS。最著名的定义TASK的数据结构叫做struct task_struct, 在linux\sched.h中。我觉得这个名字起得不好。因为大家都已经对PROCESS, THREAD之类得观念很熟悉了。现在又冒出来个TASK,很容易让人搞混。不过也许是历史原因吧。这个TASK一直保留着。
在 task_struct 中有一堆的成员。其中有PID 和TGID. PID实际上类似于WINDOWS中的THREAD ID。而TGID (thead group id) 对应于WINDOWS中的PID。PID对于独立的PROCESS来说,就是它的PID。这时PID == TGID。
对于和其他PROCESS共享地址空间的PROCESS来说,每个都有独立的PID,但是他们的TGID是一样的。
2、thead
虽然说LINUX不支持THREAD. 但是在内核代码里又可以看到THREAD这个名字。这时可以把他们和WINDOWS中的THREAD对应起来。一个比较著名的是thread_info 结构。
3、kernel thread
在LINUX中,kernel thread是一个专门的名词。它的特点是没有独立的地址空间(MM结构为NULL). 他们只运行在KERNEL SPACE.不能切换到USER SPACE。
最后,总结下,在LINUX中,一个PROCESS即可能是一个WINDOWS PROCESS类似的观念,也可能是一个与WINDOWS THREAD类似的观念。而且有时还被叫做TASK(感觉有点乱)。
不过最常用的还是与WINDOWS PROCESS类似的观念。比如在内核代码中有一个for_each_process宏。它就是只遍历那些主要的,独立的PROCESS。