Linux System Programming 学习笔记(七) 线程
1. Threading is the creation and management of
multiple units of execution within a single process
二进制文件是驻留在存储介质上,已被编译成操作系统可以使用,准备执行但没有正运行的休眠程序
进程是操作系统对 正在执行中的二进制文件的抽象:已加载的二进制、虚拟内存、内核资源
线程是进程内的执行单元
processes are running binaries, threads are the smallest unit of execution schedulable by an operating system's process scheduler
现代操作系统为用户级程序提供了最基础的虚拟化抽象:虚拟内存和虚拟处理器,这就造成一种假象,似乎每个运行进程都独占系统资源
Virtualized memory affords each process a unique view of memory
A virtualized processor lets processes act as if they alone run on the system
虚拟内存是与进程相关联的,不是线程。每个进程都有唯一的内存映像,但是进程内的所有线程是共享该进程的地址空间
虚拟处理器是与线程相关联的,不是进程。每个线程都是可调度的独立实体。
2. 多线程
多线程有以下好处:
(1) 编程抽象:thread-per-connection and thread pool patterns
(2) 并行:每个线程都有专属的虚拟处理器并且是独立的可调度实体,这可以提高系统吞吐量
(3) 提高响应能力:多线程中,类似用户输入的操作可以委托给一个工作线程,允许至少一个线程对用户输入和GUI操作保持响应
(4) 阻塞I/O:多线程中,单个线程阻塞,其它线程仍然可以继续执行。多路I/O和非阻塞I/O也是可供选择的解决单进程阻塞I/O问题的方案
(5) 上下文交换:线程的上下文交换代价远小于进程
(6) 节省内存:线程是共享进程的地址空间,可充分节省内存
Failing to synchronize threads can lead to corrupt output, incorrect execution, and program crash.
Understanding and debugging multithreaded programs is so difficult
3. 线程模型
(1) Thread-per-Connection
a unit of work is assigned to one thread, and that thread is assigned at most one unit of work, for the duration of the unit of work's execution
在这个模型中,线程数量是一个实现细节,大多数实现对线程的创建数量设有一个上限,当连接数(也就是线程数) 达到上限值时,更多的连接会排队或者被拒绝知道连接数降至上限值以下
(2) Event-Driven Threading
most of the threads are doing a lot of waiting,using more threads than you have processors on the system does not provide any benefits to parallelism
因为在 Thread-per-Connection中,负载通常在于I/O等待,我们将此等待过程从线程中解耦出来。在这种模式中,请求处理过程与一系列回调函数相关联,这些回调函数可以通过 多路I/O(select、epoll)来等待I/O,这种模式成为 event loop,When the I/O requests are returned, the event loop hands the callback off to a waiting thread
4. 并发、并行、竞争
Concurrency is the ability of two or more threads to execute in overlapping time periods
Parallelism is the ability to execute two or more threads simultaneously
Concurrency can occur without parallelism: for example, multitasking on a single processor system
With parallelism, threads literally execute in parallel
By enabling overlapping execution, threads can execute in an unpredictable order with respect to each other
A race condition is a situation in which the unsynchronized access of a shared resource by two or more threads leads to erroneous program behavior.
临界区:the region of code which should be synchronized
x++; // x is an integer
it takes the current value of x, increments it by one, and stores that new value back in x
这个例子说明了并发的情况,如果是并行,则情况如下:
5. 同步
为了防止竞争条件,程序员必须同步访问临界区,即确保互斥访问临界区
There are many techniques for making critical regions atomic,The most common technique is the lock
A deadlock is a situation in which two threads are waiting for the other to finish, and thus neither does
In the case of mutexes, a deadlock occurs where two threads are each waiting for a different mutex, which the other thread holds,
known as the ABBA deadlock
Because each thread that holds a mutex is also waiting for a mutex, neither is ever released, and the threads deadlock
6. Pthread
(1) Creating Threads
#include <pthread.h> int pthread_create (pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine) (void *), void *arg);
一旦调用成功,新线程被创建,开始执行start_routine回调函数,arg就是此回调函数的输入参数
pthread_attr_t改变新建线程的默认属性,例如栈大小、调度参数等,一般为NULL,表示默认线程属性
the new thread inherits most attributes, capabilities, and state from its parent
(2) Thread IDs
POSIX does not require it to be an arithmetic type, 注意 线程ID不一定就是整型数,为了移植性,所以用 pthread_t
const pthread_t me = pthread_self ();
int pthread_equal (pthread_t t1, pthread_t t2);
(3) Terminating Threads
线程终止有以下几种可能:
a. returns from its start routine, it terminates,类似于进程的执行到main函数末尾
b. 调用pthread_exit(),类似于进程的 exit()
c. 被其它线程通过 pthread_cancel() 终止,类似于 进程的 接收到 SIGKILL signal via kill()
#include <pthread.h>
int pthread_cancel (pthread_t thread);
pthread_cancel() sends a cancellation request to the thread represented by the thread ID thread.
(4) Joining and Detaching Threads
Joining allows one thread to block while waiting for the termination of another
#include <pthread.h> int pthread_join (pthread_t thread, void **retval);
调用线程被阻塞,直到指定线程终止
All threads in Pthreads are peers; any thread may join any other
int ret; /* join with `thread' and we don't care about its return value */ ret = pthread_join (thread, NULL); if (ret) { errno = ret; perror ("pthread_join"); return -1; }
分离线程:
默认情况下,新建线程是可结合的
#include <pthread.h>
int pthread_detach (pthread_t thread);
综合示例:
#include <stdlib.h> #include <stdio.h> #include <pthread.h> void * start_thread (void *message) { printf ("%s\n", (const char *) message); return message; } int main (void) { pthread_t thing1, thing2; const char *message1 = "Thing 1"; const char *message2 = "Thing 2"; /* Create two threads, each with a different message. */ pthread_create (&thing1, NULL, start_thread, (void *) message1); pthread_create (&thing2, NULL, start_thread, (void *) message2); /* * Wait for the threads to exit. If we didn't join here, * we'd risk terminating this main thread before the * other two threads finished. */ pthread_join (thing1, NULL); pthread_join (thing2, NULL); return 0; }
gcc -Wall -O2 -pthread example.c -o example
输出结果:
Thing 1 或者 Thing 2
Thing 2 Thing 1
(5) 初始化锁
/* define and initialize a mutex named `mutex' */ pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
(6) 加锁、解锁
int pthread_mutex_lock (pthread_mutex_t *mutex); int pthread_mutex_unlock (pthread_mutex_t *mutex);
Resource Acquisition Is Initialization (RAII) is a C++ programming pattern
Using RAII to acquires a mutex on creation and automatically releases the mutex when it falls out of scope
class ScopedMutex { public: ScopedMutex (pthread_mutex_t& mutex) :mutex_ (mutex) { pthread_mutex_lock (&mutex_); } ~ScopedMutex () { pthread_mutex_unlock (&mutex_); } private: pthread_mutex_t& mutex_; };