Pthread
共享内存编程(共享一部分)
POSIX Thread=Pthreads
定义了一套多线程编程的API(应用程序编程接口)
基本概念#
Pthread支持
- 创建并发执行
- 同步
- 非显式通信,因为共享内存是隐式的——共享数据的指针传递给线程
Pthread 相对低层,可移植性较好,开发较慢,在系统级代码中广泛使用。
OpenMP是新标准,高层编程,适用于共享内存架构上的科学计算(程序员在较高层次上指出并行方式和数据特性,并指导任务调度,系统负责实际的并行任务分解和调度管理)
基础API#
编译的时候加上-lpthread
语句,链接Pthreads线程库。
运行 ./pth <线程数>
codeblocks给main函数传参数:
Project——Set programs' arguments——在Program arguments中输入参数即可
hello world#
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>//malloc 和 strtol
//全局变量,所有线程共享
int thread_count;
//void *可以强制转成任意类型指针
void *Hello(void *rank) {
long my_rank = (long) rank;
printf("Hello from thread %ld of %d\n", my_rank, thread_count);
return NULL;
}
int main(int argc, char *argv[]) {
long thread;
pthread_t *thread_handles;//pthread_t是线程标识符,存储线程的唯一标识,不透明对象
//字符串转为int,按十进制解析,读线程数
thread_count = strtol(argv[1], NULL, 10);
thread_handles = (pthread_t*) malloc(thread_count * sizeof(pthread_t));//分配内存
//生成线程
for (thread = 0; thread < thread_count; thread++)
pthread_create(&thread_handles[thread], NULL, Hello, (void*) thread);
printf("Hello from the main thread\n");
//停止线程,主线程会依次等待所有子线程完成
for (thread = 0; thread < thread_count; thread++)
pthread_join(thread_handles[thread], NULL);
free(thread_handles);
return 0;
}
跑代码时由于是64位操作系统,要把long改成long long
输出:
Hello from thread 0 of 8
Hello from thread 2 of 8
Hello from thread 1 of 8
Hello from thread 5 of 8
Hello from the main thread
Hello from thread 4 of 8
Hello from thread 6 of 8
Hello from thread 3 of 8
Hello from thread 7 of 8
- 子线程的输出顺序和主线程的输出顺序是非确定性的,具体取决于操作系统线程调度。
- 在多次运行中,顺序可能不同,这是由于线程的并发执行。
注意:创建线程的代价很高,因此现实中各线程需完成很多工作才值得付出这种代价
- 创建线程:
int pthread_create ( pthread_t *thread_id,const pthread_attr_t *thread_attribute,void *(*thread_fun )(void *),void *fun_arg);
- thread_id:指针,指向线程ID
- thread_attribute:属性,NULL表示默认
- thread_fun:运行的函数
- fun_arg:函数参数
pthread_create 若成功,返回 0 ;若出错,返回非 0 出错编号
- 结束线程
int pthread_join ( pthread_t pthread ,void **value_ptr);
- value_ptr 允许目标线程退出时返回信息给调用线程,无返回值则通常是 NULL
返回值void *
是个 void 指针,这里得是 void 的二重指针,是指向void*
的指针
pthread_join 若成功,返回 0 ;若出错,返回非 0 出错编号
- 取消线程
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
void *thread_function(void *arg) {
while (1) {
printf("I am the child thread\n");
pthread_testcancel();//检测线程是否取消状态,若是,在此处退出线程
sleep(1);
}
}
int main(int argc, char *argv[]) {
void *status;
pthread_t thread;
pthread_create(&thread, NULL, thread_function, NULL);
sleep(3);
pthread_cancel(thread); //发出取消请求
pthread_join(thread, &status);//等待线程真的退出
if (status == PTHREAD_CANCELED) {
printf("Thread was canceled\n");
} else {
printf("unexpected thread status\n");
}
return 0;
}
输出:
I am the child thread
I am the child thread
I am the child thread
I am the child thread
Thread was canceled
8个线程只打印了4个就被取消了
线程取消是异步的,主线程发出 pthread_cancel
请求后,目标线程并不会立即停止,而是需要检测到取消请求(例如通过 pthread_testcancel
)。
系统在某些函数调用(如 sleep
、read
)中会自动检测取消请求,这些函数称为取消点。如果线程从不调用取消点或不显式调用 pthread_testcancel
,线程可能无法被取消。
同步#
基本概念#
原子性: 一组操作要么全部执行要么全不执行,则称其是原子的。即,不会得到部分执行的结果!
互斥: 任何时刻都只有一个线程在执行!
临界区: 是一个更新共享内存的代码段。一次只能允许一个线程执行该代码段!
运行在一个处理器上的一个程序实例为线程(MPI中称为进程)
时序: 指的是多个线程对资源访问的具体顺序和时间点!
Q: 竞争条件
A: .
- 执行结果依赖于两个或更多事件的时序,则存在竞争条件 。
- 多个进程/线程尝试更新同一个共享资源时结果可能是无法预测的
- 更一般地,当多个线程都要访问共享资源时,如果至少其中一个访问是更新操作,那们这些访问可能会导致某种错误
数据依赖: 就是两个内存操作的序,为了保证结果的正确性,必须保持这个序!
同步: 在时间上强制使各执行进程/线程在某一点必须互相等待,确保各进程/线程的正常顺序和对共享可写数据的正确访问!
死锁: 两个或多个进程在执行过程中,因争夺资源而造成的一种僵局。在这种情况下,每个进程都在等待其他进程释放资源,但是没有一个进程能够向前推进,因为它们都需要对方持有的资源才能继续执行!
现在来看这个问题——估算π:
串行代码:
double factor = 1.0;
double sum = 0.0;
for (k = 0; k < n; k++, factor = -factor)
sum += factor / (2 * k + 1);
pi_approx = 4.0 * sun;
忙等待#
设一个共享的flag标志变量,主线程将其初始化为0
线程不断测试某条件,在其满足前,线程不会进入临界区。
各个线程顺序地进入临界区
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
int thread_count = 8;
int flag;
double sum;
long long n = 1e8;
void *Thread_sum(void *rank) {
long long my_rank = (long long) rank;
double factor, my_sum = 0.0;
long long my_num = n / thread_count;
long long my_first_i = my_num * my_rank;
long long my_last_i = my_first_i + my_num;
if (my_first_i % 2 == 0) factor = 1.0;
else
factor = -1.0;
for (long long i = my_first_i; i < my_last_i; i++, factor = -factor)
my_sum += factor / (2 * i + 1);
while (flag != my_rank);//注意这里是忙等待,while没有循环体
sum += my_sum;
flag = (flag + 1) % thread_count;
return NULL;
}
int main(int argc, char *argv[]) {
pthread_t *thread_handles;
thread_handles = (pthread_t *) malloc(thread_count * sizeof(pthread_t));
flag = 0;
for (long long thread = 0; thread < thread_count; thread++)
pthread_create(&thread_handles[thread], NULL, Thread_sum, (void *) thread);
for (long long thread = 0; thread < thread_count; thread++)
pthread_join(thread_handles[thread], NULL);
free(thread_handles);
printf("%lf", sum);
return 0;
}
很显然,这个性能很差,如果条件一直不满足就会卡在这不停循环,浪费CPU周期。
要最小化执行临界区的次数,为每个线程配置私有变量存储部分和
互斥量#
也叫互斥锁
一个特殊类型(pthread_mutex_t)的变量,用来限制每次只有一个线程能进入临界区。
进入临界区的线程是随机的
API#
- 静态初始化
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
- 动态初始化
pthread_mutex_t mutex;
pthread_mutex_init(&mutex, NULL);
- 销毁
pthread_mutex_destory(&mutex);
- 进入临界区,返回的线程获取了互斥量
pthread_mutex_lock(&mutex);
- 退出临界区,释放互斥量
pthread_mutex_unlock(&mutex);
使用互斥量的估算π代码(不同的地方用//
标注了)
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
int thread_count = 8;
pthread_mutex_t mutex;//
double sum;
long long n = 1e8;
void *Thread_sum(void *rank) {
long long my_rank = (long long) rank;
double factor, my_sum = 0.0;
long long my_num = n / thread_count;
long long my_first_i = my_num * my_rank;
long long my_last_i = my_first_i + my_num;
if (my_first_i % 2 == 0) factor = 1.0;
else
factor = -1.0;
for (long long i = my_first_i; i < my_last_i; i++, factor = -factor)
my_sum += factor / (2 * i + 1);
pthread_mutex_lock(&mutex);//
sum += my_sum;
pthread_mutex_unlock(&mutex);//
return NULL;
}
int main(int argc, char *argv[]) {
pthread_t *thread_handles;
thread_handles = (pthread_t *) malloc(thread_count * sizeof(pthread_t));
pthread_mutex_init(&mutex,NULL);//
for (long long thread = 0; thread < thread_count; thread++)
pthread_create(&thread_handles[thread], NULL, Thread_sum, (void *) thread);
for (long long thread = 0; thread < thread_count; thread++)
pthread_join(thread_handles[thread], NULL);
free(thread_handles);
printf("%lf", sum);
return 0;
}
持有多个互斥量可能导致死锁。
比如:
线程1: lock(a); …… lock(b) 线程2:
lock(b); …… lock(a)
都想获得对方手里的互斥锁,但是对方都来不及释放自己手里的锁
`
信号量semaphore#
一种特殊类型的unsigned int
变量。
与互斥量的区别是没有个体拥有权。其他线程对任意信号量都能调用set_post
和set_wait
函数。
头文件: #include <semaphore.h>
!
API#
信号量为0是阻塞,大于0是唤醒
- 初始化
int sem_init(sem_t *semp,int shared,unsigned initial_val);
- shared=0,0是阻塞,非0共享
- 销毁
int sem_destroy(sem_t *semp);
- 通知
int sem_post(sem_t *semp);
信号量加1
- 等待
int sem_wait(sem_t *semp);
信号量减1
有这样一个问题,初始produce[i]
都是空,我们要进行生产和消费操作。
produce[i]=food;//生产i
通知i号线程可以消费了
等待其他线程的通知
printf("消费%ld商品:%s",my_rank,produce[my_rank]);
(这种一个线程要等待另一个线程执行某种操作的同步方式叫生产者——消费者同步模型)
如果用互斥量来解决问题
pthread_mutex_lock(mutex[i])
produce[i]=food;
pthread_mutex_unlock(mutex[i])
pthread_mutex_lock(mutex[my_rank])
printf(my_rank,produce[my_rank]);
可能出现问题:比如线程1已经执行到消费那一行了(拿到了互斥锁),但他还没有被生产,那么引用空指针显然会报错。
这里就可以用信号量来解决问题。 (这里只是引入,我自己乱改编了个问题)
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <semaphore.h>
int thread_count = 8;
double sum;
long long n = 1e8;
sem_t semp[30];
int produce[30];
void *Thread_sum(void *rank) {
long long my_rank = (long long) rank;
long long i=(my_rank+1)%thread_count;//假设是顺序传
produce[i]=i;
sem_post(&semp[i]);
sem_wait(&semp[my_rank]);
printf("线程%lld消费商品:%lld\n",my_rank,produce[my_rank]);
return NULL;
}
int main(int argc, char *argv[]) {
pthread_t *thread_handles;
thread_handles = (pthread_t *) malloc(thread_count * sizeof(pthread_t));
for (int i = 0; i < thread_count; i++)
sem_init(&semp[i], 0,0);//
for (long long thread = 0; thread < thread_count; thread++)
pthread_create(&thread_handles[thread], NULL, Thread_sum, (void *) thread);
for (long long thread = 0; thread < thread_count; thread++)
pthread_join(thread_handles[thread], NULL);
free(thread_handles);
return 0;
}
ppt例题:
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <semaphore.h>
int thread_count = 4;
sem_t sem_parent,sem_children;
typedef struct{
int thread_id;
}threadParm_t;
void *threadFunc(void *parm) {
threadParm_t* p=(threadParm_t*)parm;
printf("I'm the child thread %d\n",p->thread_id);
sem_post(&sem_parent);// 通知主线程:子线程已完成第一阶段任务
sem_wait(&sem_children);// 等待主线程通知,再继续运行下面的内容
printf("Thread %d is about to exit\n",p->thread_id);
pthread_exit(NULL);
}
int main(int argc, char *argv[]) {
sem_init(&sem_parent,0,0);
sem_init(&sem_children,0,0);
pthread_t thread[10];
threadParm_t parm[10];//传递参数
for (int i = 0; i < thread_count; i++) {
parm[i].thread_id=i;
pthread_create(&thread[i],NULL,threadFunc,(void*)&parm[i]);
}
//等待所有子线程都通知父线程完成任务
for (int i = 0; i < thread_count; i++)
sem_wait(&sem_parent);
printf("All children threads have been created\n");
//唤醒子线程继续输出新内容
for (int i = 0; i < thread_count; i++)
sem_post(&sem_children);
for (int i = 0; i < thread_count; i++)
pthread_join(thread[i],NULL);
sem_destroy(&sem_parent);
sem_destroy(&sem_children);
return 0;
}
输出结果:
I'm the child thread 3
I'm the child thread 0
I'm the child thread 1
I'm the child thread 2
All children threads have been created
Thread 3 is about to exit
Thread 2 is about to exit
Thread 1 is about to exit
Thread 0 is about to exit
路障barrier#
只有所有线程都抵达此路障,线程才能继续运行下去,否则阻塞在路障处。
API#
- 初始化路障:
pthread_barrier_t bar=PTHREAD_BARRIER_INITIALIZER(unsigned count);
静态int pthread_barrier_init(pthread_barrier_t *bar, NULL, unsigned count);
动态
- 等待
int pthread_barrier_wait(pthread_barrier_t *bar);
- 销毁
int pthread_barrier_destroy(pthread_barrier_t *bar);
刚才的例子其实就是想实现等所有child都print了,才print parent信息,用路障就很好实现了。
路障的实现代码:
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <semaphore.h>
int thread_count = 4;
sem_t sem_parent,sem_children;
typedef struct{
int thread_id;
}threadParm_t;
pthread_barrier_t bar;//
void *threadFunc(void *parm) {
threadParm_t* p=(threadParm_t*)parm;
printf("thread %d has entered step 1\n",p->thread_id);
pthread_barrier_wait(&bar);//就这一行
printf("Thread %d has entered step 2\n",p->thread_id);
pthread_exit(NULL);
}
int main(int argc, char *argv[]) {
pthread_barrier_init(&bar,NULL,thread_count);//
pthread_t thread[10];
threadParm_t parm[10];
for (int i = 0; i < thread_count; i++) {
parm[i].thread_id=i;
pthread_create(&thread[i],NULL,threadFunc,(void*)&parm[i]);
}
for (int i = 0; i < thread_count; i++)
pthread_join(thread[i],NULL);
sem_destroy(&sem_parent);
sem_destroy(&sem_children);
return 0;
}
输出:
thread 1 has entered step 1
thread 0 has entered step 1
thread 3 has entered step 1
thread 2 has entered step 1
Thread 2 has entered step 2
Thread 3 has entered step 2
Thread 0 has entered step 2
Thread 1 has entered step 2
条件变量#
尽管忙等待、互斥量、信号量都能实现路障,但条件变量是更好的方法。
条件变量是一个数据对象,允许线程在某个特定条件前都处于挂起状态。当条件发生时,另一线程可通过信号来唤醒挂起的线程。一个条件变量总与一个互斥量相关联。
(API大概率不考)
API#
- 数据类型
pthread_cond_t
- 初始化
pthread_cond_init(pthread_cond_t*,attr)
- 销毁
pthread_cond_destroy(pthread_cond_t*)
- 唤醒一个阻塞线程
pthread_cond_signal(pthread_cond_t*)
- 唤醒所有被阻塞线程
pthread_cond_broadcast(pthread_cond_t*)
- 用互斥量来阻塞
pthread_cond_wait(pthread_cond_t*,pthread_mutex_t*)
例子#
- 两个线程(
inc_count
)负责增加全局计数器count
。 - 一个线程(
watch_count
)负责监视count
的值,并在达到指定阈值(COUNT_LIMIT = 12
)时执行相关操作。
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#define NUM_THREADS 3
#define TCOUNT 10
#define COUNT_LIMIT 12
int count = 0;
int thread_ids[NUM_THREADS] = {0, 1, 2};
pthread_mutex_t count_mutex;
pthread_cond_t count_threshold_cv;
void *inc_count(void *idp) {
int j, i;
double result = 0.0;
int my_id = *((int *) idp); // 传递指针,解引用获取线程 ID
for (i = 0; i < TCOUNT; i++) {
pthread_mutex_lock(&count_mutex);
count++;
if (count == COUNT_LIMIT) {
pthread_cond_signal(&count_threshold_cv); // 唤醒等待的线程
printf("inc_count(): thread %d, count = %d Threshold reached.\n", my_id, count);
}
printf("inc_count(): thread %d, count = %d, unlocking mutex\n", my_id, count);
pthread_mutex_unlock(&count_mutex);
// 模拟计算
for (j = 0; j < 1000; j++) {
result = result + (double) rand();
}
}
pthread_exit(NULL);
}
void *watch_count(void *idp) {
int my_id = *((int *) idp);
printf("watch_count(): thread %d, count = %d, waiting\n", my_id, count);
pthread_mutex_lock(&count_mutex);
while (count < COUNT_LIMIT) {
pthread_cond_wait(&count_threshold_cv, &count_mutex); // 等待条件变量的信号
printf("watch_count(): thread %d, count = %d, Threshold reached.\n", my_id, count);
}
pthread_mutex_unlock(&count_mutex);
pthread_exit(NULL);
}
int main(int argc, char *argv[]) {
int i;
pthread_t threads[NUM_THREADS];
pthread_attr_t attr;
// 初始化互斥锁和条件变量
pthread_mutex_init(&count_mutex, NULL);
pthread_cond_init(&count_threshold_cv, NULL);
// 初始化线程属性
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE); // 设置为可回收状态
// 创建线程
pthread_create(&threads[0], &attr, inc_count, (void *) &thread_ids[0]);
pthread_create(&threads[1], &attr, inc_count, (void *) &thread_ids[1]);
pthread_create(&threads[2], &attr, watch_count, (void *) &thread_ids[2]);
// 等待线程完成
for (i = 0; i < NUM_THREADS; i++) {
pthread_join(threads[i], NULL);
}
printf("Main(): waited for %d threads to complete\n", NUM_THREADS);
// 销毁互斥锁、条件变量和线程属性
pthread_mutex_destroy(&count_mutex);
pthread_cond_destroy(&count_threshold_cv);
pthread_attr_destroy(&attr);
return 0;
}
inc_count(): thread 0, count = 1, unlocking mutex
watch_count(): thread 2, count = 1, waiting
inc_count(): thread 0, count = 2, unlocking mutex
inc_count(): thread 1, count = 3, unlocking mutex
inc_count(): thread 1, count = 4, unlocking mutex
inc_count(): thread 0, count = 5, unlocking mutex
inc_count(): thread 1, count = 6, unlocking mutex
inc_count(): thread 0, count = 7, unlocking mutex
inc_count(): thread 1, count = 8, unlocking mutex
inc_count(): thread 0, count = 9, unlocking mutex
inc_count(): thread 1, count = 10, unlocking mutex
inc_count(): thread 0, count = 11, unlocking mutex
inc_count(): thread 1, count = 12 Threshold reached.
inc_count(): thread 1, count = 12, unlocking mutex
inc_count(): thread 0, count = 13, unlocking mutex
watch_count(): thread 2, count = 13, Threshold reached.
inc_count(): thread 1, count = 14, unlocking mutex
inc_count(): thread 0, count = 15, unlocking mutex
inc_count(): thread 1, count = 16, unlocking mutex
inc_count(): thread 0, count = 17, unlocking mutex
inc_count(): thread 1, count = 18, unlocking mutex
inc_count(): thread 0, count = 19, unlocking mutex
inc_count(): thread 1, count = 20, unlocking mutex
Main(): waited for 3 threads to complete
读写锁#
提供两个锁函数,一个对读操作加锁,另一个为写操作加锁。
多个线程能调用读锁函数同时获得锁,但只有一个线程能通过写锁函数获得锁
作者:AuroraKelsey
出处:https://www.cnblogs.com/AuroraKelsey/p/18669399
版权:本作品采用「署名-非商业性使用-相同方式共享 4.0 国际」许可协议进行许可。
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 全程不用写代码,我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 记一次.NET内存居高不下排查解决与启示
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了