C++ 中的线程、锁和条件变量

Created: 2024-06-19T17:17+08:00
Published: 2024-11-18T10:39+08:00
Categories: C-CPP

线程创建与执行
锁
- lock guard example
- mutex 底层实现解释 by GPT
条件变量（condition variable）
-pthread 和 -lpthread
多线程 gdb debug
试题
- 多线程数据存取

#include <thread>
#include <mutex>
#include <condition_variable>
using std::thread;
using std::mutex;
using std::condition_variable;

C++多线程并发基础入门教程 - 知乎

线程创建与执行

线程的参数就是函数名和函数的参数
线程创建就立刻运行
main 如果要等待线程执行结束，需要调用 thread.join()，并且只能调用一次，多次调用 join 会报一个莫名其妙的错：
```
terminate called after throwing an instance of 'std::system_error'
what():  Invalid argument
```
不 join 的话也会报错：
```
terminate called without an active exception
```
std::this_thread::get_id() 获取 thread_id

下面是在 main() 中专门启动一个 thread 打印 vector<int> 的例子：

#include <thread>
#include <iostream>
#include <vector>

using std::vector;
using std::mutex;

void log_vec(const vector<int> &vec)
{
    std::cout << "thread_id: " << std::this_thread::get_id() << std::endl;
    for (auto &x : vec)
    {
        std::cout << x << std::endl;
    }
    return;
}


int main()
{
    vector<int> v{1, 2, 3};
    thread t{log_vec, v};
    t.join();
    return 0;
}

锁

#include<mutex>

mutex：最简单的互斥锁，mutual exclusion，需要手动调用 lock() 和 unlock() 上锁解锁
一个线程内只能 lock() 一次，第二次 lock() 会被卡住。
但是 unlock() 可以调用多次

guard_lock<mutex>：为了防止忘记对 mutex 解锁，guard_lock 析构的时候会解锁。
是一种 RAII。
构造时候自动上锁，也可以通过 adopt_lock 不自动上锁。
注意：使用 std::lock_guard 后不能手动 lock()与手动 unlock()
下面是它的定义：

 /** @brief A simple scoped lock type.
*
* A lock_guard controls mutex ownership within a scope, releasing
* ownership in the destructor.
*/
 template<typename _Mutex>
 class lock_guard
 {
 public:
   typedef _Mutex mutex_type;

   explicit lock_guard(mutex_type& __m) : _M_device(__m)
   { _M_device.lock(); }

   lock_guard(mutex_type& __m, adopt_lock_t) noexcept : _M_device(__m)
   { } // calling thread owns mutex

   ~lock_guard()
   { _M_device.unlock(); }

   lock_guard(const lock_guard&) = delete;
   lock_guard& operator=(const lock_guard&) = delete;

 private:
   mutex_type&  _M_device;
 };

unique_lock: 在 guard_lock 的基础上，可以手动 lock 和 unlock，且支持 move

lock guard example

利用 lock_guard 来管理锁，没有 lock_guard.unlock()，直接利用 scope 解锁。

#include <thread>
#include <mutex>
#include <iostream>
#include <chrono>

using std::mutex;
using std::thread;

mutex m;
int cnt = 0;
int max = 10;

void count(int tid)
{
    while (true)
    {
        { // use this bracket pair to create a scope
            std::lock_guard<mutex> lg(m);
            if (cnt >= max)
            {
                return;
            }
            cnt += 1;
            std::cout << "thread " << tid << " count " << cnt << std::endl;
        } // out of scope, lock_guard unlock
        std::this_thread::sleep_for(std::chrono::seconds(1));
    }
    return;
}

int main()
{

    std::thread t1{count, 1};
    std::thread t2{count, 2};
    t1.join();
    t2.join();
    return 0;
}

mutex 底层实现解释 by GPT

mutex 的底层实现也有区别，在现代操作系统上，如果线程尝试 lock 一个已经被 lock 的 mutex，并不会因为 spinlock 而占用很多的 CPU 资源。

现代操作系统通常提供了多种底层机制来实现互斥锁（mutex）。以下是一些常见的底层实现方式：

Test-and-Set（测试并设置）：这是一种基于硬件指令的实现方式。它使用原子操作来检查互斥锁的状态并设置锁的状态。如果锁已被占用，则线程会进入忙等待状态，不断尝试获取锁。这种实现方式在单处理器系统上效果较好，但在多处理器系统上可能会导致性能问题。

Compare-and-Swap（比较并交换）：这也是一种基于硬件指令的实现方式。它使用原子操作来比较锁的当前状态和期望状态，并在两者相等时交换锁的状态。如果锁已被占用，则线程会进入忙等待状态，不断尝试获取锁。这种实现方式在多处理器系统上效果较好，因为它可以避免一些性能问题。

Semaphore（信号量）：信号量是一种计数器，用于控制对共享资源的访问。互斥锁可以使用二进制信号量实现，其中计数器的值为 0 或 1。当线程尝试获取锁时，如果计数器为 0，则线程会被阻塞，直到计数器变为 1。当线程释放锁时，计数器会减少并唤醒等待的线程。

Futex（快速用户空间互斥锁）：Futex 是一种用户空间的互斥锁实现方式。它利用了操作系统提供的原子操作和等待唤醒机制。当线程尝试获取锁时，如果锁已被占用，则线程会进入休眠状态，将自己添加到等待队列中。当锁被释放时，操作系统会唤醒等待的线程。

这些底层实现方式可能因操作系统和硬件平台的不同而有所差异。现代操作系统通常会根据具体的场景和需求选择适当的实现方式来提供高效的互斥锁机制。

条件变量（condition variable）

条件变量允许线程「等待」和「被唤醒」。

比如生产者消费者模型中，消费者在拿到锁后检查队列发现没有东西需要消费，那么就需要释放锁，并且进入等待状态，
直到有生产者告诉消费者有东西可以消费了，消费者再启动。

如果没有条件变量，可以用轮询实现：

// consumer use spin to query queue is free or not

while (true):
    lock queue
    if (queue.size() == 0):
        unlock queue
    else:
        elem = queue.pop()
        unlock queue
        consume elem

当队列为空的时候，会有一堆消费者一直在一个 while 里面抢锁，轮询待消费队列使否有东西，导致 CPU 浪费。
如果没有东西可以消费，我们希望所有消费者都 sleep，直到有人 wakeup 他。
消费者：

// consumer can wait

while (true):
    lock queue
    while (queue.size() == 0):
        unlock & sleep
        wakeup & get lock

	// 此时可以保证既 lock queue & queue.size() != 0
	elem = queue.pop()
    unlock queue
    consume elem

生产者：

while(true):
    lock queue
    if queue is full:
        unlock queue
        notify all consumers
    else:
        add 1 element to queue
        unlock queue and notify 1 consumer

condition_variable 的作用就是这个。

condition_variable.wait(unique_lock): wait 会释放 lock 并让线程进入 sleep 状态。
线程被唤醒后，会退出 sleep 状态，并一直尝试获得 lock，直到获得 lock 才继续执行。
要保证在 wait 前 mutex 已经被 lock。
condition_variable.notify_all(): 唤醒所有在 cv 上 sleep 的线程，它们都会去抢锁。
所以要保证在此之前已经释放了 lock，不然被唤醒的线程都会去尝试获得 lock，但是没释放导致无法得到 lock。

线程从 `cv.wait(lock)` 被唤醒后会自动抢锁

如果一堆线程都卡在 wait(lock) 上，进行一次 notify_all(), 会唤醒所有线程去争夺 lock，
如果没有争夺到，也会脱离 wait 的状态，阻塞在 lock.lock() 上。

以下是验证代码，通过一次 notify_all 让所有线程都醒过来并获取 lock，见 ./wai-notify.cpp。

#include <thread>
#include <mutex>
#include <iostream>
#include <chrono>
#include <vector>
#include <condition_variable>

using std::condition_variable;
using std::lock_guard;
using std::mutex;
using std::thread;
using std::unique_lock;
using std::vector;

condition_variable cv;
int num_thread = 3;
mutex m;

void wakeup_resume() {
    unique_lock<mutex> ql(m);
    cv.wait(ql);
    std::cout << "resume" << std::endl;
}

int main()
{
    thread threads[num_thread];
    for (int i = 0; i < num_thread; ++i) {
        threads[i] = thread{wakeup_resume};
    }

    std::this_thread::sleep_for(std::chrono::seconds(1)); // let all thread block in wait(), or cv.notify_all() run too fast so no thread in wait
    cv.notify_all();

    for (int i = 0; i < num_thread; ++i) {
        threads[i].join();
    }

    return 0;
}

虚假唤醒

虚假唤醒相关资料：

简而言之， cv.wait(lock) 之后的线程不一定被 cv.notify() 唤醒，所以醒来以后，原来要求 wait 的条件可能改变，所以 cv.wait 在一个 while 循环里面。

为了支持「事件」机制，所以使用了条件变量。
为了避免虚假唤醒，所以要写成 while。

生产者消费者代码例子

下面提供一份代码，假设一个生产者，一堆消费者对一个 queue 操作。
因为要处理的任务是有限的，任务没了以后使用 no_more_job 标记没有任务了，让 consumer 退出。

代码见 ./producer-consumer.cpp

#include <thread>
#include <mutex>
#include <iostream>
#include <chrono>
#include <vector>
#include <queue>
#include <condition_variable>
#include <atomic>
using std::atomic;
using std::condition_variable;
using std::lock_guard;
using std::mutex;
using std::queue;
using std::thread;
using std::unique_lock;
using std::vector;

mutex m;
condition_variable cv;
queue<int> q{};
int q_cap = 3;

bool no_more_job = false;
int consumer_num = 3;
atomic<int> quit_consumer_num = 0;

void consumer()
{
    unique_lock<mutex> ul(m, std::defer_lock);
    while (true)
    {
        ul.lock();
        while (q.empty()) // use while to avoid "spurious wakeup"
        {
            cv.wait(ul); // wait will automatically release lock
            if (no_more_job & q.empty())
            {
                ul.unlock();
                quit_consumer_num.fetch_add(1);
                return;
            }
        }
        auto x = q.front();
        q.pop();
        std::cout << "consumer " << std::this_thread::get_id() << " get " << x << std::endl;
        ul.unlock();
        std::this_thread::sleep_for(std::chrono::seconds(1));
    }
}

void producer()
{
    queue<int> to_add({1, 2, 3, 4, 5, 6, 7, 8, 9});

    while (!to_add.empty())
    {
        m.lock();
        if (q.size() < q_cap)
        {
            int x = to_add.front();
            to_add.pop();
            q.push(x);
            std::cout << "producer add " << x << std::endl;
            m.unlock();
            cv.notify_all();
        }
        else
        { // q is full
            m.unlock();
            cv.notify_all();
        }
    }
    no_more_job = true;
    while (quit_consumer_num != consumer_num)
    {
        cv.notify_one();
    }

    return;
}

int main()
{
    thread p{producer};
    thread threads[consumer_num];
    for (int i = 0; i < consumer_num; ++i)
    {
        threads[i] = std::thread{consumer};
    }

    for (int i = 0; i < consumer_num; ++i)
    {
        threads[i].join();
    }
    p.join();

    return 0;
}

`-pthread` 和 `-lpthread`

-pthread 和 -lpthread 有什么关系？

因为 thread 的实现有不同的版本，有的版本 .h 不兼容 Posix API，所以编译的过程中，需要指定用 Posix API 兼容的线程实现。在编译命令中，通过 -pthread 会通过加宏实现这个需求。
同时 -pthread 还指定链接实现了 Posix API 的 lib，所以不再需要 -lpthread。

参考：

多线程 gdb debug

info threads 查看当前进程的线程
thread <ID>  切换调试的线程为指定 ID 的线程
break test.c:100 thread all   在所有线程中相应的行上设置断点

set scheduler-locking off|on
  off   默认值，不锁定任何线程，所有线程都执行
  on    只有当前被调试程序会执行

GDB 多线程之旅 - 知乎

试题

多线程数据存取

给了两个 api，一个取出数据，一个给入数据，需要实现一个中间函数让大量数据能够按顺序填入，限制单次填入的数据量。要求占用内存少，运行速度快。

该试题来源于网络，背景描述非常模糊。
解法是生产者和消费者操作 ring-buffer，并且消费者可以 sleep 不要轮询。

posted @ 2024-11-18 14:08 dutrmp19 阅读(21) 评论(0) 编辑收藏举报

刷新页面返回顶部

dutrmp19