AQS（三）条件队列(基于JDK 8)

1 介绍
- 1.1 Node
2 await
3 signal/signalAll
- 3.1 doSignal/doSignalAll
- 3.2 transferForSignal
4 总结和使用

1 介绍

参考： https://segmentfault.com/a/1190000016058789 和 https://blog.csdn.net/anlian523/article/details/106598910。

AQS 有两个队列，双向的 CLH 队列 sync queue 和单向的条件队列 condtion queue 。

前面已经介绍了利用双向 CLH 队列完成独占锁和共享锁的部分，本文只介绍 AQS 的条件队列。

对于这里的锁来说，有三个对应，lock.lock() 和 lock.unlock() 对应 synchronized，await 对应 wait，signal/signalAll 对应 notify/notifyAll。

通过wait/notify机制来类比await/signal机制：

调用wait方法的线程首先必须是已经进入了同步代码块，即已经获取了监视器锁；与之类似，调用await方法的线程首先必须获得lock锁
调用wait方法的线程会释放已经获得的监视器锁，进入当前监视器锁的等待队列（wait set）中；与之类似，调用await方法的线程会释放已经获得的lock锁，进入到当前Condtion对应的条件队列中。
调用监视器锁的notify方法会唤醒等待在该监视器锁上的线程，这些线程将开始参与锁竞争，并在获得锁后，从wait方法处恢复执行；与之类似，调用Condtion的signal方法会唤醒对应的条件队列中的线程，这些线程将开始参与锁竞争，并在获得锁后，从await方法处开始恢复执行。

1.1 Node

这里再回忆一下前文的内容。

Node 是 AQS 的内部类，有着五种状态 waitStatus，用于 sync queue 中节点的前后指针 prev 和 next，内部线程 thread 和 nextWaiter。

Node用来构建同步队列节点，nextWaiter标识同步锁是独占锁还是共享锁；Node用来构建条件队列节点，nextWaiter指向单向链表下一个节点。

最重要的是五种状态：CANCELLED、SIGNAL、CONDITION、PROPAGATE、0。

CANCELLED(1)：表示当前结点已取消调度。当timeout或被中断（响应中断的情况下），会触发变更为此状态，进入该状态后的结点将不会再变化。
SIGNAL(-1)：表示后继结点在等待当前结点唤醒。后继结点入队时，会将前继结点的状态更新为SIGNAL。
CONDITION(-2)：表示结点等待在Condition上，当其他线程调用了Condition的signal()方法后，CONDITION状态的结点将从等待队列转移到同步队列中，等待获取同步锁。
PROPAGATE(-3)：共享模式下，前继结点不仅会唤醒其后继结点，同时也可能会唤醒后继的后继结点。
0：新结点入队时的默认状态。

此外要注意 WaitStatus > 0 暗示了就是 CANCELLED。

在这里，主要关注的是 CANCELLED,CONDITION 和 0。

 static final class Node {
   			// 同步队列节点中标识是独占锁还是共享锁
        /** Marker to indicate a node is waiting in shared mode */
        static final Node SHARED = new Node();
        /** Marker to indicate a node is waiting in exclusive mode */
        static final Node EXCLUSIVE = null;

   			// 下面是5种状态
        /** waitStatus value to indicate thread has cancelled */
        static final int CANCELLED =  1;
        /** waitStatus value to indicate successor's thread needs unparking */
        static final int SIGNAL    = -1;
        /** waitStatus value to indicate thread is waiting on condition */
        static final int CONDITION = -2;
        /**
         * waitStatus value to indicate the next acquireShared should
         * unconditionally propagate
         */
        static final int PROPAGATE = -3;
   			//省略了一种情况，即初始化时，为0			
  
   			//
        volatile int waitStatus;
        volatile Node prev;
        volatile Node next;
        volatile Thread thread;
   			// 不同队列的节点，含义不同
   			// 这里指的是condition queue中节点的下一个节点
        Node nextWaiter;

在 ConditionObject 中，保存了 condition queue 的首尾位置 firstWaiter 和 lastWaiter。

public class ConditionObject implements Condition, java.io.Serializable {
    private static final long serialVersionUID = 1173984872572414699L;
    /** First node of condition queue. */
    private transient Node firstWaiter;
    /** Last node of condition queue. */
    private transient Node lastWaiter;

2 await

调用 await 的线程在调用 await 之前，会先执行 lock.lock()，换句话说，这个线程会先尝试获取锁，如果简单尝试失败，会进入 sync queue 中，不断堵塞唤醒，唤醒后尝试，直到获取成功，此时该线程已经进入离开 sync queue，用 exclusiveOwnerThread 表示。

await 碰到中断会抛出异常，流程如下：

1 先检查线程是否中断，是则抛出异常

2 将当前线程包装为 node 放入 condition queue

3 释放 node 重入的次数，并把次数记下来，可以理解为是释放了锁

4 在循环中处理，park，等待中断或者 unpark。如果是中断，直接退出；unpark 需要继续检查是否已经从 condition queue 移动到 sync queue。

值得注意的是，如果已经被中断，由于 checkInterruptWhileWaiting 内的 transferAfterCancelledWait 可以保证在被中断的情况下节点转移成功，所以while循环中的条件检查的其实是节点被 unpark 的情况。

5 在 sync queue 尝试获取锁，成功后检查中断状态是否是 THROW_IE，并修改 interruptMode

6 如果 node 已经被取消，执行 unlinkCancelledWaiters

7 如果异常状态 interruptMode 不是0，决定是重置中断状态还是抛出异常

        /**
         * Implements interruptible condition wait.
         * <ol>
         * <li> If current thread is interrupted, throw InterruptedException.
         * <li> Save lock state returned by {@link #getState}.
         * <li> Invoke {@link #release} with saved state as argument,
         *      throwing IllegalMonitorStateException if it fails.
         * <li> Block until signalled or interrupted.
         * <li> Reacquire by invoking specialized version of
         *      {@link #acquire} with saved state as argument.
         * <li> If interrupted while blocked in step 4, throw InterruptedException.
         * </ol>
         */
        public final void await() throws InterruptedException {
          	// 判断是否需要直接抛出异常
            if (Thread.interrupted())
                throw new InterruptedException();
          	// 新建一个节点，放入 condition queue 尾部
            Node node = addConditionWaiter();
          	// 记录重入的情况，并释放锁
            int savedState = fullyRelease(node);
            int interruptMode = 0;
          	//
            while (!isOnSyncQueue(node)) {
                LockSupport.park(this);
                if ((interruptMode = checkInterruptWhileWaiting(node)) != 0)
                    break;
            }
          	// acquireQueued 不断尝试获取
            if (acquireQueued(node, savedState) && interruptMode != THROW_IE)
                interruptMode = REINTERRUPT;
            if (node.nextWaiter != null) // clean up if cancelled
              	//搜索，去掉所有非 CONDITION 节点
                unlinkCancelledWaiters();
            if (interruptMode != 0)
              	// 决定抛出异常还是自中断
                reportInterruptAfterWait(interruptMode);
        }

2.1 addConditionWaiter

新建一个节点，并把它放入 condition queue 尾部。

/**
         * Adds a new waiter to wait queue.
         * @return its new wait node
         */
        private Node addConditionWaiter() {
            Node t = lastWaiter;
            // If lastWaiter is cancelled, clean out.
          	// 如果 t 被取消，则会去掉队列中所有取消的节点，重置 lastWaiter
            if (t != null && t.waitStatus != Node.CONDITION) {
                unlinkCancelledWaiters();
                t = lastWaiter;
            }
          	// 新建当前线程的节点，状态是 CONDITION
            Node node = new Node(Thread.currentThread(), Node.CONDITION);
          	// 下面的操作是将 node 放入尾部
            if (t == null)
                firstWaiter = node;
            else
                t.nextWaiter = node;
            lastWaiter = node;
            return node;
        }

2.2 unlinkCancelledWaiters

/**
         * Unlinks cancelled waiter nodes from condition queue.
         * Called only while holding lock. This is called when
         * cancellation occurred during condition wait, and upon
         * insertion of a new waiter when lastWaiter is seen to have
         * been cancelled. This method is needed to avoid garbage
         * retention in the absence of signals. So even though it may
         * require a full traversal, it comes into play only when
         * timeouts or cancellations occur in the absence of
         * signals. It traverses all nodes rather than stopping at a
         * particular target to unlink all pointers to garbage nodes
         * without requiring many re-traversals during cancellation
         * storms.
         */
//把 condition queue 中所有状态不是 CONDITION 的节点都去掉
        private void unlinkCancelledWaiters() {
            Node t = firstWaiter;
            Node trail = null;
            while (t != null) {
                Node next = t.nextWaiter;
                if (t.waitStatus != Node.CONDITION) {
                    t.nextWaiter = null;
                    if (trail == null)
                        firstWaiter = next;
                    else
                        trail.nextWaiter = next;
                    if (next == null)
                        lastWaiter = trail;
                }
                else
                    trail = t;
                t = next;
            }
        }

2.3 fullyRelease

在独占锁中，state >0表示有线程重入了该锁 state次，这里一次性全部释放掉。在 sync queue 排在 head 后面的线程会试着获取锁。

release 失败说明当前线程并不是记录的线程 getExclusiveOwnerThread，就会抛出异常 IllegalMonitorStateException。

release 会调用 tryRelease，以 ReentrantLock 为例，重写的 tryRelease 会检查当前线程是否是记录的线程，不是则抛出异常。

    /**
     * Invokes release with current state value; returns saved state.
     * Cancels node and throws exception on failure.
     * @param node the condition node for this wait
     * @return previous sync state
     */
		// 释放当前 node 全部重入次数，释放锁
    final int fullyRelease(Node node) {
        boolean failed = true;
        try {
          	// 先将重入值记录一下，便于以后恢复
            int savedState = getState();
          	// 一次性全部释放
            if (release(savedState)) {
                failed = false;
                return savedState;
            } else {
                throw new IllegalMonitorStateException();
            }
        } finally {
          	// 失败状态变成 CANCELLED
            if (failed)
                node.waitStatus = Node.CANCELLED;
        }
    }

// ReentrantLock.java
protected final boolean tryRelease(int releases) {
            int c = getState() - releases;
            if (Thread.currentThread() != getExclusiveOwnerThread())
                throw new IllegalMonitorStateException();
            boolean free = false;
            if (c == 0) {
                free = true;
                setExclusiveOwnerThread(null);
            }
            setState(c);
            return free;
        }

2.4 isOnSyncQueue

判断 node 是否在 sync queue 中。

findNodeFromTail 处理尾分叉情况，尾分叉可以AQS第一篇文章。

    /**
     * Returns true if a node, always one that was initially placed on
     * a condition queue, is now waiting to reacquire on sync queue.
     * @param node the node
     * @return true if is reacquiring
     */
    final boolean isOnSyncQueue(Node node) {
      	// node 状态为 CONDITION 或者 prev 为 null
      	// 第二个条件是因为 enq 入队时会先设置 prev
        if (node.waitStatus == Node.CONDITION || node.prev == null)
            return false;
      	// 后面有其他节点，说明在 sync queue 中
        if (node.next != null) // If has successor, it must be on queue
            return true;
        /*
         * node.prev can be non-null, but not yet on queue because
         * the CAS to place it on queue can fail. So we have to
         * traverse from tail to make sure it actually made it.  It
         * will always be near the tail in calls to this method, and
         * unless the CAS failed (which is unlikely), it will be
         * there, so we hardly ever traverse much.
         */
      	// 否则是尾分叉，从尾查找过来
        return findNodeFromTail(node);
    }

    /**
     * Returns true if node is on sync queue by searching backwards from tail.
     * Called only when needed by isOnSyncQueue.
     * @return true if present
     */
		// 从尾向前查找，如果找到了 node，说明肯定在 sync queue 中
		// 如果找到了 null，说明查找到了头部，返回 false。
    private boolean findNodeFromTail(Node node) {
        Node t = tail;
        for (;;) {
            if (t == node)
                return true;
            if (t == null)
                return false;
            t = t.prev;
        }
    }

    /**
     * Inserts node into queue, initializing if necessary. See picture above.
     * @param node the node to insert
     * @return node's predecessor
     */ 
    private Node enq(final Node node) {
        for (;;) {
            Node t = tail;
            if (t == null) { // Must initialize
                if (compareAndSetHead(new Node()))
                    tail = head;
            } else {
              	// 先更新 prev
                node.prev = t;
                if (compareAndSetTail(t, node)) {
                    t.next = node;
                    return t;
                }
            }
        }
    }

2.5 checkInterruptWhileWaiting

park 有两种情况会解除，unpark 或者中断。

checkInterruptWhileWaiting 检查是通过什么方式来退出 park 的。如果刚才被中断了，则通过 transferAfterCancelledWait 判断需要返回哪种操作，是重置 REINTERRUPT 还是抛出 THROW_IE；没有被中断则返回 0。

transferAfterCancelledWait 的操作：如果中断时节点还在 condition queue，将 node 从 condition queue 转移到 sync queue，返回true。否则，需要等着调用 signal 的线程执行 node 从 condition queue 转移到 sync queue，成功后返回 false。

transferAfterCancelledWait 可以保证 node 从 condition queue 转移到 sync queue 成功。

    /** Mode meaning to reinterrupt on exit from wait */
		// 需要重置 interrupt，中断在 signal 之后发生
    private static final int REINTERRUPT =  1;
    /** Mode meaning to throw InterruptedException on exit from wait */
		// 需要抛出异常，中断在 signal 之前发生，中断时节点还在 condition queue
    private static final int THROW_IE    = -1;

		/**
    * Checks for interrupt, returning THROW_IE if interrupted
    * before signalled, REINTERRUPT if after signalled, or
    * 0 if not interrupted.
    */
    private int checkInterruptWhileWaiting(Node node) {
         return Thread.interrupted() ?
         (transferAfterCancelledWait(node) ? THROW_IE : REINTERRUPT) :
          0;
     }

    /**
     * Transfers node, if necessary, to sync queue after a cancelled wait.
     * Returns true if thread was cancelled before being signalled.
     *
     * @param node the node
     * @return true if cancelled before the node was signalled
     */
		// 可以保证 node 节点从 condition 到 sync 成功。
		//中断发生在 signal 之后还是之前，如果是之前，即中断时节点还在 condition queue，返回 true
    final boolean transferAfterCancelledWait(Node node) {
      	//如果CAS 成功，说明此时 node 还在 condition queue 中,
				//(那么就将 node 从 CONDITION 里面取出,在这里没有执行，是在await 后面的
				//unlinkCancelledWaiters中执行的),
				//并通过enq放入 sync queue
        if (compareAndSetWaitStatus(node, Node.CONDITION, 0)) {
            enq(node);
            return true;
        }
        /*
         * If we lost out to a signal(), then we can't proceed
         * until it finishes its enq().  Cancelling during an
         * incomplete transfer is both rare and transient, so just
         * spin.
         */
      	//如果还没有进入 sync queue，让出时间片，
      	//等待其他线程的 signal 内的 transferForSignal
        while (!isOnSyncQueue(node))
            Thread.yield();
        return false;
    }

2.6 reportInterruptAfterWait

根据中断返回值决定是抛出异常还是自中断。

  /**
  * Throws InterruptedException, reinterrupts current thread, or
  * does nothing, depending on mode.
   */
  private void reportInterruptAfterWait(int interruptMode)
    throws InterruptedException {
    if (interruptMode == THROW_IE)
      throw new InterruptedException();
    else if (interruptMode == REINTERRUPT)
      selfInterrupt();
  }

3 signal/signalAll

signal/signalAll 基本上差不多，区别是 signal 唤醒一个，signalAll 唤醒多个。这里指的唤醒是从 condition queue 转移到 wait queue。

/**
 * Moves the longest-waiting thread, if one exists, from the
 * wait queue for this condition to the wait queue for the
 * owning lock.
 *
 * @throws IllegalMonitorStateException if {@link #isHeldExclusively}
 *         returns {@code false}
 */
public final void signal() {
  // 先检查是否是当前节点持有锁
	if (!isHeldExclusively())
		throw new IllegalMonitorStateException();
	Node first = firstWaiter;
	if (first != null)
		doSignal(first);
}

/**
 * Moves all threads from the wait queue for this condition to
 * the wait queue for the owning lock.
 *
 * @throws IllegalMonitorStateException if {@link #isHeldExclusively}
 *         returns {@code false}
 */
public final void signalAll() {
   // 先检查是否是当前节点持有锁
	if (!isHeldExclusively())
		throw new IllegalMonitorStateException();
	Node first = firstWaiter;
	if (first != null)
		doSignalAll(first);
}

3.1 doSignal/doSignalAll

doSignal 只转移一个节点，而 doSignalAll 会转移所有节点。

doSignal 如果调用 transferForSignal 失败，会继续处理下一个，直到成功或者到末尾；doSignalAll 不关心每一个节点是成功还是失败。

/**
 * Removes and transfers nodes until hit non-cancelled one or
 * null. Split out from signal in part to encourage compilers
 * to inline the case of no waiters.
 * @param first (non-null) the first node on condition queue
 */
// 在循环内的逻辑是，执行 firstWaiter = first.nextWaiter;
// transferForSignal(first) 如果成功，退出；失败继续执行下一步
// first = firstWaiter，即移动到下一个节点
private void doSignal(Node first) {
	do {
		if ( (firstWaiter = first.nextWaiter) == null)
			lastWaiter = null;
		first.nextWaiter = null;
	} while (!transferForSignal(first) &&
			 (first = firstWaiter) != null);
}

/**
 * Removes and transfers all nodes.
 * @param first (non-null) the first node on condition queue
 */
private void doSignalAll(Node first) {
	lastWaiter = firstWaiter = null;
	do {
    // 这里不断执行 Node next = first.nextWaiter;
    // 以及 first = next;
    // 并检查 fist != null
    // 所以会转移所有的节点，且是一个个的转移
		Node next = first.nextWaiter;
		first.nextWaiter = null;
		transferForSignal(first);
		first = next;
	} while (first != null);
}

3.2 transferForSignal

将 node 从 condition queue 放到 sync queue。成功返回 true。

从 condition queue 转移到 sync queue 需要两步，删掉 condition queue 中的节点，在 sync queue 中添加节点。注意第一步从 condition queue 删除的过程在 3.1 的两个方法中，这个方法只做了第二步。

/**
 * Transfers a node from a condition queue onto sync queue.
 * Returns true if successful.
 * @param node the node
 * @return true if successfully transferred (else the node was
 * cancelled before signal)
 */
final boolean transferForSignal(Node node) {
	/*
	 * If cannot change waitStatus, the node has been cancelled.
	 */
  // 不能改变，说明状态是 CANCELLED，返回 false
	if (!compareAndSetWaitStatus(node, Node.CONDITION, 0))
		return false;

	/*
	 * Splice onto queue and try to set waitStatus of predecessor to
	 * indicate that thread is (probably) waiting. If cancelled or
	 * attempt to set waitStatus fails, wake up to resync (in which
	 * case the waitStatus can be transiently and harmlessly wrong).
	 */
  // 将 node 入队，将前一个节点返回
	Node p = enq(node);
	int ws = p.waitStatus;
  // 如果前一个节点已取消或者是状态修改为 SIGNAL 失败，就会 unpark
  // 在这种情况下，ws 可能暂时不正确，但不重要，因为即使 unpark 也需要重新
  // 竞争锁，失败仍然 park。
	if (ws > 0 || !compareAndSetWaitStatus(p, ws, Node.SIGNAL))
		LockSupport.unpark(node.thread);
	return true;
}

4 总结和使用

从 condition queue 移动到 sync queue 有两种情况：await 中 checkInterruptWhileWaiting 在被中断的情况下，会使用 transferAfterCancelledWait，该方法会保证移动成功，如果还没有移动，则自己尝试移动，否则会让其他线程的 signal 来移动；signal 中 doSignal 会执行移动操作，内部的 transferForSignal 只做其中一步，向 sync queue 插入元素。具体插入都是调用的 enq。

await 堵塞的是自己，而 signal 唤醒的是别的线程。await 堵塞后，要么被 signal 解除，要么被中断解除。

await/signal 在进入和退出的时候都是持有锁的。具体如下

await在从开头到fullyRelease执行前，是持有锁的。
await在从fullyRelease执行后到 acquireQueued执行前，是没有持有锁的。
await在 acquireQueued执行后到最后，是持有锁的。
signal 一直持有锁。

最后是一个例子，使用 ReentrantLock 和 Condition 实现生产者消费者模型。下面有一个 Lock，对应两个 Condition，最多为100个。如果在生产时，队列已满，则放入 notFull 队列等待；如果在消费时，队列已空，则放入 notEmpty 队列等待。

class BoundedBuffer {
   final Lock lock = new ReentrantLock();
   final Condition notFull  = lock.newCondition(); 
   final Condition notEmpty = lock.newCondition(); 

   final Object[] items = new Object[100];
   int putptr, takeptr, count;

   public void put(Object x) throws InterruptedException {
     // 加锁
     lock.lock();
     try {
       // 队列已满，进入 notFull
       // 如果被中断，while 会使得再次进入 notFull
       while (count == items.length) 
         notFull.await();
       items[putptr] = x; 
       if (++putptr == items.length) putptr = 0;
       ++count;
       // 唤醒 notEmpty 中一个
       notEmpty.signal();
     } finally {
       // 解锁
       lock.unlock();
     }
   }

   public Object take() throws InterruptedException {
     // 加锁
     lock.lock();
     try {
       // 队列已空，进入 notEmpty
       // 如果被中断，while 会使得再次进入 notEmpty
       while (count == 0) 
         notEmpty.await();
       Object x = items[takeptr]; 
       if (++takeptr == items.length) takeptr = 0;
       --count;
       // 唤醒 notFull 中一个
       notFull.signal();
       return x;
     } finally {
       // 解锁
       lock.unlock();
     }
   } 
 }

也就是说，可以提取出如下形式

lock.lock();
try{
	//执行 await 或者 signal
}finally{
	lock.unlock();
}

假设只有两个线程，生产者线程为 A，消费者线程为 B，两个线程开始工作。下面主要分析 A 的流程。

如果数组已经满了，A继续生产，先 lock.lock()，这一步会拿到锁，可能直接拿到，也可能是先进入 sync queue 然后被唤醒拿到锁后退出 queue 后；在notFull.await()，会执行 await 操作，先进入 condition queue，fullyRelease 释放锁并唤醒 sync queue 后续节点，被堵塞在 LockSupport.park(this)处，等待中断或者唤醒，无论怎么样，退出while (!isOnSyncQueue(node))时已经完成了 condition queue -> sync queue；接下来会在 acquireQueued 获取锁并退出 sync queue。

如果此时一直是 A 执行，说明没有消费，那离开 await 只能是中断，注意到while (count == items.length) ，A 会再次被 notFull.await()。

等到 B 消费了之后，执行 signal，A 才能离开循环，并执行完最后的操作，lock.unlock() 会唤醒 sync queue 中后续节点。

posted @ 2021-03-30 21:33 Java与大数据进阶阅读(271) 评论(0) 收藏举报

刷新页面返回顶部

登录后才能查看或发表评论，立即登录或者逛逛博客园首页

阅读排行：
· 解锁.NET 9性能优化黑科技：从内存管理到Web性能的最全指南
· Chat to MySQL 最佳实践：MCP Server 服务调用
· .NET周刊【3月第5期 2025-03-30】
· 即时通信SSE和WebSocket对比
· MCP应用docker部署，docker-compose部署

公告

公众号：Java与大数据进阶

分享笔面试干货，欢迎关注

昵称： Java与大数据进阶
园龄： 6年6个月
粉丝： 0
关注： 0

+加关注

2025年4月

日

一

二

三

四

五

六

Java与大数据进阶