SQLSERVER自旋锁等待 SOS_SCHEDULER_YIELD waits and the LOCK_HASH spinlock

SQLSERVER自旋锁等待 SOS_SCHEDULER_YIELD waits and the LOCK_HASH spinlock

http://www.sqlskills.com/blogs/paul/sos_scheduler_yield-waits-and-the-lock_hash-spinlock/

Continuing my series on advanced performance troubleshooting – see these two posts for the scripts I’ll be using and an introduction to the series:

In this blog post I’d like to show you an example of SOS_SCHEDULER_YIELD waits occurring and how it can seem like a spinlock is the cause.

I originally published this blog post and then had a discussion with good friend Bob Ward from Product Support who questioned my conclusions given what he’s seen (thanks Bob!). After digging in further, I found that my original post was incorrect, so this is the corrected version.

The SOS_SCHEDULER_YIELD wait means that an executing thread voluntarily gave up the CPU to allow other threads to execute. The SQL Server code is sprinkled with “voluntary yields” in places where high CPU usage may occur.

SOS_SCHEDULER_YIELD 等待意味着一个执行线程自愿放弃CPU以允许其他线程执行。在CPU使用率较高的地方,SQL Server到处都会出现  线程自愿YIELD!!!

 

One such place where a thread will sleep but not explicitly yield is when backing off after a spinlock collision waiting to see if it can get access to the spinlock. A spinlock is a very lightweight synchronization mechanism deep inside SQL Server that protects access to a data structure (not database data itself). See the end of the second blog post above for more of an explanation on spinlocks.

When a thread voluntarily yields, it does not go onto the waiting tasks list – as it’s not waiting for anything - but instead goes to the bottom of the Runnable Queue for that scheduler. SOS_SCHEDULER_YIELD waits by themselves are not cause for concern unless they are the majority of waits on the system, and performance is suffering.


当线程获得自旋锁后,会自动进入到scheduler的Runnable队列

 

To set up the test, I’ll create a simple database and table:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
USE [master];
GO
DROP DATABASE [YieldTest];
GO
 
CREATE DATABASE [YieldTest];
GO
USE [YieldTest];
GO
 
CREATE TABLE [SampleTable] ([c1] INT IDENTITY);
GO
CREATE NONCLUSTERED INDEX [SampleTable_NC] ON [SampleTable] ([c1]);
GO
 
SET NOCOUNT ON;
GO
INSERT INTO [SampleTable] DEFAULT VALUES;
GO 100

Then I’ll clear out wait stats and latch stats:

1
2
3
4
5
DBCC SQLPERF (N'sys.dm_os_latch_stats', CLEAR);
GO
 
DBCC SQLPERF (N'sys.dm_os_wait_stats', CLEAR);
GO

And then fire up 50 clients running the following code (I just have a CMD script that fires up 50 CMD windows, each running the T-SQL code):

1
2
3
4
5
6
7
8
9
10
11
USE [YieldTest];
GO
SET NOCOUNT ON;
GO
 
DECLARE @a INT;
WHILE (1=1)
BEGIN
    SELECT @a = COUNT (*) FROM [YieldTest]..[SampleTable] WHERE [c1] = 1;
END;
GO

And the CPUs in my laptop are jammed solid (snapped from my desktop and rotated 90 degrees to save space):

lotsofcpu SOS SCHEDULER YIELD waits and the LOCK HASH spinlock

Wow!

Looking at perfmon:

perfmon SOS SCHEDULER YIELD waits and the LOCK HASH spinlock

The CPU usage is not Windows (% User Time), and of all the counters I usually monitor (I’ve cut them all out here for clarity) I can see a sustained very high Lock Requests/sec for Object and Page locks (almost 950 thousand requests per second for both types! Gotta love my laptop :-)

So let’s dig in. First off, looking at wait stats (using the script in the wait stats post referenced above):

WaitType            Wait_S  Resource_S Signal_S WaitCount Percentage AvgWait_S AvgRes_S AvgSig_S
------------------- ------- ---------- -------- --------- ---------- --------- -------- --------
SOS_SCHEDULER_YIELD 4574.77 0.20       4574.57  206473    99.43      0.0222    0.0000   0.0222

SOS_SCHEDULER_YIELD at almost 100% of the waits on the system means that I’ve got CPU pressure – as I saw from the CPU graphs above. The fact that nothing else is showing up makes me suspect this is a spinlock issue.

Checking in the sys.dm_os_waiting_tasks output (see script in the second blog post referenced above), I see nothing waiting, and if I refresh a few times I see the occasional ASYNC_NETWORK_IO and/or PREEMPTIVE_OS_WAITFORSINGLEOBJECT wait type – which I’d expect from the CMD windows running the T-SQL code.

Checking the latch stats (again, see script in the second blog post referenced above) shows no latch waits apart from a few BUFFER latch waits.

So now let’s look at spinlocks. First off I’m going to take a snapshot of the spinlocks on the system:

1
2
3
4
5
6
7
8
9
10
-- Baseline
IF EXISTS (SELECT * FROM [tempdb].[sys].[objects]
    WHERE [name] = N'##TempSpinlockStats1')
    DROP TABLE [##TempSpinlockStats1];
GO
SELECT * INTO [##TempSpinlockStats1]
FROM sys.dm_os_spinlock_stats
WHERE [collisions] > 0
ORDER BY [name];
GO

(On 2005 you’ll need to use DBCC SQLPERF (‘spinlockstats’) and use INSERT/EXEC to get the results into a table – see the post above for example code.)

Then wait 10 seconds or so for the workload to continue… and then grab another snapshot of the spinlocks:

1
2
3
4
5
6
7
8
9
10
-- Capture updated stats
IF EXISTS (SELECT * FROM [tempdb].[sys].[objects]
    WHERE [name] = N'##TempSpinlockStats2')
    DROP TABLE [##TempSpinlockStats2];
GO
SELECT * INTO [##TempSpinlockStats2]
FROM sys.dm_os_spinlock_stats
WHERE [collisions] > 0
ORDER BY [name];
GO

Now running the code I came up with to show the difference between the two snapshots:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
-- Diff them
SELECT
    '***' AS [New],
    [ts2].[name] AS [Spinlock],
    [ts2].[collisions] AS [DiffCollisions],
    [ts2].[spins] AS [DiffSpins],
    [ts2].[spins_per_collision] AS [SpinsPerCollision],
    [ts2].[sleep_time] AS [DiffSleepTime],
    [ts2].[backoffs] AS [DiffBackoffs]
FROM [##TempSpinlockStats2] [ts2]
LEFT OUTER JOIN [##TempSpinlockStats1] [ts1]
    ON [ts2].[name] = [ts1].[name]
WHERE [ts1].[name] IS NULL
UNION
SELECT
    '' AS [New],
    [ts2].[name] AS [Spinlock],
    [ts2].[collisions] - [ts1].[collisions] AS [DiffCollisions],
    [ts2].[spins] - [ts1].[spins] AS [DiffSpins],
    CASE ([ts2].[spins] - [ts1].[spins]) WHEN 0 THEN 0
        ELSE ([ts2].[spins] - [ts1].[spins]) /
            ([ts2].[collisions] - [ts1].[collisions]) END
            AS [SpinsPerCollision],
    [ts2].[sleep_time] - [ts1].[sleep_time] AS [DiffSleepTime],
    [ts2].[backoffs] - [ts1].[backoffs] AS [DiffBackoffs]
FROM [##TempSpinlockStats2] [ts2]
LEFT OUTER JOIN [##TempSpinlockStats1] [ts1]
    ON [ts2].[name] = [ts1].[name]
WHERE [ts1].[name] IS NOT NULL
    AND [ts2].[collisions] - [ts1].[collisions] > 0
ORDER BY [New] DESC, [Spinlock] ASC;
GO

 

New  Spinlock                        DiffCollisions DiffSpins  SpinsPerCollision DiffSleepTime DiffBackoffs
---- ------------------------------- -------------- ---------- ----------------- ------------- ------------
     LOCK_HASH                       6191134        4005774890 647               686           1601383
     OPT_IDX_STATS                   1164849        126549245  108               57            7555
     SOS_OBJECT_STORE                73             305        4                 0             0
     SOS_WAITABLE_ADDRESS_HASHBUCKET 115            44495      386               0             3
     XDESMGR                         1              0          0                 0             0

This is telling me that there is massive contention for the LOCK_HASH spinlock, further backed up by the SOS_WAITABLE_ADDRESS_HASHBUCKET spinlock. TheLOCK_HASH spinlock protects access to the hash buckets used by the lock manager to efficiently keep track of the lock resources for locks held inside SQL Server and to allow efficient searching for lock hash collisions (i.e. does someone hold the lock we want in an incompatible mode?). In this case the contention is so bad that instead of just spinning, the threads are actually backing off and letting other threads execute to allow progress to be made.

And that makes perfect sense because of what my workload is doing - 50 concurrent connections all trying to read the same row on the same page in the same nonclustered index.

But is that the cause of the SOS_SCHEDULER_YIELD waits? To prove it one way or the other, I created an Extended Event session that would capture call stacks when a wait occurs:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
-- Note that before SQL 2012, the wait_type to use is 120
-- On SQL 2012 the target name is 'histogram' but the old name still works.
CREATE EVENT SESSION [MonitorWaits] ON SERVER
ADD EVENT [sqlos].[wait_info]
    (ACTION ([package0].[callstack])
    WHERE [wait_type] = 124) -- SOS_SCHEDULER_YIELD only
ADD TARGET [package0].[asynchronous_bucketizer] (
    SET filtering_event_name = N'sqlos.wait_info',
    source_type = 1, -- source_type = 1 is an action
    source = N'package0.callstack') -- bucketize on the callstack
WITH (MAX_MEMORY = 50MB, max_dispatch_latency = 5 seconds)
GO
 
-- Start the session
ALTER EVENT SESSION [MonitorWaits] ON SERVER STATE = START;
GO
 
-- TF to allow call stack resolution
DBCC TRACEON (3656, -1);
GO
 
-- Let the workload run for a few seconds
 
-- Get the callstacks from the bucketizer target
-- Are they showing calls into the lock manager?
SELECT
    [event_session_address],
    [target_name],
    [execution_count],
    CAST ([target_data] AS XML)
FROM sys.dm_xe_session_targets [xst]
INNER JOIN sys.dm_xe_sessions [xs]
    ON ([xst].[event_session_address] = [xs].[address])
WHERE [xs].[name] = N'MonitorWaits';
GO
 
-- Stop the event session
ALTER EVENT SESSION [MonitorWaits] ON SERVER STATE = STOP;
GO

I also made sure to have the correct symbol files in the \binn directory (see How to download a sqlservr.pdb symbol file). After running the workload and examining the callstacks, I found the majority of the waits were coming from voluntary yields deep in the Access Methods code. An example call stack is:

IndexPageManager::GetPageWithKey+ef [ @ 0+0x0
GetRowForKeyValue+146 [ @ 0+0x0
IndexRowScanner::EstablishInitialKeyOrderPosition+10a [ @ 0+0x0
IndexDataSetSession::GetNextRowValuesInternal+7d7 [ @ 0+0x0
RowsetNewSS::FetchNextRow+12a [ @ 0+0x0
CQScanRangeNew::GetRow+6a1 [ @ 0+0x0
CQScanCountStarNew::GetRowHelper+44 [ @ 0+0x0
CQScanStreamAggregateNew::Open+70 [ @ 0+0x0
CQueryScan::Uncache+32f [ @ 0+0x0
CXStmtQuery::SetupQueryScanAndExpression+2a2 [ @ 0+0x0
CXStmtQuery::ErsqExecuteQuery+2f8 [ @ 0+0x0
CMsqlExecContext::ExecuteStmts<1,1>+cca [ @ 0+0x0
CMsqlExecContext::FExecute+58b [ @ 0+0x0
CSQLSource::Execute+319

[Edit: check out how to do this in the spinlock whitepaper]

This is clearly (to me) nothing to do with the LOCK_HASH spinlock, so that's a red herring. In this case, I'm just CPU bound. When a thread goes to sleep when backing off from a spinlock, it directly calls Windows Sleep() - so it does not show up as a SQL Server wait type at all, even though the sleep call is made from the SQL OS layer. Which is a bummer.

当一个线程从自旋锁后退休眠时候,线程会直接调用Windows Sleep() 函数,所以这个线程不会显示等待类型,尽管这是从sql os层调用的,这样就不知道线程为什么进入自旋锁休眠

How to get around that and reduce CPU usage? This is really contrived workload, but this can occur for real. Even if I try using WITH (NOLOCK), the NOLOCK seek will take a table SCH_S (schema-stability) lock to make sure the table structure doesn't change while the seek is occurring, so that only gets rid of the page locks, not the object locks and doesn't help CPU usage. With this (arguably weird) workload, there are a few things I could do (just off the top of my head):


with  nolock会获取表的SCH_S 架构稳定锁,确保查询期间表结构不会改变

  • Scale-out the connections to a few copies of the database, updated using replication
  • Do some mid-tier or client-side caching, with some notification mechanism of data changes (e.g. a DDL trigger firing a Service Broker message)

So this was an example of where wait stats lead to having to look at spinlock stats, but that the spinlock, on even *deeper* investigation, wasn't the issue at all. Cool stuff.

Next time we'll look at another cause of SOS_SCHEDULER_YIELD waits.

Hope you enjoyed this - let me know your thoughts!

Cheers

PS In the latest SQLskills Insider Quick Tips newsletter from last week, I did a video demo of looking at transaction log IOs and waits - you can get it here.

 


 

简单介绍SQLServer中的自旋锁


https://m.html.cn/softprog/database/112408067719025.html


为什么我们需要自旋锁?
用闩锁同步多个线程间数据结构访问,在每个共享数据结构前都放置一个闩锁。闩锁与此紧密关联:当你不能获得闩锁(因为其他人已经有一个不兼容的闩锁拿到),查询就会强制等待,并进入挂起(SUSPENDED)状态。
查询在挂起状态等待直到可以拿到闩锁,然后就会进入可执行(RUNNABLE)状态。对于查询执行只要没有可用的CPU,查询就一直在可执行(RUNNABLE)状态。
一旦CPU有空闲,查询会进入运行(RUNNING)状态,最后用闩锁来保护访问的共享数据结构。下图展示了SQLOS对协调线程调度实现的状态机。

 


因为对“忙碌”数据结构需要等待CPU使用闩锁保护没有意义。
因此SQL Server实现所谓自旋锁(Spinlocks)。自旋锁就像一个闩锁,存储引擎使用的一个轻量级同步对象,用来同步对共享数据结构线程访问。
和闩锁的主要区别是你积极等待自旋锁——不离开CPU。在自旋锁上的“等待”总发生在运行(RUNNING)状态的线程。
在你循环里旋转直到获得自旋锁。这就是所谓的忙碌等待(busy wait)。自旋锁的最大优点是当查询在自旋锁上等待时,不会涉及到上下文切换。
另一方面忙碌等待(busy wait)浪费CPU周期,其他线程查询也许对CPU正在使用。

为了避免太多的CPU周期浪费,SQL Server 2008 R2及后续版本实现所谓的指数后退机制(exponential backoff mechanism),那里在CPU上一些时间的休眠后,线程停止自旋。
在线程进入休眠期间,增加了尝试获得自旋锁的超时,线程每休眠退避一次增加一次超时,这个线程就一直不得运行。这个行为可以降低对CPU性能的影响。

(补充说明:Spinlock中文可以称为自旋锁。它是一个轻量级的,用户态的同步对象,和Critical Section/Region临界区类似,但是粒度比前者小多了。
它主要用来保护某些特定的内存对象的多线程并发访问。Spinlock是排他性的。一次只能一个线程拥有。

Spinlock的设计目标是非常快和高效率。Spinlock内部如何工作呢?
它首先试图获得某个对象的锁,如果目标被其它线程占有,就在那里轮询(spin)一定时间。如果还得不到锁,就sleep一小会,然后继续spin。反复这个过程直到得到对象的占有权。)

自旋锁与故障排除
对自旋锁故障排除的主要DMV是 sys.dm_os_spinlock_stats(sql2005是 DBCC SQLPERF (‘spinlockstats’) )。
查看每种自旋锁当前的运作情况
这个DMV里返回的每一行都代表SQL Server里的一个自旋锁。SQL 2012实现了212个自旋锁 SQL 2014实现了262个不同自旋锁。
我们来详细看下这个DMV里的各个列:
name:自旋锁名称
collision:当尝试访问保护的数据结构时,被自旋锁阻塞的线程次数
spins:在循环里尝试获得自旋锁的自旋锁线程次数
spins_per_collision:旋转和碰撞之间的比率
sleep_time:因为退避线程休眠时间
backoffs:为了其他线程在CPU上继续,线程退避次数
在这个DMV里最重要的列是backoffs,对于特定的自旋锁类型,这列告诉你退避发生频率。高频率的退避会导致CPU消耗高引起SQL Server里的自旋锁竞争(Spinlock Contention)。
我就见过一个32核的SQL Server服务器,CPU运行在100%而不进行任何工作——典型的自旋锁竞争症状。

select * from sys.dm_os_spinlock_stats order by collisions desc

对自旋锁问题进行故障排除你可以使用扩展事件提供的sqlos.spinlock_backoff。当退避(backoff)发生时,就会触发这个扩展事件。因为在SQL Server里退避会经常发生
下列代码给你展示了如何创建这样的扩展事件会话。

-- Retrieve the type value for the LOCK_HASH spinlock.
-- That value is used by the next XEvent session
SELECT * FROM sys.dm_xe_map_values
WHERE name = 'spinlock_types'
AND map_value = 'LOCK_HASH'
GO

-- Tracks the spinlock_backoff event
CREATE EVENT SESSION SpinlockContention ON SERVER
ADD EVENT sqlos.spinlock_backoff
(
ACTION
(
package0.callstack
)
WHERE
(
[type] = 129 -- <<< Value from the previous query
)
)
ADD TARGET package0.histogram
(
SET source = 'package0.callstack', source_type = 1
)
GO


从代码里可以看到,这里我在调用堆栈(callstack)上使用了直方图(histogram)目标来bucktize。因此对于特定的自旋锁,你可以到SQL Serve里生成的最高退避(backoffs)
代码路径。
你甚至可以通过启用3656跟踪标记(trace flag)来标识调用堆栈。这里你可以看到来自这个扩展会话的输出:

sqldk.dll!XeSosPkg::spinlock_backoff::Publish+0x138
sqldk.dll!SpinlockBase::Sleep+0xc5
sqlmin.dll!Spinlock<129,7,1>::SpinToAcquireWithExponentialBackoff+0x169
sqlmin.dll!lck_lockInternal+0x841
sqlmin.dll!XactWorkspaceImp::GetSharedDBLockFromLockManager+0x18d
sqlmin.dll!XactWorkspaceImp::GetDBLockLocal+0x15b
sqlmin.dll!XactWorkspaceImp::GetDBLock+0x5a
sqlmin.dll!lockdb+0x4a sqlmin.dll!DBMgr::OpenDB+0x1ec
sqlmin.dll!sqlusedb+0xeb
sqllang.dll!usedb+0xb3
sqllang.dll!LoginUseDbHelper::UseByMDDatabaseId+0x93
sqllang.dll!LoginUseDbHelper::FDetermineSessionDb+0x3e1
sqllang.dll!FRedoLoginImpl+0xa1b
sqllang.dll!FRedoLogin+0x1c1
sqllang.dll!process_request+0x3ec
sqllang.dll!process_commands+0x4a3
sqldk.dll!SOS_Task::Param::Execute+0x21e
sqldk.dll!SOS_Scheduler::RunTask+0xa8
sqldk.dll!SOS_Scheduler::ProcessTasks+0x279
sqldk.dll!SchedulerManager::WorkerEntryPoint+0x24c
sqldk.dll!SystemThread::RunWorker+0x8f
sqldk.dll!SystemThreadDispatcher::ProcessWorker+0x3ab
sqldk.dll!SchedulerManager::ThreadEntryPoint+0x226

使用调用堆栈,不难找出自旋锁竞争发生的地方。在那个指定的调用堆栈里竞争发生在LOCK_HASH自旋锁类型里,它是锁管理器的哈希表。
每次在锁管理器里加锁或解锁被执行时,自旋锁必须在对应的哈希桶里获得。
如你所见,在调用堆栈里,当从XactWorkspacelmp类调用GetSharedDBLockFromLockManager函数时,自旋锁被获得。
这表示当竞争到数据库时,共享数据库锁被尝试获取。最后用很高的退避(backoffs)的LOCK_HASH自旋锁里,这屈服于自旋锁竞争。


spinlock自旋锁也会引起cpu高
我就见过一个32核的SQL Server服务器,CPU运行在100%而不进行任何工作——典型的自旋锁竞争症状。

 

select * from sys.dm_os_spinlock_stats where collisions > 0 order by collisions desc

select * from sys.dm_os_spinlock_stats where name = 'SOS_SCHEDULER'

 

f

http://t.zoukankan.com/caoxb-p-13139575.html
自旋锁的优缺点
避免上下文切换:在自旋状态下,当一个线程A尝试进入同步代码块,但是当前的锁已经被线程B占有时,线程A不进入阻塞状态,而是不停的空转,等待线程B释放锁。
如果锁的线程能在很短时间内释放资源,那么等待竞争锁的线程就不需要做内核态和用户态之间的切换进入阻塞状态,只需自旋,
等持有锁的线程释放后即可立即获取锁,避免了用户线程和内核的切换消耗。


自旋等待最大时间:线程自旋会消耗cpu,若自旋太久,则会让cpu做太多无用功,因此要设置自旋等待最大时间。
sqlserver使用指数后退机制(exponential backoff mechanism)自旋等待最大时间 ,超时机制,当前线程后退让其他线程执行


优点:
开启自旋锁后能减少线程的阻塞,在对于锁的竞争不激烈且占用锁时间很短的代码块来说,能提升很大的性能,在这种情况下自旋的消耗小于线程阻塞挂起的消耗。
减少线程用户态和内核态的切换消耗
锁资源处理时间非常短代码块,性能极大提升。

缺点:
在线程竞争锁激烈,或持有锁的线程需要长时间执行同步代码块的情况下,使用自旋会使得cpu做的无用功太多
不适合锁资源处理时间长的场景
自旋锁状态导致CPU空转消耗

 

 

posted @ 2014-07-03 14:08  桦仔  阅读(998)  评论(0编辑  收藏  举报