oracle deadlock with TM lock in SX/SSX mode
转自 http://blog.itpub.net/15415488/viewspace-631260
通常情况下,dead lock是由应用的不合理造成,应该由PD解决,而非DBA。但何时为非通常情况呢?
今天遇到一个case,即为非通常情况。
而他的通常的标志即为:在trc文件里的deadlock graph里,看见有SX/SSX的TM lock出现,而不全是X mode的TX lock。
而这,常常是因为foreign key没有index。(相信很多人都已经有这个经验了)
但我今天遇到的更加特殊一点的情况是,deadlock graph里,一个session等在SSX mode TM lock上,另一个等在X mode TX lock上。
而对于foreign key没有index的情况,我们常常见到的是两个sessions都等在SSX mode TM lock上。
Deadlock graph:
---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TM-000298c4-00000000 143 3185 SX 149 3061 SX SSX
TX-000a0000-0006b43c 149 3061 X 143 3185 X
而如果是因为应用的不合理导致的行级别的冲突所产生的deadlock graph为:
---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TX-00090022-00000079 31 3274 X 29 3267 X
TX-000a0000-000010c0 29 3267 X 31 3274 X
而如果是通常的因为foreign key没有index所产生的deadlock graph常常为:
---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TM-000047c5-00000000 31 3274 SX SSX 29 3267 SX SSX
TM-000047c5-00000000 29 3267 SX SSX 31 3274 SX SSX
所以,如果大家看到deadlock graph里有出现等在SSX TM lock上时,最好先看看产生deadlock的两条SQL的表里有没有foreign key的约束,如果有,确认是否在子表上有无index。
在确认了问题之后,然后带着好奇,可以自己在QA环境做几个实验,模拟出完全一样的deadlock graph。
于是,我根据trc文件后面的信息,发现产生deadlock的两条SQL分别是两条update 父表和子表的update语句。
通过N多实验,终于模拟出通过两条对父表和子表的update语句产生死锁,并产生那种诡异的deadlock graph的方法了。
直接公布答案:
create table b as select * from a;
alter table a add constraints apk primary key(object_id);
alter table b add constraint bfk foreign key (object_id) references a (object_id) on delete cascade;
首先b上在object_id列上没有index。
--Session A:
delete from a where object_id=123;
--Session B:
delete from b where object_id=456;
--Session A:
update a set object_id=789 where object_id=789;(-- it will hung here)
--Session B:
update b set object_id=123 where object_id=123; (-- it will hung also, and dead lock happenes)
如上产生的trc文件里即有我遇见的case一模一样的deadlock graph:
---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TX-000a0018-0000115e 29 3267 X 31 3274 X
TM-000047c5-00000000 31 3274 SX 29 3267 SX SSX
session 3267: DID 0001-001D-00000024 session 3274: DID 0001-001F-0000000B
session 3274: DID 0001-001F-0000000B session 3267: DID 0001-001D-00000024
Rows waited on:
Session 3274: obj - rowid = 000047C5 - AAAEfFAAFAAACDXABS
(dictionary objn - 18373, file - 5, block - 8407, slot - 82)
Session 3267: no row
Information on the OTHER waiting sessions:
Session 3274:
pid=31 serial=2735 audsid=20123 user: 36/HAOZHU_USER
O/S info: user: oracle, term: pts/5, ospid: 9809, machine: xxx
program: xxx[/email] (TNS V1-V3)
application name: SQL*Plus, hash value=3669949024
Current SQL Statement:
update b set object_id=123 where object_id=123
End of information on OTHER waiting sessions.
Current SQL statement for this session:
update a set object_id=789 where object_id=789
如果我们加上index:
create index bidx on b(object_id);
--Session A:
delete from a where object_id=123;
--Session B:
delete from b where object_id=456;
--Session A:
update a set object_id=789 where object_id=789; (--it will NOT hung, and is successful.)
--Session B:
update b set object_id=123 where object_id=123; (-- it will hung also, but not dead lock)
那么不会出现deadlock,只会出现一般的lock。
其实,对于这个case,除了foreign key没有index这个问题之外,也不排除应用逻辑的问题。
也有可能在解决了SX/SSX mode TM lock的deadlock之后,应用也会产生X mode TX lock的deadlock也说不定。
毕竟,他也会产生上述比较危险的lock,如果Session A再在object_id=456的行上做dml,也会引发X mode TX deadlock。
综上,deadlock是个挺有趣的东西。
What to do with "ORA-60 Deadlock Detected" Errors [ID 62365.1]
for a list of when TX locks are requested in
mode 4.
TM SSX (mode 5) This is usually related to the existence of
or foreign key constraints where the columns
S (mode 4) are not indexed on the child table.
See Note:33453.1 for how to locate such
constraints. See below for locating
the OBJECT being waited on.
Although other deadlock scenarios can happen the above are the most common.