Deadlocks--Rows waited on displaying garbage info
from http://jonathanlewis.wordpress.com/2013/04/13/deadlocks/
Here’s a deadlock graph that might cause a little confusion:
Deadlock graph:
---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TX-001a0015-00014787 34 90 X 32 3 S
TX-00190008-0000601b 32 3 X 34 90 S
session 90: DID 0001-0022-00000327 session 3: DID 0001-0020-000009E9
session 3: DID 0001-0020-000009E9 session 90: DID 0001-0022-00000327
Rows waited on:
Session 90: obj - rowid = 00030CE5 - AAAwzlAAGAABDiuAAA
(dictionary objn - 199909, file - 6, block - 276654, slot - 0)
Session 3: obj - rowid = 00030CE5 - AAAwzlAAGAABp8gAAA
(dictionary objn - 199909, file - 6, block - 433952, slot - 0)
Both sessions are holding X and waiting S on a TX lock. There are several well-known reasons why you might see a share (mode 4) lock on a transaction table slot: collisions on bitmap indexes, unique indexes, index organized tables, or referential integrity checks are the commonest “external” examples, and problems with interested transaction lists (ITLs) or freelists are the common “internal” ones – so the presence of the waits in share mode shouldn’t, of themselves, be a source of confusion.
The confusion is in the reported rowids. If you try to interpret them as real rowids you may be unlucky and discover that they seem to be related to the SQL reported for the deadlocked sessions when really the information they hold is garbage. (Just to avoid confusion, let me make it clear that there are cases where the rowids reported definitely WILL be garbage; on the other hand, there may be some scenarios where the rowids are relevant – although I haven’t done any exhaustive check to see if there really are such scenarios.)
When I see a deadlock graph on transaction locks and the waits are for S mode I tend to assume that the information about the rows waited on is probably misleading; when the slot number for the rowid is zero this increases my confidence that the rowid is rubbish. (Zero is a legal value for a rowid slot, of course, so a zero doesn’t prove that the rowid is rubbish, it’s just a coincidence that allows me to continue following a hypothesis.)
The problem is that Oracle doesn’t waste resources tidying up after itself, and in the case of deadlock graphs this laziness shows up in the rowids reported. The trace is simply reporting whatever happens to be in the row_wait_obj#, row_wait_file#, row_wait_block# and row_wait_row# columns of v$session; and if the waiting process hasn’t updated these columns with current row information you could be looking at the details of the last row (or block) that the session waited for. Here’s the description of a test to demonstrate the behaviour:
create table t1 (n1 number, n2 number);
insert into t1 values(1,1);
create unique index t1_i1 on t1(n1);
create unique index t1_i2 on t1(n2);
session 1: insert into t1 values(2,11);
session 2: insert into t1 values(3,21);
session 1: insert into t1 values(4,21);
session 2: insert into t1 values(5,11);
With this table, and sequence of events following it, session 1 raised a deadlock error, and dumped the following trace:
Deadlock graph:
---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TX-000a0021-0000c4aa 16 93 X 13 90 S
TX-00020020-0000fdb9 13 90 X 16 93 S
session 93: DID 0001-0010-00000057 session 90: DID 0001-000D-000000F2
session 90: DID 0001-000D-000000F2 session 93: DID 0001-0010-00000057
Rows waited on:
Session 90: obj - rowid = 00000009 - AAAAAJAABAAAQJcAAA
(dictionary objn - 9, file - 1, block - 66140, slot - 0)
Session 93: obj - rowid = 0002E7DC - AAAufaAAFAAAAAJAAA
(dictionary objn - 190428, file - 5, block - 9, slot - 0)
Object 9 is the I_FILE#_BLOCK# index in the data dictionary – and session 90 is definitely not doing anything with that object in this transaction.
Object 190428 is another table in the test schema, but session 93 didn’t access it in this transaction, and the block referenced is the segment header block, not a block that could hold a row.
In fact, just before I started the sequence of inserts I ran this query from a third session (connected as sys) with the following results:
select
sid,
row_wait_obj#, row_wait_file#, row_wait_block#, row_wait_row#
from
v$session
where
sid in (90,93)
;
SID ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW#
---------- ------------- -------------- --------------- -------------
90 9 1 66140 0
93 190428 5 9 0
The “rows” reported for the deadlock simply echoed the values that were already in the row_wait columns before the test started. (The sessions were waiting on “SQL*Net message from client” at the time.)
Footnote:
Session 90 created the tables and indexes, that’s probably why it happened to have an outstanding reference to the i_file#_block# index.
Session 93 had just run a script to drop all the objects in the schema, which may explain why it happened to have an outstanding reference to a segment header block
If the row_wait_obj# had been set to -1 for either session then the deadlock graph would have reported "No row" for that session.