修改页面checksum值来避开824错误

修改页面checksum值来避开824错误

http://www.sqlnotes.info/2013/05/02/fix-page-checksum/

A checksum value is tagged with every data page in SQL Server. This is used to detect issues on the storage system to ensure what has been read from the disk is what has been written to the disk. When the checksum is detected which is not the one it’s supposed to be, the page will not be read into the buffer pool. SQL Server will return error. If you are sure the data on the page are all correct or the checksum is broken purposely, you do have the option to fix it even you don’t know how to calculate it.

 

Now let’s create our experimental database to get this fun stuff done.

01 use master
02 drop database Test1
03 create database Test1
04 go
05 use Test1
06 go
07 create table t1(Field1 char(10))
08 go
09 insert into t1 values('AAAAAAAAAA')
10 select * from t1
11 /*
12 Field1
13 ----------
14 AAAAAAAAAA
15  
16 (1 row(s) affected)
17  
18 */

Now let’s break it

01 -- first find out where it is
02 use master
03 go
04 alter database Test1 set single_user with rollback immediate
05 go
06 dbcc traceon(3604)
07 go
08 dbcc ind(test1, t1,0) -- page 118 in my case
09 go
10 -- then, check the record
11 dbcc page (test1, 1, 118, 1)
12 /*
13 Slot 0, Offset 0x60, Length 17, DumpStyle BYTE
14  
15 Record Type = PRIMARY_RECORD        Record Attributes =  NULL_BITMAP    Record Size = 17
16  
17 Memory Dump @0x0000000010F0A060
18  
19 0000000000000000:   10000e00 41414141 41414141 41410100 00        ....AAAAAAAAAA...
20 */
21 --Now let's modify one of the A to B
22 dbcc writepage(test1, 1, 118, 100, 1, 0x42, 1)
23 --Notice that the last parameter I used is 1 not the default value
24 --this will bypass the the buffer pool and direct write the data to the page on disk
25 --Buffer pool calculates the checksum before writting. If I bypass it
26 --it will mean the checksum is broken.
27 go
28 --Let's check the page again
29 dbcc page (test1, 1, 118, 1)
30 /*
31 Slot 0, Offset 0x60, Length 17, DumpStyle BYTE
32  
33 Record Type = PRIMARY_RECORD        Record Attributes =  NULL_BITMAP    Record Size = 17
34  
35 Memory Dump @0x0000000009D7A060
36  
37 0000000000000000:   10000e00 42414141 41414141 41410100 00        ....BAAAAAAAAA...
38 */
39 go

Ok, now we successfully break the page. if you run DBCC CheckDB(Test1), you will see following errors

Msg 8939, Level 16, State 98, Line 1
Table error: Object ID 245575913, index ID 0, partition ID 72057594039042048, alloc unit ID 72057594043432960 (type In-row data), page (1:118). Test (IS_OFF (BUF_IOERR, pBUF->bstat)) failed. Values are 2057 and -4.
Msg 8928, Level 16, State 1, Line 1
Object ID 245575913, index ID 0, partition ID 72057594039042048, alloc unit ID 72057594043432960 (type In-row data): Page (1:118) could not be processed. See other errors for details.

It does not give you too much information. What if you run select * from Test1.dbo.t1. The error you you get is very mouthful

Msg 824, Level 24, State 2, Line 1
SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0x5f1b00a8; actual: 0x5f1a80a8). It occurred during a read of page (1:118) in database ID 5 at offset 0x000000000ec000 in file 'F:\Program Files\Microsoft SQL Server\MSSQL11.SQL2012\MSSQL\DATA\Test1.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

In this error message, there are very meaningful messages delivered – (expected: 0x5f1b00a8; actual: 0x5f1a80a8). One of those 2 numbers must be correct. Those 2 numbers are integer numbers – least significant digits will be saved at lower address. So the values of storage format of those numbers are a8001b5f and a8801a5f. Now let’s fire up DBCC PAGE to see whether we can find one of them. If so, the other one of those 2 will be correct checksum.

1 dbcc page (test1, 1, 118, 2)
2 /*
3 000000000FA3A000:   01010000 00820001 00000000 00000e00 00000000  .....‚..............
4 000000000FA3A014:   00000100 54000000 8d1f7100 76000000 01000000  ....T.....q.v.......
5 000000000FA3A028:   21000000 52000000 18000000 00000000 00000000  !...R...............
6 000000000FA3A03C:   a8001b5f 00000000 00000000 00000000 00000000  ¨.._................
7 */

Look at the last line above from DBCC PAGE result. At offset 3C, you can see A8001B5F – aha, that’s the incorrect checksum. Now let’s correct it with value a8801a5f

01 dbcc writepage(test1, 1, 118, 60, 4, 0xa8801a5f, 1)
02 go
03 dbcc checkdb(test1)
04 go
05 select  * from test1..t1
06 /*
07 Field1
08 ----------
09 BAAAAAAAAA
10  
11 (1 row(s) affected)
12 */

Now the error goes away and you can see the modified record. CheckSum is managed by buffer pool. DBCC WritePage with bufferpool(last parameter set to 0) can’t modify that value. You will have to use 1 for the last parameter under single database mode. Enjoy.

John Huang – SQL MCM & MVP, http://www.sqlnotes.info

 Posted by  at 5:00 am Add comments Tagged with: 

 5 Responses to “Fix Page Checksum”

  1. hi sir, how can i convert that 2 integers to a storage format so i can check it in the result of dbcc page

  2. ONE OF THE GREATest ARTICLE ON THE NET

  3. hi john, how do u find the particular offset from the dbcc page result

  4.  
 

DBCC WritePage 不会修改checksum值的方法:
1、将数据库改为read_only模式,然后使用0默认值来修改数据库数据
修改checksum值的方法
1、将数据库设置为单用户,使用选项1来直接修改

CheckSum is managed by buffer pool


修改页面checksum值来避开824错误

 

修改页面号可以避开checkdb7天的停机时间

 

消息 824,级别 24,状态 2,第 2 行
SQL Server 检测到基于一致性的逻辑 I/O 错误 pageid 不正确(应为 6:33778661,但实际为 0:0)。在文件 'E:\DataBase\xxxchive\xxxchive_Id_04_data.ndf' 中、偏移量为 0x0000406d7ca000 的位置对数据库 ID 7 中的页 (6:33778661) 执行 读取 期间,发生了该错误。SQL Server 错误日志或系统事件日志中的其他消息可能提供了更详细信息。这是一个威胁数据库完整性的严重错误条件,必须立即纠正。请执行完整的数据库一致性检查(DBCC CHECKDB)。此错误可以由许多因素导致;有关详细信息,请参阅 SQL Server 联机丛书。

 

posted @ 2014-10-25 15:41  桦仔  阅读(762)  评论(0编辑  收藏  举报