SQL Server 存储引擎:数据页和数据行
SQL Server 存储引擎:数据页和数据行
http://aboutsqlserver.com/2013/10/15/sql-server-storage-engine-data-pages-and-data-rows/
The space in the database divided into logical 8KB pages. Those pages are continuously numbered starting with zero and can be referenced by specifying a file ID and a page number. The page numbering is always continuous – when SQL Server grows the database file, the new pages would have the numbers starting from the last highest page number in the file plus one. Similar, when SQL Server shrinks the file, it removes the highest number pages from the file.
Let’s look at the structure of a data page. All images are clickable.
96-bytes page header contains the various information about a page , such as the object page belongs; number of rows and amount of free space available on the page; links to the previous and next pages if the page is in an index page chain, and so on.
Following the page header is the area where actual data is stored. It is followed by the free space. Finally, there is the slot array, which is the block of 2-byte entries indicating the offset at which the corresponding data rows begin on the page.
Slot array indicates the logical order of the data rows on the page. In case, if data on the page needs to be sorted in the order of the index key, SQL Server does not physically sort the data rows on the page but rather populates slot array based on the index sort order. The slot 0 (right-most) stores the offset for the data row with the lowest key value on the page, slot 1 – to the second lowest key value and so forth.
SQL Server system data types can be logically separated into two different groups, such as fixed-length and variable-length types. Fixed-length data types, such as int, datetime, char and others always use the same storage space regardless of the value even when it is NULL. For example, int column always uses 4 bytes and nchar(10) column always uses 20 bytes to store the information.
As the opposite, variable-length data types, such as varchar, varbinary and a few others, use as much storage space as required to store the data plus two extra bytes. For example nvarchar(4000) column would use only 12 bytes to store five characters string and, in most part of the cases, 2 bytes to store NULL value. We will discuss the case when variable-length columns do not use storage space for NULL values later.
Let’s look at the structure of the data row
The first 2 bytes of the row, called Status Bits A and Status Bits B, are the bitmaps containing the information about the row, such as row type; if the row has been logically deleted (ghosted); if the row has NULL values, variable-length columns and versioning tag.
The next two bytes in the row are used to store the length of the fixed-length portion of the data. They are followed by fixed-length data itself.
After the fixed-length data portion, there is the null bitmap, which includes two different data elements. The first 2-byte element is the number of columns in the row. It is followed by null bitmap array. That array is using one bit per every column from the table regardless if it is nullable or not.
The null bitmap is always present in the data rows in heap tables or clustered index leaf rows even when table does not have nullable columns. Although, the null bitmap is not present in non-leaf index rows nor leaf level rows of nonclustered indexes when there are no nullable columns in the index.
Following the null bitmap, there is the variable-length data portion of the row. It starts with two-byte number of variable-length columns in the row followed by variable-length column offset array. SQL Server stores two-byte offset value per each variable-length column in the row even when value is null. It followed by the actual variable-length portion of the data.
Finally, there is optional 14-bytes versioning tag at the end of the row. That tag is used during the operations, which require row-versioning, such as online index rebuild(在线重建索引), optimistic isolation level and others.
Let’s look at the example. First, let’s create the table, populate it with some data and look at the actual row data.
use tempdb go create table dbo.DataRows ( ID int not null, Col1 varchar(255) null, Col2 varchar(255) null, Col3 varchar(255) null ); insert into dbo.DataRows(ID, Col1, Col3) values (1,replicate('a',255),replicate('c',255)); insert into dbo.DataRows(ID, Col2) values (2,replicate('b',255)); dbcc ind ( 'tempdb' -- Database name ,'dbo.DataRows' -- Table Name ,-1 -- Display info about all pages from the table );
Undocumented but well-known DBCC IND command returns us the information about table page allocations.
There are two pages that belong to the table. The first one with PageType=10 is the special type of the page called IAM allocation map. This page tracks the pages that belong to particular object. Let’s not focus on it now – we will cover allocation map pages in one of the following blog posts.
The page with PageType=1 is the actual data page that contains the data rows. PageFID and PagePID column shows the actual file and page numbers for the page. You can use another undocumented command DBCC PAGE to examine its content
-- Redirecting DBCC PAGE output to console rather than error log dbcc traceon(3604); dbcc page ( 'tempdb' -- Database name ,1 -- File ID ,214643 -- Page ID ,3 -- Output mode: 3 - display page header and row details );
You can see the output of DBCC PAGE that corresponds to the first data row below. SQL Server stores the data in byte-swapped order. For example, two-byte value of 0001 would be stored as0100.
Slot 0 Offset 0x60 Length 39 Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP VARIABLE_COLUMNS Record Size = 39 Memory Dump @0x000000000EABA060 0000000000000000:30000800 01000000 04000403 001d001d 00270061 0................'.a 0000000000000014:61616161 61616161 61636363 63636363 636363 aaaaaaaaacccccccccc Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4 ID = 1 Slot 0 Column 2 Offset 0x13 Length 10 Length (physical) 10 Col1 = aaaaaaaaaa Slot 0 Column 3 Offset 0x0 Length 0 Length (physical) 0 Col2 = [NULL] Slot 0 Column 4 Offset 0x1d Length 10 Length (physical) 10 Col3 = cccccccccc
Let’s look at the data row structure
As you see, the row starts with the two status bits bytes following by two-byte value of 0800. This is byte-swapped value of 0008, which is the offset for the number of columns attribute in the row. This offset tells SQL Server where fixed-length data part of the row ends.
Next four bytes are used to store fixed-length data, which is ID column in our case. After that, there is the two-byte value that shows that data row has four columns followed by one-byte NULL bitmap. With just four columns one byte in the bitmap is enough. It stores the value of 04, which is00000100 in the binary format. It indicates that the third column in the row contains NULL value.
The next two bytes stores the number of variable-length columns in the row, which is 3 (0300 in byte-swapped order). It follows by offset array, each two bytes there stores the offset where variable-length column data ends. As you see, even though Col2 is NULL, it still uses the slot in the offset-array. Finally, there is the actual data from variable-length columns.
Now let’s look at the second data row.
Slot 1 Offset 0x87 Length 27 Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP VARIABLE_COLUMNS Record Size = 27 Memory Dump @0x000000000EABA087 0000000000000000: 30000800 02000000 04000a02 0011001b 00626262 0................bbb 0000000000000014: 62626262 626262 bbbbbbb Slot 1 Column 1 Offset 0x4 Length 4 Length (physical) 4 ID = 2 Slot 1 Column 2 Offset 0x0 Length 0 Length (physical) 0 Col1 = [NULL] Slot 1 Column 3 Offset 0x11 Length 10 Length (physical) 10 Col2 = bbbbbbbbbb Slot 1 Column 4 Offset 0x0 Length 0 Length (physical) 0 Col3 = [NULL]
The NULL bitmap in the second row represents binary value of 00001010, which shows that Col1 and Col3 are NULL. Even though the table has three variable-length columns, number of variable-length columns in the row indicates, that there are just two columns/slots in the offset-array. SQL Server does not maintain the information about the trailing NULL variable-length columns in the row.
You can reduce the size of data row by creating tables in the way, when variable-length columns that often have null values are the last ones in the table definition.
Let’s do the exercise and calculate the actual size of the data row in dbo.DataRows table. We will have:
2 bytes for Status Bits bytes + 2 bytes for fixed-length data length + 4 bytes for ID column storage + 2 bytes for number of column + 1 byte for null bitmap + 2 bytes for number of variable-length columns + 6 bytes (3 * 2 bytes) for variable-length offset array – (2 bytes * number of trailing variable-length columns with null value) + variable-length data + 2 bytes for slot array = 21 bytes to store fixed-length data and overhead + length of variable-length data – (2 bytes * number of trailing variable-length columns with null value).
This approach can help you to calculate the actual size of the data rows in the table. Do not forget, that non-clustered indexes do not have null bitmap array in case if there are no nullable columns in the index.
The fixed-length data and internal attributes must fit into 8,060 bytes available on the single data page. SQL Server does not allow you to create the table when this is not the case. For example, the code below produces an error.
create table dbo.BadTable ( Col1 char(4000), Col2 char(4060) ); Msg 1701, Level 16, State 1, Line 1 Creating or altering table 'BadTable' failed because the minimum row size would be 8067, including 7 bytes of internal overhead. This exceeds the maximum allowable table row size of 8060 bytes.
Next: Large Objects Storage
create table dbo.BadTable
(
Col1 char(4000),
Col2 char(4060)
);
消息 1701,级别 16,状态 1,第 1 行
创建或更改表 'BadTable' 失败,因为最小行大小是 8067,包括 7 字节的内部开销。而此值超出了允许的 8060 字节的最大表行大小。