MySQL 8.0 Reference Manual(读书笔记80节-- InnoDB Row Formats)
1. 概述
The row format of a table determines how its rows are physically stored, which in turn can affect the performance of queries and DML operations. As more rows fit into a single disk page, queries and index lookups can work faster, less cache memory is required in the buffer pool, and less I/O is required to write out updated values.
The data in each table is divided into pages. The pages that make up each table are arranged in a tree data structure called a B-tree index. Table data and secondary indexes both use this type of structure. The B-tree index that represents【reprɪˈzents 代表;意味着;相当于;等于;】 an entire table is known as the clustered index, which is organized according to the primary key columns. The nodes of a clustered index data structure contain the values of all columns in the row. The nodes of a secondary index structure contain the values of index columns and primary key columns.
Variable-length columns【可变长度的字段】 are an exception to the rule that column values are stored in B-tree index nodes. Variable-length columns that are too long to fit on a B-tree page are stored on separately allocated disk pages called overflow pages. Such columns are referred to as off-page columns. The values of off-page columns are stored in singly-linked【ˈsɪŋɡli lɪŋkt 单向链接】 lists 【单链表】of overflow pages, with each such column having its own list of one or more overflow pages. Depending on column length, all or a prefix【ˈpriːfɪks 前缀(缀于单词前以改变其意义的字母或字母组合);(人名前的)称谓;前置代号(置于前面的单词或字母、数字)】 of variable-length column values are stored in the B-tree to avoid wasting storage and having to read a separate page.
The InnoDB storage engine supports four row formats: REDUNDANT【rɪˈdʌndənt 冗余的;多余的;不需要的;】, COMPACT【ˈkɑːmpækt 紧凑的;小型的;紧密的;体积小的;袖珍的;坚实的;矮小而健壮的;】, DYNAMIC【daɪˈnæmɪk] 动态的;动力的;力的;充满活力的;发展变化的;精力充沛的;个性强的;】, and COMPRESSED【kəmˈprest 压缩(文件等);(被)压紧;浓缩;精简;】
Row Format | Compact Storage Characteristics | Enhanced Variable-Length Column Storage | Large Index Key Prefix Support | Compression Support | Supported Tablespace Types |
REDUNDANT | No | No | No | No | system, file-per-table, genera |
COMPACT | Yes | No | No | No | system, file-per-table, genera |
DYNAMIC | Yes | Yes | Yes | No | system, file-per-table, genera |
COMPRESSED | Yes | Yes | Yes | Yes | file-per-table, general |
2.REDUNDANT Row Format
The REDUNDANT format provides compatibility with older versions of MySQL.
Tables that use the REDUNDANT row format store the first 768 bytes of variable-length column values (VARCHAR, VARBINARY, and BLOB and TEXT types) in the index record within the B-tree node, with the remainder stored on overflow pages. Fixed-length columns greater than or equal to 768 bytes are encoded as variable-length columns, which can be stored off-page. For example, a CHAR(255) column can exceed 768 bytes if the maximum byte length of the character set is greater than 3, as it is with utf8mb4.
If the value of a column is 768 bytes or less, an overflow page is not used, and some savings in I/O may result, since the value is stored entirely in the B-tree node. This works well for relatively short BLOB column values, but may cause B-tree nodes to fill with data rather than key values, reducing their efficiency. Tables with many BLOB columns could cause B-tree nodes to become too full, and contain too few rows, making the entire index less efficient than if rows were shorter or column values were stored off-page.
3.REDUNDANT Row Format Storage Characteristics
The REDUNDANT row format has the following storage characteristics:
• Each index record contains a 6-byte header. The header is used to link together consecutive【kənˈsekjətɪv 连续的;连续不断的】 records, and for row-level locking.
• Records in the clustered index contain fields for all user-defined columns. In addition, there is a 6-byte transaction ID field and a 7-byte roll pointer field.
• If no primary key is defined for a table, each clustered index record also contains a 6-byte row ID field.
• Each secondary index record contains all the primary key columns defined for the clustered index key that are not in the secondary index.
• A record contains a pointer to each field of the record. If the total length of the fields in a record is less than 128 bytes, the pointer is one byte; otherwise, two bytes. The array【əˈreɪ 阵列;数组;大量;大堆;大群】 of pointers is called the record directory. The area where the pointers point is the data part of the record.
• Internally【ɪnˈtɜrnəli 在内部;内部的;内部地】, fixed-length character columns such as CHAR(10) in stored in fixed-length format. Trailing【treɪlɪŋ 尾随;尾部;拖尾;后续】 spaces are not truncated from VARCHAR columns.
• Fixed-length columns greater than or equal to 768 bytes are encoded as variable-length columns, which can be stored off-page. For example, a CHAR(255) column can exceed 768 bytes if the maximum byte length of the character set is greater than 3, as it is with utf8mb4.
• An SQL NULL value reserves one or two bytes in the record directory. An SQL NULL value reserves zero bytes in the data part of the record if stored in a variable-length column. For a fixed-length column, the fixed length of the column is reserved in the data part of the record. Reserving fixed space for NULL values permits columns to be updated in place from NULL to non-NULL values without causing index page fragmentation.
4.COMPACT Row Format
The COMPACT row format reduces row storage space by about 20% compared to the REDUNDANT row format, at the cost of increasing CPU use for some operations. If your workload is a typical one that is limited by cache hit rates and disk speed, COMPACT format is likely to be faster. If the workload is limited by CPU speed, compact format might be slower.
Tables that use the COMPACT row format store the first 768 bytes of variable-length column values (VARCHAR, VARBINARY, and BLOB and TEXT types) in the index record within the B-tree node, with the remainder stored on overflow pages. Fixed-length columns greater than or equal to 768 bytes are encoded as variable-length columns, which can be stored off-page. For example, a CHAR(255) column can exceed 768 bytes if the maximum byte length of the character set is greater than 3, as it is with utf8mb4.
If the value of a column is 768 bytes or less, an overflow page is not used, and some savings in I/O may result, since the value is stored entirely in the B-tree node. This works well for relatively【ˈrelətɪvli 相对地;相当地;相当程度上】 short BLOB column values, but may cause B-tree nodes to fill with data rather than key values, reducing their efficiency. Tables with many BLOB columns could cause B-tree nodes to become too full, and contain too few rows, making the entire index less efficient than if rows were shorter or column values were stored off-page.
5. COMPACT Row Format Storage Characteristics
The COMPACT row format has the following storage characteristics:
• Each index record contains a 5-byte header that may be preceded by a variable-length header. The header is used to link together consecutive records, and for row-level locking.
• The variable-length part of the record header contains a bit vector for indicating NULL columns. If the number of columns in the index that can be NULL is N, the bit vector occupies CEILING(N/8) bytes. (For example, if there are anywhere from 9 to 16 columns that can be NULL, the bit vector uses two bytes.) Columns that are NULL do not occupy space other than the bit in this vector. The variable-length part of the header also contains the lengths of variable-length columns. Each length takes one or two bytes, depending on the maximum length of the column. If all columns in the index are NOT NULL and have a fixed length, the record header has no variable-length part.
• For each non-NULL variable-length field, the record header contains the length of the column in one or two bytes. Two bytes are only needed if part of the column is stored externally in overflow pages or the maximum length exceeds 255 bytes and the actual length exceeds 127 bytes. For an externally stored column, the 2-byte length indicates the length of the internally stored part plus the 20-byte pointer to the externally stored part. The internal part is 768 bytes, so the length is 768+20. The 20-byte pointer stores the true length of the column.
• The record header is followed by the data contents of non-NULL columns.
• Records in the clustered index contain fields for all user-defined columns. In addition, there is a 6-byte transaction ID field and a 7-byte roll pointer field.
• If no primary key is defined for a table, each clustered index record also contains a 6-byte row ID field.
• Each secondary index record contains all the primary key columns defined for the clustered index key that are not in the secondary index. If any of the primary key columns are variable length, the record header for each secondary index has a variable-length part to record their lengths, even if the secondary index is defined on fixed-length columns.
• Internally, for nonvariable-length character sets, fixed-length character columns such as CHAR(10) are stored in a fixed-length format.Trailing spaces are not truncated from VARCHAR columns.
• Internally, for variable-length character sets such as utf8mb3 and utf8mb4, InnoDB attempts to store CHAR(N) in N bytes by trimming trailing spaces. If the byte length of a CHAR(N) column value exceeds N bytes, trailing spaces are trimmed to a minimum of the column value byte length. The maximum length of a CHAR(N) column is the maximum character byte length × N.
A minimum of N bytes is reserved for CHAR(N). Reserving the minimum space N in many cases enables column updates to be done in place without causing index page fragmentation. By comparison, CHAR(N) columns occupy the maximum character byte length × N when using the REDUNDANT row format.
Fixed-length columns greater than or equal to 768 bytes are encoded as variable-length fields, which can be stored off-page. For example, a CHAR(255) column can exceed 768 bytes if the maximum byte length of the character set is greater than 3, as it is with utf8mb4.
6. DYNAMIC Row Format
The DYNAMIC row format offers the same storage characteristics as the COMPACT row format but adds enhanced storage capabilities for long variable-length columns and supports large index key prefixes.
When a table is created with ROW_FORMAT=DYNAMIC, InnoDB can store long variable-length column values (for VARCHAR, VARBINARY, and BLOB and TEXT types) fully off-page, with the clustered index record containing only a 20-byte pointer to the overflow page. Fixed-length fields greater than or equal to 768 bytes are encoded as variable-length fields. For example, a CHAR(255) column can exceed 768 bytes if the maximum byte length of the character set is greater than 3, as it is with utf8mb4.
Whether columns are stored off-page depends on the page size and the total size of the row. When a row is too long, the longest columns are chosen for off-page storage until the clustered index record fits on the B-tree page. TEXT and BLOB columns that are less than or equal to 40 bytes are stored in line.
The DYNAMIC row format maintains the efficiency of storing the entire row in the index node if it fits (as do the COMPACT and REDUNDANT formats), but the DYNAMIC row format avoids the problem of filling B-tree nodes with a large number of data bytes of long columns. The DYNAMIC row format is based on the idea that if a portion of a long data value is stored off-page, it is usually most efficient to store the entire value off-page. With DYNAMIC format, shorter columns are likely to remain in the B-tree node, minimizing the number of overflow pages required for a given row.
The DYNAMIC row format supports index key prefixes up to 3072 bytes.
Tables that use the DYNAMIC row format can be stored in the system tablespace, file-per-table tablespaces, and general tablespaces. To store DYNAMIC tables in the system tablespace, either disable innodb_file_per_table and use a regular CREATE TABLE or ALTER TABLE statement, or use the TABLESPACE [=] innodb_system table option with CREATE TABLE or ALTER TABLE. The innodb_file_per_table variable is not applicable to general tablespaces, nor is it applicable when using the TABLESPACE [=] innodb_system table option to store DYNAMIC tables in the system tablespace.
补充:【DYNAMIC Row Format Storage Characteristics】The DYNAMIC row format is a variation of the COMPACT row format. For storage characteristics, see COMPACT Row Format Storage Characteristics.
7. COMPRESSED Row Format
The COMPRESSED row format offers the same storage characteristics and capabilities as the DYNAMIC row format but adds support for table and index data compression.
The COMPRESSED row format uses similar internal details for off-page storage as the DYNAMIC row format, with additional storage and performance considerations from the table and index data being compressed and using smaller page sizes. With the COMPRESSED row format, the KEY_BLOCK_SIZE option controls how much column data is stored in the clustered index, and how much is placed on overflow pages.
The COMPRESSED row format supports index key prefixes up to 3072 bytes.
Tables that use the COMPRESSED row format can be created in file-per-table tablespaces or general tablespaces. The system tablespace does not support the COMPRESSED row format. To store a COMPRESSED table in a file-per-table tablespace, the innodb_file_per_table variable must be enabled. The innodb_file_per_table variable is not applicable to general tablespaces. General tablespaces support all row formats with the caveat that compressed and uncompressed tables cannot coexist in the same general tablespace due to different physical page sizes.
补充:【Compressed Row Format Storage Characteristics】 The COMPRESSED row format is a variation of the COMPACT row format. For storage characteristics, see COMPACT Row Format Storage Characteristics.
8.Defining the Row Format of a Table 【怎么去定义 Row Format】
The default row format for InnoDB tables is defined by innodb_default_row_format variable, which has a default value of DYNAMIC. The default row format is used when the ROW_FORMAT table option is not defined explicitly or when ROW_FORMAT=DEFAULT is specified.
The row format of a table can be defined explicitly【ɪkˈsplɪsətli 明确地;明白地】 using the ROW_FORMAT table option in a CREATE TABLE or ALTER TABLE statement. For example:
CREATE TABLE t1 (c1 INT) ROW_FORMAT=DYNAMIC;
An explicitly【ɪkˈsplɪsətli 明确地;明白地】 defined ROW_FORMAT setting overrides【oʊvərˈraɪdz 推翻,不理会;超驰控制,超控;重写】 the default row format. Specifying ROW_FORMAT=DEFAULT is equivalent to using the implicit【ɪmˈplɪsɪt 含蓄的;完全的;内含的;无疑问的;不直接言明的;成为一部分的】 default.
The innodb_default_row_format variable can be set dynamically:
mysql> SET GLOBAL innodb_default_row_format=DYNAMIC;
Valid innodb_default_row_format options include DYNAMIC, COMPACT, and REDUNDANT. The COMPRESSED row format, which is not supported for use in the system tablespace, cannot be defined as the default. It can only be specified explicitly in a CREATE TABLE or ALTER TABLE statement. Attempting to set the innodb_default_row_format variable to COMPRESSED returns an error:
不能将COMPRESSED设置为默认的row format
mysql> SET GLOBAL innodb_default_row_format=COMPRESSED; ERROR 1231 (42000): Variable 'innodb_default_row_format' can't be set to the value of 'COMPRESSED'
Newly created tables use the row format defined by the innodb_default_row_format variable when a ROW_FORMAT option is not specified explicitly, or when ROW_FORMAT=DEFAULT is used. For example, the following CREATE TABLE statements use the row format defined by the innodb_default_row_format variable.
CREATE TABLE t1 (c1 INT); CREATE TABLE t2 (c1 INT) ROW_FORMAT=DEFAULT;
When a ROW_FORMAT option is not specified explicitly, or when ROW_FORMAT=DEFAULT is used, an operation that rebuilds a table silently【ˈsaɪləntli 默默地;静静地;悄悄地;无声地;不用言语表达地;不说话地】 changes the row format of the table to the format defined by the innodb_default_row_format variable.
Table-rebuilding operations include ALTER TABLE operations that use ALGORITHM=COPY or ALGORITHM=INPLACE where table rebuilding is required. OPTIMIZE TABLE is also a table-rebuilding operation. ----表的重建,新的表用采用默认的ROW_FORMAT,应特别小心新旧表,可能ROW_FORMAT不一致了。
Consider the following potential issues before changing the row format of existing tables from REDUNDANT or COMPACT to DYNAMIC.--小心自动切换
• The REDUNDANT and COMPACT row formats support a maximum index key prefix length of 767 bytes whereas DYNAMIC and COMPRESSED row formats support an index key prefix length of 3072 bytes. In a replication environment, if the innodb_default_row_format variable is set to DYNAMIC on the source, and set to COMPACT on the replica, the following DDL statement, which does not explicitly define a row format, succeeds on the source but fails on the replica: ---主从架构下,小心主从表的ROW_FORMAT不一致的情况
CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 VARCHAR(5000), KEY i1(c2(3070)));
• Importing a table that does not explicitly define a row format results in a schema mismatch error if the innodb_default_row_format setting on the source server differs from the setting on the destination server.
9.查看 the Row Format of a Table
To determine the row format of a table, use SHOW TABLE STATUS IN tablename.
Alternatively, query the Information Schema INNODB_TABLES table:
SELECT NAME, ROW_FORMAT FROM INFORMATION_SCHEMA.INNODB_TABLES WHERE NAME='????';
-----https://dev.mysql.com/doc/refman/8.0/en/innodb-row-format.html