InnoDB On-Disk Structures(二)--Indexes (转载)
转载、节选于 https://dev.mysql.com/doc/refman/8.0/en/innodb-indexes.html
This section covers topics related to InnoDB
indexes.
1.Clustered and Secondary Indexes
Every InnoDB
table has a special index called the clustered index where the data for the rows is stored. Typically, the clustered index is synonymous with the primary key. To get the best performance from queries, inserts, and other database operations, you must understand how InnoDB
uses the clustered index to optimize the most common lookup and DML operations for each table.
-
When you define a
PRIMARY KEY
on your table,InnoDB
uses it as the clustered index. Define a primary key for each table that you create. If there is no logical unique and non-null column or set of columns, add a new auto-increment column, whose values are filled in automatically. -
If you do not define a
PRIMARY KEY
for your table, MySQL locates the firstUNIQUE
index where all the key columns areNOT NULL
andInnoDB
uses it as the clustered index. -
If the table has no
PRIMARY KEY
or suitableUNIQUE
index,InnoDB
internally generates a hidden clustered index namedGEN_CLUST_INDEX
on a synthetic column containing row ID values. The rows are ordered by the ID thatInnoDB
assigns to the rows in such a table. The row ID is a 6-byte field that increases monotonically as new rows are inserted. Thus, the rows ordered by the row ID are physically in insertion order.
How the Clustered Index Speeds Up Queries
Accessing a row through the clustered index is fast because the index search leads directly to the page with all the row data. If a table is large, the clustered index architecture often saves a disk I/O operation when compared to storage organizations that store row data using a different page from the index record.
How Secondary Indexes Relate to the Clustered Index
All indexes other than the clustered index are known as secondary indexes. In InnoDB
, each record in a secondary index contains the primary key columns for the row, as well as the columns specified for the secondary index. InnoDB
uses this primary key value to search for the row in the clustered index.
If the primary key is long, the secondary indexes use more space, so it is advantageous to have a short primary key.
2.The Physical Structure of an InnoDB Index
With the exception of spatial indexes, InnoDB
indexes are B-tree data structures. Spatial indexes use R-trees, which are specialized data structures for indexing multi-dimensional data. Index records are stored in the leaf pages of their B-tree or R-tree data structure. The default size of an index page is 16KB.
When new records are inserted into an InnoDB
clustered index, InnoDB
tries to leave 1/16 of the page free for future insertions and updates of the index records. If index records are inserted in a sequential order (ascending or descending), the resulting index pages are about 15/16 full. If records are inserted in a random order, the pages are from 1/2 to 15/16 full.
InnoDB
performs a bulk load when creating or rebuilding B-tree indexes. This method of index creation is known as a sorted index build. The innodb_fill_factor
configuration option defines the percentage of space on each B-tree page that is filled during a sorted index build, with the remaining space reserved for future index growth. Sorted index builds are not supported for spatial indexes.An innodb_fill_factor
setting of 100 leaves 1/16 of the space in clustered index pages free for future index growth.
If the fill factor of an InnoDB
index page drops below the MERGE_THRESHOLD
, which is 50% by default if not specified, InnoDB
tries to contract the index tree to free the page. The MERGE_THRESHOLD
setting applies to both B-tree and R-tree indexes.
You can define the page size for all InnoDB
tablespaces in a MySQL instance by setting the innodb_page_size
configuration option prior to initializing the MySQL instance. Once the page size for an instance is defined, you cannot change it without reinitializing the instance. Supported sizes are 64KB, 32KB, 16KB (default), 8KB, and 4KB.
A MySQL instance using a particular InnoDB
page size cannot use data files or log files from an instance that uses a different page size.
3.Sorted Index Builds
InnoDB
performs a bulk load instead of inserting one index record at a time when creating or rebuilding indexes. This method of index creation is also known as a sorted index build. Sorted index builds are not supported for spatial indexes.
There are three phases to an index build. In the first phase, the clustered index is scanned, and index entries are generated and added to the sort buffer. When the sort buffer becomes full, entries are sorted and written out to a temporary intermediate file. This process is also known as a “run”. In the second phase, with one or more runs written to the temporary intermediate file, a merge sort is performed on all entries in the file. In the third and final phase, the sorted entries are inserted into the B-tree.
Prior to the introduction of sorted index builds, index entries were inserted into the B-tree one record at a time using insert APIs. This method involved opening a B-treecursor to find the insert position and then inserting entries into a B-tree page using an optimistic insert. If an insert failed due to a page being full, a pessimistic insert would be performed, which involves opening a B-tree cursor and splitting and merging B-tree nodes as necessary to find space for the entry. The drawbacks of this “top-down”method of building an index are the cost of searching for an insert position and the constant splitting and merging of B-tree nodes.
Sorted index builds use a “bottom-up” approach to building an index. With this approach, a reference to the right-most leaf page is held at all levels of the B-tree. The right-most leaf page at the necessary B-tree depth is allocated and entries are inserted according to their sorted order. Once a leaf page is full, a node pointer is appended to the parent page and a sibling leaf page is allocated for the next insert. This process continues until all entries are inserted, which may result in inserts up to the root level. When a sibling page is allocated, the reference to the previously pinned leaf page is released, and the newly allocated leaf page becomes the right-most leaf page and new default insert location.
Reserving B-tree Page Space for Future Index Growth
To set aside space for future index growth, you can use the innodb_fill_factor
configuration option to reserve a percentage of B-tree page space. For example, setting innodb_fill_factor
to 80 reserves 20 percent of the space in B-tree pages during a sorted index build. This setting applies to both B-tree leaf and non-leaf pages. It does not apply to external pages used for TEXT
or BLOB
entries. The amount of space that is reserved may not be exactly as configured, as the innodb_fill_factor
value is interpreted as a hint rather than a hard limit.
Sorted Index Builds and Full-Text Index Support
Sorted index builds are supported for fulltext indexes. Previously, SQL was used to insert entries into a fulltext index.
Sorted Index Builds and Compressed Tables
For compressed tables, the previous index creation method appended entries to both compressed and uncompressed pages. When the modification log (representing free space on the compressed page) became full, the compressed page would be recompressed. If compression failed due to a lack of space, the page would be split. With sorted index builds, entries are only appended to uncompressed pages. When an uncompressed page becomes full, it is compressed. Adaptive padding is used to ensure that compression succeeds in most cases, but if compression fails, the page is split and compression is attempted again. This process continues until compression is successful.
Sorted Index Builds and Redo Logging
Redo logging is disabled during a sorted index build. Instead, there is a checkpoint to ensure that the index build can withstand a crash or failure. The checkpoint forces a write of all dirty pages to disk. During a sorted index build, the page cleaner thread is signaled periodically to flush dirty pages to ensure that the checkpoint operation can be processed quickly. Normally, the page cleaner thread flushes dirty pages when the number of clean pages falls below a set threshold. For sorted index builds, dirty pages are flushed promptly to reduce checkpoint overhead and to parallelize I/O and CPU activity.
Sorted Index Builds and Optimizer Statistics
Sorted index builds may result in optimizer statistics that differ from those generated by the previous method of index creation. The difference in statistics, which is not expected to affect workload performance, is due to the different algorithm used to populate the index.
4.InnoDB FULLTEXT Indexes
FULLTEXT
indexes are created on text-based columns (CHAR
, VARCHAR
, or TEXT
columns) to help speed up queries and DML operations on data contained within those columns, omitting any words that are defined as stopwords.
A FULLTEXT
index is defined as part of a CREATE TABLE
statement or added to an existing table using ALTER TABLE
or CREATE INDEX
.
Full-text search is performed using MATCH() ... AGAINST
syntax.
InnoDB Full-Text Index Design
InnoDB
FULLTEXT
indexes have an inverted index design. Inverted indexes store a list of words, and for each word, a list of documents that the word appears in. To support proximity search, position information for each word is also stored, as a byte offset.
InnoDB Full-Text Index Tables
When creating an InnoDB
FULLTEXT
index, a set of index tables is created, as shown in the following example:
mysql> CREATE TABLE opening_lines ( id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, opening_line TEXT(500), author VARCHAR(200), title VARCHAR(200), FULLTEXT idx (opening_line) ) ENGINE=InnoDB; mysql> SELECT table_id, name, space from INFORMATION_SCHEMA.INNODB_TABLES WHERE name LIKE 'test/%'; +----------+----------------------------------------------------+-------+ | table_id | name | space | +----------+----------------------------------------------------+-------+ | 333 | test/fts_0000000000000147_00000000000001c9_index_1 | 289 | | 334 | test/fts_0000000000000147_00000000000001c9_index_2 | 290 | | 335 | test/fts_0000000000000147_00000000000001c9_index_3 | 291 | | 336 | test/fts_0000000000000147_00000000000001c9_index_4 | 292 | | 337 | test/fts_0000000000000147_00000000000001c9_index_5 | 293 | | 338 | test/fts_0000000000000147_00000000000001c9_index_6 | 294 | | 330 | test/fts_0000000000000147_being_deleted | 286 | | 331 | test/fts_0000000000000147_being_deleted_cache | 287 | | 332 | test/fts_0000000000000147_config | 288 | | 328 | test/fts_0000000000000147_deleted | 284 | | 329 | test/fts_0000000000000147_deleted_cache | 285 | | 327 | test/opening_lines | 283 | +----------+----------------------------------------------------+-------+
The first six tables represent the inverted index and are referred to as auxiliary index tables. When incoming documents are tokenized, the individual words (also referred to as “tokens”) are inserted into the index tables along with position information and the associated Document ID (DOC_ID
). The words are fully sorted and partitioned among the six index tables based on the character set sort weight of the word's first character.
The inverted index is partitioned into six auxiliary index tables to support parallel index creation. By default, two threads tokenize, sort, and insert words and associated data into the index tables. The number of threads is configurable using the innodb_ft_sort_pll_degree
option. Consider increasing the number of threads when creatingFULLTEXT
indexes on large tables.
Auxiliary index table names are prefixed with fts_
and postfixed with index_*
. Each index table is associated with the indexed table by a hex value in the index table name that matches the table_id
of the indexed table. For example, the table_id
of the test/opening_lines
table is 327
, for which the hex value is 0x147. As shown in the preceding example, the “147” hex value appears in the names of index tables that are associated with the test/opening_lines
table.
A hex value representing the index_id
of the FULLTEXT
index also appears in auxiliary index table names. For example, in the auxiliary table nametest/fts_0000000000000147_00000000000001c9_index_1
, the hex value 1c9
has a decimal value of 457. The index defined on the opening_lines
table (idx
) can be identified by querying the INFORMATION_SCHEMA.INNODB_INDEXES
table for this value (457).
mysql> SELECT index_id, name, table_id, space from INFORMATION_SCHEMA.INNODB_INDEXES WHERE index_id=457; +----------+------+----------+-------+ | index_id | name | table_id | space | +----------+------+----------+-------+ | 457 | idx | 327 | 283 | +----------+------+----------+-------+
Index tables are stored in their own tablespace if the primary table is created in a file-per-table tablespace.
The other index tables shown in the preceding example are referred to as common index tables and are used for deletion handling and storing the internal state ofFULLTEXT
indexes. Unlike the inverted index tables, which are created for each full-text index, this set of tables is common to all full-text indexes created on a particular table.
Common auxiliary tables are retained even if full-text indexes are dropped. When a full-text index is dropped, the FTS_DOC_ID
column that was created for the index is retained, as removing the FTS_DOC_ID
column would require rebuilding the table. Common axillary tables are required to manage the FTS_DOC_ID
column.
-
fts_*_deleted
andfts_*_deleted_cache
Contain the document IDs (DOC_ID) for documents that are deleted but whose data is not yet removed from the full-text index. The
fts_*_deleted_cache
is the in-memory version of thefts_*_deleted
table. -
fts_*_being_deleted
andfts_*_being_deleted_cache
Contain the document IDs (DOC_ID) for documents that are deleted and whose data is currently in the process of being removed from the full-text index. The
fts_*_being_deleted_cache
table is the in-memory version of thefts_*_being_deleted
table. -
fts_*_config
Stores information about the internal state of the
FULLTEXT
index. Most importantly, it stores theFTS_SYNCED_DOC_ID
, which identifies documents that have been parsed and flushed to disk. In case of crash recovery,FTS_SYNCED_DOC_ID
values are used to identify documents that have not been flushed to disk so that the documents can be re-parsed and added back to theFULLTEXT
index cache. To view the data in this table, query theINFORMATION_SCHEMA.INNODB_FT_CONFIG
table.
InnoDB Full-Text Index Cache
When a document is inserted, it is tokenized, and the individual words and associated data are inserted into the FULLTEXT
index. This process, even for small documents, could result in numerous small insertions into the auxiliary index tables, making concurrent access to these tables a point of contention. To avoid this problem, InnoDB
uses a FULLTEXT
index cache to temporarily cache index table insertions for recently inserted rows. This in-memory cache structure holds insertions until the cache is full and then batch flushes them to disk (to the auxiliary index tables). You can query the INFORMATION_SCHEMA.INNODB_FT_INDEX_CACHE
table to view tokenized data for recently inserted rows.
The caching and batch flushing behavior avoids frequent updates to auxiliary index tables, which could result in concurrent access issues during busy insert and update times. The batching technique also avoids multiple insertions for the same word, and minimizes duplicate entries. Instead of flushing each word individually, insertions for the same word are merged and flushed to disk as a single entry, improving insertion efficiency while keeping auxiliary index tables as small as possible.
The innodb_ft_cache_size
variable is used to configure the full-text index cache size (on a per-table basis), which affects how often the full-text index cache is flushed. You can also define a global full-text index cache size limit for all tables in a given instance using the innodb_ft_total_cache_size
option.
The full-text index cache stores the same information as auxiliary index tables. However, the full-text index cache only caches tokenized data for recently inserted rows. The data that is already flushed to disk (to the full-text auxiliary tables) is not brought back into the full-text index cache when queried. The data in auxiliary index tables is queried directly, and results from the auxiliary index tables are merged with results from the full-text index cache before being returned.
InnoDB Full-Text Index Document ID and FTS_DOC_ID Column
InnoDB
uses a unique document identifier referred to as a Document ID (DOC_ID
) to map words in the full-text index to document records where the word appears. The mapping requires an FTS_DOC_ID
column on the indexed table. If an FTS_DOC_ID
column is not defined, InnoDB
automatically adds a hidden FTS_DOC_ID
column when the full-text index is created. The following example demonstrates this behavior.
The following table definition does not include an FTS_DOC_ID
column:
mysql> CREATE TABLE opening_lines ( id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, opening_line TEXT(500), author VARCHAR(200), title VARCHAR(200) ) ENGINE=InnoDB;
When you create a full-text index on the table using CREATE FULLTEXT INDEX
syntax, a warning is returned which reports that InnoDB
is rebuilding the table to add the FTS_DOC_ID
column.
mysql> CREATE FULLTEXT INDEX idx ON opening_lines(opening_line); Query OK, 0 rows affected, 1 warning (0.19 sec) Records: 0 Duplicates: 0 Warnings: 1 mysql> SHOW WARNINGS; +---------+------+--------------------------------------------------+ | Level | Code | Message | +---------+------+--------------------------------------------------+ | Warning | 124 | InnoDB rebuilding table to add column FTS_DOC_ID | +---------+------+--------------------------------------------------+
The same warning is returned when using ALTER TABLE
to add a full-text index to a table that does not have an FTS_DOC_ID
column. If you create a full-text index at CREATE TABLE
time and do not specify an FTS_DOC_ID
column, InnoDB
adds a hidden FTS_DOC_ID
column, without warning.
Defining an FTS_DOC_ID
column at CREATE TABLE
time is less expensive than creating a full-text index on a table that is already loaded with data. If an FTS_DOC_ID
column is defined on a table prior to loading data, the table and its indexes do not have to be rebuilt to add the new column. If you are not concerned with CREATE FULLTEXT INDEX
performance, leave out the FTS_DOC_ID
column to have InnoDB
create it for you. InnoDB
creates a hidden FTS_DOC_ID
column along with a unique index (FTS_DOC_ID_INDEX
) on the FTS_DOC_ID
column. If you want to create your own FTS_DOC_ID
column, the column must be defined as BIGINT UNSIGNED NOT NULL
and named FTS_DOC_ID
(all upper case), as in the following example:
mysql> CREATE TABLE opening_lines ( FTS_DOC_ID BIGINT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, opening_line TEXT(500), author VARCHAR(200), title VARCHAR(200) ) ENGINE=InnoDB;
If you choose to define the FTS_DOC_ID
column yourself, you are responsible for managing the column to avoid empty or duplicate values. FTS_DOC_ID
values cannot be reused, which means FTS_DOC_ID
values must be ever increasing.
注意:The FTS_DOC_ID
column does not need to be defined as an AUTO_INCREMENT
column, but AUTO_INCREMENT
could make loading data easier.
Optionally, you can create the required unique FTS_DOC_ID_INDEX
(all upper case) on the FTS_DOC_ID
column.
mysql> CREATE UNIQUE INDEX FTS_DOC_ID_INDEX on opening_lines(FTS_DOC_ID);
If you do not create the FTS_DOC_ID_INDEX
, InnoDB
creates it automatically.
注意:FTS_DOC_ID_INDEX
cannot be defined as a descending index because the InnoDB
SQL parser does not use descending indexes.
The permitted gap between the largest used FTS_DOC_ID
value and new FTS_DOC_ID
value is 65535.
To avoid rebuilding the table, the FTS_DOC_ID
column is retained when dropping a full-text index.
InnoDB Full-Text Index Deletion Handling
Deleting a record that has a full-text index column could result in numerous small deletions in the auxiliary index tables, making concurrent access to these tables a point of contention. To avoid this problem, the Document ID (DOC_ID
) of a deleted document is logged in a special FTS_*_DELETED
table whenever a record is deleted from an indexed table, and the indexed record remains in the full-text index. Before returning query results, information in the FTS_*_DELETED
table is used to filter out deleted Document IDs. The benefit of this design is that deletions are fast and inexpensive. The drawback is that the size of the index is not immediately reduced after deleting records. To remove full-text index entries for deleted records, run OPTIMIZE TABLE
on the indexed table with innodb_optimize_fulltext_only=ON
to rebuild the full-text index.
InnoDB Full-Text Index Transaction Handling
InnoDB
FULLTEXT
indexes have special transaction handling characteristics due its caching and batch processing behavior. Specifically, updates and insertions on a FULLTEXT
index are processed at transaction commit time, which means that a FULLTEXT
search can only see committed data. The following example demonstrates this behavior. The FULLTEXT
search only returns a result after the inserted lines are committed.
mysql> CREATE TABLE opening_lines ( id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, opening_line TEXT(500), author VARCHAR(200), title VARCHAR(200), FULLTEXT idx (opening_line) ) ENGINE=InnoDB; mysql> BEGIN; mysql> INSERT INTO opening_lines(opening_line,author,title) VALUES ('Call me Ishmael.','Herman Melville','Moby-Dick'), ('A screaming comes across the sky.','Thomas Pynchon','Gravity\'s Rainbow'), ('I am an invisible man.','Ralph Ellison','Invisible Man'), ('Where now? Who now? When now?','Samuel Beckett','The Unnamable'), ('It was love at first sight.','Joseph Heller','Catch-22'), ('All this happened, more or less.','Kurt Vonnegut','Slaughterhouse-Five'), ('Mrs. Dalloway said she would buy the flowers herself.','Virginia Woolf','Mrs. Dalloway'), ('It was a pleasure to burn.','Ray Bradbury','Fahrenheit 451'); mysql> SELECT COUNT(*) FROM opening_lines WHERE MATCH(opening_line) AGAINST('Ishmael'); +----------+ | COUNT(*) | +----------+ | 0 | +----------+ mysql> COMMIT; mysql> SELECT COUNT(*) FROM opening_lines WHERE MATCH(opening_line) AGAINST('Ishmael'); +----------+ | COUNT(*) | +----------+ | 1 | +----------+
Monitoring InnoDB Full-Text Indexes
You can monitor and examine the special text-processing aspects of InnoDB
FULLTEXT
indexes by querying the following INFORMATION_SCHEMA
tables:
-
INNODB_FT_CONFIG
-
INNODB_FT_INDEX_TABLE
-
INNODB_FT_INDEX_CACHE
-
INNODB_FT_DEFAULT_STOPWORD
-
INNODB_FT_DELETED
-
INNODB_FT_BEING_DELETED
You can also view basic information for FULLTEXT
indexes and tables by querying INNODB_INDEXES
and INNODB_TABLES
.
转载、节选于 https://dev.mysql.com/doc/refman/8.0/en/innodb-indexes.html