InnoDB内存结构

InnoDB 内存结构

5.7

InnoDB架构图5.7

8.0

InnoDB架构图8.0

InnoDB 内存结构

缓冲池 Buffer Pool

InnoDB缓冲池被midPoint划分为new和old两个部分。

缓冲池链表

默认情况下：

3/8的缓冲池被用于old部分。
中点就是new old两个部分的边界。
当InnoDB从磁盘读取页到内存缓冲池的时候，插入到midpoint位置。一个页从磁盘读取到内存可能是SQL查询或者是预读操作导致的。
访问old部分的页，会移动它到new部分head的位置。如果页是由于SQL查询操作加载到内存的，那么第一次访问这个页会立刻发生，接着页会移动到new部分（其实刚读入的页还要经过 innodb_old_blocks_time 时间才可以移动到new部分）。如果是预读，那么第一次访问在移除之前可能都不会发生。
随着数据库的运行，缓冲池中不被访问的页会逐渐“老化”，移动到链表的尾部，然后从内存中被剔除。

全表扫描或者一个没有where条件的select语句会读取大量的页，为了避免热点数据被刷出内存，MySQL使用了一些策略避免全表扫描和预读带来的不良影响。

配置参数 innodb_old_blocks_time 指定了时间窗口（毫秒），在内存中的缓冲页第一次被访问后的多少时间内，它都不能被移动到new部分。默认值是1000。这样保证了全表扫描的页不会把之前全部的热点数据刷新出内存。

innodb_old_blocks_time 经常和 innodb_old_blocks_pct 搭配使用， innodb_old_blocks_pct 指定old部分占整个LRU链表的比例，默认值是37，表示3/8的比例。点击查看如何配置InnoDB缓冲池大小。

写缓冲 Change Buffer

如果二级索引页不在缓冲池中，写缓冲会缓存对应的变化。二级索引页的变化，可能是因为insert, update, 或者delete导致的。写缓冲会在读取对应页的时候合并这些变化。

二级索引通常是non-unique，并且对于二级索引的操作往往是随机顺序发生的（局部性不如聚簇索引），写缓冲有利于避免（推迟）大量的随机IO访问。

当系统空闲（或者slow shutdown）的时候，清理操作会将已更新的索引页写回磁盘。此时写缓冲已经积攒大量的修改，写回磁盘是批量写入，效率更快。

但如果受影响的行数过多，有大量的二级索引需要更新，那么合并操作可能需要几个小时。合并操作可能发生在事务提交后，服务器关机重启，更多信息请点我。

内存中，写缓冲是缓冲池的一部分；在磁盘中，写缓冲是系统表(system table)的一部分。

写缓冲不支持降序索引。 Change buffering is not supported for a secondary index if the index contains a descending index column or if the primary key includes a descending index column.

Command-Line Format	`--innodb-change-buffering=value`
System Variable	`innodb_change_buffering`
Scope	Global
Dynamic	Yes
`SET_VAR` Hint Applies	No
Type	Enumeration
Default Value	`all`
Valid Values	none``inserts``deletes``changes``purges``all

Command-Line Format	`--innodb-change-buffer-max-size=#`
System Variable	`innodb_change_buffer_max_size`
Scope	Global
Dynamic	Yes
`SET_VAR` Hint Applies	No
Type	Integer
Default Value	`25`
Minimum Value	`0`
Maximum Value	`50`

FAQ

A.16.1.	What types of operations modify secondary indexes and result in change buffering?
	`INSERT`, `UPDATE`, and `DELETE` operations can modify secondary indexes. If an affected index page is not in the buffer pool, the changes can be buffered in the change buffer.
A.16.2.	What is the benefit of the `InnoDB` change buffer?
	Buffering secondary index changes when secondary index pages are not in the buffer pool avoids expensive random access I/O operations that would be required to immediately read in affected index pages from disk. Buffered changes can be applied later, in batches, as pages are read into the buffer pool by other read operations.
A.16.3.	Does the change buffer support other types of indexes?
	No. The change buffer only supports secondary indexes. Clustered indexes, full-text indexes, and spatial indexes are not supported. Full-text indexes have their own caching mechanism.
A.16.4.	How much space does `InnoDB` use for the change buffer?
	Prior to the introduction of the `innodb_change_buffer_max_size` configuration option in MySQL 5.6, the maximum size of the on-disk change buffer in the system tablespace was 1/3 of the `InnoDB` buffer pool size.In MySQL 5.6 and later, the `innodb_change_buffer_max_size` configuration option defines the maximum size of the change buffer as a percentage of the total buffer pool size. By default, `innodb_change_buffer_max_size` is set to 25. The maximum setting is 50.`InnoDB` does not buffer an operation if it would cause the on-disk change buffer to exceed the defined limit.Change buffer pages are not required to persist in the buffer pool and may be evicted by LRU operations.
A.16.5.	How do I determine the current size of the change buffer?
	The current size of the change buffer is reported by `SHOW ENGINE INNODB STATUS \G`, under the `INSERT BUFFER AND ADAPTIVE HASH INDEX` heading. For example:`------------------------------------- INSERT BUFFER AND ADAPTIVE HASH INDEX ------------------------------------- Ibuf: size 1, free list len 0, seg size 2, 0 merges`Relevant data points include:`size`: The number of pages used within the change buffer. Change buffer size is equal to `seg size - (1 + free list len)`. The `1 +` value represents the change buffer header page.`seg size`: The size of the change buffer, in pages.For information about monitoring change buffer status, see Section 15.5.2, “Change Buffer”.
A.16.6.	When does change buffer merging occur?
	When a page is read into the buffer pool, buffered changes are merged upon completion of the read, before the page is made available.Change buffer merging is performed as a background task. The `innodb_io_capacity` parameter sets an upper limit on the I/O activity performed by `InnoDB` background tasks such as merging data from the change buffer.A change buffer merge is performed during crash recovery. Changes are applied from the change buffer (in the system tablespace) to leaf pages of secondary indexes as index pages are read into the buffer pool.The change buffer is fully durable and can survive a system crash. Upon restart, change buffer merge operations resume as part of normal operations.A full merge of the change buffer can be forced as part of a slow server shutdown using `--innodb-fast-shutdown=0`.The buffered changes are merged when the page is loaded into the buffer pool, and the updated page is later flushed to disk. The `InnoDB` main thread merges buffered changes when the server is nearly idle, and during a slow shutdown.
A.16.7.	When is the change buffer flushed?
	Updated pages are flushed by the same flushing mechanism that flushes the other pages that occupy the buffer pool.
A.16.8.	When should the change buffer be used?
	The change buffer is a feature designed to reduce random I/O to secondary indexes as indexes grow larger and no longer fit in the `InnoDB` buffer pool. Generally, the change buffer should be used when the entire data set does not fit into the buffer pool, when there is substantial DML activity that modifies secondary index pages, or when there are lots of secondary indexes that are regularly changed by DML activity.
A.16.9.	When should the change buffer not be used?
	You might consider disabling the change buffer if the entire data set fits within the `InnoDB` buffer pool, if you have relatively few secondary indexes, or if you are using solid-state storage, where random reads are about as fast as sequential reads. Before making configuration changes, it is recommended that you run tests using a representative workload to determine if disabling the change buffer provides any benefit.
A.16.10.	Where can I find additional information about the change buffer?
	See Section 15.5.2, “Change Buffer”.

自适应哈希索引

哈希索引使用索引key的前缀构建。哈希索引由经常访问的索引页构建。如果InnoDB注意到查询可以通过构建哈希索引来优化，那么它就会这样做。哈希索引的指针指向Buffer Pool中的页。

有些情况下，性能的提升远大于构建哈希索引的开销。

但是，访问自适应哈希索引有时候会造成线程间竞争资源，比如多个并发的联表操作。LIKE查询通配符往往不会因此受益。如果性能没有提升，建议关闭。但最好之前性能测试一下。

日志缓冲区

redo log 对应内存区域，默认大小16MB。如果事务修改了大量的行，建议修改innodb_log_buffer_size增加日志缓冲区的大小。

For related information, see Memory Configuration, and Section 8.5.4, “Optimizing InnoDB Redo Logging”.

相关参数

innodb_flush_log_at_timeout

Command-Line Format	`--innodb-flush-log-at-timeout=#`
System Variable	`innodb_flush_log_at_timeout`
Scope	Global
Dynamic	Yes
`SET_VAR` Hint Applies	No
Type	Integer
Default Value	`1`
Minimum Value	`1`
Maximum Value	`2700`

Write and flush the logs every N seconds. innodb_flush_log_at_timeout allows the timeout period between flushes to be increased in order to reduce flushing and avoid impacting performance of binary log group commit. The default setting for innodb_flush_log_at_timeout is once per second.

默认情况下，后台线程每隔1秒会刷新一次log到磁盘（independent）。log0write.cc中的log_flusher函数。

innodb_flush_log_at_trx_commit

Command-Line Format	`--innodb-flush-log-at-trx-commit=#`
System Variable	`innodb_flush_log_at_trx_commit`
Scope	Global
Dynamic	Yes
`SET_VAR` Hint Applies	No
Type	Enumeration
Default Value	`1`
Valid Values	0``1``2

Controls the balance between strict ACID compliance for commit operations and higher performance that is possible when commit-related I/O operations are rearranged and done in batches. You can achieve better performance by changing the default value but then you can lose transactions in a crash.

The default setting of 1 is required for full ACID compliance. Logs are written and flushed to disk at each transaction commit.
With a setting of 0, logs are written and flushed to disk once per second. Transactions for which logs have not been flushed can be lost in a crash.
With a setting of 2, logs are written after each transaction commit and flushed to disk once per second. Transactions for which logs have not been flushed can be lost in a crash.
For settings 0 and 2, once-per-second flushing is not 100% guaranteed. Flushing may occur more frequently due to DDL changes and other internal InnoDB activities that cause logs to be flushed independently of the innodb_flush_log_at_trx_commit setting, and sometimes less frequently due to scheduling issues. If logs are flushed once per second, up to one second of transactions can be lost in a crash. If logs are flushed more or less frequently than once per second, the amount of transactions that can be lost varies accordingly.
Log flushing frequency is controlled by innodb_flush_log_at_timeout, which allows you to set log flushing frequency to N seconds (where N is 1 ... 2700, with a default value of 1). However, any unexpected mysqld process exit can erase up to N seconds of transactions.
DDL changes and other internal InnoDB activities flush the log independently of the innodb_flush_log_at_trx_commit setting.
InnoDB crash recovery works regardless of the innodb_flush_log_at_trx_commit setting. Transactions are either applied entirely or erased entirely.

For durability and consistency in a replication setup that uses InnoDB with transactions:

If binary logging is enabled, set sync_binlog=1.
Always set innodb_flush_log_at_trx_commit=1.

For information on the combination of settings on a replica that is most resilient to unexpected halts, see Section 17.4.2, “Handling an Unexpected Halt of a Replica”.

innodb_flush_log_at_trx_commit=1 每次都刷新到磁盘。

innodb_flush_log_at_trx_commit=2 每次写到系统缓存，每秒刷新到磁盘

innodb_flush_log_at_trx_commit=0 每次写道log buffer，每秒刷新到磁盘

posted @ 2021-01-12 23:49 dewxin 阅读(154) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

Dewxin