事务基础

传统关系型数据库两类典型的应用场景

OLTP(On-Line Transaction Processing) 在线交易处理，特点是并发高，写入/读取操作较多，单个事务的数据规模较小(通常情况下在byte/k byte左右)，常见于业务系统
OLAP(On-Line Analytical Processing) 在线分析处理，特点是数据量大(单次数据分析的规模基本都在GB级别往上)，无并发，写入频率极低,常见于报表类型的系统

传统事务

数据库的事务通常针对单机数据库而言，也是传统意义上的大家熟知的事务，通常情况下需要满足如下四个条件:

A(Atomic): 原子性，所有的语句作为原子工作单元去执行，要么全部执行，要么全部不执行
C(Consistent): 一致性，事务完成后，所有数据的状态都是一致的，例如银行转账，a账户转b账户100，a账户扣减100，b账户必须加上100
I(Isolation):隔离性，如果有多个事务并发执行，每个事务作出的修改必须与其他事务隔离
D(Duration):持久性，即事务完成后，对数据库数据的修改被持久化存储。

持久性与一致性通常较容易理解，原子性与隔离型在数据库事务中的底层实现较复杂

通常情况下，数据库的CRUD操作分别对应数据的insert select update delete 四种操作，上述四种操作代表数据库中的四种最基本的原子操作，不可再继续分割，类比于自然界中的化学反应，原子是构成物质的最基本的元素，不可再分割(虽说在物理学中原子可以在进一步分割)

注:下述所有的操作使用的MySql版本为5.7.36 理论上来说，5.7.x的版本应该都适用

数据库事务

提交型的事务示例

begin;
	update account set balance = balance-100 where user_name = 'A'; -- 语句1
	update account set balance = balance+100 where user_name = 'B'; -- 语句2
commit;

回滚型的事务示例

begin;
	update account set balance = balance-100 where user_name = 'A'; -- 语句1
	update account set balance = balance+100 where user_name = 'B'; -- 语句2
rollback;

上述操作即是模拟一个银行账户的转账操作，A->B账户转账100，将两个语句放到begin/end块中即构成了一个最简单的事物操作，转账操作要么成功，要么失败，不会存在账户A扣款100而账户B增加100失败的场景，在传统的关系型支持事物的数据库中(Oracle/Mysql/SqlServer)，上述事务(ACID特性)由数据库软件保证，开发人员只需要关注上层的业务逻辑，事务特性交由底层的数据库软件保证

事务隔离级别

在看事物隔离级别之前，先了解下下述三种读取数据常见的名词:

脏读(Dirty Read): 当前事务内允许读取其他事物还未提交的修改
不可重复读(NonRepeatable Read):当前事务内同一个查询操作,读取的数据可以发生变化
可重复读(Repeatable Read): 当前事务内同一个查询操作,读取的数据不会发生改变
幻读(Phantom Read):当前事务内读取到了其他事物insert操作产生的数据

四种事务隔离级别定义

读未提交(Read UnCommited):允许脏读，也就是可能读取到其他会话中未提交事务修改的数据
读提交(Read Commited):只能读取到其他事物已经提交的数据,未提交的数据无法读取
可重复读(Repeatable Read):在同一个事务内的查询都是事务开始时刻一致的，InnoDB默认级别。在SQL标准中，该隔离级别消除了不可重复读，但是还存在幻象读
串行读(Serializable):完全串行化的读，每次读都需要获得表级共享锁，读写相互都会阻塞

隔离级别对照表

隔离级别	脏读	不可重复读	幻读
读未提交	可能	可能	可能
读提交	不可能	可能	可能
可重复读	不可能	不可能	可能
串行读	不可能	不可能	不可能

读未提交这种隔离级别不对数据一致性做任何保障,主流生产环境的数据库一般不会使用，串行读会极大降低数据库对于事物的处理能力(所有事物串行执行,读加读锁，写加写锁，读写互斥)，
主流的关系型数据库默认的事务隔离级别要么是RC(SqlServer/Oracle),要么是RR(Mysql)

示例表的建表语句

CREATE TABLE `account` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `balance` int(11) NOT NULL DEFAULT '0',
  `username` varchar(30) NOT NULL DEFAULT '',
  `create_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `update_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  KEY `idx_username` (`username`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4

初始化的一条数据

insert into account values(NULL,10,'alice',now(),now());

读未提交

mysql innodb 存储引擎默认的事物隔离级别为RR(可重复读),在讨论RUC的事物隔离级别的时候需要手动调整session默认的隔离级别

set session transaction isolation level read uncommitted;

关闭session的自动提交功能

set autocommit=off;

模拟两个并发的场景(并发通常在数据库层面体现的就是两个并发的事物)

step	Tx	Session1	Tx	Session2
1	tx1	begin;	tx2	begin;
2	tx1	update account set balance = 9,update_time = now() where id = 1;	tx2
3	tx1		tx2	select id,balance,username from account where id = 1;
4	tx1		tx2	update account set balance = 8,update_time = now() where id = 1;
5	tx1		tx2	-- waiting until session1 commit or rollback;
6	tx1	commit;	tx2
7			tx2	select id,balance,username from account where id = 1;
8	tx3	begin;
9	tx3	select id,balance,username from account where id = 1;	tx2
10	tx3		tx2	commit;
11	tx3	select id,balance,username from account where id = 1;
12	tx3	commit;

下述列举了几个关键性环节的select查询语句的输出

step3 session2 输出:

+----+---------+----------+
| id | balance | username |
+----+---------+----------+
|  1 |      9  | alice    |
+----+---------+----------+

step7 session2 输出:

+----+---------+----------+
| id | balance | username |
+----+---------+----------+
|  1 |      8  | alice    |
+----+---------+----------+

step9 session1 输出:

+----+---------+----------+
| id | balance | username |
+----+---------+----------+
|  1 |      8  | alice    |
+----+---------+----------+

step11 session1 输出:

+----+---------+----------+
| id | balance | username |
+----+---------+----------+
|  1 |      8  | alice    |
+----+---------+----------+

可以看到,在"读未提交"的事物隔离级别下,事务每次都是读取行的最新数据，而不关注事物是否已提交,上述示例咋一看可能没什么特别大的问题，仔细想一下存在下述问题

数据的最终结果取决于tx1/tx2提交的顺序，上述示例演示的实际上是两个并发事物在"读未提交"隔离级别下的一种场景，亦即不确定性增加,这类场景与并发编程中的共享资源未正确同步类似，结果是不确定(亦或者是随机的)
上述示例只是演示了两个事物，一条共享数据,当系统中并发事物越多，单事物更新的数据条数越多,最终每条数据更新的结果不确定性就越大，基本上是呈指数级别增加
事务不管commit还是rollback，tx2均会读取最新的内容,这种场景对于大部分数据一致性要求较高的场景都无法接受，违反的隔离性的约束

读提交

调整session的隔离级别为读提交

set session transaction isolation level read committed;

更新数据记录为默认值10

update account set balance = 10 where id = 1;

模拟两个并发的场景

step	Tx	Session1	Tx	Session2
1	tx1	begin;	tx2	begin;
2	tx1	update account set balance = balance-1,update_time = now() where balance>0 and id = 1;	tx2
3	tx1		tx2	update account set balance = balance-1,update_time = now() where balance>0 and id = 1;
4	tx1		tx2	-- waiting unitl session1 commit;
5	tx1	commit;	tx2
6			tx2	select id,balance,username from account where id = 1;
7	tx3	begin;
8	tx3	select id,balance,username from account where id = 1;	tx2
9	tx3		tx2	commit;
10	tx3	select id,balance,username from account where id = 1;
11	tx3	commit;

step6 session2 输出:

+----+---------+----------+
| id | balance | username |
+----+---------+----------+
|  1 |      8  | alice    |
+----+---------+----------+

step8 session1 输出:

+----+---------+----------+
| id | balance | username |
+----+---------+----------+
|  1 |      9  | alice    |
+----+---------+----------+

step10 session1 输出:

+----+---------+----------+
| id | balance | username |
+----+---------+----------+
|  1 |      8  | alice    |
+----+---------+----------+

可以看到两个session在事物提交前后，读取到的内容是不一样的，

初始阶段，系统里面只有一条初始化的数据，此时session1/session2读到的数据均为初始化的数据，亦即余额=10的那条数据
step5 session1 提交更新，此时session1事物(tx1)提交,step6阶段session2输出见上述step6 session2 输出，读取到的balance=8
step7 session1 开启一个新的事务tx3(上一个tx提交,mysql隐式开启tx)，此时step8 session1(tx3)读取到的内容为balance=9
step8 session1 读取id=1的数据，此时tx3读取到的余额为balance=8
step9 session2 tx2 提交，step10 session1 tx3 重新读取数据,此时读取到的余额为balance=8

通过上述分析，读提交的事物隔离级别体现在下述2点

step6 session2(tx2)读取的数据为8，此时tx2在不同时间点读取到的内容不一样(初始balance=10，step6 读取到tx1提交的变更)
step8 session1(tx3)读取到的数据为9，此时tx2还未提交,读取的数据是tx1已提交的数据
step10 session1(tx3)读取到的数据为8，此时tx2已提交，读取到最新的内容，示例与tx1类似

通过上述几点可以观察到对于读提交的事物隔离级别,对于不同事物对同一行的更新，通常情况下由于"锁等待"的存在,两个事物看似像是"串行"执行了一样,更新的总量与总体的事物量相对应,不存在读未提交中的更新丢失，事实上读提交的这种特性符合现实当中的大部分业务场景,这也是Oracle/SqlServer将默认的隔离级别设置为读提交的场景

对比读提交与读未提交可以清晰的发现

数据的可见性不在以单条sql语句执行成功与否来决定,而由事务是否提交来决定,通常情况下应用程序开发人员可以控制事物提交的时机，但无法控制单条Sql何时执行成功
读提交的大部分场景结果是可控的，尤其是在多个事物中包含多条语句的时候，最终的执行结果取决于最后一个事物执行的结果，从应用程序的维度去看，只需要控制事物的执行顺序就能得到我们想要的结果
读提交隔离级别某种程度上违反了上述ACID四特性中的I的要求,亦即隔离性

可重复读

设置当前session的事物隔离级别

set session transaction isolation level repeatable read;

更新数据记录为默认值10

update account set balance = 10 where id = 1;

模拟两个并发的场景

step	Tx	Session1	Tx	Session2
1	tx1	begin;	tx2	begin;
2	tx1	update account set balance=balance-1,update_time = now() where id = 1;	tx2
3	tx1		tx2	select id,balance,username from account where id = 1;
4	tx1		tx2	update account set balance = balance-1,update_time = now() where id = 1;
5	tx1		tx2	-- waiting until session1 commit or rollback;
6	tx1	commit;	tx2
7	tx1		tx2	select id,balance,username from account where id = 1;
8	tx3	begin;	tx2
9	tx3	select id,balance,username from account where id = 1;	tx2
10	tx3		tx2	commit;
11	tx3	select id,balance,username from account where id = 1;
12	tx3	commit;

step3 session2 输出:

+----+---------+----------+
| id | balance | username |
+----+---------+----------+
|  1 |      10 | alice    |
+----+---------+----------+

step7 session2 输出:

+----+---------+----------+
| id | balance | username |
+----+---------+----------+
|  1 |      8  | alice    |
+----+---------+----------+

step9 session1 输出:

+----+---------+----------+
| id | balance | username |
+----+---------+----------+
|  1 |      9  | alice    |
+----+---------+----------+

step11 session1 输出:

+----+---------+----------+
| id | balance | username |
+----+---------+----------+
|  1 |      9  | alice    |
+----+---------+----------+

上述输出需要注意一下几个点

session2 step3 输出的balance是10，并未读取到脏数据(tx1更新的数据)
tx1 step2与tx2 step4的两个update操作实际上都存在一个读操作(balance-1),
session2 step7 输出的balance是8,看似读取到了tx1的提交，这个跟mvcc以及锁有关系，后续会锁章节会解释
session1 tx3 step9 step11 两次读操作体现的就是可重复读,与读提交不同的是tx3 step11与step9两次读取操作不会随着tx2事物的提交而发生变更
此处未演示tx2 insert的情形,innodb在repeatable read通过next-key机制解决了幻读问题(标准的事物隔离级别幻读只在Serializable隔离级别才能解决)

串行读(SERIALIZABLE)

照例设置session的默认隔离级别

set session transaction isolation level serializable

更新数据记录为默认值10并insert一条新的数据

begin;
  update account set balance = 10 where id = 1;
  insert into account values(NULL,10,'bob',now(),now());
commit;

模拟两个并发的场景

step	Tx	Session1	Tx	Session2
1	tx1	begin;	tx2	begin;
2	tx1	update account set balance=balance-1,update_time = now() where id = 1;	tx2
3	tx1		tx2	select id,balance,username from account where id = 1;
4	tx1		tx2	-- waiting until session1 commit or rollback;
6	tx1	commit;	tx2
7	tx1		tx2	select id,balance,username from account where id = 1;
8			tx2	update account set balance=balance-1,update_time = now() where id = 1;
8	tx3	begin;	tx2
9	tx3	update account set balance=balance-1,update_time = now() where id = 1;	tx2
10	tx3	-- Deadlock found when trying to get lock; try restarting transaction	tx2
11			tx2	commit;

注意下述几个步骤

step2 session1 tx1 有一条更新语句
step3 session2 tx2 在读取操作的时候会出现waiing状态,这在前述三种隔离级别未曾发生
step6 session1 tx1 commit后，释放行记录上的锁(X锁)，session2 tx2 才可以继续进行
step10 session1 tx3 在对记录进行update时候，直接被系统检测到死锁发生,回滚并重新开启新的事物(tx2持有记录id=1的X锁直到ste11才释放)

整体来看,并发场景下对同一个记录行进行操作时，读写操作互斥，需要等待资源上的锁定被释放才能继续执行下一步，而在前述三种事物隔离级别的场景下，两个事物，只有存在并发写才会存在锁等待，读写并不会造成锁等待，默认只有在串行读的事物隔离级别下才会出现读写操作的互斥，锁的兼容性介绍请参见后续innodb关于锁的分析

posted @ 2022-08-28 11:10 dev_song 阅读(20) 评论(0) 编辑收藏举报

刷新页面返回顶部

dev_song

事务基础

传统关系型数据库两类典型的应用场景

传统事务

数据库事务

事务隔离级别

示例表的建表语句

读未提交

读提交

可重复读

串行读(SERIALIZABLE)

公告