SQL Server Availability Group Failover 测试
兼容性测试:
测试脚本:
环境:windows failover cluster
主库执行脚本:
USE [master]
GO
ALTER AVAILABILITY GROUP [test_AG]
MODIFY REPLICA ON N'host1' WITH (FAILOVER_MODE = AUTOMATIC)
GO
报错:
Msg 35215, Level 16, State 17, Line 3
The Alter operation is not allowed on availability replica 'UELT1WASFSD01VS', because automatic failover mode is an invalid configuration on a SQL Server Failover Cluster Instance. Retry the operation by specifying manual failover mode.
结果证明AG 自动failover的属性和windows failover cluster不兼容,如果要启用AG自动failover必须使用单实例模式的sql server.
2.AG自动failover功能测试
测试1:
在sscm中手动停止主库sql service,观察是否自动failover到备库。
测试2:
直接关闭主库所在的主机,观察是否failover到备库
测试1和测试2均实现自动failover。
在failover期间,会有应用程序连接拒绝出现:
11/01/19 06:04:10.368 [0x00003688] [spid 769] SQLState: 08S01, Native Error: 10054, Severity: 0, State: 10, Line: 0
[Microsoft][SQL Server Native Client 11.0]TCP Provider: An existing connection was forcibly closed by the remote host.
3.AG Failover性能测试
table: 5218283 rows clustered.
query: select count(*) from test
command:.\ostress -Suedv1wasfsdba01 -dtest -E -Q"select count(*) from dbo.test" -ooutput -mstress -n500 -r100 -T10 -T88 -T146
Primary Session number |
Secondary session number |
1st Failover time(s) |
2nd Failover time(s) |
3rd Failover time(s) |
300 |
0 |
5 |
5 |
5 |
0 |
300 |
25 |
25 |
23 |
500 |
0 |
5 |
5 |
5 |
0 |
500 |
44 |
46 |
31 |
1000 |
0 |
5 |
5 |
5 |
0 |
1000 |
89 |
88 |
91 |
session 1000的时候,客户端报连接超时错误:
解决方案:
exec sp_configure 'max worker threads',4096
go
reconfigure
总结:
1.Avalibility group 自动failover 和 FCI failover不兼容,只能应用于 standalone instance.
2.Avalibility group automatic failover 执行sql service 宕机 failover and 主机宕机 failover.
3. failover 时间取决于备库上只读会话并发数目,可以降低主要备库上的只读负载,而在非主要备库上执行只读负载的方式来降低failover的时间。