生产环境临时性忽略MySQL主从复制模拟故障实验
1、主服务器(192.168.43.100)
(1)安装数据库
# yum install -y mariadb-server # systemctl restart mariadb # mysql_secure_installation
(2)配置my.cnf文件
# vim /etc/my.cnf [mysqld] server-id=1 log-bin=/data/logbin/mysql-bin # systemctl restart mariadb
(3)创建用户并授权
[root@node1 ~]# mysql -uroot -p000000
# MariaDB [(none)]> grant replication slave on *.* to repluser@'192.168.43.%' identified by 'centos';
(4)备份数据库
# mysqldump -uroot -p000000 -A --single-transaction --master-data=1 -F > /data/all.sql
(5)拷贝备份文件过从主机
# scp /data/all.sql 192.168.43.200:/data/
2、从服务器(192.168.43.200)
(1)安装数据库
# yum install -y mariadb-server
# systemctl restart mariadb
# mysql_secure_installation
(2)往all.sql备份文件添加主从复制相关信息
# vim /data/all.sql CHANGE MASTER TO MASTER_HOST='192.168.43.100', MASTER_USER='repluser', MASTER_PASSWORD='centos', MASTER_PORT=3306, MASTER_LOG_FILE='mariadb-bin.000002', MASTER_LOG_POS=245;
(3)配置my.cnf文件
# vim /etc/my.cnf [mysqld] server-id=2 read-only # systemctl restart mariadb
(4)导入数据
# mysql -uroot -p000000 < /data/all.sql
(5)启动slave
mysql -uroot -p000000 MariaDB [(none)]> start slave; MariaDB [(none)]> show processlist; +----+-------------+-----------+------+---------+------+-----------------------------------------------------------------------------+------------------+----------+ | Id | User | Host | db | Command | Time | State | Info | Progress | +----+-------------+-----------+------+---------+------+-----------------------------------------------------------------------------+------------------+----------+ | 3 | root | localhost | NULL | Query | 0 | NULL | show processlist | 0.000 | | 6 | system user | | NULL | Connect | 1017 | Waiting for master to send event | NULL | 0.000 | | 7 | system user | | NULL | Connect | 1017 | Slave has read all relay log; waiting for the slave I/O thread to update it | NULL | 0.000 | +----+-------------+-----------+------+---------+------+-----------------------------------------------------------------------------+------------------+----------+ 3 rows in set (0.00 sec) MariaDB [(none)]> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.43.100 Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mariadb-bin.000002 Read_Master_Log_Pos: 407 Relay_Log_File: mariadb-relay-bin.000003 Relay_Log_Pos: 531 Relay_Master_Log_File: mariadb-bin.000002 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 407 Relay_Log_Space: 1275 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 1 row in set (0.00 sec)
3、模拟故障
当在从数据库服务器误操作的情况下执行了一条创建数据库的语句,正好主的数据库服务器也操作了这条语句。
MariaDB [(none)]> create database db2;
则会发生如下报错
MariaDB [(none)]> show status slave\G ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'slave' at line 1 MariaDB [(none)]> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.43.100 Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mariadb-bin.000002 Read_Master_Log_Pos: 407 Relay_Log_File: mariadb-relay-bin.000002 Relay_Log_Pos: 612 Relay_Master_Log_File: mariadb-bin.000002 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1007 Last_Error: Error 'Can't create database 'db2'; database exists' on query. Default database: 'db2'. Query: 'create database db2' Skip_Counter: 0 Exec_Master_Log_Pos: 326 Relay_Log_Space: 989 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1007 Last_SQL_Error: Error 'Can't create database 'db2'; database exists' on query. Default database: 'db2'. Query: 'create database db2' Replicate_Ignore_Server_Ids: Master_Server_Id: 1 1 row in set (0.00 sec)
由于做了主从架构,主节点做的操作数据,只会单向的同步给从节点;从节点所做的操作并不会返回给主节点。所以当主节点也执行了从节点一模一样的语句时,则会提示db2数据库文件已存在发生冲突。
故:主从复制服务已经不能够正常运转。
通常 Last_Errno: 1007代码提示,都是指对象已存在。
临时性解决方法:
虽说上面的例子造成的故障,通过查看报错可以快速定位到故障所在。但是在生产环境中,也许不会像我们想象中那么简单,可以暂时选择性忽略;因为有大量的事务在往主服务器读写,如果从服务器不能够及时正常的同步数据,则会造成一定的数据丢失。等后续再去排查故障
mysql> STOP SLAVE; mysql> SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1; # 从服务器忽略1个主服务器的复制事件 mysql> START SLAVE;
别在该奋斗的年纪,选择安逸。