MySQL Replication Error 处理一例
故障现象
MySQL slave status详情
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.10.10.101
Master_User: root
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000010
Read_Master_Log_Pos: 143861468
Relay_Log_File: slave-relay-bin.000525
Relay_Log_Pos: 128835579
Relay_Master_Log_File: mysql-bin.000010
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table: mysql.%,information_schema.%,test.%
Last_Errno: 0
Last_Error: Could not parse relay log event entry. The possck this by running 'mysqlbinlog' on the binary log), the slave's relay log iay log), a network problem, or a bug in the master's or slave's MySQL code. you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this
Skip_Counter: 0
Exec_Master_Log_Pos: 128835442
Relay_Log_Space: 143862393
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
1 row in set (0.00 sec)
mysql>
MySQL 错误日志中显示了问题的根源,磁盘空间满.
thatsit-mysql:/var/lib/mysql # grep -i disk mysqld.log-20160114 /usr/sbin/mysqld: Disk is full writing './mysql-bin.000319' (Errcode: 28). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space) /usr/sbin/mysqld: Disk is full writing './mysql-bin.000320' (Errcode: 28). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space) thatsit-mysql:/var/lib/mysql #
MySQL 错误日志中同样还记录了binlog执行截至的位置点.
thatsit-mysql:/var/lib/mysql # grep ERROR mysqld.log-20160114|grep stopped|tail -1
160114 13:16:16 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000010' position 128835442
thatsit-mysql:/var/lib/mysql #
修复
① 清理MySQL datadir所在的磁盘
② 恢复MySQL SLAVE (STOP SLAVE → CHANGE MASTER → START SLAVE)
※: 位置点为上面从SLAVE STATUS的"Exec_Master_Log_Pos"或者错误日志中获取的.
mysql> stop slave; Query OK, 0 rows affected (0.00 sec) mysql> mysql> CHANGE MASTER TO -> MASTER_HOST='10.10.10.101', -> MASTER_PORT=3306, -> MASTER_USER='REP_USER', -> MASTER_PASSWORD='REP_PASSWORD', -> MASTER_LOG_FILE='mysql-bin.000010', -> MASTER_LOG_POS=128835442; Query OK, 0 rows affected (0.41 sec) mysql> start slave; Query OK, 0 rows affected (0.00 sec)
mysql>
操作后状态确认
mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.10.10.101 Master_User: root Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000010 Read_Master_Log_Pos: 144444705 Relay_Log_File: slave-relay-bin.000002 Relay_Log_Pos: 38259 Relay_Master_Log_File: mysql-bin.000010 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: mysql.%,information_schema.%,test.% Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 128873466 Relay_Log_Space: 15609498 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 99299 1 row in set (0.00 sec) mysql>
此故障至此修复完毕.