MHA 常见问题解决
目录
一、免密配置成功后
masterha_check_ssh --conf=/etc/masterha/app1.cnf
Wed Jan 8 18:00:57 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Wed Jan 8 18:00:57 2020 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Wed Jan 8 18:00:57 2020 - [info] Updating application default configuration from /etc/masterha/pm/load_cnf..
Can't exec "/etc/masterha/pm/load_cnf": No such file or directory at /usr/share/perl5/vendor_perl/MHA/Config.pm line 365.
Wed Jan 8 18:00:57 2020 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Wed Jan 8 18:00:57 2020 - [info] Starting SSH connection tests..
Wed Jan 8 18:00:58 2020 - [debug]
Wed Jan 8 18:00:57 2020 - [debug] Connecting via SSH from root@192.168.1.147(192.168.1.147:22) to root@192.168.1.58(192.168.1.58:22)..
Wed Jan 8 18:00:57 2020 - [debug] ok.
Wed Jan 8 18:00:58 2020 - [debug]
Wed Jan 8 18:00:58 2020 - [debug] Connecting via SSH from root@192.168.1.58(192.168.1.58:22) to root@192.168.1.147(192.168.1.147:22)..
Wed Jan 8 18:00:58 2020 - [debug] ok.
Wed Jan 8 18:00:58 2020 - [info] All SSH connection tests passed successfully.
二、监控检查问题
[root@dataexa-ccb-test-58 masterha]# masterha_check_repl --conf=/etc/masterha/app1.cnf
Thu Jan 9 10:36:19 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Jan 9 10:36:19 2020 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Thu Jan 9 10:36:19 2020 - [info] Updating application default configuration from /etc/masterha/pm/load_cnf..
Thu Jan 9 10:36:19 2020 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Thu Jan 9 10:36:19 2020 - [info] MHA::MasterMonitor version 0.58.
Thu Jan 9 10:36:20 2020 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln188] There is no alive server. We can't do failover
Thu Jan 9 10:36:20 2020 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 329.
Thu Jan 9 10:36:20 2020 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Thu Jan 9 10:36:20 2020 - [info] Got exit code 1 (Not master dead).
MySQL Replication Health is NOT OK!
原因是app1.cnf 没有配置用户和密码 端口
vi app1.cnf
user=manager
password=123456
port=3306
#系统ssh用户
ssh_user=root
ssh_port=22
#复制用户
repl_user=salve
repl_password=123456
port=3306
还有一种就是检查问题是:
[root@dataexa-ccb-test-58 masterha]# masterha_check_repl --conf=/etc/masterha/app1.cnf
Thu Jan 9 10:54:03 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Jan 9 10:54:03 2020 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Thu Jan 9 10:54:03 2020 - [info] Updating application default configuration from /etc/masterha/pm/load_cnf..
Thu Jan 9 10:54:03 2020 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Thu Jan 9 10:54:03 2020 - [info] MHA::MasterMonitor version 0.58.
Thu Jan 9 10:54:04 2020 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln180] Got MySQL error when connecting 192.168.1.58(192.168.1.58:31061) :1045:Access denied for user 'root'@'master' (using password: YES), but this is not a MySQL crash. Check MySQL server settings.
Thu Jan 9 10:54:04 2020 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln301] at /usr/share/perl5/vendor_perl/MHA/ServerManager.pm line 297.
Thu Jan 9 10:54:04 2020 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln180] Got MySQL error when connecting 192.168.1.147(192.168.1.147:31061) :1045:Access denied for user 'root'@'master' (using password: YES), but this is not a MySQL crash. Check MySQL server settings.
Thu Jan 9 10:54:04 2020 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln301] at /usr/share/perl5/vendor_perl/MHA/ServerManager.pm line 297.
Thu Jan 9 10:54:05 2020 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln309] Got fatal error, stopping operations
Thu Jan 9 10:54:05 2020 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 329.
Thu Jan 9 10:54:05 2020 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Thu Jan 9 10:54:05 2020 - [info] Got exit code 1 (Not master dead).
MySQL Replication Health is NOT OK!
原因是账号密码错误,重新配置mysql 账号密码信息
问题三、
[root@dataexa-ccb-test-58 masterha]# masterha_check_repl --conf=/etc/masterha/app1.cnf
Thu Jan 9 11:06:25 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Jan 9 11:06:25 2020 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Thu Jan 9 11:06:25 2020 - [info] Updating application default configuration from /etc/masterha/pm/load_cnf..
Thu Jan 9 11:06:25 2020 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Thu Jan 9 11:06:25 2020 - [info] MHA::MasterMonitor version 0.58.
Thu Jan 9 11:06:26 2020 - [info] GTID failover mode = 1
Thu Jan 9 11:06:26 2020 - [info] Dead Servers:
Thu Jan 9 11:06:26 2020 - [info] Alive Servers:
Thu Jan 9 11:06:26 2020 - [info] 192.168.1.147(192.168.1.147:31061)
Thu Jan 9 11:06:26 2020 - [info] 192.168.1.58(192.168.1.58:31061)
Thu Jan 9 11:06:26 2020 - [info] Alive Slaves:
Thu Jan 9 11:06:26 2020 - [info] 192.168.1.58(192.168.1.58:31061) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled
Thu Jan 9 11:06:26 2020 - [info] GTID ON
Thu Jan 9 11:06:26 2020 - [info] Replicating from 192.168.1.147(192.168.1.147:31061)
Thu Jan 9 11:06:26 2020 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Jan 9 11:06:26 2020 - [info] Current Alive Master: 192.168.1.147(192.168.1.147:31061)
Thu Jan 9 11:06:26 2020 - [info] Checking slave configurations..
Thu Jan 9 11:06:26 2020 - [info] Checking replication filtering settings..
Thu Jan 9 11:06:26 2020 - [info] binlog_do_db= , binlog_ignore_db=
Thu Jan 9 11:06:26 2020 - [info] Replication filtering check ok.
Thu Jan 9 11:06:26 2020 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln398] 192.168.1.58(192.168.1.58:31061): User salve does not exist or does not have REPLICATION SLAVE privilege! Other slaves can not start replication from this host.
Thu Jan 9 11:06:26 2020 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/ServerManager.pm line 1403.
Thu Jan 9 11:06:26 2020 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Thu Jan 9 11:06:26 2020 - [info] Got exit code 1 (Not master dead).
MySQL Replication Health is NOT OK!
SELECT host,user,authentication_string,Grant_priv,Super_priv ,Repl_slave_priv AS Value FROM mysql.user;
查询 Value 为N,改为Y 就行
FLUSH PRIVILEGES;
问题四、
[root@dataexa-ccb-test-58 masterha]# masterha_check_repl --conf=/etc/masterha/app1.cnf
Thu Jan 9 12:03:07 2020 - [info] Checking master_ip_failover_script status:
Thu Jan 9 12:03:07 2020 - [info] /etc/masterha/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.1.147 --orig_master_ip=192.168.1.147 --orig_master_port=31061
Gateway: 0.0.0.0 can't reached!!!Thu Jan 9 12:03:09 2020 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln229] Failed to get master_ip_failover_script status with return code 10:0.
Thu Jan 9 12:03:09 2020 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/bin/masterha_check_repl line 48.
Thu Jan 9 12:03:09 2020 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Thu Jan 9 12:03:09 2020 - [info] Got exit code 1 (Not master dead).
MySQL Replication Health is NOT OK!
原因是: 网关写 0.0.0.0 改为自己的网关
三、主从不同步问题
导致主从不同步,有可能是数据库问题不一样,一定要确认数据库一致。
有可能自己删除bin-log问题
解决1、重新初始化
解决2、停止 slave 重新同步
在master 操作
flush logs;
show master status\G;
在从操作
stop slave;change master to master_host='192.168.1.147',master_port=31061, master_user='slave', master_password='Aipf@123456',master_log_file='master-bin.0000004', master_log_pos=346;
start slave;
show slave status\G;
解决3、
stop slave;
reset slave;
start slave;
主从不同步,因为 数据不一致
在主库 锁表,备份整个库,导入到从库里面
flush table with read lock;
从库
stop slave;
source sqlfile;
change master to master_auto_position=0;
reset slave;
start slave;
四、数据库备份
备份全库
mysqldump -uroot -pAipf@123 -A --master-data | gzip > ./all.sql.gz
备份
percona公司xtrabackup
percona-xtrabackup-24-2.4.11-1.el7.x86_64.rpm