CentOS6.8下MySQL MHA架构搭建笔记
转载请注明出处,本文地址:http://www.cnblogs.com/ajiangg/p/6552855.html
以下是CentOS6.8下MySQL MHA架构搭建笔记
IP资源规划:
192.168.206.139 master
192.168.206.140 slave01(备用master)
192.168.206.141 slave02
192.168.206.142 manager
192.168.206.145 VIP
一、准备工作:
1.关闭Selinux
[root@localhost ~]# vi /etc/selinux/config
# This file controls the state of SELinux on the system. # SELINUX= can take one of these three values: # enforcing - SELinux security policy is enforced. # permissive - SELinux prints warnings instead of enforcing. # disabled - No SELinux policy is loaded. SELINUX=disabled # SELINUXTYPE= can take one of these two values: # targeted - Targeted processes are protected, # mls - Multi Level Security protection. SELINUXTYPE=targeted
[root@localhost ~]# reboot
2.关闭防火墙
[root@localhost ~]# chkconfig iptables off
[root@localhost ~]# service iptables stop
3.安装epel yum源
[root@localhost tmp]# rpm -Uvh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm [root@localhost tmp]# rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6
4.安装MHA node所需的perl模块(DBD:mysql)
[root@localhost tmp]# yum -y install perl-DBD-MySQL ncftp perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes perl-devel
二、Replication搭建
1.建立replication用户(master和slave01)
mysql> GRANT REPLICATION SLAVE ON *.* TO 'u_repl'@'192.168.206.%' IDENTIFIED BY 'replpass' with grant option;
2.建立MHA管理用用户
mysql> GRANT ALL PRIVILEGES ON *.* to 'root'@'192.168.206.%' IDENTIFIED BY 'checkpass';
3.安装半同步复制插件
mysql> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so'; mysql> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
4.修改my.cnf配置文件
[root@localhost tmp]# vi /etc/my.cnf
# For advice on how to change settings please see # http://dev.mysql.com/doc/refman/5.7/en/server-configuration-defaults.html [mysqld] # # Remove leading # and set to the amount of RAM for the most important data # cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%. # innodb_buffer_pool_size = 128M # # Remove leading # to turn on a very important data integrity option: logging # changes to the binary log between backups. # log_bin # # Remove leading # to set options mainly useful for reporting servers. # The server defaults are faster for transactions and fast SELECTs. # Adjust sizes as needed, experiment to find the optimal values. # join_buffer_size = 128M # sort_buffer_size = 2M # read_rnd_buffer_size = 2M datadir=/var/lib/mysql socket=/var/lib/mysql/mysql.sock server_id=139 log_bin=/var/lib/mysql/mysql-bin binlog-format=ROW binlog-checksum=CRC32 log_slave_updates=true gtid-mode=on enforce-gtid-consistency=true master_info_repository=TABLE sync-master-info=1 slave-parallel-workers=2 master-verify-checksum=1 slave-sql-verify-checksum=1 relay_log=/var/lib/mysql/localhost-relay-bin relay_log_purge=0 relay_log_recovery=1 rpl_semi_sync_master_enabled=ON rpl_semi_sync_slave_enabled=ON # Disabling symbolic-links is recommended to prevent assorted security risks symbolic-links=0 log-error=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid
5.master备份slave还原(略),查找master Binlog位置
[root@localhost tmp]# mysql -uroot -p mysql> show master status\G; *************************** 1. row *************************** File: mysql-bin.000001 Position: 154 Binlog_Do_DB: Binlog_Ignore_DB: Executed_Gtid_Set: 1 row in set (0.00 sec)
6.各slave上执行脚本,建立主从配置。(MySQL5.7可以直接使用master_auto_position=1参数)
mysql> change master to master_host='192.168.206.139',master_user='u_repl',master_password='replpass',master_log_file='mysql-bin.000001' ,master_log_pos=154;
mysql> start slave;
三、开始安装配置
1.各节点使用ssh-keygen实现三台机器之间相互免密钥登录
Memo:如果是直接通过修改/root/.ssh/authorized_keys,需要在安装manager的服务器上把自己的key也加进去,否则masterha_check_ssh验证通不过。
[root@localhost tmp]# ssh-keygen -t rsa
[root@localhost tmp]# ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: d5:0d:37:75:67:54:47:f0:e4:72:34:26:29:0d:1b:ba root@localhost.localdomain The key's randomart image is: +--[ RSA 2048]----+ | o+.=B&| | .oo*oB=| | ...o o +| | .. o | | SE | | | | | | | | | +-----------------+
[root@localhost .ssh]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.206.140
[root@localhost .ssh]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.206.140 root@192.168.206.140's password: Now try logging into the machine, with "ssh 'root@192.168.206.140'", and check in: .ssh/authorized_keys to make sure we haven't added extra keys that you weren't expecting.
[root@localhost .ssh]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.206.141 [root@localhost .ssh]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.206.142
2.所有节点(包括manager)下载安装mha4mysql-node-0.56
由于google被墙,我是直接从网上下载的rpm文件,具体下载过程略。
[root@localhost tmp]# yum localinstall mha4mysql-node-0.56-0.el6.noarch.rpm
[root@localhost tmp]# yum localinstall mha4mysql-node-0.56-0.el6.noarch.rpm Loaded plugins: fastestmirror, refresh-packagekit, security Setting up Local Package Process Examining mha4mysql-node-0.56-0.el6.noarch.rpm: mha4mysql-node-0.56-0.el6.noarch Marking mha4mysql-node-0.56-0.el6.noarch.rpm to be installed Loading mirror speeds from cached hostfile * base: ftp.sjtu.edu.cn * epel: mirrors.tuna.tsinghua.edu.cn * extras: centos.ustc.edu.cn * updates: ftp.sjtu.edu.cn Resolving Dependencies --> Running transaction check ---> Package mha4mysql-node.noarch 0:0.56-0.el6 will be installed --> Finished Dependency Resolution Dependencies Resolved ============================================================================================================================================================================================== Package Arch Version Repository Size ============================================================================================================================================================================================== Installing: mha4mysql-node noarch 0.56-0.el6 /mha4mysql-node-0.56-0.el6.noarch 102 k Transaction Summary ============================================================================================================================================================================================== Install 1 Package(s) Total size: 102 k Installed size: 102 k Is this ok [y/N]: y Downloading Packages: Running rpm_check_debug Running Transaction Test Transaction Test Succeeded Running Transaction Installing : mha4mysql-node-0.56-0.el6.noarch 1/1 Verifying : mha4mysql-node-0.56-0.el6.noarch 1/1 Installed: mha4mysql-node.noarch 0:0.56-0.el6 Complete!
确认安装内容
[root@localhost tmp]# rpm -ql mha4mysql-node
[root@localhost tmp]# rpm -ql mha4mysql-node /usr/bin/apply_diff_relay_logs /usr/bin/filter_mysqlbinlog /usr/bin/purge_relay_logs /usr/bin/save_binary_logs /usr/share/man/man1/apply_diff_relay_logs.1.gz /usr/share/man/man1/filter_mysqlbinlog.1.gz /usr/share/man/man1/purge_relay_logs.1.gz /usr/share/man/man1/save_binary_logs.1.gz /usr/share/perl5/vendor_perl/MHA/BinlogHeaderParser.pm /usr/share/perl5/vendor_perl/MHA/BinlogManager.pm /usr/share/perl5/vendor_perl/MHA/BinlogPosFindManager.pm /usr/share/perl5/vendor_perl/MHA/BinlogPosFinder.pm /usr/share/perl5/vendor_perl/MHA/BinlogPosFinderElp.pm /usr/share/perl5/vendor_perl/MHA/BinlogPosFinderXid.pm /usr/share/perl5/vendor_perl/MHA/NodeConst.pm /usr/share/perl5/vendor_perl/MHA/NodeUtil.pm /usr/share/perl5/vendor_perl/MHA/SlaveUtil.pm
3.manager节点安装mha4mysql-manager-0.56
[root@localhost tmp]# yum localinstall mha4mysql-manager-0.56-0.el6.noarch.rpm
[root@localhost tmp]# yum localinstall mha4mysql-manager-0.56-0.el6.noarch.rpm Loaded plugins: fastestmirror, refresh-packagekit, security Setting up Local Package Process Examining mha4mysql-manager-0.56-0.el6.noarch.rpm: mha4mysql-manager-0.56-0.el6.noarch Marking mha4mysql-manager-0.56-0.el6.noarch.rpm to be installed Loading mirror speeds from cached hostfile * base: mirrors.163.com * epel: mirror01.idc.hinet.net * extras: mirrors.cn99.com * updates: mirrors.cn99.com Resolving Dependencies --> Running transaction check ---> Package mha4mysql-manager.noarch 0:0.56-0.el6 will be installed --> Finished Dependency Resolution Dependencies Resolved ============================================================================================================================================================================================== Package Arch Version Repository Size ============================================================================================================================================================================================== Installing: mha4mysql-manager noarch 0.56-0.el6 /mha4mysql-manager-0.56-0.el6.noarch 325 k Transaction Summary ============================================================================================================================================================================================== Install 1 Package(s) Total size: 325 k Installed size: 325 k Is this ok [y/N]: y Downloading Packages: Running rpm_check_debug Running Transaction Test Transaction Test Succeeded Running Transaction Installing : mha4mysql-manager-0.56-0.el6.noarch 1/1 Verifying : mha4mysql-manager-0.56-0.el6.noarch 1/1 Installed: mha4mysql-manager.noarch 0:0.56-0.el6 Complete!
确认安装内容
[root@localhost tmp]# rpm -ql mha4mysql-manager
[root@localhost tmp]# rpm -ql mha4mysql-node /usr/bin/apply_diff_relay_logs /usr/bin/filter_mysqlbinlog /usr/bin/purge_relay_logs /usr/bin/save_binary_logs /usr/share/man/man1/apply_diff_relay_logs.1.gz /usr/share/man/man1/filter_mysqlbinlog.1.gz /usr/share/man/man1/purge_relay_logs.1.gz /usr/share/man/man1/save_binary_logs.1.gz /usr/share/perl5/vendor_perl/MHA/BinlogHeaderParser.pm /usr/share/perl5/vendor_perl/MHA/BinlogManager.pm /usr/share/perl5/vendor_perl/MHA/BinlogPosFindManager.pm /usr/share/perl5/vendor_perl/MHA/BinlogPosFinder.pm /usr/share/perl5/vendor_perl/MHA/BinlogPosFinderElp.pm /usr/share/perl5/vendor_perl/MHA/BinlogPosFinderXid.pm /usr/share/perl5/vendor_perl/MHA/NodeConst.pm /usr/share/perl5/vendor_perl/MHA/NodeUtil.pm /usr/share/perl5/vendor_perl/MHA/SlaveUtil.pm [root@localhost tmp]# rpm -ql mha4mysql-manager /usr/bin/masterha_check_repl /usr/bin/masterha_check_ssh /usr/bin/masterha_check_status /usr/bin/masterha_conf_host /usr/bin/masterha_manager /usr/bin/masterha_master_monitor /usr/bin/masterha_master_switch /usr/bin/masterha_secondary_check /usr/bin/masterha_stop /usr/share/man/man1/masterha_check_repl.1.gz /usr/share/man/man1/masterha_check_ssh.1.gz /usr/share/man/man1/masterha_check_status.1.gz /usr/share/man/man1/masterha_conf_host.1.gz /usr/share/man/man1/masterha_manager.1.gz /usr/share/man/man1/masterha_master_monitor.1.gz /usr/share/man/man1/masterha_master_switch.1.gz /usr/share/man/man1/masterha_secondary_check.1.gz /usr/share/man/man1/masterha_stop.1.gz /usr/share/perl5/vendor_perl/MHA/Config.pm /usr/share/perl5/vendor_perl/MHA/DBHelper.pm /usr/share/perl5/vendor_perl/MHA/FileStatus.pm /usr/share/perl5/vendor_perl/MHA/HealthCheck.pm /usr/share/perl5/vendor_perl/MHA/ManagerAdmin.pm /usr/share/perl5/vendor_perl/MHA/ManagerAdminWrapper.pm /usr/share/perl5/vendor_perl/MHA/ManagerConst.pm /usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm /usr/share/perl5/vendor_perl/MHA/MasterFailover.pm /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm /usr/share/perl5/vendor_perl/MHA/MasterRotate.pm /usr/share/perl5/vendor_perl/MHA/SSHCheck.pm /usr/share/perl5/vendor_perl/MHA/Server.pm /usr/share/perl5/vendor_perl/MHA/ServerManager.pm
4.在manager节点管理MHA配置文件
前面采用yum安装的,没有samples文件夹,所以另外down了一份mha4mysql-manager-0.56.tar.gz包(具体过程略)
[root@localhost tmp]# tar xf mha4mysql-manager-0.56.tar.gz [root@localhost tmp]# cd mha4mysql-manager-0.56 [root@localhost mha4mysql-manager-0.56]# mkdir -p /etc/mha/{app1,scripts} [root@localhost mha4mysql-manager-0.56]# cp -r samples/conf/* /etc/mha/ [root@localhost mha4mysql-manager-0.56]# cp -r samples/scripts/* /etc/mha/scripts/ [root@localhost mha4mysql-manager-0.56]# mv /etc/mha/app1.cnf /etc/mha/app1/ [root@localhost mha4mysql-manager-0.56]# mv /etc/mha/masterha_default.cnf /etc/masterha_default.cnf
5.在manager上设置全局配置
修改master_ip_failover脚本,添加VIP漂移相关三个参数
[root@localhost mha4mysql-manager-0.56]# vi /etc/mha/scripts/master_ip_failover
[root@localhost scripts]# vi master_ip_failover #!/usr/bin/env perl # Copyright (C) 2011 DeNA Co.,Ltd. # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ## Note: This is a sample script and is not complete. Modify the script based on your environment. use strict; use warnings FATAL => 'all'; use Getopt::Long; use MHA::DBHelper; my ( $command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $new_master_host, $new_master_ip, $new_master_port, $new_master_user, $new_master_password, $virtual_ip, $orig_master_vip_eth, $new_master_vip_eth ); GetOptions( 'command=s' => \$command, 'ssh_user=s' => \$ssh_user, 'orig_master_host=s' => \$orig_master_host, 'orig_master_ip=s' => \$orig_master_ip, 'orig_master_port=i' => \$orig_master_port, 'new_master_host=s' => \$new_master_host, 'new_master_ip=s' => \$new_master_ip, 'new_master_port=i' => \$new_master_port, 'new_master_user=s' => \$new_master_user, 'new_master_password=s' => \$new_master_password, 'virtual_ip=s' => \$virtual_ip, 'orig_master_vip_eth=s' => \$orig_master_vip_eth, 'new_master_vip_eth=s' => \$new_master_vip_eth ); exit &main(); sub main { if ( $command eq "stop" || $command eq "stopssh" ) { # $orig_master_host, $orig_master_ip, $orig_master_port are passed. # If you manage master ip address at global catalog database, # invalidate orig_master_ip here. my $exit_code = 1; eval { # updating global catalog, etc `ssh $orig_master_host -o "ConnectTimeout=5" '/sbin/ifconfig $orig_master_vip_eth down'`; $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { # all arguments are passed. # If you manage master ip address at global catalog database, # activate new_master_ip here. # You can also grant write access (create user, set read_only=0, etc) here. my $exit_code = 10; eval { my $new_master_handler = new MHA::DBHelper(); # args: hostname, port, user, password, raise_error_or_not $new_master_handler->connect( $new_master_ip, $new_master_port, $new_master_user, $new_master_password, 1 ); ## Set read_only=0 on the new master $new_master_handler->disable_log_bin_local(); print "Set read_only=0 on the new master.\n"; $new_master_handler->disable_read_only(); ## Creating an app user on the new master print "Creating app user on the new master..\n"; # FIXME_xxx_create_user( $new_master_handler->{dbh} ); $new_master_handler->enable_log_bin_local(); $new_master_handler->disconnect(); ## Update master ip on the catalog database, etc # FIXME_xxx; my $real_eth=$new_master_vip_eth; $real_eth=~ s/:.*//g; my $new_master_network= `ssh $new_master_host '/sbin/ifconfig $real_eth'`; my $vip_netmask = join(".", $new_master_network =~ /Mask\:(\d+)\.(\d+)\.(\d+)\.(\d+)/ ); my $vip_broadcast = join(".", $new_master_network =~ /Bcast\:(\d+)\.(\d+)\.(\d+)\.(\d+)/ ); `ssh $new_master_host -o "ConnectTimeout=15" '/sbin/ifconfig $new_master_vip_eth $virtual_ip netmask $vip_netmask up; /sbin/arping -c 3 -A $virtual_ip; ping -I $virtual_ip -b $vip_broadcast -c 3 -c 3 -i 0.01 -q -W 10 > /dev/null'`; $exit_code = 0; }; if ($@) { warn $@; # If you want to continue failover, exit 10. exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { # do nothing exit 0; } else { &usage(); exit 1; } } sub usage { print "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port --virtual_ip=ip --orig_master_vip_eth=eth --new_master_vip_eth=eth\n"; }
修改master_ip_online_change脚本,添加VIP漂移相关三个参数
[root@localhost scripts]# vi master_ip_online_change
[root@localhost scripts]# vi master_ip_online_change #!/usr/bin/env perl # Copyright (C) 2011 DeNA Co.,Ltd. # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ## Note: This is a sample script and is not complete. Modify the script based on your environment. use strict; use warnings FATAL => 'all'; use Getopt::Long; use MHA::DBHelper; use MHA::NodeUtil; use Time::HiRes qw( sleep gettimeofday tv_interval ); use Data::Dumper; my $_tstart; my $_running_interval = 0.1; my ( $command, $orig_master_is_new_slave, $orig_master_host, $orig_master_ip, $orig_master_port, $orig_master_user, $orig_master_password, $orig_master_ssh_user, $new_master_host, $new_master_ip, $new_master_port, $new_master_user, $new_master_password, $new_master_ssh_user, $virtual_ip, $orig_master_vip_eth, $new_master_vip_eth ); GetOptions( 'command=s' => \$command, 'orig_master_is_new_slave' => \$orig_master_is_new_slave, 'orig_master_host=s' => \$orig_master_host, 'orig_master_ip=s' => \$orig_master_ip, 'orig_master_port=i' => \$orig_master_port, 'orig_master_user=s' => \$orig_master_user, 'orig_master_password=s' => \$orig_master_password, 'orig_master_ssh_user=s' => \$orig_master_ssh_user, 'new_master_host=s' => \$new_master_host, 'new_master_ip=s' => \$new_master_ip, 'new_master_port=i' => \$new_master_port, 'new_master_user=s' => \$new_master_user, 'new_master_password=s' => \$new_master_password, 'new_master_ssh_user=s' => \$new_master_ssh_user, 'virtual_ip=s' => \$virtual_ip, 'orig_master_vip_eth=s' => \$orig_master_vip_eth, 'new_master_vip_eth=s' => \$new_master_vip_eth ); exit &main(); sub current_time_us { my ( $sec, $microsec ) = gettimeofday(); my $curdate = localtime($sec); return $curdate . " " . sprintf( "%06d", $microsec ); } sub sleep_until { my $elapsed = tv_interval($_tstart); if ( $_running_interval > $elapsed ) { sleep( $_running_interval - $elapsed ); } } sub get_threads_util { my $dbh = shift; my $my_connection_id = shift; my $running_time_threshold = shift; my $type = shift; $running_time_threshold = 0 unless ($running_time_threshold); $type = 0 unless ($type); my @threads; my $sth = $dbh->prepare("SHOW PROCESSLIST"); $sth->execute(); while ( my $ref = $sth->fetchrow_hashref() ) { my $id = $ref->{Id}; my $user = $ref->{User}; my $host = $ref->{Host}; my $command = $ref->{Command}; my $state = $ref->{State}; my $query_time = $ref->{Time}; my $info = $ref->{Info}; $info =~ s/^\s*(.*?)\s*$/$1/ if defined($info); next if ( $my_connection_id == $id ); next if ( defined($query_time) && $query_time < $running_time_threshold ); next if ( defined($command) && $command eq "Binlog Dump" ); next if ( defined($user) && $user eq "system user" ); next if ( defined($command) && $command eq "Sleep" && defined($query_time) && $query_time >= 1 ); if ( $type >= 1 ) { next if ( defined($command) && $command eq "Sleep" ); next if ( defined($command) && $command eq "Connect" ); } if ( $type >= 2 ) { next if ( defined($info) && $info =~ m/^select/i ); next if ( defined($info) && $info =~ m/^show/i ); } push @threads, $ref; } return @threads; } sub main { if ( $command eq "stop" ) { ## Gracefully killing connections on the current master # 1. Set read_only= 1 on the new master # 2. DROP USER so that no app user can establish new connections # 3. Set read_only= 1 on the current master # 4. Kill current queries # * Any database access failure will result in script die. my $exit_code = 1; eval { ## Setting read_only=1 on the new master (to avoid accident) my $new_master_handler = new MHA::DBHelper(); # args: hostname, port, user, password, raise_error(die_on_error)_or_not $new_master_handler->connect( $new_master_ip, $new_master_port, $new_master_user, $new_master_password, 1 ); print current_time_us() . " Set read_only on the new master.. "; $new_master_handler->enable_read_only(); if ( $new_master_handler->is_read_only() ) { print "ok.\n"; } else { die "Failed!\n"; } $new_master_handler->disconnect(); # Connecting to the orig master, die if any database error happens my $orig_master_handler = new MHA::DBHelper(); $orig_master_handler->connect( $orig_master_ip, $orig_master_port, $orig_master_user, $orig_master_password, 1 ); ## Drop application user so that nobody can connect. Disabling per-session binlog beforehand $orig_master_handler->disable_log_bin_local(); print current_time_us() . " Drpping app user on the orig master..\n"; # FIXME_xxx_drop_app_user($orig_master_handler); ## Waiting for N * 100 milliseconds so that current connections can exit my $time_until_read_only = 15; $_tstart = [gettimeofday]; my @threads = get_threads_util( $orig_master_handler->{dbh}, $orig_master_handler->{connection_id} ); while ( $time_until_read_only > 0 && $#threads >= 0 ) { if ( $time_until_read_only % 5 == 0 ) { printf "%s Waiting all running %d threads are disconnected.. (max %d milliseconds)\n", current_time_us(), $#threads + 1, $time_until_read_only * 100; if ( $#threads < 5 ) { print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n" foreach (@threads); } } sleep_until(); $_tstart = [gettimeofday]; $time_until_read_only--; @threads = get_threads_util( $orig_master_handler->{dbh}, $orig_master_handler->{connection_id} ); } ## Setting read_only=1 on the current master so that nobody(except SUPER) can write print current_time_us() . " Set read_only=1 on the orig master.. "; $orig_master_handler->enable_read_only(); if ( $orig_master_handler->is_read_only() ) { print "ok.\n"; } else { die "Failed!\n"; } ## Waiting for M * 100 milliseconds so that current update queries can complete my $time_until_kill_threads = 5; @threads = get_threads_util( $orig_master_handler->{dbh}, $orig_master_handler->{connection_id} ); while ( $time_until_kill_threads > 0 && $#threads >= 0 ) { if ( $time_until_kill_threads % 5 == 0 ) { printf "%s Waiting all running %d queries are disconnected.. (max %d milliseconds)\n", current_time_us(), $#threads + 1, $time_until_kill_threads * 100; if ( $#threads < 5 ) { print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n" foreach (@threads); } } sleep_until(); $_tstart = [gettimeofday]; $time_until_kill_threads--; @threads = get_threads_util( $orig_master_handler->{dbh}, $orig_master_handler->{connection_id} ); } ## Terminating all threads print current_time_us() . " Killing all application threads..\n"; $orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 ); print current_time_us() . " done.\n"; $orig_master_handler->enable_log_bin_local(); $orig_master_handler->disconnect(); ## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { ## Activating master ip on the new master # 1. Create app user with write privileges # 2. Moving backup script if needed # 3. Register new master's ip to the catalog database # We don't return error even though activating updatable accounts/ip failed so that we don't interrupt slaves' recovery. # If exit code is 0 or 10, MHA does not abort my $exit_code = 10; eval { my $new_master_handler = new MHA::DBHelper(); # args: hostname, port, user, password, raise_error_or_not $new_master_handler->connect( $new_master_ip, $new_master_port, $new_master_user, $new_master_password, 1 ); ## Set read_only=0 on the new master $new_master_handler->disable_log_bin_local(); print current_time_us() . " Set read_only=0 on the new master.\n"; $new_master_handler->disable_read_only(); ## Creating an app user on the new master print current_time_us() . " Creating app user on the new master..\n"; # FIXME_xxx_create_app_user($new_master_handler); $new_master_handler->enable_log_bin_local(); $new_master_handler->disconnect(); ## Update master ip on the catalog database, etc `ssh $orig_master_host -o "ConnectTimeout=5" '/sbin/ifconfig $orig_master_vip_eth down'`; my $real_eth=$new_master_vip_eth; $real_eth=~ s/:.*//g; my $new_master_network= `ssh $new_master_host '/sbin/ifconfig $real_eth'`; my $vip_netmask = join(".", $new_master_network =~ /Mask\:(\d+)\.(\d+)\.(\d+)\.(\d+)/ ); my $vip_broadcast = join(".", $new_master_network =~ /Bcast\:(\d+)\.(\d+)\.(\d+)\.(\d+)/ ); `ssh $new_master_host -o "ConnectTimeout=15" '/sbin/ifconfig $new_master_vip_eth $virtual_ip netmask $vip_netmask up; /sbin/arping -c 3 -A $virtual_ip; ping -I $virtual_ip -b $vip_broadcast -c 3 -c 3 -i 0.01 -q -W 10 > /dev/null'`; $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { # do nothing exit 0; } else { &usage(); exit 1; } } sub usage { print "Usage: master_ip_online_change --command=start|stop|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port --virtual_ip=ip --orig_master_vip_eth=eth --new_master_vip_eth=eth\n"; die; }
修改masterha_default.cnf全局配置
[root@localhost mha4mysql-manager-0.56]# vi /etc/masterha_default.cnf
[root@localhost scripts]# vi /etc/masterha_default.cnf [server default] user=root password=checkpass ssh_user=root repl_user=u_repl repl_password=replpass master_binlog_dir= /var/lib/mysql,/var/log/mysql remote_workdir=/data/log/masterha secondary_check_script= masterha_secondary_check -s 192.168.206.140 -s 192.168.206.139 --user=root --master_host=192.168.206.139 --master_port=3306 ping_interval=3 master_ip_failover_script= /etc/mha/scripts/master_ip_failover --virtual_ip=192.168.206.145 --orig_master_vip_eth=eth0:234 --new_master_vip_eth=eth0:234 # shutdown_script= /etc/mha/scripts/power_manager # report_script= /etc/mha/scripts/send_report master_ip_online_change_script= /etc/mha/scripts/master_ip_online_change --virtual_ip=192.168.206.145 --orig_master_vip_eth=eth0:234 --new_master_vip_eth=eth0:234
创建日志目录:
[root@localhost mha4mysql-manager-0.56]# mkdir -p /var/log/mha/app1
配置app1.cnf:
[root@localhost mha4mysql-manager-0.56]# vi /etc/mha/app1/app1.cnf
[root@localhost mha4mysql-manager-0.56]# vi /etc/mha/app1/app1.cnf [server default] manager_workdir=/var/log/mha/app1 manager_log=/var/log/mha/app1/manager.log [server1] hostname=192.168.206.139 master_binlog_dir=/var/lib/mysql/ candidate_master=1 [server2] hostname=192.168.206.140 master_binlog_dir=/var/lib/mysql/ candidate_master=1 [server3] hostname=192.168.206.141 master_binlog_dir=/var/lib/mysql/ no_master=1
6.验证安装内容
验证ssh信任
[root@localhost mha4mysql-manager-0.56]# masterha_check_ssh --conf=/etc/mha/app1/app1.cnf
[root@localhost mha4mysql-manager-0.56]# masterha_check_ssh --conf=/etc/mha/app1/app1.cnf Thu Mar 16 03:53:28 2017 - [info] Reading default configuration from /etc/masterha_default.cnf.. Thu Mar 16 03:53:28 2017 - [info] Reading application default configuration from /etc/mha/app1/app1.cnf.. Thu Mar 16 03:53:28 2017 - [info] Reading server configuration from /etc/mha/app1/app1.cnf.. Thu Mar 16 03:53:28 2017 - [info] Starting SSH connection tests.. Thu Mar 16 03:53:33 2017 - [debug] Thu Mar 16 03:53:28 2017 - [debug] Connecting via SSH from root@192.168.206.140(192.168.206.140:22) to root@192.168.206.139(192.168.206.139:22).. Thu Mar 16 03:53:32 2017 - [debug] ok. Thu Mar 16 03:53:32 2017 - [debug] Connecting via SSH from root@192.168.206.140(192.168.206.140:22) to root@192.168.206.141(192.168.206.141:22).. Thu Mar 16 03:53:33 2017 - [debug] ok. Thu Mar 16 03:53:33 2017 - [debug] Thu Mar 16 03:53:28 2017 - [debug] Connecting via SSH from root@192.168.206.139(192.168.206.139:22) to root@192.168.206.140(192.168.206.140:22).. Thu Mar 16 03:53:32 2017 - [debug] ok. Thu Mar 16 03:53:32 2017 - [debug] Connecting via SSH from root@192.168.206.139(192.168.206.139:22) to root@192.168.206.141(192.168.206.141:22).. Thu Mar 16 03:53:33 2017 - [debug] ok. Thu Mar 16 03:53:33 2017 - [debug] Thu Mar 16 03:53:29 2017 - [debug] Connecting via SSH from root@192.168.206.141(192.168.206.141:22) to root@192.168.206.139(192.168.206.139:22).. Thu Mar 16 03:53:33 2017 - [debug] ok. Thu Mar 16 03:53:33 2017 - [debug] Connecting via SSH from root@192.168.206.141(192.168.206.141:22) to root@192.168.206.140(192.168.206.140:22).. Thu Mar 16 03:53:33 2017 - [debug] ok. Thu Mar 16 03:53:33 2017 - [info] All SSH connection tests passed successfully.
验证主从复制
[root@localhost mha4mysql-manager-0.56]# masterha_check_repl --conf=/etc/mha/app1/app1.cnf
[root@localhost mha4mysql-manager-0.56]# masterha_check_repl --conf=/etc/mha/app1/app1.cnf Thu Mar 16 03:56:02 2017 - [info] Reading default configuration from /etc/masterha_default.cnf.. Thu Mar 16 03:56:02 2017 - [info] Reading application default configuration from /etc/mha/app1/app1.cnf.. Thu Mar 16 03:56:02 2017 - [info] Reading server configuration from /etc/mha/app1/app1.cnf.. Thu Mar 16 03:56:02 2017 - [info] MHA::MasterMonitor version 0.56. Thu Mar 16 03:56:03 2017 - [info] GTID failover mode = 1 Thu Mar 16 03:56:03 2017 - [info] Dead Servers: Thu Mar 16 03:56:03 2017 - [info] Alive Servers: Thu Mar 16 03:56:03 2017 - [info] 192.168.206.139(192.168.206.139:3306) Thu Mar 16 03:56:03 2017 - [info] 192.168.206.140(192.168.206.140:3306) Thu Mar 16 03:56:03 2017 - [info] 192.168.206.141(192.168.206.141:3306) Thu Mar 16 03:56:03 2017 - [info] Alive Slaves: Thu Mar 16 03:56:03 2017 - [info] 192.168.206.140(192.168.206.140:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled Thu Mar 16 03:56:03 2017 - [info] GTID ON Thu Mar 16 03:56:03 2017 - [info] Replicating from 192.168.206.139(192.168.206.139:3306) Thu Mar 16 03:56:03 2017 - [info] Primary candidate for the new Master (candidate_master is set) Thu Mar 16 03:56:03 2017 - [info] 192.168.206.141(192.168.206.141:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled Thu Mar 16 03:56:03 2017 - [info] GTID ON Thu Mar 16 03:56:03 2017 - [info] Replicating from 192.168.206.139(192.168.206.139:3306) Thu Mar 16 03:56:03 2017 - [info] Not candidate for the new Master (no_master is set) Thu Mar 16 03:56:03 2017 - [info] Current Alive Master: 192.168.206.139(192.168.206.139:3306) Thu Mar 16 03:56:03 2017 - [info] Checking slave configurations.. Thu Mar 16 03:56:03 2017 - [info] read_only=1 is not set on slave 192.168.206.140(192.168.206.140:3306). Thu Mar 16 03:56:03 2017 - [info] read_only=1 is not set on slave 192.168.206.141(192.168.206.141:3306). Thu Mar 16 03:56:03 2017 - [info] Checking replication filtering settings.. Thu Mar 16 03:56:03 2017 - [info] binlog_do_db= , binlog_ignore_db= Thu Mar 16 03:56:03 2017 - [info] Replication filtering check ok. Thu Mar 16 03:56:03 2017 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking. Thu Mar 16 03:56:03 2017 - [info] Checking SSH publickey authentication settings on the current master.. Thu Mar 16 03:56:08 2017 - [warning] HealthCheck: Got timeout on checking SSH connection to 192.168.206.139! at /usr/share/perl5/vendor_perl/MHA/HealthCheck.pm line 342. Thu Mar 16 03:56:08 2017 - [info] 192.168.206.139(192.168.206.139:3306) (current master) +--192.168.206.140(192.168.206.140:3306) +--192.168.206.141(192.168.206.141:3306) Thu Mar 16 03:56:08 2017 - [info] Checking replication health on 192.168.206.140.. Thu Mar 16 03:56:08 2017 - [info] ok. Thu Mar 16 03:56:08 2017 - [info] Checking replication health on 192.168.206.141.. Thu Mar 16 03:56:08 2017 - [info] ok. Thu Mar 16 03:56:08 2017 - [info] Checking master_ip_failover_script status: Thu Mar 16 03:56:08 2017 - [info] /etc/mha/scripts/master_ip_failover --virtual_ip=192.168.206.145 --orig_master_vip_eth=eth0:234 --new_master_vip_eth=eth0:234 --command=status --ssh_user=root --orig_master_host=192.168.206.139 --orig_master_ip=192.168.206.139 --orig_master_port=3306 Thu Mar 16 03:56:08 2017 - [info] OK. Thu Mar 16 03:56:08 2017 - [warning] shutdown_script is not defined. Thu Mar 16 03:56:08 2017 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK.
四、启动MHA、测试MHA故障转移
启动MHA
[root@localhost mha4mysql-manager-0.56]# masterha_manager --conf=/etc/mha/app1/app1.cnf
[root@localhost mha4mysql-manager-0.56]# masterha_manager --conf=/etc/mha/app1/app1.cnf Thu Mar 16 03:59:39 2017 - [info] Reading default configuration from /etc/masterha_default.cnf.. Thu Mar 16 03:59:39 2017 - [info] Reading application default configuration from /etc/mha/app1/app1.cnf.. Thu Mar 16 03:59:39 2017 - [info] Reading server configuration from /etc/mha/app1/app1.cnf..
或者采用后台启动
[root@localhost mha4mysql-manager-0.56]# nohup masterha_manager --conf=/etc/mha/app1/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/mha/app1/manager.log 2>&1 &
[root@localhost mha4mysql-manager-0.56]# nohup masterha_manager --conf=/etc/mha/app1/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/mha/app1/manager.log 2>&1 & [3] 5977 [2] Exit 1 nohup masterha_manager --conf=/etc/mha/app1/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/mha/app1/manager.log 2>&1
配置master的VIP
[root@localhost mysql]# /sbin/ifconfig eth0:234 192.168.206.145 netmask 255.255.255.0 up
确认VIP
[root@localhost mysql]# ifconfig
[root@localhost mysql]# ifconfig eth0 Link encap:Ethernet HWaddr 00:0C:29:81:A4:6E inet addr:192.168.206.139 Bcast:192.168.206.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fe81:a46e/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:23074 errors:0 dropped:0 overruns:0 frame:0 TX packets:11235 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:1628296 (1.5 MiB) TX bytes:1043306 (1018.8 KiB) eth0:234 Link encap:Ethernet HWaddr 00:0C:29:81:A4:6E inet addr:192.168.206.145 Bcast:192.168.206.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:4 errors:0 dropped:0 overruns:0 frame:0 TX packets:4 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:240 (240.0 b) TX bytes:240 (240.0 b)
确认MHA启动状态:
[root@localhost ~]# masterha_check_status --conf=/etc/mha/app1/app1.cnf
[root@localhost ~]# masterha_check_status --conf=/etc/mha/app1/app1.cnf app1 (pid:5140) is running(0:PING_OK), master:192.168.206.139
进程确认:
[root@localhost scripts]# ps aux|grep mha
[root@localhost scripts]# ps aux|grep mha root 6731 1.8 0.9 194240 18724 pts/1 S 10:16 0:00 perl /usr/bin/masterha_manager --conf=/etc/mha/app1/app1.cnf --remove_dead_master_conf --ignore_last_failover root 6750 0.0 0.0 103316 844 pts/1 S+ 10:16 0:00 grep mha
关闭MHA Manager监控:
[root@localhost mha4mysql-manager-0.56]# masterha_stop --conf=/etc/mha/app1/app1.cnf
[root@localhost mha4mysql-manager-0.56]# masterha_stop --conf=/etc/mha/app1/app1.cnf Stopped app1 successfully. [3]+ Exit 1 nohup masterha_manager --conf=/etc/mha/app1/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/mha/app1/manager.log 2>&1
主动切换之交互模式
[root@localhost scripts]# masterha_master_switch --master_state=alive --conf=/etc/mha/app1/app1.cnf --new_master_host=192.168.206.140
[root@localhost scripts]# masterha_master_switch --master_state=alive --conf=/etc/mha/app1/app1.cnf --new_master_host=192.168.206.140 Thu Mar 16 07:57:17 2017 - [info] MHA::MasterRotate version 0.56. Thu Mar 16 07:57:17 2017 - [info] Starting online master switch.. Thu Mar 16 07:57:17 2017 - [info] Thu Mar 16 07:57:17 2017 - [info] * Phase 1: Configuration Check Phase.. Thu Mar 16 07:57:17 2017 - [info] Thu Mar 16 07:57:17 2017 - [info] Reading default configuration from /etc/masterha_default.cnf.. Thu Mar 16 07:57:17 2017 - [info] Reading application default configuration from /etc/mha/app1/app1.cnf.. Thu Mar 16 07:57:17 2017 - [info] Reading server configuration from /etc/mha/app1/app1.cnf.. Thu Mar 16 07:57:17 2017 - [info] GTID failover mode = 1 Thu Mar 16 07:57:17 2017 - [info] Current Alive Master: 192.168.206.139(192.168.206.139:3306) Thu Mar 16 07:57:17 2017 - [info] Alive Slaves: Thu Mar 16 07:57:17 2017 - [info] 192.168.206.140(192.168.206.140:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled Thu Mar 16 07:57:17 2017 - [info] GTID ON Thu Mar 16 07:57:17 2017 - [info] Replicating from 192.168.206.139(192.168.206.139:3306) Thu Mar 16 07:57:17 2017 - [info] Primary candidate for the new Master (candidate_master is set) Thu Mar 16 07:57:17 2017 - [info] 192.168.206.141(192.168.206.141:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled Thu Mar 16 07:57:17 2017 - [info] GTID ON Thu Mar 16 07:57:17 2017 - [info] Replicating from 192.168.206.139(192.168.206.139:3306) Thu Mar 16 07:57:17 2017 - [info] Not candidate for the new Master (no_master is set) It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 192.168.206.139(192.168.206.139:3306)? (YES/no): yes Thu Mar 16 07:57:19 2017 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time.. Thu Mar 16 07:57:19 2017 - [info] ok. Thu Mar 16 07:57:19 2017 - [info] Checking MHA is not monitoring or doing failover.. Thu Mar 16 07:57:19 2017 - [info] Checking replication health on 192.168.206.140.. Thu Mar 16 07:57:19 2017 - [info] ok. Thu Mar 16 07:57:19 2017 - [info] Checking replication health on 192.168.206.141.. Thu Mar 16 07:57:19 2017 - [info] ok. Thu Mar 16 07:57:19 2017 - [info] 192.168.206.140 can be new master. Thu Mar 16 07:57:19 2017 - [info] From: 192.168.206.139(192.168.206.139:3306) (current master) +--192.168.206.140(192.168.206.140:3306) +--192.168.206.141(192.168.206.141:3306) To: 192.168.206.140(192.168.206.140:3306) (new master) +--192.168.206.141(192.168.206.141:3306) Starting master switch from 192.168.206.139(192.168.206.139:3306) to 192.168.206.140(192.168.206.140:3306)? (yes/NO): yes Thu Mar 16 07:57:21 2017 - [info] Checking whether 192.168.206.140(192.168.206.140:3306) is ok for the new master.. Thu Mar 16 07:57:21 2017 - [info] ok. Thu Mar 16 07:57:21 2017 - [info] ** Phase 1: Configuration Check Phase completed. Thu Mar 16 07:57:21 2017 - [info] Thu Mar 16 07:57:21 2017 - [info] * Phase 2: Rejecting updates Phase.. Thu Mar 16 07:57:21 2017 - [info] Thu Mar 16 07:57:21 2017 - [info] Executing master ip online change script to disable write on the current master: Thu Mar 16 07:57:21 2017 - [info] /etc/mha/scripts/master_ip_online_change --virtual_ip=192.168.206.145 --orig_master_vip_eth=eth0:234 --new_master_vip_eth=eth0:234 --command=stop --orig_master_host=192.168.206.139 --orig_master_ip=192.168.206.139 --orig_master_port=3306 --orig_master_user='root' --orig_master_password='replpass' --new_master_host=192.168.206.140 --new_master_ip=192.168.206.140 --new_master_port=3306 --new_master_user='root' --new_master_password='replpass' --orig_master_ssh_user=root --new_master_ssh_user=root Thu Mar 16 07:57:21 2017 836071 Set read_only on the new master.. ok. Thu Mar 16 07:57:21 2017 846572 Drpping app user on the orig master.. Thu Mar 16 07:57:21 2017 848400 Waiting all running 1 threads are disconnected.. (max 1500 milliseconds) {'Time' => '325','Command' => 'Binlog Dump GTID','db' => undef,'Id' => '58','Info' => undef,'User' => 'u_repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Host' => '192.168.206.141:59184'} Thu Mar 16 07:57:22 2017 350495 Waiting all running 1 threads are disconnected.. (max 1000 milliseconds) {'Time' => '325','Command' => 'Binlog Dump GTID','db' => undef,'Id' => '58','Info' => undef,'User' => 'u_repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Host' => '192.168.206.141:59184'} Thu Mar 16 07:57:22 2017 853113 Waiting all running 1 threads are disconnected.. (max 500 milliseconds) {'Time' => '326','Command' => 'Binlog Dump GTID','db' => undef,'Id' => '58','Info' => undef,'User' => 'u_repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Host' => '192.168.206.141:59184'} Thu Mar 16 07:57:23 2017 355738 Set read_only=1 on the orig master.. ok. Thu Mar 16 07:57:23 2017 360013 Waiting all running 1 queries are disconnected.. (max 500 milliseconds) {'Time' => '326','Command' => 'Binlog Dump GTID','db' => undef,'Id' => '58','Info' => undef,'User' => 'u_repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Host' => '192.168.206.141:59184'} Thu Mar 16 07:57:23 2017 858184 Killing all application threads.. Thu Mar 16 07:57:23 2017 860278 done. Thu Mar 16 07:57:23 2017 - [info] ok. Thu Mar 16 07:57:23 2017 - [info] Locking all tables on the orig master to reject updates from everybody (including root): Thu Mar 16 07:57:23 2017 - [info] Executing FLUSH TABLES WITH READ LOCK.. Thu Mar 16 07:57:23 2017 - [info] ok. Thu Mar 16 07:57:23 2017 - [info] Orig master binlog:pos is mysql-bin.000003:984. Thu Mar 16 07:57:23 2017 - [info] Waiting to execute all relay logs on 192.168.206.140(192.168.206.140:3306).. Thu Mar 16 07:57:23 2017 - [info] master_pos_wait(mysql-bin.000003:984) completed on 192.168.206.140(192.168.206.140:3306). Executed 0 events. Thu Mar 16 07:57:23 2017 - [info] done. Thu Mar 16 07:57:23 2017 - [info] Getting new master's binlog name and position.. Thu Mar 16 07:57:23 2017 - [info] mysql-bin.000002:1033 Thu Mar 16 07:57:23 2017 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.206.140', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='u_repl', MASTER_PASSWORD='xxx'; Thu Mar 16 07:57:23 2017 - [info] Executing master ip online change script to allow write on the new master: Thu Mar 16 07:57:23 2017 - [info] /etc/mha/scripts/master_ip_online_change --virtual_ip=192.168.206.145 --orig_master_vip_eth=eth0:234 --new_master_vip_eth=eth0:234 --command=start --orig_master_host=192.168.206.139 --orig_master_ip=192.168.206.139 --orig_master_port=3306 --orig_master_user='root' --orig_master_password='replpass' --new_master_host=192.168.206.140 --new_master_ip=192.168.206.140 --new_master_port=3306 --new_master_user='root' --new_master_password='replpass' --orig_master_ssh_user=root --new_master_ssh_user=root Thu Mar 16 07:57:24 2017 094535 Set read_only=0 on the new master. Thu Mar 16 07:57:24 2017 096832 Creating app user on the new master.. Thu Mar 16 07:57:24 2017 - [info] ok. Thu Mar 16 07:57:24 2017 - [info] Thu Mar 16 07:57:24 2017 - [info] * Switching slaves in parallel.. Thu Mar 16 07:57:24 2017 - [info] Thu Mar 16 07:57:24 2017 - [info] -- Slave switch on host 192.168.206.141(192.168.206.141:3306) started, pid: 6433 Thu Mar 16 07:57:24 2017 - [info] Thu Mar 16 07:57:24 2017 - [info] Log messages from 192.168.206.141 ... Thu Mar 16 07:57:24 2017 - [info] Thu Mar 16 07:57:24 2017 - [info] Waiting to execute all relay logs on 192.168.206.141(192.168.206.141:3306).. Thu Mar 16 07:57:24 2017 - [info] master_pos_wait(mysql-bin.000003:984) completed on 192.168.206.141(192.168.206.141:3306). Executed 0 events. Thu Mar 16 07:57:24 2017 - [info] done. Thu Mar 16 07:57:24 2017 - [info] Resetting slave 192.168.206.141(192.168.206.141:3306) and starting replication from the new master 192.168.206.140(192.168.206.140:3306).. Thu Mar 16 07:57:24 2017 - [info] Executed CHANGE MASTER. Thu Mar 16 07:57:24 2017 - [info] Slave started. Thu Mar 16 07:57:24 2017 - [info] End of log messages from 192.168.206.141 ... Thu Mar 16 07:57:24 2017 - [info] Thu Mar 16 07:57:24 2017 - [info] -- Slave switch on host 192.168.206.141(192.168.206.141:3306) succeeded. Thu Mar 16 07:57:24 2017 - [info] Unlocking all tables on the orig master: Thu Mar 16 07:57:24 2017 - [info] Executing UNLOCK TABLES.. Thu Mar 16 07:57:24 2017 - [info] ok. Thu Mar 16 07:57:24 2017 - [info] All new slave servers switched successfully. Thu Mar 16 07:57:24 2017 - [info] Thu Mar 16 07:57:24 2017 - [info] * Phase 5: New master cleanup phase.. Thu Mar 16 07:57:24 2017 - [info] Thu Mar 16 07:57:24 2017 - [info] 192.168.206.140: Resetting slave info succeeded. Thu Mar 16 07:57:24 2017 - [info] Switching master to 192.168.206.140(192.168.206.140:3306) completed successfully.
重新将旧的master加入复制,指向新的master
mysql> change master to master_host='192.168.206.140',master_user='u_repl',master_password='replpass',master_auto_position=1; mysql> start slave;
主动切换之非交互模式:
[root@localhost scripts]# masterha_master_switch --master_state=alive --conf=/etc/mha/app1/app1.cnf --new_master_host=192.168.206.139 --interactive=0
[root@localhost scripts]# masterha_master_switch --master_state=alive --conf=/etc/mha/app1/app1.cnf --new_master_host=192.168.206.139 --interactive=0 Thu Mar 16 07:51:50 2017 - [info] MHA::MasterRotate version 0.56. Thu Mar 16 07:51:50 2017 - [info] Starting online master switch.. Thu Mar 16 07:51:50 2017 - [info] Thu Mar 16 07:51:50 2017 - [info] * Phase 1: Configuration Check Phase.. Thu Mar 16 07:51:50 2017 - [info] Thu Mar 16 07:51:50 2017 - [info] Reading default configuration from /etc/masterha_default.cnf.. Thu Mar 16 07:51:50 2017 - [info] Reading application default configuration from /etc/mha/app1/app1.cnf.. Thu Mar 16 07:51:50 2017 - [info] Reading server configuration from /etc/mha/app1/app1.cnf.. Thu Mar 16 07:51:50 2017 - [info] GTID failover mode = 1 Thu Mar 16 07:51:50 2017 - [info] Current Alive Master: 192.168.206.140(192.168.206.140:3306) Thu Mar 16 07:51:50 2017 - [info] Alive Slaves: Thu Mar 16 07:51:50 2017 - [info] 192.168.206.139(192.168.206.139:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled Thu Mar 16 07:51:50 2017 - [info] GTID ON Thu Mar 16 07:51:50 2017 - [info] Replicating from 192.168.206.140(192.168.206.140:3306) Thu Mar 16 07:51:50 2017 - [info] Primary candidate for the new Master (candidate_master is set) Thu Mar 16 07:51:50 2017 - [info] 192.168.206.141(192.168.206.141:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled Thu Mar 16 07:51:50 2017 - [info] GTID ON Thu Mar 16 07:51:50 2017 - [info] Replicating from 192.168.206.140(192.168.206.140:3306) Thu Mar 16 07:51:50 2017 - [info] Not candidate for the new Master (no_master is set) Thu Mar 16 07:51:50 2017 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time.. Thu Mar 16 07:51:50 2017 - [info] ok. Thu Mar 16 07:51:50 2017 - [info] Checking MHA is not monitoring or doing failover.. Thu Mar 16 07:51:50 2017 - [info] Checking replication health on 192.168.206.139.. Thu Mar 16 07:51:50 2017 - [info] ok. Thu Mar 16 07:51:50 2017 - [info] Checking replication health on 192.168.206.141.. Thu Mar 16 07:51:50 2017 - [info] ok. Thu Mar 16 07:51:50 2017 - [info] 192.168.206.139 can be new master. Thu Mar 16 07:51:50 2017 - [info] From: 192.168.206.140(192.168.206.140:3306) (current master) +--192.168.206.139(192.168.206.139:3306) +--192.168.206.141(192.168.206.141:3306) To: 192.168.206.139(192.168.206.139:3306) (new master) +--192.168.206.141(192.168.206.141:3306) Thu Mar 16 07:51:50 2017 - [info] Checking whether 192.168.206.139(192.168.206.139:3306) is ok for the new master.. Thu Mar 16 07:51:50 2017 - [info] ok. Thu Mar 16 07:51:50 2017 - [info] ** Phase 1: Configuration Check Phase completed. Thu Mar 16 07:51:50 2017 - [info] Thu Mar 16 07:51:50 2017 - [info] * Phase 2: Rejecting updates Phase.. Thu Mar 16 07:51:50 2017 - [info] Thu Mar 16 07:51:50 2017 - [info] Executing master ip online change script to disable write on the current master: Thu Mar 16 07:51:50 2017 - [info] /etc/mha/scripts/master_ip_online_change --virtual_ip=192.168.206.145 --orig_master_vip_eth=eth0:234 --new_master_vip_eth=eth0:234 --command=stop --orig_master_host=192.168.206.140 --orig_master_ip=192.168.206.140 --orig_master_port=3306 --orig_master_user='root' --orig_master_password='replpass' --new_master_host=192.168.206.139 --new_master_ip=192.168.206.139 --new_master_port=3306 --new_master_user='root' --new_master_password='replpass' --orig_master_ssh_user=root --new_master_ssh_user=root Thu Mar 16 07:51:50 2017 428863 Set read_only on the new master.. ok. Thu Mar 16 07:51:50 2017 436946 Drpping app user on the orig master.. Thu Mar 16 07:51:50 2017 438015 Waiting all running 1 threads are disconnected.. (max 1500 milliseconds) {'Time' => '180','Command' => 'Binlog Dump GTID','db' => undef,'Id' => '72','Info' => undef,'User' => 'u_repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Host' => '192.168.206.141:48684'} Thu Mar 16 07:51:50 2017 940921 Waiting all running 1 threads are disconnected.. (max 1000 milliseconds) {'Time' => '181','Command' => 'Binlog Dump GTID','db' => undef,'Id' => '72','Info' => undef,'User' => 'u_repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Host' => '192.168.206.141:48684'} Thu Mar 16 07:51:51 2017 443500 Waiting all running 1 threads are disconnected.. (max 500 milliseconds) {'Time' => '181','Command' => 'Binlog Dump GTID','db' => undef,'Id' => '72','Info' => undef,'User' => 'u_repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Host' => '192.168.206.141:48684'} Thu Mar 16 07:51:51 2017 946057 Set read_only=1 on the orig master.. ok. Thu Mar 16 07:51:51 2017 952395 Waiting all running 1 queries are disconnected.. (max 500 milliseconds) {'Time' => '182','Command' => 'Binlog Dump GTID','db' => undef,'Id' => '72','Info' => undef,'User' => 'u_repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Host' => '192.168.206.141:48684'} Thu Mar 16 07:51:52 2017 448683 Killing all application threads.. Thu Mar 16 07:51:52 2017 450864 done. Thu Mar 16 07:51:52 2017 - [info] ok. Thu Mar 16 07:51:52 2017 - [info] Locking all tables on the orig master to reject updates from everybody (including root): Thu Mar 16 07:51:52 2017 - [info] Executing FLUSH TABLES WITH READ LOCK.. Thu Mar 16 07:51:52 2017 - [info] ok. Thu Mar 16 07:51:52 2017 - [info] Orig master binlog:pos is mysql-bin.000002:1033. Thu Mar 16 07:51:52 2017 - [info] Waiting to execute all relay logs on 192.168.206.139(192.168.206.139:3306).. Thu Mar 16 07:51:52 2017 - [info] master_pos_wait(mysql-bin.000002:1033) completed on 192.168.206.139(192.168.206.139:3306). Executed 0 events. Thu Mar 16 07:51:52 2017 - [info] done. Thu Mar 16 07:51:52 2017 - [info] Getting new master's binlog name and position.. Thu Mar 16 07:51:52 2017 - [info] mysql-bin.000003:984 Thu Mar 16 07:51:52 2017 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.206.139', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='u_repl', MASTER_PASSWORD='xxx'; Thu Mar 16 07:51:52 2017 - [info] Executing master ip online change script to allow write on the new master: Thu Mar 16 07:51:52 2017 - [info] /etc/mha/scripts/master_ip_online_change --virtual_ip=192.168.206.145 --orig_master_vip_eth=eth0:234 --new_master_vip_eth=eth0:234 --command=start --orig_master_host=192.168.206.140 --orig_master_ip=192.168.206.140 --orig_master_port=3306 --orig_master_user='root' --orig_master_password='replpass' --new_master_host=192.168.206.139 --new_master_ip=192.168.206.139 --new_master_port=3306 --new_master_user='root' --new_master_password='replpass' --orig_master_ssh_user=root --new_master_ssh_user=root Thu Mar 16 07:51:52 2017 689824 Set read_only=0 on the new master. Thu Mar 16 07:51:52 2017 691842 Creating app user on the new master.. Thu Mar 16 07:51:56 2017 - [info] ok. Thu Mar 16 07:51:56 2017 - [info] Thu Mar 16 07:51:56 2017 - [info] * Switching slaves in parallel.. Thu Mar 16 07:51:56 2017 - [info] Thu Mar 16 07:51:56 2017 - [info] -- Slave switch on host 192.168.206.141(192.168.206.141:3306) started, pid: 6405 Thu Mar 16 07:51:56 2017 - [info] Thu Mar 16 07:51:56 2017 - [info] Log messages from 192.168.206.141 ... Thu Mar 16 07:51:56 2017 - [info] Thu Mar 16 07:51:56 2017 - [info] Waiting to execute all relay logs on 192.168.206.141(192.168.206.141:3306).. Thu Mar 16 07:51:56 2017 - [info] master_pos_wait(mysql-bin.000002:1033) completed on 192.168.206.141(192.168.206.141:3306). Executed 0 events. Thu Mar 16 07:51:56 2017 - [info] done. Thu Mar 16 07:51:56 2017 - [info] Resetting slave 192.168.206.141(192.168.206.141:3306) and starting replication from the new master 192.168.206.139(192.168.206.139:3306).. Thu Mar 16 07:51:56 2017 - [info] Executed CHANGE MASTER. Thu Mar 16 07:51:56 2017 - [info] Slave started. Thu Mar 16 07:51:56 2017 - [info] End of log messages from 192.168.206.141 ... Thu Mar 16 07:51:56 2017 - [info] Thu Mar 16 07:51:56 2017 - [info] -- Slave switch on host 192.168.206.141(192.168.206.141:3306) succeeded. Thu Mar 16 07:51:56 2017 - [info] Unlocking all tables on the orig master: Thu Mar 16 07:51:56 2017 - [info] Executing UNLOCK TABLES.. Thu Mar 16 07:51:56 2017 - [info] ok. Thu Mar 16 07:51:56 2017 - [info] All new slave servers switched successfully. Thu Mar 16 07:51:56 2017 - [info] Thu Mar 16 07:51:56 2017 - [info] * Phase 5: New master cleanup phase.. Thu Mar 16 07:51:56 2017 - [info] Thu Mar 16 07:51:56 2017 - [info] 192.168.206.139: Resetting slave info succeeded. Thu Mar 16 07:51:56 2017 - [info] Switching master to 192.168.206.139(192.168.206.139:3306) completed successfully.
Master192.168.206.140宕机模式下切换主从:
masterha_master_switch --master_state=dead --conf=/etc/mha/app1/app1.cnf --dead_master_host=192.168.206.140 --dead_master_ip=192.168.206.140 --dead_master_port=3306 --new_master_host=192.168.206.139
[root@localhost scripts]# masterha_master_switch --master_state=dead --conf=/etc/mha/app1/app1.cnf --dead_master_host=192.168.206.140 --dead_master_ip=192.168.206.140 --dead_master_port=3306 --new_master_host=192.168.206.139 Thu Mar 16 08:09:36 2017 - [info] Reading default configuration from /etc/masterha_default.cnf.. Thu Mar 16 08:09:36 2017 - [info] Reading application default configuration from /etc/mha/app1/app1.cnf.. Thu Mar 16 08:09:36 2017 - [info] Reading server configuration from /etc/mha/app1/app1.cnf.. Thu Mar 16 08:09:36 2017 - [info] MHA::MasterFailover version 0.56. Thu Mar 16 08:09:36 2017 - [info] Starting master failover. Thu Mar 16 08:09:36 2017 - [info] Thu Mar 16 08:09:36 2017 - [info] * Phase 1: Configuration Check Phase.. Thu Mar 16 08:09:36 2017 - [info] Thu Mar 16 08:09:37 2017 - [info] GTID failover mode = 1 Thu Mar 16 08:09:37 2017 - [info] Dead Servers: Thu Mar 16 08:09:37 2017 - [info] 192.168.206.140(192.168.206.140:3306) Thu Mar 16 08:09:37 2017 - [info] Checking master reachability via MySQL(double check)... Thu Mar 16 08:09:37 2017 - [info] ok. Thu Mar 16 08:09:37 2017 - [info] Alive Servers: Thu Mar 16 08:09:37 2017 - [info] 192.168.206.139(192.168.206.139:3306) Thu Mar 16 08:09:37 2017 - [info] 192.168.206.141(192.168.206.141:3306) Thu Mar 16 08:09:37 2017 - [info] Alive Slaves: Thu Mar 16 08:09:37 2017 - [info] 192.168.206.139(192.168.206.139:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled Thu Mar 16 08:09:37 2017 - [info] GTID ON Thu Mar 16 08:09:37 2017 - [info] Replicating from 192.168.206.140(192.168.206.140:3306) Thu Mar 16 08:09:37 2017 - [info] Primary candidate for the new Master (candidate_master is set) Thu Mar 16 08:09:37 2017 - [info] 192.168.206.141(192.168.206.141:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled Thu Mar 16 08:09:37 2017 - [info] GTID ON Thu Mar 16 08:09:37 2017 - [info] Replicating from 192.168.206.140(192.168.206.140:3306) Thu Mar 16 08:09:37 2017 - [info] Not candidate for the new Master (no_master is set) Master 192.168.206.140(192.168.206.140:3306) is dead. Proceed? (yes/NO): yes Thu Mar 16 08:09:42 2017 - [info] Starting GTID based failover. Thu Mar 16 08:09:42 2017 - [info] Thu Mar 16 08:09:42 2017 - [info] ** Phase 1: Configuration Check Phase completed. Thu Mar 16 08:09:42 2017 - [info] Thu Mar 16 08:09:42 2017 - [info] * Phase 2: Dead Master Shutdown Phase.. Thu Mar 16 08:09:42 2017 - [info] Thu Mar 16 08:09:46 2017 - [info] HealthCheck: SSH to 192.168.206.140 is reachable. Thu Mar 16 08:09:46 2017 - [info] Forcing shutdown so that applications never connect to the current master.. Thu Mar 16 08:09:46 2017 - [info] Executing master IP deactivation script: Thu Mar 16 08:09:46 2017 - [info] /etc/mha/scripts/master_ip_failover --virtual_ip=192.168.206.145 --orig_master_vip_eth=eth0:234 --new_master_vip_eth=eth0:234 --orig_master_host=192.168.206.140 --orig_master_ip=192.168.206.140 --orig_master_port=3306 --command=stopssh --ssh_user=root Thu Mar 16 08:09:46 2017 - [info] done. Thu Mar 16 08:09:46 2017 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master. Thu Mar 16 08:09:46 2017 - [info] * Phase 2: Dead Master Shutdown Phase completed. Thu Mar 16 08:09:46 2017 - [info] Thu Mar 16 08:09:46 2017 - [info] * Phase 3: Master Recovery Phase.. Thu Mar 16 08:09:46 2017 - [info] Thu Mar 16 08:09:46 2017 - [info] * Phase 3.1: Getting Latest Slaves Phase.. Thu Mar 16 08:09:46 2017 - [info] Thu Mar 16 08:09:46 2017 - [info] The latest binary log file/position on all slaves is mysql-bin.000002:1033 Thu Mar 16 08:09:46 2017 - [info] Latest slaves (Slaves that received relay log files to the latest): Thu Mar 16 08:09:46 2017 - [info] 192.168.206.139(192.168.206.139:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled Thu Mar 16 08:09:46 2017 - [info] GTID ON Thu Mar 16 08:09:46 2017 - [info] Replicating from 192.168.206.140(192.168.206.140:3306) Thu Mar 16 08:09:46 2017 - [info] Primary candidate for the new Master (candidate_master is set) Thu Mar 16 08:09:46 2017 - [info] 192.168.206.141(192.168.206.141:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled Thu Mar 16 08:09:46 2017 - [info] GTID ON Thu Mar 16 08:09:46 2017 - [info] Replicating from 192.168.206.140(192.168.206.140:3306) Thu Mar 16 08:09:46 2017 - [info] Not candidate for the new Master (no_master is set) Thu Mar 16 08:09:46 2017 - [info] The oldest binary log file/position on all slaves is mysql-bin.000002:1033 Thu Mar 16 08:09:46 2017 - [info] Oldest slaves: Thu Mar 16 08:09:46 2017 - [info] 192.168.206.139(192.168.206.139:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled Thu Mar 16 08:09:46 2017 - [info] GTID ON Thu Mar 16 08:09:46 2017 - [info] Replicating from 192.168.206.140(192.168.206.140:3306) Thu Mar 16 08:09:46 2017 - [info] Primary candidate for the new Master (candidate_master is set) Thu Mar 16 08:09:46 2017 - [info] 192.168.206.141(192.168.206.141:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled Thu Mar 16 08:09:46 2017 - [info] GTID ON Thu Mar 16 08:09:46 2017 - [info] Replicating from 192.168.206.140(192.168.206.140:3306) Thu Mar 16 08:09:46 2017 - [info] Not candidate for the new Master (no_master is set) Thu Mar 16 08:09:46 2017 - [info] Thu Mar 16 08:09:46 2017 - [info] * Phase 3.3: Determining New Master Phase.. Thu Mar 16 08:09:46 2017 - [info] Thu Mar 16 08:09:46 2017 - [info] 192.168.206.139 can be new master. Thu Mar 16 08:09:46 2017 - [info] New master is 192.168.206.139(192.168.206.139:3306) Thu Mar 16 08:09:46 2017 - [info] Starting master failover.. Thu Mar 16 08:09:46 2017 - [info] From: 192.168.206.140(192.168.206.140:3306) (current master) +--192.168.206.139(192.168.206.139:3306) +--192.168.206.141(192.168.206.141:3306) To: 192.168.206.139(192.168.206.139:3306) (new master) +--192.168.206.141(192.168.206.141:3306) Starting master switch from 192.168.206.140(192.168.206.140:3306) to 192.168.206.139(192.168.206.139:3306)? (yes/NO): yes Thu Mar 16 08:09:52 2017 - [info] New master decided manually is 192.168.206.139(192.168.206.139:3306) Thu Mar 16 08:09:52 2017 - [info] Thu Mar 16 08:09:52 2017 - [info] * Phase 3.3: New Master Recovery Phase.. Thu Mar 16 08:09:52 2017 - [info] Thu Mar 16 08:09:52 2017 - [info] Waiting all logs to be applied.. Thu Mar 16 08:09:52 2017 - [info] done. Thu Mar 16 08:09:52 2017 - [info] Getting new master's binlog name and position.. Thu Mar 16 08:09:52 2017 - [info] mysql-bin.000003:984 Thu Mar 16 08:09:52 2017 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.206.139', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='u_repl', MASTER_PASSWORD='xxx'; Thu Mar 16 08:09:52 2017 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysql-bin.000003, 984, 347cbac6-0906-11e7-b957-000c2981a46e:1-2, 9e2c7c0f-0908-11e7-8230-000c29ab7544:2-3 Thu Mar 16 08:09:52 2017 - [info] Executing master IP activate script: Thu Mar 16 08:09:52 2017 - [info] /etc/mha/scripts/master_ip_failover --virtual_ip=192.168.206.145 --orig_master_vip_eth=eth0:234 --new_master_vip_eth=eth0:234 --command=start --ssh_user=root --orig_master_host=192.168.206.140 --orig_master_ip=192.168.206.140 --orig_master_port=3306 --new_master_host=192.168.206.139 --new_master_ip=192.168.206.139 --new_master_port=3306 --new_master_user='root' --new_master_password='replpass' Set read_only=0 on the new master. Creating app user on the new master.. Undefined subroutine &main::FIXME_xxx_create_user called at /etc/mha/scripts/master_ip_failover line 93. Thu Mar 16 08:09:52 2017 - [error][/usr/share/perl5/vendor_perl/MHA/MasterFailover.pm, ln1588] Failed to activate master IP address for 192.168.206.139(192.168.206.139:3306) with return code 10:0 Thu Mar 16 08:09:52 2017 - [warning] Proceeding. Thu Mar 16 08:09:52 2017 - [info] ** Finished master recovery successfully. Thu Mar 16 08:09:52 2017 - [info] * Phase 3: Master Recovery Phase completed. Thu Mar 16 08:09:52 2017 - [info] Thu Mar 16 08:09:52 2017 - [info] * Phase 4: Slaves Recovery Phase.. Thu Mar 16 08:09:52 2017 - [info] Thu Mar 16 08:09:52 2017 - [info] Thu Mar 16 08:09:52 2017 - [info] * Phase 4.1: Starting Slaves in parallel.. Thu Mar 16 08:09:52 2017 - [info] Thu Mar 16 08:09:52 2017 - [info] -- Slave recovery on host 192.168.206.141(192.168.206.141:3306) started, pid: 6478. Check tmp log /var/log/mha/app1/192.168.206.141_3306_20170316080936.log if it takes time.. Thu Mar 16 08:09:53 2017 - [info] Thu Mar 16 08:09:53 2017 - [info] Log messages from 192.168.206.141 ... Thu Mar 16 08:09:53 2017 - [info] Thu Mar 16 08:09:52 2017 - [info] Resetting slave 192.168.206.141(192.168.206.141:3306) and starting replication from the new master 192.168.206.139(192.168.206.139:3306).. Thu Mar 16 08:09:52 2017 - [info] Executed CHANGE MASTER. Thu Mar 16 08:09:52 2017 - [info] Slave started. Thu Mar 16 08:09:52 2017 - [info] gtid_wait(347cbac6-0906-11e7-b957-000c2981a46e:1-2, 9e2c7c0f-0908-11e7-8230-000c29ab7544:2-3) completed on 192.168.206.141(192.168.206.141:3306). Executed 0 events. Thu Mar 16 08:09:53 2017 - [info] End of log messages from 192.168.206.141. Thu Mar 16 08:09:53 2017 - [info] -- Slave on host 192.168.206.141(192.168.206.141:3306) started. Thu Mar 16 08:09:53 2017 - [info] All new slave servers recovered successfully. Thu Mar 16 08:09:53 2017 - [info] Thu Mar 16 08:09:53 2017 - [info] * Phase 5: New master cleanup phase.. Thu Mar 16 08:09:53 2017 - [info] Thu Mar 16 08:09:53 2017 - [info] Resetting slave info on the new master.. Thu Mar 16 08:09:53 2017 - [info] 192.168.206.139: Resetting slave info succeeded. Thu Mar 16 08:09:53 2017 - [info] Master failover to 192.168.206.139(192.168.206.139:3306) completed successfully. Thu Mar 16 08:09:53 2017 - [info] ----- Failover Report ----- app1: MySQL Master failover 192.168.206.140(192.168.206.140:3306) to 192.168.206.139(192.168.206.139:3306) succeeded Master 192.168.206.140(192.168.206.140:3306) is down! Check MHA Manager logs at localhost.localdomain for details. Started manual(interactive) failover. Invalidated master IP address on 192.168.206.140(192.168.206.140:3306) Selected 192.168.206.139(192.168.206.139:3306) as a new master. 192.168.206.139(192.168.206.139:3306): OK: Applying all logs succeeded. Failed to activate master IP address for 192.168.206.139(192.168.206.139:3306) with return code 10:0 192.168.206.141(192.168.206.141:3306): OK: Slave started, replicating from 192.168.206.139(192.168.206.139:3306) 192.168.206.139(192.168.206.139:3306): Resetting slave info succeeded. Master failover to 192.168.206.139(192.168.206.139:3306) completed successfully.
本文地址:http://www.cnblogs.com/ajiangg/p/6552855.html
参考资源:https://code.google.com/p/mysql-master-ha/wiki/TableOfContents?tm=6