MHA原理及搭建

MYSQL5.7下搭建MHA

环境说明

在主机1,主机2,主机3上安装MySQL服务端和客户端。

 

 

主机1

主机2

主机3

操作系统

CentOS7.4

CentOS7.4

CentOS7.4

主机名

mydb1

mydb2

mydb3

IP

192.168.1.101

192.168.1.102

192.168.1.103

VIP

192.168.1.201

192.168.1.202

192.168.1.203

角色

主,MHA Manager

主备,MHA Manager

从,MHA Node

数据库软件版本

mysql5.7.22

mysql5.7.22

mysql5.7.22

MySQL配置文件目位置

/app/mysqldata/3306/ my.cnf

/app/mysqldata/3306/ my.cnf

/app/mysqldata/3306/ my.cnf

MHA配置文件位置

/etc/app1.cnf

/etc/app1.cnf

 

MHA安装包

/soft/MHA/ mha4mysql-node-0.58-0.el7.centos.noarch.rpm

/soft/MHA/ mha4mysql-manager-0.58-0.el7.centos.noarch.rpm

/soft/MHA/ mha4mysql-node-0.58-0.el7.centos.noarch.rpm

/soft/MHA/ mha4mysql-manager-0.58-0.el7.centos.noarch.rpm

/soft/MHA/ mha4mysql-node-0.58-0.el7.centos.noarch.rpm

 

 

 

 

 

 

 

一.建立3MHA服务器之间ssh互信

mydb1mydb2mydb3服务器上分别执行:

ssh-keygen -t rsa

ssh-copy-id -i .ssh/id_rsa.pub root@192.168.1.101

ssh-copy-id -i .ssh/id_rsa.pub root@192.168.1.102

ssh-copy-id -i .ssh/id_rsa.pub root@192.168.1.103

 

配置master服务器hosts文件

mydb1mydb2mydb3服务器上:

cat /etc/hosts          

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4

::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.1.101 mydb1

192.168.1.102 mydb2

192.168.1.103 mydb3

 

. 配置主从关系

mydb1上操作                                     

修改mydb1配置文件:

vi /app/mysqldata/3306/my.cnf

在原有文件添加以下参数:

[mysqld]                  

server_id=1013306               

log-bin = /app/mysqldata/3306/binlog/mysql-bin

log-bin-index = /app/mysqldata/3306/binlog/mysql-bin.index

 

启动mysql服务:

[root@mydb1 ~]# mysql_db_startup.sh 3306

 

添加复制用户并授权:

[root@mydb1 ~]# mysqlplus.sh 3306

(root@localhost) [(none)]> create user xtrabk@'192.168.1.%' identified by 'onlybackup';

(root@localhost) [(none)]> grant reload,lock tables,process,Replication client,super on *.* to xtrabk@'192.168.1.%';

(root@localhost) [(none)]> create user rep1@'192.168.1.%' identified by 'rep1';

(root@localhost) [(none)]> grant replication slave on *.* to 'rep1'@'192.168.1.%' identified by 'rep1';

(root@localhost) [(none)]> flush privileges;                                                       

 

查看是否支持动态加载半同步复制模块:(YES为可以)

(root@localhost) [(none)]> show VARIABLES LIKE 'have_dynamic_loading';

+----------------------+-------+

| Variable_name           | Value   |

+----------------------+-------+

| have_dynamic_loading | YES   |

+----------------------+-------+

1 row in set (0.00 sec)

 

加载半同步复制模块:

(root@localhost) [(none)]> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';

(root@localhost) [(none)]> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';

 

判断安装是否成功

(root@localhost) [(none)]> show variables like '%semi%';

+-------------------------------------------+------------+

| Variable_name                             | Value      |

+-------------------------------------------+------------+

| rpl_semi_sync_master_enabled              | ON         |

| rpl_semi_sync_master_timeout              | 5000       |

| rpl_semi_sync_master_trace_level          | 32         |

| rpl_semi_sync_master_wait_for_slave_count | 1          |

| rpl_semi_sync_master_wait_no_slave        | ON         |

| rpl_semi_sync_master_wait_point           | AFTER_SYNC |

| rpl_semi_sync_slave_enabled               | ON         |

| rpl_semi_sync_slave_trace_level           | 32         |

+-------------------------------------------+------------+

8 rows in set (0.00 sec)

ON开启半同步OFF关闭半同步

SET GLOBAL rpl_semi_sync_master_enabled = ON;

 

毫秒单位,5秒该参数主服务器等待确认消息5秒后,不再等待,变为异步方式。

SET GLOBAL rpl_semi_sync_master_timeout = 5000 ;

 

授权远程登录:

(root@localhost) [(none)]> create user dba_user@'192.168.1.%' identified by 'msds007';

(root@localhost) [(none)]> grant all privileges on *.* to 'dba_user'@'192.168.1.%';

(root@localhost) [(none)]> flush privileges;      

 

mydb1上建表,插入数据,使用Python脚本

 

创建完整备份

[root@mydb1 ~]# /app/mysqldata/scripts/my_full_backup.sh

复制和准备备份集

[root@mydb1 tmp]# scp -r xtrabackup/ mydb2:`pwd`

[root@mydb1 tmp]# scp -r xtrabackup/ mydb3:`pwd`

 

mydb2上操作

mysql配置文件:

vi /app/mysqldata/3306/my.cnf

在原有文件添加以下参数:

[mysqld]   

server_id = 1023306                                             

read_only=1

relay_log_purge = 0                                    

relay_log = /app/mysqldata/3306/binlog/mysql-relay-bin             

relay_log_index = /app/mysqldata/3306/binlog/mysql-relay-bin.index

log-bin = /app/mysqldata/3306/binlog/mysql-bin

log-bin-index = /app/mysqldata/3306/binlog/mysql-bin.index

 

进行数据恢复

[root@mydb2 ~]# /app/mysqldata/scripts/my_full_recover.sh

 

[root@mydb2 ~]# mysqlplus.sh 3306

 

查看是否支持动态加载半同步复制模块:(YES为可以)

(root@localhost) [(none)]> show VARIABLES LIKE 'have_dynamic_loading';

+----------------------+-------+

| Variable_name           | Value   |

+----------------------+-------+

| have_dynamic_loading | YES   |

+----------------------+-------+

1 row in set (0.00 sec)

 

加载半同步复制模块:

(root@localhost) [(none)]> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';

(root@localhost) [(none)]> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';

 

判断安装是否成功

(root@localhost) [(none)]> show variables like '%semi%';

+-------------------------------------------+------------+

| Variable_name                             | Value      |

+-------------------------------------------+------------+

| rpl_semi_sync_master_enabled              | ON         |

| rpl_semi_sync_master_timeout              | 5000       |

| rpl_semi_sync_master_trace_level          | 32         |

| rpl_semi_sync_master_wait_for_slave_count | 1          |

| rpl_semi_sync_master_wait_no_slave        | ON         |

| rpl_semi_sync_master_wait_point           | AFTER_SYNC |

| rpl_semi_sync_slave_enabled               | ON         |

| rpl_semi_sync_slave_trace_level           | 32         |

+-------------------------------------------+------------+

8 rows in set (0.00 sec)

 

ON开启半同步 OFF关闭半同步

SET GLOBAL rpl_semi_sync_slave_enabled =ON;

 

配置Slave节点复制环境

查看备份文件中的位置信息

[root@mydb2 full]# cat xtrabackup_binlog_info              

mysql-bin.000002 236874 9831a804-8c25-11e8-9a69-000c2983201e:1-3,

a5d58971-9ee1-11e8-8767-000c2983201e:1-576                                                                                                       

 

设置GTID_PURGED

(root@localhost) [(none)]> prompt Slave>

Slave>stop slave;

Query OK, 0 rows affected (0.00 sec)

Slave>reset master;

Query OK, 0 rows affected (0.02 sec)

Slave>set global

GTID_PURGED='9831a804-8c25-11e8-9a69-000c2983201e:1-3,a5d58971-9ee1-11e8-8767-000c2983201e:1-576';

Query OK, 0 rows affected (0.00 sec)

Slave > change master to

master_host='192.168.1.101',master_port=3306,master_user='rep1',master_password='rep1',MASTER_AUTO_POSITION =1;

 

启动从库服务

Slave > start slave;                                                                                                                              

 

查看主从复制是否成功:

Slave > show slave status\G    

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

 

查看mydb2上的master日志状态

Slave>show master status;

+------------------+----------+--------------+------------------+--------------------------------------------------------------------------------------+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                                                                    |

+------------------+----------+--------------+------------------+--------------------------------------------------------------------------------------+

| mysql-bin.000001 |      154 |              |                  | 9831a804-8c25-11e8-9a69-000c2983201e:1-3,

a5d58971-9ee1-11e8-8767-000c2983201e:1-576 |

+------------------+----------+--------------+------------------+--------------------------------------------------------------------------------------+

1 row in set (0.00 sec) 

 

mydb3

mysql配置文件:

vi /app/mysqldata/3306/my.cnf

在原有文件添加以下参数:

[mysqld]                                            

server_id = 1033306                                         

read_only = 1                                         

relay_log_purge = 0                                   

relay_log = /app/mysqldata/3306/binlog/mysql-relay-bin             

relay_log_index = /app/mysqldata/3306/binlog/mysql-relay-bin.index

log-bin = /app/mysqldata/3306/binlog/mysql-bin

log-bin-index = /app/mysqldata/3306/binlog/mysql-bin.index

 

进行数据恢复

[root@mydb3 ~]# /app/mysqldata/scripts/my_full_recover.sh

 

[root@mydb3 ~]# mysqlplus.sh 3306

 

查看是否支持动态加载半同步复制模块:(YES为可以)

(root@localhost) [(none)]> show VARIABLES LIKE 'have_dynamic_loading';

+----------------------+-------+

| Variable_name           | Value   |

+----------------------+-------+

| have_dynamic_loading | YES   |

+----------------------+-------+

1 row in set (0.00 sec)

 

加载半同步复制模块:

(root@localhost) [(none)]> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';

(root@localhost) [(none)]> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';

 

判断安装是否成功

(root@localhost) [(none)]> show variables like '%semi%';

+-------------------------------------------+------------+

| Variable_name                             | Value      |

+-------------------------------------------+------------+

| rpl_semi_sync_master_enabled              | ON         |

| rpl_semi_sync_master_timeout              | 5000       |

| rpl_semi_sync_master_trace_level          | 32         |

| rpl_semi_sync_master_wait_for_slave_count | 1          |

| rpl_semi_sync_master_wait_no_slave        | ON         |

| rpl_semi_sync_master_wait_point           | AFTER_SYNC |

| rpl_semi_sync_slave_enabled               | ON         |

| rpl_semi_sync_slave_trace_level           | 32         |

+-------------------------------------------+------------+

8 rows in set (0.00 sec)

 

ON开启半同步 OFF关闭半同步

SET GLOBAL rpl_semi_sync_slave_enabled =ON;

 

配置Slave节点复制环境

查看备份文件中的位置信息

[root@mydb3 full]# cat xtrabackup_binlog_info              

mysql-bin.000002 236874 9831a804-8c25-11e8-9a69-000c2983201e:1-3,

a5d58971-9ee1-11e8-8767-000c2983201e:1-576 

 

设置GTID_PURGED

(root@localhost) [(none)]> prompt Slave>

Slave>stop slave;

Query OK, 0 rows affected (0.00 sec)

Slave>reset master;

Query OK, 0 rows affected (0.02 sec)

Slave>set global

GTID_PURGED='9831a804-8c25-11e8-9a69-000c2983201e:1-3,a5d58971-9ee1-11e8-8767-000c2983201e:1-576';

Query OK, 0 rows affected (0.00 sec)

Slave > change master to

master_host='192.168.1.101',master_port=3306,master_user='rep1',master_password='rep1',MASTER_AUTO_POSITION =1;

 

启动从库服务:

Slave > start slave;                                                                                                                          

 

查看主从复制是否成功:

Slave > show slave status\G    

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

 

.主从复制配置完成,开始配置MHA

mydb1mydb2上操作

使用阿里yum

1.删除自带的yum源:

#cd /etc/yum.repos.d/

#rm -rf *

2.配置远程yum源:

wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo

$releasever替换为7

3.进行验证yum源配置成功:

清除缓存yum

#yum clean all

4.链接远程的yum

#yum makecache

5.查看已配置好的yum源信息“

#yum repolist

 

#先安装依赖

wget http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

rpm -ivh epel-release-latest-7.noarch.rpm

yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager -y

 

需自行下载mysql MHA CentOS7相关依赖包,下载地址:https://centos.pkgs.org/

yum install -y perl-Params-Validate-1.08-4.el7.x86_64.rpm

yum install -y perl-Parallel-ForkManager-1.18-2.el7.noarch.rpm

yum install -y perl-MIME-Types-1.38-2.el7.noarch.rpm

yum install -y perl-MIME-Lite-3.030-1.el7.noarch.rpm

yum install -y perl-Mail-Sendmail-0.79-21.el7.noarch.rpm

yum install -y perl-Mail-Sender-0.8.23-1.el7.noarch.rpm

yum install -y perl-Log-Dispatch-2.41-1.el7.1.noarch.rpm

yum install -y perl-Email-Date-Format-1.002-15.el7.noarch.rpm

yum install -y perl-Config-Tiny-2.14-7.el7.noarch.rpm

yum install -y perl-Mail-Send-Loop-0.3-alt1.noarch.rpm    不需要这个

 

上传了百度云盘,包括上面的软件包和安装包

链接:https://pan.baidu.com/s/1p4ShAhAgNUnqMf6s1lP84Q 密码:11mr

 

使用yum安装,这步有点繁琐

yum install -y mha4mysql-node-0.58-0.el7.centos.noarch.rpm

yum install -y mha4mysql-manager-0.58-0.el7.centos.noarch.rpm

 

/usr/bin目录下

save_binary_logs 保存和复制master的二进制日志

purge_relay_logs 清除中继日志(不会阻塞SQL线程)

filter_mysqlbinlog 去除不必要的ROLLBACK事件(已不再使用)

apply_diff_relay_logs 识别差异的中继日志事件并将其差异应用于其他的slave

masterha_stop 停止MHA

masterha_secondary_check 用于检查来自多个网络路由的主机可用性

masterha_master_switch 控制故障转移(自动或手动)

masterha_master_monitor 检测master是否宕机

masterha_manager 启动MHA

masterha_conf_host 添加或删除配置的server信息

masterha_check_status 检查当前MHA的运行状态

masterha_check_ssh 检查MHASSH配置状况

masterha_check_repl 检查MySQL复制状况

 

mydb3上操作

yum install -y mha4mysql-node-0.58-0.el7.centos.noarch.rpm

                                                                                                                                                                                               

在主和主备上配置MHA参数文件

vi /etc/masterha_default.cnf                 

[server default]                                     

user=dba_user                                          

password=msds007

ssh_user=root                                        

repl_user=rep1                                      

repl_password=rep1

ping_interval=1

master_ip_online_change_script="/masterha/app1/master_ip_online_change.pl"

master_ip_failover_script="/masterha/app1/master_ip_failover.pl"

report_script="/masterha/app1/send_report"

shutdown_script="/masterha/app1/power_manager"

secondary_check_script="masterha_secondary_check -s mydb1 -s mydb2"

 

vi /etc/app1.cnf

[server default]

manager_workdir=/var/log/masterha/app1               

manager_log=/var/log/masterha/app1/app1.log          

remote_workdir=/var/log/masterha/app1                

[server1]                                            

hostname=192.168.1.101                               

master_binlog_dir = /app/mysqldata/3306/binlog           

candidate_master=1                                   

check_repl_delay=0

[server2]                                            

hostname=192.168.1.102                               

master_binlog_dir = /app/mysqldata/3306/binlog           

candidate_master=1                                   

check_repl_delay=0

[server3]                                            

hostname=192.168.1.103                               

master_binlog_dir = /app/mysqldata/3306/binlog           

no_master=1

ignore_fail=1

 

mydb1mydb2上操作

mkdir -p /masterha/app1/

编辑/masterha/app1/master_ip_failover.pl

# chmod +x master_ip_failover.pl

注意修改vip及网卡

#!/usr/bin/env perl

use strict;

use warnings FATAL => 'all';

 

use Getopt::Long;

 

my (

    $command,          $ssh_user,        $orig_master_host, $orig_master_ip,

    $orig_master_port, $new_master_host, $new_master_ip,    $new_master_port

);

 

my $vip = '192.168.1.201/24';  #此处为你要设置的虚拟ip

my $key = '1';

my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip"; #此处改为你的网卡名称

my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";

 

GetOptions(

    'command=s'          => \$command,

    'ssh_user=s'         => \$ssh_user,

    'orig_master_host=s' => \$orig_master_host,

    'orig_master_ip=s'   => \$orig_master_ip,

    'orig_master_port=i' => \$orig_master_port,

    'new_master_host=s'  => \$new_master_host,

    'new_master_ip=s'    => \$new_master_ip,

    'new_master_port=i'  => \$new_master_port,

);

 

exit &main();

 

sub main {

    print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

    if ( $command eq "stop" || $command eq "stopssh" ) {

        my $exit_code = 1;

        eval {

            print "Disabling the VIP on old master: $orig_master_host \n";

            &stop_vip();

            $exit_code = 0;

        };

        if ($@) {

            warn "Got Error: $@\n";

            exit $exit_code;

        }

        exit $exit_code;

    }

    elsif ( $command eq "start" ) {

        my $exit_code = 10;

        eval {

            print "Enabling the VIP - $vip on the new master - $new_master_host \n";

            &start_vip();

            $exit_code = 0;

        };

        if ($@) {

            warn $@;

            exit $exit_code;

        }

        exit $exit_code;

    }

    elsif ( $command eq "status" ) {

        print "Checking the Status of the script.. OK \n";

        exit 0;

    }

    else {

        &usage();

        exit 1;

    }

}

 

sub start_vip() {

    `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;

}

sub stop_vip() {

     return 0  unless  ($ssh_user);

    `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;

}

 

sub usage {

    print

    "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";

}

 

编辑/masterha/app1/master_ip_online_change.pl

# chmod +x master_ip_online_change.pl

注意修改vip及网卡

#!/usr/bin/env perl

 

use strict;

use warnings FATAL => 'all';

 

use Getopt::Long;

use MHA::DBHelper;

use MHA::NodeUtil;

use Time::HiRes qw( sleep gettimeofday tv_interval );

use Data::Dumper;

 

my $_tstart;

my $_running_interval = 0.1;

my (

  $command,          

  $orig_master_is_new_slave, $orig_master_host, $orig_master_ip,  $orig_master_port, $orig_master_user,    $orig_master_password, $orig_master_ssh_user,

  $new_master_host,          $new_master_ip,    $new_master_port, $new_master_user,  $new_master_password, $new_master_ssh_user,

);

 

my $vip = '192.168.1.201/24';  #此处为你要设置的虚拟ip

my $key = '1';

my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip"; #此处改为你的网卡名称

my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";

 

GetOptions(

  'command=s'                => \$command,

  'orig_master_is_new_slave' => \$orig_master_is_new_slave,

  'orig_master_host=s'       => \$orig_master_host,

  'orig_master_ip=s'         => \$orig_master_ip,

  'orig_master_port=i'       => \$orig_master_port,

  'orig_master_user=s'       => \$orig_master_user,

  'orig_master_password=s'   => \$orig_master_password,

  'orig_master_ssh_user=s'   => \$orig_master_ssh_user,

  'new_master_host=s'        => \$new_master_host,

  'new_master_ip=s'          => \$new_master_ip,

  'new_master_port=i'        => \$new_master_port,

  'new_master_user=s'        => \$new_master_user,

  'new_master_password=s'    => \$new_master_password,

  'new_master_ssh_user=s'    => \$new_master_ssh_user,

);

 

exit &main();

 

sub current_time_us {

  my ( $sec, $microsec ) = gettimeofday();

  my $curdate = localtime($sec);

  return $curdate . " " . sprintf( "%06d", $microsec );

}

 

sub sleep_until {

  my $elapsed = tv_interval($_tstart);

  if ( $_running_interval > $elapsed ) {

    sleep( $_running_interval - $elapsed );

  }

}

 

sub get_threads_util {

  my $dbh                    = shift;

  my $my_connection_id       = shift;

  my $running_time_threshold = shift;

  my $type                   = shift;

  $running_time_threshold = 0 unless ($running_time_threshold);

  $type                   = 0 unless ($type);

  my @threads;

 

  my $sth = $dbh->prepare("SHOW PROCESSLIST");

  $sth->execute();

 

  while ( my $ref = $sth->fetchrow_hashref() ) {

    my $id         = $ref->{Id};

    my $user       = $ref->{User};

    my $host       = $ref->{Host};

    my $command    = $ref->{Command};

    my $state      = $ref->{State};

    my $query_time = $ref->{Time};

    my $info       = $ref->{Info};

    $info =~ s/^\s*(.*?)\s*$/$1/ if defined($info);

    next if ( $my_connection_id == $id );

    next if ( defined($query_time) && $query_time < $running_time_threshold );

    next if ( defined($command)    && $command eq "Binlog Dump" );

    next if ( defined($user)       && $user eq "system user" );

    next

      if ( defined($command)

      && $command eq "Sleep"

      && defined($query_time)

      && $query_time >= 1 );

 

    if ( $type >= 1 ) {

      next if ( defined($command) && $command eq "Sleep" );

      next if ( defined($command) && $command eq "Connect" );

    }

 

    if ( $type >= 2 ) {

      next if ( defined($info) && $info =~ m/^select/i );

      next if ( defined($info) && $info =~ m/^show/i );

    }

 

    push @threads, $ref;

  }

  return @threads;

}

 

sub main {

  if ( $command eq "stop" ) {

    ## Gracefully killing connections on the current master

    # 1. Set read_only= 1 on the new master

    # 2. DROP USER so that no app user can establish new connections

    # 3. Set read_only= 1 on the current master

    # 4. Kill current queries

    # * Any database access failure will result in script die.

    my $exit_code = 1;

    eval {

      ## Setting read_only=1 on the new master (to avoid accident)

      my $new_master_handler = new MHA::DBHelper();

 

      # args: hostname, port, user, password, raise_error(die_on_error)_or_not

      $new_master_handler->connect( $new_master_ip, $new_master_port,

        $new_master_user, $new_master_password, 1 );

      print current_time_us() . " Set read_only on the new master.. ";

      $new_master_handler->enable_read_only();

      if ( $new_master_handler->is_read_only() ) {

        print "ok.\n";

      }

      else {

        die "Failed!\n";

      }

      $new_master_handler->disconnect();

 

      # Connecting to the orig master, die if any database error happens

      my $orig_master_handler = new MHA::DBHelper();

      $orig_master_handler->connect( $orig_master_ip, $orig_master_port,

        $orig_master_user, $orig_master_password, 1 );

 

      ## Drop application user so that nobody can connect. Disabling per-session binlog beforehand

      #$orig_master_handler->disable_log_bin_local();

      #print current_time_us() . " Drpping app user on the orig master..\n";

      #FIXME_xxx_drop_app_user($orig_master_handler);

 

      ## Waiting for N * 100 milliseconds so that current connections can exit

      my $time_until_read_only = 15;

      $_tstart = [gettimeofday];

      my @threads = get_threads_util( $orig_master_handler->{dbh},

        $orig_master_handler->{connection_id} );

      while ( $time_until_read_only > 0 && $#threads >= 0 ) {

        if ( $time_until_read_only % 5 == 0 ) {

          printf

"%s Waiting all running %d threads are disconnected.. (max %d milliseconds)\n",

            current_time_us(), $#threads + 1, $time_until_read_only * 100;

          if ( $#threads < 5 ) {

            print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"

              foreach (@threads);

          }

        }

        sleep_until();

        $_tstart = [gettimeofday];

        $time_until_read_only--;

        @threads = get_threads_util( $orig_master_handler->{dbh},

          $orig_master_handler->{connection_id} );

      }

 

      ## Setting read_only=1 on the current master so that nobody(except SUPER) can write

      print current_time_us() . " Set read_only=1 on the orig master.. ";

      $orig_master_handler->enable_read_only();

      if ( $orig_master_handler->is_read_only() ) {

        print "ok.\n";

      }

      else {

        die "Failed!\n";

      }

 

      ## Waiting for M * 100 milliseconds so that current update queries can complete

      my $time_until_kill_threads = 5;

      @threads = get_threads_util( $orig_master_handler->{dbh},

        $orig_master_handler->{connection_id} );

      while ( $time_until_kill_threads > 0 && $#threads >= 0 ) {

        if ( $time_until_kill_threads % 5 == 0 ) {

          printf

"%s Waiting all running %d queries are disconnected.. (max %d milliseconds)\n",

            current_time_us(), $#threads + 1, $time_until_kill_threads * 100;

          if ( $#threads < 5 ) {

            print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"

              foreach (@threads);

          }

        }

        sleep_until();

        $_tstart = [gettimeofday];

        $time_until_kill_threads--;

        @threads = get_threads_util( $orig_master_handler->{dbh},

          $orig_master_handler->{connection_id} );

      }

 

      print "Disabling the VIP on old master: $orig_master_host \n";

      &stop_vip();     

 

      ## Terminating all threads

      print current_time_us() . " Killing all application threads..\n";

      $orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 );

      print current_time_us() . " done.\n";

      #$orig_master_handler->enable_log_bin_local();

      $orig_master_handler->disconnect();

 

      ## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK

      $exit_code = 0;

    };

    if ($@) {

      warn "Got Error: $@\n";

      exit $exit_code;

    }

    exit $exit_code;

  }

  elsif ( $command eq "start" ) {

    ## Activating master ip on the new master

    # 1. Create app user with write privileges

    # 2. Moving backup script if needed

    # 3. Register new master's ip to the catalog database

 

# We don't return error even though activating updatable accounts/ip failed so that we don't interrupt slaves' recovery.

# If exit code is 0 or 10, MHA does not abort

    my $exit_code = 10;

    eval {

      my $new_master_handler = new MHA::DBHelper();

 

      # args: hostname, port, user, password, raise_error_or_not

      $new_master_handler->connect( $new_master_ip, $new_master_port,

        $new_master_user, $new_master_password, 1 );

 

      ## Set read_only=0 on the new master

      #$new_master_handler->disable_log_bin_local();

      print current_time_us() . " Set read_only=0 on the new master.\n";

      $new_master_handler->disable_read_only();

 

      ## Creating an app user on the new master

      #print current_time_us() . " Creating app user on the new master..\n";

      #FIXME_xxx_create_app_user($new_master_handler);

      #$new_master_handler->enable_log_bin_local();

      $new_master_handler->disconnect();

      ## Update master ip on the catalog database, etc

                print "Enabling the VIP - $vip on the new master - $new_master_host \n";

                &start_vip();

                $exit_code = 0;

    };

    if ($@) {

      warn "Got Error: $@\n";

      exit $exit_code;

    }

    exit $exit_code;

  }

  elsif ( $command eq "status" ) {

    # do nothing

    exit 0;

  }

  else {

    &usage();

    exit 1;

  }

}

 

# A simple system call that enable the VIP on the new master

sub start_vip() {

    `ssh $new_master_ssh_user\@$new_master_host \" $ssh_start_vip \"`;

}

# A simple system call that disable the VIP on the old_master

sub stop_vip() {

    `ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;

}

 

sub usage {

  print

"Usage: master_ip_online_change --command=start|stop|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --orig_master_user=user --orig_master_password=password --orig_master_ssh_user=sshuser --new_master_host=host --new_master_ip=ip --new_master_port=port --new_master_user=user --new_master_password=password --new_master_ssh_user=sshuser \n";

  die;

}

 

mydb1mydb2mydb3上操作

手动添加vip

[root@mydb1 ~]# ifconfig ens33:1 192.168.1.201 只设置主ip即可

[root@mydb2 ~]# ifconfig ens33:1 192.168.1.202 

[root@mydb3 ~]# ifconfig ens33:1 192.168.1.203 

 

.主备上用MHA工具验证互信,主从复制并且开启MHA管理进程

检查互信是否成功

masterha_check_ssh --conf=/etc/app1.cnf

判断输出有All SSH connection tests passed successfully.即检查互信成功

 

检查主从复制是否成功

masterha_check_repl --conf=/etc/app1.cnf

判断输出有MySQL Replication Health is OK.即主从复制搭建成功

 

开启管理模式(failover

masterha_manager --conf=/etc/app1.cnf &

 

检查是否启动

masterha_check_status --conf=/etc/app1.cnf

 

停止mha

masterha_stop --conf=/etc/app1.cnf

 

测试

1.手工failover测试

mydb1为主,mydb2mydb3为从

手工failover场景,master死掉,但是masterha_manager没有开启,可以通过手工failover

masterha_master_switch --conf=/etc/app1.cnf --dead_master_host=192.168.1.101 --master_state=dead --new_master_host=192.168.1.102 --ignore_last_failover

将原来的主加入到集群中

(root@localhost) [(none)]> prompt Slave >

Slave > change master to master_host='192.168.1.102', master_port=3306, master_user='rep1', master_password='rep1',master_auto_position=1;

(root@localhost) [(none)]> start slave;

(root@localhost) [(none)]> show slave status\G

 

2.手动在线切换

mydb1为主,mydb2mydb3为从

手动在线切换mha,切换时需要将在运行的masterha_manager停掉后才能切换

masterha_stop --conf=/etc/app1.cnf

masterha_master_switch --conf=/etc/app1.cnf --master_state=alive --new_master_host=192.168.1.102 --orig_master_is_new_slave --running_updates_limit=10000

 

3.自动failover测试

mydb1为主,mydb2mydb3为从

masterha_manager开启后,才能进行自动failover测试

masterha_check_ssh --conf=/etc/app1.cnf

masterha_check_repl --conf=/etc/app1.cnf

masterha_manager --conf=/etc/app1.cnf &

将主库mysql进程kill

 

每次failover切换后会在管理目录生成文件app1.failover.complete ,下次在切换的时候会发现有这个文件导致切换不成功,需要手动清理掉。

rm -rf /var/log/masterha/app1/app1.failover.complete

 

说明:每次测试完毕后,需要清理一下/var/log/masterha/app1/下的日志,然后启动mhamanager

1.关闭192.168.1.101上的mysql(观察从库从哪里同步,及mha日志输出)

2.恢复192.168.1.101192.168.1.102slave

(change master语句可以在/var/log/masterha/app1/app1.log里找到)

All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.1.102', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='rep1', MASTER_PASSWORD='xxx';

此时需要查看/etc/app1.cnf配置文件是否被修改

 

https://github.com/yoshinorim/mha4mysql-manager/releases

https://github.com/yoshinorim/mha4mysql-node/releases

https://github.com/yoshinorim/mha4mysql-manager/wiki

https://github.com/yoshinorim/mha4mysql-manager/wiki/Parameters

https://github.com/yoshinorim/mha4mysql-manager

https://github.com/yoshinorim/mha4mysql-node

posted @ 2019-08-12 11:57  AllenHU320  阅读(422)  评论(0编辑  收藏  举报