mysql mha 高可用集群

一、MHA介绍
注意:阅读本文需要有一定的mysql基础知识和主从集群知识

MHA(mysql High Availability)是一款开源的 MySQL 的高可用程序,由perl语言编写,它为 MySQL主从复制架构提供了自动化主故障转移功能。在MySQL故障切换过程中,MHA能做到在0~30秒内自动完成数据库的故障切换操作,并且能在最大程度上保证数据的一致性,以达到真正意义上的高可用。

mha官网:https://code.google.com/archive/p/mysql-master-ha/
github下载地址:https://github.com/yoshinorim/

1.2MHA软件构成
MHA 服务有两种角色,一是manager节点(提供管理功能),二是node节点(和mysql部署在一起) Manager:通常单独部署在一台独立机器上管理多个mysql集群, node:运行在每台 MySQL 服务器上

# mha各个软件包介绍
Manager工具包主要包括以下几个工具:
masterha_manger             启动MHA 
masterha_check_ssh          检查MHA的SSH配置状况 
masterha_check_repl         检查MySQL复制状况 
masterha_master_monitor     检测master是否宕机 
masterha_check_status       检测当前MHA运行状态 
masterha_conf_host          添加或删除配置的server信息
masterha_master_switch      控制故障转移(自动或者手动)


Node工具包主要包括以下几个工具:
这些工具通常由MHA Manager的脚本触发,无需人为操作
save_binary_logs            主库宕机后,保证从库能继续保存和复制master的剩余二进制日志,以防数据丢失
apply_diff_relay_logs       主库宕机后,识别剩余从库差异的中继日志事件,并将数据量多的那个从库的差异的事件应用于数据量少的那一个从库,以保证剩下的从库数据一致
purge_relay_logs            清除中继日志(不会阻塞SQL线程)
二、MHA部署
2.1、架构
ip 角色 备注
192.168.167.160 主库
192.168.167.161 备库1 manager在这个节点
192.168.167.161 备库2

mysql 版本8.0.32


2.2、部署mysql集群+MHA

1、部署mysql主从集群

# 1、部署三台mysql
部署mysql详情请参考我另一篇博客:https://www.cnblogs.com/sunjiwei/articles/18359401

# 2、部署主从集群
# 2.1主节点创建复制用账号
CREATE USER 'repl'@'%' IDENTIFIED WITH mysql_native_password BY '****';
GRANT REPLICATION SLAVE ON *.* TO repl@'%';

# 2.2 创建主从
mysql有两种事务机制,一个是匿名事务,一个是gtid事务,不同的机制创建主从时命令也不同,这里不赘述,只写命令

匿名事务时使用的命令:
change master to master_host='192.168.167.160',master_port=3306,master_user='repl',master_password='*****',master_log_file='mysql-bin.000001',master_log_pos=660;

使用这个方式时,需要知道master_log_file和master_log_pos,如果是新搭建的集群,直接在主节点执行show master status; 命令就可以获取,在两个从节点上执行reset master; 来清空从节点自己的事务。
如果是已有主库,后面想要扩展为主从集群,则需要使用mysqldump 或者innobackup 备份主库的数据到从库,备份文件中也会有master_log_file和master_log_pos,这里不赘述。


gtid事务时使用的命令:
change master to master_host='192.168.167.160',master_port=3306,master_user='repl',master_password='*****',master_auto_position=1;

使用这个方式时,不需要知道master_log_file和master_log_pos,在从库执行命令reset master; set @@GLOBAL.GTID_PURGED='3a87563c-2333-11eb-a778-005056a13ac1:1-115437623';(3a87563c-2333-11eb-a778-005056a13ac1:1-115437623这个是主库的事务号)就会从这个事务自动往后查找。


# 主从搭建好后要在两个从库上设置只读模式
set global read_only = 1;


2.3、配置各个节点mysql账号下主机ssh互信免密登录

# 三个节点都做
su - mysql
ssh-keygen 
cd /home/mysql/.ssh
cat id_rsa.pub > authorized_keys 
chmod 644 authorized_keys

然后把每个节点的公钥(id_rsa.pub这个文件里的内容)拷贝到其他两个节点的authorized_keys 文件中

# 然后要手动都ssh一遍
各节点验证
主节点:
ssh 192.168.167.160 date
ssh 192.168.167.161 date
ssh 192.168.167.162 date
从节点1:
ssh 192.168.167.160 date
ssh 192.168.167.161 date
ssh 192.168.167.162 date
从节点2:
ssh 192.168.167.160 date
ssh 192.168.167.161 date
ssh 192.168.167.162 date


2.4、部署mha

1、所有机器配置软连接,在root账号下做
ln -s /opt/software/mysql-8.0.32/bin/mysqlbinlog /usr/bin/mysqlbinlog
ln -s /opt/software/mysql-8.0.32/bin/mysql /usr/bin/mysql

chown -R mysql:mysql /usr/bin/mysqlbinlog
chown -R mysql:mysql /usr/bin/mysql

2、在主库上创建mha所用账号
CREATE USER 'mha'@'%' IDENTIFIED WITH mysql_native_password BY '****';
GRANT all privileges ON *.* TO mha@'%';

3、mysql三个节点及manager节点都部署node软件包及依赖包
yum install perl-DBD-MySQL -y
tar -zxvf mha4mysql-node-0.58.tar.gz
cd mha4mysql-node-0.58
perl Makefile.PL
make && make install

安装完成后脚本都在/usr/local/bin下,修改owner:
chown -R mysql:mysql /usr/local/bin/apply_diff_relay_logs
chown -R mysql:mysql /usr/local/bin/filter_mysqlbinlog
chown -R mysql:mysql /usr/local/bin/purge_relay_logs
chown -R mysql:mysql /usr/local/bin/save_binary_logs



4、manager 节点 软件包解压及依赖包安装
yum install -y  perl-DBD-MySQL perl-Config-Tiny  perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes
tar -zxvf  mha4mysql-manager-0.58.tar.gz
cd mha4mysql-manager-0.58
perl Makefile.PL
make && make install

安装完成后脚本都在/usr/local/bin下面,修改owner:
chown -R mysql:mysql /usr/local/bin/masterha_*



5、创建相应文件
# 创建配置文件目录,manager节点上做
mkdir -p /etc/mha
 
# 创建日志目录,manager节点上做
mkdir -p /opt/mha
chown -R mysql.mysql /opt/mha
chmod -R 777 /opt/mha


6、编辑配置文件
cat > /etc/mha/app.cnf << DOF
[server default]
manager_log=/opt/mha/manager.log      	  #设置manager的日志 
manager_workdir=/opt/mha             	  #设置manager的工作目录 
master_binlog_dir=/opt/mysql              #设置主库端 保存binlog的位置,以便MHA可以找到master的日志,
user=mha                                  #设置监控用户mha
password=111111                             #设置mysql中mha用户的密码,
ping_interval=1                           #设置监控主库,发送ping包的时间间隔,默认是3秒,尝试三次没有回应的时候自动进行failover
repl_user=repl
repl_password=111111
ssh_user=mysql                             #设置ssh的登录用户名
remote_workdir=/tmp               #设置在远程服务器上的工作目录。这个工作目录主要用于 MHA Manager 在执行故障转移(failover)操作期间保存一些临时文件和脚本
#report_script=/usr/local/send_report     #设置发生切换后发送的报警的脚本,用自己的告警就行,这里不用配置
#shutdown_script=                         #设置故障发生后关闭故障主机脚本(该脚本的主要作用是关闭主机防止发生脑裂,这里没有使用)
master_ip_failover_script=/usr/local/bin/master_ip_failover     #故障发生后,vip自动切换脚本。如果使用keepalived来做vip,这里就不用配置
master_ip_online_change_script=/usr/local/bin/master_ip_online_change     #手动切主时用的脚本,该脚本会自动阻塞以及kill原master session,配置原master为只读,停掉VIP(获取旧主库的binlog位置,使用master_log_wait()函数追赶主库)。同步完成之后,获取新主库的binlog位置,生成change master语句准备用于其他从库切换到新主库。

[server1]
hostname=192.168.167.160
port=3306 
                                 
[server2]            
hostname=192.168.167.161
port=3306

[server3]
hostname=192.168.167.162
port=3306

candidate_master=1   #设置为候选master,如果设置该参数以后,发生主从切换以后将会将此从库提升为主库,即使这个主库不是集群中事件最新的slave
check_repl_delay=0   #默认情况下如果一个slave落后master 100M的relay logs的话,MHA将不会选择该slave作为一个新的master,因为对于这个slave的恢复需要花费很长时间,通过设置check_repl_delay=0,MHA触发切换在选择一个新的master的时候将会忽略复制延时,这个参数对于设置了candidate_master=1的主机非常有用,因为这个候选主在切换的过程中一定是新的master
EOF

chown -R mysql:mysql /etc/mha


7、VIP配置,可以使用网卡配置,也可以使用keepalived配置
# 方案一:网卡配置,如果是使用网卡配置vip,发生故障时 需要vip 切换脚本 ,脚本如下:
cat > /usr/local/bin/master_ip_failover << EOF
#!/usr/bin/env perl

#  Copyright (C) 2011 DeNA Co.,Ltd.
#
#  This program is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.
#
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#
#  You should have received a copy of the GNU General Public License
#   along with this program; if not, write to the Free Software
#  Foundation, Inc.,
#  51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

## Note: This is a sample script and is not complete. Modify the script based on your environment.

use strict;
use warnings FATAL => 'all';

use Getopt::Long;
use MHA::DBHelper;

my (
  $command,        $ssh_user,         $orig_master_host,
  $orig_master_ip, $orig_master_port, $new_master_host,
  $new_master_ip,  $new_master_port,  $new_master_user,
  $new_master_password
);
my $vip = '192.168.168.100/24';   # 这里替换成自己的VIP
my $key = "1";

# 这里根据自己的操作系统和网卡修改命令
my $ssh_start_vip = "sudo /sbin/ip addr add $vip dev ens33";
my $ssh_stop_vip = "sudo /sbin/ip addr del $vip dev ens33";

GetOptions(
  'command=s'             => \$command,
  'ssh_user=s'            => \$ssh_user,
  'orig_master_host=s'    => \$orig_master_host,
  'orig_master_ip=s'      => \$orig_master_ip,
  'orig_master_port=i'    => \$orig_master_port,
  'new_master_host=s'     => \$new_master_host,
  'new_master_ip=s'       => \$new_master_ip,
  'new_master_port=i'     => \$new_master_port,
  'new_master_user=s'     => \$new_master_user,
  'new_master_password=s' => \$new_master_password,
);

exit &main();

sub main {
  if ( $command eq "stop" || $command eq "stopssh" ) {

    # $orig_master_host, $orig_master_ip, $orig_master_port are passed.
    # If you manage master ip address at global catalog database,
    # invalidate orig_master_ip here.
    my $exit_code = 1;
    eval {

      # updating global catalog, etc
      $exit_code = 0;
    };
    if ($@) {
      warn "Got Error: $@\n";
      exit $exit_code;
    }
    exit $exit_code;
  }
    elsif ( $command eq "start" ) {

        # all arguments are passed.
        # If you manage master ip address at global catalog database,
        # activate new_master_ip here.
        # You can also grant write access (create user, set read_only=0, etc) here.
        my $exit_code = 10;
        eval {
            print "Enabling the VIP - $vip on the new master - $new_master_host \n";
            &start_vip();
            &stop_vip();
            $exit_code = 0;
        };
        if ($@) {
            warn $@;
            exit $exit_code;
        }
        exit $exit_code;
    }
    elsif ( $command eq "status" ) {
        print "Checking the Status of the script.. OK \n";
        `ssh $ssh_user\@$orig_master_host \" $ssh_start_vip \"`;
        exit 0;
    }
    else {
        &usage();
        exit 1;
    }
}


sub start_vip() {
    `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master 
sub stop_vip() {
   `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}


sub usage {
  print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
EOF


# 如果想能够手动切主,需要配置手动切换脚本,注意只有采用网卡配置VIP才能手动切换, 如果是keepalived配置vip则不能手动切主
cat > /usr/local/bin/master_ip_online_change << EOF
#!/usr/bin/env perl

#  Copyright (C) 2011 DeNA Co.,Ltd.
#
#  This program is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.
#
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#
#  You should have received a copy of the GNU General Public License
#   along with this program; if not, write to the Free Software
#  Foundation, Inc.,
#  51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

## Note: This is a sample script and is not complete. Modify the script based on your environment.

use strict;
use warnings FATAL => 'all';

use Getopt::Long;
use MHA::DBHelper;
use MHA::NodeUtil;
use Time::HiRes qw( sleep gettimeofday tv_interval );
use Data::Dumper;

my $_tstart;
my $_running_interval = 0.1;
my (
  $command,              $orig_master_is_new_slave, $orig_master_host,
  $orig_master_ip,       $orig_master_port,         $orig_master_user,
  $orig_master_password, $orig_master_ssh_user,     $new_master_host,
  $new_master_ip,        $new_master_port,          $new_master_user,
  $new_master_password,  $new_master_ssh_user,      $master_vip,
  $master_vip_interface
);

GetOptions(
  'command=s'                => \$command,
  'orig_master_is_new_slave' => \$orig_master_is_new_slave,
  'orig_master_host=s'       => \$orig_master_host,
  'orig_master_ip=s'         => \$orig_master_ip,
  'orig_master_port=i'       => \$orig_master_port,
  'orig_master_user=s'       => \$orig_master_user,
  'orig_master_password=s'   => \$orig_master_password,
  'orig_master_ssh_user=s'   => \$orig_master_ssh_user,
  'new_master_host=s'        => \$new_master_host,
  'new_master_ip=s'          => \$new_master_ip,
  'new_master_port=i'        => \$new_master_port,
  'new_master_user=s'        => \$new_master_user,
  'new_master_password=s'    => \$new_master_password,
  'new_master_ssh_user=s'    => \$new_master_ssh_user,
  'master_vip=s'             => \$master_vip,
  'master_vip_interface=s'   => \$master_vip_interface,
);

# 根据自己的操作系统修改这里的命令
my $ssh_start_vip = "/sbin/ifconfig $master_vip_interface $master_vip";
my $ssh_stop_vip = "/sbin/ifconfig $master_vip_interface down";
my $ssh_flush_arp = "/sbin/arping -q -U -c 3 -I bond0 $master_vip";

exit &main();

sub current_time_us {
  my ( $sec, $microsec ) = gettimeofday();
  my $curdate = localtime($sec);
  return $curdate . " " . sprintf( "%06d", $microsec );
}

sub sleep_until {
  my $elapsed = tv_interval($_tstart);
  if ( $_running_interval > $elapsed ) {
    sleep( $_running_interval - $elapsed );
  }
}

sub get_threads_util {
  my $dbh                    = shift;
  my $my_connection_id       = shift;
  my $running_time_threshold = shift;
  my $type                   = shift;
  $running_time_threshold = 0 unless ($running_time_threshold);
  $type                   = 0 unless ($type);
  my @threads;

  my $sth = $dbh->prepare("SHOW PROCESSLIST");
  $sth->execute();

  while ( my $ref = $sth->fetchrow_hashref() ) {
    my $id         = $ref->{Id};
    my $user       = $ref->{User};
    my $host       = $ref->{Host};
    my $command    = $ref->{Command};
    my $state      = $ref->{State};
    my $query_time = $ref->{Time};
    my $info       = $ref->{Info};
    $info =~ s/^\s*(.*?)\s*$/$1/ if defined($info);
    next if ( $my_connection_id == $id );
    next if ( defined($query_time) && $query_time < $running_time_threshold );
    next if ( defined($command)    && $command eq "Binlog Dump" );
    next if ( defined($user)       && $user eq "system user" );
    next
      if ( defined($command)
      && $command eq "Sleep"
      && defined($query_time)
      && $query_time >= 1 );

    if ( $type >= 1 ) {
      next if ( defined($command) && $command eq "Sleep" );
      next if ( defined($command) && $command eq "Connect" );
    }

    if ( $type >= 2 ) {
      next if ( defined($info) && $info =~ m/^select/i );
      next if ( defined($info) && $info =~ m/^show/i );
    }

    push @threads, $ref;
  }
  return @threads;
}

sub main {
  if ( $command eq "stop" ) {
    ## Gracefully killing connections on the current master
    # 1. Set read_only= 1 on the new master
    # 2. DROP USER so that no app user can establish new connections
    # 3. Set read_only= 1 on the current master
    # 4. Kill current queries
    # * Any database access failure will result in script die.
    my $exit_code = 1;
    eval {
      ## Setting read_only=1 on the new master (to avoid accident)
      my $new_master_handler = new MHA::DBHelper();

      # args: hostname, port, user, password, raise_error(die_on_error)_or_not
      $new_master_handler->connect( $new_master_ip, $new_master_port,
        $new_master_user, $new_master_password, 1 );
      print current_time_us() . " Set read_only on the new master.. ";
      $new_master_handler->enable_read_only();
      if ( $new_master_handler->is_read_only() ) {
        print "ok.\n";
      }
      else {
        die "Failed!\n";
      }
      $new_master_handler->disconnect();

      # Connecting to the orig master, die if any database error happens
      my $orig_master_handler = new MHA::DBHelper();
      $orig_master_handler->connect( $orig_master_ip, $orig_master_port,
        $orig_master_user, $orig_master_password, 1 );

      ## Drop application user so that nobody can connect. Disabling per-session binlog beforehand
      $orig_master_handler->disable_log_bin_local();
      #print current_time_us() . " Drpping app user on the orig master..\n";
      #FIXME_xxx_drop_app_user($orig_master_handler);

      ## Waiting for N * 100 milliseconds so that current connections can exit
      my $time_until_read_only = 15;
      $_tstart = [gettimeofday];
      my @threads = get_threads_util( $orig_master_handler->{dbh},
        $orig_master_handler->{connection_id} );
      while ( $time_until_read_only > 0 && $#threads >= 0 ) {
        if ( $time_until_read_only % 5 == 0 ) {
          printf
"%s Waiting all running %d threads are disconnected.. (max %d milliseconds)\n",
            current_time_us(), $#threads + 1, $time_until_read_only * 100;
          if ( $#threads < 5 ) {
            print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"
              foreach (@threads);
          }
        }
        sleep_until();
        $_tstart = [gettimeofday];
        $time_until_read_only--;
        @threads = get_threads_util( $orig_master_handler->{dbh},
          $orig_master_handler->{connection_id} );
      }

      ## Setting read_only=1 on the current master so that nobody(except SUPER) can write
      print current_time_us() . " Set read_only=1 on the orig master.. ";
      $orig_master_handler->enable_read_only();
      if ( $orig_master_handler->is_read_only() ) {
        print "ok.\n";
      }
      else {
        die "Failed!\n";
      }

      ## Waiting for M * 100 milliseconds so that current update queries can complete
      my $time_until_kill_threads = 5;
      @threads = get_threads_util( $orig_master_handler->{dbh},
        $orig_master_handler->{connection_id} );
      while ( $time_until_kill_threads > 0 && $#threads >= 0 ) {
        if ( $time_until_kill_threads % 5 == 0 ) {
          printf
"%s Waiting all running %d queries are disconnected.. (max %d milliseconds)\n",
            current_time_us(), $#threads + 1, $time_until_kill_threads * 100;
          if ( $#threads < 5 ) {
            print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"
              foreach (@threads);
          }
        }
        sleep_until();
        $_tstart = [gettimeofday];
        $time_until_kill_threads--;
        @threads = get_threads_util( $orig_master_handler->{dbh},
          $orig_master_handler->{connection_id} );
      }

                print "Disabling the VIP - $master_vip on old master: $orig_master_host \n";
                &stop_vip();     
      ## Terminating all threads
      print current_time_us() . " Killing all application threads..\n";
      $orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 );
      print current_time_us() . " done.\n";
      $orig_master_handler->enable_log_bin_local();
      $orig_master_handler->disconnect();

      ## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK
      $exit_code = 0;
    };
    if ($@) {
      warn "Got Error: $@\n";
      exit $exit_code;
    }
    exit $exit_code;
  }
  elsif ( $command eq "start" ) {
    ## Activating master ip on the new master
    # 1. Create app user with write privileges
    # 2. Moving backup script if needed
    # 3. Register new master's ip to the catalog database

# We don't return error even though activating updatable accounts/ip failed so that we don't interrupt slaves' recovery.
# If exit code is 0 or 10, MHA does not abort
    my $exit_code = 10;
    eval {
      my $new_master_handler = new MHA::DBHelper();

      # args: hostname, port, user, password, raise_error_or_not
      $new_master_handler->connect( $new_master_ip, $new_master_port,
        $new_master_user, $new_master_password, 1 );

      ## Set read_only=0 on the new master
      $new_master_handler->disable_log_bin_local();
      print current_time_us() . " Set read_only=0 on the new master.\n";
      $new_master_handler->disable_read_only();

      ## Creating an app user on the new master
      #print current_time_us() . " Creating app user on the new master..\n";
      #FIXME_xxx_create_app_user($new_master_handler);
      $new_master_handler->enable_log_bin_local();
      $new_master_handler->disconnect();

      ## Update master ip on the catalog database, etc
      print "Enabling the VIP - $master_vip on the new master - $new_master_host \n";
      &start_vip();
      $exit_code = 0;
    };
    if ($@) {
      warn "Got Error: $@\n";
      exit $exit_code;
    }
    exit $exit_code;
  }
  elsif ( $command eq "status" ) {

    # do nothing
    exit 0;
  }
  else {
    &usage();
    exit 1;
  }
}

# A simple system call that enable the VIP on the new master 
sub start_vip() {
    `ssh $new_master_ssh_user\@$new_master_host \" $ssh_start_vip \"`;
    $ssh_flush_arp =~ s/\/24//g;
    `ssh $new_master_ssh_user\@$new_master_host \" $ssh_flush_arp \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
    `ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
  print
"Usage: master_ip_online_change --command=start|stop|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
  die;
}
EOF


chown -R mysql:mysql /usr/local/bin/master_ip_online_change

此种通过网卡配置vip的方式不会有脑裂情况


# 方案二、keepalived 配置vip
keepalived安装参考我另一篇博客:https://www.cnblogs.com/sunjiwei/p/18413813
 
keepalived 配置文件如下:

mkdir -p /etc/keepalived
cat > /etc/keepalived/keepalived.conf << EOF
global_defs {
        notification_email {
                dba@99im.net
        }

        notification_email_from dba@account.99bill.com
        smtp_server account.99bill.com
        smtp_connect_timeout 30

        router_id jqma01  # keepalived name,master-slave same
        script_user root
        enable_script_security 
}

vrrp_instance VI_2 {
        state BACKUP
        interface ens192
        smtp_alert
        virtual_router_id 25       # keepalived ID,master-slave same
        priority 80               # keepalived weight
        advert_int 1

        authentication {
                auth_type PASS
                auth_pass b3q14meiupedso7h
        }

        unicast_src_ip 192.168.167.160
        unicast_peer {
                192.168.167.162
        }

        virtual_ipaddress {
                192.168.167.159      # VIP
        }

}

virtual_server 192.168.167.159 3306 {
        delay_loop 2
        lb_algo wrr
        lb_kind DR
        persistence_timeout 60
        protocol TCP
        real_server 192.168.167.160 3306 {
                weight 3
                notify_down "/etc/keepalived/shutdown_keepalived.sh"

                TCP_CHECK {
                        connect_timeout 3
                        connect_port 3306
                }
        }
}
EOF


vim /etc/keepalived/shutdown_keepalived.sh
#!/bin/sh    
pkill keepalived 


chmod 777 shutdown_keepalived.sh


8、手动先在主库上生成一个vip,如果是keepalive配置vip 就不用做这一步了

   1.添加vip,以下两条命令任选一个即可,一个是ifconfig,一个是ip addr add,两个命令作用一样
   /sbin/ifconfig bond0:1 172.21.175.64/16 done     #bond用这个命令
   /sbin/ip addr add 192.168.167.160/24 dev ens192   # 单网卡用这个命令,ens192要根据自己的网卡名修改,我的网卡是ens192
   
   # 去掉vip命令
   /sbin/ip addr del 192.168.225.116/24 dev ens192
   /sbin/ifconfig bond0:1 172.21.175.64/16 down
   
   2.查看添加的虚拟ip

    ip addr


至此,mha就部署好了,下面是启动和运维mha的命令

以下步骤要切换到mysql账户下执行    sudo su - mysql 	

1、互信检查
[mysql@mysql-ha-0003 ~]$ masterha_check_ssh  --conf=/etc/mha/app.cnf 

2、主从状态检查
[mysql@mysql-ha-0003 ~]$ masterha_check_repl  --conf=/etc/mha/app.cnf

3、开启mha
[mysql@mysql-ha-0003 ~]$ nohup masterha_manager --conf=/etc/mha/app.cnf --remove_dead_master_conf --ignore_last_failover  < /dev/null> /var/log/mha/manager.log 2>&1 &

4、停止mha
masterha_stop --conf=/etc/mha/app.cnf

5、查看mha状态
[mysql@mysql-ha-0003 ~]$ masterha_check_status --conf=/etc/mha/app.cnf


6、额外介绍
说明:
主库宕机谁来接管?
1. 所有从节点日志都是一致的,默认会以配置文件的顺序去选择一个新主。
2. 从节点日志不一致,自动选择最接近于主库的从库
3. 如果对于某节点设定了权重(candidate_master=1),权重节点会优先选择。
但是此节点日志量落后主库100M日志的话,也不会被选择。可以配合check_repl_delay=0,关闭日志量的检查,强制选择候选节点。

三、测试主库故障切换
1、自动故障切换
模拟主库宕机

然后查看mha日志可以看到切祝成功
----- Failover Report -----

app: MySQL Master failover 192.168.167.160(192.168.167.160:3306) to 192.168.167.162(192.168.167.162:3306) succeeded

Master 192.168.167.160(192.168.167.160:3306) is down!

Check MHA Manager logs at st1-ykf-s:/opt/mha/manager.log for details.

Started automated(non-interactive) failover.
Selected 192.168.167.162(192.168.167.162:3306) as a new master.
192.168.167.162(192.168.167.162:3306): OK: Applying all logs succeeded.
192.168.167.161(192.168.167.161:3306): OK: Slave started, replicating from 192.168.167.162(192.168.167.162:3306)
192.168.167.162(192.168.167.162:3306): Resetting slave info succeeded.
Master failover to 192.168.167.162(192.168.167.162:3306) completed successfully.



2、手动切主
masterha_master_switch --help
Usage:
    # For master failover

    masterha_master_switch --master_state=dead
    --global_conf=/etc/masterha_default.cnf
    --conf=/usr/local/masterha/conf/app1.cnf --dead_master_host=host1

    # For online master switch

    masterha_master_switch --master_state=alive
    --global_conf=/etc/masterha_default.cnf
    --conf=/usr/local/masterha/conf/app1.cnf

    See online reference
    (http://code.google.com/p/mysql-master-ha/wiki/masterha_master_switch)
    for details.


/usr/local/bin/masterha_master_switch --conf=/opt/odsappback/odsapp-m_62.67_3306.cnf --master_state=alive --new_master_host=172.21.62.67 --new_master_port=3306 --orig_master_is_new_slave --running_updates_limit=10000

参数解释:

--orig_master_is_new_slave #将老主切换成新主的slave节点。如果不指定这个参数,老主将不加入新的集群环境。
--running_updates_limit=(seconds)  
If the current master executes write queries that take more than this parameter, 
or any of the MySQL slaves behind master more than this parameter, master switch aborts. By default, 
it's 1 (1 second).当老的主库当前写操作执行超过N秒,或者任何一台slave的 slaves behind master 超过N秒,切换都将终止。
--skip_lock_all_tables  # 切换过程中,老的主库将被施加 FLUSH TABLES WITH READ LOCK ,这是个昂贵的操作,如果你可以确定没有写操作,可以指定这个参数跳过加全局锁。
--master_state=dead  #强制参数. 可选有: "dead" or "alive". 如果设置为 alive,将执行 在线切主操作
--new_master_host=(hostname) # 可选参数。如果不指定,程序自动选主。


注意:Online master switch 只有当所有下列条件都满足时才会开始执行:

1. IO threads on all slaves are running   # 所有slave上IO线程运行。
2. SQL threads on all slaves are running  # 所有的slave上SQL线程正常运行。
3. Seconds_Behind_Master on all slaves are less or equal than --running_updates_limit seconds  // 在所有的slave上 Seconds_Behind_Master 要小于等于  running_updates_limit seconds
4. On master, none of update queries take more than --running_updates_limit seconds in the show processlist output  // 在主上,没有更新查询操作多于running_updates_limit seconds,可以使用show processlist查看。

posted @ 2024-09-14 16:20  有形无形  阅读(44)  评论(0编辑  收藏  举报