MySQL高可用方案MHA的部署和原理

 

MySQL高可用方案MHA的部署和原理

MHA概念简介

MHA(Master High Availability)是一套相对成熟的MySQL高可用方案,能做到在0~30s内自动完成数据库的故障切换操作,在master服务器不宕机的情况下,基本能保证数据的一致性。

 

它由两部分组成:MHA Manager(管理节点)和MHA Node(数据节点)。其中,MHA Manager可以单独部署在一台独立的机器上管理多个master-slave集群,也可以部署在一台slave上。MHA Node则运行在每个mysql节点上MHA Manager会定时探测集群中的master节点,当master出现故障时,它自动将最新数据的slave提升为master,然后将其它所有的slave指向新的master。

 

MHA自动故障切换过程中,MHA试图保存master的二进制日志,从而最大程度地保证数据不丢失,当这并不总是可行的,譬如,主服务器硬件故障或无法通过ssh访问,MHA就没法保存二进制日志,这样就只进行了故障转移但丢失了最新数据。可结合MySQL 5.5中推出的半同步复制来降低数据丢失的风险。

 

MHA软件组成:

Manager工具包和Node工具包,具体说明如下:

 

MHA Manager:

1. masterha_check_ssh:检查MHA的SSH配置状况

2. masterha_check_repl:检查MySQL的复制状况

3. masterha_manager:启动MHA

4. masterha_check_status:检测当前MHA运行状态

5. masterha_master_monitor:检测master是否宕机

6. masterha_master_switch:控制故障转移(自动或手动)

7. masterha_conf_host:添加或删除配置的server信息

8. masterha_stop:关闭MHA

 

MHA Node:

save_binary_logs:保存或复制master的二进制日志

apply_diff_relay_logs:识别差异的relay log并将差异的event应用到其它slave中

filter_mysqlbinlog:去除不必要的ROLLBACK事件(MHA已不再使用这个工具)

purge_relay_logs:消除中继日志(不会堵塞SQL线程)

 

另有如下几个脚本需自定义:

1. master_ip_failover:管理VIP

2. master_ip_online_change:

3. masterha_secondary_check:当MHA manager检测到master不可用时,通过masterha_secondary_check脚本来进一步确认,减低误切的风险。

4. send_report:当发生故障切换时,可通过send_report脚本发送告警信息。

 

集群信息:

角色

IP地址

ServerID

类型

Master

(主库)

192.168.120.10

1

写入

Candicate master(候选主库)

192.168.120.20

2

Slave

(从库)

192.168.120.30

3

Monitor host

(监控机)

192.168.120.30

 

监控集群组

此处由于环境限制,监控搭建在slave上。

版本信息:

 

 

操作系统:

Red Hat Enterprise Linux Server release 7.4 (Maipo)

软件版本:

MySQL: 5.7.31

MHA:0.56

安装MySQL

1、下载安装

1 登录MySQL官网下载mysql

下载地址 https://dev.mysql.com/downloads/mysql/  这里使用的是rpm包安装,下载对应的社区版rpm

不能只下载mysql-server服务器,因为有依赖关系,得下载下面五个包

 

 

2、查看系统是否已经安装了mysql

# rpm -aq | grep mysql

如果有旧版本就需要像下面卸载mariadb一样卸载掉。

Mariadbrhel系统已经安装好了的,需要卸载掉。

百度百科是这么说,Mariadb也被视为开源数据库MySQL的替代品,安装MySQL时可能有冲突,

查看Mariadb,代码

# rpm  -qa | grep mariadb

 

 

 

3、卸载Mariadb

用rpm -e 卸载,因为有有依赖关系,末尾加上 --nodeps不顾依赖关系强制卸载,反正也不用了(注意是两个英文输入法下的横杠),三个都要卸载

rpm -e mariadb-server.* --nodeps

rpm -e mariadb-libs.* --nodeps

rpm -e mariadb-* --nodeps

rpm -e mariadb-devel-* --nodeps

rpm -e mariadb-test-* --nodeps

rpm -e mariadb-bench-* --nodeps

rpm -e mariadb-embedded-* --nodeps

rpm -e mariadb-embedded-devel-* --nodeps

 

再次检查,确定清空

# rpm -qa |grep mariadb 

 

 

 

 

4、创建mysql文件夹

有些mysql安装教程,创建了用户和组,但是在现在的版本中,官方文档已经没有说明必须要创建mysql组和mysql用户,所以可以不用创建组和用户的,注意切换的root用户下,进入 /usr/local下。

# mkdir /usr/local/mysql

# cd /usr/local/mysql

 

 

5、上传文件到mysql目录

 

 

 

6、开始安装mysql数据库

注意了官方文档说,各个文件是有依赖性的,必须按以下顺序安装

 

安装顺序

rpm -ivh mysql-community-common-*

rpm -ivh mysql-community-libs-*

rpm -ivh mysql-community-client-*

rpm -ivh mysql-community-server-*

rpm -ivh mysql-community-devel-*

 

 

到此为止安装就算完了,但是还没有结束。

官方安装文档地址 https://dev.mysql.com/doc/refman/5.7/en/linux-installation-rpm.html

 

mysql安装完毕后,是不会自动启动的,但是第一次启动后,以后开机都是自启的,没有第一次启动之前,也是没有密码的。

当你第一次启动MySQL服务器的时候,mysql自己就会自己做很多事情,其中之一重要的就是初始密码,A superuseraccount 'root'@'localhost' is created. Apassword for the superuser is set and stored in the error log file. 当你安装后没有启动 MySQL之前, 你执行 vim /var/log/mysql.log  你会发现该文件是空的或是新文件,现在就来启动mysql

 

7、启动mysql服务

注意!第一次启动mysql为初始化,查看/etc/my.cnf文件

 

 

 

# ls /var/lib |grep mysql

查看是否存在/var/lib/mysql这个目录,如存在则删除,rhel7里自带数据库,因此可能会有残留。

# mv /var/lib/mysql /var/lib/mysql.bak

 

确认没有后执行

# systemctl start mysqld

 

查看一下mysql服务是否启动了

# systemctl status mysqld.service

 

出现以下就表示成功了

 

 

 

8、登录mysql

启动之后密码就初始化好了,查看密码

# tail -n 100 /var/log/mysqld.log|grep password

 

 

 

# mysql -u root -p 

回车就让你输入密码,输入密码的时候仔细点,密码比较乱,可以复制shift+insert粘贴

9、修改密码

为了安全性,此文档未修改密码复杂度。

mysql> set password for 'root'@'localhost'='newpasswd'   

newpasswd就是你设置的新密码,密码必须要符合要求,八位及以上,需要大小写、数字和特殊字符

例如:

mysql> set password for 'root'@'localhost'='P@ssw0rd';

Query OK, 0 rows affected (0.00 sec)

到此就算mysql数据库安装完成了,

 

安装MHA(需要外网)

一、在所有节点上安装MHA node

1.在MySQL服务器上安装MHA node所需的perl模块(DBD:mysql)

# yum install perl-DBD-MySQL -y

 

2.在所有的节点上安装mha node

下载地址为:

http://pan.baidu.com/s/1boS31vT

 

# tar xvf mha4mysql-node-0.56.tar.gz

# cd mha4mysql-node-0.56

# perl Makefile.PL  

 

Can't locate ExtUtils/MakeMaker.pm in @INC (@INC contains: inc /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at inc/Module/Install/Can.pm line 6.

BEGIN failed--compilation aborted at inc/Module/Install/Can.pm line 6.

Compilation failed in require at inc/Module/Install.pm line 283.

Can't locate ExtUtils/MakeMaker.pm in @INC (@INC contains: inc /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at inc/Module/Install/Makefile.pm line 4.

BEGIN failed--compilation aborted at inc/Module/Install/Makefile.pm line 4.

Compilation failed in require at inc/Module/Install.pm line 283.

Can't locate ExtUtils/MM_Unix.pm in @INC (@INC contains: inc /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at inc/Module/Install/Metadata.pm line 349.

 

通过报错可以看出,是相关依赖包没有安装。

 

# yum install perl-ExtUtils-MakeMaker -y

# perl Makefile.PL  

 

*** Module::AutoInstall version 1.03

*** Checking for Perl dependencies...

Can't locate CPAN.pm in @INC (@INC contains: inc /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at inc/Module/AutoInstall.pm line 277.

    

# yum install perl-CPAN -y

# perl Makefile.PL

 

 

*** Module::AutoInstall version 1.03

*** Checking for Perl dependencies...

[Core Features]

- DBI        ...loaded. (1.609)

- DBD::mysql ...loaded. (4.013)

*** Module::AutoInstall configuration finished.

Checking if your kit is complete...

Looks good

Writing Makefile for mha4mysql::node

 

# make

# make install

 

至此,MHA node节点安装完毕,会在/usr/local/bin下生成以下脚本文件

 

# ll /usr/local/bin/

total 44

-r-xr-xr-x 1 root root 16367 Jul 20 07:00 apply_diff_relay_logs

-r-xr-xr-x 1 root root  4807 Jul 20 07:00 filter_mysqlbinlog

-r-xr-xr-x 1 root root  8261 Jul 20 07:00 purge_relay_logs

-r-xr-xr-x 1 root root  7525 Jul 20 07:00 save_binary_logs

 

二、在Monitor host(node3)节点上部署MHA Manager

manager节点建议安装到单独的一台机器上或者一个不用切换为主的从库上

此处选择在node3上安装

1、安装管理程序

# tar xvf mha4mysql-manager-0.56.tar.gz

# cd mha4mysql-manager-0.56

# perl Makefile.PL

 

*** Module::AutoInstall version 1.03

*** Checking for Perl dependencies...

[Core Features]

- DBI                   ...loaded. (1.609)

- DBD::mysql            ...loaded. (4.013)

- Time::HiRes           ...missing.

- Config::Tiny          ...missing.

- Log::Dispatch         ...missing.

- Parallel::ForkManager ...missing.

- MHA::NodeConst        ...missing.

==> Auto-install the 5 mandatory module(s) from CPAN? [y] y

*** Dependencies will be installed the next time you type 'make'.

*** Module::AutoInstall configuration finished.

Checking if your kit is complete...

Looks good

Warning: prerequisite Config::Tiny 0 not found.

Warning: prerequisite Log::Dispatch 0 not found.

Warning: prerequisite MHA::NodeConst 0 not found.

Warning: prerequisite Parallel::ForkManager 0 not found.

Warning: prerequisite Time::HiRes 0 not found.

Writing Makefile for mha4mysql::manager

 

# make && make install

根据网速需要等待一会

执行此命令可能遇到CPAN配置问题,如出现Looking for CPAN mirrors near you (please be patient),则根据CPAN配置进行操作

2、CPAN配置(可选)

[root@node3 CPAN]# cpan

Sorry, we have to rerun the configuration dialog for CPAN.pm due to

some missing parameters. Configuration will be written to

 <</root/.cpan/CPAN/MyConfig.pm>>

 

 

CPAN.pm requires configuration, but most of it can be done automatically.

If you answer 'no' below, you will enter an interactive dialog for each

configuration option instead.

 

Would you like to configure as much as possible automatically? [yes] yes

 

Autoconfigured everything but 'urllist'.

 

Now you need to choose your CPAN mirror sites.  You can let me

pick mirrors for you, you can select them from a list or you

can enter them by hand.

 

Would you like me to automatically choose some CPAN mirror

sites for you? (This means connecting to the Internet) [yes] n

 

Would you like to pick from the CPAN mirror list? [yes] n

Now you can enter your own CPAN URLs by hand. A local CPAN mirror can be

listed using a 'file:' URL like 'file:///path/to/cpan/'

 

CPAN.pm needs at least one URL where it can fetch CPAN files from.

 

Please enter your CPAN site: [] https://mirrors.aliyun.com/CPAN/

Enter another URL or ENTER to quit: [] [按回车]

New urllist

  https://mirrors.aliyun.com/CPAN/

 

Autoconfiguration complete.

 

commit: wrote '/root/.cpan/CPAN/MyConfig.pm'

 

You can re-run configuration any time with 'o conf init' in the CPAN shell

Terminal does not support AddHistory.

 

cpan shell -- CPAN exploration and modules installation (v1.9800)

Enter 'h' for help.

 

cpan[1]> quit

Terminal does not support GetHistory.

Lockfile removed.

 

完成配置

 

CPAN问题解决后再次执行

# make && make install

根据网速需要等待一会

3、查看新增文件

执行完毕后,会在/usr/local/bin下新增以下几个文件

 

# ll /usr/local/bin/

total 40

-r-xr-xr-x 1 root root 1991 Jul 20 00:50 masterha_check_repl

-r-xr-xr-x 1 root root 1775 Jul 20 00:50 masterha_check_ssh

-r-xr-xr-x 1 root root 1861 Jul 20 00:50 masterha_check_status

-r-xr-xr-x 1 root root 3197 Jul 20 00:50 masterha_conf_host

-r-xr-xr-x 1 root root 2513 Jul 20 00:50 masterha_manager

-r-xr-xr-x 1 root root 2161 Jul 20 00:50 masterha_master_monitor

-r-xr-xr-x 1 root root 2369 Jul 20 00:50 masterha_master_switch

-r-xr-xr-x 1 root root 5167 Jul 20 00:50 masterha_secondary_check

-r-xr-xr-x 1 root root 1735 Jul 20 00:50 masterha_stop

 

 

三、配置SSH登录无密码验证

1. 在manager上配置到所有Node节点的无密码验证

# ssh-keygen

一路按“Enter”

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@node1

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@node2

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@node3

 

2. 在Master上配置

# ssh-keygen

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@node2

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@node3

 

3. 在Candicate master上配置     

# ssh-keygen

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@node1

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@node3

 

4. 在Slave上配置     

# ssh-keygen

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@node1

# ssh-copy-id -i /root/.ssh/id_rsa.pub root@node2

 

四、搭建主从从复制环境

1、修改主数据库配置文件

# cat /etc/my.cnf

[mysqld]

server-id=1

log-bin=binlog

binlog-ignore-db=mysql,information_schema,performance_schema

 

# systemctl restart mysqld

 

# mysql -u root -p

Enter password:

 

2. 在Master上创建复制用户

Master数据库中创建主从复制帐号(授权给从数据库服务器)

账号随意,此处sunyard只是测试

mysql> GRANT REPLICATION SLAVE,REPLICATION CLIENT ON *.* TO 'sunyard'@'%'  IDENTIFIED BY 'Sunyard_23';

Query OK, 0 rows affected (0.09 sec)

mysql> SELECT Host,User FROM mysql.user;

+---------------+---------------+

| Host          | User          |

+---------------+---------------+

| 192.168.120.% | sunyard       |

| localhost     | mysql.session |

| localhost     | mysql.sys     |

| localhost     | root          |

+---------------+---------------+

4 rows in set (0.00 sec)

 

mysql> FLUSH PRIVILEGES;

Query OK, 0 rows affected (0.00 sec)

 

查询主数据库状态 记录下返回结果的File列和Position列的值

mysql> SHOW MASTER STATUS\G

*************************** 1. row ***************************

             File: binlog.000001

         Position: 154

     Binlog_Do_DB:

 Binlog_Ignore_DB: mysql,information_schema,performance_schema

Executed_Gtid_Set:

1 row in set (0.00 sec)

mysql> quit

Bye

 

用户sunyard,密码Sunyard_23

3、在Candicate master上搭建从库

# cat /etc/my.cnf

[mysqld]

server_id=2

replicate-ignore-db=mysql,information_schema,performance_schema

binlog-ignore-db=mysql,information_schema,performance_schema

 

# systemctl restart mysqld

 

在Candicate master数据库中设置主数据库信息,

mysql>change master to master_host="192.168.120.10",

master_port=3306,

master_user='sunyard',

master_password='Sunyard_23',

master_log_file=' binlog.000001',

master_log_pos=154,

master_connect_retry=30;

mysql>CHANGE MASTER TO MASTER_LOG_FILE='binlog.000001',MASTER_LOG_POS=154;

 

Candicate master数据库中查看主从同步状态

mysql> SHOW SLAVE STATUS\G

   Slave_IO_Running: No

   Slave_SQL_Running:NO

 Candicate master数据库中开启主从同步

mysql> START SLAVE;

 

再次查看Candicate master数据库主从同步状态

mysql> SHOW SLAVE STATUS\G

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

 

5. 在Slave上搭建从库

先查看master数据库状态

# mysql -u root -p

mysql> SHOW MASTER STATUS\G

*************************** 1. row ***************************

         File: binlog.000001

         Position: 154

 

再到Slave服务器上配置

# vim /etc/my.cnf

[mysqld]

server_id=3

log-bin=binlog

replicate-ignore-db=mysql,information_schema,performance_schema

binlog-ignore-db=mysql,information_schema,performance_schema

 

# mysql -u root -p

mysql> change master to master_host="192.168.120.10",

master_port=3306,

master_user='sunyard',

master_password='Sunyard_23',

master_log_file=' binlog.000001',

master_log_pos=154,

master_connect_retry=30;

mysql> CHANGE MASTER TO MASTER_LOG_FILE='binlog.000001',MASTER_LOG_POS=154;

mysql> START SLAVE;

再次查看Slave数据库主从同步状态

mysql> SHOW SLAVE STATUS\G

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

 

6. slave服务器设置为read only

mysql> set global read_only=1;

Query OK, 0 rows affected (0.04 sec)

7. 在Master中创建监控用户

mysql> grant all privileges on *.* to 'monitor'@'%' identified by 'Monitor_23';

Query OK, 0 rows affected (0.07 sec)

mysql> FLUSH PRIVILEGES;

 

注意,如果修改监控用户密码,其他库上的也得修改否则MHA无法连接

五、配置MHA

1、在Monitor host(192.168.120.30)上创建MHA工作目录,并且创建相关配置文件

# mkdir -p /etc/masterha

 

# vim /etc/masterha/app1.cnf

[server default]

manager_log=/masterha/app1/manager.log   //设置manager的日志

manager_workdir=/masterha/app1           //设置manager的工作目录

master_binlog_dir=/var/lib/mysql         //设置master默认保存binlog的位置,以便MHA可以找到master的日志

 

 

master_ip_failover_script= /usr/local/bin/master_ip_failover    //设置自动failover时候的切换脚本

master_ip_online_change_script= /usr/local/bin/master_ip_online_change  //设置手动切换时候的切换脚本

 

 

user=monitor               // 设置监控用户

password=Monitor_23        //设置监控用户的密码

ping_interval=1            //设置监控主库,发送ping包的时间间隔,默认是3秒,尝试三次没有回应的时候进行自动failover

remote_workdir=/tmp        //设置远端mysql在发生切换时binlog的保存位置

repl_user=sunyard          //设置复制环境中的复制用户名

repl_password=Sunyard_23   //设置复制用户的密码

 

 

report_script=/usr/local/bin/send_report    //设置发生切换后发送的报警的脚本

 

secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.120.20 -s 192.168.120.30 --user=root --master_host=192.168.120.10 --master_ip=192.168.120.10 --master_port=3306

#一旦MHA到master的监控之间出现问题,MHA Manager将会判断其它两个slave是否能建立到master_ip 3306端口的连接

 

shutdown_script=""      //设置故障发生后关闭故障主机脚本(该脚本的主要作用是关闭主机防止发生脑裂)

 

ssh_user=root           //设置ssh的登录用户名

 

[server1]

hostname=192.168.120.10

port=3306

 

[server2]

hostname=192.168.120.20

port=3306

candidate_master=1   //设置为候选master,如果设置该参数以后,发生主从切换以后将会将此从库提升为主库,即使这个主库不是集群中最新的slave

 

check_repl_delay=0   //默认情况下如果一个slave落后master 100M的relay logs的话,MHA将不会选择该slave作为一个新的master,因为对于这个slave的恢复需要花费很长时间,通过设置check_repl_delay=0,MHA触发切换在选择一个新的master的时候将会忽略复制延时,这个参数对于设置了candidate_master=1的主机非常有用,因为它保证了这个候选主在切换过程中一定是最新的master

 

[server3]

hostname=192.168.120.30

port=3306

 

无注释配置

[server default]

manager_log=/masterha/app1/manager.log

manager_workdir=/masterha/app1

master_binlog_dir=/var/lib/mysql

master_ip_failover_script= /usr/local/bin/master_ip_failover

master_ip_online_change_script= /usr/local/bin/master_ip_online_change

user=monitor

password=Monitor_23

ping_interval=1

remote_workdir=/tmp

repl_user=sunyard

repl_password=Sunyard_23

report_script=/usr/local/bin/send_report

 

secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.120.20 -s 192.168.120.30 --user=root --master_host=192.168.120.10 --master_ip=192.168.120.10 --master_port=3306

shutdown_script=""

 

ssh_user=root

 

[server1]

hostname=192.168.120.10

port=3306

 

[server2]

hostname=192.168.120.20

port=3306

candidate_master=1

check_repl_delay=0

 

[server3]

hostname=192.168.120.30

port=3306

注意:

      1> 在编辑该文件时,后面的注释切记要去掉,MHA并不会将后面的内容识别为注释。

 

      2> 配置文件中设置了master_ip_failover_script,secondary_check_script,master_ip_online_change_script,report_script,对应的文件见文章末尾。

 

2、设置relay log清除方式

在每个Slave上,由于采用主从从架构,即Candicate master和Slave库都要设置

 

mysql> set global relay_log_purge=0;

 

      MHA在发生切换过程中,从库在恢复的过程中,依赖于relay log的相关信息,所以我们这里要将relay log的自动清楚设置为OFF,采用手动清楚relay log的方式。

 

      在默认情况下,从服务器上的中继日志会在SQL线程执行完后被自动删除。但是在MHA环境中,这些中继日志在恢复其它从服务器时可能会被用到,因此需要禁用中继日志的自动清除。改为定期手动清除SQL线程应用完的中继日志。

 

      ext3文件系统下,删除大的文件需要一定的时间,这样会导致严重的复制延迟,所以在Linux中,一般都是通过硬链接的方式来删除大文件。

 

3、设置定期清理relay脚本

      MHA节点中包含了purge_relay_logs脚本,它可以为relay log创建硬链接,执行set global relay_log_purge=1,等待几秒钟以便SQL线程切换到新的中继日志,再执行set global relay_log_purge=0。

 

下面看看脚本的使用方法:

 

# purge_relay_logs --user=monitor --password=Monitor_23 --disable_relay_log_purge --workdir=/tmp/

 

2020-10-16 16:27:15: purge_relay_logs script started.

 relay_log_purge is enabled. Disabling..

 Found relay_log.info: /var/lib/mysql/relay-log.info

 Opening /var/lib/mysql/node3-relay-bin.000001 ..

 Opening /var/lib/mysql/node3-relay-bin.000002 ..

 Executing SET GLOBAL relay_log_purge=1; FLUSH LOGS; sleeping a few seconds so that SQL thread can delete older relay log files (if it keeps up); SET GLOBAL relay_log_purge=0; .. ok.

2020-10-16 16:27:18: All relay log purging operations succeeded.

 

 

 

 

其中:

--user:mysql用户名

--password:mysql用户的密码

--host: mysqlserver地址

 

--workdir:指定创建relay log的硬链接的位置,默认的是/var/tmp。由于系统不同分区创建硬链接文件会失败,故需要指定具体的硬链接的位置。

 

--disable_relay_log_purge:默认情况下,如果relay_log_purge=1,则脚本会直接退出。通过设置这个参数,该脚本会首先将relay_log_purge设置为1,清除掉relay log后,再将该参数设置为0。

 

设置crontab来定期清理relay log

 

        MHA在切换的过程中会直接调用mysqlbinlog命令,故需要在环境变量中指定mysqlbinlog的具体路径。

 

# vim /etc/cron.d/purge_relay_logs

0 4 * * * /usr/local/bin/purge_relay_logs --user=monitor --password=Monitor_23 -disable_relay_log_purge --workdir=/tmp/ >> /tmp/purge_relay_logs.log 2>&1

       

注意:最好是每台slave服务器在不同时间点执行该计划任务。

 

4、将mysqlbinlog的路径添加到环境变量中

# vim .bash_profile

export binlog_dir=/var/lib/mysql/binlog

六、检查SSH的配置

Monitor host上执行

# masterha_check_ssh --conf=/etc/masterha/app1.cnf

 

Fri Oct 16 10:58:29 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Fri Oct 16 10:58:29 2020 - [info] Reading application default configuration from /etc/masterha/app1.cnf..

Fri Oct 16 10:58:29 2020 - [info] Reading server configuration from /etc/masterha/app1.cnf..

Fri Oct 16 10:58:29 2020 - [info] Starting SSH connection tests..

Fri Oct 16 10:58:31 2020 - [debug]

Fri Oct 16 10:58:29 2020 - [debug]  Connecting via SSH from root@192.168.120.10(192.168.120.10:22) to root@192.168.120.20(192.168.120.20:22)..

Fri Oct 16 10:58:30 2020 - [debug]   ok.

Fri Oct 16 10:58:30 2020 - [debug]  Connecting via SSH from root@192.168.120.10(192.168.120.10:22) to root@192.168.120.30(192.168.120.30:22)..

Fri Oct 16 10:58:30 2020 - [debug]   ok.

Fri Oct 16 10:58:32 2020 - [debug]

Fri Oct 16 10:58:30 2020 - [debug]  Connecting via SSH from root@192.168.120.20(192.168.120.20:22) to root@192.168.120.10(192.168.120.10:22)..

Fri Oct 16 10:58:31 2020 - [debug]   ok.

Fri Oct 16 10:58:31 2020 - [debug]  Connecting via SSH from root@192.168.120.20(192.168.120.20:22) to root@192.168.120.30(192.168.120.30:22)..

Fri Oct 16 10:58:31 2020 - [debug]   ok.

Fri Oct 16 10:58:32 2020 - [debug]

Fri Oct 16 10:58:30 2020 - [debug]  Connecting via SSH from root@192.168.120.30(192.168.120.30:22) to root@192.168.120.10(192.168.120.10:22)..

Fri Oct 16 10:58:31 2020 - [debug]   ok.

Fri Oct 16 10:58:31 2020 - [debug]  Connecting via SSH from root@192.168.120.30(192.168.120.30:22) to root@192.168.120.20(192.168.120.20:22)..

Fri Oct 16 10:58:32 2020 - [debug]   ok.

Fri Oct 16 10:58:32 2020 - [info] All SSH connection tests passed successfully.

Use of uninitialized value in exit at /usr/local/bin/masterha_check_ssh line 44.

 

七、查看整个集群的状态

1、在Monitor host上执行

# masterha_check_repl --conf=/etc/masterha/app1.cnf

Fri Oct 16 11:00:37 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Fri Oct 16 11:00:37 2020 - [info] Reading application default configuration from /etc/masterha/app1.cnf..

Fri Oct 16 11:00:37 2020 - [info] Reading server configuration from /etc/masterha/app1.cnf..

Fri Oct 16 11:00:37 2020 - [info] MHA::MasterMonitor version 0.56.

Creating directory /masterha/app1.. done.

Fri Oct 16 11:00:37 2020 - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln301] install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted.

Compilation failed in require at (eval 172) line 3.

 

 at /usr/local/share/perl5/MHA/DBHelper.pm line 205.

 at /usr/local/share/perl5/MHA/Server.pm line 166.

Fri Oct 16 11:00:37 2020 - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln301] install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted.

Compilation failed in require at (eval 172) line 3.

 

 at /usr/local/share/perl5/MHA/DBHelper.pm line 205.

 at /usr/local/share/perl5/MHA/Server.pm line 166.

Fri Oct 16 11:00:37 2020 - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln301] install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted.

Compilation failed in require at (eval 172) line 3.

 

 at /usr/local/share/perl5/MHA/DBHelper.pm line 205.

 at /usr/local/share/perl5/MHA/Server.pm line 166.

Fri Oct 16 11:00:38 2020 - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln309] Got fatal error, stopping operations

Fri Oct 16 11:00:38 2020 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations.  at /usr/local/share/perl5/MHA/MasterMonitor.pm line 326.

Fri Oct 16 11:00:38 2020 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.

Fri Oct 16 11:00:38 2020 - [info] Got exit code 1 (Not master dead).

 

MySQL Replication Health is NOT OK!

 

Slave没有启动log-bin,如果没有启动的话,后续就无法提升为主

 

2、设置log-bin后,重新执行

查看是否开启log-bin

# mysql -u root -p

mysql> show variables like '%log_bin%';

+---------------------------------+-------+

| Variable_name                   | Value |

+---------------------------------+-------+

| log_bin                         | OFF   |

| log_bin_basename                |       |

| log_bin_index                   |       |

| log_bin_trust_function_creators | OFF   |

| log_bin_use_v1_row_events       | OFF   |

| sql_log_bin                     | ON    |

+---------------------------------+-------+

6 rows in set (0.05 sec)

slave显示的是off,刚刚忘了,文档已补上

# vim /etc/my.cnf

重启mysql,然后再去检查一下。若为ON,则配置OK。

# systemctl restart mysqld

# masterha_check_repl --conf=/etc/masterha/app1.cnf

还是有报错

 

 

 

 # cpan

cpan[1]> force install GD

cpan[2]> quit

# cpan -D DBI

# cpan DBD::mysql

# masterha_check_repl --conf=/etc/masterha/app1.cnf

 

 

上述问题已解决,新的问题是检测到多主机配置,但两个或多个主机可写(只读)未设置或已死!经测试,双主机模式有问题,改为主从从,一个为备选主机

 

 

 

Candicate master设为readonly

mysql> set global read_only=1;

 

这里遇到了很多BUG,如果提示缺少master_ip_failover_script 脚本,看最下面的脚本

 

# masterha_check_repl --conf=/etc/masterha/app1.cnf

 

 

 

检查通过

 

八、检查MHA Manager的状态

# masterha_check_status --conf=/etc/masterha/app1.cnf

app1 is stopped(2:NOT_RUNNING).  

 

 

 

        如果正常,会显示“PING_OK”,否则会显示“NOT_RUNNING”,代表MHA监控还没有开启。

 

九、开启MHA Manager监控

# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /masterha/app1/manager.log 2>&1 &

 

 

其中

remove_dead_master_conf:该参数代表当发生主从切换后,老的主库的IP将会从配置文件中移除。经测试如果移除后再恢复了会找不到库,需要重新往/etc/masterha/app1.cnf里写入库地址。

ignore_last_failover:在默认情况下,MHA发生切换后将会在/masterha/app1下产生app1.failover.complete文件,下次再次切换的时候如果发现该目录下存在该文件且两次切换的时间间隔不足8小时的话,将不允许触发切换。除非在第一次切换后手动rm -rf /masterha/app1/app1.failover.complete。该参数代表忽略上次MHA触发切换产生的文件。

 

开启后查看状态,有一定的初始化时间

# masterha_check_status --conf=/etc/masterha/app1.cnf

app1 (pid:56057) is running(0:PING_OK), master:192.168.120.10

 

 

 

 

 

十、关闭MHA Manager监控

 

# masterha_stop --conf=/etc/masterha/app1.cnf

Stopped app1 successfully.

[1]+  Exit 1                  nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /masterha/app1/manager.log 2>&1

至此,MHA部分配置完毕,下面,来配置VIP。

 

十一、VIP配置

 

VIP配置可以采用两种方式,一是通过引入Keepalived来管理VIP,另一种是在脚本中手动管理。

 

对于keepalived管理VIP,存在脑裂情况,即当主从网络出现问题时,slave会抢占VIP,这样会导致主从数据库都持有VIP,造成IP冲突,所以在网络不是很好的情况下,不建议采用keepalived服务。

 

在实际生产中使用较多的也是第二种,即在脚本中手动管理VIP,

 

经测试,脚本vip不需要添加心跳线,通过link的方式添加转移vip,主机自动挂载vip地址,当主机故障切换后,将由新的主机继承VIP地址。

 

1. keepalived管理VIP,仅当了解

 

1> 安装keepalived

 

    因为我这里设置了Candicate master,故只在Master和Candicate master上安装。

 

    如果没有Candicate master,两个Slave的地位平等,则两个Slave上都需安装keepalived。

 

    # wget http://www.keepalived.org/software/keepalived-1.2.24.tar.gz

 

    # tar xvf keepalived-1.2.24.tar.gz

 

    # cd keepalived-1.2.24

 

    # ./configure --prefix=/usr/local/keepalived

 

    # make

 

    # make install

 

    # cp /usr/local/keepalived/etc/rc.d/init.d/keepalived /etc/rc.d/init.d/

 

    # cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/

 

    # mkdir /etc/keepalived

 

    # cp /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/

 

    # cp /usr/local/keepalived/sbin/keepalived /usr/sbin/

 

2> 为keepalived设置单独的日志文件(非必需)

keepalived的日志默认是输出到/var/log/message中

# vim /etc/sysconfig/keepalived

KEEPALIVED_OPTIONS="-D -d -S 0"

设置syslog

# vim /etc/rsyslog.conf

添加如下内容:

local0.*           /var/log/keepalived.log

 

# service rsyslog restart

 

3> 配置keepalived

Master上修改

# vim /etc/keepalived/keepalived.conf

global_defs {

   notification_email {

     slowtech@qq.com

   }

   notification_email_from root@localhost.localdomain

   smtp_server 127.0.0.1

   smtp_connect_timeout 30

   router_id MySQL-HA

}

 

vrrp_instance VI_1 {

    state BACKUP

    interface ens33

    virtual_router_id 51

    priority 150

    advert_int 1

    nopreempt

    authentication {

        auth_type PASS

        auth_pass 1111

    }

    virtual_ipaddress {

        192.168.120.159/24

    }

}

 

关于keepalived的参数的详细介绍,可参考:

    LVS+Keepalived搭建MyCAT高可用负载均衡集群

    keepalived工作原理和配置说明

 

将配置文件scp到Candicate master上

# scp /etc/keepalived/keepalived.conf 192.168.120.20:/etc/keepalived/

 

只需将配置文件中的priority设置为90

 

注意:我们为什么在这里设置keepalived为backup模式呢?

    master-backup模式下,如果主库宕掉,VIP会自动漂移到Slave上,当主库修复,keepalived启动后,还会将VIP抢过来,即使设置了nopreempt(不抢占)的方式,该动作仍会发生。但在backup-backup模式下,当主库修改,并启动keepalived后,并不会抢占新主的VIP,即便原主的priority高于新主的。

 

4> 启动keepalived

先在Master上启动

# service keepalived start

env: /etc/init.d/keepalived: Permission denied

 

# chmod +x /etc/init.d/keepalived

# service keepalived start

 

查看绑定情况

# ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

    inet 127.0.0.1/8 scope host lo

    inet6 ::1/128 scope host

       valid_lft forever preferred_lft forever

2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000

    link/ether 00:0c:29:c6:47:04 brd ff:ff:ff:ff:ff:ff

    inet 192.168.120.10/24 brd 192.168.120.255 scope global ens33

    inet 192.168. 152.159/24 scope global secondary ens33

    inet6 fe80::20c:29ff:fec6:4704/64 scope link

       valid_lft forever preferred_lft forever     

 

可见,VIP(192168.120.159)已经绑定到Master的ens33网卡上了。

 

启动Candicate master的keepalived

# service keepalived start

 

5> MHA中引入keepalived

编辑/usr/local/bin/master_ip_failover

相对于原文件,修改地方为93-95行

 

  1 #!/usr/bin/env perl

  2

  3 #  Copyright (C) 2011 DeNA Co.,Ltd.

  4 #

  5 #  This program is free software; you can redistribute it and/or modify

  6 #  it under the terms of the GNU General Public License as published by

  7 #  the Free Software Foundation; either version 2 of the License, or

  8 #  (at your option) any later version.

  9 #

 10 #  This program is distributed in the hope that it will be useful,

 11 #  but WITHOUT ANY WARRANTY; without even the implied warranty of

 12 #  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the

 13 #  GNU General Public License for more details.

 14 #

 15 #  You should have received a copy of the GNU General Public License

 16 #   along with this program; if not, write to the Free Software

 17 #  Foundation, Inc.,

 18 #  51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

 19

 20 ## Note: This is a sample script and is not complete. Modify the script based on your environment.

 21

 22 use strict;

 23 use warnings FATAL => 'all';

 24

 25 use Getopt::Long;

 26 use MHA::DBHelper;

 27 my (

 28   $command,        $ssh_user,         $orig_master_host,

 29   $orig_master_ip, $orig_master_port, $new_master_host,

 30   $new_master_ip,  $new_master_port,  $new_master_user,

 31   $new_master_password

 32 );

 33

 34 GetOptions(

 35   'command=s'             => \$command,

 36   'ssh_user=s'            => \$ssh_user,

 37   'orig_master_host=s'    => \$orig_master_host,

 38   'orig_master_ip=s'      => \$orig_master_ip,

 39   'orig_master_port=i'    => \$orig_master_port,

 40   'new_master_host=s'     => \$new_master_host,

 41   'new_master_ip=s'       => \$new_master_ip,

 42   'new_master_port=i'     => \$new_master_port,

 43   'new_master_user=s'     => \$new_master_user,

 44   'new_master_password=s' => \$new_master_password,

 45 );

 46

 47 exit &main();

 48

 49 sub main {

 50   if ( $command eq "stop" || $command eq "stopssh" ) {

 51

 52     # $orig_master_host, $orig_master_ip, $orig_master_port are passed.

 53     # If you manage master ip address at global catalog database,

 54     # invalidate orig_master_ip here.

 55     my $exit_code = 1;

 56     eval {

 57

 58       # updating global catalog, etc

 59       $exit_code = 0;

 60     };

 61     if ($@) {

 62       warn "Got Error: $@\n";

 63       exit $exit_code;

 64     }

 65     exit $exit_code;

 66   }

 67   elsif ( $command eq "start" ) {

 68

 69     # all arguments are passed.

 70     # If you manage master ip address at global catalog database,

 71     # activate new_master_ip here.

 72     # You can also grant write access (create user, set read_only=0, etc) here.

 73     my $exit_code = 10;

 74     eval {

 75       my $new_master_handler = new MHA::DBHelper();

 76

 77       # args: hostname, port, user, password, raise_error_or_not

 78       $new_master_handler->connect( $new_master_ip, $new_master_port,

 79         $new_master_user, $new_master_password, 1 );

 80

 81       ## Set read_only=0 on the new master

 82       $new_master_handler->disable_log_bin_local();

 83       print "Set read_only=0 on the new master.\n";

 84       $new_master_handler->disable_read_only();

 85

 86       ## Creating an app user on the new master

 87       #print "Creating app user on the new master..\n";

 88       #FIXME_xxx_create_user( $new_master_handler->{dbh} );

 89       $new_master_handler->enable_log_bin_local();

 90       $new_master_handler->disconnect();

 91

 92       ## Update master ip on the catalog database, etc

 93       my $cmd;

 94       $cmd = 'ssh '.$ssh_user.'@'.$orig_master_ip.' service keepalived stop';

 95       system($cmd);

 96       $exit_code = 0;

 97     };

 98     if ($@) {

 99       warn $@;

100

101       # If you want to continue failover, exit 10.

102       exit $exit_code;

103     }

104     exit $exit_code;

105   }

106   elsif ( $command eq "status" ) {

107

108     # do nothing

109     exit 0;

110   }

111   else {

112     &usage();

113     exit 1;

114   }

115 }

116

117 sub usage {

118   print

119 "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";

120 }

 

 

2. 通过脚本的方式管理VIP

此脚本不明原因报错,已解决,是网卡名没写对,要使用当前系统的网卡。

如:

rhel6默认网卡是eth0

rhel7默认网卡是ens33

 

# vim /usr/local/bin/master_ip_failover

 

#!/usr/bin/env perl

 

#  Copyright (C) 2011 DeNA Co.,Ltd.

#

#  This program is free software; you can redistribute it and/or modify

#  it under the terms of the GNU General Public License as published by

#  the Free Software Foundation; either version 2 of the License, or

#  (at your option) any later version.

#

#  This program is distributed in the hope that it will be useful,

#  but WITHOUT ANY WARRANTY; without even the implied warranty of

#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the

#  GNU General Public License for more details.

#

#  You should have received a copy of the GNU General Public License

#   along with this program; if not, write to the Free Software

#  Foundation, Inc.,

#  51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

 

## Note: This is a sample script and is not complete. Modify the script based on your environment.

 

use strict;

use warnings FATAL => 'all';

 

use Getopt::Long;

use MHA::DBHelper;

my (

  $command,        $ssh_user,         $orig_master_host,

  $orig_master_ip, $orig_master_port, $new_master_host,

  $new_master_ip,  $new_master_port,  $new_master_user,

  $new_master_password

);

 

my $vip = '192.168.120.159';

my $key = "2";

my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip/24";

my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";

my $ssh_send_garp = "/sbin/arping -U $vip -I ens33 -c 1";

 

 

GetOptions(

  'command=s'             => \$command,

  'ssh_user=s'            => \$ssh_user,

  'orig_master_host=s'    => \$orig_master_host,

  'orig_master_ip=s'      => \$orig_master_ip,

  'orig_master_port=i'    => \$orig_master_port,

  'new_master_host=s'     => \$new_master_host,

  'new_master_ip=s'       => \$new_master_ip,

  'new_master_port=i'     => \$new_master_port,

  'new_master_user=s'     => \$new_master_user,

  'new_master_password=s' => \$new_master_password,

);

 

exit &main();

 

sub main {

  if ( $command eq "stop" || $command eq "stopssh" ) {

 

    # $orig_master_host, $orig_master_ip, $orig_master_port are passed.

    # If you manage master ip address at global catalog database,

    # invalidate orig_master_ip here.

    my $exit_code = 1;

    eval {

      print "Disabling the VIP an old master: $orig_master_host \n";

      &stop_vip();

      $exit_code = 0;

    };

    if ($@) {

      warn "Got Error: $@\n";

      exit $exit_code;

    }

    exit $exit_code;

  }

  elsif ( $command eq "start" ) {

 

    # all arguments are passed.

    # If you manage master ip address at global catalog database,

    # activate new_master_ip here.

    # You can also grant write access (create user, set read_only=0, etc) here.

    my $exit_code = 10;

    eval {

      

      my $new_master_handler = new MHA::DBHelper();

 

      # args: hostname, port, user, password, raise_error_or_not

      $new_master_handler->connect( $new_master_ip, $new_master_port,

        $new_master_user, $new_master_password, 1 );

 

      ## Set read_only=0 on the new master

      $new_master_handler->disable_log_bin_local();

      print "Set read_only=0 on the new master.\n";

      $new_master_handler->disable_read_only();

 

      ## Creating an app user on the new master

      # print "Creating app user on the new master..\n";

      # FIXME_xxx_create_user( $new_master_handler->{dbh} );

      $new_master_handler->enable_log_bin_local();

      $new_master_handler->disconnect();

 

      print "Enabling the VIP $vip on the new master: $new_master_host \n";

      &start_vip();

      $exit_code = 0;

    };

    if ($@) {

      warn $@;

 

      # If you want to continue failover, exit 10.

      exit $exit_code;

    }

    exit $exit_code;

  }

  elsif ( $command eq "status" ) {

 

    # do nothing

    exit 0;

  }

  else {

    &usage();

    exit 1;

  }

}

 

sub start_vip(){

    `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;

    `ssh $ssh_user\@$new_master_host \" $ssh_send_garp \"`;

}

 

sub stop_vip(){

    return 0  unless  ($ssh_user);

    `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;

}

 

sub usage {

  print

"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";

}

3、使用官方sample脚本master_ip_failover_script(二者取一即可)

# cat /usr/local/bin/master_ip_failover

#!/usr/bin/env perl

use strict;

use warnings FATAL => 'all';

 

use Getopt::Long;

use MHA::DBHelper;

 

my (

  $command,        $ssh_user,         $orig_master_host,

  $orig_master_ip, $orig_master_port, $new_master_host,

  $new_master_ip,  $new_master_port,  $new_master_user,

  $new_master_password

);

 

my $vip = '192.168.120.159/24';

my $key = '88';

my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";

my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";

GetOptions(

  'command=s'             => \$command,

  'ssh_user=s'            => \$ssh_user,

  'orig_master_host=s'    => \$orig_master_host,

  'orig_master_ip=s'      => \$orig_master_ip,

  'orig_master_port=i'    => \$orig_master_port,

  'new_master_host=s'     => \$new_master_host,

  'new_master_ip=s'       => \$new_master_ip,

  'new_master_port=i'     => \$new_master_port,

  'new_master_user=s'     => \$new_master_user,

  'new_master_password=s' => \$new_master_password,

);

 

exit &main();

 

sub main {

    print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

    if ( $command eq "stop" || $command eq "stopssh" ) {

        my $exit_code = 1;

        eval {

            print "Disabling the VIP on old master: $orig_master_host \n";

            &stop_vip();

            $exit_code = 0;

        };

        if ($@) {

            warn "Got Error: $@\n";

            exit $exit_code;

        }

        exit $exit_code;

    }

    elsif ( $command eq "start" ) {

        my $exit_code = 10;

        eval {

            print "Enabling the VIP - $vip on the new master - $new_master_host \n";

            &start_vip();

            $exit_code = 0;

        };

        if ($@) {

            warn $@;

            exit $exit_code;

        }

        exit $exit_code;

    }

    elsif ( $command eq "status" ) {

        print "Checking the Status of the script.. OK \n";

        exit 0;

    }

    else {

        &usage();

        exit 1;

    }

}

sub start_vip() {

    `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;

}

sub stop_vip() {

     return 0  unless  ($ssh_user);

    `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;

}

sub usage {

  print

"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";

}

 

 

 

 

实际生产环境中,推荐这种方式来管理VIP,可有效防止脑裂情况的发生。

 

十二、额外脚本

脚本存放路径:/usr/local/bin

1. master_ip_failover:管理VIP

2. master_ip_online_change:

3. masterha_secondary_check:当MHA manager检测到master不可用时,通过masterha_secondary_check脚本来进一步确认,减低误切的风险。

4. send_report:当发生故障切换时,可通过send_report脚本发送告警信息。

 

master_ip_online_change

# vim /usr/local/bin/master_ip_online_change

 

#!/usr/bin/env perl

 

#  Copyright (C) 2011 DeNA Co.,Ltd.

#

#  This program is free software; you can redistribute it and/or modify

#  it under the terms of the GNU General Public License as published by

#  the Free Software Foundation; either version 2 of the License, or

#  (at your option) any later version.

#

#  This program is distributed in the hope that it will be useful,

#  but WITHOUT ANY WARRANTY; without even the implied warranty of

#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the

#  GNU General Public License for more details.

#

#  You should have received a copy of the GNU General Public License

#   along with this program; if not, write to the Free Software

#  Foundation, Inc.,

#  51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

 

## Note: This is a sample script and is not complete. Modify the script based on your environment.

 

use strict;

use warnings FATAL => 'all';

 

use Getopt::Long;

use MHA::DBHelper;

use MHA::NodeUtil;

use Time::HiRes qw( sleep gettimeofday tv_interval );

use Data::Dumper;

 

my $_tstart;

my $_running_interval = 0.1;

 

my $vip = '192.168.120.159';

my $key = "2";

my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip/24";

my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";

my $ssh_send_garp = "/sbin/arping -U $vip -I ens33 -c 1";

 

my (

  $command,              $orig_master_is_new_slave, $orig_master_host,

  $orig_master_ip,       $orig_master_port,         $orig_master_user,

  $orig_master_password, $orig_master_ssh_user,     $new_master_host,

  $new_master_ip,        $new_master_port,          $new_master_user,

  $new_master_password,  $new_master_ssh_user,

);

GetOptions(

  'command=s'                => \$command,

  'orig_master_is_new_slave' => \$orig_master_is_new_slave,

  'orig_master_host=s'       => \$orig_master_host,

  'orig_master_ip=s'         => \$orig_master_ip,

  'orig_master_port=i'       => \$orig_master_port,

  'orig_master_user=s'       => \$orig_master_user,

  'orig_master_password=s'   => \$orig_master_password,

  'orig_master_ssh_user=s'   => \$orig_master_ssh_user,

  'new_master_host=s'        => \$new_master_host,

  'new_master_ip=s'          => \$new_master_ip,

  'new_master_port=i'        => \$new_master_port,

  'new_master_user=s'        => \$new_master_user,

  'new_master_password=s'    => \$new_master_password,

  'new_master_ssh_user=s'    => \$new_master_ssh_user,

);

 

exit &main();

 

sub start_vip(){

    `ssh $new_master_ssh_user\@$new_master_host \" $ssh_start_vip \"`;

    `ssh $new_master_ssh_user\@$new_master_host \" $ssh_send_garp \"`;

}

 

sub stop_vip(){

    `ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;

}

 

 

sub current_time_us {

  my ( $sec, $microsec ) = gettimeofday();

  my $curdate = localtime($sec);

  return $curdate . " " . sprintf( "%06d", $microsec );

}

 

sub sleep_until {

  my $elapsed = tv_interval($_tstart);

  if ( $_running_interval > $elapsed ) {

    sleep( $_running_interval - $elapsed );

  }

}

 

sub get_threads_util {

  my $dbh                    = shift;

  my $my_connection_id       = shift;

  my $running_time_threshold = shift;

  my $type                   = shift;

  $running_time_threshold = 0 unless ($running_time_threshold);

  $type                   = 0 unless ($type);

  my @threads;

 

  my $sth = $dbh->prepare("SHOW PROCESSLIST");

  $sth->execute();

 

  while ( my $ref = $sth->fetchrow_hashref() ) {

    my $id         = $ref->{Id};

    my $user       = $ref->{User};

    my $host       = $ref->{Host};

    my $command    = $ref->{Command};

    my $state      = $ref->{State};

    my $query_time = $ref->{Time};

    my $info       = $ref->{Info};

    $info =~ s/^\s*(.*?)\s*$/$1/ if defined($info);

    next if ( $my_connection_id == $id );

    next if ( defined($query_time) && $query_time < $running_time_threshold );

    next if ( defined($command)    && $command eq "Binlog Dump" );

    next if ( defined($user)       && $user eq "system user" );

    next

      if ( defined($command)

      && $command eq "Sleep"

      && defined($query_time)

      && $query_time >= 1 );

 

    if ( $type >= 1 ) {

      next if ( defined($command) && $command eq "Sleep" );

      next if ( defined($command) && $command eq "Connect" );

    }

 

    if ( $type >= 2 ) {

      next if ( defined($info) && $info =~ m/^select/i );

      next if ( defined($info) && $info =~ m/^show/i );

    }

 

    push @threads, $ref;

  }

  return @threads;

}

 

sub main {

  if ( $command eq "stop" ) {

    ## Gracefully killing connections on the current master

    # 1. Set read_only= 1 on the new master

    # 2. DROP USER so that no app user can establish new connections

    # 3. Set read_only= 1 on the current master

    # 4. Kill current queries

    # * Any database access failure will result in script die.

    my $exit_code = 1;

    eval {

      ## Setting read_only=1 on the new master (to avoid accident)

      my $new_master_handler = new MHA::DBHelper();

 

      # args: hostname, port, user, password, raise_error(die_on_error)_or_not

      $new_master_handler->connect( $new_master_ip, $new_master_port,

        $new_master_user, $new_master_password, 1 );

      print current_time_us() . " Set read_only on the new master.. ";

      $new_master_handler->enable_read_only();

      if ( $new_master_handler->is_read_only() ) {

        print "ok.\n";

      }

      else {

        die "Failed!\n";

      }

      $new_master_handler->disconnect();

 

      # Connecting to the orig master, die if any database error happens

      my $orig_master_handler = new MHA::DBHelper();

      $orig_master_handler->connect( $orig_master_ip, $orig_master_port,

        $orig_master_user, $orig_master_password, 1 );

 

      ## Drop application user so that nobody can connect. Disabling per-session binlog beforehand

      $orig_master_handler->disable_log_bin_local();

      # print current_time_us() . " Drpping app user on the orig master..\n";

      #drop_app_user($orig_master_handler);

 

      ## Waiting for N * 100 milliseconds so that current connections can exit

      my $time_until_read_only = 15;

      $_tstart = [gettimeofday];

      my @threads = get_threads_util( $orig_master_handler->{dbh},

        $orig_master_handler->{connection_id} );

      while ( $time_until_read_only > 0 && $#threads >= 0 ) {

        if ( $time_until_read_only % 5 == 0 ) {

          printf

"%s Waiting all running %d threads are disconnected.. (max %d milliseconds)\n",

            current_time_us(), $#threads + 1, $time_until_read_only * 100;

          if ( $#threads < 5 ) {

            print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"

              foreach (@threads);

          }

        }

        sleep_until();

        $_tstart = [gettimeofday];

        $time_until_read_only--;

        @threads = get_threads_util( $orig_master_handler->{dbh},

          $orig_master_handler->{connection_id} );

      }

 

      ## Setting read_only=1 on the current master so that nobody(except SUPER) can write

      print current_time_us() . " Set read_only=1 on the orig master.. ";

      $orig_master_handler->enable_read_only();

      if ( $orig_master_handler->is_read_only() ) {

        print "ok.\n";

      }

      else {

        die "Failed!\n";

      }

 

      ## Waiting for M * 100 milliseconds so that current update queries can complete

      my $time_until_kill_threads = 5;

      @threads = get_threads_util( $orig_master_handler->{dbh},

        $orig_master_handler->{connection_id} );

      while ( $time_until_kill_threads > 0 && $#threads >= 0 ) {

        if ( $time_until_kill_threads % 5 == 0 ) {

          printf

"%s Waiting all running %d queries are disconnected.. (max %d milliseconds)\n",

            current_time_us(), $#threads + 1, $time_until_kill_threads * 100;

          if ( $#threads < 5 ) {

            print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"

              foreach (@threads);

          }

        }

        sleep_until();

        $_tstart = [gettimeofday];

        $time_until_kill_threads--;

        @threads = get_threads_util( $orig_master_handler->{dbh},

          $orig_master_handler->{connection_id} );

      }

 

      ## Terminating all threads

      print current_time_us() . " Killing all application threads..\n";

      $orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 );

      print current_time_us() . " done.\n";

      $orig_master_handler->enable_log_bin_local();

      $orig_master_handler->disconnect();

 

      ## Droping the VIP     

      print "Disabling the VIP an old master: $orig_master_host \n";

      &stop_vip();

 

      ## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK

      $exit_code = 0;

    };

    if ($@) {

      warn "Got Error: $@\n";

      exit $exit_code;

    }

    exit $exit_code;

  }

  elsif ( $command eq "start" ) {

    ## Activating master ip on the new master

    # 1. Create app user with write privileges

    # 2. Moving backup script if needed

    # 3. Register new master's ip to the catalog database

 

# We don't return error even though activating updatable accounts/ip failed so that we don't interrupt slaves' recovery.

# If exit code is 0 or 10, MHA does not abort

    my $exit_code = 10;

    eval {

      my $new_master_handler = new MHA::DBHelper();

 

      # args: hostname, port, user, password, raise_error_or_not

      $new_master_handler->connect( $new_master_ip, $new_master_port,

        $new_master_user, $new_master_password, 1 );

 

      ## Set read_only=0 on the new master

      $new_master_handler->disable_log_bin_local();

      print current_time_us() . " Set read_only=0 on the new master.\n";

      $new_master_handler->disable_read_only();

 

      ## Creating an app user on the new master

      #print current_time_us() . " Creating app user on the new master..\n";

      # create_app_user($new_master_handler);

      print "Enabling the VIP $vip on the new master: $new_master_host \n";

      &start_vip();

      $new_master_handler->enable_log_bin_local();

      $new_master_handler->disconnect();

 

      ## Update master ip on the catalog database, etc

      $exit_code = 0;

    };

    if ($@) {

      warn "Got Error: $@\n";

      exit $exit_code;

    }

    exit $exit_code;

  }

  elsif ( $command eq "status" ) {

 

    # do nothing

    exit 0;

  }

  else {

    &usage();

    exit 1;

  }

}

 

sub usage {

  print

"Usage: master_ip_online_change --command=start|stop|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";

  die;

}

 

masterha_secondary_check

# vim /usr/local/bin/masterha_secondary_check

 

#!/bin/env perl

 

#  Copyright (C) 2011 DeNA Co.,Ltd.

#

#  This program is free software; you can redistribute it and/or modify

#  it under the terms of the GNU General Public License as published by

#  the Free Software Foundation; either version 2 of the License, or

#  (at your option) any later version.

#

#  This program is distributed in the hope that it will be useful,

#  but WITHOUT ANY WARRANTY; without even the implied warranty of

#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the

#  GNU General Public License for more details.

#

#  You should have received a copy of the GNU General Public License

#   along with this program; if not, write to the Free Software

#  Foundation, Inc.,

#  51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

 

use strict;

use warnings FATAL => 'all';

 

use English qw(-no_match_vars);

use Getopt::Long;

use Pod::Usage;

use MHA::ManagerConst;

 

my @monitoring_servers;

my (

  $help,        $version,         $ssh_user,  $ssh_port,

  $ssh_options, $master_host,     $master_ip, $master_port,

  $master_user, $master_password, $ping_type

);

my $timeout = 5;

 

$| = 1;

GetOptions(

  'help'              => \$help,

  'version'           => \$version,

  'secondary_host=s'  => \@monitoring_servers,

  'user=s'            => \$ssh_user,

  'port=s'            => \$ssh_port,

  'options=s'         => \$ssh_options,

  'master_host=s'     => \$master_host,

  'master_ip=s'       => \$master_ip,

  'master_port=i'     => \$master_port,

  'master_user=s'     => \$master_user,

  'master_password=s' => \$master_password,

  'ping_type=s'       => \$ping_type,

  'timeout=i'         => \$timeout,

);

 

if ($version) {

  print "masterha_secondary_check version $MHA::ManagerConst::VERSION.\n";

  exit 0;

}

 

if ($help) {

  pod2usage(0);

}

 

unless ($master_host) {

  pod2usage(1);

}

 

sub exit_by_signal {

  exit 1;

}

local $SIG{INT} = $SIG{HUP} = $SIG{QUIT} = $SIG{TERM} = \&exit_by_signal;

 

$ssh_user    = "root" unless ($ssh_user);

$ssh_port    = 22     unless ($ssh_port);

$master_port = 3306   unless ($master_port);

 

if ($ssh_options) {

  $MHA::ManagerConst::SSH_OPT_CHECK = $ssh_options;

}

$MHA::ManagerConst::SSH_OPT_CHECK =~ s/VAR_CONNECT_TIMEOUT/$timeout/;

 

# 0: master is not reachable from all monotoring servers

# 1: unknown errors

# 2: at least one of monitoring servers is not reachable from this script

# 3: master is reachable from at least one of monitoring servers

my $exit_code = 0;

 

foreach my $monitoring_server (@monitoring_servers) {

  my $ssh_user_host = $ssh_user . '@' . $monitoring_server;

  my $command =

"ssh $MHA::ManagerConst::SSH_OPT_CHECK -p $ssh_port $ssh_user_host \"perl -e "

    . "\\\"use IO::Socket::INET; my \\\\\\\$sock = IO::Socket::INET->new"

    . "(PeerAddr => \\\\\\\"$master_host\\\\\\\", PeerPort=> $master_port, "

    . "Proto =>'tcp', Timeout => $timeout); if(\\\\\\\$sock) { close(\\\\\\\$sock); "

    . "exit 3; } exit 0;\\\" \"";

  my $ret = system($command);

  $ret = $ret >> 8;

  if ( $ret == 0 ) {

    print

"Monitoring server $monitoring_server is reachable, Master is not reachable from $monitoring_server. OK.\n";

    next;

  }

  if ( $ret == 3 ) {

    if ( defined $ping_type

      && $ping_type eq $MHA::ManagerConst::PING_TYPE_INSERT )

    {

      my $ret_insert;

      my $command_insert =

          "ssh $MHA::ManagerConst::SSH_OPT_CHECK -p $ssh_port $ssh_user_host \'"

        . "/usr/bin/mysql -u$master_user -p$master_password -h$master_host "

        . "-e \"CREATE DATABASE IF NOT EXISTS infra; "

        . "CREATE TABLE IF NOT EXISTS infra.chk_masterha (\\`key\\` tinyint NOT NULL primary key,\\`val\\` int(10) unsigned NOT NULL DEFAULT '0'\) engine=MyISAM; "

        . "INSERT INTO infra.chk_masterha values (1,unix_timestamp()) ON DUPLICATE KEY UPDATE val=unix_timestamp()\"\'";

      my $sigalrm_timeout = 3;

      eval {

        local $SIG{ALRM} = sub {

          die "timeout.\n";

        };

        alarm $sigalrm_timeout;

        $ret_insert = system($command_insert);

        $ret_insert = $ret_insert >> 8;

        alarm 0;

      };

      if ( $@ || $ret_insert != 0 ) {

        print

"Monitoring server $monitoring_server is reachable, Master is not writable from $monitoring_server. OK.\n";

        next;

      }

    }

    print "Master is reachable from $monitoring_server!\n";

    $exit_code = 3;

    last;

  }

  else {

    print "Monitoring server $monitoring_server is NOT reachable!\n";

    $exit_code = 2;

    last;

  }

}

 

exit $exit_code;

 

# ############################################################################

# Documentation

# ############################################################################

 

=pod

 

=head1 NAME

 

masterha_secondary_check - Checking master availability from additional network routes

 

=head1 SYNOPSIS

 

masterha_secondary_check -s secondary_host1 -s secondary_host2 .. --user=ssh_username --master_host=host --master_ip=ip --master_port=port

 

See online reference (http://code.google.com/p/mysql-master-ha/wiki/Parameters#secondary_check_script) for details.

 

=head1 DESCRIPTION

 

See online reference (http://code.google.com/p/mysql-master-ha/wiki/Parameters#secondary_check_script) for details.

 

send_report

# vim /usr/local/bin/send_report

#!/usr/bin/perl

 

#  Copyright (C) 2011 DeNA Co.,Ltd.

#

#  This program is free software; you can redistribute it and/or modify

#  it under the terms of the GNU General Public License as published by

#  the Free Software Foundation; either version 2 of the License, or

#  (at your option) any later version.

#

#  This program is distributed in the hope that it will be useful,

#  but WITHOUT ANY WARRANTY; without even the implied warranty of

#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the

#  GNU General Public License for more details.

#

#  You should have received a copy of the GNU General Public License

#   along with this program; if not, write to the Free Software

#  Foundation, Inc.,

#  51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

 

## Note: This is a sample script and is not complete. Modify the script based on your environment.

 

use strict;

use warnings FATAL => 'all';

use Mail::Sender;

use Getopt::Long;

 

#new_master_host and new_slave_hosts are set only when recovering master succeeded

my ( $dead_master_host, $new_master_host, $new_slave_hosts, $subject, $body );

my $smtp='smtp.126.com';

my $mail_from='slowtech@126.com';

my $mail_user='slowtech@126.com';

my $mail_pass='xxxxx';

my $mail_to=['slowtech@126.com'];

GetOptions(

  'orig_master_host=s' => \$dead_master_host,

  'new_master_host=s'  => \$new_master_host,

  'new_slave_hosts=s'  => \$new_slave_hosts,

  'subject=s'          => \$subject,

  'body=s'             => \$body,

);

mailToContacts($smtp,$mail_from,$mail_user,$mail_pass,$mail_to,$subject,$body);

 

sub mailToContacts {

    my ( $smtp, $mail_from, $user, $passwd, $mail_to, $subject, $msg ) = @_;

    open my $DEBUG, "> /tmp/monitormail.log"

        or die "Can't open the debug      file:$!\n";

    my $sender = new Mail::Sender {

        ctype       => 'text/plain; charset=utf-8',

        encoding    => 'utf-8',

        smtp        => $smtp,

        from        => $mail_from,

        auth        => 'LOGIN',

        TLS_allowed => '0',

        authid      => $user,

        authpwd     => $passwd,

        to          => $mail_to,

        subject     => $subject,

        debug       => $DEBUG

    };

 

    $sender->MailMsg(

        {   msg   => $msg,

            debug => $DEBUG

        }

    ) or print $Mail::Sender::Error;

    return 1;

}

 

 

 

# Do whatever you want here

exit 0;

 

至此,MHA高可用环境基本搭建完毕。

 

posted @ 2021-09-09 10:09  昌北F4  阅读(544)  评论(0编辑  收藏  举报