zhy2_rehat6_mysql03 - MHA_搭建.txt


export LANG=en_US


机器 VPN ip linux 账号/密码
manager2 172.28.20.131 10.1.1.11 
mysql2 - z(主) 172.28.20.133 10.1.1.13 
mysql2 - c(从) 172.28.20.135 10.1.1.15 
mysql2 - b(备) 172.28.20.137 10.1.1.17 


1.部署MHA

接下来部署MHA,具体的搭建环境如下(所有操作系统均为centos 6.2 64bit,不是必须,server03和server04是server02的从,复制环境搭建后面会简单演示,但是相关的安全复制不会详细说明,需要的童鞋请参考前面的文章,MySQL Replication需要注意的问题):

角色 ip地址 主机名 server_id 类型
Monitor host 192.168.0.20 148 server01 - 监控复制组
Master 192.168.0.50 145 server02 1 写入
Candicate master 192.168.0.60 146 server03 2 读
Slave 192.168.0.70 147 server04 3 读


MHA软件由两部分组成,Manager工具包和Node工具包,具体的说明如下。
其中master对外提供写服务,备选master(实际的slave,主机名server03)提供读服务,slave也提供相关的读服务,一旦master宕机,将会把备选master提升为新的master,slave指向新的master


[root@jspospdb ~]# mysql -uroot -proot
[root@jspospdb-bak ~]# service mysqld start


(1)在所有 节点 安装MHA node所需的perl模块(DBD:mysql),安装脚本如下:


export LANG=en_US


[root@192.168.0.145 ~]# cat install.sh
#!/bin/bash
wget http://xrl.us/cpanm --no-check-certificate
mv cpanm /usr/bin
chmod 755 /usr/bin/cpanm
cat > /root/list << EOF
install DBD::mysql
EOF
for package in `cat /root/list`
do
cpanm $package
done

 

如果有安装epel源,也可以使用yum安装
rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm

rpm -qa |grep -i dbd

[root@ums-data-bak bai]# yum install perl-DBD-MySQL -y

(2)在所有的节点安装mha node:
# tar zxvf mha4mysql-node-0.56.tar.gz
# cd mha4mysql-node-0.56
# perl Makefile.PL
# make && make install

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
报错一:
[root@jspospdb-manager mha4mysql-node-0.56]# perl Makefile.PL
Can't locate Module/Install/Admin.pm in @INC (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at inc/Module/Install.pm line 160.
BEGIN failed--compilation aborted at Makefile.PL line 1.
[root@jspospdb-manager mha4mysql-node-0.56]#

解决办法:
yum install cpan -y
yum install perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker -y
cpan ExtUtils::Install

如果不想用cpan安装,那就使用下面这条命令

yum install perl-ExtUtils-Embed -y

yum -y install perl-CPAN


解决方法:

export LANG=en_US
yum install perl-Mail-Sender
yum install perl-Log-Dispatch


注意:mha是由perl编写,因此依赖perl模块,我们使用epel源来安装相关perl模块。


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

安装完成后会在/usr/local/bin目录下生成以下脚本文件:

[root@192.168.0.50 bin]# cd /usr/local/bin
[root@192.168.0.50 bin]# ll
total 40
-r-xr-xr-x 1 root root 15498 Apr 20 10:05 apply_diff_relay_logs
-r-xr-xr-x 1 root root 4807 Apr 20 10:05 filter_mysqlbinlog
-r-xr-xr-x 1 root root 7401 Apr 20 10:05 purge_relay_logs
-r-xr-xr-x 1 root root 7263 Apr 20 10:05 save_binary_logs


(2)安装MHA Manager。首先安装MHA Manger依赖的perl模块(我这里使用yum安装):

manager

yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes -y

#这两个yum没有安装上 perl-Log-Dispatch perl-Parallel-ForkManager
yum install perl-Mail-Sender
yum install perl-Log-Dispatch

wget http://rpmfind.net/linux/dag/redhat/el6/en/x86_64/dag/RPMS/perl-Log-Dispatch-2.26-1.el6.rf.noarch.rpm

wget ftp://rpmfind.net/linux/dag/redhat/el6/en/x86_64/dag/RPMS/perl-Parallel-ForkManager-0.7.5-2.2.el6.rf.noarch.rpm

wget ftp://rpmfind.net/linux/dag/redhat/el6/en/x86_64/dag/RPMS/perl-Mail-Sender-0.8.16-1.el6.rf.noarch.rpm

wget ftp://rpmfind.net/linux/dag/redhat/el6/en/x86_64/dag/RPMS/perl-Mail-Sendmail-0.79-1.2.el6.rf.noarch.rpm


rpm -ivh perl-Config-Tiny-2.12-1.el6.rfx.noarch.rpm & rpm -ivh perl-Mail-Sender-0.8.16-1.el6.rf.noarch.rpm & rpm -ivh perl-Mail-Sendmail-0.79-1.2.el6.rf.noarch.rpm & rpm -ivh perl-Net-Telnet-3.03-2.el6.rfx.noarch.rpm
yum localinstall perl-Log-Dispatch-2.26-1.el6.rf.noarch.rpm
yum localinstall perl-Parallel-ForkManager-0.7.5-2.2.el6.rf.noarch.rpm

rpm -qa |grep -i perl-Net-Telnet
rpm -qa |grep -i perl-Mail
rpm -qa |grep -i perl-Log
rpm -qa |grep -i perl-Config
rpm -qa |grep -i perl-Parallel

 


任何包下载去这个网址搜索:
http://rpm.pbone.net/index.php3/stat/4/idpl/16393956/dir/redhat_el_6/com/perl-Config-Tiny-2.12-1.el6.rfx.noarch.rpm.html

http://www.rpmfind.net/linux/rpm2html/search.php?query=perl(Log%3A%3ADispatch%3A%3AFile)


[root@DB-mysql1-b bin]# rpm -qa |grep -i dbd
perl-DBD-SQLite-1.27-3.el6.x86_64
perl-DBD-MySQL-4.013-3.el6.x86_64
[root@DB-mysql1-b bin]# rpm -e --nodeps perl-DBD-SQLite-1.27-3.el6.x86_64
[root@DB-mysql1-b bin]# rpm -e --nodeps perl-DBD-MySQL-4.013-3.el6.x86_64
[root@DB-mysql1-b bin]# yum install perl-DBD-MySQL -y

 


安装MHA Manager软件包:

wget http://mysql-master-ha.googlecode.com/files/mha4mysql-manager-0.53.tar.gz
tar zxvf mha4mysql-manager-0.56.tar.gz
cd mha4mysql-manager-0.56
perl Makefile.PL
make && make install


安装完成后会在/usr/local/bin目录下面生成以下脚本文件,前面已经说过这些脚本的作用,这里不再重复
[root@192.168.66.148 bin]# cd /usr/local/bin
[root@192.168.66.148 bin]# ll

[root@ums-data-manager ~]# cd /mha4mysql-manager-0.56/samples/scripts
[root@192.168.66.148 scripts]# cp * /usr/local/bin/


--------------------

3.配置SSH登录无密码验证(使用key登录,工作中常用)我的测试环境已经是使用key登录,服务器之间无需密码验证的。关于配置使用key登录,我想我不再重复。但是有一点需要注意:不能禁止 password 登陆,否则会出现错误

.配置SSH登录无密码验证 (每个节点都执行)


机器 VPN ip linux 账号/密码
manager1 172.28.20.130 10.1.1.10 
mysql1-z(主) 172.28.20.132 10.1.1.12 
mysql1-c(从) 172.28.20.134 10.1.1.14 
mysql1-b(备) 172.28.20.136 10.1.1.16 


manager1 上

ssh-keygen -t rsa (连续按三下回车)
[root@ums-data ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@10.1.1.13
[root@ums-data ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@10.1.1.15
[root@ums-data ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@10.1.1.15

master上

ssh-keygen -t rsa (连续按三下回车)
[root@ums-data ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@10.1.1.15
[root@ums-data ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@10.1.1.17
[root@ums-data ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@10.1.1.11


slave1 上

ssh-keygen -t rsa (连续按三下回车)
[root@ums-data ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@10.1.1.12
[root@ums-data ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@10.1.1.16
[root@ums-data ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@10.1.1.10


slave2上

ssh-keygen -t rsa (连续按三下回车)
[root@ums-data ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@10.1.1.12
[root@ums-data ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@10.1.1.14
[root@ums-data ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@10.1.1.10


------------------------

4.搭建主从复制环境

[root@ums-data ~]# mysql -uroot -proot
mysql> show master status;
mysql> show slave status\G;


(2) 在node1 (Master)上备份一份完整的数据:
[root@ums-data mysql]# mysqldump -uroot -p --master-data=2 --single-transaction -R --triggers -A > all.sql
Enter password:
[root@ums-data mysql]#
其中--master-data=2代表备份时刻记录master的Binlog位置和Position。

 

还原数据库,用户权限要授权,刷新。

在mysql节点上,建立允许manager 访问数据库的“ manager manager ”账户,主要用于SHOW SLAVESTATUS,RESET SLAVE; 所以需要执行如下命令:

 

GRANT REPLICATION SLAVE,REPLICATION CLIENT,SELECT ON *.* TO 'repl'@'10.1.1.%' IDENTIFIED BY '123456';

grant SUPER,RELOAD,REPLICATION CLIENT,SELECT,replication slave on *.* to 'repl'@'192.168.66.%' identified by '123456' WITH GRANT OPTION;

grant all privileges on *.* to 'root'@'127.0.0.1' identified by 'root' WITH GRANT OPTION;
grant all privileges on *.* to 'root'@'localhost' identified by 'root' WITH GRANT OPTION;
flush privileges;

select Host,User from mysql.user where User='root';


从机器测试:
[root@DB-mysql1-c ~]# mysql -urepl -p123456 -h10.1.1.12
[root@DB-mysql1-c ~]# mysql -urepl -p123456 -h10.1.1.12 -P3306


第一台作为SLAVE Server

CHANGE MASTER TO
MASTER_HOST='10.1.1.12',
MASTER_USER='repl',
MASTER_PASSWORD='123456',
MASTER_LOG_FILE='mysql_bin.000003',
MASTER_LOG_POS=634;

方法一 注意 POSITION 120 是指主服务器的 120日志位置 该位置之后才建立了复制账号 repl和数据库world_innodb
此时开始复制 start slave 日志位置120之后建立的账号、world_innodb全过来


mysql> show master status\G
mysql> start slave;
mysql> show slave status\G
mysql> stop slave;
mysql> start slave io_thread;
mysql> start slave sql_thread;

 


发现已经将虚拟ip 192.168.66.150 只绑定了 master 机器145网卡eth1上。
(4)查看绑定情况

/sbin/ifconfig eth0:1 192.168.66.150 netmask 255.255.255.0 up

/sbin/ifconfig eth0:1 10.1.1.19 netmask 255.255.255.0 up


4.比如要想删除刚添加的虚拟ip就用如下命令:

[root@ums-data-slave network-scripts]# /sbin/ip addr del 192.168.66.149/24 dev eth0
[root@ums-data-slave network-scripts]# /sbin/ifconfig eth0:1 192.168.66.149 netmask 255.255.255.0 down
[root@ums-data-slave network-scripts]# ip add

[root@ums-data etc]# ip addr | grep eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
inet 192.168.66.141/24 brd 192.168.66.255 scope global eth0
inet 192.168.66.149/24 brd 192.168.66.255 scope global secondary eth0:1

-----------------------------------------------------------------------------

(7)两台slave服务器设置read_only(从库对外提供读服务,只所以没有写进配置文件,是因为随时slave会提升为master)

mysql> show variables like '%read_only%';

[root@192.168.0.60 ~]# mysql -e 'set global read_only=1'
[root@192.168.0.60 ~]#
[root@192.168.0.70 ~]# mysql -e 'set global read_only=1'
[root@192.168.0.70 ~]#


(8)创建监控用户(在master上执行,也就是10.1.1.13):

复制代码
mysql> grant all privileges on *.* to 'root'@'10.1.1.%' identified by 'root';
Query OK, 0 rows affected (0.00 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.01 sec)

 

====================================================================================================
5.配置MHA

(1)创建MHA的工作目录,并且创建相关配置文件(在软件包解压后的目录里面有样例配置文件)。

[root@DB-manger1 ~]# mkdir -p /etc/masterha
[root@DB-manger1 ~]#cd /home/mha4mysql-manager-0.56/samples/conf/
[root@DB-manger1 conf]# ls
app1.cnf masterha_default.cnf
[root@DB-manger1 conf]# cp app1.cnf /etc/masterha/
[root@DB-manger1 conf]#

 

修改app1.cnf配置文件,修改后的文件内容如下(注意,配置文件中的注释需要去掉,我这里是为了解释清楚):


[root@192.168.66.148 ~]# cat /etc/masterha/app1.cnf
[server default]
manager_workdir=/var/log/masterha/app1 //设置manager的工作目录
manager_log=/var/log/masterha/app1/manager.log //设置manager的日志
master_binlog_dir=/data/mysql //设置master 保存binlog的位置,以便MHA可以找到master的日志,我这里的也就是mysql的数据目录
master_ip_failover_script= /usr/local/bin/master_ip_failover //设置自动failover时候的切换脚本
master_ip_online_change_script= /usr/local/bin/master_ip_online_change //设置手动切换时候的切换脚本
password=123456 //设置mysql中root用户的密码,这个密码是前文中创建监控用户的那个密码
user=root 设置监控用户root
ping_interval=1 //设置监控主库,发送ping包的时间间隔,默认是3秒,尝试三次没有回应的时候自动进行railover
remote_workdir=/tmp //设置远端mysql在发生切换时binlog的保存位置
repl_password=123456 //设置复制用户的密码
repl_user=repl //设置复制环境中的复制用户名
report_script=/usr/local/send_report //设置发生切换后发送的报警的脚本
secondary_check_script= /usr/local/bin/masterha_secondary_check -s server03 -s server02
shutdown_script="" //设置故障发生后关闭故障主机脚本(该脚本的主要作用是关闭主机放在发生脑裂,这里没有使用)
ssh_user=root //设置ssh的登录用户名

[server1]
hostname=192.168.0.50
port=3306

[server2]
hostname=192.168.0.60
port=3306
candidate_master=1 //设置为候选master,如果设置该参数以后,发生主从切换以后将会将此从库提升为主库,即使这个主库不是集群中事件最新的slave
check_repl_delay=0 //默认情况下如果一个slave落后master 100M的relay logs的话,MHA将不会选择该slave作为一个新的master,因为对于这个slave的恢复需要花费很长时间,通过设置check_repl_delay=0,MHA触发切换在选择一个新的master的时候将会忽略复制延时,这个参数对于设置了candidate_master=1的主机非常有用,因为这个候选主在切换的过程中一定是新的master

[server3]
hostname=192.168.0.70
port=3306
[root@192.168.0.20 ~]#

---------------------------------------
在监控机10.1.1.11 上:
[root@DB-manger2 conf]# mkdir -p /var/log/masterha/app1/
[root@DB-manger2 conf]# touch /var/log/masterha/app1/manager.log
[root@DB-manger2 conf]#


[root@ums-data-manager app1.log]# vi /etc/masterha/app1.cnf

[server default]
manager_log=/var/log/masterha/app1/manager.log
manager_workdir=/var/log/masterha/app1
master_binlog_dir=/mysql/log
#master_ip_failover_script=/usr/local/bin/master_ip_failover
password=root
ping_interval=1
remote_workdir=/tmp
repl_password=123456
repl_user=repl
report_script=""
shutdown_script=""
ssh_user=root
user=root

[server1]
hostname=10.1.1.13
candidate_master=1
port=3306

[server2]
hostname=10.1.1.15
candidate_master=1
port=3306

[server3]
hostname=10.1.1.17
candidate_master=0
check_repl_delay=0
port=3306

---------------------------------------
(2)设置relay log的清除方式(在每个slave节点上):

mysql> show variables like '%relay_log%';

[root@192.168.0.60 ~]# mysql -e 'set global relay_log_purge=0'
[root@192.168.0.70 ~]# mysql -e 'set global relay_log_purge=0'

 

(3)设置定期清理relay脚本(两台slave服务器)

[root@DB-mysql2-c bin]# mkdir -p /mysql/purge_relay

[root@DB-mysql1-c ~]# cd /usr/local/bin/
[root@192.168.0.60 ~]# cat purge_relay_log.sh
#!/bin/bash
user=root
passwd=Woyee@117
port=3306
log_dir='/mysql/purge_relay'
work_dir='/mysql/purge_relay'
purge='/usr/local/bin/purge_relay_logs'

if [ ! -d $log_dir ]
then
mkdir $log_dir -p
fi

$purge --user=$user --password=$passwd --disable_relay_log_purge --port=$port --workdir=$work_dir >> $log_dir/purge_relay_logs.log 2>&1

[root@192.168.0.60 ~]#

添加到crontab定期执行

[root@192.168.0.60 ~]# crontab -l
0 4 * * * /usr/local/bin/purge_relay_log.sh

[root@jspospdb-slave ~]# chmod 755 purge_relay_log.sh

purge_relay_logs脚本删除中继日志不会阻塞SQL线程。下面我们手动执行看看什么情况。

mysql> grant all privileges on *.* to 'root'@'127.0.0.1' identified by 'root' WITH GRANT OPTION;
mysql> flush privileges;
mysql> select Host,User from mysql.user ;


测试下:

[root@DB-mysql1-c bin]# /usr/local/bin/purge_relay_logs --user=root --password=root --port=3306 -disable_relay_log_purge --workdir=/mysql/purge_relay/

2019-05-17 14:23:25: purge_relay_logs script started.
Opening /mysql/data/DB-mysql1-c-relay-bin.000001 ..
Opening /mysql/data/DB-mysql1-c-relay-bin.000002 ..
Executing SET GLOBAL relay_log_purge=1; FLUSH LOGS; sleeping a few seconds so that SQL thread can delete older relay log files (if it keeps up); SET GLOBAL relay_log_purge=0; .. ok.
2019-05-17 14:23:28: All relay log purging operations succeeded.


====================================================================================

mysql -uroot -proot
GRANT REPLICATION SLAVE,REPLICATION CLIENT,SELECT ON *.* TO 'repl'@'10.1.1.%' IDENTIFIED BY '123456';
flush privileges;
select host ,user from mysql.user;


6.检查SSH配置

检查MHA Manger到所有MHA Node的SSH连接状态:

[root@192.168.0.20 ~]# masterha_check_ssh --conf=/etc/masterha/app1.cnf


#验证mysql复制

wget ftp://rpmfind.net/linux/dag/redhat/el6/en/x86_64/extras/RPMS/perl-Net-Telnet-3.03-2.el6.rfx.noarch.rpm

yum localinstall perl-Net-Telnet-3.03-2.el6.rfx.noarch.rpm


可以看见各个节点ssh验证都是ok的

7.检查整个复制环境状况。

通过masterha_check_repl脚本查看整个集群的状态

[root@192.168.0.20 ~]# masterha_check_repl --conf=/etc/masterha/app1.cnf

每个节点都执行;

[root@ums-data-slave ~]# vi /etc/bashrc
PATH="$PATH:/usr/local/mysql/bin"
export PATH

~~~~~~~
或者:
如果发现如下错误:

Can't exec "mysqlbinlog": No such file or directory at /usr/local/share/perl5/MHA/BinlogManager.pm line 99.
mysqlbinlog version not found!
Testing mysql connection and privileges..sh: mysql: command not found
解决方法如下,添加软连接(所有节点)

ln -s /usr/local/mysql/bin/mysqlbinlog /usr/local/bin
ln -s /usr/local/mysql/bin/mysql /usr/local/bin/mysql


~~~~~~~
所以先暂时注释master_ip_failover_script= /usr/local/bin/master_ip_failover这个选项。后面引入keepalived后和修改该脚本以后再开启该选项。

[root@192.168.0.20 ~]# grep master_ip_failover /etc/masterha/app1.cnf
#master_ip_failover_script= /usr/local/bin/master_ip_failover

 

[root@ums-data-manager etc]# masterha_check_repl --conf=/etc/masterha/app1.cnf

MySQL Replication Health is OK.
已经没有明显报错,只有两个警告而已,复制也显示正常了。

 

-----------------------------------------------------------------工具说明

masterha_check_ssh 检查MHA的SSH配置状况

masterha_check_repl 检查MySQL复制状况

masterha_manger 启动MHA

masterha_check_status 检测当前MHA运行状态

masterha_master_monitor 检测master是否宕机

masterha_master_switch 控制故障转移(自动或者手动)

masterha_conf_host 添加或删除配置的server信息
=====================================================

8.检查MHA Manager的状态:

通过master_check_status脚本查看Manager的状态:

[root@192.168.0.20 ~]# masterha_check_status --conf=/etc/masterha/app1.cnf


app1 is stopped(2:NOT_RUNNING).
[root@192.168.0.20 ~]#
注意:如果正常,会显示"PING_OK",否则会显示"NOT_RUNNING",这代表MHA监控没有开启。


==============================================================
9.开启MHA Manager监控

[root@ums-data-manager masterha]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &

[1] 9882
查看MHA Manager监控是否正常:
[root@192.168.0.20 ~]# masterha_check_status --conf=/etc/masterha/app1.cnf


app1 (pid:20386) is running(0:PING_OK), master:192.168.0.50
[root@192.168.0.20 ~]#

[root@ums-data-manager masterha]# cd ~

[root@ums-data-manager ~]# tail -n20 /var/log/masterha/app1/manager.log
Checking slave recovery environment settings..
Opening /data/mysql/relay-log.info ... ok.
Relay log found at /data/mysql, up to relay-bin.000003
Temporary relay log file is /data/mysql/relay-bin.000003
Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Fri Jan 12 10:27:30 2018 - [info] Slaves settings check done.
Fri Jan 12 10:27:30 2018 - [info]
192.168.66.141(192.168.66.141:3306) (current master)
+--192.168.66.143(192.168.66.143:3306)
+--192.168.66.142(192.168.66.142:3306)

Fri Jan 12 10:27:30 2018 - [warning] master_ip_failover_script is not defined.
Fri Jan 12 10:27:30 2018 - [warning] shutdown_script is not defined.
Fri Jan 12 10:27:30 2018 - [info] Set master ping interval 1 seconds.
Fri Jan 12 10:27:30 2018 - [info] Set secondary check script: /usr/local/bin/masterha_secondary_check -s server03 -s server02
Fri Jan 12 10:27:30 2018 - [info] Starting ping health check on 192.168.66.141(192.168.66.141:3306)..
Fri Jan 12 10:27:30 2018 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
[root@ums-data-manager ~]#


Fri Jan 19 02:00:42 2018 - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln492] Server 192.168.66.146(192.168.66.146:3306) is dead, but must be alive! Check server settings.

[root@jspospdb-bak mysql]# service iptables stop
iptables: Setting chains to policy ACCEPT: filter [ OK ]
iptables: Flushing firewall rules: [ OK ]
iptables: Unloading modules: [ OK ]
[root@jspospdb-bak mysql]#


==============================================
11.关闭MHA Manage监控


[root@ums-data-manager ~]# masterha_stop --conf=/etc/masterha/app1.cnf
Stopped app1 successfully.
[1]+ Exit 1 nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1.log/manager.log 2>&1 (wd: /etc/masterha)
(wd now: ~)
[root@ums-data-manager ~]#

 


12.配置VIP
vip配置可以采用两种方式,一种通过keepalived的方式管理虚拟ip的浮动;另外一种通过脚本方式启动虚拟ip的方式(即不需要keepalived或者heartbeat类似的软件)。


[root@192.168.0.50 ~]# tail -f /var/log/messages

 

编辑脚本/usr/local/bin/master_ip_failover,修改后如下,我对perl不熟悉,
所以我这里完整贴出该脚本(主库上操作,192.168.0.50)。

在MHA Manager修改脚本修改后的内容如下(参考资料比较少):

[root@DB-mysql1-z ~]# /sbin/ifconfig eth0:1

bayaim:直接复制我的,原封不动。不要用系统自带的例子~!!!

vi /usr/local/bin/master_ip_failover


----------------------------------------------
#!/usr/bin/env perl

use strict;
use warnings FATAL => 'all';

use Getopt::Long;

my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);

my $vip = '10.1.1.19/24';
my $key = '1';
my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down";

 

GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);

exit &main();

sub main {

print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

if ( $command eq "stop" || $command eq "stopssh" ) {

my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {

my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
#`ssh $ssh_user\@cluster1 \" $ssh_start_vip \"`;
exit 0;
}
else {
&usage();
exit 1;
}
}

# A simple system call that enable the VIP on the new master
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
return 0 unless ($ssh_user);
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}


现在已经修改这个脚本了,我们现在打开在上面提到过的参数,再检查集群状态,看是否会报错。

[root@192.168.0.20 ~]# grep 'master_ip_failover_script' /etc/masterha/app1.cnf
master_ip_failover_script= /usr/local/bin/master_ip_failover
[root@192.168.0.20 ~]#
复制代码
[root@192.168.0.20 ~]# masterha_check_repl --conf=/etc/masterha/app1.cnf

 

 

 

 

posted @ 2019-07-02 14:03  上帝_BayaiM  阅读(200)  评论(0编辑  收藏  举报