MHA配置方法

MHA搭建步骤
一、安装软件包
-rwxr-x---.  1 root root  21272 Jul  6  2016 perl-Config-Tiny-2.12-1.el6.rfx.noarch.rpm*
-rwxr-x---.  1 root root  72328 Jul  6  2016 perl-Log-Dispatch-2.26-1.el6.rf.noarch.rpm*
-rwxr-x---.  1 root root  14840 Jul  6  2016 perl-Parallel-ForkManager-0.7.5-2.2.el6.rf.noarch.rpm*
-rwxr-x---.  1 root root  76728 Jul  6  2016 perl-Params-Validate-0.92-3.el6.x86_64.rpm*
-rwxr-x---.  1 root root 118521 Nov  3  2016 mha4mysql-manager-0.57.tar.gz*
-rwxr-x---.  1 root root  54484 Nov  3  2016 mha4mysql-node-0.57.tar.gz*
-rwxr-x---.  1 root root  10353 Aug  2  2018 master_ip_online_change*
-rwxr-x---.  1 root root   2350 Aug  2  2018 master_ip_failover*
首先需要安装依赖包
rpm -ivh *.rpm
安装管理节点软件
cd mha4mysql-manager-0.57
perl Makefile.PL
make && make install
安装node节点软件
cd mha4mysql-node-0.57
perl Makefile.PL
make && make install
二、修改参数化参数文件
MASTER上修改配置,双一原则,保证数据库不丢失
innodb_flush_log_at_trx_commit = 1 --> redo log 1写磁盘、2写系统缓存(操作系统挂可能丢数据)、0写redo log buffer(mysql挂可能丢数据)
sync_binlog = 1 --> binlog 1写磁盘、0写系统缓存

SLAVE上修改配置
master_info_repository=TABLE
relay_log_info_repository=TABLE
relay_log_recovery=ON

确保binlog格式为row
binlog_format='ROW'

三、创建MAH管理用户
CREATE USER 'admin'@'192.168.10.%' IDENTIFIED BY 'admin';
GRANT SUPER,RELOAD,REPLICATION CLIENT,SELECT ON *.* TO 'admin'@'192.168.10.%';

四、配置root的免密码登陆,利用oracle rac的脚本
sh a.sh -user root -hosts "m1 m2 m3" -advanced -noPromptPassphrase 

五、修改主节点和候选节点的配置文件
主
 vi /etc/masterha/app1.cnf 
[server default]
manager_workdir=/masterha/app1
manager_log=/mysqlapp/mysql/log/manager.log
user=admin
password=admin
ssh_user=root
repl_user=rep
repl_password=rep
ping_interval=1
shutdown_script=""
master_ip_online_change_script=""
report_script=""


[server1]
hostname=192.168.10.100
port=3306
candidate_master=1
master_binlog_dir="/data/3306"


[server2]
hostname=192.168.10.101
port=3306
candidate_master=1
master_binlog_dir="/data/3306"

[server3]
hostname=192.168.10.103
port=3306

候选主
[server default]
manager_workdir=/masterha/app1
manager_log=/mysqlapp/mysql/log/manager.log
user=admin
password=admin
ssh_user=root
repl_user=rep
repl_password=rep
ping_interval=1
shutdown_script=""
master_ip_online_change_script=""
report_script=""


[server1]
hostname=192.168.10.101
port=3306
candidate_master=1
master_binlog_dir="/data/3306"


[server2]
hostname=192.168.10.100
port=3306
candidate_master=1
master_binlog_dir="/data/3306"

[server3]
hostname=192.168.10.103
port=3306

六、检查ssh连通性
[root@m1 mha]# masterha_check_ssh --conf=/etc/masterha/app1.cnf
Sat Jun 22 22:04:02 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Jun 22 22:04:02 2019 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Sat Jun 22 22:04:02 2019 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Sat Jun 22 22:04:02 2019 - [info] Starting SSH connection tests..
Sat Jun 22 22:04:03 2019 - [debug] 
Sat Jun 22 22:04:02 2019 - [debug]  Connecting via SSH from root@192.168.10.100(192.168.10.100:22) to root@192.168.10.101(192.168.10.101:22)..
Sat Jun 22 22:04:02 2019 - [debug]   ok.
Sat Jun 22 22:04:02 2019 - [debug]  Connecting via SSH from root@192.168.10.100(192.168.10.100:22) to root@192.168.10.103(192.168.10.103:22)..
Sat Jun 22 22:04:03 2019 - [debug]   ok.
Sat Jun 22 22:04:03 2019 - [debug] 
Sat Jun 22 22:04:03 2019 - [debug]  Connecting via SSH from root@192.168.10.101(192.168.10.101:22) to root@192.168.10.100(192.168.10.100:22)..
Sat Jun 22 22:04:03 2019 - [debug]   ok.
Sat Jun 22 22:04:03 2019 - [debug]  Connecting via SSH from root@192.168.10.101(192.168.10.101:22) to root@192.168.10.103(192.168.10.103:22)..
Sat Jun 22 22:04:03 2019 - [debug]   ok.
Sat Jun 22 22:04:04 2019 - [debug] 
Sat Jun 22 22:04:03 2019 - [debug]  Connecting via SSH from root@192.168.10.103(192.168.10.103:22) to root@192.168.10.100(192.168.10.100:22)..
Sat Jun 22 22:04:03 2019 - [debug]   ok.
Sat Jun 22 22:04:03 2019 - [debug]  Connecting via SSH from root@192.168.10.103(192.168.10.103:22) to root@192.168.10.101(192.168.10.101:22)..
Sat Jun 22 22:04:04 2019 - [debug]   ok.
Sat Jun 22 22:04:04 2019 - [info] All SSH connection tests passed successfully.
七、检查同步用户的连通情况
[root@m1 ~]#  masterha_check_repl --conf=/etc/masterha/app1.cnf
Sat Jun 22 22:09:57 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Jun 22 22:09:57 2019 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Sat Jun 22 22:09:57 2019 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Sat Jun 22 22:09:57 2019 - [info] MHA::MasterMonitor version 0.57.
Sat Jun 22 22:09:57 2019 - [info] GTID failover mode = 1
Sat Jun 22 22:09:57 2019 - [info] Dead Servers:
Sat Jun 22 22:09:57 2019 - [info] Alive Servers:
Sat Jun 22 22:09:57 2019 - [info]   192.168.10.100(192.168.10.100:3306)
Sat Jun 22 22:09:57 2019 - [info]   192.168.10.101(192.168.10.101:3306)
Sat Jun 22 22:09:57 2019 - [info]   192.168.10.103(192.168.10.103:3306)
Sat Jun 22 22:09:57 2019 - [info] Alive Slaves:
Sat Jun 22 22:09:57 2019 - [info]   192.168.10.100(192.168.10.100:3306)  Version=5.7.20-log (oldest major version between slaves) log-bin:enabled
Sat Jun 22 22:09:57 2019 - [info]     GTID ON
Sat Jun 22 22:09:57 2019 - [info]     Replicating from 192.168.10.101(192.168.10.101:3306)
Sat Jun 22 22:09:57 2019 - [info]     Primary candidate for the new Master (candidate_master is set)
Sat Jun 22 22:09:57 2019 - [info]   192.168.10.103(192.168.10.103:3306)  Version=5.7.20-log (oldest major version between slaves) log-bin:enabled
Sat Jun 22 22:09:57 2019 - [info]     GTID ON
Sat Jun 22 22:09:57 2019 - [info]     Replicating from 192.168.10.101(192.168.10.101:3306)
Sat Jun 22 22:09:57 2019 - [info] Current Alive Master: 192.168.10.101(192.168.10.101:3306)
Sat Jun 22 22:09:57 2019 - [info] Checking slave configurations..
Sat Jun 22 22:09:57 2019 - [info]  read_only=1 is not set on slave 192.168.10.100(192.168.10.100:3306).
Sat Jun 22 22:09:57 2019 - [info]  read_only=1 is not set on slave 192.168.10.103(192.168.10.103:3306).
Sat Jun 22 22:09:57 2019 - [info] Checking replication filtering settings..
Sat Jun 22 22:09:57 2019 - [info]  binlog_do_db= , binlog_ignore_db= 
Sat Jun 22 22:09:57 2019 - [info]  Replication filtering check ok.
Sat Jun 22 22:09:57 2019 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Sat Jun 22 22:09:57 2019 - [info] Checking SSH publickey authentication settings on the current master..
Sat Jun 22 22:09:57 2019 - [info] HealthCheck: SSH to 192.168.10.101 is reachable.
Sat Jun 22 22:09:57 2019 - [info] 
192.168.10.101(192.168.10.101:3306) (current master)
 +--192.168.10.100(192.168.10.100:3306)
 +--192.168.10.103(192.168.10.103:3306)

Sat Jun 22 22:09:57 2019 - [info] Checking replication health on 192.168.10.100..
Sat Jun 22 22:09:57 2019 - [info]  ok.
Sat Jun 22 22:09:57 2019 - [info] Checking replication health on 192.168.10.103..
Sat Jun 22 22:09:57 2019 - [info]  ok.
Sat Jun 22 22:09:57 2019 - [warning] master_ip_failover_script is not defined.
Sat Jun 22 22:09:57 2019 - [warning] shutdown_script is not defined.
Sat Jun 22 22:09:57 2019 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.
八、启动mha
主节点启动mha
[root@m1 ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf > /tmp/mha_manager.log  2>&1 &
[1] 13272
[root@m1 ~]# masterha_check_status --conf=/etc/masterha/app1.cnf 
app1 (pid:13272) is running(0:PING_OK), master:192.168.10.101

删除老的切换记录文件
rm -rf  rm /masterha/app1/app1.failover.complete

关闭主节点,验证是否生效

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.10.101
                  Master_User: rep
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000003
          Read_Master_Log_Pos: 234
               Relay_Log_File: m3-relay-bin.000002
                Relay_Log_Pos: 367
        Relay_Master_Log_File: mysql-bin.000003
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

从节点会自动切换
mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.10.100
                  Master_User: rep
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000007
          Read_Master_Log_Pos: 234
               Relay_Log_File: m3-relay-bin.000002
                Relay_Log_Pos: 367
        Relay_Master_Log_File: mysql-bin.000007
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
候选节点会升级为主,原主节点会脱离。

将脱离的主机点重新加入到主从。
CHANGE MASTER TO
 MASTER_HOST='192.168.10.100',
  MASTER_USER='rep',
  MASTER_PASSWORD='rep',
  MASTER_AUTO_POSITION = 1;
start slave;
show slave status \G;


再在新的主节点启动mah管理进程

rm -rf  rm /masterha/app1/app1.failover.complete
nohup masterha_manager --conf=/etc/masterha/app1.cnf > /tmp/mha_manager.log  2>&1 &
 masterha_check_status --conf=/etc/masterha/app1.cnf 
 
app1 (pid:23378) is running(0:PING_OK), master:192.168.10.100

CHANGE MASTER TO
 MASTER_HOST='192.168.10.101',
  MASTER_USER='rep',
  MASTER_PASSWORD='rep',
  MASTER_AUTO_POSITION = 1;
start slave;
show slave status \G;

 

posted on 2019-06-24 11:29  侯志清  阅读(432)  评论(0编辑  收藏  举报

导航