mysql GTID + MHA 集群

一,基本知识点

二 规划图:

三 集群描述:

机器规划和描述:

1. 从架构上来说MHA管理节点实时监控集群主从复制的一致性状态,当1.55主发送故障。从1.66会自动提升为主,vip切换到1.66上面.是业务不断线,达到了高可用。

2. 从数据线一致性来说:处于崩溃的master会以二进制日志事件保存,采用差异的中继日志方式,解析最新的slave上的中继日志relny log的日志头,从其他的从服务器确定出差异的位置,组成完整的二进制日志。其他的slave重新连接到新的master进行复制。

3. 从转移速度:MHA能做到在0~30秒之内自动完成数据库的故障切换操作

 

角色
ip地址
主机名
server_id 
类型
Monitor host 

172.16.1.38

yunwei

-

监控复制组
Master

172.16.1.55

info-slave

10

写入
Candicate master

172.16.1.66

info-master

20

Slave 

172.16.1.116

apache2-filing

102

vip

172.16.1.45

-

-

-

atlas

172.16.1.118

Redis2

-

中间件


 四.修改主从结构

  1. 采用GTID的主从复制,GTID用来代替传统复制方法,不再使用MASTER_LOG_FILE+MASTER_LOG_POS开启复制。而是使用MASTER_AUTO_POSTION=1的方式开始复制.MySQL-5.6.5开始支持的,MySQL-5.6.10后开始完善.
  2. 在传统的slave端,binlog是不用开启的,但是在GTID中slave端的binlog是必须开启的,目的是记录执行过的GTID(强制).
  3. GTID = source_id:transaction_id.
  4. 更简单的实现failover,不用以前那样在需要找log_file和log_pos.
  5. 更简单的搭建主从复制.
  6. 比传统的复制更加安全.
  7. GTID是连续的没有空洞的,保证数据的一致性,零丢失.

五.修改1.55主库的上my.cnf 配置开启GTID

 添加如下参数:

1.gtid_mode=ON  ### 开启GTID模块

enforce_gtid_consistency

log-slave-updates

binlog_format= row

skip-name-resolve

2.查看

3.重启数据库

4. 修改1.66 1.116 从库上修改的 my.cnf 配置

1.添加如下参数:

gtid_mode=on

enforce_gtid_consistency

log-slave-updates

skip-name-resolve

binlog_format=mixed

 六.主上授权

grant replication slave on *.* to repl@'172.16.1.%' identified by '123456';

七.从连接主库

1.change master to master_host='172.16.1.117',master_user='repl',master_password='123456',master_auto_position=1;

2.start slave

3.mysql -uroot -p密码 -e 'show slave status\G' | egrep 'Slave_IO|Slave_SQL'

4.配置MHA

1. 172.16.1.38 部署MHA

2.所有节点安装 yum -y install perl-DBD-MySQL

  1. 所有节点安装rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm
  2. 管理节点38 安装
  3. yum -y install perl-Config-Tiny epel-release perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes

rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm

  1. 在117主库上给mha 配置管理账号

 grant all privileges on *.* to mha@'172.16.1.%'identified by 'mha';

5.所有的机器上执行命令软连接

 ln -s /usr/local/mysql/bin/mysqlbinlog /usr/bin/mysqlbinlog

 ln -s /usr/local/mysql/bin/mysql /usr/bin/mysql

  1. 创建配置文件目录

 mkdir -p /etc/mha

  1. 创建日志目录

mkdir -p /var/log/mha/app1

  1. 编辑mha配置文件 vim /etc/mha/app1.cnf

[server default]

manager_log=/var/log/mha/mha1/manager

manager_workdir=/var/log/mha/mha1

master_binlog_dir=/data/mysqldata

master_ip_failover_script=/usr/local/bin/master_ip_failover

#report_script=/usr/local/bin/send_report

password=mha

ping_interval=2

repl_password=123456

repl_user=slave

ssh_user=root

user=mha

 

[server1]

hostname=172.16.1.55

port=3306

[server2]

candidate_master=1  ###指定切换的主库

check_repl_delay=0

hostname=172.16.1.66

port=3306

 

[server3]

hostname=172.16.1.116

port=3306

 

  1. 配置ssh信任点四台机器执行所有的节点

*****************分发密钥***********************

  1. ssh-keygen -t rsa
  2. ssh-copy-id -i ~/.ssh/id_rsa.pub root@172.16.1.55

   ssh-copy-id -i ~/.ssh/id_rsa.pub root@172.16.1.66

   ssh-copy-id -i ~/.ssh/id_rsa.pub root@172.16.1.116

ssh-copy-id -i ~/.ssh/id_rsa.pub root@172.16.1.118

 

 检查SHH、主从复制:

         [root@mysql-db03 ~]# masterha_check_ssh  --conf=/etc/mha/app1.cnf#ssh检查命令

Wed Apr  4 14:58:48 2018 - [debug]  Connecting via SSH from root@172.16.1.55(172.16.1.55:22) to root@172.16.1.66(172.16.1.66:22)..

reverse mapping checking getaddrinfo for bogon [172.16.1.66] failed - POSSIBLE BREAK-IN ATTEMPT!

Wed Apr  4 14:58:48 2018 - [debug]   ok.

Wed Apr  4 14:58:48 2018 - [debug]  Connecting via SSH from root@172.16.1.55(172.16.1.55:22) to root@172.16.1.116(172.16.1.116:22)..

reverse mapping checking getaddrinfo for bogon [172.16.1.116] failed - POSSIBLE BREAK-IN ATTEMPT!

Wed Apr  4 14:58:49 2018 - [debug]   ok.

Wed Apr  4 14:58:49 2018 - [debug]

Wed Apr  4 14:58:49 2018 - [debug]  Connecting via SSH from root@172.16.1.66(172.16.1.66:22) to root@172.16.1.55(172.16.1.55:22)..

reverse mapping checking getaddrinfo for bogon [172.16.1.55] failed - POSSIBLE BREAK-IN ATTEMPT!

Wed Apr  4 14:58:49 2018 - [debug]   ok.

Wed Apr  4 14:58:49 2018 - [debug]  Connecting via SSH from root@172.16.1.66(172.16.1.66:22) to root@172.16.1.116(172.16.1.116:22)..

reverse mapping checking getaddrinfo for bogon [172.16.1.116] failed - POSSIBLE BREAK-IN ATTEMPT!

Wed Apr  4 14:58:49 2018 - [debug]   ok.

Wed Apr  4 14:58:50 2018 - [debug]

Wed Apr  4 14:58:49 2018 - [debug]  Connecting via SSH from root@172.16.1.116(172.16.1.116:22) to root@172.16.1.55(172.16.1.55:22)..

reverse mapping checking getaddrinfo for bogon [172.16.1.55] failed - POSSIBLE BREAK-IN ATTEMPT!

Wed Apr  4 14:58:49 2018 - [debug]   ok.

Wed Apr  4 14:58:49 2018 - [debug]  Connecting via SSH from root@172.16.1.116(172.16.1.116:22) to root@172.16.1.66(172.16.1.66:22)..

reverse mapping checking getaddrinfo for bogon [172.16.1.66] failed - POSSIBLE BREAK-IN ATTEMPT!

Wed Apr  4 14:58:50 2018 - [debug]   ok.

Wed Apr  4 14:58:50 2018 - [info] All SSH connection tests passed successfully.

 

         [root@mysql-db03 ~]#masterha_check_repl --conf=/etc/mha/app1.cnf #主从复制检测

 

Wed Apr  4 15:00:48 2018 - [info] Checking replication health on 172.16.1.66..

Wed Apr  4 15:00:48 2018 - [info]  ok.

Wed Apr  4 15:00:48 2018 - [info] Checking replication health on 172.16.1.116..

Wed Apr  4 15:00:48 2018 - [info]  ok.

Wed Apr  4 15:00:48 2018 - [info] Checking master_ip_failover_script status:

Wed Apr  4 15:00:48 2018 - [info]   /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=172.16.1.55 --orig_master_ip=172.16.1.55 --orig_master_port=3306

Checking the Status of the script..OK

Wed Apr  4 15:00:48 2018 - [info]  OK.

Wed Apr  4 15:00:48 2018 - [warning] shutdown_script is not defined.

Wed Apr  4 15:00:48 2018 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.  --->检测MHA 集群监控值

  1. 启动MHA

nohup masterha_manager --conf=/etc/mha/mha1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/mha/mha1/manager.log 2>&1 &

六.配置VIP漂移

=======================================

=============配置VIP漂移===============

=======================================

1、从源码包中将需要的脚本拷贝出来

         [root@mha-03 ~]# tar xf mha4mysql-manager-0.56.tar.gz

         [root@mha-03 ~]# cd mha4mysql-manager-0.56

         [root@mha-03 mha4mysql-manager-0.56]# cd samples/scripts/

         [root@mha-03 scripts]# ll

         总用量 32

         -rwxr-xr-x. 1 4984 users  3648 4月   1 2014 master_ip_failover  #这就是管理虚拟IP的脚本

         -rwxr-xr-x. 1 4984 users  9870 4月   1 2014 master_ip_online_change

         -rwxr-xr-x. 1 4984 users 11867 4月   1 2014 power_manager

         -rwxr-xr-x. 1 4984 users  1360 4月   1 2014 send_report

 

2、将管理脚本拷贝到执行路径下

[root@mha-03 scripts]# cp master_ip_failover /usr/local/bin/  #复制到/usr/local/bin目录

[root@mha-03 scripts]# ll /usr/local/bin/master_ip_failover

-rwxr-xr-x. 1 root root 3648 10月 23 11:02 /usr/local/bin/master_ip_failover

 

3、在 /etc/mha/mha1.cnf 配置文件中添加脚本

master_ip_failover_script=/usr/local/bin/master_ip_failover     #添加脚本位置

 use strict;

use warnings FATAL => 'all';

 

use Getopt::Long;

 

my (

  $command,        $ssh_user,         $orig_master_host,

  $orig_master_ip, $orig_master_port, $new_master_host,

  $new_master_ip,  $new_master_port

);

 

my $vip = '172.16.1.45/24';

my $key = '0';

my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip";  --->###根据网卡的设置改具体的参数

my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down";

 

GetOptions(

  'command=s'             => \$command,

  'ssh_user=s'            => \$ssh_user,

  'orig_master_host=s'    => \$orig_master_host,

  'orig_master_ip=s'      => \$orig_master_ip,

  'orig_master_port=i'    => \$orig_master_port,

  'new_master_host=s'     => \$new_master_host,

  'new_master_ip=s'       => \$new_master_ip,

  'new_master_port=i'     => \$new_master_port,

);

 

exit &main();

 

sub main {

  if ( $command eq "stop" || $command eq "stopssh" ) {

   my $exit_code = 1;

  eval {

                     print "Disabling the VIP on old master:$orig_master_host \n";

                    &stop_vip();

                    $exit_code = 0;

};

if ($@) {

                    warn "Got Error: $@\n";

                    exit $exit_code;

             }

         exit $exit_code;

       }

       elsif ( $command eq "start" ) {

                       my $exit_code = 10;

             eval {

                    print "Enabling the VIP - $vip on the new master - $new_master_host \n";

                    &start_vip();                                                  ###############

                    $exit_code = 0;

             };

             if ($@) {

                    warn $@;

                    exit $exit_code;

             }

             exit $exit_code;

       }

        elsif ( $command eq "status" ) {

                       print "Checking the Status of the script..OK \n";

                       `ssh $ssh_user\@$orig_master_host \" $ssh_start_vip \"`;

                       exit 0;

       }

       else {

                       &usage();

                       exit 1;

       }

         }

 

sub start_vip() {

         `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;

}

sub stop_vip() {

         `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;

}

sub usage {

  print

"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";

}

 

  1. 先停止masterha_stop --conf=/etc/mha/app1.cnf
  2. 检查脚本状态

Tue Mar 27 17:22:30 2018 - [info] Checking master_ip_failover_script status:

Tue Mar 27 17:22:30 2018 - [info]   /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=172.16.1.117 --orig_master_ip=172.16.1.117 --orig_master_port=3307

  1. 配置vip

 /sbin/ifconfig eth0:0 172.16.1.45/24

 

 七.配置binlog-server远程实时备份1.55主库上binlog日志

=========配置binlog-server=================

===========================================

  1. 添加binlog-server
  2. 创建存放binlog日志的目录
  3.          [root@mha-03 ~]# mkdir /data/mysql/binlog/ -p
  4. 拉取主库的binlog日志:---->一定要在当前目录下执行这条命令
  5.          [root@mha-03 binlog]# mysqlbinlog -R --host=172.16.1.55 --user=mha --password=123123 --raw --stop-never mysql-bin.000001 &
  6. 拉取主库binlog日志的进程运行:
  7. [root@mha-03 binlog]# ps -ef | grep mysqlbinlog | grep -v grep
  8. root       9314   3444  0 15
  9. mysqlbinlog -R --host=172.16.1.55 --user=mha --password=x x --raw --stop-never mysql-bin.000014:25 pts/1    00:00:00
  10.  
  11. 添加的模块:
  12.          [binlog1]      #添加binlog模块
  13.          no_master=1     #不允许切换为主
  14.          hostname=mha-03.kun.cc    #binlog日志服务器地址
  15.          master_binlog_dir=/data/mysql/binlog/     #binlog存放位置优先级比全局的高

mysqlbinlog -R --host=172.16.1.117 --user=mha --password=mha --raw --stop-never mysql-bin.000001 &

  1. 测试masterha_check_repl --conf=/etc/mha/app1.cnf 检测当前主从一致性状态
  2. nohup masterha_manager --conf=/etc/mha/app1.cnf  -remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/mha/mha1/manager.log 2>&1 &  ##后台运行mha
  3. 编写mha监控脚本

#!/bin/bash

function check_pid  {

pid=`ps -ef | grep /usr/bin/masterha_manager | grep -v grep | awk '{print $2}'`

echo $pid

if [ $pid -gt 0 ];then ##不存在

    echo "MHA PID is runing" >> /var/log/monitor.log

else

echo "MHA监控PID" | mail -s "MHA故障,进程已经宕机" 172771552@qq.com

masterha_manager --conf=/etc/mha/app1.cnf  -remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/mha/mha1/manager.log 2>&1 &  ##启动进程

fi

}

function main {

       for i in {1..10}

       do

           check_pid

       done

       break

}

main

八.部署Atlas读写分离中间件

1.在1.55主库上授权

create user 'szx'@'172.16.1.118' identified by 'Pass@word';

grant all on *.* to 'szx'@'172.16.1.118'identified by 'Pass@word';

2.cd /usr/local/mysql-proxy/bin/

3../encrypt Pass@word 创建Atlas管理账号 --> atlas 授权程序连接入口密码Pass@word

4./usr/local/mysql-proxy/conf

修改atlas 配置文件

5.[mysql-proxy]

6#管理接口的用户名

7.admin-username = user

#管理接口的密码

admin-password = pwd

#Atlas后端连接的MySQL主库的IP和端口,可设置多项,用逗号分隔

proxy-backend-addresses = 172.16.1.45:3306  ---> 配置虚拟vip 达到主从切换 vip指定主库

#Atlas后端连接的MySQL从库的IP和端口,@后面的数字代表权重,用来作负载均衡,若省略则默认为1,可设置多项,用逗号分隔

proxy-read-only-backend-addresses = 172.16.1.166:3306@1,172.16.1.116:3306

#用户名与其对应的加密过的MySQL密码,密码使用PREFIX/bin目录下的加密程序encrypt加密,下行的user1和user2为示例,将其替换为你的MySQL的用户名和加密密码!

pwds = root:/iZxz+0GRoA=,szx:xWQZYJHSQrrmTh762ogNww==  ---> 这里添加授权密码

daemon = true

#设置Atlas的运行方式,设为true时Atlas会启动两个进程,一个为monitor,一个为worker,monitor在worker意外退出后会自动将其重启,设为false时只有worker,没有monitor,一般开发调试时设为false,线上运行时设为true,true后面不能有空格。

keepalive = true

#工作线程数,对Atlas的性能有很大影响,可根据情况适当设置

event-threads = 32

#日志级别,分为message、warning、critical、error、debug五个级别

log-level = debug

#日志存放的路径

log-path = /usr/local/mysql-proxy/log

#SQL日志的开关,可设置为OFF、ON、REALTIME,OFF代表不记录SQL日志,ON代表记

录SQL日志,REALTIME代表记录SQL日志且实时写入磁盘,默认为OFF

sql-log = ON

#慢日志输出设置。当设置了该参数时,则日志只输出执行时间超过sql-log-slow(单位:ms)的日志记录。不设置该参数则输出全部日志。

#sql-log-slow = 10

#实例名称,用于同一台机器上多个Atlas实例间的区分

#instance = test

#Atlas监听的工作接口IP和端口

proxy-address = 0.0.0.0:3306

#Atlas监听的管理接口IP和端口

admin-address = 0.0.0.0:1234

#tables = person.mt.id.3

charset = utf8

#client-ips = 127.0.0.1, 192.168.1

#lvs-ips = 192.168.1.1

  1. 启动atlas  /usr/local/mysql-proxy/bin/mysql-proxyd test start
  2. 进入atlas管理界面

mysql -uuser -ppwd -P1234 -h172.16.1.118

  1. 查看后端

+-------------+-------------------+-------+------+

| backend_ndx | address           | state | type |

+-------------+-------------------+-------+------+

|           1 | 172.16.1.45:3306  | up    | rw   |

|           2 | 172.16.1.66:3306  | up    | ro   |

|           3 | 172.16.1.116:3306 | up    | ro   |

+-------------+-------------------+-------+------+

Rw 代表读写  ro 代表只读

  1. 授权程序连接sql线程
  2. Atlas上面./encrypt  设置允许连接的密码
  3. 配置文件添加允许连接的用户密码

pwds = root:/iZxz+0GRoA=,szx:xWQZYJHSQrrmTh762ogNww==

  1. 测试连接atlas

mysql -uszx -h172.16.1.118 -pPass@word 程序连接授权的用户和密码

  1. 修改程序连接数据库的文件

sed -i 's/172.16.1.55/172.16.1.118/g' DbConfig.php

  1. 看到atlas的sql日志

[04/04/2018 15:32:14] C:172.16.1.68:46490 S:172.16.1.66:3306 OK 0.970 "SHOW FULL COLUMNS FROM `MESSAGE_REMIND`"

[04/04/2018 15:32:12] C:172.16.1.64:57374 S:172.16.1.66:3306 OK 12.151 "SELECT * FROM `LOG_USER_LOGIN` `t` WHERE SessionID='9nvn3721s78280qq9iepdmonb2' and UserId='2633574085583' LIMIT 1"

[04/04/2018 15:31:10] C:172.16.1.70:34640 S:172.16.1.66:3306 OK 0.724 "SHOW FULL COLUMNS FROM `MESSAGE_REMIND`"

[04/04/2018 15:31:10] C:172.16.1.70:34640 S:172.16.1.116:3306 OK 0.194 "SHOW CREATE TABLE `MESSAGE_REMIND`"

[04/04/2018 15:31:10] C:172.16.1.70:34640 S:172.16.1.45:3306 OK 1.406 "UPDATE `LOG_USER_LOGIN` SET `Id`='41980', `UserId`='2627214699994', `LoginName`='15011436643',

九.问题

如果发现如下错误:

1.Can't exec "mysqlbinlog": No such file or directory at /usr/local/share/perl5/MHA/BinlogManager.pm line 99.

mysqlbinlog version not found!

解决方法如下,添加软连接(所有节点)

ln -s /usr/local/mysql/bin/mysqlbinlog /usr/local/bin/mysqlbinlog
ln -s /usr/local/mysql/bin/mysql /usr/local/bin/mysql
2. 查看日志
tail -n20 /var/log/mha/mha1/manager
3.查看当前集群状态,显示主库
masterha_check_status --conf=/etc/masterha/app1.cnf
app1 (pid:20386) is running(0:PING_OK), master:192.168.1.55
3. 当主库切换了,恢复从库使用授权重新恢复主从一致性

change master to master_host='master_ip',master_user='repl',master_password='123456',master_auto_position=1;

4. 程序连接端口可以暂时使用55库,作为程序连接

mysql -h172.16.1.55 -uszx -pPass@word

5. Atlas 上开启了防火墙,安全加固 --> 只允许web主机连接

ACCEPT     all  --  172.16.1.68/24        0.0.0.0/0

ACCEPT     all  --  172.16.1.64/24        0.0.0.0/0

ACCEPT     all  --  172.16.1.92/24        0.0.0.0/0

posted @ 2018-07-24 02:20  人生信条~~  阅读(281)  评论(0编辑  收藏  举报