mysql高可用集群——heartbeat+drbd
heartbeat+drbd+mysql是一种早期的mysql高可用技术。
资料来源:http://www.drbd.org
DRBD原理:DRBD是对磁盘块操作的复制,可看做网络raid1。不复制磁盘内容,只复制操作。原理可见下图
架构描述
服务器列表
192.168.1.82 | 192.168.1.1 | 3306 | 主 | /dev/drbd0 |
192.168.1.82 | 192.168.1.2 | 3306 | 备 | /dev/drbd0 |
架构图
安装配置:
配置drbd
1.检查机器名解析:
1.查看解析
sudo vi /etc/hosts
192.168.1.1 mysql-1
192.168.1.2 mysql-2
2.查看内核
$ uname -a
Linux mysql-2-2 2.6.18-308.el5 #1 SMP Tue Feb 21 20:06:06 EST 2012 x86_64 x86_64 x86_64 GNU/Linux
根据linux内核版本来确定可选择哪一版本的drbd
可以选则tar包自己安装,下载地址:http://oss.linbit.com/drbd/,也有部分yum源提供drbd的安装,本文作者使用centos 5.8,可直接yum安装
3.安装drbd
$ sudo yum list | grep drbd
drbd.x86_64 8.0.16-5.el5.centos extras
drbd82.x86_64 8.2.6-1.el5.centos extras
drbd83.x86_64 8.3.15-2.el5.centos extras
drbdlinks.noarch 1.26-1.el5 epel
kmod-drbd.x86_64 8.0.16-5.el5_3 extras
kmod-drbd-xen.x86_64 8.0.16-5.el5_3 extras
kmod-drbd82.x86_64 8.2.6-2 extras
kmod-drbd82-xen.x86_64 8.2.6-2 extras
kmod-drbd83.x86_64 8.3.15-3.el5.centos extras
kmod-drbd83-xen.x86_64 8.3.15-3.el5.centos extras
这里我们选:drbd83.x86_64
sudo yum install -y drbd83.x86_64
4.配置drbd
rpm包安装的配置文件位置在/etc/drbd.conf
tar包安装的配置文件位置安装目录下的./etc/drbd.conf
global {
# minor-count 64;
# dialog-refresh 5; # 5 seconds
# disable-ip-verification;
usage-count no;
}
common {
protocol C;
disk {
on-io-error detach;
#size 3982G;
no-disk-flushes;
no-md-flushes;
}
net {
sndbuf-size 512k;
# timeout 60; # 6 seconds (unit = 0.1 seconds)
# connect-int 10; # 10 seconds (unit = 1 second)
# ping-int 10; # 10 seconds (unit = 1 second)
# ping-timeout 5; # 500 ms (unit = 0.1 seconds)
max-buffers 8000;
unplug-watermark 1024;
max-epoch-size 8000;
# ko-count 4;
# allow-two-primaries;
cram-hmac-alg "sha1";
shared-secret "hdhwXes23sYEhart8t";
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict disconnect;
# data-integrity-alg "md5";
# no-tcp-cork;
}
syncer {
rate 120M;
al-extents 517;
}
}
resource data {
on mysql-2-1 {
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.1.1:7788;
meta-disk internal;
}
on mysql-2-2 {
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.1.2:7788;
meta-disk internal;
}
}
5.格式化同步磁盘区
sudo /sbin/fdisk -l
sudo /sbin/mkfs.ext3 /dev/sdb1
sudo dd if=/dev/zero of=/dev/sdb1 bs=1M count=1;sync
6.启动drbd
同样rpm包安装的配置文件位置在/etc/init.d/目录下,源码安装的在 安装目录/etc/init.d/下
sudo /etc/init.d/drbd start
错误1:
Starting DRBD resources: Can not load the drbd module.
原因是,缺少内核模块,需要执行:
sudo yum install -y kmod-drbd83
错误2:
再启动,报错:
Starting DRBD resources: [
data
no suitable meta data found :(
Command '/sbin/drbdmeta 0 v08 /dev/sdb1 internal check-resize' terminated with exit code 255
drbdadm check-resize data: exited with code 255
d(data) 0: Failure: (119) No valid meta-data signature found.
==> Use 'drbdadm create-md res' to initialize meta-data area. <==
[data] cmd /sbin/drbdsetup 0 disk /dev/sdb1 /dev/sdb1 internal --set-defaults --create-device --no-md-flushes --no-disk-flushes --on-io-error=detach failed - continuing!
s(data) n(data) ]..........
处理办法:
sudo /sbin/drbdadm create-md data
data是resource的模块名称。
7.检查服务
查看端口:
netstat -ant
...
tcp 0 0 192.168.1.1:7788 192.168.1.2:36040 ESTABLISHED
tcp 0 0 192.168.1.1:38371 192.168.1.2:7788 ESTABLISHED
...
tcp 0 0 192.168.1.2:36040 192.168.1.1:7788 ESTABLISHED
tcp 0 0 192.168.1.2:7788 192.168.1.1:38371 ESTABLISHED
查看状态:
sudo cat /proc/drbd
version: 8.3.15 (api:88/proto:86-97)
GIT-hash: 0ce4d235fc02b5c53c1c52c53433d11a694eab8c build by mockbuild@builder10.centos.org, 2013-03-27 16:01:26
0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:3888655588
version: 8.3.15 (api:88/proto:86-97)
GIT-hash: 0ce4d235fc02b5c53c1c52c53433d11a694eab8c build by mockbuild@builder10.centos.org, 2013-03-27 16:01:26
0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:3888655588
有的时候proc会反应一个过程,比如你磁盘格式化之后
8.指定主库(在一台执行)
sudo /sbin/drbdadm -- --overwrite-data-of-peer primary all
或
sudo /sbin/drbdsetup /dev/drbd1 primary -o
指定主库之后,我们再看/proc/drbd。
sudo cat /proc/drbd
version: 8.3.15 (api:88/proto:86-97)
GIT-hash: 0ce4d235fc02b5c53c1c52c53433d11a694eab8c build by mockbuild@builder10.centos.org, 2013-03-27 16:01:26
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r---n-
ns:69312336 nr:0 dw:61154476 dr:8190020 al:53128 bm:501 lo:1 pe:21 ua:254 ap:0 ep:1 wo:b oos:3819537276
[>....................] sync'ed: 1.8% (3730016/3797512)M
finish: 9:21:51 speed: 113,284 (99,164) K/sec
9.格式化磁盘,挂盘(在主库上执行)
sudo /sbin/mkfs.ext3 /dev/drbd0
mount /dev/drbd0 /data
10.切换测试
切换前,2.218为主,2.223为备,
(1)在2.218/data目录创建文件1.txt
-rw-r--r-- 1 root root 0 Jun 24 09:38 1.txt
(2)卸载磁盘/data,sudo umount /data
(3)为2.218降级:
$ sudo /sbin/drbdadm secondary data
$ sudo cat /proc/drbd
version: 8.3.15 (api:88/proto:86-97)
GIT-hash: 0ce4d235fc02b5c53c1c52c53433d11a694eab8c build by mockbuild@builder10.centos.org, 2013-03-27 16:01:26
0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----
ns:1741363424 nr:0 dw:61154560 dr:1680209989 al:53130 bm:237344 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
(4)为2.223升级:
$ sudo /sbin/drbdadm primary data
$ sudo cat /proc/drbd
version: 8.3.15 (api:88/proto:86-97)
GIT-hash: 0ce4d235fc02b5c53c1c52c53433d11a694eab8c build by mockbuild@builder10.centos.org, 2013-03-27 16:01:26
0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:0 nr:1741363424 dw:1741363424 dr:0 al:0 bm:237341 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
(5)在2.223上挂载磁盘:sudo mount /dev/drbd0 /data
(6)查看存在1.txt
-rw-r--r-- 1 root root 0 Jun 24 09:38 1.txt
(7)编辑1.txt,输入123,保存
(8)按上面步骤将主切回1.218,查看1.txt
[leiche@mysql-2-2 data]$ sudo cat 1.txt
123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123
(9)再将主切回2.223,发现内容一样。
问题:我写入的是123,为何存入这么多123?
不管如何,切换时成功了。至此,安装drbd完成。
配置heartbeat
1.下载
heartbeat同样可用源码安装,源码下载地址如下。也可以用yum源安装,heartbeat与linux内核版本关系不大。
下载libnet
http://sourceforge.jp/projects/sfnet_libnet-dev/releases/
下载heartbeat
http://www.ultramonkey.org/download/heartbeat/2.1.3/heartbeat-2.1.3.tar.gz
2.安装heartbeat
$ sudo yum list | grep heartbeat
heartbeat.i386 2.1.4-11.el5 epel
heartbeat.x86_64 2.1.4-11.el5 epel
heartbeat-devel.i386 2.1.4-11.el5 epel
heartbeat-devel.x86_64 2.1.4-11.el5 epel
heartbeat-gui.x86_64 2.1.4-11.el5 epel
heartbeat-ldirectord.x86_64 2.1.4-11.el5 epel
heartbeat-pils.i386 2.1.4-11.el5 epel
heartbeat-pils.x86_64 2.1.4-11.el5 epel
heartbeat-stonith.i386 2.1.4-11.el5 epel
heartbeat-stonith.x86_64 2.1.4-11.el5 epel
安装:
yum install -y heartbeat heartbeat-ldirectord heartbeat-pils heartbeat-stonith
3.配置drbd
(1)配置authkeys
$ sudo cat /etc/ha.d/authkeys
#这个文件用来配置密码认证方式,支持3种认证方式,crc,md5和sha1
auth 2
#1 crc
2 sha1 47e9336850f1db6fa58bc470bc9b7810eb397f04
#3 md5 Hellomysql
sudo chmod 600 /etc/ha.d/authkeys
(2)配置ha.cf
[leiche@mysql-2-1 3306]$ sudo cat /etc/ha.d/ha.cf
#日志
debugfile /var/log/ha-debug
logfile /var/log/ha-log
#
logfacility local0
#心跳设定
#检测心跳,每2秒检测一次
keepalive 2
#60秒连接不上认为对方挂掉了
deadtime 60
#连续10次连接不上则警告提示
warntime 10
#为重启预留一段时间
initdead 180
#有三种广播方式ucast,mcast,bcast,也就是心跳线,694是默认端口,1是ttl:允许生存时间
mcast eth1 225.0.0.37 694 1 0
#不回切
auto_failback off
node mysql-2-1
node mysql-2-2
#关闭crm
crm no
(3)配置hareresources
格式:[node-name] IPaddr drbddisk Filesystem 启动项
[node-name] 需要和ha.cf中的node值一致;
IPaddr:由/etc/ha.d/resource.d/IPaddr 控制,用::隔开,意思是IP为192.168.1.82,子网掩码为255.255.255.0,基础网卡为eth0
drbddisk:由/etc/ha.d/resource.d/drbddisk控制,用::隔开,意思是挂载共享盘,data为drbd配置文件中的resource data模块
Filesystem:由/etc/ha.d/resource.d/Filesystem控制,用::隔开,挂载共享盘,等同于mount -t ext3 /dev/drbd0 /data
可执行文件:默认目录为/etc/init.d/,要求可执行文件有start|stop命令选项
#主218,node mysql-2-1
mysql-2-1 IPaddr::192.168.1.82/24/eth1 drbddisk::data Filesystem::/dev/drbd0::/data::ext3 mysql3306
#备233,node mysql-2-2
mysql-2-2 IPaddr::192.168.1.82/24/eth1 drbddisk::data Filesystem::/dev/drbd0::/data::ext3 mysql3306
4.设置开机启动
chkconfig mysql off
chkconfig --add heartbeat
chkconfig heartbeat on
5.启动heartbeat
[leiche@mysql-2-1 ~]$ sudo /etc/init.d/heartbeat start
Starting High-Availability services:
2014/06/25_11:18:10 INFO: Resource is stopped
[ OK ]
查看磁盘:
/dev/drbd0 3.6T 1.8G 3.4T 1% /data
查看ip:
eth0:0 Link encap:Ethernet HWaddr F8:BC:12:48:65:B4
inet addr:192.168.1.82 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:194 Memory:d91a0000-d91b0000
查看主备错误日志有没有明显error。
切换测试:
停止前状态:
master1:192.179.1.218
磁盘:
Filesystem Size Used Avail Use% Mounted on /dev/drbd0 3.6T 1.1T 2.3T 33% /data
进程: ps aux | grep mysql进程运行正常 网卡: eth0:0 Link encap:Ethernet HWaddr F8:BC:12:48:65:B4 inet addr:192.168.1.82 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:194 Memory:d91a0000-d91b0000
drbd状态: sudo cat /proc/drbd version: 8.3.15 (api:88/proto:86-97) GIT-hash: 0ce4d235fc02b5c53c1c52c53433d11a694eab8c build by mockbuild@builder10.centos.org, 2013-03-27 16:01:26 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- ns:1289487736 nr:104088 dw:1289592052 dr:1265759530 al:1359461 bm:6 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
master2:192.168.1.2
磁盘,网卡,mysql服务都不存在
drbd状态:
[leiche@mysql-2-2 ~]$ sudo cat /proc/drbd
version: 8.3.15 (api:88/proto:86-97)
GIT-hash: 0ce4d235fc02b5c53c1c52c53433d11a694eab8c build by mockbuild@builder10.centos.org, 2013-03-27 16:01:26
0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
ns:103512 nr:1291455972 dw:1291559484 dr:108510 al:662 bm:6 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
(1)停止主机heartdrbd服务:/etc/init.d/mysql3306 stop
停止后,主从切换,状态互换,这时候master2为主写,master1宕掉了。
我们再把master1的heartbeat服务,发现主写不会切回来,原因是ha.cf文件中的参数:
auto_failback off
若需要主写需要切回值master1,则需要模拟master2的heartbeat宕掉;
可以看到相关日志:
eartbeat[30459]: 2014/07/01_16:20:42 info: Heartbeat restart on node mysql-2-2
heartbeat[30459]: 2014/07/01_16:20:42 info: Link mysql-2-2:eth1 up.
heartbeat[30459]: 2014/07/01_16:20:42 info: Status update for node mysql-2-2: status init
heartbeat[30459]: 2014/07/01_16:20:42 info: Status update for node mysql-2-2: status up
heartbeat[30459]: 2014/07/01_16:20:42 debug: StartNextRemoteRscReq(): child count 1
heartbeat[25083]: 2014/07/01_16:20:42 debug: notify_world: setting SIGCHLD Handler to SIG_DFL
harc[25083]: 2014/07/01_16:20:42 info: Running /etc/ha.d/rc.d/status status
heartbeat[25099]: 2014/07/01_16:20:42 debug: notify_world: setting SIGCHLD Handler to SIG_DFL
harc[25099]: 2014/07/01_16:20:42 info: Running /etc/ha.d/rc.d/status status
heartbeat[30459]: 2014/07/01_16:20:43 debug: get_delnodelist: delnodelist=
heartbeat[30459]: 2014/07/01_16:20:43 info: all clients are now paused
heartbeat[30459]: 2014/07/01_16:20:43 info: Status update for node mysql-2-2: status active
heartbeat[25115]: 2014/07/01_16:20:43 debug: notify_world: setting SIGCHLD Handler to SIG_DFL
harc[25115]: 2014/07/01_16:20:43 info: Running /etc/ha.d/rc.d/status status
heartbeat[30459]: 2014/07/01_16:20:44 info: remote resource transition completed.
heartbeat[30459]: 2014/07/01_16:20:44 info: all clients are now resumed
(2)只停止mysql服务
heartbeat不会有任何反馈,甚至ha-debug和ha-log都没有记录。
脑裂
什么是脑裂?脑裂就是两边大脑各说各的,没有协调,思想不统一,然后你整个人就混乱了。
在heartbeat+drbd这种架构中,脑裂是指:由于某些网络或服务故障,导致heartbeat心跳线暂时断开,从而引起主备都被启用,互不相让的情况。
在本例中即指master1和master2都被启用,都有ip;192.168.1.82,都有mysql服务。
如何处理?
1.加仲裁
2.监控log,漂移则报警,然后人为检查并强制停止一台;