Mfs+drbd+keepalived实现mfs系统高可用

http://blog.sina.com.cn/s/blog_53c654720102wo1k.html

Moosefs分布式文件系统是一个易用的系统,但其只有在Pro版中提供了master的高可用方案,免费版master只能单机运行,存在单点故障的隐患。

 

本文结合网上的相关资料,介绍通过drbd+keepalived来实现mfsmaster高可用的方案。

 

​环境:

CentOS 6

Master-primary IP: 172.18.18.201 (主机名test01)

Master-secondary IP: 172.18.18.202  (test02)

Mfschunkserver IP:172.18.18.203  (test03)

Master 虚拟IP:172.18.18.204

 

一、安装操作系统

 安装CentOS 6系统。需注意:在安装mfsmaster服务器时,一定要划分出一个独立的分区给drbd使用,我这里用的是/dev/sda3,大小500MB。

 

二、安装drbd

 2.1 环境准备

 因drbd编辑安装时,需要用到CentOS内核代码,因此需下载对应内核的源码包到本地安装,不要用yum安装。

 # rpm -ikernel-devel-2.6.32-504.el6.x86_64.rpm

 安装编译时需要的环境:

 # yum -y install gcc flex perl

 

2.2 drbd编辑安装

 # tar zxvf drbd-8.4.2.tar.gz

# cd drbd-8.4.2

# ./configure --prefix=/usr/local/drbd--with-km

# makeKDIR=/usr/src/kernels/2.6.32-504.el6.x86_64

--说明:这是实际内核源码路径,根据实际情况设定

# make install

# mkdir -p /usr/local/drbd/var/run/drbd

# cp /usr/local/drbd/etc/rc.d/init.d/drbd/etc/rc.d/init.d

# chkconfig --add drbd

# chkconfig drbd on

 

安装DRBD模块:

# cd /root/data/drbd-8.4.2/drbd

 # make clean

# makeKDIR=/usr/src/kernels/2.6.32-504.el6.x86_64

# cp drbd.ko/lib/modules/2.6.32-504.el6.x86_64/kernel/lib/

--说明:内核版本要用uname -r查一下

# modprobe drbd

 

查看模块是否加载成功

# lsmod |grep drbd

drbd                  299688  0

libcrc32c               1246  1 drbd

 

2.3 drbd配置

 # vi/usr/local/drbd/etc/drbd.d/global_common.conf

 

global { 

  usage-count yes; 

common { 

  net { 

    protocol C; 

  } 

 

# vi /usr/local/drbd/etc/drbd.d/r0.res

 

resource r0 {

   on test01 {

      device /dev/drbd1;

      disk /dev/sda3;

      address 172.18.18.201:7788;

      meta-disk internal;

      }

   on test02 {

      device /dev/drbd1;

      disk /dev/sda3;

      address 172.18.18.202:7788;

      meta-disk internal;

      }

}

 

创建drbd资源:

# dd if=/dev/zero of=/dev/sda3 bs=1Mcount=1    --一定要执行这个否则下一步会报错

# drbdadm create-md r0

# drbdadm up r0

 

以上操作在secondary (test02)上也执行一遍。

 

2.4 设置Primary Node

 # drbdadm primary --force r0

 

查看drbd状态:

# cat /proc/drbd

 

 1:cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r---n-

   ns:108008 nr:0 dw:0 dr:109208 al:0 bm:6 lo:0 pe:1 ua:1 ap:0 ep:1 wo:foos:404428

         [===>................]sync'ed: 21.6% (404428/511948)K

         finish:0:00:33 speed: 11,944 (11,944) K/sec

 

2.5 创建drbd文件系统

 在primary上:

# mkfs.ext4 /dev/drbd1

 

# mkdir drbddata

# mount /dev/drbd1 /drbddata

 

2.6 drbd同步测试

 在primary (test01)上,往/drbddata目录中写入测试文件后,执行:

# umount /dev/drbd1

# drbdadm secondary r0

 

然后在secondary (test02)上,执行:

# mkdir drbddata

# drbdadm primary r0

# mount /dev/drbd1 /drbddata

 

此时,在/drbddata目录应能看到刚写入的测试文件!

 

三、安装mfs

 3.1 环境准备

 安装libpcap

# yum –y install libpcap

 

3.2 安装mfsmaster

 # tar zxvfmoosefs-packages-linux-3.0.77.tar.gz

# rpm -imoosefs-master-3.0.77-1.rhsysv.x86_64.rpm

 

3.3 配置mfsmaster

 # cd /etc/mfs

# vi mfsmaster.cfg

确保修改:DATA_PATH = /drbddata

 

其它按默认参数设置。

 

以上操作在secondary (test02)上也执行一遍,完成两台服务器mfsmaster的安装。

 

3.4 mfsmaster启动测试

 在primary上启动测试:

 

# drbdadm primary r0

# mount /dev/drbd1 /drbddata

# chown mfs:mfs –R /drbddata

# mfsmaster start

 

查看日志看mfsmaster是否启动成功。

 

在secondary上启动测试:

 

先将primary上服务停止:

# mfsmaster stop

# umount /drbddata

# drbdadm secondary r0

 

再在secondary上执行:

# drbdadm primary r0

# mount /dev/drbd1 /drbddata

# chown mfs:mfs –R /drbddata

# mfsmaster start

 

测试成功后,进行下一步的安装。

 

四、安装keepalived

 

4.1 安装keepalived

 # tar zxvf keepalived-1.2.22.tar.gz

# cd keepalived-1.2.22

# ./configure --prefix=/usr/local/keepalived --disable-fwmark

# make

# make install

# cp /usr/local/keepalived/sbin/keepalived/usr/sbin/

# cp /usr/local/keepalived/etc/sysconfig/keepalived/etc/sysconfig/

# cp/usr/local/keepalived/etc/rc.d/init.d/keepalived /etc/init.d/

# chkconfig --add keepalived

# chkconfig keepalived on

 

4.2 配置keepalived

 # mkdir -p /etc/keepalived

# cp/usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/

# vi /etc/keepalived/keepalived.conf

 

global_defs {

   notification_email {

   shuyb@sina.com

   }

   notification_email_from shuyb@sina.com

   smtp_server 192.168.200.1

   smtp_connect_timeout 30

   router_id test01

   vrrp_skip_check_adv_addr

   vrrp_strict

   vrrp_garp_interval 0

   vrrp_gna_interval 0

}

 

vrrp_script check_drbd {

   script "/etc/keepalived/check_drbd.sh"

   interval 15

}

 

vrrp_instance mfs {

    state MASTER

    interface p4p1

    virtual_router_id 51

    priority 100

    advert_int 1

    authentication {

        auth_type PASS

        auth_pass 1111

    }

    virtual_ipaddress {

    172.18.18.204

    }

    track_script {

    check_drbd

    }

}

 

在/etc/keepalived下新增一个文件:

# vi check_drbd.sh

 

#!/bin/bash

A=`ps -C mfsmaster --no-header |wc -l`

if [ $A -eq 0 ]; then

   umount /dev/drbd1

   drbdadm secondary r0

   killall keepalived

fi

 

# chmod +x check_drbd.sh

 

在secondary上安装步骤同上,配置文件如下:

 

# vi /etc/keepalived/keepalived.conf

global_defs {

   notification_email {

   shuyb@sina.com

   }

   notification_email_from shuyb@sina.com

   smtp_server 192.168.200.1

   smtp_connect_timeout 30

   router_id test01

   vrrp_skip_check_adv_addr

   vrrp_strict

   vrrp_garp_interval 0

   vrrp_gna_interval 0

}

 

vrrp_instance mfs {

    state BACKUP

    interface p4p1

    virtual_router_id 51

    priority 90

    advert_int 1

    authentication {

        auth_type PASS

        auth_pass 1111

    }

    virtual_ipaddress {

    172.18.18.204

    }

 

notify_master /etc/keepalived/master.sh

notify_backup /etc/keepalived/backup.sh

}

 

在/etc/keepalived下新增两个文件:

# vi backup.sh

 

#!/bin/bash

mfsmaster stop

umount /dev/drbd1

drbdadm secondary r0

 

# vi master.sh

 

#!/bin/bash

drbdadm primary r0

mount /dev/drbd1 /drbddata

mfsmaster start

 

# chmod +x backup.sh master.sh

 

至此,安装配置完成。需要注意的是,要将服务器的防火墙关闭,或保证mfs需要用到的9420~9425端口、drbd用到的7788端口等打开!

 

五、安装mfschunkserver和mfs客户端

 5.1在chunkserver(test03)上

 # yum install libpcap

# rpm -imoosefs-chunkserver-3.0.77-1.rhsysv.x86_64.rpm

 

# vi /etc/mfs/mfsmetalogger.cfg

修改:

 MASTER_HOST = 172.18.18.204

 

# vi /etc/mfs/mfshdd.cfg

增加:

/home

 

(注:/home是该服务器上给mfschunk用的目录,mount到一个独立的逻辑卷上)

 

# chown -R mfs:mfs /home

 

5.2 在mfs客户端

 

# yum install fuse-libs

# rpm -imoosefs-client-3.0.77-1.rhsysv.x86_64.rpm

 

# mkdir -p /mfs

 

六、启动及切换测试

 

好了,现在可以进行测试了!

 

1.  先把mfsmaster在primary上启动起来,步骤同上

2.  在primary上启动keepalived

# /etc/init.d/keepalived start

 

查看虚拟ip是否绑定:

# ip addr

 

3.在secondary上启动keepalived

# /etc/init.d/keepalived start

 

4. 启动mfschunkserver

# mfschunkserver start

 

查看/var/log/message看与Master是否连接成功

 

5. 客户端连接

# mfsmount /mfs –H 172.18.18.204

# df --查看是否挂载成功

 

6. 停止primary上的mfsmaster

# mfsmaster stop

 

7. 查看secondary、chunkserver、client是否正常

 

测试完成。完成后,需手工将服务重新切回到primary上。

 

ps:当出现问题调转到backup后,如果主的正常了,我们手动切换必须遵守以下步骤

  1. 停掉mfsmaster服务

     mfsmaster stop

          umount /dev/drbd1

         drbdadm secondary mfs

  1. 主机:

    1.drbdadm  primary mfs

        2.mount /dev/drbd1 /data1/drbd

        3.mfsmaster start

        4.keepalived start

还有要提到的就是可能会出现脑裂,这一般时操作不按顺序,也可能有其他原因,下面是网上大神写的,

 

1,正常情况下状态:
 [root@drbd1 ~]# cat /proc/drbd 
 version: 8.3.8 (api:88/proto:86-94)
 : 299AFE04D7AFD98B3CA0AF9 
 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
  ns:2144476 nr:0 dw:36468 dr:2115769 al:14 bm:129 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
 
[root@drbd2 ~]# cat /proc/drbd 
 
version: 8.3.8 (api:88/proto:86-94)
 
srcversion: 299AFE04D7AFD98B3CA0AF9 
 
 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----
 
    ns:0 nr:2141684 dw:2141684 dr:0 al:0 bm:130 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
 
2,drbd1故障后
 
drbd1状态:
 
[root@drbd1 ~]# cat /proc/drbd 
 
version: 8.3.8 (api:88/proto:86-94)
 
srcversion: 299AFE04D7AFD98B3CA0AF9 
 
 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r----
 
    ns:4 nr:102664 dw:102668 dr:157 al:1 bm:8 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
 
drbd2的状态:
 
[root@drbd2 ~]# cat /proc/drbd 
 
version: 8.3.8 (api:88/proto:86-94)
 
srcversion: 299AFE04D7AFD98B3CA0AF9 
 
 0: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r----
 
    ns:0 nr:2141684 dw:2141684 dr:0 al:0 bm:130 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
 
3,处理方法:
 
a,将secondary配置成primary角色
 
[root@drbd2 ~]# drbdsetup /dev/drbd0 primary -o
 
[root@drbd2 ~]# cat /proc/drbd 
 
version: 8.3.8 (api:88/proto:86-94)
 
srcversion: 299AFE04D7AFD98B3CA0AF9 
 
 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/Outdated C r----
 
    ns:0 nr:2141684 dw:2141684 dr:0 al:0 bm:130 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
 
挂载:
 
[root@drbd2 /]# mount /dev/drbd0 /data1
 
[root@drbd2 data1]# ll
 
total 10272
 
-rw-r--r-- 1 root root 10485760 Feb 13 11:26 aa.img
 
drwx------ 2 root root    16384 Feb 13 11:25 lost+found
 
这个时候drbd2开始提供服务,开始写数据
 
drbd1主恢复正常后:
 
[root@drbd1 ~]# cat /proc/drbd
 
version: 8.3.8 (api:88/proto:86-94)
 
srcversion: 299AFE04D7AFD98B3CA0AF9 
 
 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r----
 
    ns:2144476 nr:0 dw:36484 dr:2115769 al:14 bm:129 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8
 
drbd1状态是:StandAlone,此时,drbd1是不会和drbd2互相联系的
 
我们来查看下日志:
 
[root@drbd1 ~]# tailf /var/log/messages
 
Feb 13 16:14:27 drbd1 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0
 
Feb 13 16:14:27 drbd1 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0 (0x0)
 
Feb 13 16:14:27 drbd1 kernel: block drbd0: conn( WFReportParams -> Disconnecting ) 
 
Feb 13 16:14:27 drbd1 kernel: block drbd0: error receiving ReportState, l: 4!
 
Feb 13 16:14:27 drbd1 kernel: block drbd0: asender terminated
 
Feb 13 16:14:27 drbd1 kernel: block drbd0: Terminating drbd0_asender
 
Feb 13 16:14:27 drbd1 kernel: block drbd0: Connection closed
 
Feb 13 16:14:27 drbd1 kernel: block drbd0: conn( Disconnecting -> StandAlone ) 
 
Feb 13 16:14:27 drbd1 kernel: block drbd0: receiver terminated
 
Feb 13 16:14:27 drbd1 kernel: block drbd0: Terminating drbd0_receiver
 
脑裂出现!
 
解决方法:
 
1>,我们需要将现在的drbd1角色修改为secondary
 
[root@drbd1 ~]# drbdadm secondary r0
 
[root@drbd1 ~]# drbdadm -- --discard-my-data connect r0  ##该命令告诉drbd,secondary上的数据不正确,以primary上的数据为准。
 
2>,我们还需要在drbd2上执行下面操作
 
[root@drbd2 /]# drbdadm connect r0
 
这样drbd1就能和drbd2开始连接上了,并且保证数据不会丢失:
 
[root@drbd1 ~]# cat /proc/drbd      
 
version: 8.3.8 (api:88/proto:86-94)
 
srcversion: 299AFE04D7AFD98B3CA0AF9 
 
 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----
 
    ns:0 nr:20592 dw:20592 dr:0 al:0 bm:4 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

 

posted on 2017-08-24 15:29  林肯公园  阅读(649)  评论(0编辑  收藏  举报

导航