Cluster - HA -keepalived
学习须知
VRRP:https://www.cnblogs.com/aftree/p/9376427.html
需求
集群中,对后端RealServer的状态做检测,实现自动化问题检测和问题自动处理机制。包括问题发现、问题处理、恢复处理。
问题发现:准确的机制毫秒级发现问题,例如端口在但进程僵死无法对外提供服务,我们只监控端口是否Listening是不可靠的
问题处理:根据事先建立的规则,对出现问题的服务器及时回收资源,踢出资源服务池,并能有机制监控宕机的服务器服务,在服务恢复的时候能够及时把恢复正常的服务器恢复到正常状态,正常对外提供服务
恢复处理:需要能够对已经恢复正常的服务器进行管理控制,按照预先的规则设置主备和优先级情况,针对已经恢复的服务器优先级如何设置及再次出问题是否依然添加其为服务列表等规则。比如,一台服务器刚恢复服务,但几分钟后再次宕机,如此循环,但这对提供服务的集群在选举主备优先级时也是一种资源消耗,频繁的争抢也可能导致脑裂等问题,这些问题都是需要有良好的机制规避。
第1章 keepalived服务说明
1.1 keepalived是什么?
Keepalived软件起初是专为LVS负载均衡软件设计的,用来管理并监控LVS集群系统中各个服务节点的状态,后来又加入了可以实现高可用的VRRP功能。因此,Keepalived除了能够管理LVS软件外,还可以作为其他服务(例如:Nginx、Haproxy、MySQL等)的高可用解决方案软件。
Keepalived软件主要是通过VRRP协议实现高可用功能的。VRRP是Virtual Router RedundancyProtocol(虚拟路由器冗余协议)的缩写,VRRP出现的目的就是为了解决静态路由单点故障问题的,它能够保证当个别节点宕机时,整个网络可以不间断地运行。
所以,Keepalived 一方面具有配置管理LVS的功能,同时还具有对LVS下面节点进行健康检查的功能,另一方面也可实现系统网络服务的高可用功能。
keepalived官网http://www.keepalived.org
1.2 keepalived服务的三个重要功能
管理LVS负载均衡软件
实现LVS集群节点的健康检查中
作为系统网络服务的高可用性(failover)
1.3 Keepalived高可用故障切换转移原理
Keepalived高可用服务对之间的故障切换转移,是通过 VRRP (Virtual Router Redundancy Protocol ,虚拟路由器冗余协议)来实现的。
在 Keepalived服务正常工作时,主 Master节点会不断地向备节点发送(多播的方式)心跳消息,用以告诉备Backup节点自己还活看,当主 Master节点发生故障时,就无法发送心跳消息,备节点也就因此无法继续检测到来自主 Master节点的心跳了,于是调用自身的接管程序,接管主Master节点的 IP资源及服务。而当主 Master节点恢复时,备Backup节点又会释放主节点故障时自身接管的IP资源及服务,恢复到原来的备用角色。
那么,什么是VRRP呢?
VRRP ,全 称 Virtual Router Redundancy Protocol ,中文名为虚拟路由冗余协议 ,VRRP的出现就是为了解决静态踣甶的单点故障问题,VRRP是通过一种竞选机制来将路由的任务交给某台VRRP路由器的。
1.4 keepalived 原理
1.4.1keepalived高可用架构示意图
1.4.2 文字,表述
Keepalived的工作原理:
Keepalived高可用对之间是通过VRRP通信的,因此,我们从 VRRP开始了解起:
1) VRRP,全称 Virtual Router Redundancy Protocol,中文名为虚拟路由冗余协议,VRRP的出现是为了解决静态路由的单点故障。
2) VRRP是通过一种竟选协议机制来将路由任务交给某台 VRRP路由器的。
3) VRRP用 IP多播的方式(默认多播地址(224.0_0.18))实现高可用对之间通信。
4) 工作时主节点发包,备节点接包,当备节点接收不到主节点发的数据包的时候,就启动接管程序接管主节点的开源。备节点可以有多个,通过优先级竞选,但一般 Keepalived系统运维工作中都是一对。
5) VRRP使用了加密协议加密数据,但Keepalived官方目前还是推荐用明文的方式配置认证类型和密码。
介绍完 VRRP,接下来我再介绍一下 Keepalived服务的工作原理:
Keepalived高可用对之间是通过 VRRP进行通信的, VRRP是遑过竞选机制来确定主备的,主的优先级高于备,因此,工作时主会优先获得所有的资源,备节点处于等待状态,当主挂了的时候,备节点就会接管主节点的资源,然后顶替主节点对外提供服务。
在 Keepalived服务对之间,只有作为主的服务器会一直发送 VRRP广播包,告诉备它还活着,此时备不会枪占主,当主不可用时,即备监听不到主发送的广播包时,就会启动相关服务接管资源,保证业务的连续性.接管速度最快可以小于1秒。
第2章 keepalived软件使用
2.1 软件的部署
2.1.1 第一个里程碑 keepalived软件安装
yum install keepalived -y
第二个里程碑: 进行默认配置测试
2.1.2 配置文件说明
1-13行表示全局配置
15-30行 虚拟ip配置 brrp
配置管理LVS
2.1.3 最终配置文件
主负载均衡服务器配置
备负载均衡服务器配置
2.1.4 启动keepalived
2.1.5 【说明】在进行访问测试之前要保证后端的节点都能够单独的访问。
测试连通性. 后端节点
2.1.6 查看虚拟ip状态
2.1.7 【总结】配置文件修改
Keepalived主备配置文件区别:
01. router_id 信息不一致
02. state 状态描述信息不一致
03. priority 主备竞选优先级数值不一致
2.2 脑裂
在高可用(HA)系统中,当联系2个节点的“心跳线”断开时,本来为一整体、动作协调的HA系统,就分裂成为2个独立的个体。由于相互失去了联系,都以为是对方出了故障。两个节点上的HA软件像“裂脑人”一样,争抢“共享资源”、争起“应用服务”,就会发生严重后果——或者共享资源被瓜分、2边“服务”都起不来了;或者2边“服务”都起来了,但同时读写“共享存储”,导致数据损坏(常见如数据库轮询着的联机日志出错)。
对付HA系统“裂脑”的对策,目前达成共识的的大概有以下几条:
1)添加冗余的心跳线,例如:双线条线(心跳线也HA),尽量减少“裂脑”发生几率;
2)启用磁盘锁。正在服务一方锁住共享磁盘,“裂脑”发生时,让对方完全“抢不走”共享磁盘资源。但使用锁磁盘也会有一个不小的问题,如果占用共享盘的一方不主动“解锁”,另一方就永远得不到共享磁盘。现实中假如服务节点突然死机或崩溃,就不可能执行解锁命令。后备节点也就接管不了共享资源和应用服务。于是有人在HA中设计了“智能”锁。即:正在服务的一方只在发现心跳线全部断开(察觉不到对端)时才启用磁盘锁。平时就不上锁了。
3)设置仲裁机制。例如设置参考IP(如网关IP),当心跳线完全断开时,2个节点都各自ping一下参考IP,不通则表明断点就出在本端。不仅“心跳”、还兼对外“服务”的本端网络链路断了,即使启动(或继续)应用服务也没有用了,那就主动放弃竞争,让能够ping通参考IP的一端去起服务。更保险一些,ping不通参考IP的一方干脆就自我重启,以彻底释放有可能还占用着的那些共享资源。
2.2.1 脑裂产生的原因
一般来说,裂脑的发生,有以下几种原因:
😶 高可用服务器对之间心跳线链路发生故障,导致无法正常通信。
因心跳线坏了(包括断了,老化)。
因网卡及相关驱动坏了,ip配置及冲突问题(网卡直连)。
因心跳线间连接的设备故障(网卡及交换机)。
因仲裁的机器出问题(采用仲裁的方案)。
😶 高可用服务器上开启了 iptables防火墙阻挡了心跳消息传输。
😶 高可用服务器上心跳网卡地址等信息配置不正确,导致发送心跳失败。
😶 其他服务配置不当等原因,如心跳方式不同,心跳广插冲突、软件Bug等。
提示: Keepalived配置里同一 VRRP实例如果 virtual_router_id两端参数配置不一致也会导致裂脑问题发生。
2.2.2 常见的解决方案
在实际生产环境中,我们可以从以下几个方面来防止裂脑问题的发生:
🎑 同时使用串行电缆和以太网电缆连接,同时用两条心跳线路,这样一条线路坏了,另一个还是好的,依然能传送心跳消息。
🎑 当检测到裂脑时强行关闭一个心跳节点(这个功能需特殊设备支持,如Stonith、feyce)。相当于备节点接收不到心跳消患,通过单独的线路发送关机命令关闭主节点的电源。
🎑 做好对裂脑的监控报警(如邮件及手机短信等或值班).在问题发生时人为第一时间介入仲裁,降低损失。例如,百度的监控报警短倍就有上行和下行的区别。报警消息发送到管理员手机上,管理员可以通过手机回复对应数字或简单的字符串操作返回给服务器.让服务器根据指令自动处理相应故障,这样解决故障的时间更短.
当然,在实施高可用方案时,要根据业务实际需求确定是否能容忍这样的损失。对于一般的网站常规业务.这个损失是可容忍的。
2.3 如何进行脑裂情况监控
2.3.1 在什么服务器上进行监控?
在备服务器上进行监控,可以使用zabbix监控,参考http://www.cnblogs.com/clsn/p/7885990.html
2.3.2 监控什么信息?
备上面出现vip情况:
1)脑裂情况出现
2)正常主备切换也会出现
2.3.3 编写监控脑裂脚本
编写完脚本后要给脚本赋予执行权限
2.3.4 测试 确保两台负载均衡能够正常负载
2.4 排错过程
1)利用负载均衡服务器,在服务器上curl所有的节点信息(web服务器配置有问题)
2)curl 负载均衡服务器地址,可以实现负载均衡
3)windows上绑定虚拟IP,浏览器上进行测试
keepalived日志文件位置 /var/log/messages
2.5 更改nginx反向代理配置 只监听vip地址
修改nginx监听参数 listen 10.0.0.3:80;
修改内核参数,实现监听本地不存在的ip
2.6 让keepalived监控nginx
编写执行脚本
注意脚本的授权
2.6.1 使用keepalived的监控脚本
说明 执行的脚本名称尽量不要和服务名称相同或相似
2.7 多实例的配置
2.7.1 lb01的keepalived配置文件
2.7.2 修改lb02的keepalived配置文件
修改nginx配置文件,让bbs 与www分别监听不同的ip地址
lb01
lb02
2.8 keepalived双主模式示意图
Software Design
Keepalived is integrally written is pure ANSI/ISO C. The software is articulated around a central I/O multiplexer that provide realtime networking design. The main design focus were to provide an homogene modularity between all elements, this why a core library were created to remove code duplication. On the other hand, the goal were to produce a safe and secure code to ensure production robustness and stability. To ensure robustness and stability, daemon is split into 3 distinct processes. The global design is based on a minimalistic parent process in charge with forked children process monitoring. Then 2 children processes, one responsible for VRRP framework and the other for healthchecking. Each children process has its own scheduling I/O multiplexer, that way VRRP scheduling jitter is optimized since VRRP scheduling is more sensible/critical than healthcheckers. On the other hand this split design minimalize for healthchecking the usage of foreign librairies and minimalize its own action down to and idle mainloop in order to avoid malfunctions caused by itself. The parent process monitoring framework is called watchdog, the design is : each children process open an accept unix domain socket, then while daemon bootstrap, parent process connect to those unix domain socket and send periodic (5s) hello packets to children. If parent cannot send hello packet to remote connected unix domain socket it simply restart children process. This watchdog design offers 2 benefits, first of all hello packets sent from parent process to remote connected children is done throught I/O multiplexer scheduler that way it can detect deadloop in the children scheduling framework. The second benefit is brought by the uses of sysV signal to detect dead children. When running you will see in process list :
All the atomic elements are introduced bellow : |
|||||||||||||||||
Control Plane : Keepalived configuration is done throught the file keepalived.conf. A compiler design is used for parsing. Parser work with a keyword tree hierarchy for mapping each configuration keyword with specifics handler. A central multi-level recursive function read the configuration file and traverse the keyword tree. During parsing, configuration file is translated into an internal memory representation. |
Scheduler - I/O Multiplexer : All the event are scheduled into the same process. Keepalived is a single process. Keepalived is a network routing software, it is so closed to I/O. The design used here is a central select(...) that is in charge of scheduling all internal task. POSIX thread libs are NOT used. This framework provide its own thread abstraction optimized for networking purpose. |
||||||||||||||||
Memory Management : This framework provides acces to some generic memory managements functions like allocation, reallocation, release,... This framework can be used in two mode : normal_mode & debug_mode. When using debug_mode it provide a strong way to eradicate and track memory leaks. This low level env provide buffer under-run protection by tracking allocation memory and released. All the buffer used are length fixed to prevent against eventual buffer-overflow. |
Core components : This framework define some common and global libraries that are used in all the code. Those libraries are : html parsing, link-list, timer, vector, string formating, buffer dump, networking utils, daemon management, pid handling, low level TCP layer4. The goal here is to factorize code to the max to limite as possible code duplication to increase modularity. |
||||||||||||||||
WatchDog : |
Checkers : This is one of the main Keepalived functionnality. Checkers are in charge of realserver healthchecking. A checker test if realserver is alive, this test end on a binary decision : remove or add realserver from/into the LVS topology. The internal checker design is realtime networking software, it use a fully multi-threaded FSM design (Finite State Machine). This checker stack provide LVS topology manipulation accoring to layer4 to layer5/7 test results. Its run in an independent process monitored by parent process. |
||||||||||||||||
VRRP Stack : The other most important Keepalived functionnality. VRRP (Virtual Router Redundancy Protocol : RFC2338) is focused on director takeover, it provide low-level design for router backup. It implements full IETF RFC2338 standard with some provisions and extensions for LVS and Firewall design. It implements the vrrp_sync_group extension that guarantee persistence routing path after protocol takeover. It implements IPSEC-AH using MD5-96bit crypto provision for securing protocol adverts exchange. For more informations on VRRP please read the RFC. Important things : VRRP code can be used without the LVS support, it has been designed for independant use.Its run in an independent process monitored by parent process. |
System call : This framework offer the ability to launch extra system script. It is mainly used in the MISC checker. In VRRP framework it provides the ability to launch extra script during protocol state transition. The system call is done into a forked process to not pertube the global scheduling timer. SMTP : The SMTP protocol is used for administration notification. It implements the IETF RFC821 using a multi-threaded FSM design. Administration notifications are sent for healthcheckers activities and VRRP protocol state transition. SMTP is commonly used and can be interfaced with any other notification sub-system such as GSM-SMS, pagers, ... |
||||||||||||||||
Netlink Reflector : Same as IPVS wrapper. Keepalived work with its own network interface representation. IP address and interface flags are set and monitored through kernel Netlink channel. The Netlink messaging sub-system is used for setting VRRP VIPs. On the other hand, the Netlink kernel messaging broadcast capability is used to reflect into our userspace Keepalived internal data representation any events related to interfaces. So any other userspace (others program) netlink manipulation is reflected to our Keepalived data representation via Netlink Kernel broadcast (RTMGRP_LINK & RTMGRP_IPV4_IFADDR). |
IPVS wrapper : This framework is used for sending rules to the Kernel IPVS code. It provides translation between Keepalived internal data representation and IPVS rule_user representation. It uses the IPVS libipvs to keep generic integration with IPVS code. |
||||||||||||||||
IPVS : The Linux Kernel code provided by Wensong from LinuxVirtualServer.orgOpenSource Project. NETLINK : The Linux Kernel code provided by Alexey Kuznetov with its very nice advanced routing framework and sub-system capabilities. |