LightDB分布式高可用+负载均衡部署
软件版本
LightDB 13.8-22.3
安装分布式多机单实例模式
根据LightDB安装文档6.3节, 安装分布式多机单实例模式。
安装后,确认环境变量$LTDATA
, $LTHOME
正确配置,工作节点正确添加。
本文假设CN(协调节点, primary)安装在机器186
,两个DN(数据节点)安装在机器192
,193
,端口均为15858。
本文之后章节介绍如何搭建CN高可用(CN standby安装在机器187
),支持failover,支持CN备节点(standby)接受DML,配置LVS实现负载均衡。
在CN上查询数据节点:
ltsql -p 15858 -h 10.18.68.186
canopy@lt_test=# select nodeid,nodename,nodeport,isactive from pg_dist_node; nodeid | nodename | nodeport | isactive --------+--------------+----------+---------- 2 | 10.18.68.192 | 15858 | t 3 | 10.18.68.193 | 15858 | t (2 rows)
搭建CN高可用, 支持failover
在CN primary机器上操作
本实例中, 在机器186上按如下步骤操作:
lt_ctl stop
, 停CN实例,修改$LTDATA/lightdb.conf,在shared_preload_libraries后面加上ltcluster,如:
shared_preload_libraries='canopy,ltcluster,lt_stat_statements,lt_stat_activity,lt_prewarm,lt_cron,ltaudit,lt_hint_plan'
lt_ctl start
,启动CN实例,并用如下命令添加高可用组件相关信息
ltsql -p 15858 -h localhost -dpostgres -c"create extension ltcluster;" ltsql -p 15858 -h localhost -dpostgres -c"create role ltcluster superuser password 'ltcluster' login;" ltsql -p 15858 -h localhost -dpostgres -c"create database ltcluster owner ltcluster;"
- 添加用户认证信息,以便standby有权限从primary复制数据; echo后使用
lt_ctl reload
重新加载配置
echo " host replication ltcluster 10.18.68.0/24 trust " >> $LTDATA/lt_hba.conf lt_ctl reload
- 执行下面sh脚本,生成高可用配置文件ltcluster.conf
id=186 NODE_NAME=cn186 ip=10.18.68.186 port=15858 ltclusterconf=$LTHOME/etc/ltcluster/ltcluster.conf echo " node_id=$id node_name='$NODE_NAME' conninfo='host=$ip port=$port user=ltcluster dbname=ltcluster connect_timeout=2' data_directory='$LTDATA' pg_bindir='$LTHOME/bin' failover='automatic' promote_command='$LTHOME/bin/ltcluster standby promote -f $ltclusterconf' follow_command='$LTHOME/bin/ltcluster standby follow -f $ltclusterconf --upstream-node-id=%n' restore_command='cp $LTHOME/archive/%f %p' monitoring_history=true #(Enable monitoring parameters) monitor_interval_secs=2 #(Define monitoring data interval write time parameter) connection_check_type='ping' reconnect_attempts=3 #(before failover,Number of attempts to reconnect to primary before failover(default 6)) reconnect_interval=5 standby_disconnect_on_failover =true log_level=INFO log_facility=STDERR log_file='$LTHOME/etc/ltcluster/ltcluster.log' failover_validation_command='$LTHOME/etc/ltcluster/ltcluster_failover.sh "$LTHOME" "$LTDATA"' shutdown_check_timeout=1800 use_replication_slots=true check_lightdb_command='$LTHOME/etc/ltcluster/check_lightdb.sh' check_lightdb_interval=10 " > $ltclusterconf
- 使用如下命令注册CN主节点(primary),并检查状态
ltcluster -f $LTHOME/etc/ltcluster/ltcluster.conf primary register -F ltclusterd -d -f $LTHOME/etc/ltcluster/ltcluster.conf -p $LTHOME/etc/ltcluster/ltclusterd.pid ltcluster -f $LTHOME/etc/ltcluster/ltcluster.conf cluster show ltcluster -f $LTHOME/etc/ltcluster/ltcluster.conf service status
在机器187上操作(CN standby)
机器187将作为CN standby,进行如下操作:
- 把上一节生成ltcluster.conf的sh脚本修改如下, 并执行生成ltcluster.conf
# 修改ip、节点名等为187 id=187 NODE_NAME=cn187 ip=10.18.68.187 port=15858 # 后面同上一节一样
- 克隆CN primary,其中-h参数为primary IP。视数据量大小, 这可能需要几分钟到几个小时
ltcluster -f $LTHOME/etc/ltcluster/ltcluster.conf standby clone -h 10.18.68.186 -p 15858 -U ltcluster
- 完成克隆后,启动数据库,注册standby,并检查状态
lt_ctl start ltcluster -f $LTHOME/etc/ltcluster/ltcluster.conf standby register -F ltclusterd -d -f $LTHOME/etc/ltcluster/ltcluster.conf -p -f $LTHOME/etc/ltcluster/ltclusterd.pid ltcluster -f $LTHOME/etc/ltcluster/ltcluster.conf cluster show ltcluster -f $LTHOME/etc/ltcluster/ltcluster.conf service status
示例如下,可以看到集群监控进程(ltclusterd)正在运行,集群中有一主一备,备节点的上游(upstream)为cn186。
[canopy@host187 ~]$ ltcluster -f $LTHOME/etc/ltcluster/ltcluster.conf service status ID | Name | Role | Status | Upstream | ltclusterd | PID | Paused? | Upstream last seen ----+-------+---------+-----------+----------+------------+---------+---------+-------------------- 187 | cn187 | standby | running | cn186 | running | 3310911 | no | 0 second(s) ago 186 | cn186 | primary | * running | | running | 1118590 | no | n/a
验证CN standby支持DML
在CN主节点186上执行sql: ltsql -p 15858
create table the_table(id int, code text, price numeric(8,2)); select create_distributed_table('the_table', 'id'); insert into the_table values (1, '1', 3.439); insert into the_table values (2, '2', 6.86); select * from the_table;
在CN备节点187上执行sql: ltsql -p 15858
select * from the_table; delete from the_table where id = 1; -- 失败 SET canopy.writable_standby_coordinator TO on; -- 设置standby支持DML, 下面的DML可成功执行 delete from the_table where id = 1; delete from the_table where id = 2; select * from the_table; insert into the_table values (3, '3', 6.86); select * from the_table;
把canopy.writable_standby_coordinator = on
添加到两个CN节点的lightdb.conf,并执行lt_ctl reload
,可永久有效。
部署LVS负载均衡
采用LVS DR模式做负载均衡。
首先安装ipvsadm: yum install ipvsadm
, 或使用光盘中rpm包安装。
Director脚本:修改脚本前面的VIP,RIP1,RIP2,ethx(网卡,使用ifconfig查看),port变量。
#!/bin/sh # # Startup script handle the initialisation of LVS # chkconfig: - 28 72 # description: Initialise the Linux Virtual Server for DR # ### BEGIN INIT INFO # Provides: ipvsadm # Required-Start: $local_fs $network $named # Required-Stop: $local_fs $remote_fs $network # Short-Description: Initialise the Linux Virtual Server # Description: The Linux Virtual Server is a highly scalable and highly # available server built on a cluster of real servers, with the load # balancer running on Linux. # description: start LVS of DR LOCK=/var/lock/ipvsadm.lock VIP=10.19.70.166 RIP1=10.18.68.186 # CN IP RIP2=10.18.68.187 # CN IP ethx=enp1s0 port=15858 # CN port . /etc/rc.d/init.d/functions start() { PID=`ipvsadm -Ln | grep ${VIP} | wc -l` if [ $PID -gt 0 ]; then echo "The LVS-DR Server is already running !" else #Set the Virtual IP Address /sbin/ifconfig $ethx:1 $VIP broadcast $VIP netmask 255.255.255.255 up /sbin/route add -host $VIP dev $ethx:1 #Clear IPVS Table /sbin/ipvsadm -C #Set Lvs #echo $VIP:$port #echo $RIP1:$port #echo $RIP2:$port #echo $RIP3:$port /sbin/ipvsadm -At $VIP:$port -s rr /sbin/ipvsadm -at $VIP:$port -r $RIP1:$port -g -w 1 /sbin/ipvsadm -at $VIP:$port -r $RIP2:$port -g -w 1 #/sbin/ipvsadm -at $VIP:$port -r $RIP3:$port -g -w 1 /bin/touch $LOCK #Run Lvs echo "starting LVS-DR Server is ok !" fi } stop() { #clear Lvs and vip /sbin/ipvsadm -C /sbin/route del -host $VIP dev $ethx:1 /sbin/ifconfig $ethx:1 down >/dev/null rm -rf $LOCK echo "stopping LVS-DR server is ok !" } status() { if [ -e $LOCK ]; then echo "The LVS-DR Server is already running !" else echo "The LVS-DR Server is not running !" fi } case "$1" in start) start ;; stop) stop ;; restart) stop start ;; status) status ;; *) echo "Usage: $1 {start|stop|restart|status}" exit 1 esac exit 0
RealServer脚本: 修改脚本前面的VIP,ethx(网卡,使用ifconfig查看)变量。
#!/bin/sh # # Startup script handle the initialisation of LVS # chkconfig: - 28 72 # description: Initialise the Linux Virtual Server for DR # ### BEGIN INIT INFO # Provides: ipvsadm # Required-Start: $local_fs $network $named # Required-Stop: $local_fs $remote_fs $network # Short-Description: Initialise the Linux Virtual Server # Description: The Linux Virtual Server is a highly scalable and highly # available server built on a cluster of real servers, with the load # balancer running on Linux. # description: start LVS of DR-RIP LOCK=/var/lock/ipvsadm.lock VIP=10.19.70.166 ethx=enp1s0 . /etc/rc.d/init.d/functions start() { PID=`ifconfig | grep lo:0 | wc -l` if [ $PID -ne 0 ]; then echo "The LVS-DR-RIP Server is already running !" else /sbin/ifconfig lo:0 $VIP netmask 255.255.255.255 broadcast $VIP up /sbin/route add -host $VIP dev lo:0 echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce echo "1" >/proc/sys/net/ipv4/conf/$ethx/arp_ignore echo "2" >/proc/sys/net/ipv4/conf/$ethx/arp_announce echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce /bin/touch $LOCK echo "starting LVS-DR-RIP server is ok !" fi } stop() { /sbin/route del -host $VIP dev lo:0 /sbin/ifconfig lo:0 down >/dev/null echo "0" >/proc/sys/net/ipv4/conf/lo/arp_ignore echo "0" >/proc/sys/net/ipv4/conf/lo/arp_announce echo "0" >/proc/sys/net/ipv4/conf/$ethx/arp_ignore echo "0" >/proc/sys/net/ipv4/conf/$ethx/arp_announce echo "0" >/proc/sys/net/ipv4/conf/all/arp_ignore echo "0" >/proc/sys/net/ipv4/conf/all/arp_announce rm -rf $LOCK echo "stopping LVS-DR-RIP server is ok !" } status() { if [ -e $LOCK ]; then echo "The LVS-DR-RIP Server is already running !" else echo "The LVS-DR-RIP Server is not running !" fi } case "$1" in start) start ;; stop) stop ;; restart) stop start ;; status) status ;; *) echo "Usage: $1 {start|stop|restart|status}" exit 1 esac exit 0
机器有限,186是cn primary(即RealServer),也是LVS director;187是cn standby(RealServer)。把上述Director,RealServer脚本上传至186 /etc/init.d
目录,把RealServer脚本上传至187 /etc/init.d
目录,并加上可执行权限chmod +x
,并启动服务:
# 186 ./lvs-dr start # Director脚本 ./lvs-rs start # RealServer脚本 # 187 ./lvs-rs start
可使用ip a
看到虚拟地址是否已经加到对应的网卡上。
开多个客户端(比如ltsql)连接到VIP,在Director上使用命令ipvsadm -Ln --stats
查看负载情况。
# ipvsadm -Ln --stats IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Conns InPkts OutPkts InBytes OutBytes -> RemoteAddress:Port TCP 10.19.70.166:15858 5 16 15 2918 5763 -> 10.18.68.186:15858 3 7 6 1320 2461 -> 10.18.68.187:15858 2 9 9 1598 3302
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· winform 绘制太阳,地球,月球 运作规律
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· AI与.NET技术实操系列(五):向量存储与相似性搜索在 .NET 中的实现
· 超详细:普通电脑也行Windows部署deepseek R1训练数据并当服务器共享给他人
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理