lightdb搭建单服务器高可用环境
1、安装lightdb单机版,具体步骤可参考http://www.light-pg.com/docs/LightDB_Install_Manual/current/index.html
此文示例所安装的单机版部分参数
port:60001
$LTHOME:/home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1
$LTDATA:/home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/cluster/data
2、根据服务器配置,修改shared_buffer,避免standby节点无法启动(此步骤可不执行)
3、添加ltcluster插件并启动
3.1、shared_preload_libraries后面加上ltcluster
3.2、重启数据库,然后在命令行执行下面命令
ltsql -p 60001 -h localhost -dpostgres -c"create extension ltcluster;"
ltsql -p 60001 -h localhost -dpostgres -c"create role ltcluster superuser password 'ltcluster' login;"
ltsql -p 60001 -h localhost -dpostgres -c"create database ltcluster owner ltcluster;"
4、生成primary节点配置文件
4.1、
mkdir -p $LTDATA/../etc/ltcluster
cp -rf $LTHOME/etc/ltcluster $LTDATA/../etc/
4.2、变量替换为自己的即可,主要是id,NODE_NAME,ip(此文默认为127.0.0.1),port
echo "
node_id=$id
node_name='$NODE_NAME'
conninfo='host=$ip port=$port user=ltcluster dbname=ltcluster connect_timeout=2'
data_directory='$LTDATA'
pg_bindir='$LTHOME/bin'
failover='automatic'
promote_command='$LTHOME/bin/ltcluster standby promote -f $LTDATA/../etc/ltcluster/ltcluster.conf'
follow_command='$LTHOME/bin/ltcluster standby follow -f $LTDATA/../etc/ltcluster/ltcluster.conf --upstream-node-id=%n'
restore_command='cp $LTHOME/archive/%f %p'
monitoring_history=true #(Enable monitoring parameters)
monitor_interval_secs=2 #(Define monitoring data interval write time parameter)
connection_check_type='ping'
reconnect_attempts=3 #(before failover,Number of attempts to reconnect to primary before failover(default 6))
reconnect_interval=5
standby_disconnect_on_failover =true
log_level=INFO
log_facility=STDERR
log_file='$LTHOME/etc/ltcluster/ltcluster.log'
failover_validation_command='$LTHOME/etc/ltcluster/ltcluster_failover.sh "$LTHOME" "$LTDATA"'
shutdown_check_timeout=1800
use_replication_slots=true
check_lightdb_command='$LTHOME/etc/ltcluster/check_lightdb.sh'
check_lightdb_interval=10
" > $LTDATA/../etc/ltcluster/ltcluster.conf
示例
node_id=1
node_name='lightdbCluster12700160001'
conninfo='host=127.0.0.1 port=60001 user=ltcluster dbname=ltcluster connect_timeout=2'
data_directory='/home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/cluster/data'
pg_bindir='/home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/bin'
failover='automatic'
promote_command='/home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/bin/ltcluster standby promote -f /home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/cluster/etc/ltcluster.conf'
follow_command='/home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/bin/ltcluster standby follow -f /home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/cluster/etc/ltcluster.conf --upstream-node-id=%n'
monitoring_history=true #(Enable monitoring parameters)
monitor_interval_secs=2 #(Define monitoring data interval write time parameter)
connection_check_type='ping'
reconnect_attempts=3 #(before failover,Number of attempts to reconnect to primary before failover(default 6))
reconnect_interval=5
standby_disconnect_on_failover =true
log_level=INFO
log_facility=STDERR
log_file='/home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/cluster/etc/ltcluster/ltcluster.log'
failover_validation_command='/home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/etc/ltcluster/ltcluster_failover.sh "/home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1" "/home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/cluster/data"'
shutdown_check_timeout=1800
use_replication_slots=true
check_lightdb_command='/home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/etc/ltcluster/check_lightdb.sh "/home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/cluster" "/home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/cluster/data"'
check_lightdb_interval=10
5、注册primary节点
ltcluster -f $LTDATA/../etc/ltcluster/ltcluster.conf primary register -F
ltclusterd -d -f $LTDATA/../etc/ltcluster/ltcluster.conf -p $LTDATA/../etc/ltcluster/ltclusterd.pid
ltcluster -f $LTDATA/../etc/ltcluster/ltcluster.conf service status
ltcluster -f $LTDATA/../etc/ltcluster/ltcluster.conf cluster show
6、配置standby节点
6.1、standby节点的实例目录和ltcluster可根据需要配置,此文是按照lightdb 13.8-23.1版本的高可用来配置的
mkdir -p $LTHOME/cluster1/archive
mkdir -p $LTHOME/cluster1/etc/ltcluster
cp -rf $LTDATA/../etc/ltcluster/ $LTHOME/cluster1/etc/
cd $LTHOME/cluster1/etc/ltcluster
rm -rf ltclusterd.pid #不删除会影响standby节点ltcluster启动
6.2、修改ltcluster.conf,此时的ltcluster.conf在$LTHOME/cluster1/etc/ltcluster下面
具体参考primary节点的配置,id,NODE_NAME,ip(此文默认为127.0.0.1),port,还有ltcluster.conf目录,实例目录,这些和primary节点都不一样,一定要修改,下面为示例
node_id=2
node_name='lightdbCluster12700160002'
conninfo='host=127.0.0.1 port=60002 user=ltcluster dbname=ltcluster connect_timeout=2'
data_directory='/home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/cluster1/data'
pg_bindir='/home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/bin'
failover='automatic'
promote_command='/home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/bin/ltcluster standby promote -f /home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/cluster1/etc/ltcluster.conf'
follow_command='/home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/bin/ltcluster standby follow -f /home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/cluster1/etc/ltcluster.conf --upstream-node-id=%n'
monitoring_history=true #(Enable monitoring parameters)
monitor_interval_secs=2 #(Define monitoring data interval write time parameter)
connection_check_type='ping'
reconnect_attempts=3 #(before failover,Number of attempts to reconnect to primary before failover(default 6))
reconnect_interval=5
standby_disconnect_on_failover =true
log_level=INFO
log_facility=STDERR
log_file='/home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/cluster1/etc/ltcluster/ltcluster.log'
failover_validation_command='/home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/etc/ltcluster/ltcluster_failover.sh /home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1 /home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/cluster1/data'
shutdown_check_timeout=1800
use_replication_slots=true
check_lightdb_command='/home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/etc/ltcluster/check_lightdb.sh /home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/cluster1 /home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/cluster1/data'
check_lightdb_interval=10
7、执行clone,视数据量大小, 这可能需要几分钟到几个小时
ltcluster -f $LTHOME/cluster1/etc/ltcluster/ltcluster.conf standby clone -h 127.0.0.1 -p 60001 -U ltcluster
特别注意:此处的ltcluster.conf是步骤6里面生成的,而不是primary的,-h和-p是primary节点的
8、修改standby节点的端口为60002,并修改archive_command中的归档目录,然后启动standby节点实例(clone过去,端口为60001,归档目录还是primary节点的,需要进行修改)
archive_command = 'rm -f /home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/cluster1/archive/%f && cp %p /home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/cluster1/archive/%f'
lightdb_archive_dir = '/home/lightdb/test/monitored/ha/lightdb-x/13.8-23.1/cluster1/archive'
9、注册standby节点,此处的ltcluster.conf是步骤6里面生成的
#注册standby节点
ltcluster -f /home/lightdb/zm/ha/lightdb-x/13.8-23.1/cluster1/etc/ltcluster.conf standby register -F
#启动ltcluster进程
ltclusterd -d -f /home/lightdb/zm/ha/lightdb-x/13.8-23.1/cluster1/etc/ltcluster.conf -p /home/lightdb/zm/ha/lightdb-x/13.8-23.1/cluster1/etc/ltclusterd.pid
10、查看集群状态
ltcluster -f $LTDATA/../etc/ltcluster/ltcluster.conf service status
ltcluster -f $LTDATA/../etc/ltcluster/ltcluster.conf cluster show