postgresql 物理备份 pg_rman

os: centos 7.4
postgresql: 9.6.9
pg_rman: REL9_6_STABLE

pg_rman 是一款优秀的postgresql 在线备份和恢复的工具,在github上可以找到该软件。
下面是pg_rman主页面的描述:

pg_rman is an online backup and restore tool for PostgreSQL.

The goal of the pg_rman project is to provide a method for online backup and PITR that is as easy as pg_dump. Also, it maintains a backup catalog per database cluster. Users can maintain old backups including archive logs with one command.

下载

https://github.com/ossc-db/pg_rman

由于本次的postgresql 版本为 9.6.9,所以需要下载相应的pg_rman REL9_6_STABLE : branch for PostgreSQL 9.6

# su - postgres
$ cd /tmp
$ git clone https://github.com/ossc-db/pg_rman.git
$ cd pg_rman/
$ git branch -a
* master
  remotes/origin/HEAD -> origin/master
  remotes/origin/REL9_2_STABLE
  remotes/origin/REL9_3_STABLE
  remotes/origin/REL9_4_STABLE
  remotes/origin/REL9_5_STABLE
  remotes/origin/REL9_6_STABLE
  remotes/origin/REL_10_STABLE
  remotes/origin/master
  remotes/origin/pre-9.2
$ git checkout REL9_6_STABLE
Already on 'REL9_6_STABLE'
$ git status
$ On branch REL9_6_STABLE
nothing to commit, working directory clean

安装

前提是postgresl 9.6 已经安装好了。

# su - postgres
$ make
$ make installcheck
$ exit
# cd /tmp/pg_rman
# make install
/usr/bin/mkdir -p '/usr/pgsql-9.6/bin'
/usr/bin/install -c  pg_rman '/usr/pgsql-9.6/bin'

# ls -l /usr/pgsql-9.6/bin | grep -i rman
-rwxr-xr-x. 1 root root  633680 Jun 11 23:08 pg_rman

pg_rman --help

$ which pg_rman
/usr/pgsql-9.6/bin/pg_rman
$ pg_rman --help
pg_rman manage backup/recovery of PostgreSQL database.

Usage:
  pg_rman OPTION init
  pg_rman OPTION backup
  pg_rman OPTION restore
  pg_rman OPTION show [DATE]
  pg_rman OPTION show detail [DATE]
  pg_rman OPTION validate [DATE]
  pg_rman OPTION delete DATE
  pg_rman OPTION purge

Common Options:
  -D, --pgdata=PATH         location of the database storage area
  -A, --arclog-path=PATH    location of archive WAL storage area
  -S, --srvlog-path=PATH    location of server log storage area
  -B, --backup-path=PATH    location of the backup storage area
  -c, --check               show what would have been done
  -v, --verbose             show what detail messages
  -P, --progress            show progress of processed files

Backup options:
  -b, --backup-mode=MODE    full, incremental, or archive
  -s, --with-serverlog      also backup server log files
  -Z, --compress-data       compress data backup with zlib
  -C, --smooth-checkpoint   do smooth checkpoint before backup
  -F, --full-backup-on-error   switch to full backup mode
                               if pg_rman cannot find validate full backup
                               on current timeline
      NOTE: this option is only used in --backup-mode=incremental or archive.
  --keep-data-generations=NUM keep NUM generations of full data backup
  --keep-data-days=NUM        keep enough data backup to recover to N days ago
  --keep-arclog-files=NUM   keep NUM of archived WAL
  --keep-arclog-days=DAY    keep archived WAL modified in DAY days
  --keep-srvlog-files=NUM   keep NUM of serverlogs
  --keep-srvlog-days=DAY    keep serverlog modified in DAY days
  --standby-host=HOSTNAME   standby host when taking backup from standby
  --standby-port=PORT       standby port when taking backup from standby

Restore options:
  --recovery-target-time    time stamp up to which recovery will proceed
  --recovery-target-xid     transaction ID up to which recovery will proceed
  --recovery-target-inclusive whether we stop just after the recovery target
  --recovery-target-timeline  recovering into a particular timeline
  --hard-copy                 copying archivelog not symbolic link

Catalog options:
  -a, --show-all            show deleted backup too

Delete options:
  -f, --force               forcibly delete backup older than given DATE

Connection options:
  -d, --dbname=DBNAME       database to connect
  -h, --host=HOSTNAME       database server host or socket directory
  -p, --port=PORT           database server port
  -U, --username=USERNAME   user name to connect as
  -w, --no-password         never prompt for password
  -W, --password            force password prompt

Generic options:
  -q, --quiet               don't show any INFO or DEBUG messages
  --debug                   show DEBUG messages
  --help                    show this help, then exit
  --version                 output version information, then exit

Read the website for details. <http://github.com/ossc-db/pg_rman>
Report bugs to <http://github.com/ossc-db/pg_rman/issues>.

pg_rman init

pg_rman 需要一个备份目录

# mkdir -p /mnt/walbackup
# mkdir -p /mnt/pg_rman_backupset
# chown -R postgres:postgres /mnt
$ ls -l
total 0
drwxr-xr-x. 2 postgres postgres 6 Jun 11 23:24 pg_rman_backupset
drwxr-xr-x. 2 postgres postgres 6 Jun 11 23:05 walbackup

$ vi ~/.bash_profile

export BACKUP_PATH=/mnt/pg_rman_backupset


pg_rman init 初始化

$ pg_rman init
INFO: ARCLOG_PATH is set to '/mnt/walbackup'
INFO: SRVLOG_PATH is set to '/var/lib/pgsql/9.6/data/pg_log'

pg_rman backup

pg_rman 全量备份

$ pg_rman backup --backup-mode=full --with-serverlog --progress
INFO: copying database files
Processed 1166 of 1166 files, skipped 0
INFO: copying archived WAL files
Processed 3 of 3 files, skipped 0
INFO: copying server log files
Processed 1 of 1 files, skipped 0
INFO: backup complete
INFO: Please execute 'pg_rman validate' to verify the files are correctly copied.

pg_rman 校验备份集
pg_rman 的备份必须都是经过验证过的,否则不能进行恢复和增量备份

$ pg_rman validate
INFO: validate: "2018-06-11 23:30:47" backup, archive log files and server log files by CRC
INFO: backup "2018-06-11 23:30:47" is valid

pg_rman 列出备份集

$ pg_rman show
=====================================================================
 StartTime           EndTime              Mode    Size   TLI  Status 
=====================================================================
2018-06-11 23:30:47  2018-06-11 23:30:49  FULL    58MB     1  OK

$ ls -l /mnt/pg_rman_backupset/
total 8
drwx------. 3 postgres postgres 20 Jun 11 23:30 20180611
drwx------. 4 postgres postgres 35 Jun 11 23:27 backup
-rw-r--r--. 1 postgres postgres 75 Jun 11 23:27 pg_rman.ini
-rw-r--r--. 1 postgres postgres 40 Jun 11 23:27 system_identifier
drwx------. 2 postgres postgres  6 Jun 11 23:27 timeline_history

pg_rman 增量备份
增量备份是基于文件系统的update time时间线
增量备份必须有个对应的全库备份

$ pg_rman backup --backup-mode incremental --progress --compress-data 
INFO: copying database files
Processed 1435 of 1435 files, skipped 1135
INFO: copying archived WAL files
Processed 6 of 6 files, skipped 3
INFO: backup complete
INFO: Please execute 'pg_rman validate' to verify the files are correctly copied.

$ pg_rman validate
INFO: validate: "2018-06-11 23:40:57" backup and archive log files by CRC
INFO: backup "2018-06-11 23:40:57" is valid

$ pg_rman show
=====================================================================
 StartTime           EndTime              Mode    Size   TLI  Status 
=====================================================================
2018-06-11 23:40:57  2018-06-11 23:40:59  INCR   907kB     1  OK
2018-06-11 23:30:47  2018-06-11 23:30:49  FULL    58MB     1  OK

pg_rman 删除备份集
若果提示不能删除,请执行查看输出信息。如果实在手贱,可以指定 -f 参数。

$ pg_rman delete '2018-06-11 23:30:47'
WARNING: cannot delete backup with start time "2018-06-11 23:30:47"
DETAIL: This is the latest full backup necessary for successful recovery.

$ pg_rman delete -f '2018-06-11 23:30:47'
INFO: delete the backup with start time: "2018-06-11 23:30:47"

$ pg_rman show
=====================================================================
 StartTime           EndTime              Mode    Size   TLI  Status 
=====================================================================
2018-06-11 23:40:57  2018-06-11 23:40:59  INCR   907kB     1  OK

pg_rman restore

以下操作为模拟目录误删除。

# systemctl stop postgresql-9.6.service 
# ps -ef|grep -i postmaster |grep -v grep

# su - postgres
$ cd $PGDATA/..
$ ls -l
total 8
drwx------.  2 postgres postgres    6 May 10 03:36 backups
drwx------. 20 postgres postgres 4096 Jun 12 02:51 data
-rw-------.  1 postgres postgres  878 Jun 11 21:47 initdb.log

$ mv ./data ./data.bak
$ ls -l
total 8
drwx------.  2 postgres postgres    6 May 10 03:36 backups
drwx------. 20 postgres postgres 4096 Jun 12 02:51 data.bak
-rw-------.  1 postgres postgres  878 Jun 11 21:47 initdb.log

$ mkdir data
$ chmod 700 ./data
$ ls -l

使用 pg_rman restore 还原

$ pg_rman restore
WARNING: pg_controldata file "/var/lib/pgsql/9.6/data/global/pg_control" does not exist
INFO: the recovery target timeline ID is not given
INFO: use timeline ID of latest full backup as recovery target: 1
INFO: calculating timeline branches to be used to recovery target point
INFO: searching latest full backup which can be used as restore start point
INFO: found the full backup can be used as base in recovery: "2018-06-12 02:36:40"
INFO: copying online WAL files and server log files
INFO: clearing restore destination
INFO: validate: "2018-06-12 02:36:40" backup, archive log files and server log files by SIZE
INFO: backup "2018-06-12 02:36:40" is valid
INFO: restoring database files from the full mode backup "2018-06-12 02:36:40"
INFO: searching incremental backup to be restored
INFO: searching backup which contained archived WAL files to be restored
INFO: backup "2018-06-12 02:36:40" is valid
INFO: restoring WAL files from backup "2018-06-12 02:36:40"
INFO: restoring online WAL files and server log files
INFO: generating recovery.conf
INFO: restore complete
HINT: Recovery will start automatically when the PostgreSQL server is started.

$ cd $PGDATA/
$ ls -l
total 60
-rw-r--r--. 1 postgres postgres   213 Jun 12 02:57 backup_label
drwx------. 7 postgres postgres    67 Jun 12 02:57 base
drwx------. 2 postgres postgres  4096 Jun 12 02:57 global
drwx------. 2 postgres postgres    18 Jun 12 02:57 pg_clog
drwx------. 2 postgres postgres     6 Jun 12 02:57 pg_commit_ts
drwx------. 2 postgres postgres     6 Jun 12 02:57 pg_dynshmem
-rw-------. 1 postgres postgres  4224 Jun 12 02:57 pg_hba.conf
-rw-------. 1 postgres postgres  1636 Jun 12 02:57 pg_ident.conf
drwx------. 2 postgres postgres     6 Jun 12 02:57 pg_log
drwx------. 4 postgres postgres    39 Jun 12 02:57 pg_logical
drwx------. 4 postgres postgres    36 Jun 12 02:57 pg_multixact
drwx------. 2 postgres postgres    18 Jun 12 02:57 pg_notify
drwx------. 2 postgres postgres     6 Jun 12 02:57 pg_replslot
drwx------. 2 postgres postgres     6 Jun 12 02:57 pg_serial
drwx------. 2 postgres postgres     6 Jun 12 02:57 pg_snapshots
drwx------. 2 postgres postgres     6 Jun 12 02:57 pg_stat
drwx------. 2 postgres postgres     6 Jun 12 02:57 pg_stat_tmp
drwx------. 2 postgres postgres    18 Jun 12 02:57 pg_subtrans
drwx------. 2 postgres postgres     6 Jun 12 02:57 pg_tblspc
drwx------. 2 postgres postgres     6 Jun 12 02:57 pg_twophase
-rw-------. 1 postgres postgres     4 Jun 12 02:57 PG_VERSION
drwx------. 3 postgres postgres    28 Jun 12 02:57 pg_xlog
-rw-------. 1 postgres postgres    88 Jun 12 02:57 postgresql.auto.conf
-rw-------. 1 postgres postgres 22304 Jun 12 02:57 postgresql.conf
-rw-------. 1 postgres postgres    60 Jun 12 02:57 postmaster.opts
-rw-r--r--. 1 postgres postgres   118 Jun 12 02:57 recovery.conf

$ cat recovery.conf
# recovery.conf generated by pg_rman 1.3.6
restore_command = 'cp /mnt/walbackup/%f %p'
recovery_target_timeline = '1'

启动postgresql

# systemctl start postgresql-9.6.service
# ps -ef|grep -i postgres
postgres 29509     1  0 02:59 ?        00:00:00 /usr/pgsql-9.6/bin/postmaster -D /var/lib/pgsql/9.6/data/
postgres 29512 29509  0 02:59 ?        00:00:00 postgres: logger process   
postgres 29515 29509  0 02:59 ?        00:00:00 postgres: checkpointer process   
postgres 29516 29509  0 02:59 ?        00:00:00 postgres: writer process   
postgres 29518 29509  0 02:59 ?        00:00:00 postgres: stats collector process   
postgres 29525 29509  0 02:59 ?        00:00:00 postgres: wal writer process   
postgres 29526 29509  0 02:59 ?        00:00:00 postgres: autovacuum launcher process   
postgres 29527 29509  0 02:59 ?        00:00:00 postgres: archiver process   last was 00000002.history
postgres 29559 16928  0 03:00 pts/1    00:00:00 ps -ef
postgres 29560 16928  0 03:00 pts/1    00:00:00 grep --color=auto -i postgres

有时候restore后启动会碰到如下错误:

invalid primary checkpoint record
invalid secondary checkpoint record
could not locate a valid checkpoint record

此时只能重置xlog,并取消恢复模式

$ pg_resetxlog -f $PGDATA
$ mv $PGDATA/recovery.conf $PGDATA/recovery.done

这里有个问题需要煮一下,使用pg_rman备份时对wal的归档会是通过软链接来实现。建议添加 --hard-copy

参考:
https://github.com/ossc-db/pg_rman/tree/master
http://ossc-db.github.io/pg_rman/index.html

posted @ 2018-06-12 15:12  peiybpeiyb  阅读(581)  评论(0编辑  收藏  举报