PostgreSQL备份与恢复

Posted on 2021-12-06 15:13 高&玉阅读(5640) 评论(0) 编辑收藏举报

前言

PostgreSQL自带的备份工具有pg_basebackup、pg_dump、pg_dumpall。

区别：

pg_basebackup是文件系统级别的备份，可用于连续归档方案。
pg_dump、pg_dumpall都是是逻辑备份，前者支持多种备份格式，后者只支持sql文本。

1 pg_start_backup

pg_start_backup() 和 pg_stop_backup()是postgreSQL提供的一种备份方式，由于无法并行备份，现在已经逐渐被pg_basebackup工具（postgresql9.1之后）所替代。

pg_start_backup() 和 pg_stop_backup()的使用是不需要开启归档的（强烈建议开启），那么在进行备份的时候，应该进行归档的一部分日志就会无法保存直接被覆盖掉，如果再想恢复到“归档日志之后”的时间段数据也就会发生丢失。

pg_start_backup()
1、强制发生一次checkpoint点。将未写到磁盘上的脏数据全部刷到磁盘上去。这样从这之后产生的日志就是记录整个数据块。可以“确保”恢复的正确性。
2、置写日志标志为：XLogCtl->Insert.forcePageWrites = true，这样wal日志将会记录整个数据块。避免了在进行备份时候（读操作——旧数据）持续向数据库写入数据（写操作——新数据）造成的前后数据不一致
3、pg_start_backup()开启期间（不执行pg_stop_backup()），wal日志仍然会进行循环使用。从我们使用者的角度来看也许数据库应该是持续的将数据块变化记录到wal中，备份不停止，wal日志也不应该被覆盖，但事实上并不是如此，也许是postgreSQL为了不至于太复杂和为了避免撑爆xlog日志，pg_start_backup()开启期间wal仍会进行覆盖循环使用。

pg_stop_backup()的作用就是结束此次备份状态，以便进行下次备份（非并发性备份），一直不执行pg_stop_backup()也并不会撑爆xlog目录，但是是无法执行下次备份的。

2 pg_basebackup

2.1 介绍

pg_basebackup用于获得一个正在运行的PostgreSQL数据库的基础备份。获得这些备份不会影响连接到该数据库的其他客户端，并且可以被用于时间点恢复。

pg_basebackup建立数据库集簇文件的一份二进制副本，同时保证系统进入和退出备份模式。备份总是从整个数据库集簇获得，不可能备份单个数据库或数据库对象。关于个体数据库备份，必须使用一个像pg_dump的工具。

备份通过一个常规PostgreSQL连接制作，并且使用复制协议。该连接必须由一个超级用户或者一个具有REPLICATION权限的用户建立，并且pg_hba.conf必须显式地允许该复制连接。该服务器还必须被配置，使max_wal_senders设置得足够大以留出至少一个会话用于备份。

在同一时间可以有多个pg_basebackup运行，但是从性能的角度来说最好只做一个备份并且复制结果。

pg_basebackup不仅能从主控机也能从后备机创建一个基础备份。要从后备机获得一个备份，设置后备机让它能接受复制连接（也就是，设置max_wal_senders和hot_standby，并且配置基于主机的认证）。你将也需要在主控机上启用full_page_writes。

注意在来自后备机的在线备份中有一些限制：

不会在被备份的数据库集簇中创建备份历史文件。
不保证备份所需的所有 WAL 文件在备份结束时被归档。如果你计划将该备份用于一次归档恢复并且想要确保所有所需文件在那个时刻都可用，你需要通过使用-x将它们包括在备份中。
如果在在线备份期间后备机被提升为主控机，备份会失败。
备份所需的所有 WAL 记录必须包含足够的全页写，这要求你在主控机上启用full_page_writes并且不使用一个类似pg_compresslog的工具以archive_command从 WAL 文件中移除全页写。

2.2 参数介绍

pg_basebackup帮助手册

[root@guizhou_hp-pop-10-150-57-13 ~]# pg_basebackup --help
pg_basebackup takes a base backup of a running PostgreSQL server.

Usage:
  pg_basebackup [OPTION]...

Options controlling the output:
  -D, --pgdata=DIRECTORY receive base backup into directory
  -F, --format=p|t       output format (plain (default), tar)
  -r, --max-rate=RATE    maximum transfer rate to transfer data directory
                         (in kB/s, or use suffix "k" or "M")
  -R, --write-recovery-conf
                         write recovery.conf after backup
  -S, --slot=SLOTNAME    replication slot to use
  -T, --tablespace-mapping=OLDDIR=NEWDIR
                         relocate tablespace in OLDDIR to NEWDIR
  -x, --xlog             include required WAL files in backup (fetch mode)
  -X, --xlog-method=fetch|stream
                         include required WAL files with specified method
      --xlogdir=XLOGDIR  location for the transaction log directory
  -z, --gzip             compress tar output
  -Z, --compress=0-9     compress tar output with given compression level

General options:
  -c, --checkpoint=fast|spread
                         set fast or spread checkpointing
  -l, --label=LABEL      set backup label
  -P, --progress         show progress information
  -v, --verbose          output verbose messages
  -V, --version          output version information, then exit
  -?, --help             show this help, then exit

Connection options:
  -d, --dbname=CONNSTR   connection string
  -h, --host=HOSTNAME    database server host or socket directory
  -p, --port=PORT        database server port number
  -s, --status-interval=INTERVAL
                         time between status packets sent to server (in seconds)
  -U, --username=NAME    connect as specified database user
  -w, --no-password      never prompt for password
  -W, --password         force password prompt (should happen automatically)

Report bugs to <pgsql-bugs@postgresql.org>.

参数注释：

类型	参数	说明
控制输出的选项	-D, --pgdata=DIRECTORY	接收基础备份的路径
	-F, --format=p\|t	指定备份格式（p=无格式，t=tar格式）
	-r, --max-rate=RATE	从该服务器传输数据的最大传输率（默认KB/s，可指定"K"或"M"）
	-R, --write-recovery-conf	备份完成后写入recovery.conf文件
	-S, --slot=SLOTNAME	指定复制槽
	-T, --tablespace-mapping=OLDDIR=NEWDIR	将OLDDIR中的表空间重新定位到NEWDIR
	-x, --xlog	在备份中包含所需的WAL文件（fetch模式）
	-X, --xlog-method=fetch\|stream	使用指定的方法包含所需的WAL文件（fetch，stream模式）
	--xlogdir=XLOGDIR	指定事务日志目录的路径
	-z, --gzip	tar压缩输出
	-Z, --compress=0-9	指定压缩级别
一般参数	-c, --checkpoint=fast\|spread	设置检查点的方式（fast或spread）
	-l, --label=LABEL	设置备份标签
	-P, --progress	显示进度信息
	-v, --verbose	输出详细信息
	-V, --version	查看版本信息
	-?, --help	帮助手册
连接参数	-d, --dbname=CONNSTR	pg_basebackup并不连接到集簇中的任何特定数据库，连接字符串中的数据库名将被忽略。
	-h, --host=HOSTNAME	指定数据库主机或socket
	-p, --port=PORT	指定数据库端口
	-s, --status-interval=INTERVAL	指定发送状态给服务器的时间间隔秒（）
	-U, --username=NAME	指定连接用户名
	-w, --no-password	不提示输入密码
	-W, --password	强制提示输入密码（默认）

2.3 备份案例

创建本地基础备份，并存储到/home/postgres/2021-12-06

[postgres]# pg_basebackup -U repl -D /home/postgres/2021-12-06 -P
Password: 
809339/809339 kB (100%), 1/1 tablespace
NOTICE:  pg_stop_backup complete, all required WAL segments have been archived

[postgres]# ls /home/postgres/2021-12-06/
arch_log      backup_label.old  global   pg_commit_ts  pg_hba.conf    pg_log      pg_multixact  pg_replslot  pg_snapshots  pg_stat_tmp  pg_tblspc    PG_VERSION  postgresql.auto.conf  recovery.done
backup_label  base              pg_clog  pg_dynshmem   pg_ident.conf  pg_logical  pg_notify     pg_serial    pg_stat       pg_subtrans  pg_twophase  pg_xlog     postgresql.conf       tablespace_map.old

创建本地基础备份并打包压缩，并存储到/home/postgres/2021-12-06

[postgres]# pg_basebackup -U repl -Ft -z -P -D /home/postgres/2021-12-06/
940414/940414 kB (100%), 1/1 tablespace
NOTICE:  pg_stop_backup complete, all required WAL segments have been archived

[postgres]# ls /home/postgres/2021-12-06/
base.tar.gz

用于复制流的基础备份

[postgres]# pg_basebackup -h 10.150.57.13 -U repl -X stream -P -D $PGDATA
Password: 
1202567/1202567 kB (100%), 1/1 tablespace

2.4 时间点恢复（PITR）

案例1：使用连续归档进行恢复

开启归档模式

[postgres]# vi $PGDATA/portgresql.conf
wal_level=hot_standby
archive_mode = on
archive_command = 'cp %p /usr/local/pgsql/data/arch_log/%f && find /usr/local/pgsql/data/arch_log -type f -mtime +3 | xargs rm -fr'

基础备份

[postgres]# pg_basebackup -U repl -Ft -z -P -D /home/postgres/2021-12-06/
940414/940414 kB (100%), 1/1 tablespace
NOTICE:  pg_stop_backup complete, all required WAL segments have been archived

[postgres]# ls /home/postgres/2021-12-06/
base.tar.gz

模拟故障，误删数据库数据文件，导致数据库异常（如果WAL日志丢失，恢复会丢失部分事务）。

拷贝基础备份到$PGDATA目录下，然后配置recovery.conf

[root]# vi $PGDATA/recovery.conf
restore_command = 'cp /usr/local/pgsql/data/arch_log/%f %p'

启动PostgreSQL

[postgres]# pg_ctl start

案例2：基于时间线恢复

与上面的例子区别在于recovery.conf

[root]# vi $PGDATA/recovery.conf
restore_command = 'cp /usr/local/pgsql/data/arch_log/%f %p'
#recovery_target_time = '2021-12-06 20:24:45.871979 CST'
#recovery_target_time = '2021-12-06 20:24:45.871979+08'
recovery_target_time = '2021-12-06 20:24:45'
recovery_target_inclusive = true
recovery_target_timeline = 'latest'

控制恢复停止的位置：

recovery_target_name：指pg_create_restore_point(text)创建的还原点，如果有重名的还原点，那么在recovery过程中第一个遇到的还原点即停止。
recovery_target_time：指XLOG中记录的recordXtime(xl_xact_commit_compact->xact_time)，配合recovery_target_inclusive使用,
recovery_target_xid：指XLogRecord->xl_xid，可以配合recovery_target_inclusive使用，但是recovery_target_inclusive只影响日志的输出，并不影响恢复进程截至点的选择,截至都截止于这个xid的xlog位置。也就是说无论如何都包含了这个事务的xlog信息的recovery。xid的信息体现在结束时，而不是分配xid时。所以恢复到xid=100提交|回滚点，可能xid=102已经先提交了。那么包含xid=102的xlog信息会被recovery。
recovery_target_inclusive：
- 如果在同一个时间点有多个事务回滚或提交，那么recovery_target_inclusive=false则恢复到这个时间点第一个回滚或提交的事务(含)，recovery_target_inclusive=true则恢复到这个时间点最后一个回滚或提交的事务(含)。
- 如果时间点上刚好只有1个事务回滚或提交，那么recovery_target_inclusive=true和false一样，恢复将处理到这个事务包含的xlog信息(含)。
- 如果时间点没有匹配的事务提交或回滚信息，那么recovery_target_inclusive=true和false一样，恢复将处理到这个时间后的下一个事务回滚或提交的xlog信息(含)。

3 pg_dump

3.1 介绍

pg_dump是逻辑备份工具，支持多种输出格式。

3.2 帮助手册

查看pg_dump帮助手册

[postgres]# pg_dump --help
pg_dump dumps a database as a text file or to other formats.

Usage:
  pg_dump [OPTION]... [DBNAME]

General options:
  -f, --file=FILENAME          output file or directory name
  -F, --format=c|d|t|p         output file format (custom, directory, tar,
                               plain text (default))
  -j, --jobs=NUM               use this many parallel jobs to dump
  -v, --verbose                verbose mode
  -V, --version                output version information, then exit
  -Z, --compress=0-9           compression level for compressed formats
  --lock-wait-timeout=TIMEOUT  fail after waiting TIMEOUT for a table lock
  -?, --help                   show this help, then exit

Options controlling the output content:
  -a, --data-only              dump only the data, not the schema
  -b, --blobs                  include large objects in dump
  -c, --clean                  clean (drop) database objects before recreating
  -C, --create                 include commands to create database in dump
  -E, --encoding=ENCODING      dump the data in encoding ENCODING
  -n, --schema=SCHEMA          dump the named schema(s) only
  -N, --exclude-schema=SCHEMA  do NOT dump the named schema(s)
  -o, --oids                   include OIDs in dump
  -O, --no-owner               skip restoration of object ownership in
                               plain-text format
  -s, --schema-only            dump only the schema, no data
  -S, --superuser=NAME         superuser user name to use in plain-text format
  -t, --table=TABLE            dump the named table(s) only
  -T, --exclude-table=TABLE    do NOT dump the named table(s)
  -x, --no-privileges          do not dump privileges (grant/revoke)
  --binary-upgrade             for use by upgrade utilities only
  --column-inserts             dump data as INSERT commands with column names
  --disable-dollar-quoting     disable dollar quoting, use SQL standard quoting
  --disable-triggers           disable triggers during data-only restore
  --enable-row-security        enable row security (dump only content user has
                               access to)
  --exclude-table-data=TABLE   do NOT dump data for the named table(s)
  --if-exists                  use IF EXISTS when dropping objects
  --inserts                    dump data as INSERT commands, rather than COPY
  --no-security-labels         do not dump security label assignments
  --no-synchronized-snapshots  do not use synchronized snapshots in parallel jobs
  --no-tablespaces             do not dump tablespace assignments
  --no-unlogged-table-data     do not dump unlogged table data
  --quote-all-identifiers      quote all identifiers, even if not key words
  --section=SECTION            dump named section (pre-data, data, or post-data)
  --serializable-deferrable    wait until the dump can run without anomalies
  --snapshot=SNAPSHOT          use given snapshot for the dump
  --strict-names               require table and/or schema include patterns to
                               match at least one entity each
  --use-set-session-authorization
                               use SET SESSION AUTHORIZATION commands instead of
                               ALTER OWNER commands to set ownership

Connection options:
  -d, --dbname=DBNAME      database to dump
  -h, --host=HOSTNAME      database server host or socket directory
  -p, --port=PORT          database server port number
  -U, --username=NAME      connect as specified database user
  -w, --no-password        never prompt for password
  -W, --password           force password prompt (should happen automatically)
  --role=ROLENAME          do SET ROLE before dump

If no database name is supplied, then the PGDATABASE environment
variable value is used.

Report bugs to <pgsql-bugs@postgresql.org>.

参数解释：

类型	参数	说明
一般选项	-f, --file=FILENAME	指定输出文件名称
	-F, --format=c\|d\|t\|p	指定输出文件格式(custom, directory, tar,plain text (default))
	-j, --jobs=NUM	指定多个job并行备份
	-v, --verbose	输出备份详细信息
	-V, --version	输出版本信息
	-Z, --compress=0-9	指定压缩格式的压缩等级
	--lock-wait-timeout=TIMEOUT	指定锁超时的时间
控制输出文件的选项	-a, --data-only	只导出数据
	-b, --blobs	在dump文件中包含大对象
	-c, --clean	重建数据库之前删除数据库
	-C, --create	在dump文件中包含创建数据库的命令
	-E, --encoding=ENCODING	转储编码为encoding的数据
	-n, --schema=SCHEMA	导出此模式的数据库
	-N, --exclude-schema=SCHEMA	不导出此模式的数据
	-o, --oids	在dump文件包含OID信息
	-O, --no-owner	skip restoration of object ownership in plain-text format
	-s, --schema-only	只导出模式，不包含数据
	-S, --superuser=NAME	超级用户使用明文格式的用户名
	-t, --table=TABLE	指定导出的表名
	-T, --exclude-table=TABLE	指定不导出的表名
	-x, --no-privileges	不导出权限(grant/revoke)
	--binary-upgrade	仅供升级工具使用
	--column-inserts	将数据转储为带有列名的INSERT命令
	--disable-dollar-quoting	禁用美元引用，使用SQL标准引用
	--disable-triggers	在数据恢复期间禁用触发器
	--enable-row-security	启用行安全性(仅转储用户可以访问的内容)
	--exclude-table-data=TABLE	不转储指定表的数据
	--if-exists	如果对象存在则删除
	--inserts	将数据转储为INSERT命令，而不是COPY命令
	--no-security-labels	不转储安全标签分配
	--no-synchronized-snapshots	在并行作业中不使用同步快照
	--no-tablespaces	不备份表空间
	--no-unlogged-table-data	不转储不记录日志的表数据
	--quote-all-identifiers	引用所有标识符，即使不是关键字
	--section=SECTION	转储指定的部分(前数据、数据或后数据)
	--serializable-deferrable	等待直到转储可以正常运行
	--snapshot=SNAPSHOT	为转储使用给定的快照
	--strict-names	要求表和/或模式包含模式，以便每个模式至少匹配一个实体
	--use-set-session-authorization	使用SET SESSION AUTHORIZATION命令而不是ALTER OWNER命令来设置所有权
连接参数	-d, --dbname=DBNAME	备份的数据库名
	-h, --host=HOSTNAME	数据库的主机名或socket路径
	-p, --port=PORT	数据库服务端口
	-U, --username=NAME	连接数据库指定的用户
	-w, --no-password	不提示输入密码
	-W, --password	提示输入密码（默认会提示）
	--role=ROLENAME	dump指定ROLE

3.3 备份恢复

备份postgres库

[postgres]# pg_dump -h 127.0.0.1 -p 5432 -U postgres -f postgres.sql --column-inserts

备份postgres库并tar打包

[postgres]# pg_dump -h 127.0.0.1 -p 5432 -U postgres -f postgres.sql.tar -Ft

只备份postgres库对象数据

[postgres]# pg_dump -U postgres -d postgres -f postgres.sql -Ft --data-only --column-inserts

只备份postgres库对象结构

[postgres]# pg_dump -U postgres -d postgres -f postgres.sql -Ft --schema-only

导入SQL文件

[postgres]# psql -f postgre.sql postgres postgres

3.4 pg_restore恢复

备份

pg_dump -C -Fc postgres > postgres.db

恢复

[postgres]# pg_restore -l postgres.db > postgres.ini
[postgres]# pg_restore -L postgres.ini -d postgres postgres.db

4 pg_dumpall

4.1 介绍

pg_dumpall是逻辑备份工具，只支持导出SQL命令。

4.2 帮助手册

查看pg_dumpall帮助手册

[postgres]# pg_dumpall --help
pg_dumpall extracts a PostgreSQL database cluster into an SQL script file.

Usage:
  pg_dumpall [OPTION]...

General options:
  -f, --file=FILENAME          output file name
  -V, --version                output version information, then exit
  --lock-wait-timeout=TIMEOUT  fail after waiting TIMEOUT for a table lock
  -?, --help                   show this help, then exit

Options controlling the output content:
  -a, --data-only              dump only the data, not the schema
  -c, --clean                  clean (drop) databases before recreating
  -g, --globals-only           dump only global objects, no databases
  -o, --oids                   include OIDs in dump
  -O, --no-owner               skip restoration of object ownership
  -r, --roles-only             dump only roles, no databases or tablespaces
  -s, --schema-only            dump only the schema, no data
  -S, --superuser=NAME         superuser user name to use in the dump
  -t, --tablespaces-only       dump only tablespaces, no databases or roles
  -x, --no-privileges          do not dump privileges (grant/revoke)
  --binary-upgrade             for use by upgrade utilities only
  --column-inserts             dump data as INSERT commands with column names
  --disable-dollar-quoting     disable dollar quoting, use SQL standard quoting
  --disable-triggers           disable triggers during data-only restore
  --if-exists                  use IF EXISTS when dropping objects
  --inserts                    dump data as INSERT commands, rather than COPY
  --no-security-labels         do not dump security label assignments
  --no-tablespaces             do not dump tablespace assignments
  --no-unlogged-table-data     do not dump unlogged table data
  --quote-all-identifiers      quote all identifiers, even if not key words
  --use-set-session-authorization
                               use SET SESSION AUTHORIZATION commands instead of
                               ALTER OWNER commands to set ownership

Connection options:
  -d, --dbname=CONNSTR     connect using connection string
  -h, --host=HOSTNAME      database server host or socket directory
  -l, --database=DBNAME    alternative default database
  -p, --port=PORT          database server port number
  -U, --username=NAME      connect as specified database user
  -w, --no-password        never prompt for password
  -W, --password           force password prompt (should happen automatically)
  --role=ROLENAME          do SET ROLE before dump

If -f/--file is not used, then the SQL script will be written to the standard
output.

Report bugs to <pgsql-bugs@postgresql.org>.

参数解释：

类型	参数	说明
一般参数	-f, --file=FILENAME	指定文件名称
一般参数	--lock-wait-timeout=TIMEOUT	指定等待表锁N秒后失败
控制输出参数	-a, --data-only	只备份数据
	-c, --clean	重建之前删除数据库
	-g, --globals-only	仅备份全局对象，不导出数据库
	-o, --oids	在转储文件中包含OID
	-O, --no-owner	不转储对象的权限信息
	-r, --roles-only	仅仅转储role，不转储数据库和表空间
	-s, --schema-only	只转储结构，不转储数据
	-S, --superuser=NAME	转储使用的超级用户
	-t, --tablespaces-only	只转储表空间，不包含database和role
	-x, --no-privileges	不转储权限（grant/revoke）
	--binary-upgrade	仅供升级工具使用
	--column-inserts	将数据转储为带有列名的INSERT命令
	--disable-dollar-quoting	禁用美元引用，使用SQL标准引用
	--disable-triggers	在数据恢复期间禁用触发器
	--if-exists	当删除对象时使用IF EXISTS
	--inserts	将数据转储为INSERT命令，而不是COPY命令
	--no-security-labels	不转储安全标签分配
	--no-tablespaces	不转储表空间分配
	--no-unlogged-table-data	不转储没记录日志的数据
	--quote-all-identifiers	引用所有标识符，即使不是关键字
	--use-set-session-authorization	使用SET SESSION AUTHORIZATION命令而不是ALTER OWNER命令来设置所有权
连接选项	-d, --dbname=DBNAME	备份的数据库名
	-h, --host=HOSTNAME	数据库的主机名或socket路径
	-p, --port=PORT	数据库服务端口
	-U, --username=NAME	连接数据库指定的用户
	-w, --no-password	不提示输入密码
	-W, --password	提示输入密码（默认会提示）
	--role=ROLENAME	dump指定ROLE
	-l, --database=DBNAME	选择默认数据库

4.3 备份恢复

备份postgres库，转储数据为带列名的INSERT命令

[postgres]# pg_dumpall -d postgres -U postgres -f postgres.sql --column-inserts

备份postges库，转储数据为INSERT命令

[postgres]# pg_dumpall -d postgres -U postgres -f postgres.sql --inserts

导入数据

[postgres]#psql -f postgres.sql

刷新页面返回顶部

疯狂的骆驼

公告