PostgreSQL PITR实验
磨砺技术珠矶,践行数据之道,追求卓越价值
回到上一级页面: PostgreSQL基础知识与基本操作索引页 回到顶级页面:PostgreSQL索引页
看PostgreSQL中与PITR相关的设定:
wal_level=archive
…
# - Archiving -
archive_mode = on # allows archiving to be done
# (change requires restart)
archive_command = 'cp %p "/usr/local/pgsql/arch/%f"' # command to use to archive a logfile segment
#archive_timeout = 0 # force a logfile segment switch after this
# number of seconds; 0 disables
开始,在运行之前,看xlog状况:
[postgres@pg201 pgsql]$ ls -lrt ./data/pg_xlog
total 32812
-rw------- 1 postgres postgres 16777216 Aug 6 14:16 000000010000000000000001
drwx------ 2 postgres postgres 4096 Aug 6 14:16 archive_status
-rw------- 1 postgres postgres 16777216 Aug 6 14:16 000000010000000000000002
[postgres@pg201 pgsql]$
[postgres@pg201 pgsql]$ ls -lrt ./arch
total 16404
-rw------- 1 postgres postgres 16777216 Aug 6 14:16 000000010000000000000001
[postgres@pg201 pgsql]$
为文件拷贝进行准备:
[postgres@pg201 pgsql]$ ./bin/psql
psql (9.1.2)
Type "help" for help.
postgres=# select pg_start_backup('gao');
pg_start_backup
-----------------
0/3000020
(1 row)
postgres=#
此时,xlog和archive log状况如下:
[postgres@pg201 pgsql]$
[postgres@pg201 pgsql]$ ls -lrt ./data/pg_xlog
total 49216
-rw------- 1 postgres postgres 16777216 Aug 6 14:16 000000010000000000000004
-rw------- 1 postgres postgres 16777216 Aug 6 14:17 000000010000000000000002
drwx------ 2 postgres postgres 4096 Aug 6 14:17 archive_status
-rw------- 1 postgres postgres 16777216 Aug 6 14:17 000000010000000000000003
[postgres@pg201 pgsql]$ ls -lrt ./arch
total 32808
-rw------- 1 postgres postgres 16777216 Aug 6 14:16 000000010000000000000001
-rw------- 1 postgres postgres 16777216 Aug 6 14:17 000000010000000000000002
[postgres@pg201 pgsql]$
完成文件拷贝:
[postgres@pg201 pgsql]$ tar -cvf basedata.tar ./data
结束基础备份:
postgres=# select pg_stop_backup();
NOTICE: pg_stop_backup complete, all required WAL segments have been archived
pg_stop_backup
----------------
0/30000A0
(1 row)
postgres=#
再次查看log:
[postgres@pg201 pgsql]$ ls -lrt ./data/pg_xlog
total 49220
-rw------- 1 postgres postgres 16777216 Aug 6 14:16 000000010000000000000004
-rw------- 1 postgres postgres 16777216 Aug 6 14:17 000000010000000000000002
drwx------ 2 postgres postgres 4096 Aug 6 14:20 archive_status
-rw------- 1 postgres postgres 267 Aug 6 14:20 000000010000000000000003.00000020.backup
-rw------- 1 postgres postgres 16777216 Aug 6 14:20 000000010000000000000003
[postgres@pg201 pgsql]$
[postgres@pg201 pgsql]$ ls -lrt ./arch
total 49216
-rw------- 1 postgres postgres 16777216 Aug 6 14:16 000000010000000000000001
-rw------- 1 postgres postgres 16777216 Aug 6 14:17 000000010000000000000002
-rw------- 1 postgres postgres 267 Aug 6 14:20 000000010000000000000003.00000020.backup
-rw------- 1 postgres postgres 16777216 Aug 6 14:20 000000010000000000000003
[postgres@pg201 pgsql]$
可以看到产生了一个backup文件,看内容:
[postgres@pg201 pgsql]$ cat ./data/pg_xlog/000000010000000000000003.00000020.backup
START WAL LOCATION: 0/3000020 (file 000000010000000000000003)
STOP WAL LOCATION: 0/30000A0 (file 000000010000000000000003)
CHECKPOINT LOCATION: 0/3000020
BACKUP METHOD: pg_start_backup
START TIME: 2013-08-06 14:17:53 CST
LABEL: gao
STOP TIME: 2013-08-06 14:20:53 CST
[postgres@pg201 pgsql]$
强制日志切换,模拟大量数据产生导致日志switch:
postgres=# select pg_switch_xlog();
pg_switch_xlog
----------------
0/4000000
(1 row)
postgres=#
log发生了变化:
[postgres@pg201 pgsql]$ ls -lrt ./data/pg_xlog
total 49220
-rw------- 1 postgres postgres 16777216 Aug 6 14:20 000000010000000000000006
-rw------- 1 postgres postgres 267 Aug 6 14:20 000000010000000000000003.00000020.backup
-rw------- 1 postgres postgres 16777216 Aug 6 14:25 000000010000000000000004
drwx------ 2 postgres postgres 4096 Aug 6 14:27 archive_status
-rw------- 1 postgres postgres 16777216 Aug 6 14:27 000000010000000000000005
[postgres@pg201 pgsql]$
[postgres@pg201 pgsql]$ ls -lrt ./arch
total 65620
-rw------- 1 postgres postgres 16777216 Aug 6 14:16 000000010000000000000001
-rw------- 1 postgres postgres 16777216 Aug 6 14:17 000000010000000000000002
-rw------- 1 postgres postgres 267 Aug 6 14:20 000000010000000000000003.00000020.backup
-rw------- 1 postgres postgres 16777216 Aug 6 14:20 000000010000000000000003
-rw------- 1 postgres postgres 16777216 Aug 6 14:25 000000010000000000000004
[postgres@pg201 pgsql]$
再作几次log switch,最后是这样:
[postgres@pg201 pgsql]$ ls -lrt ./data/pg_xlog
total 49220
-rw------- 1 postgres postgres 16777216 Aug 6 14:20 000000010000000000000006
-rw------- 1 postgres postgres 267 Aug 6 14:20 000000010000000000000003.00000020.backup
-rw------- 1 postgres postgres 16777216 Aug 6 14:25 000000010000000000000004
drwx------ 2 postgres postgres 4096 Aug 6 14:27 archive_status
-rw------- 1 postgres postgres 16777216 Aug 6 14:27 000000010000000000000005
[postgres@pg201 pgsql]$
[postgres@pg201 pgsql]$ ls -lrt ./arch
total 65620
-rw------- 1 postgres postgres 16777216 Aug 6 14:16 000000010000000000000001
-rw------- 1 postgres postgres 16777216 Aug 6 14:17 000000010000000000000002
-rw------- 1 postgres postgres 267 Aug 6 14:20 000000010000000000000003.00000020.backup
-rw------- 1 postgres postgres 16777216 Aug 6 14:20 000000010000000000000003
-rw------- 1 postgres postgres 16777216 Aug 6 14:25 000000010000000000000004
[postgres@pg201 pgsql]$
经过一次正常关机后,日志再次变化:
[postgres@pg201 pgsql]$ ./bin/pg_ctl -D ./data stop
日志变为:
[postgres@pg201 pgsql]$ ls -lrt ./data/pg_xlog
total 49220
-rw------- 1 postgres postgres 267 Aug 6 14:20 000000010000000000000003.00000020.backup
-rw------- 1 postgres postgres 16777216 Aug 6 14:25 000000010000000000000007
drwx------ 2 postgres postgres 4096 Aug 6 14:39 archive_status
-rw------- 1 postgres postgres 16777216 Aug 6 14:39 000000010000000000000006
-rw------- 1 postgres postgres 16777216 Aug 6 14:39 000000010000000000000005
[postgres@pg201 pgsql]$
[postgres@pg201 pgsql]$ ls -lrt ./arch
total 82024
-rw------- 1 postgres postgres 16777216 Aug 6 14:16 000000010000000000000001
-rw------- 1 postgres postgres 16777216 Aug 6 14:17 000000010000000000000002
-rw------- 1 postgres postgres 267 Aug 6 14:20 000000010000000000000003.00000020.backup
-rw------- 1 postgres postgres 16777216 Aug 6 14:20 000000010000000000000003
-rw------- 1 postgres postgres 16777216 Aug 6 14:25 000000010000000000000004
-rw------- 1 postgres postgres 16777216 Aug 6 14:39 000000010000000000000005
[postgres@pg201 pgsql]$
重新启动PostgreSQL,添加数据,然后杀掉进程,模拟崩溃:
[postgres@pg201 pgsql]$ ./bin/psql
psql (9.1.2)
Type "help" for help.
postgres=# create table test(id integer);
CREATE TABLE
postgres=# insert into test values(1);
INSERT 0 1
postgres=# \q
[postgres@pg201 pgsql]$
[root@pg201 ~]# ps -ef | grep post
root 3120 3049 0 14:08 pts/1 00:00:00 su - postgres
postgres 3121 3120 0 14:08 pts/1 00:00:00 -bash
root 3216 3191 0 14:15 pts/2 00:00:00 su - postgres
postgres 3217 3216 0 14:15 pts/2 00:00:00 -bash
root 3299 3279 0 14:17 pts/3 00:00:00 su - postgres
postgres 3300 3299 0 14:17 pts/3 00:00:00 -bash
postgres 3516 1 0 15:00 pts/2 00:00:00 /usr/local/pgsql/bin/postgres -D ./data
postgres 3518 3516 0 15:00 ? 00:00:00 postgres: writer process
postgres 3519 3516 0 15:00 ? 00:00:00 postgres: wal writer process
postgres 3520 3516 0 15:00 ? 00:00:00 postgres: autovacuum launcher process
postgres 3521 3516 0 15:00 ? 00:00:00 postgres: archiver process
postgres 3522 3516 0 15:00 ? 00:00:00 postgres: stats collector process
root 3559 3535 0 15:01 pts/4 00:00:00 grep post
[root@pg201 ~]# kill -s SIGQUIT 3516
[root@pg201 ~]#
下面开始回复动作、找出基础备份:
[postgres@pg201 pgsql]$ tar -xvf basedata.tar ./data
其pg_xlog目录下,还是旧的online redo log:
[postgres@pg201 pg_xlog]$ ls -lrt
total 49216
-rw------- 1 postgres postgres 16777216 Aug 6 14:16 000000010000000000000004
-rw------- 1 postgres postgres 16777216 Aug 6 14:17 000000010000000000000002
drwx------ 2 postgres postgres 4096 Aug 6 14:17 archive_status
-rw------- 1 postgres postgres 16777216 Aug 6 14:17 000000010000000000000003
[postgres@pg201 pg_xlog]$
把它清空后,不补足online redolog,仅仅用 archive log的效果:
在解压好的data目录下,建立 recovery.conf文件:
[postgres@pg201 data]$ cat recovery.conf
restore_command = 'cp /usr/local/pgsql/arch/%f %p'
[postgres@pg201 data]$
启动PostgreSQL:
[postgres@pg201 pgsql]$ ./bin/pg_ctl -D ./data status
pg_ctl: no server running
[postgres@pg201 pgsql]$ ./bin/pg_ctl -D ./data start
pg_ctl: another server might be running; trying to start server anyway
server starting
[postgres@pg201 pgsql]$ LOG: database system was interrupted; last known up at 2013-08-06 14:17:53 CST
LOG: creating missing WAL directory "pg_xlog/archive_status"
LOG: starting archive recovery
LOG: restored log file "000000010000000000000003" from archive
LOG: redo starts at 0/3000078
LOG: consistent recovery state reached at 0/4000000
LOG: restored log file "000000010000000000000004" from archive
LOG: restored log file "000000010000000000000005" from archive
cp: cannot stat `/usr/local/pgsql/arch/000000010000000000000006': No Such File or Directory
LOG: could not open file "pg_xlog/000000010000000000000006" (log file 0, segment 6): No Such File or Directory
LOG: redo done at 0/5000078 LOG: restored log file "000000010000000000000005" from archive cp: cannot stat `/usr/local/pgsql/arch/00000002.history': No Such File or Directory
LOG: selected new timeline ID: 2 cp: cannot stat `/usr/local/pgsql/arch/00000001.history':No Such File or Directory
LOG: archive recovery complete
LOG: autovacuum launcher started
LOG: database system is ready to accept connections
因为我的创建table test 的动作虽然计入了 redo log中,但是由于突然的崩溃,没有来得及放入archive 日志中,所以,没有得到恢复:
[postgres@pg201 pgsql]$ ./bin/psql
psql (9.1.2)
Type "help" for help.
postgres=# select * from test;
ERROR: relation "test" does not exist
LINE 1: select * from test;
^
postgres=#
此时,看Log,就可以清晰地看到TimeLine的痕迹了:
[postgres@pg201 pgsql]$ ls -lrt ./data/pg_xlog
total 49220
-rw------- 1 postgres postgres 56 Aug 6 15:11 00000002.history
-rw------- 1 postgres postgres 16777216 Aug 6 15:11 000000020000000000000005
-rw------- 1 postgres postgres 16777216 Aug 6 15:17 000000020000000000000006
drwx------ 2 postgres postgres 4096 Aug 6 15:17 archive_status
-rw------- 1 postgres postgres 16777216 Aug 6 15:17 000000020000000000000007
[postgres@pg201 pgsql]$
[postgres@pg201 pgsql]$ ls -lrt ./arch
total 114836
-rw------- 1 postgres postgres 16777216 Aug 6 14:16 000000010000000000000001
-rw------- 1 postgres postgres 16777216 Aug 6 14:17 000000010000000000000002
-rw------- 1 postgres postgres 267 Aug 6 14:20 000000010000000000000003.00000020.backup
-rw------- 1 postgres postgres 16777216 Aug 6 14:20 000000010000000000000003
-rw------- 1 postgres postgres 16777216 Aug 6 14:25 000000010000000000000004
-rw------- 1 postgres postgres 16777216 Aug 6 14:39 000000010000000000000005
-rw------- 1 postgres postgres 56 Aug 6 15:11 00000002.history
-rw------- 1 postgres postgres 16777216 Aug 6 15:17 000000020000000000000006
-rw------- 1 postgres postgres 16777216 Aug 6 15:17 000000020000000000000005
[postgres@pg201 pgsql]$ cat ./arch/00000002.history
1 000000010000000000000005 no recovery target specified
[postgres@pg201 pgsql]$
这次再看清空pg_xlog目录后,补足online redolog+archive log的效果:
还要把 pg_xlog/archive_status 目录清空:
[postgres@pg201 pgsql]$ cp ./data.bak/pg_xlog/* ./data/pg_xlog/
...
[postgres@pg201 pg_xlog]$ ls -lrt
total 98432
-rw------- 1 postgres postgres 16777216 Aug 6 14:16 000000010000000000000004
-rw------- 1 postgres postgres 16777216 Aug 6 14:17 000000010000000000000002
-rw------- 1 postgres postgres 16777216 Aug 6 14:17 000000010000000000000003
-rw------- 1 postgres postgres 267 Aug 6 15:41 000000010000000000000003.00000020.backup
-rw------- 1 postgres postgres 16777216 Aug 6 15:41 000000010000000000000007
-rw------- 1 postgres postgres 16777216 Aug 6 15:41 000000010000000000000006
-rw------- 1 postgres postgres 16777216 Aug 6 15:41 000000010000000000000005
drwx------ 2 postgres postgres 4096 Aug 6 15:44 archive_status
[postgres@pg201 pg_xlog]$
此时,启动 PostgreSQL:
[postgres@pg201 pgsql]$ ./bin/pg_ctl -D ./data start
pg_ctl: another server might be running; trying to start server anyway
server starting
[postgres@pg201 pgsql]$ LOG: database system was interrupted; last known up at 2013-08-06 14:17:53 CST
LOG: starting archive recovery
LOG: restored log file "000000010000000000000003" from archive
LOG: redo starts at 0/3000078
LOG: consistent recovery state reached at 0/4000000
LOG: restored log file "000000010000000000000004" from archive
LOG: restored log file "000000010000000000000005" from archive
cp: cannot stat `/usr/local/pgsql/arch/000000010000000000000006': No such file or directory
LOG: record with zero length at 0/6011770
LOG: redo done at 0/6011730
LOG: last completed transaction was at log time 2013-08-06 15:01:17.757945+08
LOG: restored log file "00000002.history" from archive
cp: cannot stat `/usr/local/pgsql/arch/00000003.history': No such file or directory
LOG: selected new timeline ID: 3
cp: cannot stat `/usr/local/pgsql/arch/00000001.history': No such file or directory
LOG: archive recovery complete
LOG: autovacuum launcher started
LOG: database system is ready to accept connections
看数据:
[postgres@pg201 pgsql]$ ./bin/psql
psql (9.1.2)
Type "help" for help.
postgres=# select * from test;
id
----
1
(1 row)
postgres=#
回到上一级页面: PostgreSQL基础知识与基本操作索引页 回到顶级页面:PostgreSQL索引页
磨砺技术珠矶,践行数据之道,追求卓越价值