Redis_持久化之RDB
rdb - Redis DataBase
官网介绍:
在指定的时间间隔内存中的数据集快照写入磁盘,也就是行话将的Snapshot快照,它恢复时是将快照文件直接读到内存中。
是什么:
Redis会单独创建(fork)一个子进程来进行持久化,会先将数据写入到内存里一个临时文件中,待持久化过程都结束了,在用这个临时文件替换上次持久化好的文件。整个过程中,主进程是不进行任何IO操作的,这就确保了极高的性能,如果需要进行大规模数据的恢复,且对于数据的完整性不是非常敏感,那RDB方式要比AOF方式更加的高效。RDB的缺点是最后一次持久化后的数据可能对丢失。
Fork:
Fork的作用是复制一个与当前进程一样的进程。新进程的所有数据(变量,环境变量,程序计数器等)数值都是和原进程一致,但是是一个全新的进程,并作为原进程的子进程。(存在一个问题:如果原程序的数据数据特别大,fork一份)
Rdb 保存的是dump.rdb文件
Save the DB on disk
# after 900 sec (15 min) if at least 1 key changed
# after 300 sec (5 min) if at least 10 keys changed
# after 60 sec if at least 10000 keys changed
#
# Note: you can disable saving completely by commenting out all "save" lines.
#
如果想禁用RDB持久化的策略,只要不设置任何save指令,或者给save传入一个空字符串参数也可以
# It is also possible to remove all the previously configured save
# points by adding a save directive with a single empty string argument
# like in the following example:
#
# save ""
RDB是整个内存的压缩过的Snapshot,RDB的数据结构,可以配合复合的快照触发条件,
默认是
1分钟内改了1万次,
或5分钟内改了10万次,
或15分钟内改了1次。
可以使用 save命令立即进行备份
# However if you have setup your proper monitoring of the Redis server
# and persistence, you may want to disable this feature so that Redis will
# continue to work as usual even if there are problems with disk,
# permissions, and so forth.
stop-writes-on-bgsave-error yes
如果在后台save出错了,前台要停止写
如果配置成no,表示你不在乎数据不一致或者有其他的手段发现和控制
# Compress string objects using LZF when dump .rdb databases?
# For default that's set to 'yes' as it's almost always a win.
# If you want to save some CPU in the saving child set it to 'no' but
# the dataset will likely be bigger if you have compressible values or keys.
rdbcompression yes
rdbcompressiion:对于存储到磁盘中的快照,可以设置是否进行压缩存储。如果是的话,redis会采用LZF算法进行压缩。如果你不想消耗CPU来进行压缩的话,可以设置关闭此功能
# Since version 5 of RDB a CRC64 checksum is placed at the end of the file.
# This makes the format more resistant to corruption but there is a performance
# hit to pay (around 10%) when saving and loading RDB files, so you can disable it
# for maximum performances.
#
# RDB files created with checksum disabled have a checksum of zero that will
# tell the loading code to skip the check.
rdbchecksum yes
rdbchecksum:在存储快照后,还可以让redis使用CRC64算法来进行数据校验,但这样会增加大约10%的性能消耗,如果希望获取到最大的性能提升,可以关闭此功能,不过一般不用关,可以再空闲时进行压缩和校验
# The DB will be written inside this directory, with the filename specified
# above using the 'dbfilename' configuration directive.
#
# The Append Only File will also be created inside this directory.
#
# Note that you must specify a directory here, not a file name.
dir ./
使用config get dir 获取路径
快照:
配置文件中默认的快照配置->冷拷贝后可以重新使用->可以cp dump.rdb dump_new.rdb,一般备份文件不会和主程序在同一台机器上。
命令save或者bgsave ,可以立刻的生成dump.rdb文件
Save:save是只管保存,其他不管,全部阻塞
BGSAVE: Redis会在后台异步进行快照操作,快照同时还可以响应客户端请求。可以通过lastsave命令获取最后一次成功执行快照的时间。
执行flushall命令,也会产生dump.rdb文件,但里面是空的,无意义
如何恢复:
将备份文件(dump.rdb) 移动到redis安装目录并启动服务即可,启动时会自动的读取文件,
CONFIG GET dir获取目录
优势:
适合大规模的数据恢复,对数据完整性和一致性要求不高
劣势:
在一定间隔时间做一次备份,所以如果redis意外down掉的话,就会丢失最后一次快照后的所有修改
Fork的时候,内存中的数据被克隆了一份,大致2倍的膨胀性需要考虑
如何停止:
动态所有停止RDB保存规则的方法:redis-cli config set save ""
aof - Append Only File
是什么:
以日志的形式来记录每个写操作,将Redis执行过的所有写指令记录下来(读操作不记录),只许追加文件但不可改写文件,redis启动之初会读到该文件重新构建数据,换言之,重启的话就根据日志文件的内容将写指令从前到后执行一次以完成数据的恢复工作。
aof在文件被破坏后可以使用如下命令修复
[root@node1 bin]# redis-server /myredis/redis_aof.conf
[root@node1 bin]# redis-cli -p 6379
Could not connect to Redis at 127.0.0.1:6379: Connection refused
not connected> exit
[root@node1 bin]# redis-check-aof --fix appendonly.aof
0x 51: Expected prefix ':', got: '*'
AOF analyzed: size=103, ok_up_to=81, diff=22
This will shrink the AOF from 103 bytes, with 22 bytes, to 81 bytes
Continue? [y/N]: y
Successfully truncated AOF
[root@node1 bin]#
# (see later in the config file) Redis can lose just one second of writes in a
# dramatic event like a server power outage, or a single write if something
# wrong with the Redis process itself happens, but the operating system is
# still running correctly.
#
# AOF and RDB persistence can be enabled at the same time without problems.
# If the AOF is enabled on startup Redis will load the AOF, that is the file
# with the better durability guarantees.
#
# Please check http://redis.io/topics/persistence for more information.
appendonly yes 默认为no,改为yes为打开持久化开关
# The default is "everysec", as that's usually the right compromise between
# speed and data safety. It's up to you to understand if you can relax this to
# "no" that will let the operating system flush the output buffer when
# it wants, for better performances (but if you can live with the idea of
# some data loss consider the default persistence mode that's snapshotting),
# or on the contrary, use "always" that's very slow but a bit safer than
# everysec.
#
# More details please check the following article:
# http://antirez.com/post/redis-persistence-demystified.html
#
# If unsure, use "everysec".
# appendfsync always
appendfsync everysec
# appendfsync no
Appendfsync:
Always:同步持久化,每次发生数据变更会被立即记录到磁盘,性能较差但数据完整性比较好
Everysec:出厂默认推荐,异步操作,每秒记录,如果一秒内宕机,有数据丢失
No
AOF启动/修复/恢复
正常恢复
启动:设置Yes,修改默认的appendonly no ,改为yes
将有数据的aof文件复制一份保存到对应目录(config get dir)
恢复:重启redis然后重新加载
异常恢复:
启动:设置Yes
备份被写坏的AOF文件
修复:Redis-check-aof --fix进行修复
恢复:重启redis然后重新加载
Rewrite:
是什么:
AOF采用文件追加方式,文件会越来越大为避免出现此中情况,新增了重写机制,当AOF文件的大小超过所设定的阈值时,Redis就会启动AOF文件的内容压缩,只保留可以恢复数据的最小指令集。可以使用命令bgrewriteaof
重写原理:
AOF文件持续增长而过大时,会fork出一条新进程来将文件重写(也是先写临时文件最后在rename),遍历新进程的内存中的数据,每条记录有一条的Set语句。重写aof文件的操作,并没有读取旧的aof文件,而是将整个内存中的数据库内容用命令的方式重写了一个新的aof文件,这点和快照有点类似。
触发机制:
Redis会记录上次重写是的AOF大小,默认配置是当AOF文件大小是上次rewrite后大小的一倍且大于64M时触发
# This base size is compared to the current size. If the current size is
# bigger than the specified percentage, the rewrite is triggered. Also
# you need to specify a minimal size for the AOF file to be rewritten, this
# is useful to avoid rewriting the AOF file even if the percentage increase
# is reached but it is still pretty small.
#
# Specify a percentage of zero in order to disable the automatic AOF
# rewrite feature.
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
1.如果一个系统里边同时存在 RDB 和 AOF 是冲突还是协作?
可以两者共存,系统先加载的是AOF
2.AOF 为什么会在RDB 之后产生?
3.AOF 会有什么优缺点
优势:
每秒同步:appendfsync always 同步持久化 每次发生数据变更会被立即记录到磁盘,性能较差但数据完整性比较高
每修改同步:appendfsync everysec (默认)异步操作,每秒记录
不同步:appendfsync no 从不同步
劣势:
相同数据集的数据而言aof文件要远大于rdb文件恢复速度慢于rdb
Aof运行效率要慢于rdb,每秒同步策略效率较好,不同步效率和rdb相同
AOF
应该使用哪种?