高性能Key-Value数据库--Redis

NoSQL

not only sql —非关系型数据库

	NoSQL四大分类:
	1、Key-Value键值对	redis
	2、文档数据类型	MongoDB ---主要用来处理大量文档。属于关系型和非关系型数据库交集部分的产品。
	3、列存储	HBase、分布式文件系统
	4、图形关系数据库	Neo4j、InfoGrid

Redis安装

Redis(Remote Dictionary Server 远程字典服务) C编写支持网络，基于内存可持久化的日志型、k-v数据库。

	redis作用：

	内存存储、持久化( rdb, aof )

	效率高，可用于高速缓存

	支持多样数据类型

	持久化、集群、事务

Redis 键 key

Redis 键命令用于管理 redis 的键

127.0.0.1:6379> set k1 1  	设置一个key为k1
OK
127.0.0.1:6379> get k1  	获取k1的value
"1"

del key 	当key存在时，删除key和对应的内容
dump key	序列化key并返回序列化后的值
exists key	检查key是否存在，存在返回1，否则返回0
expireat key 给key设置过期时间
TTL key 	返回key剩余的生存时间，单位：秒
randomkey	随机返回存在的一个key
rename key [newkey] 修改key的名称
type key 	返回key所存储值的数据类型

Redis五大数据类型

String 字符串

getrange key start end //返回key中字符串字符：start下标到end下标
getset key value //更改key对应的value，并返回原来的value
mget key.. //获取一个或多个key值
mset key value //同时设置一个或多个key-value键值对
strlen key //返回key所存储的字符串值的长度
setrange key offset value //从偏移量offset开始，用value参数覆写给定的key所存储的字符串值
setex key seconds value //将值关联到key，设置key的过期时间为秒
incr key //将key中存储的数字值增1
decr key //将key中存储的数字值减1
append key value //对原来的key-value追加value到末尾

List 列表

按照插入顺序排序。可以添加一个元素到列表的头部（左边）或者尾部（右边）

lpush key value //向key中插入value,位置在列表头部，先进先出
lrange key start end //返回插入元素，下标start到end
lpop //删除并返回列表的第一个元素
rpop //删除并返回列表最后一个元素   
blpop key timeout //删除并返回列表第一个元素，如果列表没有元素会阻塞直到超时或发现可弹出元素为止。
rlpop key timeout //删除并返回列表第一个元素，如果列表没有元素会阻塞直到超时或发现可弹出元素为止。
llen key //获取列表长度
lindex key index //通过索引获取列表中的元素
linsert key before/after pivot value //在列表的元素前或后插入元素;如果列表为空不执行，key类型不一致，返回error
lrem key count value //移除列表元素
lset key index value 通过索引设置列表元素的值

Set 集合

Redis 的 Set 是 String 类型的无序集合。集合成员是唯一的，这就意味着集合中不能出现重复的数据。
Redis 中集合是通过哈希表实现的，所以添加，删除，查找的复杂度都是 O(1)。

sadd key member.. //向集合添加一个或多个成员
scard key //获取集合的成员数
sdiff key1 [key2..] //返回key1集合与其他集合之间的差异值。
sinterstore destination key1 [key2..] //将给定集合之间的差集存储在指定的集合中。如果指定的集合 key 已存在，则会被覆盖。
sismemger key member //判断member元素是否是集合key得成员
smembers key //返回集合中的所有成员
spop key //删除并返回集合中一个随机元素
srandmember key [count] //返回集合中一个或多个随机数
srem key member.. //删除集合中一个或多个元素
sscan key cursor //迭代集合中的元素
sunion key .. //返回所有给定集合的并集

Hash 哈希

Redis hash 是一个 string 类型的 field（字段）和 value（值）的映射表，hash 特别适合用于存储对象。

hmset key1 field1 value1 field2 value2.. //将多个field-value(域-值)设置到哈希表key中
hset key field value //设置哈希表key中字段field的value
hget key field //获取存储在哈希表中 指定字段的值
hmget key field.. //获取 所有给定字段的值
hdel key field.. //删除一个或多个哈希表字段
hexists key field //查看key中指定的字段是否存在
hlen key //获取哈希表key中字段的数量
hvals key //获取哈希表key中所有的值value
hgetall key //获取哈希表key中所有的字段field和值value
hincrby key field increment //给哈希表key中指定字段的整数值加上增量increment

Zset 有序集合

Redis 有序集合和集合一样也是 string 类型元素的集合，且不允许重复的成员。
不同的是每个元素都会关联一个 double 类型的分数。redis 正是通过分数来为集合中的成员进行从小到大的排序。
有序集合的成员是唯一的,但分数(score)却可以重复

zadd key score1 member1 score2 member2.. //向有序集合中添加成员或更新已存在成员的score
zcard key //获取有序集合key的成员数
zcount key min max //获取key中指定区间分数的成员数量
zlexcount key min max //获取key中计算指定字典区间内成员数量
zremrangebylex key min max //删除key中给定字典区间所有成员
zremrangebyrank key start stop //删除key中给定排名区间的所有成员
zremrangebyscore key min max //删除key中给定分数区间的所有成员
zrank key member //返回key中指定成员的索引
zscore key member //返回key中成员的分数值
zrevrank key member //返回key中指定成员的排名，有有序集成员按分数值递减，从大到小排序

解析配置文件：redis.conf

# Redis配置文件样例

# Note on units: when memory size is needed, it is possible to specifiy
# it in the usual form of 1k 5GB 4M and so forth:
#
# 1k => 1000 bytes
# 1kb => 1024 bytes
# 1m => 1000000 bytes
# 1mb => 1024*1024 bytes
# 1g => 1000000000 bytes
# 1gb => 1024*1024*1024 bytes
#
# units are case insensitive so 1GB 1Gb 1gB are all the same.

# Redis默认不是以守护进程的方式运行，可以通过该配置项修改，使用yes启用守护进程
# 启用守护进程后，Redis会把pid写到一个pidfile中，在/var/run/redis.pid
daemonize no

# 当Redis以守护进程方式运行时，Redis默认会把pid写入/var/run/redis.pid文件，可以通过pidfile指定
pidfile /var/run/redis.pid

# 指定Redis监听端口，默认端口为6379
# 如果指定0端口，表示Redis不监听TCP连接
port 6379

# 绑定的主机地址
# 你可以绑定单一接口，如果没有绑定，所有接口都会监听到来的连接
# bind 127.0.0.1

# Specify the path for the unix socket that will be used to listen for
# incoming connections. There is no default, so Redis will not listen
# on a unix socket when not specified.
#
# unixsocket /tmp/redis.sock
# unixsocketperm 755

# 当客户端闲置多长时间后关闭连接，如果指定为0，表示关闭该功能
timeout 0

# 指定日志记录级别，Redis总共支持四个级别：debug、verbose、notice、warning，默认为verbose
# debug (很多信息, 对开发／测试比较有用)
# verbose (many rarely useful info, but not a mess like the debug level)
# notice (moderately verbose, what you want in production probably)
# warning (only very important / critical messages are logged)
loglevel verbose

# 日志记录方式，默认为标准输出，如果配置为redis为守护进程方式运行，而这里又配置为标准输出，则日志将会发送给/dev/null
logfile stdout

# To enable logging to the system logger, just set 'syslog-enabled' to yes,
# and optionally update the other syslog parameters to suit your needs.
# syslog-enabled no

# Specify the syslog identity.
# syslog-ident redis

# Specify the syslog facility.  Must be USER or between LOCAL0-LOCAL7.
# syslog-facility local0

# 设置数据库的数量，默认数据库为0，可以使用select <dbid>命令在连接上指定数据库id
# dbid是从0到‘databases’-1的数目
databases 16

################################ SNAPSHOTTING  #################################
# 指定在多长时间内，有多少次更新操作，就将数据同步到数据文件，可以多个条件配合
# Save the DB on disk:
#
#   save <seconds> <changes>
#
#   Will save the DB if both the given number of seconds and the given
#   number of write operations against the DB occurred.
#
#   满足以下条件将会同步数据:
#   900秒（15分钟）内有1个更改
#   300秒（5分钟）内有10个更改
#   60秒内有10000个更改
#   Note: 可以把所有“save”行注释掉，这样就取消同步操作了

save 900 1
save 300 10
save 60 10000

# 指定存储至本地数据库时是否压缩数据，默认为yes，Redis采用LZF压缩，如果为了节省CPU时间，可以关闭该选项，但会导致数据库文件变的巨大
rdbcompression yes

# 指定本地数据库文件名，默认值为dump.rdb
dbfilename dump.rdb

# 工作目录.
# 指定本地数据库存放目录，文件名由上一个dbfilename配置项指定
# 
# Also the Append Only File will be created inside this directory.
# 
# 注意，这里只能指定一个目录，不能指定文件名
dir ./

################################# REPLICATION #################################

# 主从复制。使用slaveof从 Redis服务器复制一个Redis实例。注意，该配置仅限于当前slave有效
# so for example it is possible to configure the slave to save the DB with a
# different interval, or to listen to another port, and so on.
# 设置当本机为slav服务时，设置master服务的ip地址及端口，在Redis启动时，它会自动从master进行数据同步
# slaveof <masterip> <masterport>


# 当master服务设置了密码保护时，slav服务连接master的密码
# 下文的“requirepass”配置项可以指定密码
# masterauth <master-password>

# When a slave lost the connection with the master, or when the replication
# is still in progress, the slave can act in two different ways:
#
# 1) if slave-serve-stale-data is set to 'yes' (the default) the slave will
#    still reply to client requests, possibly with out of data data, or the
#    data set may just be empty if this is the first synchronization.
#
# 2) if slave-serve-stale data is set to 'no' the slave will reply with
#    an error "SYNC with master in progress" to all the kind of commands
#    but to INFO and SLAVEOF.
#
slave-serve-stale-data yes

# Slaves send PINGs to server in a predefined interval. It's possible to change
# this interval with the repl_ping_slave_period option. The default value is 10
# seconds.
#
# repl-ping-slave-period 10

# The following option sets a timeout for both Bulk transfer I/O timeout and
# master data or ping response timeout. The default value is 60 seconds.
#
# It is important to make sure that this value is greater than the value
# specified for repl-ping-slave-period otherwise a timeout will be detected
# every time there is low traffic between the master and the slave.
#
# repl-timeout 60

################################## SECURITY ###################################

# Warning: since Redis is pretty fast an outside user can try up to
# 150k passwords per second against a good box. This means that you should
# use a very strong password otherwise it will be very easy to break.
# 设置Redis连接密码，如果配置了连接密码，客户端在连接Redis时需要通过auth <password>命令提供密码，默认关闭
# requirepass foobared

# Command renaming.
#
# It is possilbe to change the name of dangerous commands in a shared
# environment. For instance the CONFIG command may be renamed into something
# of hard to guess so that it will be still available for internal-use
# tools but not available for general clients.
#
# Example:
#
# rename-command CONFIG b840fc02d524045429941cc15f59e41cb7be6c52
#
# It is also possilbe to completely kill a command renaming it into
# an empty string:
#
# rename-command CONFIG ""

################################### LIMITS ####################################

# 设置同一时间最大客户端连接数，默认无限制，Redis可以同时打开的客户端连接数为Redis进程可以打开的最大文件描述符数，
# 如果设置maxclients 0，表示不作限制。当客户端连接数到达限制时，Redis会关闭新的连接并向客户端返回max Number of clients reached错误信息
# maxclients 128

# Don't use more memory than the specified amount of bytes.
# When the memory limit is reached Redis will try to remove keys with an
# EXPIRE set. It will try to start freeing keys that are going to expire
# in little time and preserve keys with a longer time to live.
# Redis will also try to remove objects from free lists if possible.
#
# If all this fails, Redis will start to reply with errors to commands
# that will use more memory, like SET, LPUSH, and so on, and will continue
# to reply to most read-only commands like GET.
#
# WARNING: maxmemory can be a good idea mainly if you want to use Redis as a
# 'state' server or cache, not as a real DB. When Redis is used as a real
# database the memory usage will grow over the weeks, it will be obvious if
# it is going to use too much memory in the long run, and you'll have the time
# to upgrade. With maxmemory after the limit is reached you'll start to get
# errors for write operations, and this may even lead to DB inconsistency.
# 指定Redis最大内存限制，Redis在启动时会把数据加载到内存中，达到最大内存后，Redis会先尝试清除已到期或即将到期的Key，
# 当此方法处理后，仍然到达最大内存设置，将无法再进行写入操作，但仍然可以进行读取操作。
# Redis新的vm机制，会把Key存放内存，Value会存放在swap区
# maxmemory <bytes>

# MAXMEMORY POLICY: how Redis will select what to remove when maxmemory
# is reached? You can select among five behavior:
# 
# volatile-lru -> remove the key with an expire set using an LRU algorithm
# allkeys-lru -> remove any key accordingly to the LRU algorithm
# volatile-random -> remove a random key with an expire set
# allkeys->random -> remove a random key, any key
# volatile-ttl -> remove the key with the nearest expire time (minor TTL)
# noeviction -> don't expire at all, just return an error on write operations
# 
# Note: with all the kind of policies, Redis will return an error on write
#       operations, when there are not suitable keys for eviction.
#
#       At the date of writing this commands are: set setnx setex append
#       incr decr rpush lpush rpushx lpushx linsert lset rpoplpush sadd
#       sinter sinterstore sunion sunionstore sdiff sdiffstore zadd zincrby
#       zunionstore zinterstore hset hsetnx hmset hincrby incrby decrby
#       getset mset msetnx exec sort
#
# The default is:
#
# maxmemory-policy volatile-lru

# LRU and minimal TTL algorithms are not precise algorithms but approximated
# algorithms (in order to save memory), so you can select as well the sample
# size to check. For instance for default Redis will check three keys and
# pick the one that was used less recently, you can change the sample size
# using the following configuration directive.
#
# maxmemory-samples 3

############################## APPEND ONLY MODE ###############################

# 
# Note that you can have both the async dumps and the append only file if you
# like (you have to comment the "save" statements above to disable the dumps).
# Still if append only mode is enabled Redis will load the data from the
# log file at startup ignoring the dump.rdb file.
# 指定是否在每次更新操作后进行日志记录，Redis在默认情况下是异步的把数据写入磁盘，如果不开启，可能会在断电时导致一段时间内的数据丢失。
# 因为redis本身同步数据文件是按上面save条件来同步的，所以有的数据会在一段时间内只存在于内存中。默认为no
# IMPORTANT: Check the BGREWRITEAOF to check how to rewrite the append
# log file in background when it gets too big.

appendonly no

# 指定更新日志文件名，默认为appendonly.aof
# appendfilename appendonly.aof

# The fsync() call tells the Operating System to actually write data on disk
# instead to wait for more data in the output buffer. Some OS will really flush 
# data on disk, some other OS will just try to do it ASAP.

# 指定更新日志条件，共有3个可选值：
# no:表示等操作系统进行数据缓存同步到磁盘（快）
# always:表示每次更新操作后调用fsync()将数据写到磁盘（慢，安全）
# everysec:表示每秒同步一次（默认值）

appendfsync everysec
# appendfsync no

# When the AOF fsync policy is set to always or everysec, and a background
# saving process (a background save or AOF log background rewriting) is
# performing a lot of I/O against the disk, in some Linux configurations
# Redis may block too long on the fsync() call. Note that there is no fix for
# this currently, as even performing fsync in a different thread will block
# our synchronous write(2) call.
#
# In order to mitigate this problem it's possible to use the following option
# that will prevent fsync() from being called in the main process while a
# BGSAVE or BGREWRITEAOF is in progress.
#
# This means that while another child is saving the durability of Redis is
# the same as "appendfsync none", that in pratical terms means that it is
# possible to lost up to 30 seconds of log in the worst scenario (with the
# default Linux settings).
# 
# If you have latency problems turn this to "yes". Otherwise leave it as
# "no" that is the safest pick from the point of view of durability.
no-appendfsync-on-rewrite no

# Automatic rewrite of the append only file.
# Redis is able to automatically rewrite the log file implicitly calling
# BGREWRITEAOF when the AOF log size will growth by the specified percentage.
# 
# This is how it works: Redis remembers the size of the AOF file after the
# latest rewrite (or if no rewrite happened since the restart, the size of
# the AOF at startup is used).
#
# This base size is compared to the current size. If the current size is
# bigger than the specified percentage, the rewrite is triggered. Also
# you need to specify a minimal size for the AOF file to be rewritten, this
# is useful to avoid rewriting the AOF file even if the percentage increase
# is reached but it is still pretty small.
#
# Specify a precentage of zero in order to disable the automatic AOF
# rewrite feature.

auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

################################## SLOW LOG ###################################

# The Redis Slow Log is a system to log queries that exceeded a specified
# execution time. The execution time does not include the I/O operations
# like talking with the client, sending the reply and so forth,
# but just the time needed to actually execute the command (this is the only
# stage of command execution where the thread is blocked and can not serve
# other requests in the meantime).
# 
# You can configure the slow log with two parameters: one tells Redis
# what is the execution time, in microseconds, to exceed in order for the
# command to get logged, and the other parameter is the length of the
# slow log. When a new command is logged the oldest one is removed from the
# queue of logged commands.

# The following time is expressed in microseconds, so 1000000 is equivalent
# to one second. Note that a negative number disables the slow log, while
# a value of zero forces the logging of every command.
slowlog-log-slower-than 10000

# There is no limit to this length. Just be aware that it will consume memory.
# You can reclaim memory used by the slow log with SLOWLOG RESET.
slowlog-max-len 1024

################################ VIRTUAL MEMORY ###############################

### WARNING! Virtual Memory is deprecated in Redis 2.4
### The use of Virtual Memory is strongly discouraged.

### WARNING! Virtual Memory is deprecated in Redis 2.4
### The use of Virtual Memory is strongly discouraged.

# Virtual Memory allows Redis to work with datasets bigger than the actual
# amount of RAM needed to hold the whole dataset in memory.
# In order to do so very used keys are taken in memory while the other keys
# are swapped into a swap file, similarly to what operating systems do
# with memory pages.
# 指定是否启用虚拟内存机制，默认值为no，
# VM机制将数据分页存放，由Redis将访问量较少的页即冷数据swap到磁盘上，访问多的页面由磁盘自动换出到内存中
# 把vm-enabled设置为yes，根据需要设置好接下来的三个VM参数，就可以启动VM了
vm-enabled no
# vm-enabled yes

# This is the path of the Redis swap file. As you can guess, swap files
# can't be shared by different Redis instances, so make sure to use a swap
# file for every redis process you are running. Redis will complain if the
# swap file is already in use.
#
# Redis交换文件最好的存储是SSD（固态硬盘）
# 虚拟内存文件路径，默认值为/tmp/redis.swap，不可多个Redis实例共享
# *** WARNING *** if you are using a shared hosting the default of putting
# the swap file under /tmp is not secure. Create a dir with access granted
# only to Redis user and configure Redis to create the swap file there.
vm-swap-file /tmp/redis.swap

# With vm-max-memory 0 the system will swap everything it can. Not a good
# default, just specify the max amount of RAM you can in bytes, but it's
# better to leave some margin. For instance specify an amount of RAM
# that's more or less between 60 and 80% of your free RAM.
# 将所有大于vm-max-memory的数据存入虚拟内存，无论vm-max-memory设置多少，所有索引数据都是内存存储的（Redis的索引数据就是keys）
# 也就是说当vm-max-memory设置为0的时候，其实是所有value都存在于磁盘。默认值为0
vm-max-memory 0

# Redis swap文件分成了很多的page，一个对象可以保存在多个page上面，但一个page上不能被多个对象共享，vm-page-size是要根据存储的数据大小来设定的。
# 建议如果存储很多小对象，page大小最后设置为32或64bytes；如果存储很大的对象，则可以使用更大的page，如果不确定，就使用默认值
vm-page-size 32

# 设置swap文件中的page数量由于页表（一种表示页面空闲或使用的bitmap）是存放在内存中的，在磁盘上每8个pages将消耗1byte的内存
# swap空间总容量为 vm-page-size * vm-pages
#
# With the default of 32-bytes memory pages and 134217728 pages Redis will
# use a 4 GB swap file, that will use 16 MB of RAM for the page table.
#
# It's better to use the smallest acceptable value for your application,
# but the default is large in order to work in most conditions.
vm-pages 134217728

# Max number of VM I/O threads running at the same time.
# This threads are used to read/write data from/to swap file, since they
# also encode and decode objects from disk to memory or the reverse, a bigger
# number of threads can help with big objects even if they can't help with
# I/O itself as the physical device may not be able to couple with many
# reads/writes operations at the same time.
# 设置访问swap文件的I/O线程数，最后不要超过机器的核数，如果设置为0，那么所有对swap文件的操作都是串行的，可能会造成比较长时间的延迟，默认值为4
vm-max-threads 4

############################### ADVANCED CONFIG ###############################

# Hashes are encoded in a special way (much more memory efficient) when they
# have at max a given numer of elements, and the biggest element does not
# exceed a given threshold. You can configure this limits with the following
# configuration directives.
# 指定在超过一定的数量或者最大的元素超过某一临界值时，采用一种特殊的哈希算法
hash-max-zipmap-entries 512
hash-max-zipmap-value 64

# Similarly to hashes, small lists are also encoded in a special way in order
# to save a lot of space. The special representation is only used when
# you are under the following limits:
list-max-ziplist-entries 512
list-max-ziplist-value 64

# Sets have a special encoding in just one case: when a set is composed
# of just strings that happens to be integers in radix 10 in the range
# of 64 bit signed integers.
# The following configuration setting sets the limit in the size of the
# set in order to use this special memory saving encoding.
set-max-intset-entries 512

# Similarly to hashes and lists, sorted sets are also specially encoded in
# order to save a lot of space. This encoding is only used when the length and
# elements of a sorted set are below the following limits:
zset-max-ziplist-entries 128
zset-max-ziplist-value 64

# Active rehashing uses 1 millisecond every 100 milliseconds of CPU time in
# order to help rehashing the main Redis hash table (the one mapping top-level
# keys to values). The hash table implementation redis uses (see dict.c)
# performs a lazy rehashing: the more operation you run into an hash table
# that is rhashing, the more rehashing "steps" are performed, so if the
# server is idle the rehashing is never complete and some more memory is used
# by the hash table.
# 
# The default is to use this millisecond 10 times every second in order to
# active rehashing the main dictionaries, freeing memory when possible.
#
# If unsure:
# use "activerehashing no" if you have hard latency requirements and it is
# not a good thing in your environment that Redis can reply form time to time
# to queries with 2 milliseconds delay.
# 指定是否激活重置哈希，默认为开启
activerehashing yes

################################## INCLUDES ###################################

# 指定包含其他的配置文件，可以在同一主机上多个Redis实例之间使用同一份配置文件，而同时各实例又拥有自己的特定配置文件
# include /path/to/local.conf
# include /path/to/other.conf
————————————————

RDB 持久化方式

Redis的数据都存放在内存中，如果没有配置持久化，redis重启后数据就全丢失了，于是需要开启redis的持久化功能，将数据保存到磁盘上，当redis重启后，可以从磁盘中恢复数据。

RDB持久化是指在指定的时间间隔内将内存中的数据集快照写入磁盘。也是默认的持久化方式，这种方式是就是将内存中数据以快照的方式写入到二进制文件中，默认的文件名为dump.rdb

三种触发方式：

save命令，该命令会阻塞当前Redis服务器，执行save命令期间，Redis不能处理其他命令，直到RDB过程完成为止
bgsave命令：异步进行快照操作，Redis进程执行fork操作创建子进程，RDB持久化过程由子进程负责，完成后自动结束。阻塞只发生在fork阶段，时间很短。

bgsave命令是针对save阻塞问题做的优化。因此Redis内部所有涉及到RDB操作都采用bgsave的方式，save命令可以废弃。

自动触发：在配置文件redis.conf中SNAPSHOTTING下配置

优点：

RDB全量备份，文件紧凑，非常适用于进行备份和灾难恢复
生成RDB文件的时候，redis主进程会fork()一个子进程来处理所有保存工作，主进程不需要进行任何磁盘IO操作
RDB 在恢复大数据集时的速度比 AOF 的恢复速度要快

缺点：

RDB快照是一次全量备份，存储的是内存数据的二进制序列化形式，存储上非常紧凑。当进行快照持久化时，会开启一个子进程专门负责快照持久化，子进程会拥有父进程的内存数据，父进程修改内存子进程不会反应出来，所以在快照持久化期间修改的数据不会被保存，可能丢失数据。

AOF 持久化方式

AOF(append only file)持久化：以独立日志的方式记录每次写命令，重启时再重新执行AOF文件中命令达到恢复数据的目的。AOF的主要作用是解决了数据持久化的实时性，目前已经是Redis持久化的主流方式。

AOF功能需要设置配置：appendonly yes；AOF文件通过appendfilename 配置设置，默认文件名是appendonly.aof

AOF工作流程：

命令写入(append)：所有的写入命令会追加到aof_buf（缓冲区）中
文件同步(sync)：AOF缓冲区根据对应的策略向硬盘做同步操作
文件重写(rewrite)：随着AOF文件越来越大，需要定期对AOF文件进行重写，达到压缩AOF文件的目的
重启加载(load)：当Redis服务重启时，可以加载AOF文件进行数据恢复

AOF三种触发机制 (可在 redis.conf 的append only mode里配置)：

always：同步持久化，每次发生数据变更会被立即记录到磁盘，性能较差但数据完整性比较好
everysec：异步操作，每秒记录，如果一秒内宕机，有数据丢失
no：从不同步

优点：

AOF可以更好的保护数据不丢失，一般AOF会每隔1秒，通过一个后台线程执行一次fsync操作，最多丢失1秒钟的数据
AOF日志文件没有任何磁盘寻址的开销，写入性能非常高，文件不容易破损
AOF日志文件即使过大的时候，出现后台重写操作，也不会影响客户端的读写
AOF日志文件的命令通过非常可读的方式进行记录，这个特性非常适合做灾难性的误删除的紧急恢复。比如某人不小心用flushall命令清空了所有数据，只要这个时候后台rewrite还没有发生，那么就可以立即拷贝AOF文件，将最后一条flushall命令给删了，然后再将该AOF文件放回去，就可以通过恢复机制，自动恢复所有数据

缺点：

对于同一份数据来说，AOF日志文件通常比RDB数据快照文件更大
AOF开启后，支持的写QPS会比RDB支持的写QPS低，因为AOF一般会配置成每秒fsync一次日志文件，当然，每秒一次fsync，性能也还是很高的

每秒查询率（QPS，Queries-per-second）是对一个特定的查询服务器在规定时间内所处理流量多少的衡量标准。

Redis事务

MULTI、EXEC、DISCARD和WATCH命令是Redis事务功能的基础。Redis事务允许在一次单独的步骤中执行一组命令。

Redis将一个事务中所有命令序列化按顺序执行。Redis不可能在一个Redis事务的执行过程中插入执行另一个客户端发出的请求。在一个Redis事务中，要么执行其中所有命令，要么都不执行。因此Redis事务能够保证原子性。

EXEC命令会触发执行事务中的所有命令。因此，当某个客户端正在执行一次事务时，如果它在调用MULTI命令之前就从Redis服务端断开连接，那么就不会执行事务中的任何操作；相反，如果它在调用EXEC命令之后才从Redis服务端断开连接，那么就会执行事务中的所有操作。

当Redis使用只增文件（AOF：Append-only File）时，Redis能够确保使用一个单独的write(2)系统调用，这样便能将事务写入磁盘。然而，如果Redis服务器宕机，或者系统管理员以某种方式停止Redis服务进程的运行，那么Redis很有可能只执行了事务中的一部分操作。Redis将会在重新启动时检查上述状态，然后退出运行，并且输出报错信息。使用redis-check-aof工具可以修复上述的只增文件，这个工具将会从上述文件中删除执行不完全的事务，这样Redis服务器才能再次启动。

MULTI

事务开启，Redis会将后续命令逐个放入队列中，然后才能使用EXEC命令，原子化地执行

redis 127.0.0.1:6379> multi
OK

EXEC

执行一个事务中所有命令，执行结束后恢复正常运行状态

redis 127.0.0.1:6379> multi
OK
redis 127.0.0.1:6379> 命令1
redis 127.0.0.1:6379> 命令2
...
redis 127.0.0.1:6379> EXEC
执行结果

DISCARD

放弃事务中所有命令，然后恢复正常的运行状态

redis 127.0.0.1:6379> multi
OK
redis 127.0.0.1:6379> 命令1
redis 127.0.0.1:6379> 命令2
...
redis 127.0.0.1:6379> DISCARD
OK

WATCH

用于监视一个(或多个) key ，如果在事务执行之前这个(或这些) key 被其他命令所改动，那么事务将被打断

WATCH key [key ...]
OK

UNWATCH

取消 WATCH 命令对所有 key 的监视。

redis 127.0.0.1:6379> UNWATCH
OK

通过CAS操作实现乐观锁

Redis使用WATCH命令实现事务的“检查再设置”（CAS）行为

WATCH命令的参数的键会受到Redis的监控，Redis能够检测到它们的变化。在执行EXEC命令之前，如果Redis检测到至少有一个键被修改了，那么整个事务便会中止运行，然后EXEC命令会返回一个Null值，提醒用户事务运行失败。

Redis乐观锁：如果存在竞争状态，在我们调用WATCH命令和EXEC命令之间的时间内，有其他客户端修改了监视的key，那么事务将会运行失败。

Redis复制：主从复制

当有多台 Redis 服务器时，肯定就有一台主服务器和多台从服务器。一般来说，主服务器进行写操作，从服务器进行读操作。

通过数据复制，Redis 的一个 master 可以挂载多个 slave，而 slave 下还可以挂载多个 slave，形成多层嵌套结构。所有写操作都在 master 实例中进行，master 执行完毕后，将写指令分发给挂在自己下面的 slave 节点。slave 节点下如果有嵌套的 slave，会将收到的写指令进一步分发给挂在自己下面的 slave。

通过多个 slave，Redis 的节点数据就可以实现多副本保存，任何一个节点异常都不会导致数据丢失，同时多 slave 可以 N 倍提升读性能。master 只写不读，这样整个 master-slave 组合，读写能力都可以得到大幅提升。

master 在分发写请求时，同时会将写指令复制一份存入复制积压缓冲区，这样当 slave 短时间断开重连时，只要 slave 的复制位置点仍然在复制积压缓冲区，则可以从之前的复制位置点之后继续进行复制，提升复制效率。

Redis的复制功能分为以下2个操作：

同步：用于将从服务器的数据库状态更新至主服务器当前所处的数据库状态。
命令传播：用于在主服务器的数据库状态被修改，导致主从服务器的数据库状态不一致时，让主从服务器的数据库状态重新回到一致状态。

同步

当客户端向从服务器发送SLAVEOF命令，要求从服务器复制主服务器时，从服务器会向主服务器SYNC命令，该命令的执行步骤如下所示：

从服务器向主服务器发送SYNC命令。
主服务器收到SYNC命令后，执行BGSAVE命令，在后台生成RDB文件，并使用一个缓冲区记录从现在开始执行的所有写命令。
当主服务器的BGSAVE命令执行完成，主服务器将生成的RDB文件发送给从服务器，从服务器接收并载入这个RDB文件，至此，从服务器的数据库状态和主服务器执行BGSAVE命令时的数据库状态一致。
主服务器将记录在缓冲区里面的所有写命令发送给从服务器，从服务器接收并执行这些写命令，至此，从服务器的数据库状态和主服务器当前的数据库状态一致。

SYNC命令执行期间，主从服务器的通信过程如下图所示：

命令传播

在同步操作执行完毕后，主从服务器的数据库状态达到一致状态，当主服务器执行了客户端发送的写命令时，主服务器的数据库就被修改了，导致主从服务器的数据库状态不再一致。

为了让主从服务器的数据库状态再次回到一致状态，主服务器需要对从服务器执行命令传播操作：主服务器会将自己执行的写命令，发送给从服务器执行，当从服务器执行了相同的写命令后，主从服务器的数据库状态再次回到一致状态。

Redis 2.8以前的版本复制功能的缺陷

在Redis 2.8以前，从服务器对主服务器的复制分为以下2种情况：

初次复制

从服务器以前没有复制过任何主服务器，或者从服务器当前要复制的主服务器和上一次复制的主服务器不同。
断线后重复制

处于命令传播阶段的主从服务器因为网络原因而中断了复制，但从服务器通过重试又重新连上了主服务器，并继续复制主服务器。

旧版复制功能可以很好的完成初次复制，但完成断线后重复制的效率却很低。

举个具体的例子，从服务器B一直在复制着主服务器A，刚开始都是正常的，主服务器A执行的写命令也都通过命令

传播的方式传递给了从服务器B执行，但突然因为网络原因，主服务器A和从服务器B之间中断了复制，在这期间，

假设主服务器又执行了10个写命令，然后从服务器B通过重试又重新连上了主服务器A，继续开始复制，那么它是

怎么复制的呢？

从服务器B会向主服务器A发送SYNC命令，主服务器A接收到命令后会执行BGSAVE命令，BGSAVE命令执行期间的

所有写命令会被记录到缓冲区，待BGSAVE命令执行完毕后，主服务器A会将生成的RDB文件发送给从服务器B，

从服务器B接收并载入这个RDB文件，然后主服务器A将缓冲区里的写命令发送给从服务器B执行，至此，主从

服务器的数据库状态又恢复一致，后续又进入命令传播阶段。

也就是说，每次断线后重复制，都要执行一次SYNC命令来一次全量复制，但其实从服务器B需要的只是断开连接期间主服务器A执行的写命令，按上面的例子，也就是只需要10个写命令即可。

而SYNC命令又是一个非常耗费资源的操作：

主服务器需要执行BGSAVE命令生成RDB文件，这会耗费主服务器大量的CPU、内存和磁盘IO资源。
主服务器需要将生成的RDB文件发送给从服务器，这会耗费主从服务器大量的网络资源（带宽和流量）。
接收到RDB文件的从服务器需要载入RDB文件，在载入期间，从服务器会阻塞，没办法处理命令请求。

Redis 2.8以及之后的版本

从Redis 2.8版本开始，Redis使用PSYNC命令代替SYNC命令来执行复制时的同步操作。

PSYNC命令有以下2种场景：

完整重同步

完整重同步用于处理初次复制，执行步骤和SYNC命令的执行步骤基本一样。
部分重同步

部分重同步用于处理断线后重复制，当从服务器在断线后重新连接主服务器时，如果条件允许，主服务器可以将主从服务器连接断开期间执行的写命发送给从服务器，从服务器只要接收并执行这些写命令，就可以将数据库更新至主服务器当前所处的状态。

仍然用上面举的例子，新版复制，主服务器只需要把断开期间执行的10个写命令发送给从服务器即可，而不用生成并发送整个RDB文件，性能大大提升。

主从服务器在执行部分重同步时的通信过程如下图所示：

那么部分重同步是如何实现的呢？

部分重同步功能由以下3个部分组成：

主服务器和从服务器的复制偏移量
主服务器的复制积压缓冲区
服务器的运行ID

接下来我们一一讲解。

复制偏移量

执行复制的主服务器和从服务器会分别维护一个复制偏移量：

主服务器每次向从服务器传播N个字节的数据时，就将自己的复制偏移量的值加上N。
从服务器每次收到主服务器传播来的N个字节的数据时，就将自己的复制偏移量的值加上N。

举个例子，假设主服务器有3个从服务器，它们的复制偏移量都为10086，如下图所示：

然后，主服务器向3个从服务器传播了长度为33字节的数据，那么主服务器的复制偏移量会加上33，变为10119，

从服务器A在这时刚好断线了，没有接收到数据，所以偏移量仍然为10086，

从服务器B和从服务器C正常接收到了数据，所以偏移量都更新为了10019，如下图所示：

很显然，通过对比主从服务器的复制偏移量，可以很容易地知道主从服务器是否处于一致状态。

然后，从服务器A通过重试又重新连接到了主服务器，然后向主服务器发送PSYNC命令，并报告了自己当前的复制

偏移量为10086，主服务器此时需要处理2个问题：

该对从服务器A执行完整重同步还是部分重同步？
如果执行部分重同步，主服务器从哪里获取到断线期间从服务器A丢失的数据？

带着这2个问题，我们看下复制积压缓冲区。

复制积压缓冲区

复制积压缓冲区是主服务器维护的一个固定长度先进先出队列，默认大小为1MB。

当主服务器进行命令传播时，它不仅会将写命令发送给所有从服务器，还会将写命令入队到复制积压缓冲区，如下图所示：

所以，主服务器的复制积压缓冲区会保存着一部分最近传播的写命令，并且为队列中的每个字节记录相应的复制偏移量，如下所示：

偏移量	...	10087	10088	10089	10090	10091	...
字节值	...	'*'	3	'\r'	'\n'	'$'	...

当从服务器重新连接上主服务器时，会通过PSYNC命令将自己的复制偏移量offset发送给主服务器，主服务器会根据以下规则来决定对从服务器执行何种同步操作：

如果offset偏移量之后的数据仍然存在于复制积压缓冲区，那么主服务器将对从服务器执行部分重同步操作。
如果offset偏移量之后的数据已经不存在于复制积压缓冲区，那么主服务器将对从服务器执行完整重同步操作。

回到之前的例子：

从服务器A重新连接上主服务器，向主服务器发送PSYNC命令，报告自己的复制偏移量为10086。
主服务器收到PSYNC命令以及偏移量10086之后，会检查偏移量10086之后的数据是否存在于复制积压缓冲区，结果发现数据还在，于是主服务器向从服务器A发送+CONTINUE回复，表示数据同步将以部分重同步模式来进行。
接着主服务器会将复制积压缓冲区里10086偏移量之后的所有数据（偏移量为10087到10119）都发送给从服务器A。
从服务器A接收这33字节的缺失数据，就回到与主服务器一致的状态。

服务器运行ID

每个Redis服务器，不论主服务器还是从服务器，都会有自己的运行ID，运行ID在服务器启动时自动生成，由40个十六进制字符组成，如下图所示：

当从服务器对主服务器进行初次复制时，主服务器会将自己的运行ID传送给从服务器，从服务器会将这个运行ID保存起来。

当从服务器断线并重新连接上主服务器时，从服务器会把之前保存的运行ID发送给当前连接的主服务器：

如果从服务器之前保存的运行ID和当前连接的主服务器的运行ID相同，说明从服务器断线前后复制的是同一台主服务器，主服务器可以继续尝试执行部分重同步操作。
如果从服务器之前保存的运行ID和当前连接的主服务器的运行ID不相同，说明从服务器断线前后复制的不是同一台主服务器，主服务器将对从服务器执行完整重同步操作。

PSYNC命令执行细节

对于从服务器来说，调用PSYNC命令有以下2种情况：

如果从服务器以前没有复制过任何主服务器，或者之前执行过SLAVEOF on one命令，那么从服务器在开始一次新的复制时将向主服务器发送PSYNC ? -1命令，主动请求主服务器进行完整重同步。
如果从服务器已经复制过某个主服务器，那么从服务器在开始一次新的复制时将向主服务器发送

PSYNC {runid} {offset}命令，其中runid是上一次复制的主服务器的运行ID，offset是从服务器当前的复制偏移量。

对于主服务器来说，接收到PSYNC命令后会向从服务器返回以下3种回复中的一种：

如果主服务器返回+FULLRESYNC {runid} {offset}，表示主服务器将与从服务器执行完整重同步操作，其中runid是主服务器的运行ID，从服务器会将这个ID保存起来，在下一次发送PSYNC命令时使用，offset是主服务器当前的复制偏移量，从服务器会将这个值作为自己的初始化偏移量。
如果主服务器返回+CONTINUE，表示主服务器将与从服务器执行部分重同步操作，主服务器会将从服务器缺少的那部分数据发送给从服务器。
如果主服务器返回-ERROR，表示主服务器的版本低于Redis 2.8，它识别不了PSYNC命令，从服务器将向主服务器发送SYNC命令，并与主服务器执行完整重同步操作。

以上描述流程可以使用以下流程图来表示：

哨兵(Sentinel)

1、当启动哨兵模式之后，如果你的master服务器宕机之后，哨兵自动会在从redis服务器里面投票选举一个master主服务器出来；这个主服务器也可以进行读写操作！而其他slave都将跟随新的master

2、如果之前宕机的主服务器已经修好，可以正式运行了。那么这个服务器只能进行读的操作，会自动跟随由哨兵选举出来的新服务器！

1、Sentinel的作用

Master 状态监测
如果Master 异常，则会进行Master-slave 转换，将其中一个Slave作为Master，将之前的Master作为Slave
Master-Slave切换后，master_redis.conf、slave_redis.conf和sentinel.conf的内容都会发生改变，即master_redis.conf中会多一行slaveof的配置，sentinel.conf的监控目标会随之调换

2、Sentinel的工作方式

1)：每个Sentinel以每秒钟一次的频率向它所知的Master，Slave以及其他 Sentinel 实例发送一个 PING 命令。

2)：如果一个实例（instance）距离最后一次有效回复 PING 命令的时间超过 down-after-milliseconds 选项所指定的值，则这个实例会被 Sentinel 标记为主观下线。

3)：如果一个Master被标记为主观下线，则正在监视这个Master的所有 Sentinel 要以每秒一次的频率确认Master的确进入了主观下线状态。

4)：当有足够数量的 Sentinel（大于等于配置文件指定的值）在指定的时间范围内确认Master的确进入了主观下线状态，则Master会被标记为客观下线。

5)：在一般情况下，每个 Sentinel 会以每 10 秒一次的频率向它已知的所有Master，Slave发送 INFO 命令。

6)：当Master被 Sentinel 标记为客观下线时，Sentinel 向下线的 Master 的所有 Slave 发送 INFO 命令的频率会从 10 秒一次改为每秒一次。

7)：若没有足够数量的 Sentinel 同意 Master 已经下线， Master 的客观下线状态就会被移除。

若 Master 重新向 Sentinel 的 PING 命令返回有效回复， Master 的主观下线状态就会被移除。

posted @ 2021-07-27 22:32 Leejk 阅读(579) 评论(0) 收藏举报

刷新页面返回顶部

Lee's Blog

念头通达