Redis 相关运维操作

背景

　　Redis作为目前全球最流行的KV存储，除了使用之外，还需要做好日常的运维工作。关于运维相关的工作，本文从以下方面进行介绍说明（Redis5.0以上）：

内存方面
客户端连接方面
工具方面

说明

内存方面

说明：第一个值是memory stats提供，第二个值是info memory提供

服务内存相关（单位：字节）

消耗的最大峰值内存
```
peak.allocated
used_memory_peak
```
最大分配使用内存
```
total.allocated
used_memory
```
启动时消耗的初始内存
```
startup.allocated
used_memory_startup
```
复制积压的内存
```
replication.backlog
repl_backlog_size
```
数据占用内存
```
dataset.bytes
used_memory_dataset
```

管理内部数据结构的所有开销的内存

-- 包括startup.allocated, replication.backlog, clients.slaves, clients.normal, aof.buffer及用于管理Redis键空间的内部数据结构的总和
overhead.total
used_memory_overhead

当前和重写AOF缓冲区内存
```
aof.buffer
mem_aof_buffer
```
所有副本开销（输出和查询缓冲区，连接上下文）的总内存
```
clients.slaves
mem_clients_slaves
```
所有客户端开销（输出和查询缓冲区，连接上下文）的总内存
```
clients.normal
mem_clients_normal
```
Lua脚本的缓存开销的总内存
```
lua.caches
used_memory_lua
```
服务内存占用物理内存比例
```
rss-overhead.ratio
rss_overhead_ratio
```

　　更多内存相关的可以看memory stats和info memory

Key内存相关：查看单个key的大小

命令行查看
① DEBUG OBJECT（redis4.0之前）命令估算key的内存使用(字段serializedlength)，但因为相差太大，参考价值不高
```
> get b
"cbd"

> DEBUG OBJECT b
Value at:0x7f24e2b33d40 refcount:1 encoding:embstr serializedlength:4 lru:445248 lru_seconds_idle:3
```
② MEMORY USAGE [SAMPLES count]（redis4.0之后）命令估算key的value实际使用内存，不包含key的大小和过期属性的内存占用
```
> get b
"cbd"
> MEMORY USAGE b
(integer) 48
```
对于集合的数据类型(除string外), usage子命令采用类似LRU SAMPLES的抽样方式，默认抽样5个（count）元素求平均得出实际内存占用，所以计算是近似值。可以指定抽样的SAMPLES个数，如：生成一个100w个字段的hash键:hkey, 每字段的value长度是从1~1024字节的随机值：
```
> hlen hkey    // hkey有100w了字段，每个字段的value长度介入1~1024个字节
(integer) 1000000
> MEMORY usage hkey   //默认SAMPLES为5，分析hkey键内存占用521588753字节
(integer) 521588753
> MEMORY usage hkey SAMPLES  1000 //指定SAMPLES为1000，分析hkey键内存占用617977753字节
(integer) 617977753
> MEMORY usage hkey SAMPLES  10000 //指定SAMPLES为10000，分析hkey键内存占用624950853字节
(integer) 624950853
```
这是使用抽样求平均的算法，要想获取key较精确的内存值，就指定更大SAMPLES个数。但并不越大越好，因为memory usage越大其占用cpu时间分片就大。SAMPLES 为0则会对所有值进行采样。memory usage时间复杂度，和指定的SAMPLES数有关。
rdb工具分析rdb文件，获得某个key的实际使用内存
Redis RDB 分析工具 rdbtools 说明

客户端连接方面

说明：client 是一个客户端的操作方法，可以查看当前ID、当前连接信息等，具体的命令可以看官网

> client help
 1) CLIENT <subcommand> arg arg ... arg. Subcommands are:
 -- 返回当前的连接ID，可用于kill
 2) ID                     -- Return the ID of the current connection.
 -- 返回当前连接的名字
 3) GETNAME                -- Return the name of the current connection.
 -- Kill 来自某个地址的连接
 4) KILL <ip:port>         -- Kill connection made from <ip:port>.
 -- Kill 连接
 5) KILL <option> <value> [option value ...] -- Kill connections. Options are:
 -- Kill 来自某个地址的连接
 6)      ADDR <ip:port>                      -- Kill connection made from <ip:port>
 -- 按类型Kill连接
 7)      TYPE (normal|master|replica|pubsub) -- Kill connections by type.
 -- Kill通过该用户身份验证的连接
 8)      USER <username>   -- Kill connections authenticated with such user.
 -- 跳过Kill当前连接，默认yes
 9)      SKIPME (yes|no)   -- Skip killing current connection (default: yes).
 -- 返回客户端连接的信息
10) LIST [options ...]     -- Return information about client connections. Options:
 -- 返回指定类型的客户端
11) 　　TYPE (normal|master|replica|pubsub) -- Return clients of specified type.
 -- 暂停所有客户端<timout>毫秒
12) PAUSE <timeout>        -- Suspend all Redis clients for <timout> milliseconds.
 -- 控制服务器对当前客户端的回复，ON：默认，回复；OFF：不回复客；SKIP：跳过回复
13) REPLY (on|off|skip)    -- Control the replies sent to the current connection.
-- 给当前连接设置名字
14) SETNAME <name>         -- Assign the name <name> to the current connection.
-- 取消指定的被阻止的客户端 
15) UNBLOCK <clientid> [TIMEOUT|ERROR] -- Unblock the specified blocked client.
-- 客户端缓存启用key跟踪
16) TRACKING (on|off) [REDIRECT <id>] [BCAST] [PREFIX first] [PREFIX second] [OPTIN] [OPTOUT]... -- Enable client keys tracking for client side caching.
-- 启用跟踪时，返回重定向到的客户端ID
17) GETREDIR               -- Return the client ID we are redirecting to when tracking is enabled.
-- 启用跟踪时，控制在连接执行的下一个命令中对键的跟踪（在连接中设置一个状态，该状态仅对下一个命令执行有效）
18) CACHING （YES|NO）     -- Basically the command sets a state in the connection, that is valid only for the next command execution, that will modify the behavior of client tracking.

CLIENT ID：返回当前连接的ID

-- ID值单调递增，不重复
> CLIENT ID
(integer) 701

CLIENT GETNAME：返回当前连接由CLIENT SETNAME设置的名字
```
> CLIENT GETNAME
"zjy"
```

CLIENT SETNAME：为当前连接分配一个名字

-- client list 中的name会显示
> CLIENT SETNAME zjy
OK

CLIENT LIST：返回所有连接到服务器的客户端信息和统计数据

-- 返回所有
> CLIENT LIST
id=701 addr=192.168.163.134:52722 fd=16 name=zjy age=1225 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=26 qbuf-free=32742 obl=0 oll=0 omem=0 events=r cmd=client user=default
id=598 addr=192.168.163.134:36327 fd=12 name= age=9321 idle=1 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=replconf user=replica-user
id=703 addr=192.168.163.1:64583 fd=9 name= age=115 idle=81 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=auth user=dba

-- 返回指定类型:normal|master|replica|pubsub
> CLIENT LIST type replica
id=598 addr=192.168.163.134:36327 fd=12 name= age=9341 idle=0 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=replconf user=replica-user

-- 返回指定类型:normal|master|replica|pubsub
> CLIENT LIST type normal
id=701 addr=192.168.163.134:52722 fd=16 name=zjy age=1254 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=48 qbuf-free=32720 obl=0 oll=0 omem=0 events=r cmd=client user=default
id=703 addr=192.168.163.1:64583 fd=9 name= age=144 idle=110 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=auth user=dba

格式如下：

1. 每个已连接客户端对应一行（以 LF 分割）
2. 每行字符串由一系列 属性=值（property=value） 形式的域组成，每个域之间以空格分开。

各字段的含义：

id: 唯一的客户端ID
addr: 客户端的地址和端口
fd: 套接字所使用的文件描述符
age: 以秒计算的已连接时长
idle: 以秒计算的空闲时长
flags: 客户端 flag:
　　-- O: 客户端是 MONITOR 模式下的附属节点（slave）
　　-- S: 客户端是一般模式下（normal）的附属节点
　　-- M: 客户端是主节点（master）
　　-- x: 客户端正在执行事务
　　-- b: 客户端正在等待阻塞事件
　　-- i: 客户端正在等待 VM I/O 操作（已废弃）
　　-- d: 一个受监视（watched）的键已被修改， EXEC 命令将失败
　　-- c: 在将回复完整地写出之后，关闭链接
　　-- u: 客户端未被阻塞（unblocked）
　　-- U: 通过Unix套接字连接的客户端
　　-- r: 客户端是只读模式的集群节点
　　-- A: 尽可能快地关闭连接
　　-- N: 未设置任何 flag
db: 该客户端正在使用的数据库 ID
sub: 已订阅频道的数量
psub: 已订阅模式的数量
multi: 在事务中被执行的命令数量
qbuf: 查询缓冲区的长度（字节为单位， 0 表示没有分配查询缓冲区）
qbuf-free: 查询缓冲区剩余空间的长度（字节为单位， 0 表示没有剩余空间）
obl: 输出缓冲区的长度（字节为单位， 0 表示没有分配输出缓冲区）
oll: 输出列表包含的对象数量（当输出缓冲区没有剩余空间时，命令回复会以字符串对象的形式被入队到这个队列里）
omem: 输出缓冲区占用的内存总量
tot-mem:客户端在其各种缓冲区中消耗的总内存
events: 文件描述符事件:
　　-- r: 客户端套接字（在事件 loop 中）是可读的（readable）
　　-- w: 客户端套接字（在事件 loop 中）是可写的（writeable）
cmd: 最近一次执行的命令
user: 连接的用户

　　注意：6.0之后多了tot-mem参数：客户端在其各种缓冲区中消耗的总内存；而omem只表示了输出缓冲的大小。

CLIENT PAUSE：连接控制，可以将所有客户端的访问暂停给定的毫秒数（slaves的交互除外）

-- 所有客户端暂停10秒
> CLIENT PAUSE 10000
OK

场景：

当需要升级一个实例时，管理员可以作如下操作：

使用CLIENT PAUSE 暂停所有客户端
等待数秒，让slaves节点处理完所有来自master的复制命令
将一个salve节点切换为master
重配客户端以来接新的master 节点

CLIENT KILL：按不同的属性关闭连接：如id，user、addr:port

-- list
> client list
id=598 addr=192.168.163.134:36327 fd=12 name= age=23517 idle=0 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=replconf user=replica-user
id=801 addr=192.168.163.134:53810 fd=13 name= age=208 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=26 qbuf-free=32742 obl=0 oll=0 omem=0 events=r cmd=client user=default
id=803 addr=192.168.163.1:54934 fd=9 name= age=14 idle=14 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=auth user=dba
id=804 addr=192.168.163.1:54935 fd=14 name= age=10 idle=7 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=client user=default

-- 根据地址属性kill
> CLIENT KILL addr 192.168.163.1:54935
(integer) 1

-- 根据用户kill
> CLIENT KILL user dba
(integer) 1

-- 根据连接类型kill
> CLIENT KILL type normal
(integer) 1

-- 是否kill当前操作的连接
> CLIENT KILL type normal SKIPME no
(integer) 1

注意：Redis的单线程属性，不可能在客户端执行命令时杀掉它. 从客户端的角度看，永远无法杀死一个正在执行命令的连接。但是当客户端发送下一条命令时会意识到连接已被关，原因为网络错误。

CLIENT UNBLOCK：解除客户端的阻塞

-- Connection A (blocking connection):
> CLIENT ID
2934
> BRPOP key1 key2 key3 0
(client is blocked)

... Now we want to add a new key ...

-- Connection B (control connection):
> CLIENT UNBLOCK 2934
1

-- Connection A (blocking connection):
... BRPOP reply with timeout ...
NULL

> BRPOP key1 key2 key3 key4 0
(client is blocked again)

CLIENT REPLY：是否禁用redis服务器对当前客户端的回复响应

-- 控制服务器是否将回复客户端的命令
ON. 默认选项，回复客户端每条命令
OFF. 不回复客户端命令
SKIP. 跳过该命令的回复

当执行命令设置为OFF或SKIP，设置命令收不到任何回复，当设置为 ON时，返回OK

CLIENT TRACKING：启用Redis服务器的跟踪功能，该功能用于服务器的客户端缓存
CLIENT CACHING：当在OPTIN或OPTOUT模式下启用跟踪时，此命令控制在连接执行的下一个命令中对键的跟踪
说明：客户端缓存是Redis6.0的一个新功能，利用应用程序服务器中的可用内存，提高性能。CLIENT TRACKING 和CLIENT CACHING是针对客户端缓存进行跟踪和管理，如果对客户端缓存有兴趣的可以看官网说明。

HELLO protover [AUTH username password] [SETNAME clientname]：切换协议（RESP2、RESP3）

-- Redis 6或更高版本支持两种协议，即旧协议RESP2和新协议RESP3。Redis 6连接以RESP2模式启动，因此RESP2的客户端无需更改 想要握手RESP3模式的客户端需要使用“ 3”作为第一个参数来调用HELLO命令。
> HELLO 3 auth dba dba setname zjy
1# "server" => "redis"
2# "version" => "6.0.3"
3# "proto" => (integer) 3
4# "id" => (integer) 825
5# "mode" => "cluster"
6# "role" => "master"
7# "modules" => (empty array)

COMMAND：以数组的形式返回有关所有Redis命令的详细信息
COMMAND COUNT：返回Redis服务器命令的总数

COMMAND GETKEYS：从完整的Redis命令中找到key

-- 返回key
> COMMAND GETKEYS MSET a b c d e f
1) "a"
2) "c"
3) "e"
> COMMAND GETKEYS SORT mylist ALPHA STORE outlist
1) "mylist"
2) "outlist"

COMMAND INFO command-name [command-name ...]：返回的结果与COMMAND相同，但是你可以指定返回哪些命令

> COMMAND INFO get set
1) 1) "get"
   2) (integer) 2
   3) 1~ readonly
      2~ fast
   4) (integer) 1
   5) (integer) 1
   6) (integer) 1
   7) 1~ @read
      2~ @string
      3~ @fast
2) 1) "set"
   2) (integer) -3
   3) 1~ write
      2~ denyoom
   4) (integer) 1
   5) (integer) 1
   6) (integer) 1
   7) 1~ @write
      2~ @string
      3~ @slow

CONFIG RESETSTAT：重置INFO命令统计里面的一些计算器
DEBUG SEGFAULT：模拟错误，让server崩溃
LASTSAVE：查看BGSAVE 命令执行的结果，返回时间戳
```
> LASTSAVE
(integer) 1594473834 
```

SLOWLOG subcommand [argument]：用于读取和重置Redis慢查询日志

实际执行命令所需的时间：在命令执行过程中线程被阻塞且不能同时处理其他请求的阶段。

> config get *slow*
1) "slowlog-max-len"  -- 慢查询长度
2) "128"
3) "slowlog-log-slower-than"  --慢查询阈值时间，微妙
4) "10000"
> SLOWLOG get 3       -- 查看慢查询记录条数
(empty array)
> SLOWLOG len         -- 获取慢查询长度
(integer) 0
> SLOWLOG reset       -- 重置慢查询日志
OK

工具方面

redis-cli --help

# /usr/local/redis6.0/bin/redis-cli -h 192.168.163.134 -p 8379 --help
redis-cli 6.0.3

Usage: redis-cli [OPTIONS] [cmd [arg [arg ...]]]
-- 服务地址
-h <hostname> Server hostname (default: 127.0.0.1).
-- 服务端口
-p <port> Server port (default: 6379).
-- 服务套接字
-s <socket> Server socket (overrides hostname and port).
-- 服务密码
-a <password> Password to use when connecting to the server.
You can also use the REDISCLI_AUTH environment
variable to pass this password more safely
(if both are used, this argument takes predecence).
-- ACL设置的用户
--user <username> Used to send ACL style 'AUTH username pass'. Needs -a.
-- ACL设置的密码
--pass <password> Alias of -a for consistency with the new --user option.
-- 强制设置密码
--askpass Force user to input password with mask from STDIN.
If this argument is used, '-a' and REDISCLI_AUTH
environment variable will be ignored.
-- 服务URI
-u <uri> Server URI.
-- 指定执行的命令次数
-r <repeat> Execute specified command N times.
-- 间隔多少秒执行指定的命令
-i <interval> When -r is used, waits <interval> seconds per command.
It is possible to specify sub-second times like -i 0.1.
-- 数据库号
-n <db> Database number.
-- RESP协议
-3 Start session in RESP3 protocol mode.
-- 读取标准输入的最后一个参数
-x Read last argument from STDIN.
-- 分隔符，默认\n
-d <delimiter> Multi-bulk delimiter in for raw formatting (default: \n).
-- 开启集群模式
-c Enable cluster mode (follow -ASK and -MOVED redirections).
-- 回复请求格式
--raw Use raw formatting for replies (default when STDOUT is
not a tty).
-- 格式化输出
--no-raw Force formatted output even when STDOUT is not a tty.
-- 导出格式
--csv Output in CSV format.
-- 打印服务器的统计数据
--stat Print rolling stats about server: mem, clients, ...
-- 持续采样延迟模式，在交互式会话中使用此模式，显示实时统计信息。 如果指定了--raw或--csv，或者将输出重定向到非TTY，则它将对延迟进行采样1秒钟（可以使用-i更改间隔），然后生成单个输出并退出
--latency Enter a special mode continuously sampling latency.
If you use this mode in an interactive session it runs
forever displaying real-time stats. Otherwise if --raw or
--csv is specified, or if you redirect the output to a non
TTY, it samples the latency for 1 second (you can use
-i to change the interval), then produces a single output
and exits.
-- 像--latency一样，但是跟踪延迟随时间变化。 默认时间间隔是15秒。 使用-i进行更改。
--latency-history Like --latency but tracking latency changes over time.
Default time interval is 15 sec. Change it using -i.
-- 将延迟显示为频谱，需要xterm 256色。 默认时间间隔是1秒。 使用-i进行更改。
--latency-dist Shows latency as a spectrum, requires xterm 256 colors.
Default time interval is 1 sec. Change it using -i.
-- 模拟工作负载
--lru-test <keys> Simulate a cache workload with an 80-20 distribution.
-- 模拟从主接收到的命令的副本
--replica Simulate a replica showing commands received from the master.
-- 将远程服务的RDB转存到本地
--rdb <filename> Transfer an RDB dump from remote server to local file.
-- 将原始Redis协议从stdin传输到服务器
--pipe Transfer raw Redis protocol from stdin to server.
-- pipe模式下的超时时间，默认30秒。
--pipe-timeout <n> In --pipe mode, abort with error if after sending all data.
no reply is received within <n> seconds.
Default timeout: 30. Use 0 to wait forever.
-- 查找具有每个类型最多元素的key
--bigkeys Sample Redis keys looking for keys with many elements (complexity).
-- 查找每个类型消耗大量内存的key
--memkeys Sample Redis keys looking for keys consuming a lot of memory.
-- 定义要采样的key元素的数量
--memkeys-samples <n> Sample Redis keys looking for keys consuming a lot of memory.
And define number of key elements to sample
-- 查找热key，在maxmemory-policy为* lfu时有效
--hotkeys Sample Redis keys looking for hot keys.
only works when maxmemory-policy is *lfu.
-- 使用SCAN命令列出所有键
--scan List all keys using the SCAN command.
-- 与scan一起使用以指定扫描模式
--pattern <pat> Useful with --scan to specify a SCAN pattern.
-- 运行测试以测量内部系统延迟，测试将运行指定的秒数。
--intrinsic-latency <sec> Run a test to measure intrinsic system latency.
The test will run for the specified amount of seconds.
-- 使用<file>处的Lua脚本发送EVAL命令
--eval <file> Send an EVAL command using the Lua script at <file>.
-- 与--eval一起使用，启用Redis Lua调试器
--ldb Used with --eval enable the Redis Lua debugger.
-- 与--ldb一样，但使用同步Lua调试器，在这种模式下，服务器被阻止，脚本更改不会从服务器内存中回滚
--ldb-sync-mode Like --ldb but uses the synchronous Lua debugger, in
this mode the server is blocked and script changes are
not rolled back from the server memory.
-- 集群管理
--cluster <command> [args...] [opts...]
Cluster Manager command and arguments (see below).
-- 详细模式
--verbose Verbose mode.
-- 命令行界面上使用密码时，不显示警告消息
--no-auth-warning Don't show warning message when using password on command
line interface.
--help Output this help and exit.
--version Output version and exit.

Cluster Manager Commands:
Use --cluster help to list all available cluster manager commands.

-- 示例
Examples:
cat /etc/passwd | redis-cli -x set mypasswd
redis-cli get mypasswd
redis-cli -r 100 lpush mylist x
redis-cli -r 100 -i 1 info | grep used_memory_human:
redis-cli --eval myscript.lua key1 key2 , arg1 arg2 arg3
redis-cli --scan --pattern '*:12345*'

(Note: when using --eval the comma separates KEYS[] from ARGV[] items)

When no command is given, redis-cli starts in interactive mode.
Type "help" in interactive mode for information on available commands
and settings.

注意的参数：

1：--pipe
echo -e "*3\r\n\$3\r\nset\r\n\$8\r\nstring_a\r\n\$3\r\ndba\r\n"  | /usr/local/redis6.0/bin/redis-cli -h 192.168.163.134 -p 9379 --pipe
等同于
echo -e "*3\r\n\$3\r\nset\r\n\$8\r\nstring_a\r\n\$3\r\ndba\r\n" | nc 127.0.0.1 15391

2：--bigkeys
使用scan方式对每个类型的key进行统计，无需担心对redis造成阻塞，最大key只有string类型是以字节长度为衡量标准的。list,set,zset等都是以元素个数作为衡量标准，不能说明其占的内存就一定多

3：--memkeys
查找每个类型消耗大量内存的key

4：--hotkeys
需要用LFU才能使用

5：--rdb
导出远程的rdb到本地

6：--scan
遍历所有的key

7：--raw
使用原始格式

8：--cluster
集群操作

总结

本文大致说明了平时运维的时候需要注意的一些事情，比如内存、连接、rdb、key等，后续如果有一些tips，会持续更新到本文当中。

posted @ 2020-07-13 15:28 jyzhou 阅读(2186) 评论(1) 编辑收藏举报

刷新页面返回顶部

DBA's Record

Redis 相关运维操作

背景

说明

内存方面

服务内存相关（单位：字节）

Key内存相关：查看单个key的大小

客户端连接方面

工具方面

redis-cli --help

总结

公告