redis缓存淘汰策略LRU和LFU对比与分析

一、Redis占用内存大小

我们知道Redis是基于内存的key-value数据库，因为系统的内存大小有限，所以我们在使用Redis的时候可以配置Redis能使用的最大的内存大小。

1、通过配置文件配置

通过在Redis安装目录下面的redis.conf配置文件中添加以下配置设置内存大小

//设置Redis最大占用内存大小为100M

maxmemory 100mb

2、通过命令修改

Redis支持运行时通过命令动态修改内存大小

//设置Redis最大占用内存大小为100M

127.0.0.1:6379> config set maxmemory 100mb

//获取设置的Redis能使用的最大内存大小

127.0.0.1:6379> config get maxmemory

如果不设置最大内存大小或者设置最大内存大小为0，在64位操作系统下不限制内存大小，在32位操作系统下最多使用3GB内存

二、Redis的内存淘汰

既然可以设置Redis最大占用内存大小，那么配置的内存就有用完的时候。那在内存用完的时候，还继续往Redis里面添加数据不就没内存可用了吗？

实际上Redis定义了几种策略用来处理这种情况：

noeviction(默认策略)：对于写请求不再提供服务，直接返回错误（DEL请求和部分特殊请求除外）
allkeys-lru：从所有key中使用LRU算法进行淘汰
volatile-lru：从设置了过期时间的key中使用LRU算法进行淘汰
allkeys-random：从所有key中随机淘汰数据
volatile-random：从设置了过期时间的key中随机淘汰
volatile-ttl：在设置了过期时间的key中，根据key的过期时间进行淘汰，越早过期的越优先被淘汰

当使用volatile-lru、volatile-random、volatile-ttl这三种策略时，如果没有key可以被淘汰，则和noeviction一样返回错误

三、如何获取及设置内存淘汰策略

获取当前内存淘汰策略：

127.0.0.1:6379> config get maxmemory-policy

通过配置文件设置淘汰策略（修改redis.conf文件）：

maxmemory-policy allkeys-lru

通过命令修改淘汰策略：

127.0.0.1:6379> config set maxmemory-policy allkeys-lru

四、LRU算法
什么是LRU?

上面说到了Redis可使用最大内存使用完了，是可以使用LRU算法进行内存淘汰的，那么什么是LRU算法呢？
LRU(Least Recently Used)，即最近最少使用，是一种缓存置换算法。在使用内存作为缓存的时候，缓存的大小一般是固定的。当缓存被占满，这个时候继续往缓存里面添加数据，就需要淘汰一部分老的数据，释放内存空间用来存储新的数据。这个时候就可以使用LRU算法了。其核心思想是：如果一个数据在最近一段时间没有被用到，那么将来被使用到的可能性也很小，所以就可以被淘汰掉。
其原理是维护一个双向链表，key -> node，其中node保存链表前后节点关系及数据data。新插入的key时，放在头部，并检查是否超出总容量，如果超出则删除最后的key；访问key时，无论是查找还是更新，将该Key被调整到头部。

使用php实现一个简单的LRU算法
代码地址：

https://github.com/rogeriopvl/php-lrucache

  1 <?php
  2 namespace LRUCache;
  3 /**
  4  * Class that implements the concept of an LRU Cache
  5  * using an associative array as a naive hashmap, and a doubly linked list
  6  * to control the access and insertion order.
  7  *
  8  * @author Rogério Vicente
  9  * @license MIT (see the LICENSE file for details)
 10  */
 11 class LRUCache {
 12     // object Node representing the head of the list
 13     private $head;
 14     // object Node representing the tail of the list
 15     private $tail;
 16     // int the max number of elements the cache supports
 17     private $capacity;
 18     // Array representing a naive hashmap (TODO needs to pass the key through a hash function)
 19     private $hashmap;
 20     /**
 21      * @param int $capacity the max number of elements the cache allows
 22      */
 23     public function __construct($capacity) {
 24         $this->capacity = $capacity;
 25         $this->hashmap = array();
 26         $this->head = new Node(null, null);
 27         $this->tail = new Node(null, null);
 28         $this->head->setNext($this->tail);
 29         $this->tail->setPrevious($this->head);
 30     }
 31     /**
 32      * Get an element with the given key
 33      * @param string $key the key of the element to be retrieved
 34      * @return mixed the content of the element to be retrieved
 35      */
 36     public function get($key) {
 37         if (!isset($this->hashmap[$key])) { return null; }
 38         $node = $this->hashmap[$key];
 39         if (count($this->hashmap) == 1) { return $node->getData(); }
 40         // refresh the access
 41         $this->detach($node);
 42         $this->attach($this->head, $node);
 43         return $node->getData();
 44     }
 45     /**
 46      * Inserts a new element into the cache 
 47      * @param string $key the key of the new element
 48      * @param string $data the content of the new element
 49      * @return boolean true on success, false if cache has zero capacity
 50      */
 51     public function put($key, $data) {
 52         if ($this->capacity <= 0) { return false; }
 53         if (isset($this->hashmap[$key]) && !empty($this->hashmap[$key])) {
 54             $node = $this->hashmap[$key];
 55             // update data
 56             $this->detach($node);
 57             $this->attach($this->head, $node);
 58             $node->setData($data);
 59         }
 60         else {
 61             $node = new Node($key, $data);
 62             $this->hashmap[$key] = $node;
 63             $this->attach($this->head, $node);
 64             // check if cache is full
 65             if (count($this->hashmap) > $this->capacity) {
 66                 // we're full, remove the tail
 67                 $nodeToRemove = $this->tail->getPrevious();
 68                 $this->detach($nodeToRemove);
 69                 unset($this->hashmap[$nodeToRemove->getKey()]);
 70             }
 71         }
 72         return true;
 73     }
 74     /**
 75      * Removes a key from the cache
 76      * @param string $key key to remove
 77      * @return bool true if removed, false if not found
 78      */
 79     public function remove($key) {
 80         if (!isset($this->hashmap[$key])) { return false; }
 81         $nodeToRemove = $this->hashmap[$key];
 82         $this->detach($nodeToRemove);
 83         unset($this->hashmap[$nodeToRemove->getKey()]);
 84         return true;
 85      }
 86     /**
 87      * Adds a node to the head of the list
 88      * @param Node $head the node object that represents the head of the list
 89      * @param Node $node the node to move to the head of the list
 90      */
 91     private function attach($head, $node) {
 92         $node->setPrevious($head);
 93         $node->setNext($head->getNext());
 94         $node->getNext()->setPrevious($node);
 95         $node->getPrevious()->setNext($node);
 96     }
 97     /**
 98      * Removes a node from the list
 99      * @param Node $node the node to remove from the list
100      */
101     private function detach($node) {
102         $node->getPrevious()->setNext($node->getNext());
103         $node->getNext()->setPrevious($node->getPrevious());
104     }
105 }
106 /**
107  * Class that represents a node in a doubly linked list
108  */
109 class Node {
110     /**
111      * the key of the node, this might seem reduntant,
112      * but without this duplication, we don't have a fast way
113      * to retrieve the key of a node when we wan't to remove it
114      * from the hashmap.
115      */
116     private $key;
117     // the content of the node
118     private $data;
119     // the next node
120     private $next;
121     // the previous node
122     private $previous;
123     /**
124      * @param string $key the key of the node
125      * @param string $data the content of the node
126      */
127     public function __construct($key, $data) {
128         $this->key = $key;
129         $this->data = $data;
130     }
131     /**
132      * Sets a new value for the node data
133      * @param string the new content of the node
134      */
135     public function setData($data) {
136         $this->data = $data;
137     }
138     /**
139      * Sets a node as the next node
140      * @param Node $next the next node
141      */
142     public function setNext($next) {
143         $this->next = $next;
144     }
145     /**
146      * Sets a node as the previous node
147      * @param Node $previous the previous node
148      */
149     public function setPrevious($previous) {
150         $this->previous = $previous;
151     }
152     /**
153      * Returns the node key
154      * @return string the key of the node
155      */
156     public function getKey() {
157         return $this->key;
158     }
159     /**
160      * Returns the node data
161      * @return mixed the content of the node
162      */
163     public function getData() {
164         return $this->data;
165     }
166     /**
167      * Returns the next node
168      * @return Node the next node of the node
169      */
170     public function getNext() {
171         return $this->next;
172     }
173     /**
174      * Returns the previous node
175      * @return Node the previous node of the node
176      */
177     public function getPrevious() {
178         return $this->previous;
179     }
180 }

View Code

假如一次访问key 1,5,1,3,5,2,4,1,2

五、LRU在Redis中的实现
近似LRU算法

Redis使用的是近似LRU算法，它跟常规的LRU算法还不太一样。近似LRU算法通过随机采样法淘汰数据，每次随机出5（默认）个key，从里面淘汰掉最近最少使用的key。

可以通过maxmemory-samples参数修改采样数量：例：maxmemory-samples 10 maxmenory-samples配置的越大，淘汰的结果越接近于严格的LRU算法

Redis为了实现近似LRU算法，给每个key增加了一个额外增加了一个24bit的字段，用来存储该key最后一次被访问的时间。

Redis3.0对近似LRU的优化

Redis3.0对近似LRU算法进行了一些优化。新算法会维护一个候选池（大小为16），池中的数据根据访问时间进行排序，第一次随机选取的key都会放入池中，随后每次随机选取的key只有在访问时间小于池中最小的时间才会放入池中，直到候选池被放满。当放满后，如果有新的key需要放入，则将池中最后访问时间最大（最近被访问）的移除。

当需要淘汰的时候，则直接从池中选取最近访问时间最小（最久没被访问）的key淘汰掉就行。

LRU算法的对比

我们可以通过一个实验对比各LRU算法的准确率，先往Redis里面添加一定数量的数据n，使Redis可用内存用完，再往Redis里面添加n/2的新数据，这个时候就需要淘汰掉一部分的数据，如果按照严格的LRU算法，应该淘汰掉的是最先加入的n/2的数据。
生成如下各LRU算法的对比图：

你可以看到图中有三种不同颜色的点：

浅灰色是被淘汰的数据
灰色是没有被淘汰掉的老数据
绿色是新加入的数据

我们能看到Redis3.0采样数是10的时候生成的图最接近于严格的LRU。而同样使用5个采样数，Redis3.0也要优于Redis2.8。

Redis并没有使用严格的LRU算法，因为维护一个那么大的双向链表需要的内存空间较大。

显然LRU的缺陷是明显的，最新访问的数据被当做热数据显然是不合理的，热数据顾名思义就是被访问频次叫高的数据，显然是不同的概念

六、LFU算法

LFU算法是Redis4.0里面新加的一种淘汰策略。它的全称是Least Frequently Used，它的核心思想是根据key的最近被访问的频率进行淘汰，很少被访问的优先被淘汰，被访问的多的则被留下来。

LFU算法能更好的表示一个key被访问的热度。假如你使用的是LRU算法，一个key很久没有被访问到，只刚刚是偶尔被访问了一次，那么它就被认为是热点数据，不会被淘汰，而有些key将来是很有可能被访问到的则被淘汰了。如果使用LFU算法则不会出现这种情况，因为使用一次并不会使一个key成为热点数据。LFU原理使用计数器来对key进行排序，每次key被访问的时候，计数器增大。计数器越大，可以约等于访问越频繁。具有相同引用计数的数据块则按照时间排序。

LFU一共有两种策略：

volatile-lfu：在设置了过期时间的key中使用LFU算法淘汰key
allkeys-lfu：在所有的key中使用LFU算法淘汰数据

设置使用这两种淘汰策略跟前面讲的一样，不过要注意的一点是这两种策略只能在Redis4.0及以上设置，如果在Redis4.0以下设置会报错

新加入数据插入到队列尾部（因为引用计数为1）；

队列中的数据被访问后，引用计数增加，队列重新排序；

当需要淘汰数据时，将已经排序的列表最后的数据块删除。

l 命中率

一般情况下，LFU效率要优于LRU，且能够避免周期性或者偶发性的操作导致缓存命中率下降的问题。但LFU需要记录数据的历史访问记录，一旦数据访问模式改变，LFU需要更长时间来适用新的访问模式，即：LFU存在历史数据影响将来数据的“缓存污染”效用。

l 复杂度

需要维护一个队列记录所有数据的访问记录，每个数据都需要维护引用计数。

l 代价

需要记录所有数据的访问记录，内存消耗较高；需要基于引用计数排序，性能消耗较高。

LFC算法存在两个问题：

1、在LRU算法中可以维护一个双向链表，然后简单的把被访问的节点移至链表开头，但在LFU中是不可行的，节点要严格按照计数器进行排序，新增节点或者更新节点位置时，时间复杂度可能达到O(N)。
2、只是简单的增加计数器的方法并不完美。访问模式是会频繁变化的，一段时间内频繁访问的key一段时间之后可能会很少被访问到，只增加计数器并不能体现这种趋势。

第一个问题很好解决，可以借鉴LRU实现的经验，维护一个待淘汰key的pool。第二个问题的解决办法是，记录key最后一个被访问的时间，然后随着时间推移，降低计数器。

更多请参考：https://www.zhangshengrong.com/p/zD1yQg6b1r/

zz：https://blog.csdn.net/raoxiaoya/article/details/103141022

posted @ 2022-02-09 11:20 琅琊甲乙木阅读(1592) 评论(0) 收藏举报

刷新页面返回顶部

IT又日新

苟日新、日日新、又日新

redis缓存淘汰策略LRU和LFU对比与分析