Redis LRU源码分析

LRU（Least recently used，最近最少使用）算法根据数据的历史访问记录来进行淘汰数据，其核心思想是“如果数据最近被访问过，那么将来被访问的几率也更高”，通常是使用双向链表来实现，会占用大量内存，所以redis使用的是近似LRU的算法，即每次访问时，给该对象记录一下当前的时间戳（单位秒），当需要删除数据时，随机选取5个元素，删除最久未被访问的。

下面先看下redis object的定义（基于redis 2.8.8版本源代码）：

/* The actual Redis Object */
#define REDIS_LRU_BITS 24
#define REDIS_LRU_CLOCK_MAX ((1<<REDIS_LRU_BITS)-1) /* Max value of obj->lru */
#define REDIS_LRU_CLOCK_RESOLUTION 1 /* LRU clock resolution in seconds */
typedef struct redisObject {
    unsigned type:4;
    unsigned encoding:4;
    unsigned lru:REDIS_LRU_BITS; /* lru time (relative to server.lruclock) */
    int refcount;
    void *ptr;
} robj;

其中“REDIS_LRU_CLOCK_RESOLUTION”是LRU算法的精度，默认1S，数字越低精度越高，精度越高，lru时间溢出时间越短，下面细看。

其中的“lru”记录最后一次访问的时间戳，由于只有24位，无法记录完整的时间，因此只记录了unix time的低24位，24 bits数据要溢出的话需要194天，而缓存的数据更新非常频繁，已经足够了。

下面看下创建redis对象时的代码：

robj *createObject(int type, void *ptr) {
    robj *o = zmalloc(sizeof(*o));
    o->type = type;
    o->encoding = REDIS_ENCODING_RAW;
    o->ptr = ptr;
    o->refcount = 1;

    /* Set the LRU to the current lruclock (minutes resolution). */
    o->lru = server.lruclock;
    return o;
}

我们看到是直接把server.lruclock赋值给lru，而server.lruclock会由redis的后台线程每秒更新10次，其他需要的地方直接引用，减少unix time访问，节省资源，代码如下：

void updateLRUClock(void) {
    server.lruclock = (server.unixtime/REDIS_LRU_CLOCK_RESOLUTION) &
                                                REDIS_LRU_CLOCK_MAX; // 把unix time的低24位赋值给server.lruclock
}

当后续再访问redis对象时，更新lru字段，如下：

/*-----------------------------------------------------------------------------
 * C-level DB API
 *----------------------------------------------------------------------------*/

robj *lookupKey(redisDb *db, robj *key) {
    dictEntry *de = dictFind(db->dict,key->ptr);
    if (de) {
        robj *val = dictGetVal(de);

        /* Update the access time for the ageing algorithm.
         * Don't do it if we have a saving child, as this will trigger
         * a copy on write madness. */
        if (server.rdb_child_pid == -1 && server.aof_child_pid == -1) // 在执行rdb任务或者aof任务时，不能更新该字段
            val->lru = server.lruclock;
        return val;
    } else {
        return NULL;
    }
}

下面再看下LRU策略触发时，是如何选择对象的：

 /* volatile-lru and allkeys-lru policy */
            else if (server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_LRU ||
                server.maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_LRU)
            {
                for (k = 0; k < server.maxmemory_samples; k++) { //  server.maxmemory_samples 默认是5
                    sds thiskey;
                    long thisval;
                    robj *o;

                    de = dictGetRandomKey(dict);   // 随机选择一个对象
                    thiskey = dictGetKey(de);
                    /* When policy is volatile-lru we need an additional lookup
                     * to locate the real key, as dict is set to db->expires. */
                    if (server.maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_LRU)
                        de = dictFind(db->dict, thiskey);
                    o = dictGetVal(de);
                    thisval = estimateObjectIdleTime(o);  // 获取对象的lru值

                    /* Higher idle time is better candidate for deletion */
                    if (bestkey == NULL || thisval > bestval) {
                        bestkey = thiskey;
                        bestval = thisval;
                    }
                }
            }

/* Given an object returns the min number of seconds the object was never
 * requested, using an approximated LRU algorithm. */
unsigned long estimateObjectIdleTime(robj *o) {  // 计算对象距离上次访问流逝的时间，单位秒
    if (server.lruclock >= o->lru) {
        return (server.lruclock - o->lru) * REDIS_LRU_CLOCK_RESOLUTION;
    } else {  // o->lru > server.lruclock，说明距离上次访问已经至少超过一个REDIS_LRU_CLOCK_MAX了（194天）
        return ((REDIS_LRU_CLOCK_MAX - o->lru) + server.lruclock) *
                    REDIS_LRU_CLOCK_RESOLUTION;
    }
}

从上面的代码可以看出，redis的lru近似算法有三个要点：

1. 守护线程，每秒更新10次server.lruclock值（unix time的低24位）。

2. redis对象首次创建和后续访问时，把当前server.lruclock赋值给该对象的lru。

3. redis在处理命令时，会检查内存使用情况，如果超过限制且配置了LRU策略，则：

a). 随机选择5个元素。

b). 删除其中最久未被访问的元素。

c). 已使用内存是否还是超过限制，如果是则跳转到步骤a继续，否则本次删除结束。

posted on 2020-03-29 21:36 xinghebuluo 阅读(622) 评论(0) 收藏举报

刷新页面返回顶部

Redis LRU源码分析

导航

公告