九、Redis源码持久化之AOF

RDB 文件是将某一时刻的内存数据保存成一个文件,而 AOF 日志则会记录接收到的所有写操作。

AOF 重写函数与触发时机

首先,实现 AOF 重写的函数是 rewriteAppendOnlyFileBackground,它是在aof.c文件中实现的。在这个函数中,会调用 fork 函数创建一个 AOF 重写子进程,来实际执行重写操作。实际上,rewriteAppendOnlyFileBackground 函数一共会在三个函数中被调用。

第一个是 bgrewriteaofCommand 函数。这个函数是在 aof.c 文件中实现的,对应了我们在 Redis server 上执行 bgrewriteaof 命令,也就是说,我们手动触发了 AOF rewrite 的执行。

void bgrewriteaofCommand(client *c) {
    
条件一:当前是否已经有 AOF 重写的子进程正在执行。如果有的话,那么 bgrewriteaofCommand 函数就不再执行 AOF 重写了。
if (server.aof_child_pid != -1) { addReplyError(c,"Background append only file rewriting already in progress"); } else if (server.rdb_child_pid != -1) {
条件二:当前是否有创建 RDB 的子进程正在执行。如果有的话,bgrewriteaofCommand 函数会把全局变量 server 的 aof_rewrite_scheduled 成员变量设置为 1,这个标志表明 Redis server 已经将 AOF 重写设为待调度运行,等后续条件满足时,它就会实际执行 AOF 重写。
server.aof_rewrite_scheduled
= 1; addReplyStatus(c,"Background append only file rewriting scheduled"); } else if (rewriteAppendOnlyFileBackground() == C_OK) { addReplyStatus(c,"Background append only file rewriting started"); } else { addReply(c,shared.err); } }

第二个是 startAppendOnly 函数。这个函数也是在 aof.c 文件中实现的,它本身会被 configSetCommand 函数(在config.c文件中)和 restartAOFAfterSYNC 函数(在replication.c文件中)调用。

首先,对于 configSetCommand 函数来说,它对应了我们在 Redis 中执行 config 命令启用 AOF 功能,如下所示:config set appendonly yes这样,一旦 AOF 功能启用后,configSetCommand 函数就会调用 startAppendOnly 函数,执行一次 AOF 重写。

而对于 restartAOFAfterSYNC 函数来说,它会在主从节点的复制过程中被调用。简单来说,就是当主从节点在进行复制时,如果从节点的 AOF 选项被打开,那么在加载解析 RDB 文件时,AOF 选项就会被关闭。然后,无论从节点是否成功加载了 RDB 文件,restartAOFAfterSYNC 函数都会被调用,用来恢复被关闭的 AOF 功能。那么在这个过程中,restartAOFAfterSYNC 函数就会调用 startAppendOnly 函数,并进一步调用 rewriteAppendOnlyFileBackground 函数,来执行一次 AOF 重写。

 

/* Called when the user switches from "appendonly no" to "appendonly yes"
 * at runtime using the CONFIG command. */
int startAppendOnly(void) {
    char cwd[MAXPATHLEN]; /* Current working dir path for error messages. */
    int newfd;

    newfd = open(server.aof_filename,O_WRONLY|O_APPEND|O_CREAT,0644);
    serverAssert(server.aof_state == AOF_OFF);
    if (newfd == -1) {
    .....
    }
    if (server.rdb_child_pid != -1) {
        server.aof_rewrite_scheduled = 1;
        serverLog(LL_WARNING,"AOF was enabled but there is already a child process saving an RDB file on disk. An AOF background was scheduled to start when possible.");
    } else {
        /* If there is a pending AOF rewrite, we need to switch it off and
         * start a new one: the old one cannot be reused because it is not
         * accumulating the AOF buffer. */
        if (server.aof_child_pid != -1) {
            serverLog(LL_WARNING,"AOF was enabled but there is already an AOF rewriting in background. Stopping background AOF and starting a rewrite now.");
            killAppendOnlyChild();
        }
        if (rewriteAppendOnlyFileBackground() == C_ERR) {
            close(newfd);
            serverLog(LL_WARNING,"Redis needs to enable the AOF but can't trigger a background AOF rewrite operation. Check the above logs for more info about the error.");
            return C_ERR;
        }
    }
    /* We correctly switched on AOF, now wait for the rewrite to be complete
     * in order to append data on disk. */
    server.aof_state = AOF_WAIT_REWRITE;
    server.aof_last_fsync = server.unixtime;
    server.aof_fd = newfd;
    return C_OK;
}

第三个是 serverCron 函数。在 Redis server 运行时,serverCron 函数是会被周期性执行的。然后它在执行的过程中,会做两次判断来决定是否执行 AOF 重写。

int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) {
    int j;
    UNUSED(eventLoop);
    UNUSED(id);
    UNUSED(clientData);

    .......

    /* Start a scheduled AOF rewrite if this was requested by the user while
     * a BGSAVE was in progress. */
// 如果没有RDB子进程,也没有AOF重写子进程,并且AOF重写被设置为待调度执行,那么调用rewriteAppendOnlyFileBackground函数进行AOF
if (server.rdb_child_pid == -1 && server.aof_child_pid == -1 && server.aof_rewrite_scheduled) { rewriteAppendOnlyFileBackground(); } /* Check if a background saving or AOF rewrite in progress terminated. */ if (server.rdb_child_pid != -1 || server.aof_child_pid != -1 || ldbPendingChildren()) { int statloc; pid_t pid; if ((pid = wait3(&statloc,WNOHANG,NULL)) != 0) { int exitcode = WEXITSTATUS(statloc); int bysignal = 0; if (WIFSIGNALED(statloc)) bysignal = WTERMSIG(statloc); if (pid == -1) { serverLog(LL_WARNING,"wait3() returned an error: %s. " "rdb_child_pid = %d, aof_child_pid = %d", strerror(errno), (int) server.rdb_child_pid, (int) server.aof_child_pid); } else if (pid == server.rdb_child_pid) { backgroundSaveDoneHandler(exitcode,bysignal); if (!bysignal && exitcode == 0) receiveChildInfo(); } else if (pid == server.aof_child_pid) { backgroundRewriteDoneHandler(exitcode,bysignal); if (!bysignal && exitcode == 0) receiveChildInfo(); } else { if (!ldbRemoveChild(pid)) { serverLog(LL_WARNING, "Warning, detected child with unmatched pid: %ld", (long)pid); } } updateDictResizePolicy(); closeChildInfoPipe(); } } else { /* If there is not a background saving/rewrite in progress check if * we have to save/rewrite now. */ for (j = 0; j < server.saveparamslen; j++) { struct saveparam *sp = server.saveparams+j; /* Save if we reached the given amount of changes, * the given amount of seconds, and if the latest bgsave was * successful or if, in case of an error, at least * CONFIG_BGSAVE_RETRY_DELAY seconds already elapsed. */ if (server.dirty >= sp->changes && server.unixtime-server.lastsave > sp->seconds && (server.unixtime-server.lastbgsave_try > CONFIG_BGSAVE_RETRY_DELAY || server.lastbgsave_status == C_OK)) { serverLog(LL_NOTICE,"%d changes in %d seconds. Saving...", sp->changes, (int)sp->seconds); rdbSaveInfo rsi, *rsiptr; rsiptr = rdbPopulateSaveInfo(&rsi); rdbSaveBackground(server.rdb_filename,rsiptr); break; } } /* Trigger an AOF rewrite if needed. */
//如果AOF功能启用、没有RDB子进程和AOF重写子进程在执行、AOF文件大小比例设定了阈值,以及AOF文件大小绝对值超出了阈值,那么,进一步判断AOF文件大小比例是否超出阈值
   if (server.aof_state == AOF_ON && server.rdb_child_pid == -1 && server.aof_child_pid == -1 && server.aof_rewrite_perc && server.aof_current_size > server.aof_rewrite_min_size) {
       //计算AOF文件当前大小超出基础大小的比例
       long long base = server.aof_rewrite_base_size ? server.aof_rewrite_base_size : 1;
       //如果AOF文件当前大小超出基础大小的比例已经超出预设阈值,那么执行AOF重写
       long long growth = (server.aof_current_size*100/base) - 100; 是不是看着别扭?换个写法:(server.aof_current_size*100/base) - 100*base/base=100*(server.aof_current_size-base)/base; if (growth >= server.aof_rewrite_perc) { serverLog(LL_NOTICE,"Starting automatic rewriting of AOF on %lld%% growth",growth); rewriteAppendOnlyFileBackground(); }
        auto-aof-rewrite-percentage:AOF 文件大小超出基础大小的比例,默认值为 100%,即超出 1 倍大小。
        auto-aof-rewrite-min-size:AOF 文件大小绝对值的最小值,默认为 64MB。 } }

 

总结

到这里,我们就了解了 AOF 重写的四个触发时机,这里我也给你总结下,方便你回顾复习。

时机一:bgrewriteaof 命令被执行。

时机二:主从复制完成 RDB 文件解析和加载(无论是否成功)。

时机三:AOF 重写被设置为待调度执行。

时机四:AOF 被启用,同时 AOF 文件的大小比例超出阈值,以及 AOF 文件的大小绝对值超出阈值。

另外,这里你还需要注意,在这四个时机下,其实都不能有正在执行的 RDB 子进程和 AOF 重写子进程,否则的话,AOF 重写就无法执行了。

 

AOF 重写的基本过程

int rewriteAppendOnlyFileBackground(void) {
   ...
   if ((childpid = fork()) == 0) {  //创建子进程
      ...
      //子进程调用rewriteAppendOnlyFile进行AOF重写
      if (rewriteAppendOnlyFile(tmpfile) == C_OK) {
            size_t private_dirty = zmalloc_get_private_dirty(-1);
            ...
            exitFromChild(0);
        } else {
            exitFromChild(1);
        }
   }
   else{ //父进程执行的逻辑
      ...
      server.aof_rewrite_scheduled = 0;  
      server.aof_rewrite_time_start = time(NULL);
      server.aof_child_pid = childpid; //记录重写子进程的进程号
      updateDictResizePolicy(); //关闭rehash功能
}

 

posted @ 2022-05-27 22:09  chch213  阅读(80)  评论(0编辑  收藏  举报