bdb log file 预设长度的性能优化
postgres 同理的code:
backend/cdb/cdblogsync.c, createZeroFilledNewFile()
/* * Zero-fill the file. We have to do this the hard way to ensure that all * the file space has really been allocated --- on platforms that allow * "holes" in files, just seeking to the end doesn't allocate intermediate * space. This way, we know that we have all the space and (after the * fsync below) that all the indirect blocks are down on disk. Therefore, * fdatasync(2) or O_DSYNC will be sufficient to sync future writes to the * log file. */ MemSet(zbuffer, 0, sizeof(zbuffer));
看代码随手记:log_put.c, __log_write()
/* * If we're writing the first block in a log file on a filesystem that * guarantees unwritten blocks are zero-filled, we set the size of the * file in advance. This increases sync performance on some systems, * because they don't need to update metadata on every sync. * * Ignore any error -- we may have run out of disk space, but that's no * reason to quit. */ #ifdef HAVE_FILESYSTEM_NOTZERO if (lp->w_off == 0 && !__os_fs_notzero()) { #else if (lp->w_off == 0) { #endif (void)__db_file_extend(env, dblp->lfhp, lp->log_size); if (F_ISSET(dblp, DBLOG_ZERO)) (void)__db_zero_extend(env, dblp->lfhp, 0, lp->log_size/lp->buffer_size, lp->buffer_size); }
我的理解:在flush log时使用fdatasync, 若log文件长度发生变化, 则仍需要写文件 metadata。
https://linux.die.net/man/2/fdatasync
fdatasync() is similar to fsync(), but does not flush modified metadata unless that metadata is needed in order to allow a subsequent data retrieval to be correctly handled. For example, changes to st_atime or st_mtime (respectively, time of last access and time of last modification; see stat(2)) do not require flushing because they are not necessary for a subsequent data read to be handled correctly. On the other hand, a change to the file size (st_size, as made by say ftruncate(2)), would require a metadata flush.