初步认识pg_control文件之一
这个据说是PostgreSQL的control file。
到底如何呢,先看看改名后如何,把pg_control文件改名,然后启动 Postgres,运行时得到信息:
[postgres@pg101 bin]$ postgres: could not find the database system Expected to find it in the directory "/usr/local/pgsql/bin/../data", but could not open file "/usr/local/pgsql/bin/../data/global/pg_control": No Such file or Directory
对应的源代码,在postmater.c的 checkDataDir方法中:
snprintf(path, sizeof(path), "%s/global/pg_control", DataDir); fp = AllocateFile(path, PG_BINARY_R); if (fp == NULL) { write_stderr("%s: could not find the database system\n" "Expected to find it in the directory \"%s\",\n" "but could not open file \"%s\": %s\n", progname, DataDir, path, strerror(errno)); ExitPostmaster(2); } FreeFile(fp);
将 pg_control文件改回原来的名字后,重新启动PostgreSQL数据库,没有问题。
而在main.c中,有如下代码:
从注释中可以看到,数据库中初始化后,会有LC_CTYPE/LC_COLLATE等信息已经写入到pg_control文件中。
/* * Set up locale information from environment. Note that LC_CTYPE and * LC_COLLATE will be overridden later from pg_control if we are in an * already-initialized database. We set them here so that they will be * available to fill pg_control during initdb. LC_MESSAGES will get set * later during GUC option processing, but we set it here to allow startup * error messages to be localized. */ set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("postgres"));
在 src/backend/access/transam/xlog.c 中,有如下代码:
/* * We maintain an image of pg_control in shared memory. */ static ControlFileData *ControlFile = NULL;
可见,与pg_control文件相对应,在内存中保留着一个内存结构。
它长得是这个样子:
/* * Contents of pg_control. * * NOTE: try to keep this under 512 bytes so that it will fit on one physical * sector of typical disk drives. This reduces the odds of corruption due to * power failure midway through a write. */ typedef struct ControlFileData { /* * Unique system identifier --- to ensure we match up xlog files with the * installation that produced them. */ uint64 system_identifier; /* * Version identifier information. Keep these fields at the same offset, * especially pg_control_version; they won't be real useful if they move * around. (For historical reasons they must be 8 bytes into the file * rather than immediately at the front.) * * pg_control_version identifies the format of pg_control itself. * catalog_version_no identifies the format of the system catalogs. * * There are additional version identifiers in individual files; for * example, WAL logs contain per-page magic numbers that can serve as * version cues for the WAL log. */ uint32 pg_control_version; /* PG_CONTROL_VERSION */ uint32 catalog_version_no; /* see catversion.h */ /* * System status data */ DBState state; /* see enum above */ pg_time_t time; /* time stamp of last pg_control update */ XLogRecPtr checkPoint; /* last check point record ptr */ XLogRecPtr prevCheckPoint; /* previous check point record ptr */ CheckPoint checkPointCopy; /* copy of last check point record */ /* * These two values determine the minimum point we must recover up to * before starting up: * * minRecoveryPoint is updated to the latest replayed LSN whenever we * flush a data change during archive recovery. That guards against * starting archive recovery, aborting it, and restarting with an earlier * stop location. If we've already flushed data changes from WAL record X * to disk, we mustn't start up until we reach X again. Zero when not * doing archive recovery. * * backupStartPoint is the redo pointer of the backup start checkpoint, if * we are recovering from an online backup and haven't reached the end of * backup yet. It is reset to zero when the end of backup is reached, and * we mustn't start up before that. A boolean would suffice otherwise, but * we use the redo pointer as a cross-check when we see an end-of-backup * record, to make sure the end-of-backup record corresponds the base * backup we're recovering from. */ XLogRecPtr minRecoveryPoint; XLogRecPtr backupStartPoint; /* * Parameter settings that determine if the WAL can be used for archival * or hot standby. */ int wal_level; int MaxConnections; int max_prepared_xacts; int max_locks_per_xact; /* * This data is used to check for hardware-architecture compatibility of * the database and the backend executable. We need not check endianness * explicitly, since the pg_control version will surely look wrong to a * machine of different endianness, but we do need to worry about MAXALIGN * and floating-point format. (Note: storage layout nominally also * depends on SHORTALIGN and INTALIGN, but in practice these are the same * on all architectures of interest.) * * Testing just one double value is not a very bulletproof test for * floating-point compatibility, but it will catch most cases. */ uint32 maxAlign; /* alignment requirement for tuples */ double floatFormat; /* constant 1234567.0 */ #define FLOATFORMAT_VALUE 1234567.0 /* * This data is used to make sure that configuration of this database is * compatible with the backend executable. */ uint32 blcksz; /* data block size for this DB */ uint32 relseg_size; /* blocks per segment of large relation */ uint32 xlog_blcksz; /* block size within WAL files */ uint32 xlog_seg_size; /* size of each WAL segment */ uint32 nameDataLen; /* catalog name field width */ uint32 indexMaxKeys; /* max number of columns in an index */ uint32 toast_max_chunk_size; /* chunk size in TOAST tables */ /* flag indicating internal format of timestamp, interval, time */ bool enableIntTimes; /* int64 storage enabled? */ /* flags indicating pass-by-value status of various types */ bool float4ByVal; /* float4 pass-by-value? */ bool float8ByVal; /* float8, int8, etc pass-by-value? */ /* CRC of all above ... MUST BE LAST! */ pg_crc32 crc; } ControlFileData;
然后,一个一个地看吧。