Unix DBM实现中的部分细节
key-value database
一直以为"key-value database"是一个很新的概念,但是维基明确说明了这种概念在很早的Unix系统中就自带了一个基于key-value的数据库dbm(database manager),更让人惊奇的是这个软件的作者依然是大名鼎鼎的"ken Thompson"。不得不说,看起来一些很新的概念,在很早就已经被提出来,只是由于当时硬件(cpu运算速度、内存大小、磁盘访问速度等)的限制,导致这些概念没有被实现,或者没有流行起来。
The Unix system provides dbm (database manager), which is a 1979 library originally written by Ken Thompson. It is also ported to Microsoft Windows, provided through programming languages such as Perl for Win32. The dbm manages associative arrays of arbitrary data by use of a single key (a primary key). Modern implementations include sdbm, GNU dbm, and Berkeley DB. Although dbm precedes the concept of a NoSQL and is rarely mentioned in modern discourse, it is used by many pieces of software.
- key-value
///@file: gdbm\src\gdbm.h.in
/* The data and key structure. */
typedef struct
char *dptr;
int dsize;
} datum;
- 描述
The string and stringz are special. Both define a string of bytes, similar to ‘char x[]’ in C. The former defines an array of bytes, the latter - a null-terminated string. This makes a difference, in particular, when the string is the only part of datum. Consider the following two definitions:
define key string
define key stringz
Now, suppose we want to store the string "ab" in the key. Using the definition (1), the dptr member of GDBM datum will contain two bytes: ‘a’, and ‘b’. Consequently, the dsize member will have the value 2. Using the definition (2), the dptr member will contain three bytes: ‘a’, ‘b’, and ASCII 0. The dsize member will have the value 3.
The definition (1) is the default for both key and content.
The second form of the define statement is similar to the C struct statement and allows for defining structural data. In this form, the definition part is a comma-separated list of data types and variables enclosed in curly braces. In contrast to the rest of gdbm commands, this command is inherently multiline and is terminated with the closing curly brace. For example:
define content {
int status,
pad 8,
char id[3],
string name
This defines a structure consisting of three members: an integer status, an array of 3 bytes id, and an array of bytes name. Notice the pad statement: it allows to introduce padding between structure members. Another useful statement is offset: it specifies that the member following it begins at the given offset in the structure. Assuming the size of int is 8 bytes, the above definition can also be written as
define content {
int status,
offset 16,
char id[3],
string name
NOTE: The string type can reasonably be used only if it is the last or the only member of the data structure. That’s because it provides no information about the number of elements in the array, so it is interpreted to contain all bytes up to the end of the datum.
static int
datum_scan_tag (datum *dat, struct dsegm *ds, struct kvpair *kvlist)
struct xdatum xd;
int err = 0;
struct kvpair *kv;
/* Check keywords for consistency */
for (kv = kvlist; kv; kv = kv->next)
if (!kv->key)
lerror (&kv->loc,
_("mixing tagged and untagged values is not allowed"));
return 1;
if (!dsegm_list_find (ds, kv->key))
lerror (&kv->loc, _("%s: no such field in datum"), kv->key);
return 1;
/* Initialize datum */
memset (&xd, 0, sizeof (xd));
for (; err == 0 && ds; ds = ds->next)
switch (ds->type)
case FDEF_FLD:
kv = kvlist_find (kvlist, ds->v.field.name);
if (kv)
err = dsconv (&xd, ds, kv);
size_t sz = ds->v.field.type->size * ds->v.field.dim;
xd_expand (&xd, xd.off + sz);
xd.off += sz;
case FDEF_OFF:
xd_expand (&xd, ds->v.n);
xd.off = ds->v.n;
case FDEF_PAD:
xd_expand (&xd, xd.off + ds->v.n);
xd.off += ds->v.n;
if (err)
free (xd.dptr);
return 1;
dat->dptr = xd.dptr;
dat->dsize = xd.dsize;
return 0;
datum_scan (datum *dat, struct dsegm *ds, struct kvpair *kv)
return (kv->key ? datum_scan_tag : datum_scan_notag) (dat, ds, kv);
Extendible Hashing
在内存存储中,dbm使用的是Extendible Hashing-A Fast Access Method for Dynamic Files描述的一种实现,从概念上看这种结构并不复杂,但是在wiki上的python代码看起来让人非常困惑,主要是感觉在拆分逻辑中少了必要的复制操作(当然也可能是我python不是很熟悉而没理解这个代码的正确性)。
void Directory::grow(void)
for(int i = 0 ; i < 1<<global_depth ; i++ )
void Directory::insert(int key,string value,bool reinserted)
int bucket_no = hash(key);
int status = buckets[bucket_no]->insert(key,value);
cout<<"Inserted key "<<key<<" in bucket "<<bucket_id(bucket_no)<<endl;
cout<<"Moved key "<<key<<" to bucket "<<bucket_id(bucket_no)<<endl;
else if(status==0)
cout<<"Key "<<key<<" already exists in bucket "<<bucket_id(bucket_no)<<endl;
///@file: gdbm\src\bucket.c
/* Split the current bucket. This includes moving all items in the bucket to
a new bucket. This doesn't require any disk reads because all hash values
are stored in the buckets. Splitting the current bucket may require
doubling the size of the hash directory. */
_gdbm_split_bucket (GDBM_FILE dbf, int next_insert)
off_t old_adr[GDBM_HASH_BITS]; /* Address of the old directories. */
int old_size[GDBM_HASH_BITS]; /* Size of the old directories. */
int old_count; /* Number of old directories. */
int index; /* Used in array indexing. */
int index1; /* Used in array indexing. */
/* No directories are yet old. */
old_count = 0;
while (dbf->bucket->count == dbf->header->bucket_elems)
int new_bits; /* The number of bits for the new buckets. */
cache_elem *newcache[2]; /* Location in the cache for the buckets. */
off_t adr_0; /* File address of the new bucket 0. */
off_t adr_1; /* File address of the new bucket 1. */
avail_elem old_bucket; /* Avail Struct for the old bucket. */
off_t dir_start0; /* Used in updating the directory. */
off_t dir_start1;
off_t dir_end;
new_bits = dbf->bucket->bucket_bits + 1;
* Allocate two new buckets. They will be populated with the entries
* from the current bucket (cache_mru->bucket), so make sure that
* cache_mru remains unchanged until both buckets are fully formed.
* Newly allocated buckets must be linked right after cache_mru, so
* that all changed buckets form a contiguous sequence at the beginning
* of the cache list (this is needed by _gdbm_cache_flush).
adr_0 = _gdbm_alloc (dbf, dbf->header->bucket_size);
switch (cache_lookup (dbf, adr_0, dbf->cache_mru, &newcache[0]))
case cache_new:
case cache_found:
/* should not happen */
"%s: bucket found where it should not",
return -1;
case cache_failure:
return -1;
_gdbm_new_bucket (dbf, newcache[0]->ca_bucket, new_bits);
adr_1 = _gdbm_alloc (dbf, dbf->header->bucket_size);
switch (cache_lookup (dbf, adr_1, newcache[0], &newcache[1]))
case cache_new:
case cache_found:
/* should not happen */
"%s: bucket found where it should not",
return -1;
case cache_failure:
return -1;
_gdbm_new_bucket (dbf, newcache[1]->ca_bucket, new_bits);
/* Double the directory size if necessary. */
if (dbf->header->dir_bits == dbf->bucket->bucket_bits)
off_t *new_dir; /* Pointer to the new directory. */
int dir_size; /* Size of the new directory. */
off_t dir_adr; /* Address of the new directory. */
if (dbf->header->dir_size >= GDBM_MAX_DIR_HALF)
_gdbm_fatal (dbf, _("directory overflow"));
return -1;
dir_size = dbf->header->dir_size * 2;
dir_adr = _gdbm_alloc (dbf, dir_size);
if (dir_adr == 0)
return -1;
new_dir = malloc (dir_size);
if (new_dir == NULL)
_gdbm_fatal (dbf, _("malloc error"));
return -1;
for (index = 0; index < GDBM_DIR_COUNT (dbf); index++)
new_dir[2*index] = dbf->dir[index];
new_dir[2*index+1] = dbf->dir[index];
/* Update header. */
old_adr[old_count] = dbf->header->dir;
dbf->header->dir = dir_adr;
old_size[old_count] = dbf->header->dir_size;
dbf->header->dir_size = dir_size;
dbf->header->dir_bits = new_bits;
/* Now update dbf. */
dbf->header_changed = TRUE;
dbf->bucket_dir *= 2;
free (dbf->dir);
dbf->dir = new_dir;
/* Copy all elements in dbf->bucket into the new buckets. */
for (index = 0; index < dbf->header->bucket_elems; index++)
bucket_element *old_el = &dbf->bucket->h_table[index];
hash_bucket *bucket;
int elem_loc;
if (old_el->hash_value < 0)
return -1;
bucket =
newcache[(old_el->hash_value >> (GDBM_HASH_BITS - new_bits)) & 1]->ca_bucket;
elem_loc = old_el->hash_value % dbf->header->bucket_elems;
while (bucket->h_table[elem_loc].hash_value != -1)
elem_loc = (elem_loc + 1) % dbf->header->bucket_elems;
bucket->h_table[elem_loc] = *old_el;
/* Allocate avail space for the newcache[1]->ca_bucket. */
= _gdbm_alloc (dbf, dbf->header->block_size);
if (newcache[1]->ca_bucket->bucket_avail[0].av_adr == 0)
return -1;
= dbf->header->block_size;
newcache[1]->ca_bucket->av_count = 1;
/* Copy the avail elements in dbf->bucket to newcache[0]->ca_bucket. */
newcache[0]->ca_bucket->av_count = dbf->bucket->av_count;
index = 0;
if (newcache[0]->ca_bucket->av_count == BUCKET_AVAIL)
/* The avail is full, move the first one to newcache[1]->ca_bucket.*/
_gdbm_put_av_elem (dbf->bucket->bucket_avail[0],
index = 1;
index1 = 0;
for (; index < dbf->bucket->av_count; index++)
= dbf->bucket->bucket_avail[index];
/* Update the directory. We have new file addresses for both buckets. */
dir_start1 = (dbf->bucket_dir >> (dbf->header->dir_bits - new_bits)) | 1;
dir_end = (dir_start1 + 1) << (dbf->header->dir_bits - new_bits);
dir_start1 = dir_start1 << (dbf->header->dir_bits - new_bits);
dir_start0 = dir_start1 - (dir_end - dir_start1);
for (index = dir_start0; index < dir_start1; index++)
dbf->dir[index] = adr_0;
for (index = dir_start1; index < dir_end; index++)
dbf->dir[index] = adr_1;
/* Set changed flags. */
newcache[0]->ca_changed = TRUE;
newcache[1]->ca_changed = TRUE;
dbf->directory_changed = TRUE;
/* Update the cache! */
dbf->bucket_dir = _gdbm_bucket_dir (dbf, next_insert);
/* Invalidate old cache entry. */
old_bucket.av_adr = dbf->cache_mru->ca_adr;
old_bucket.av_size = dbf->header->bucket_size;
cache_elem_free (dbf, dbf->cache_mru);
/* Set dbf->bucket to the proper bucket. */
if (dbf->dir[dbf->bucket_dir] != adr_0)
cache_elem *t = newcache[0];
newcache[0] = newcache[1];
newcache[1] = t;
_gdbm_put_av_elem (old_bucket,
lru_unlink_elem (dbf, newcache[0]);
lru_link_elem (dbf, newcache[0], NULL);
/* Get rid of old directories. */
for (index = 0; index < old_count; index++)
if (_gdbm_free (dbf, old_adr[index], old_size[index]))
return -1;
return 0;
tsecer@harry: tools/gdbmtool
Welcome to the gdbm tool. Type ? for help.
gdbmtool> define key { int pid, int tid}
gdbmtool> define content { string name, int age}
gdbmtool> store {1, 1} { tsecer, 3}
gdbmtool> fetch {1, 1}
name=tsecer\003\000\000\000,age=(not enough data)
NOTE: The string type can reasonably be used only if it is the last or the only member of the data structure. That’s because it provides no information about the number of elements in the array, so it is interpreted to contain all bytes up to the end of the datum.
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 10年+ .NET Coder 心语 ── 封装的思维:从隐藏、稳定开始理解其本质意义
· 地球OL攻略 —— 某应届生求职总结
· 提示词工程——AI应用必不可少的技术
· Open-Sora 2.0 重磅开源!
· 周边上新:园子的第一款马克杯温暖上架