Redis源码分析(七)--- zipmap压缩图
如果有看过之前我分析的ziplist压缩列表的分析的话,理解这个我觉得不是什么特别的难题。ziplist压缩列表和zipmap都采用了动态分配字节的做法表示长度,比如通过固定的字节表示节省了不少的空间。同样带来的问题就是复杂的指针移动,和字符位置移动。但总的来说,一定是利大于弊了,要不然设计者也不会这么做。ziplist保存的使用一个列表,zipmap就保存的则是一个个键值对,通过key:value key:value的形式连着。下面我给出zipmap的结构构成,zipmap其实也就是一个超级长的字符串。
<zmlen><len>"foo"<len><free>"bar"<len>"hello"<len><free>"world"
里面涉及了几个变量zmlen,len,free,下面给出完整的解释:
/* String -> String Map data structure optimized for size.
* This file implements a data structure mapping strings to other strings
* implementing an O(n) lookup data structure designed to be very memory
* efficient.
*
* The Redis Hash type uses this data structure for hashes composed of a small
* number of elements, to switch to a hash table once a given number of
* elements is reached.
*
* Given that many times Redis Hashes are used to represent objects composed
* of few fields, this is a very big win in terms of used memory.
*
* zipmap压缩表和ziplist十分类似,都做到了内存操作效率比较高的
* --------------------------------------------------------------------------
*
* Copyright (c) 2009-2010, Salvatore Sanfilippo <antirez at gmail dot com>
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* * Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* * Neither the name of Redis nor the names of its contributors may be used
* to endorse or promote products derived from this software without
* specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
/* Memory layout of a zipmap, for the map "foo" => "bar", "hello" => "world":
*
* <zmlen><len>"foo"<len><free>"bar"<len>"hello"<len><free>"world"
*
* <zmlen> is 1 byte length that holds the current size of the zipmap.
* When the zipmap length is greater than or equal to 254, this value
* is not used and the zipmap needs to be traversed to find out the length.
* <zmeln>占有着1个字节,所以他的最多可代表的数量是254,当zipmap中的元素记录超过这个数时,
* 那只能从前往后后遍历算大小了,和ziplist是不一样的。
*
* <len> is the length of the following string (key or value).
* <len> lengths are encoded in a single value or in a 5 bytes value.
* If the first byte value (as an unsigned 8 bit value) is between 0 and
* 252, it's a single-byte length. If it is 253 then a four bytes unsigned
* integer follows (in the host byte ordering). A value of 255 is used to
* signal the end of the hash. The special value 254 is used to mark
* empty space that can be used to add new key/value pairs.
* <len>代表了后面字符串key 或 value的值的长度,长度一般被编码1个字节或5个字节表示,这个和ziplist类似
* 如果后面的字符串长度小于等于252个,可与用单字节表示,其他253,254等长度被用来表示其他作用了,当超过这个数时候
* 则直接按5字节的方式存储长度。
*
* <free> is the number of free unused bytes after the string, resulting
* from modification of values associated to a key. For instance if "foo"
* is set to "bar", and later "foo" will be set to "hi", it will have a
* free byte to use if the value will enlarge again later, or even in
* order to add a key/value pair if it fits.
* <free>一般来表示后面的value长度的空闲值,当key:value=“foo”:"bar",后来被改为“foo”:"hi",空闲长度就为1了
*
* <free> is always an unsigned 8 bit number, because if after an
* update operation there are more than a few free bytes, the zipmap will be
* reallocated to make sure it is as small as possible.
* <free>的数字一般比较小,如果空闲太大,zipmap会进行调整大小使map整体变得尽可能小
*
* The most compact representation of the above two elements hash is actually:
* 这是一个例子:
* <zmlen><len>"foo"<len><free>"bar"<len>"hello"<len><free>"world"
* <总键值对数><第一个key的长度>key字符<第一个value的长度><空闲长度开始都为0>后面同前
* "\x02\x03foo\x03\x00bar\x05hello\x05\x00world\xff"
*
* Note that because keys and values are prefixed length "objects",
* the lookup will take O(N) where N is the number of elements
* in the zipmap and *not* the number of bytes needed to represent the zipmap.
* This lowers the constant times considerably.
*/
说到键值对,里面最最重要的方法当然是根据key ,setValue的方法了,方法如下:
/* Set key to value, creating the key if it does not already exist.
* If 'update' is not NULL, *update is set to 1 if the key was
* already preset, otherwise to 0. */
unsigned char *zipmapSet(unsigned char *zm, unsigned char *key, unsigned int klen, unsigned char *val, unsigned int vlen, int *update) {
unsigned int zmlen, offset;
unsigned int freelen, reqlen = zipmapRequiredLength(klen,vlen);
unsigned int empty, vempty;
unsigned char *p;
freelen = reqlen;
if (update) *update = 0;
//寻找key的位置
p = zipmapLookupRaw(zm,key,klen,&zmlen);
if (p == NULL) {
/* Key not found: enlarge */
//key的位置没有找到,调整zipmap的大小,准备添加操作
zm = zipmapResize(zm, zmlen+reqlen);
p = zm+zmlen-1;
zmlen = zmlen+reqlen;
/* Increase zipmap length (this is an insert) */
//如果头字节还没有达到最大值,则递增
if (zm[0] < ZIPMAP_BIGLEN) zm[0]++;
} else {
/* Key found. Is there enough space for the new value? */
/* Compute the total length: */
if (update) *update = 1;
//key的位置以及找到,判断是否有空间插入新的值
freelen = zipmapRawEntryLength(p);
if (freelen < reqlen) {
/* Store the offset of this key within the current zipmap, so
* it can be resized. Then, move the tail backwards so this
* pair fits at the current position. */
//如果没有空间插入新的值,则调整大小
offset = p-zm;
zm = zipmapResize(zm, zmlen-freelen+reqlen);
p = zm+offset;
/* The +1 in the number of bytes to be moved is caused by the
* end-of-zipmap byte. Note: the *original* zmlen is used. */
//移动空间以便增加新的值
memmove(p+reqlen, p+freelen, zmlen-(offset+freelen+1));
zmlen = zmlen-freelen+reqlen;
freelen = reqlen;
}
}
/* We now have a suitable block where the key/value entry can
* be written. If there is too much free space, move the tail
* of the zipmap a few bytes to the front and shrink the zipmap,
* as we want zipmaps to be very space efficient. */
empty = freelen-reqlen;
if (empty >= ZIPMAP_VALUE_MAX_FREE) {
/* First, move the tail <empty> bytes to the front, then resize
* the zipmap to be <empty> bytes smaller. */
offset = p-zm;
memmove(p+reqlen, p+freelen, zmlen-(offset+freelen+1));
zmlen -= empty;
zm = zipmapResize(zm, zmlen);
p = zm+offset;
vempty = 0;
} else {
vempty = empty;
}
/* Just write the key + value and we are done. */
/* Key: */
//定位到插入的位置,首先写入key值
p += zipmapEncodeLength(p,klen);
memcpy(p,key,klen);
p += klen;
/* Value: */
//key值后面是value值,再次写入
p += zipmapEncodeLength(p,vlen);
*p++ = vempty;
memcpy(p,val,vlen);
return zm;
}
map里返回长度的方法有点特别,就直接定位了就用一个字节存储长度:
/* Return the number of entries inside a zipmap */
/* 返回map的长度 */
unsigned int zipmapLen(unsigned char *zm) {
unsigned int len = 0;
//如果第一个长度小于最大值,则直接返回
if (zm[0] < ZIPMAP_BIGLEN) {
len = zm[0];
} else {
//否则变量计算长度
unsigned char *p = zipmapRewind(zm);
while((p = zipmapNext(p,NULL,NULL,NULL,NULL)) != NULL) len++;
/* Re-store length if small enough */
if (len < ZIPMAP_BIGLEN) zm[0] = len;
}
return len;
}
平常我们在redis客户端执行set key "value"命令的时候,调用的其实就是set方法,如下:
zm = zipmapSet(zm,(unsigned char*) "name",4, (unsigned char*) "foo",3,NULL);
zm = zipmapSet(zm,(unsigned char*) "surname",7, (unsigned char*) "foo",3,NULL);
zm = zipmapSet(zm,(unsigned char*) "age",3, (unsigned char*) "foo",3,NULL);
比ziplist方法简单许多了,最后给出头文件
/* String -> String Map data structure optimized for size.
*
* See zipmap.c for more info.
*
* --------------------------------------------------------------------------
*
* Copyright (c) 2009-2010, Salvatore Sanfilippo <antirez at gmail dot com>
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* * Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* * Neither the name of Redis nor the names of its contributors may be used
* to endorse or promote products derived from this software without
* specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
#ifndef _ZIPMAP_H
#define _ZIPMAP_H
unsigned char *zipmapNew(void); //创建一个新的压缩图
unsigned char *zipmapSet(unsigned char *zm, unsigned char *key, unsigned int klen, unsigned char *val, unsigned int vlen, int *update); //设置压缩图中的某个键值对
unsigned char *zipmapDel(unsigned char *zm, unsigned char *key, unsigned int klen, int *deleted); //删除压缩图上的某个键值对
unsigned char *zipmapRewind(unsigned char *zm); //将在zipmapNext中被调用到
unsigned char *zipmapNext(unsigned char *zm, unsigned char **key, unsigned int *klen, unsigned char **value, unsigned int *vlen); //取得此键值对的下一个键值对
int zipmapGet(unsigned char *zm, unsigned char *key, unsigned int klen, unsigned char **value, unsigned int *vlen); //获取某个键值对
int zipmapExists(unsigned char *zm, unsigned char *key, unsigned int klen); //某个key值在zipmap中是否存在
unsigned int zipmapLen(unsigned char *zm); //zipmap压缩图的总键值对数
size_t zipmapBlobLen(unsigned char *zm); //压缩图的序列化到文件中所需大小
void zipmapRepr(unsigned char *p); //输出的压缩图的具体信息,用于测试
#endif
最后,基于本人对redis源代码分析有一段时间了,我把分析好的代码,同步到了我的个人github上了,放上地址大家可以一起学习:
github:https://github.com/linyiqun/Redis-Code