golang map 内幕
关键性数据结构
- hmap: map 的 header结构
- bmap: map 的 bucket结构
- mapextra: map 的 拓展结构 不是每一个map都包含
golang map 是用 hash map实现的,首先,我们先看 hash map是怎么实现的;然后我们再看 golang map 是怎么基于 hash map 封装的 map 类型。
Bucket
// A bucket for a Go map.
type bmap struct {
// tophash generally contains the top byte of the hash value
// for each key in this bucket. If tophash[0] < minTopHash,
// tophash[0] is a bucket evacuation state instead.
tophash [bucketCnt]uint8
// Followed by bucketCnt keys and then bucketCnt elems.
// NOTE: packing all the keys together and then all the elems together makes the
// code a bit more complicated than alternating key/elem/key/elem/... but it allows
// us to eliminate padding which would be needed for, e.g., map[int64]int8.
// Followed by an overflow pointer.
}
这里面的 bucketCnt
是一个常量:
const (
// Maximum number of key/elem pairs a bucket can hold.
bucketCntBits = 3
bucketCnt = 1 << bucketCntBits
... ...
)
这个topHash
就是指hash值的前8位, 每一个桶是大小为 8*8 大小。此外topHash 的元素还可能有以下状态:
const(
emptyRest = 0 // 此位置未被占用,且后面的位置也没被占用
emptyOne = 1 // 此位置未被占用,在delete后会设置此标志
evacuatedX = 2 // 此位置已经被占用,对应数据已经被迁移到新的buckets的first半区
evacuatedY = 3 // 此位置已经被占用,对应数据已经被迁移到新的buckets的second半区
evacuatedEmpty = 4 // 此位置未被占用,但bucket已经被迁移
minTopHash = 5 // topHash的最小值,为了与前面4个值进行区分
)
Hmap
// A header for a Go map.
type hmap struct {
// Note: the format of the hmap is also encoded in cmd/compile/internal/gc/reflect.go.
// Make sure this stays in sync with the compiler's definition.
count int // # live cells == size of map. Must be first (used by len() builtin)
flags uint8
B uint8 // log_2 of # of buckets (can hold up to loadFactor * 2^B items)
noverflow uint16 // overflow buckets 的计数器;详细信息看 incrnoverflow
hash0 uint32 // hash seed
buckets unsafe.Pointer // array of 2^B Buckets. may be nil if count==0.
oldbuckets unsafe.Pointer // previous bucket array of half the size, non-nil only when growing
nevacuate uintptr // progress counter for evacuation (buckets less than this have been evacuated), 已迁移的bucket的计数器
extra *mapextra // optional fields,记录next_overflow的地址,和已经分配的overflow
}
const (
// flags
iterator = 1 // there may be an iterator using buckets,有一个iter在使用bucket
oldIterator = 2 // there may be an iterator using oldbuckets,有一个iter在使用old_bucket
hashWriting = 4 // a goroutine is writing to the map,有一个协程在写入
sameSizeGrow = 8 // the current map growth is to a new map of the same size,新bucekt与old_bucket size相同.
)
// mapextra holds fields that are not present on all maps.
type mapextra struct {
// If both key and elem do not contain pointers and are inline, then we mark bucket
// type as containing no pointers. This avoids scanning such maps.
// However, bmap.overflow is a pointer. In order to keep overflow buckets
// alive, we store pointers to all overflow buckets in hmap.extra.overflow and hmap.extra.oldoverflow.
// overflow and oldoverflow are only used if key and elem do not contain pointers.
// overflow contains overflow buckets for hmap.buckets.
// oldoverflow contains overflow buckets for hmap.oldbuckets.
// The indirection allows to store a pointer to the slice in hiter.
overflow *[]*bmap
oldoverflow *[]*bmap
// nextOverflow holds a pointer to a free overflow bucket.
nextOverflow *bmap
}
map解析
make map
在golang中可以通过 make(map[key]value, hint)
创建一个map实例,在runtime包中是通过如下函数实现的:
// makemap implements Go map creation for make(map[k]v, hint).
// If the compiler has determined that the map or the first bucket
// can be created on the stack, h and/or bucket may be non-nil.
// If h != nil, the map can be created directly in h.
// If h.buckets != nil, bucket pointed to can be used as the first bucket.
func makemap(t *maptype, hint int, h *hmap) *hmap {
mem, overflow := math.MulUintptr(uintptr(hint), t.bucket.size)
if overflow || mem > maxAlloc {
hint = 0
}
// 1. 初始化hmap
if h == nil {
h = new(hmap)
}
h.hash0 = fastrand() // 取一个随机值作为hash seed
// Find the size parameter B which will hold the requested # of elements.
// For hint < 0 overLoadFactor returns false since hint < bucketCnt.
B := uint8(0)
for overLoadFactor(hint, B) {
B++
}
h.B = B
// allocate initial hash table
// if B == 0, the buckets field is allocated lazily later (in mapassign)
// If hint is large zeroing this memory could take a while.
// 分配空间
if h.B != 0 {
var nextOverflow *bmap
h.buckets, nextOverflow = makeBucketArray(t, h.B, nil)
if nextOverflow != nil {
h.extra = new(mapextra)
h.extra.nextOverflow = nextOverflow // 保存overflow区域的首地址
}
}
}
// makeBucketArray initializes a backing array for map buckets.
// 1<<b is the minimum number of buckets to allocate.
// dirtyalloc should either be nil or a bucket array previously
// allocated by makeBucketArray with the same t and b parameters.
// If dirtyalloc is nil a new backing array will be alloced and
// otherwise dirtyalloc will be cleared and reused as backing array.
func makeBucketArray(t *maptype, b uint8, dirtyalloc unsafe.Pointer) (buckets unsafe.Pointer, nextOverflow *bmap) {
// 1. 计算要使用的空间大小
base := bucketShift(b) // base = 2^(b'), b' 32位系统时为b的低5位,最大值为31,64系统时为b的低6位,最大值为63
nbuckets := base
// For small b, overflow buckets are unlikely.
// Avoid the overhead of the calculation.
// 预分配内存
if b >= 4 {
// Add on the estimated number of overflow buckets
// required to insert the median number of elements
// used with this value of b.
nbuckets += bucketShift(b - 4) // nbuckets = base + 2^((b-4)')
sz := t.bucket.size * nbuckets
up := roundupsize(sz) // 内存页对齐,golang 中pagesize 为8kB
if up != sz {
nbuckets = up / t.bucket.size
}
}
// 2. 空间重复利用
if dirtyalloc == nil {
// 没有脏空间,则新分配一个数组
buckets = newarray(t.bucket, int(nbuckets))
} else {
// dirtyalloc was previously generated by
// the above newarray(t.bucket, int(nbuckets))
// but may not be empty.
buckets = dirtyalloc
size := t.bucket.size * nbuckets
if t.bucket.ptrdata != 0 {
memclrHasPointers(buckets, size)
} else {
memclrNoHeapPointers(buckets, size)
}
}
// 3. 计算overflow 指针
if base != nbuckets {
// We preallocated some overflow buckets.
// To keep the overhead of tracking these overflow buckets to a minimum,
// we use the convention that if a preallocated overflow bucket's overflow
// pointer is nil, then there are more available by bumping the pointer.
// We need a safe non-nil pointer for the last overflow bucket; just use buckets.
nextOverflow = (*bmap)(add(buckets, base*uintptr(t.bucketsize)))
last := (*bmap)(add(buckets, (nbuckets-1)*uintptr(t.bucketsize)))
last.setoverflow(t, (*bmap)(buckets)) // 将overflow最后一个bucket的末尾位置存储 buckets指针
}
return buckets, nextOverflow
}
// overLoadFactor reports whether count items placed in 1<<B buckets is over loadFactor.
// 检查当前数量是不是超过负载系数
func overLoadFactor(count int, B uint8) bool {
return count > bucketCnt && uintptr(count) > loadFactorNum*(bucketShift(B)/loadFactorDen)
}
返回值类型是 hmap*
, hmap.B 要符合以下规则
loadFactorNum = 13
loadFactorDen = 2
count = hint
uintptr(count) > loadFactorNum*(bucketShift(B)/loadFactorDen)
hmap.buckets 分配的内存: roudupsize(base=bucketsize*2^B + overflow=bucketsize*/2^B/4),bucketsize = bitmap_size + 8*(key_size) + 8*(elem_size) + ptr_size
,在32和64位系统中ptr_size
不同,所以bucket_size
会也不同。
从上面的代码可以看出,在物理内存中是以一个数组来存储hmap table,内存分布大致如下:
# bucket
|bmap|key1~8|elem1~8|
#bucekts
|bucket1~N|overflow|
# overflow
|nextOverflow|...|last|
# last
|bmap|...|ptr_buckets|
bucket1~N 是base区域, overflow 是预留区域,降低内存重复分配的次数。
insert into map
// Like mapaccess, but allocates a slot for the key if it is not present in the map.
func mapassign(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer {
... ...
// 1. 生成key的hash
alg := t.key.alg
hash := alg.hash(key, uintptr(h.hash0))
if h.buckets == nil {
h.buckets = newobject(t.bucket) // newarray(t.bucket, 1)
}
again:
// 2. 分桶
// 取出hash的低 B 位,作为 bucket的编号
bucket := hash & bucketMask(h.B)
if h.growing() {
growWork(t, h, bucket)
}
// 找到桶对应的数组首地址
b := (*bmap)(unsafe.Pointer(uintptr(h.buckets) + bucket*uintptr(t.bucketsize)))
// hash 的高 8 位, 如果小于5 则 +5 返回
top := tophash(hash)
var inserti *uint8 // bmap中写入 tophash 的地址
var insertk unsafe.Pointer // 写入key的地址
var elem unsafe.Pointer // 写入值的地址
bucketloop:
for {
for i := uintptr(0); i < bucketCnt; i++ { // 查找可使用的位置
if b.tophash[i] != top {
if isEmpty(b.tophash[i]) && inserti == nil {
inserti = &b.tophash[i]
insertk = add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize))
elem = add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.elemsize))
}
if b.tophash[i] == emptyRest { // 找到可使用的位置,跳出循环
break bucketloop
}
continue
}
// 找到相同topHash,比较key
k := add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize))
if t.indirectkey() {
k = *((*unsafe.Pointer)(k))
}
if !alg.equal(key, k) { // key不相同继续查找
continue
}
// already have a mapping for key. Update it.
// key相同,找到elem位置,跳转到done
if t.needkeyupdate() {
typedmemmove(t.key, k, key)
}
elem = add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.elemsize))
goto done
}
// 在N bucket 中没有找到,查找overflow
ovf := b.overflow(t)
if ovf == nil {
break // overflow 为空跳出循环
}
b = ovf
}
// Did not find mapping for key. Allocate new cell & add entry.
// If we hit the max load factor or we have too many overflow buckets,
// and we're not already in the middle of growing, start growing.
// 是否触发内存增长
if !h.growing() && (overLoadFactor(h.count+1, h.B) || tooManyOverflowBuckets(h.noverflow, h.B)) {
hashGrow(t, h)
goto again // Growing the table invalidates everything, so try again
}
if inserti == nil { // 没有找到可以插入的位置,新分配一个overflow
// all current buckets are full, allocate a new one.
newb := h.newoverflow(t, b)
inserti = &newb.tophash[0]
insertk = add(unsafe.Pointer(newb), dataOffset)
elem = add(insertk, bucketCnt*uintptr(t.keysize))
}
// store new key/elem at insert position
if t.indirectkey() {
kmem := newobject(t.key)
*(*unsafe.Pointer)(insertk) = kmem
insertk = kmem
}
if t.indirectelem() {
vmem := newobject(t.elem)
*(*unsafe.Pointer)(elem) = vmem
}
typedmemmove(t.key, insertk, key)
*inserti = top
h.count++
done:
if h.flags&hashWriting == 0 {
throw("concurrent map writes")
}
h.flags &^= hashWriting
if t.indirectelem() {
elem = *((*unsafe.Pointer)(elem))
}
return elem
}
从上面代码解析我们能清楚一个写入的过程:
- 对key做hash,取hash的低B位,确定bucket的编号 N;
- 遍历 N bucket中每一个 位置 ,找到没有写入的 位置,写入topHash即hash的高8位;
- 如果 N bucket 中有相同的 topHash,则需要去出对应的key做比较,如果相同则修改elem,如果不同则继续向后遍历,寻找空闲的 位置 写入,以此来解决冲突的问题。
那如果出现N bucket 满了怎么办?虽然这种概率很低但是也难免会遇到,毕竟一个bmap中只能装的下8个key,这里就要用到我们刚才说的预分配内存 overflow
,分配overflow
的逻辑如下:
func (h *hmap) newoverflow(t *maptype, b *bmap) *bmap {
var ovf *bmap
if h.extra != nil && h.extra.nextOverflow != nil { // 存在overflow区域
// We have preallocated overflow buckets available.
// See makeBucketArray for more details.
ovf = h.extra.nextOverflow
if ovf.overflow(t) == nil { // ovf不是最后一块,将nextOverflow 指针向后移动
// We're not at the end of the preallocated overflow buckets. Bump the pointer.
h.extra.nextOverflow = (*bmap)(add(unsafe.Pointer(ovf), uintptr(t.bucketsize)))
} else { // 最后一块 overflow,将nextOverflow 置空
// This is the last preallocated overflow bucket.
// Reset the overflow pointer on this bucket,
// which was set to a non-nil sentinel value.
ovf.setoverflow(t, nil) // 抹平末尾指针
h.extra.nextOverflow = nil
}
} else {// overflow 用光,新建一块
ovf = (*bmap)(newobject(t.bucket))
}
h.incrnoverflow() // 增加overflow 使用计数
if t.bucket.ptrdata == 0 {
h.createOverflow()
*h.extra.overflow = append(*h.extra.overflow, ovf)
}
b.setoverflow(t, ovf) // 将overflow的地址加到末尾
return ovf
}
通过上面的函数我们可以明白,当N bucket被用光后如何扩充,即从预留区域查找一块未使用的区域,将该区域的指针放在 N bucket 的末尾,作为 N bucket的扩充区域:
# N bucket 扩充
|bmap|key1~8|elem1~8|ptr_extra_bucket|......|extra_bucket|
^---------------^
然后我们就可以愉快的将新的key-value放到extra_bucket
中,我们在结合上面一节的内容可以更加清晰的明白hashmap table
在内存里的构造,即连续数组+跳转指针,这也是为什么访问能如此快速的原因,基本上都是在连续内存上指针位移操作。
看到这里我们对map的实现应该有一个粗浅的认识,但是这只是一部分。我们可以注意到hmap
中还有一个oldbuckets
,还有其他的成员变量的含义没有被解开。
在上面的段落中我们给出了bucket溢出的解决方案,但如果bucket溢出过多怎么办(即单桶数据过多
)??我们可以假设一个极端情况 一个map中除了 bucket1,剩余的都是它的extra_bucket,当我在 bucket1中取查询的时候要遍历很长,最差的情况要遍历整个map。怎样去解决这个问题?
- 对key进行排序,使key变成有序的。可以使用插入排序,实现简单但是内存位移的操作较多。使用二分查找时间复杂度可以优化到log(n)。当然也可以使用其它排序算法,来减少内存位移发生的几率,但是因为底层存储是使用的数组,内存位移难以避免。
- 重建,通过重新划分桶,来解决单桶数据过多问题。
上面代码中不难展示出在golang中使用的是第二种方案,那么什么情况下会触发重建?如何重建?
map Grow
tooManyOverflowBuckets 和 overLoadFactor 这两个函数会去判断是否需要执行 Grow
流程:
func mapassign(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer {
... ...
// Did not find mapping for key. Allocate new cell & add entry.
// If we hit the max load factor or we have too many overflow buckets,
// and we're not already in the middle of growing, start growing.
if !h.growing() && (overLoadFactor(h.count+1, h.B) || tooManyOverflowBuckets(h.noverflow, h.B)) {
hashGrow(t, h)
goto again // Growing the table invalidates everything, so try again
}
... ...
}
const(
loadFactorNum = 13
loadFactorDen = 2
)
// overLoadFactor reports whether count items placed in 1<<B buckets is over loadFactor.
func overLoadFactor(count int, B uint8) bool {
return count > bucketCnt && uintptr(count) > loadFactorNum*(bucketShift(B)/loadFactorDen)
}
// tooManyOverflowBuckets reports whether noverflow buckets is too many for a map with 1<<B buckets.
// Note that most of these overflow buckets must be in sparse use;
// if use was dense, then we'd have already triggered regular map growth.
func tooManyOverflowBuckets(noverflow uint16, B uint8) bool {
// If the threshold is too low, we do extraneous work.
// If the threshold is too high, maps that grow and shrink can hold on to lots of unused memory.
// "too many" means (approximately) as many overflow buckets as regular buckets.
// See incrnoverflow for more details.
if B > 15 {
B = 15
}
// The compiler doesn't see here that B < 16; mask B to generate shorter shift code.
return noverflow >= uint16(1)<<(B&15)
}
首先是overLoadFactor
,当 key_count + 1 > 8 and key_count+1 > 13*2^(B-1)
时会触发 Grow
流程。其次是 tooManyOverflowBuckets
,当noverflow >= 2^(B&15)
,也就是说overflow bucket
的数量大于2^(B&15)
就会触发Grow
,这里分两种情况:
- B < 16,此时当
overflow bucket
和base bucket
相等的时候就会触发增长流程。 - B >= 16 时,此时当
overflow bucket
数量 > 2^15时就会触发增长流程。
这里需要注意的是,noverflow
并不是一个准确的计数,当数量过大的时候它只能显示一个近似的数量:
// incrnoverflow increments h.noverflow.
// noverflow counts the number of overflow buckets.
// This is used to trigger same-size map growth.
// See also tooManyOverflowBuckets.
// To keep hmap small, noverflow is a uint16.
// When there are few buckets, noverflow is an exact count.
// When there are many buckets, noverflow is an approximate count.
func (h *hmap) incrnoverflow() {
// We trigger same-size map growth if there are
// as many overflow buckets as buckets.
// We need to be able to count to 1<<h.B.
if h.B < 16 {
h.noverflow++
return
}
// Increment with probability 1/(1<<(h.B-15)).
// When we reach 1<<15 - 1, we will have approximately
// as many overflow buckets as buckets.
mask := uint32(1)<<(h.B-15) - 1
// Example: if h.B == 18, then mask == 7,
// and fastrand & 7 == 0 with probability 1/8.
if fastrand()&mask == 0 {
h.noverflow++
}
}
可以很明显的看出,当h.B>=16
时候并不是每次都会累加,此时base bucket的数量至少为2^16
,可以存储2^19=524288
条数据,外加预分配的overflow
。桶的数量越多,分布的越离散,出现overflow的概率更低,即使出现overflow
单桶过长的概率也会降低。
func hashGrow(t *maptype, h *hmap) {
// If we've hit the load factor, get bigger.
// Otherwise, there are too many overflow buckets,
// so keep the same number of buckets and "grow" laterally.
bigger := uint8(1)
// 判断是否超出loadFactor
if !overLoadFactor(h.count+1, h.B) { // 没有超过则维持以前的size
bigger = 0
h.flags |= sameSizeGrow
}
oldbuckets := h.buckets
newbuckets, nextOverflow := makeBucketArray(t, h.B+bigger, nil)
// 更新flags
flags := h.flags &^ (iterator | oldIterator)
if h.flags&iterator != 0 {
flags |= oldIterator
}
// commit the grow (atomic wrt gc)
h.B += bigger
h.flags = flags
// 替换buckets
h.oldbuckets = oldbuckets
h.buckets = newbuckets
h.nevacuate = 0
h.noverflow = 0
if h.extra != nil && h.extra.overflow != nil {
// Promote current overflow buckets to the old generation.
if h.extra.oldoverflow != nil {
throw("oldoverflow is not nil")
}
h.extra.oldoverflow = h.extra.overflow
h.extra.overflow = nil
}
if nextOverflow != nil {
if h.extra == nil {
h.extra = new(mapextra)
}
h.extra.nextOverflow = nextOverflow
}
// the actual copying of the hash table data is done incrementally
// by growWork() and evacuate().
}
根据上面的hashGrow
函数,我们可以看出map内存增长的规则,在overLoadFactor
的情况下,h.B = h.B + 1,即base bucket数量翻倍, 否则维持原size不变,创建一个新的 buckets
数组。
func mapassign(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer{
... ...
bucket := hash & bucketMask(h.B)
if h.growing() {
growWork(t, h, bucket)
}
... ...
}
func growWork(t *maptype, h *hmap, bucket uintptr) {
// make sure we evacuate the oldbucket corresponding
// to the bucket we're about to use
evacuate(t, h, bucket&h.oldbucketmask())
// evacuate one more oldbucket to make progress on growing
if h.growing() {
evacuate(t, h, h.nevacuate)
}
}
// 迁移数据
func evacuate(t *maptype, h *hmap, oldbucket uintptr) {
b := (*bmap)(add(h.oldbuckets, oldbucket*uintptr(t.bucketsize)))
newbit := h.noldbuckets() //oldbuckets 的 size
if !evacuated(b) {
// TODO: reuse overflow buckets instead of using new ones, if there
// is no iterator using the old buckets. (If !oldIterator.)
// xy contains the x and y (low and high) evacuation destinations.
// 1. 确定两个潜在目的迁移地址,X/Y
// X 地址为 new buckets中编号为 oldbucket 的bucket
var xy [2]evacDst
x := &xy[0]
x.b = (*bmap)(add(h.buckets, oldbucket*uintptr(t.bucketsize)))
x.k = add(unsafe.Pointer(x.b), dataOffset)
x.e = add(x.k, bucketCnt*uintptr(t.keysize))
if !h.sameSizeGrow() { // size 不变的情况下
// Only calculate y pointers if we're growing bigger.
// Otherwise GC can see bad pointers.
// Y 地址为 new buckets中编号为 oldbucket + noldbuckets, 即second 半区
y := &xy[1]
y.b = (*bmap)(add(h.buckets, (oldbucket+newbit)*uintptr(t.bucketsize)))
y.k = add(unsafe.Pointer(y.b), dataOffset)
y.e = add(y.k, bucketCnt*uintptr(t.keysize))
}
// 2. 开始迁移数据,包括bucket overflow部分的数据
for ; b != nil; b = b.overflow(t) {
k := add(unsafe.Pointer(b), dataOffset)
e := add(k, bucketCnt*uintptr(t.keysize))
for i := 0; i < bucketCnt; i, k, e = i+1, add(k, uintptr(t.keysize)), add(e, uintptr(t.elemsize)) {
top := b.tophash[i]
if isEmpty(top) {
b.tophash[i] = evacuatedEmpty
continue
}
if top < minTopHash {
throw("bad map state")
}
k2 := k
if t.indirectkey() {
k2 = *((*unsafe.Pointer)(k2))
}
var useY uint8
// 计算迁移到first 半区还是 second半区
if !h.sameSizeGrow() {
// Compute hash to make our evacuation decision (whether we need
// to send this key/elem to bucket x or bucket y).
hash := t.key.alg.hash(k2, uintptr(h.hash0))
if h.flags&iterator != 0 && !t.reflexivekey() && !t.key.alg.equal(k2, k2) {
// If key != key (NaNs), then the hash could be (and probably
// will be) entirely different from the old hash. Moreover,
// it isn't reproducible. Reproducibility is required in the
// presence of iterators, as our evacuation decision must
// match whatever decision the iterator made.
// Fortunately, we have the freedom to send these keys either
// way. Also, tophash is meaningless for these kinds of keys.
// We let the low bit of tophash drive the evacuation decision.
// We recompute a new random tophash for the next level so
// these keys will get evenly distributed across all buckets
// after multiple grows.
useY = top & 1
top = tophash(hash)
} else {
if hash&newbit != 0 {
useY = 1
}
}
}
if evacuatedX+1 != evacuatedY || evacuatedX^1 != evacuatedY {
throw("bad evacuatedN")
}
// 修改topHash 为 evacuatedX 或 evacuatedY, 表示已经被迁移
b.tophash[i] = evacuatedX + useY // evacuatedX + 1 == evacuatedY
dst := &xy[useY] // evacuation destination
// 迁移数据
if dst.i == bucketCnt {
dst.b = h.newoverflow(t, dst.b)
dst.i = 0
dst.k = add(unsafe.Pointer(dst.b), dataOffset)
dst.e = add(dst.k, bucketCnt*uintptr(t.keysize))
}
dst.b.tophash[dst.i&(bucketCnt-1)] = top // mask dst.i as an optimization, to avoid a bounds check
if t.indirectkey() {
*(*unsafe.Pointer)(dst.k) = k2 // copy pointer
} else {
typedmemmove(t.key, dst.k, k) // copy elem
}
if t.indirectelem() {
*(*unsafe.Pointer)(dst.e) = *(*unsafe.Pointer)(e)
} else {
typedmemmove(t.elem, dst.e, e)
}
dst.i++
// These updates might push these pointers past the end of the
// key or elem arrays. That's ok, as we have the overflow pointer
// at the end of the bucket to protect against pointing past the
// end of the bucket.
dst.k = add(dst.k, uintptr(t.keysize))
dst.e = add(dst.e, uintptr(t.elemsize))
}
}
// Unlink the overflow buckets & clear key/elem to help GC.
// 清理数据
if h.flags&oldIterator == 0 && t.bucket.ptrdata != 0 {
b := add(h.oldbuckets, oldbucket*uintptr(t.bucketsize))
// Preserve b.tophash because the evacuation
// state is maintained there.
ptr := add(b, dataOffset)
n := uintptr(t.bucketsize) - dataOffset
memclrHasPointers(ptr, n)
}
}
if oldbucket == h.nevacuate {
// 统计是否完全迁移,如果完全迁移后,oldbuckets 会被释放掉(设置为nil)
advanceEvacuationMark(h, t, newbit)
}
}
上面我们提到了在Grow
的流程中,新申请的buckets
可能会大小不变即same_size
,也可能会变成oldbuckets
的两倍即double_size
,当double_size
的情况下,会划分为两个半区first
和 second
:
# oldbuckets
|bucket1~N|
# newbuckets
|bucket1~N|bucketN+1~2N|
first second
我们以dobule_size
为例,当插入一个新的key触发Grow
操作的时候,整体的执行流程如下:
- 取hash(key)的低 B 位作为bucket编号 N, bucket_i = bucket_N;判断当前是否处于增长状态,如果不处于
Grow_State
,执行下一步;否则跳转到第 6 步; - 如果bucket_i中可以找到插入的位置,则插入结束流程;否则执行下一步;
- 查找bucket_i 是否分配了overflow bucket,如果有分配overflow bucket,则 bucketi = overflwo_bucket,跳转到第2步;否则执行下一步;
- 判断是否需要执行
Grow
流程,如果需要则执行下一步;否则执行第 7 步; - 创建新的
buckets
并扩充到dobule_size
,设置为Grow_State
,跳转到第 1 步; - 将
oldbucket
从oldbuckets
迁移到newbuckets
,oldbucket = N^(2^B-1)
,odlbucket中的数据会分布到first
和second
两个半区中(下面会详细说明),跳转到第2步执行; - 为
bucket_i
分配的oveflow_bucketr
,插入key,结束流程;
此处对第6步进行一下详细描述,我们假设 N = 12, B = 3
,触发增长后B = 4, X = N^(2^B - 1) = 4, Y = X + 2^(B-1) = 12
,新的key会被插到second
半区;如果N = 4
则会被插入到first
半区,但无论如何都是在oldbucket
迁移过来的数据桶中,以此来保证hash的一致性
,这就是重构hash map
的过程。
在same_size
的情况下执行过程是一样的,因为bucket总数不变,所以oldbucket
对应迁移到new_buckets
中相同编号的bucket中即可。
同时有一个细节值得我们注意,在bmap中被设置为emptyone
的表示是已经被删除的数据,在迁移的过程中跳过即可,这样迁移后的数据会变的更加紧凑。
# 迁移前
# oldbucket_4
# emptyreset = 0, emptyone = 1
|xx|yy|1|ww|zz|0|0|0|key1~8|elem1~8|
# 迁移后
# newbucket
# newbucket_4
|xx|ww|0|0|0|0|0|0|key_xx|key_ww|...|elem_xx|elme_ww|...|
# newbucket_12
|yy|zz|0|0|0|0|0|0|key_yy|key_zz|...|elem_yy|elme_zz|...|
# oldbucket_4
# evacuatedX = 2, evacuatedY = 3, evacuatedEmpty = 4
|2|3|1|2|3|4|4|4|key1~8|elem1~8|
map access
golang map中访问一个map中数据有三种方式,我们以map[int]int
为例:
sets := map[int]int{1:2,3:4,5:6}
value := map[1] // 返回值
value, isExist := map[i] // 返回值和是否存在
for key, value := range sets{ // 遍历
}
前两种访问方式相同,都是通过key来访问,只是返回的值有所不同而已。
func mapaccess2(t *maptype, h *hmap, key unsafe.Pointer) (unsafe.Pointer, bool) {
if raceenabled && h != nil {
callerpc := getcallerpc()
pc := funcPC(mapaccess2)
racereadpc(unsafe.Pointer(h), callerpc, pc)
raceReadObjectPC(t.key, key, callerpc, pc)
}
if msanenabled && h != nil {
msanread(key, t.key.size)
}
if h == nil || h.count == 0 {
if t.hashMightPanic() {
t.key.alg.hash(key, 0) // see issue 23734
}
return unsafe.Pointer(&zeroVal[0]), false
}
if h.flags&hashWriting != 0 {
throw("concurrent map read and map write")
}
// 1. 计算对应的桶编号,获取桶地址
alg := t.key.alg
hash := alg.hash(key, uintptr(h.hash0))
m := bucketMask(h.B)
b := (*bmap)(unsafe.Pointer(uintptr(h.buckets) + (hash&m)*uintptr(t.bucketsize))) // 取hash值的低B位为桶编号
if c := h.oldbuckets; c != nil { // 当存在旧桶的时候,数据可能尚未迁移
if !h.sameSizeGrow() { // 当扩充桶为double_size时, 桶编号要取 hash 值的低 B-1 位
// There used to be half as many buckets; mask down one more power of two.
m >>= 1
}
oldb := (*bmap)(unsafe.Pointer(uintptr(c) + (hash&m)*uintptr(t.bucketsize)))
if !evacuated(oldb) { // 如果没有迁移则去老的桶取
b = oldb
}
}
top := tophash(hash)
bucketloop:
// 2. 遍历寻找对应的key
for ; b != nil; b = b.overflow(t) {
for i := uintptr(0); i < bucketCnt; i++ {
if b.tophash[i] != top {
if b.tophash[i] == emptyRest {
break bucketloop
}
continue
}
k := add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize))
if t.indirectkey() {
k = *((*unsafe.Pointer)(k))
}
if alg.equal(key, k) {
e := add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.elemsize))
if t.indirectelem() {
e = *((*unsafe.Pointer)(e))
}
return e, true
}
}
}
return unsafe.Pointer(&zeroVal[0]), false
}
整个查询过程比较简单,如果你是顺序阅读到这里的话应该很好理解:
- 取低key_hash的低B位计算桶的编号N,
bucket_i = bucket_N
,如果此时有旧桶存在,执行第2步,否则执行第 4 步; - 寻找
oldbucket
,判断Grow 类型
,如果是same_size
则编号编号仍然为N,如果是dobule_size
则取key_hash低B-1位作为旧桶编号;执行下一步; - 判断
odlbucket
是否被迁移,如果没被迁移则bucket_i = old_bucket
;执行下一步; - 在
bucket_i
和overflow_bucket
中寻找与key_topHash
相等的tophash
,找到后取对应的key
做equal
比对,如果相等则返回 对应的elem
,否则继续遍历,到结束为止。
相比较于按照key
访问,遍历访问要更复杂一些:
// mapiterinit initializes the hiter struct used for ranging over maps.
// The hiter struct pointed to by 'it' is allocated on the stack
// by the compilers order pass or on the heap by reflect_mapiterinit.
// Both need to have zeroed hiter since the struct contains pointers.
func mapiterinit(t *maptype, h *hmap, it *hiter) {
if raceenabled && h != nil {
callerpc := getcallerpc()
racereadpc(unsafe.Pointer(h), callerpc, funcPC(mapiterinit))
}
if h == nil || h.count == 0 {
return
}
if unsafe.Sizeof(hiter{})/sys.PtrSize != 12 {
throw("hash_iter size incorrect") // see cmd/compile/internal/gc/reflect.go
}
it.t = t
it.h = h
// grab snapshot of bucket state
// 1. 创建当前map状态快照
it.B = h.B
it.buckets = h.buckets
if t.bucket.ptrdata == 0 {
// Allocate the current slice and remember pointers to both current and old.
// This preserves all relevant overflow buckets alive even if
// the table grows and/or overflow buckets are added to the table
// while we are iterating.
h.createOverflow()
it.overflow = h.extra.overflow
it.oldoverflow = h.extra.oldoverflow
}
// decide where to start
// 2. 决定起始位置
// 取一个随机值
r := uintptr(fastrand())
if h.B > 31-bucketCntBits {
r += uintptr(fastrand()) << 31
}
// 选取一个随机的起始bucket
it.startBucket = r & bucketMask(h.B)
// 选取一个随机的偏移量
it.offset = uint8(r >> h.B & (bucketCnt - 1))
// iterator state
it.bucket = it.startBucket
// Remember we have an iterator.
// Can run concurrently with another mapiterinit().
// 写入标志位
if old := h.flags; old&(iterator|oldIterator) != iterator|oldIterator {
atomic.Or8(&h.flags, iterator|oldIterator)
}
mapiternext(it)
}
func mapiternext(it *hiter) {
h := it.h
if raceenabled {
callerpc := getcallerpc()
racereadpc(unsafe.Pointer(h), callerpc, funcPC(mapiternext))
}
if h.flags&hashWriting != 0 {
throw("concurrent map iteration and map write")
}
t := it.t
bucket := it.bucket
b := it.bptr
i := it.i
checkBucket := it.checkBucket
alg := t.key.alg
next:
if b == nil {
// 判断是否已经循环遍历所有的Bucket
if bucket == it.startBucket && it.wrapped {
// end of iteration
it.key = nil
it.elem = nil
return
}
if h.growing() && it.B == h.B {
//map 在grow state 且 iterInit在grow state 或者是 `same_size`
// Iterator was started in the middle of a grow, and the grow isn't done yet.
// If the bucket we're looking at hasn't been filled in yet (i.e. the old
// bucket hasn't been evacuated) then we need to iterate through the old
// bucket and only return the ones that will be migrated to this bucket.
oldbucket := bucket & it.h.oldbucketmask()
b = (*bmap)(add(h.oldbuckets, oldbucket*uintptr(t.bucketsize)))
if !evacuated(b) { // 判断是否迁移
checkBucket = bucket
} else {
b = (*bmap)(add(it.buckets, bucket*uintptr(t.bucketsize)))
checkBucket = noCheck
}
} else {
// map未处于`grow state`,或`grow state`为`doubel_size`.
b = (*bmap)(add(it.buckets, bucket*uintptr(t.bucketsize)))
checkBucket = noCheck
}
bucket++
if bucket == bucketShift(it.B) {
bucket = 0
it.wrapped = true
}
i = 0
}
for ; i < bucketCnt; i++ {
offi := (i + it.offset) & (bucketCnt - 1)
// 跳过空闲位置
if isEmpty(b.tophash[offi]) || b.tophash[offi] == evacuatedEmpty {
// TODO: emptyRest is hard to use here, as we start iterating
// in the middle of a bucket. It's feasible, just tricky.
continue
}
k := add(unsafe.Pointer(b), dataOffset+uintptr(offi)*uintptr(t.keysize))
if t.indirectkey() {
k = *((*unsafe.Pointer)(k))
}
e := add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+uintptr(offi)*uintptr(t.elemsize))
if checkBucket != noCheck && !h.sameSizeGrow() {// 过滤不会被迁移过来的数据
// Special case: iterator was started during a grow to a larger size
// and the grow is not done yet. We're working on a bucket whose
// oldbucket has not been evacuated yet. Or at least, it wasn't
// evacuated when we started the bucket. So we're iterating
// through the oldbucket, skipping any keys that will go
// to the other new bucket (each oldbucket expands to two
// buckets during a grow).
if t.reflexivekey() || alg.equal(k, k) {
// If the item in the oldbucket is not destined for
// the current new bucket in the iteration, skip it.
hash := alg.hash(k, uintptr(h.hash0))
if hash&bucketMask(it.B) != checkBucket {
continue
}
} else {
// Hash isn't repeatable if k != k (NaNs). We need a
// repeatable and randomish choice of which direction
// to send NaNs during evacuation. We'll use the low
// bit of tophash to decide which way NaNs go.
// NOTE: this case is why we need two evacuate tophash
// values, evacuatedX and evacuatedY, that differ in
// their low bit.
if checkBucket>>(it.B-1) != uintptr(b.tophash[offi]&1) {
continue
}
}
}
if (b.tophash[offi] != evacuatedX && b.tophash[offi] != evacuatedY) ||
!(t.reflexivekey() || alg.equal(k, k)) {// 数据没有迁移,直接访问
// This is the golden data, we can return it.
// OR
// key!=key, so the entry can't be deleted or updated, so we can just return it.
// That's lucky for us because when key!=key we can't look it up successfully.
it.key = k
if t.indirectelem() {
e = *((*unsafe.Pointer)(e))
}
it.elem = e
} else { // 数据已经迁移,通过key定位访问
// The hash table has grown since the iterator was started.
// The golden data for this key is now somewhere else.
// Check the current hash table for the data.
// This code handles the case where the key
// has been deleted, updated, or deleted and reinserted.
// NOTE: we need to regrab the key as it has potentially been
// updated to an equal() but not identical key (e.g. +0.0 vs -0.0).
rk, re := mapaccessK(t, h, k)
if rk == nil {
continue // key has been deleted
}
it.key = rk
it.elem = re
}
it.bucket = bucket
if it.bptr != b { // avoid unnecessary write barrier; see issue 14921
it.bptr = b
}
it.i = i + 1
it.checkBucket = checkBucket
return
}
b = b.overflow(t) // 继续遍历overflow bucket
i = 0
goto next
}
当我们去遍历一个map的时候可能有三种情况:
- 在
iterInit
与iterNext
期间未发生Grow
: 只要顺序遍历it.buckets
数据即可; iterInit
与iterNext
都发生在Grow state
: 上述代码可以清晰看出iterInit
时候是对h.buckets
、h.B
做了 snapshot,此时iterNext
是以newBucket
作为基础去遍历的,那么一个bucket_N
可能有两个状态:已经迁移和尚未迁移。已经迁移的直接遍历桶即可,未迁移的则需要去oldbucket
中遍历,不过需要注意的一点是,要过滤掉那些不可能被迁移到bucket_N
的数据(在double_size
情况下分上下半区);iterInit
在Grow state
之前,iterNext
在Grow State
:此种情况更加复杂一些,因为iterInit
是在Grow
之前,iterNext
的时候it.buckets
实际对应的是h.oldbucket
,也就是说是基于oldbuckets
去遍历,此时bucket_N
也有两种情况:已经迁移和没有迁移,没有迁移的直接取数据返回,已经迁移的则直接通过key
访问,因为此时可能在新bucekt中已经被更新或者删除了。
map delete
上面我们也提到过,map的移除是通过修改topHash
为emptyOne
完成,删除逻辑要比插入逻辑简单很多:
func mapdelete(t *maptype, h *hmap, key unsafe.Pointer) {
if raceenabled && h != nil {
callerpc := getcallerpc()
pc := funcPC(mapdelete)
racewritepc(unsafe.Pointer(h), callerpc, pc)
raceReadObjectPC(t.key, key, callerpc, pc)
}
if msanenabled && h != nil {
msanread(key, t.key.size)
}
if h == nil || h.count == 0 {
if t.hashMightPanic() {
t.key.alg.hash(key, 0) // see issue 23734
}
return
}
if h.flags&hashWriting != 0 {
throw("concurrent map writes")
}
alg := t.key.alg
hash := alg.hash(key, uintptr(h.hash0))
// Set hashWriting after calling alg.hash, since alg.hash may panic,
// in which case we have not actually done a write (delete).
h.flags ^= hashWriting
bucket := hash & bucketMask(h.B)
if h.growing() { // 判断是否在增长,如果在增长,则对对应的oldbucket 进行迁移
growWork(t, h, bucket)
}
b := (*bmap)(add(h.buckets, bucket*uintptr(t.bucketsize)))
bOrig := b // bucket在base区域的位置,但key实际所在位置可能是overflow
top := tophash(hash)
search:
for ; b != nil; b = b.overflow(t) {
for i := uintptr(0); i < bucketCnt; i++ {
if b.tophash[i] != top {
if b.tophash[i] == emptyRest { // 搜索到emptyRest,停止搜索
break search
}
continue
}
k := add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize))
k2 := k
if t.indirectkey() {
k2 = *((*unsafe.Pointer)(k2))
}
if !alg.equal(key, k2) {
continue
}
// Only clear key if there are pointers in it.
// 清理内存
if t.indirectkey() {
*(*unsafe.Pointer)(k) = nil
} else if t.key.ptrdata != 0 {
memclrHasPointers(k, t.key.size)
}
e := add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.elemsize))
if t.indirectelem() {
*(*unsafe.Pointer)(e) = nil
} else if t.elem.ptrdata != 0 {
memclrHasPointers(e, t.elem.size)
} else {
memclrNoHeapPointers(e, t.elem.size)
}
b.tophash[i] = emptyOne
// If the bucket now ends in a bunch of emptyOne states,
// change those to emptyRest states.
// It would be nice to make this a separate function, but
// for loops are not currently inlineable.
// 如果emptyOne 后面紧跟的 emptyRest,则把emptyOne设置为emptyRest
if i == bucketCnt-1 {
if b.overflow(t) != nil && b.overflow(t).tophash[0] != emptyRest {
goto notLast
}
} else {
if b.tophash[i+1] != emptyRest {
goto notLast
}
}
for {
b.tophash[i] = emptyRest
if i == 0 {
if b == bOrig {
break // beginning of initial bucket, we're done.
}
// Find previous bucket, continue at its last entry.
c := b
for b = bOrig; b.overflow(t) != c; b = b.overflow(t) { //查找b前面的overflow
}
i = bucketCnt - 1
} else {
i--
}
if b.tophash[i] != emptyOne {
break
}
}
notLast:
h.count--
break search
}
}
if h.flags&hashWriting == 0 {
throw("concurrent map writes")
}
h.flags &^= hashWriting
}
执行流程如下:
- 查找key所在bucket,将对应位置的
topHash
设置为emptyOne
,清除对应的key
、elem
数据; - 如果下一个位置为
emptyRest
(包括后面紧跟的overflow_bueckt
),则将emptyOne
修改为emptyRest
,向下执行;否则结束流程; - 向前查找前一个位置,如果到了
base_bucekt
的起始位置,则结束流程;否则跳转到第2步。
总结
- golang map的底层实现是通过hash table实现的,每个bucket可以存贮8个
key-elem
,通过key_hash的低B
(总共划分2^B个桶)bit划分桶,桶内通过topHash
(key_hash前8bit)做区分; - hash table在内存中使用
连续数组+跳转指针
存储,跳转指针
指向overflow_bucket
,根据key
查找的时候都是在连续内存上操作,以此来保证O(1)时间复杂度; - 桶的
Grow
可以剔除被删除数据占用的空间,使得数据更加紧凑,同时overflow_bucket
的排序会发生改变,优先迁移的bucket
对应的overflow_bucket
地址靠前。有两种形式:same_size
(Grow
前后桶数不变),数据分桶格局不变;double_size
(扩张后桶数*2), 根据key_hash的B-1
bit决定是划分到first
半区还是second
半区,完成桶的重新划分。 - 删除数据的时候会见对应位置的
topHash
设置为emptyOne
,如果一个bucekt
(这里的bucekt
是指逻辑上的桶,包括base_bucekt和overflow_bucket
)中最后的位置为emptyOne
则修改为emptyreset
。