源码中的超高性能取模
今天在看一个golang的ringbuffer的源码时看到一段看不懂的代码
https://github.com/Workiva/go-datastructures/blob/c466da296827daa1e1efba14c912e2802533fe7f/queue/ring.go#L96
func (rb *RingBuffer) init(size uint64) {
size = roundUp(size)
rb.nodes = make(nodes, size)
for i := uint64(0); i < size; i++ {
rb.nodes[i] = node{position: i}
}
rb.mask = size - 1 // so we don't have to do this with every put/get operation
}
// Put adds the provided item to the queue. If the queue is full, this
// call will block until an item is added to the queue or Dispose is called
// on the queue. An error will be returned if the queue is disposed.
func (rb *RingBuffer) Put(item interface{}) error {
_, err := rb.put(item, false)
return err
}
// Offer adds the provided item to the queue if there is space. If the queue
// is full, this call will return false. An error will be returned if the
// queue is disposed.
func (rb *RingBuffer) Offer(item interface{}) (bool, error) {
return rb.put(item, true)
}
func (rb *RingBuffer) put(item interface{}, offer bool) (bool, error) {
var n *node
pos := atomic.LoadUint64(&rb.queue)
L:
for {
if atomic.LoadUint64(&rb.disposed) == 1 {
return false, ErrDisposed
}
n = &rb.nodes[pos&rb.mask]
seq := atomic.LoadUint64(&n.position)
switch dif := seq - pos; {
case dif == 0:
if atomic.CompareAndSwapUint64(&rb.queue, pos, pos+1) {
break L
}
case dif < 0:
panic(`Ring buffer in a compromised state during a put operation.`)
default:
pos = atomic.LoadUint64(&rb.queue)
}
if offer {
return false, nil
}
runtime.Gosched() // free up the cpu before the next iteration
}
n.data = item
atomic.StoreUint64(&n.position, pos+1)
return true, nil
}
rb.mask = size - 1
n = &rb.nodes[pos&rb.mask]
这两行代码从意思上理解是要取模, 因为是ringbuffer嘛, 所以肯定要取模, 但是为啥他的取模这么简单?
分析代码得到, size肯定是2的倍数, 结果发现size-1 & pos 就 等于 pos % size
例如:size = 8
那么mask = 0b111
比如 pos = 9, 那么 0b111 & 0b1001 = 1就是取模的值
原理就是2的倍数减1肯定是全是1的二进制位, &运算正好就会去除高位, 所以结果就等于取模的值了,
顿时觉得大佬写的代码真🐂🍺
然后想到go的map肯定也是要取模的, 因为go的map有一堆bucket, 然后hash算出offset, 那go源码是不是这么做的, 查看map.go源码
发现也有很多这样取模的
bucket := hash & bucketMask(h.B)
总结:
以后如果有需要取模运算的情况话, 可以用这个方法优化性能