【LevelDB源码阅读】Random
是什么
用于产生随机数。
C语言中伪随机数生成算法实际上是采用了“线性同余法”,具体计算如下:
seed = (seed * A + C ) % M
其中A,C,M都是常数(一般取质数),当C=0时,叫作乘同余法。
为什么要用
为什么不用系统随机数?
学到什么
- 可以将长的二进制整数分解为多个段来解决问题
- 利用位运算优化算术运算
源码分析
构造函数
因为后面随机数生成采用seed_ = (seed_ * A) % M
,如果seed_为0或M(2^31-1)会导致所有seed_都是0。
// 0x7fffffffu == 2147483647L == 2^31-1 == 01111111 11111111 11111111 11111111
// 表达式s & 0x7fffffffu,确保结果值在[0,2147483647]范围内
explicit Random(uint32_t s) : seed_(s & 0x7fffffffu) {
// Avoid bad seeds.
if (seed_ == 0 || seed_ == 2147483647L) {
seed_ = 1;
}
}
Next方法
在计算produce%M
时使用static_cast<uint32_t>((product >> 31) + (product & M))
来进行优化,因为左移运算和与操作快于模操作。下面证明
produce%M == static_cast<uint32_t>((product >> 31) + (product & M))
product类型是uint64_t,可以将product二进制从左到右分为高33位和低31位,假设高33位为H,低31位为L,则:
product = H << 31 + L
左边 = produce % M = (H << 31 + L) % M = (H * 2^31 + L) % M=(H * M + H + L) % M = H + L
右边 = (product >> 31) + (product & M) = (H * 2^31 + L) >> 31 + L = (H * 2^31 + L) / 2^31 + L = H + L
因为低31位可能等于M,则左边 = H,右边 = H + L,所以需要判断当右边大于M时,减去M。
uint32_t Next() {
static const uint32_t M = 2147483647L; // 2^31-1
static const uint64_t A = 16807; // bits 14, 8, 7, 5, 2, 1, 0
// We are computing
// seed_ = (seed_ * A) % M, where M = 2^31-1
//
// seed_ must not be zero or M, or else all subsequent computed values
// will be zero or M respectively. For all other values, seed_ will end
// up cycling through every number in [1,M-1]
uint64_t product = seed_ * A;
// Compute (product % M) using the fact that ((x << 31) % M) == x.
seed_ = static_cast<uint32_t>((product >> 31) + (product & M));
// The first reduction may overflow by 1 bit, so we may need to
// repeat. mod == M is not possible; using > allows the faster
// sign-bit-based test.
if (seed_ > M) {
seed_ -= M;
}
return seed_;
}
其它接口
// Returns a uniformly distributed value in the range [0..n-1]
// REQUIRES: n > 0
uint32_t Uniform(int n) { return Next() % n; }
// Randomly returns true ~"1/n" of the time, and false otherwise.
// REQUIRES: n > 0
bool OneIn(int n) { return (Next() % n) == 0; }
// Skewed: pick "base" uniformly from range [0,max_log] and then
// return "base" random bits. The effect is to pick a number in the
// range [0,2^max_log-1] with exponential bias towards smaller numbers.
uint32_t Skewed(int max_log) { return Uniform(1 << Uniform(max_log + 1)); }