Guava(一)RateLimter设计分析
RateLimiter是一个抽象类。
在RateLimiter的类上面的注释中大概知道如下信息:
1:一个限流器以固定配置的速率来分配令牌,每次acquire方法的调用,如果令牌桶的令牌不够的时候会阻塞,直到有足够的令牌生成。
2:RateLimiter是线程安全的,它会限制所有线程的总速率。但是它并不能保证公平。
3:acquire(1)和acquire(100)所产生的影响是一样的,他们都不会影响第一次调用这个方法的请求,影响的是后面的请求。
它的类继承结构如下,有两个具体的实现类SmoothWarmingUp,SmoothBursty。它俩都是SmoothRateLimiter的内部类。他们分别对应两种限流方式。一个是有预热时间,一个没有预热时间。
从SmoothRateLimiter开始
/* * How is the RateLimiter designed, and why? RateLimiter 是怎么设计的,以及为什么要这么设计呢? * * RateLimiter 主要的特点就是它“稳定的速率”。根据需要,通过强制限制到达的请求来实现。例如,我们可以为一个到达的请求计算一个合适的限流时间 * ,使得当前这个线程一直等待到那个时间点。 * The primary feature of a RateLimiter is its "stable rate", the maximum rate that it should * allow in normal conditions. This is enforced by "throttling" incoming requests as needed. For * example, we could compute the appropriate throttle time for an incoming request, and make the * calling thread wait for that time. * * 最简单的保持QPS速率的方式就是,记录上次请求放行的时间,然后 过了 (1/QPS)秒之后再放行另外一个请求。 * 例如: QPS =5(每秒五个令牌,也就是 200ms 一个),如果我们能保证一个请求到来的时候的时间点比上次放行的请求的时间点之差不小于200ms, * 那么我们就算保证了这个速率。如果一个请求到达的时候,上次放行的请求才过了100ms,那么当前这个请求就得再等待100ms. * 照这个速率,如果acquire(15),就是想得到15个可以放行得许可,就需要花费3秒的时间。 * The simplest way to maintain a rate of QPS is to keep the timestamp of the last granted * request, and ensure that (1/QPS) seconds have elapsed since then. For example, for a rate of * QPS=5 (5 tokens per second), if we ensure that a request isn't granted earlier than 200ms after * the last one, then we achieve the intended rate. If a request comes and the last request was * granted only 100ms ago, then we wait for another 100ms. At this rate, serving 15 fresh permits * (i.e. for an acquire(15) request) naturally takes 3 seconds. * * 需要认识到,RateLimiter 对于通过的请求是不怎么关心的,唯一存下的就是上次的请求信息。 * RateLimiter 如果长时间未使用,当一个请求到来的时候,它会立即被授予一个许可通过吗? * RateLimiter 会忘记它之前是处于一个未被充分利用的状况。这可能导致利用率不足或溢出,具体取决于未使用预期速率的实际后果。 * It is important to realize that such a RateLimiter has a very superficial memory of the past: * it only remembers the last request. What if the RateLimiter was unused for a long period of * time, then a request arrived and was immediately granted? This RateLimiter would immediately * forget about that past underutilization. This may result in either underutilization or * overflow, depending on the real world consequences of not using the expected rate. * * 长时间未使用的情况,可能意味着资源过剩。RateLimiter 应该要加快速度充分利用资源。 * 当这个速率代表着宽带限制的时候,past underutilization 这种状况通常意味着“几乎是空的缓存区”,它可以被瞬间填充满。 * Past underutilization could mean that excess resources are available. Then, the RateLimiter * should speed up for a while, to take advantage of these resources. This is important when the * rate is applied to networking (limiting bandwidth), where past underutilization typically * translates to "almost empty buffers", which can be filled immediately. * * 另一方面,past underutilization 也可能意味着 处理请求的服务端无法再为未来可能到达的请求准备好资源了。 * 例如:可能缓存失效了,请求需要花费更多的时间来处理。更极端的情况是服务端刚刚重启。 * On the other hand, past underutilization could mean that "the server responsible for handling * the request has become less ready for future requests", i.e. its caches become stale, and * requests become more likely to trigger expensive operations (a more extreme case of this * example is when a server has just booted, and it is mostly busy with getting itself up to * speed). * * 为了处理这种情况,我们增加了一个维度:"past underutilization" 使用 storedPermits 来表示。 * storedPermits 可以从零到 maxStoredPermits 之间。因此 acquire(permits) 得到的令牌有两部分组成: * 存储的令牌,也就是stored permits,余下的令牌,也就是 fresh permits * To deal with such scenarios, we add an extra dimension, that of "past underutilization", * modeled by "storedPermits" variable. This variable is zero when there is no underutilization, * and it can grow up to maxStoredPermits, for sufficiently large underutilization. So, the * requested permits, by an invocation acquire(permits), are served from: * * - stored permits (if available) * * - fresh permits (for any remaining permits) * * How this works is best explained with an example: 举个例子说明下是怎么工作的 * * 对于 每秒生成1个令牌的 RateLimiter,如果1s内没有被使用,那它的 storedPermits 就增加1. * 也就是说 RateLimiter 10s内有没有被使用(例如,我们希望X 时间点请求到达,但是X+10s 请求才到来。),storedPermits 就变成了10.0,当然得保证 maxStoredPermits >= 10.0 * 在这种情况下,如果有一个acquire(3)得请求到达,会从storedPermits中取出令牌使用,并减少到7.0(它是怎么转换成节流时间得,稍后讨论) * 刚处理完这个请求,马上有个acquire(10)的请求到来,我们从 storedPermits 中取出剩下的7.0个,其它三个从 fresh permits中取出。 * For a RateLimiter that produces 1 token per second, every second that goes by with the * RateLimiter being unused, we increase storedPermits by 1. Say we leave the RateLimiter unused * for 10 seconds (i.e., we expected a request at time X, but we are at time X + 10 seconds before * a request actually arrives; this is also related to the point made in the last paragraph), thus * storedPermits becomes 10.0 (assuming maxStoredPermits >= 10.0). At that point, a request of * acquire(3) arrives. We serve this request out of storedPermits, and reduce that to 7.0 (how * this is translated to throttling time is discussed later). Immediately after, assume that an * acquire(10) request arriving. We serve the request partly from storedPermits, using all the * remaining 7.0 permits, and the remaining 3.0, we serve them by fresh permits produced by the * rate limiter. * * 如果速率是1s 一个令牌生成,提供3个 fresh permits 需要花费三秒。 * 但是提供 7个 stored permits是什么意思呢? * 如果我们更关注怎么解决资源过剩没有被充分利用的话,那么可以让 stored perimits 释放的比 fresh permits快些。 * underutilization 就意味着有空闲的资源可以被使用。 * 如果我们更专注的是怎么应付overflow的情况,那么可以让 stored perimits 释放的比 fresh permits慢些。 * 因此我们需要一个把storedPermits转换成节流时间的方法(在不同的情况下,这个方法是不同的。) * We already know how much time it takes to serve 3 fresh permits: if the rate is * "1 token per second", then this will take 3 seconds. But what does it mean to serve 7 stored * permits? As explained above, there is no unique answer. If we are primarily interested to deal * with underutilization, then we want stored permits to be given out /faster/ than fresh ones, * because underutilization = free resources for the taking. If we are primarily interested to * deal with overflow, then stored permits could be given out /slower/ than fresh ones. Thus, we * require a (different in each case) function that translates storedPermits to throttling time. * * storedPermitsToWaitTime 就是扮演这个角色的。底层模型是一个映射storedPermits到 1/rate的连续函数。 * "storedPermits" 本质上代表着未被使用的时间,在这段时间内存储令牌。Rate(速率)就是 permits/time,因此 1/rate= time/permits * "1/rate" (time / permits) 乘以 "permits" 等于 time(intervals)。(假设上面说的连续函数是 storedPermitsToWaitTime()) * 对于一定数量的permits请求比如acquire(3),在这个函数上的积分(就是 time / permits 乘以 permits 的结果)就是两个请求之间的请求时间间隔。 * This role is played by storedPermitsToWaitTime(double storedPermits, double permitsToTake). The * underlying model is a continuous function mapping storedPermits (from 0.0 to maxStoredPermits) * onto the 1/rate (i.e. intervals) that is effective at the given storedPermits. "storedPermits" * essentially measure unused time; we spend unused time buying/storing permits. Rate is * "permits / time", thus "1 / rate = time / permits". Thus, "1/rate" (time / permits) times * "permits" gives time, i.e., integrals on this function (which is what storedPermitsToWaitTime() * computes) correspond to minimum intervals between subsequent requests, for the specified number * of requested permits. * * 下面是个关于storedPermitsToWaitTime 函数的例子: 如果现在 storedPermits 等于10.0,我们想从中取出 3个令牌, * 那么 storedPermits 减少到7.0,通过调用这个函数 storedPermitsToWaitTime(storedPermits = 10.0, permitsToTake = 3.0) * 我们可以得到从7.0 到10.0的时间间隔。 * Here is an example of storedPermitsToWaitTime: If storedPermits == 10.0, and we want 3 permits, * we take them from storedPermits, reducing them to 7.0, and compute the throttling for these as * a call to storedPermitsToWaitTime(storedPermits = 10.0, permitsToTake = 3.0), which will * evaluate the integral of the function from 7.0 to 10.0. * * 使用积分的形式来计算可以保证 acquire(3) 得到的结果和 acquire(1); acquire(1); acquire(1);或者 * acquire(2); acquire(1); 的结果是一样的,因为连续的函数在连续区间上积分有这样的定理(高中知识哈) * * Using integrals guarantees that the effect of a single acquire(3) is equivalent to { * acquire(1); acquire(1); acquire(1); }, or { acquire(2); acquire(1); }, etc, since the integral * of the function in [7.0, 10.0] is equivalent to the sum of the integrals of [7.0, 8.0], [8.0, * 9.0], [9.0, 10.0] (and so on), no matter what the function is. This guarantees that we handle * correctly requests of varying weight (permits), /no matter/ what the actual function is - so we * can tweak the latter freely. (The only requirement, obviously, is that we can compute its * integrals). * * Note well that if, for this function, we chose a horizontal line, at height of exactly (1/QPS), * then the effect of the function is non-existent: we serve storedPermits at exactly the same * cost as fresh ones (1/QPS is the cost for each). We use this trick later. * * If we pick a function that goes /below/ that horizontal line, it means that we reduce the area * of the function, thus time. Thus, the RateLimiter becomes /faster/ after a period of * underutilization. If, on the other hand, we pick a function that goes /above/ that horizontal * line, then it means that the area (time) is increased, thus storedPermits are more costly than * fresh permits, thus the RateLimiter becomes /slower/ after a period of underutilization. * * Last, but not least: consider a RateLimiter with rate of 1 permit per second, currently * completely unused, and an expensive acquire(100) request comes. It would be nonsensical to just * wait for 100 seconds, and /then/ start the actual task. Why wait without doing anything? A much * better approach is to /allow/ the request right away (as if it was an acquire(1) request * instead), and postpone /subsequent/ requests as needed. In this version, we allow starting the * task immediately, and postpone by 100 seconds future requests, thus we allow for work to get * done in the meantime instead of waiting idly. * * 这里有个很重要的结论: RateLimiter不保存上次请求到达的时间,但是它记录下次请求期望的到达时间。 * 这样我们也可以马上知道下次被调度的时间时间是否是超时的。 * 对于 " an unused RateLimiter"是这样定义的: * 下次请求期望到达的时间点过去了,还没有请求到达,假设下次请求期望到达的时间点是past,现在的时间点是now, * 那么now-past 这段时间说明 RateLimiter没有被使用。在这段空闲的时间内我们增加 storedPermits的数量。 * 如果,速率 rate =1 每秒一个permit,如果请求是在刚好在上一次请求之后的1s到达,storedPermits 也不会增加。 * 只有超过1s才会增加。 * This has important consequences: it means that the RateLimiter doesn't remember the time of the * _last_ request, but it remembers the (expected) time of the _next_ request. This also enables * us to tell immediately (see tryAcquire(timeout)) whether a particular timeout is enough to get * us to the point of the next scheduling time, since we always maintain that. And what we mean by * "an unused RateLimiter" is also defined by that notion: when we observe that the * "expected arrival time of the next request" is actually in the past, then the difference (now - * past) is the amount of time that the RateLimiter was formally unused, and it is that amount of * time which we translate to storedPermits. (We increase storedPermits with the amount of permits * that would have been produced in that idle time). So, if rate == 1 permit per second, and * arrivals come exactly one second after the previous, then storedPermits is _never_ increased -- * we would only increase it for arrivals _later_ than the expected one second. */ /** * This implements the following function where coldInterval = coldFactor * stableInterval. * * <pre> * ^ throttling == 1/rate == time/permits * | * cold + / * interval | /. * | / . * | / . ← "warmup period" is the area of the trapezoid between * | / . thresholdPermits and maxPermits * | / . * | / . * | / . * stable +----------/ WARM . * interval | . UP . * | . PERIOD. * | . . * 0 +----------+-------+--------------→ storedPermits * 0 thresholdPermits maxPermits * </pre> * 了解这个图形细节之前,我们看下下面的基础内容。 * Before going into the details of this particular function, let's keep in mind the basics: * * <ol> * 横坐标是 RateLimiter的状态表示 storedPermits的值。 * <li>The state of the RateLimiter (storedPermits) is a vertical line in this figure. * 当 RateLimiter 没有被使用的时候,横坐标向右移动直到 maxPermits * <li>When the RateLimiter is not used, this goes right (up to maxPermits) * 当 RateLimiter 被使用的时候,横坐标开始向左移动直到0,storedPermits 有值 优先使用它 * <li>When the RateLimiter is used, this goes left (down to zero), since if we have * storedPermits, we serve from those first * * 当未被使用的时候,坐标以恒定的速率向右移动,这个速率被选为 maxPermits / warmupPeriod * 这样可以保证横坐标从0到maxPermits花费的时间等于 warmupPeriod (这里向右移动的时间不是按积分来算的。) * <li>When _unused_, we go right at a constant rate! The rate at which we move to the right is * chosen as maxPermits / warmupPeriod. This ensures that the time it takes to go from 0 to * maxPermits is equal to warmupPeriod. * 当被使用的时候,花费的时间就是花费的 K 个permits宽度之间的积分。 * <li>When _used_, the time it takes, as explained in the introductory class note, is equal to * the integral of our function, between X permits and X-K permits, assuming we want to * spend K saved permits. * </ol> * * <p>In summary, the time it takes to move to the left (spend K permits), is equal to the area of * the function of width == K. * 假设我们的令牌桶满了,maxPermits 到 thresholdPermits 花费的时间等于 warmupPeriod。 * 从 thresholdPermits 到 0花费的时间是 warmupPeriod/2 * <p>Assuming we have saturated demand, the time to go from maxPermits to thresholdPermits is * equal to warmupPeriod. And the time to go from thresholdPermits to 0 is warmupPeriod/2. (The * reason that this is warmupPeriod/2 is to maintain the behavior of the original implementation * where coldFactor was hard coded as 3.) * * thresholdsPermits and maxPermits 还是需要计算出来的 * <p>It remains to calculate thresholdsPermits and maxPermits. * * <ul> * 从 thresholdPermits 到0 花费的时间是 从 0到 thresholdPermits之间的函数积分。等于 thresholdPermits * stableIntervals * 按照上面讲的它也等于 warmupPeriod/2, 所以 * thresholdPermits = 0.5 * warmupPeriod / stableInterval * * <li>The time to go from thresholdPermits to 0 is equal to the integral of the function * between 0 and thresholdPermits. This is thresholdPermits * stableIntervals. By (5) it is * also equal to warmupPeriod/2. Therefore * <blockquote> * thresholdPermits = 0.5 * warmupPeriod / stableInterval * </blockquote> * * 从 maxPermits 到 thresholdPermits花费的时间 等于上图梯形的面积:0.5 * (stableInterval + coldInterval) * (maxPermits - * * thresholdPermits) =warmupPeriod * 所以 maxPermits = thresholdPermits + 2 * warmupPeriod / (stableInterval + coldInterval) * <li>The time to go from maxPermits to thresholdPermits is equal to the integral of the * function between thresholdPermits and maxPermits. This is the area of the pictured * trapezoid, and it is equal to 0.5 * (stableInterval + coldInterval) * (maxPermits - * thresholdPermits). It is also equal to warmupPeriod, so * <blockquote> * maxPermits = thresholdPermits + 2 * warmupPeriod / (stableInterval + coldInterval) * </blockquote> * </ul> */
对这个限流器的做些总结如下:
1:它是令牌桶算法的具体实现。所以在获取令牌和放入令牌(permits)的时候带着漏桶算法的流程去想那个图形的走势会比较好理解。
2: 上面描述的图形就是令牌桶放入令牌和消耗令牌的过程。
2.1:在限流器未被使用的时候,空闲的时候,会生成令牌放入桶中,生成令牌的速率是固定的,对于SmoothBursty限流器,桶中存储的令牌取出的时候不会进行限流,有多少就可以拿出多少,这个在acquire的时候可以看出来。
它生成令牌的速率固定是 stableInterval = 1/permitsPerSecond。对于SmoothWarmingUp, maxPermits / warmupPeriodMicros 作为速率,因为它实现的是 storedPermits 从 0 到 maxPermits 花费的时间为 warmupPeriod。
消耗令牌的时候图形向左走,消耗一个令牌的时间作为了纵坐标,令牌桶中的令牌数量作为了横坐标。
3: 上面提到了 从 thresholdPermits 到 0花费的时间是 warmupPeriod/2,解释说是因为codeFactor 硬编码是3。也没真正理解其意思,大概猜测是因为从maxPermits到thresholdPermits 纵坐标的消耗速率差了2倍,为了保持这种趋势,
所以就约定从maxPermits到thresholdPermits消耗的时候是 thresholdPermits到0消耗时间的两倍。
4:有了梯形面积的计算公式,和3)中说的关系,知道了warmupPeriod和permitsPerSecond 就可以计算maxPermits和thresholdPermits。
具体是怎么按照上面所设计的方式进行限流的,看下篇记录。关于上述描述哪里不准确的地方,欢迎读友留言指正。