Spring Cloud Ribbon的原理-负载均衡策略
在前两篇《Spring Cloud Ribbon的原理》,《Spring Cloud Ribbon的原理-负载均衡器》中,整理了Ribbon如何通过负载均衡拦截器植入RestTemplate,以及调用负载均衡器获取服务列表,如何过滤,如何更新等的处理过程。
因为,负载均衡器最终是调用负载均衡策略的choose方法来选择一个服务,所以这一篇,整理Ribbon的负载均衡策略。
策略类
- RandomRule
- RoundRobinRule
- RetryRule
- WeightedResponseTimeRule
- ClientConfigEnabledRoundRobinRule
- BestAvailableRule
- PredicateBasedRule
- AvailabilityFilteringRule
- ZoneAvoidanceRule
类继承关系
RandomRule
随机选取负载均衡策略。
choose方法中,通过随机Random对象,在所有服务实例数量中随机找一个服务的索引号,然后从上线的服务中获取对应的服务。
这时候,很可能会有不在线的服务,就有可能从上线的服务中获取不到,那么休息会儿再获取知道随机获取到一个上线的服务为止。
1 public class RandomRule extends AbstractLoadBalancerRule { 2 3 /** 4 * Randomly choose from all living servers 5 */ 6 @edu.umd.cs.findbugs.annotations.SuppressWarnings(value = "RCN_REDUNDANT_NULLCHECK_OF_NULL_VALUE") 7 public Server choose(ILoadBalancer lb, Object key) { 8 if (lb == null) { 9 return null; 10 } 11 Server server = null; 12 13 while (server == null) { 14 if (Thread.interrupted()) { 15 return null; 16 } 17 List<Server> upList = lb.getReachableServers(); 18 List<Server> allList = lb.getAllServers(); 19 20 int serverCount = allList.size(); 21 if (serverCount == 0) { 22 /* 23 * No servers. End regardless of pass, because subsequent passes 24 * only get more restrictive. 25 */ 26 return null; 27 } 28 29 int index = chooseRandomInt(serverCount); 30 server = upList.get(index); 31 32 if (server == null) { 33 /* 34 * The only time this should happen is if the server list were 35 * somehow trimmed. This is a transient condition. Retry after 36 * yielding. 37 */ 38 Thread.yield(); 39 continue; 40 } 41 42 if (server.isAlive()) { 43 return (server); 44 } 45 46 // Shouldn't actually happen.. but must be transient or a bug. 47 server = null; 48 Thread.yield(); 49 } 50 51 return server; 52 53 } 54 55 protected int chooseRandomInt(int serverCount) { 56 return ThreadLocalRandom.current().nextInt(serverCount); 57 } 58 59 @Override 60 public Server choose(Object key) { 61 return choose(getLoadBalancer(), key); 62 } 63 64 @Override 65 public void initWithNiwsConfig(IClientConfig clientConfig) { 66 // TODO Auto-generated method stub 67 68 }
RoundRobinRule
线性轮询负载均衡策略。
choose方法中,通过incrementAndGetModulo方法以线性轮询方式获取服务。
在incrementAndGetModulo中,实际上在类中维护了一个原子性的nextServerCyclicCounter成员变量作为当前服务的索引号,每次在所有服务数量的限制下,就是将服务的索引号加1,到达服务数量限制时再从头开始。
1 public class RoundRobinRule extends AbstractLoadBalancerRule { 2 3 private AtomicInteger nextServerCyclicCounter; 4 private static final boolean AVAILABLE_ONLY_SERVERS = true; 5 private static final boolean ALL_SERVERS = false; 6 7 private static Logger log = LoggerFactory.getLogger(RoundRobinRule.class); 8 9 public RoundRobinRule() { 10 nextServerCyclicCounter = new AtomicInteger(0); 11 } 12 13 public RoundRobinRule(ILoadBalancer lb) { 14 this(); 15 setLoadBalancer(lb); 16 } 17 18 public Server choose(ILoadBalancer lb, Object key) { 19 if (lb == null) { 20 log.warn("no load balancer"); 21 return null; 22 } 23 24 Server server = null; 25 int count = 0; 26 while (server == null && count++ < 10) { 27 List<Server> reachableServers = lb.getReachableServers(); 28 List<Server> allServers = lb.getAllServers(); 29 int upCount = reachableServers.size(); 30 int serverCount = allServers.size(); 31 32 if ((upCount == 0) || (serverCount == 0)) { 33 log.warn("No up servers available from load balancer: " + lb); 34 return null; 35 } 36 37 int nextServerIndex = incrementAndGetModulo(serverCount); 38 server = allServers.get(nextServerIndex); 39 40 if (server == null) { 41 /* Transient. */ 42 Thread.yield(); 43 continue; 44 } 45 46 if (server.isAlive() && (server.isReadyToServe())) { 47 return (server); 48 } 49 50 // Next. 51 server = null; 52 } 53 54 if (count >= 10) { 55 log.warn("No available alive servers after 10 tries from load balancer: " 56 + lb); 57 } 58 return server; 59 } 60 61 /** 62 * Inspired by the implementation of {@link AtomicInteger#incrementAndGet()}. 63 * 64 * @param modulo The modulo to bound the value of the counter. 65 * @return The next value. 66 */ 67 private int incrementAndGetModulo(int modulo) { 68 for (;;) { 69 int current = nextServerCyclicCounter.get(); 70 int next = (current + 1) % modulo; 71 if (nextServerCyclicCounter.compareAndSet(current, next)) 72 return next; 73 } 74 } 75 76 @Override 77 public Server choose(Object key) { 78 return choose(getLoadBalancer(), key); 79 } 80 81 @Override 82 public void initWithNiwsConfig(IClientConfig clientConfig) { 83 } 84 }
WeightedResponseTimeRule
响应时间作为选取权重的负载均衡策略。其含义就是,响应时间越短的服务被选中的可能性大。继承自RoundRobinRule类。
1 public class WeightedResponseTimeRule extends RoundRobinRule { 2 3 public static final IClientConfigKey<Integer> WEIGHT_TASK_TIMER_INTERVAL_CONFIG_KEY = new IClientConfigKey<Integer>() { 4 @Override 5 public String key() { 6 return "ServerWeightTaskTimerInterval"; 7 } 8 9 @Override 10 public String toString() { 11 return key(); 12 } 13 14 @Override 15 public Class<Integer> type() { 16 return Integer.class; 17 } 18 }; 19 20 public static final int DEFAULT_TIMER_INTERVAL = 30 * 1000; 21 22 private int serverWeightTaskTimerInterval = DEFAULT_TIMER_INTERVAL; 23 24 private static final Logger logger = LoggerFactory.getLogger(WeightedResponseTimeRule.class); 25 26 // holds the accumulated weight from index 0 to current index 27 // for example, element at index 2 holds the sum of weight of servers from 0 to 2 28 private volatile List<Double> accumulatedWeights = new ArrayList<Double>(); 29 30 31 private final Random random = new Random(); 32 33 protected Timer serverWeightTimer = null; 34 35 protected AtomicBoolean serverWeightAssignmentInProgress = new AtomicBoolean(false); 36 37 String name = "unknown"; 38 39 public WeightedResponseTimeRule() { 40 super(); 41 } 42 43 public WeightedResponseTimeRule(ILoadBalancer lb) { 44 super(lb); 45 } 46 47 @Override 48 public void setLoadBalancer(ILoadBalancer lb) { 49 super.setLoadBalancer(lb); 50 if (lb instanceof BaseLoadBalancer) { 51 name = ((BaseLoadBalancer) lb).getName(); 52 } 53 initialize(lb); 54 } 55 56 void initialize(ILoadBalancer lb) { 57 if (serverWeightTimer != null) { 58 serverWeightTimer.cancel(); 59 } 60 serverWeightTimer = new Timer("NFLoadBalancer-serverWeightTimer-" 61 + name, true); 62 serverWeightTimer.schedule(new DynamicServerWeightTask(), 0, 63 serverWeightTaskTimerInterval); 64 // do a initial run 65 ServerWeight sw = new ServerWeight(); 66 sw.maintainWeights(); 67 68 Runtime.getRuntime().addShutdownHook(new Thread(new Runnable() { 69 public void run() { 70 logger 71 .info("Stopping NFLoadBalancer-serverWeightTimer-" 72 + name); 73 serverWeightTimer.cancel(); 74 } 75 })); 76 } 77 78 public void shutdown() { 79 if (serverWeightTimer != null) { 80 logger.info("Stopping NFLoadBalancer-serverWeightTimer-" + name); 81 serverWeightTimer.cancel(); 82 } 83 } 84 85 List<Double> getAccumulatedWeights() { 86 return Collections.unmodifiableList(accumulatedWeights); 87 } 88 89 @edu.umd.cs.findbugs.annotations.SuppressWarnings(value = "RCN_REDUNDANT_NULLCHECK_OF_NULL_VALUE") 90 @Override 91 public Server choose(ILoadBalancer lb, Object key) { 92 if (lb == null) { 93 return null; 94 } 95 Server server = null; 96 97 while (server == null) { 98 // get hold of the current reference in case it is changed from the other thread 99 List<Double> currentWeights = accumulatedWeights; 100 if (Thread.interrupted()) { 101 return null; 102 } 103 List<Server> allList = lb.getAllServers(); 104 105 int serverCount = allList.size(); 106 107 if (serverCount == 0) { 108 return null; 109 } 110 111 int serverIndex = 0; 112 113 // last one in the list is the sum of all weights 114 double maxTotalWeight = currentWeights.size() == 0 ? 0 : currentWeights.get(currentWeights.size() - 1); 115 // No server has been hit yet and total weight is not initialized 116 // fallback to use round robin 117 if (maxTotalWeight < 0.001d || serverCount != currentWeights.size()) { 118 server = super.choose(getLoadBalancer(), key); 119 if(server == null) { 120 return server; 121 } 122 } else { 123 // generate a random weight between 0 (inclusive) to maxTotalWeight (exclusive) 124 double randomWeight = random.nextDouble() * maxTotalWeight; 125 // pick the server index based on the randomIndex 126 int n = 0; 127 for (Double d : currentWeights) { 128 if (d >= randomWeight) { 129 serverIndex = n; 130 break; 131 } else { 132 n++; 133 } 134 } 135 136 server = allList.get(serverIndex); 137 } 138 139 if (server == null) { 140 /* Transient. */ 141 Thread.yield(); 142 continue; 143 } 144 145 if (server.isAlive()) { 146 return (server); 147 } 148 149 // Next. 150 server = null; 151 } 152 return server; 153 } 154 155 class DynamicServerWeightTask extends TimerTask { 156 public void run() { 157 ServerWeight serverWeight = new ServerWeight(); 158 try { 159 serverWeight.maintainWeights(); 160 } catch (Exception e) { 161 logger.error("Error running DynamicServerWeightTask for {}", name, e); 162 } 163 } 164 } 165 166 class ServerWeight { 167 168 public void maintainWeights() { 169 ILoadBalancer lb = getLoadBalancer(); 170 if (lb == null) { 171 return; 172 } 173 174 if (!serverWeightAssignmentInProgress.compareAndSet(false, true)) { 175 return; 176 } 177 178 try { 179 logger.info("Weight adjusting job started"); 180 AbstractLoadBalancer nlb = (AbstractLoadBalancer) lb; 181 LoadBalancerStats stats = nlb.getLoadBalancerStats(); 182 if (stats == null) { 183 // no statistics, nothing to do 184 return; 185 } 186 double totalResponseTime = 0; 187 // find maximal 95% response time 188 for (Server server : nlb.getAllServers()) { 189 // this will automatically load the stats if not in cache 190 ServerStats ss = stats.getSingleServerStat(server); 191 totalResponseTime += ss.getResponseTimeAvg(); 192 } 193 // weight for each server is (sum of responseTime of all servers - responseTime) 194 // so that the longer the response time, the less the weight and the less likely to be chosen 195 Double weightSoFar = 0.0; 196 197 // create new list and hot swap the reference 198 List<Double> finalWeights = new ArrayList<Double>(); 199 for (Server server : nlb.getAllServers()) { 200 ServerStats ss = stats.getSingleServerStat(server); 201 double weight = totalResponseTime - ss.getResponseTimeAvg(); 202 weightSoFar += weight; 203 finalWeights.add(weightSoFar); 204 } 205 setWeights(finalWeights); 206 } catch (Exception e) { 207 logger.error("Error calculating server weights", e); 208 } finally { 209 serverWeightAssignmentInProgress.set(false); 210 } 211 212 } 213 } 214 215 void setWeights(List<Double> weights) { 216 this.accumulatedWeights = weights; 217 } 218 219 @Override 220 public void initWithNiwsConfig(IClientConfig clientConfig) { 221 super.initWithNiwsConfig(clientConfig); 222 serverWeightTaskTimerInterval = clientConfig.get(WEIGHT_TASK_TIMER_INTERVAL_CONFIG_KEY, DEFAULT_TIMER_INTERVAL); 223 } 224 225 }
既然是按照响应时间权重来选择服务,那么先整理一下权重算法是怎么做的。
观察initialize方法,启动了定时器定时执行DynamicServerWeightTask的run来调用计算服务权重,计算权重是通过内部类ServerWeight的maintainWeights方法来进行。
整理一下maintainWeights方法的逻辑,里面有两个for循环,第一个for循环拿到所有服务的总响应时间,第二个for循环计算每个服务的权重以及总权重。
第一个for循环。
假设有4个服务,每个服务的响应时间(ms):
A: 200
B: 500
C: 30
D: 1200
总响应时间:
200+500+30+1200=1930ms
接下来第二个for循环,计算每个服务的权重。
服务的权重=总响应时间-服务自身的响应时间:
A: 1930-200=1730
B: 1930-500=1430
C: 1930-30=1900
D: 1930-1200=730
总权重:
1730+1430+1900+730=5790
结果就是响应时间越短的服务,它的权重就越大。
再看一下choose方法。重点在while循环的第3个if这里。首先如果判定没有服务或者权重还没计算出来时,会采用父类RoundRobinRule以线性轮询的方式选择服务器。
有服务,有权重计算结果后,就是以总权重值为限制,拿到一个随机数,然后看随机数落到哪个区间,就选择对应的服务。
所以选取服务的结论就是:响应时间越短的服务,它的权重就越大,被选中的可能性就越大。
还有其他的负载均衡选择策略,下面就不一一列举了。