Spring Cloud Ribbon的原理-负载均衡策略

在前两篇《Spring Cloud Ribbon的原理》，《Spring Cloud Ribbon的原理-负载均衡器》中，整理了Ribbon如何通过负载均衡拦截器植入RestTemplate，以及调用负载均衡器获取服务列表，如何过滤，如何更新等的处理过程。

因为，负载均衡器最终是调用负载均衡策略的choose方法来选择一个服务，所以这一篇，整理Ribbon的负载均衡策略。

策略类

RandomRule
RoundRobinRule
RetryRule
WeightedResponseTimeRule
ClientConfigEnabledRoundRobinRule
BestAvailableRule
PredicateBasedRule
AvailabilityFilteringRule
ZoneAvoidanceRule

类继承关系

RandomRule
随机选取负载均衡策略。
choose方法中，通过随机Random对象，在所有服务实例数量中随机找一个服务的索引号，然后从上线的服务中获取对应的服务。
这时候，很可能会有不在线的服务，就有可能从上线的服务中获取不到，那么休息会儿再获取知道随机获取到一个上线的服务为止。

 1 public class RandomRule extends AbstractLoadBalancerRule {
 2 
 3     /**
 4      * Randomly choose from all living servers
 5      */
 6     @edu.umd.cs.findbugs.annotations.SuppressWarnings(value = "RCN_REDUNDANT_NULLCHECK_OF_NULL_VALUE")
 7     public Server choose(ILoadBalancer lb, Object key) {
 8         if (lb == null) {
 9             return null;
10         }
11         Server server = null;
12 
13         while (server == null) {
14             if (Thread.interrupted()) {
15                 return null;
16             }
17             List<Server> upList = lb.getReachableServers();
18             List<Server> allList = lb.getAllServers();
19 
20             int serverCount = allList.size();
21             if (serverCount == 0) {
22                 /*
23                  * No servers. End regardless of pass, because subsequent passes
24                  * only get more restrictive.
25                  */
26                 return null;
27             }
28 
29             int index = chooseRandomInt(serverCount);
30             server = upList.get(index);
31 
32             if (server == null) {
33                 /*
34                  * The only time this should happen is if the server list were
35                  * somehow trimmed. This is a transient condition. Retry after
36                  * yielding.
37                  */
38                 Thread.yield();
39                 continue;
40             }
41 
42             if (server.isAlive()) {
43                 return (server);
44             }
45 
46             // Shouldn't actually happen.. but must be transient or a bug.
47             server = null;
48             Thread.yield();
49         }
50 
51         return server;
52 
53     }
54 
55     protected int chooseRandomInt(int serverCount) {
56         return ThreadLocalRandom.current().nextInt(serverCount);
57     }
58 
59     @Override
60     public Server choose(Object key) {
61         return choose(getLoadBalancer(), key);
62     }
63 
64     @Override
65     public void initWithNiwsConfig(IClientConfig clientConfig) {
66         // TODO Auto-generated method stub
67         
68     }

RoundRobinRule
线性轮询负载均衡策略。
choose方法中，通过incrementAndGetModulo方法以线性轮询方式获取服务。
在incrementAndGetModulo中，实际上在类中维护了一个原子性的nextServerCyclicCounter成员变量作为当前服务的索引号，每次在所有服务数量的限制下，就是将服务的索引号加1，到达服务数量限制时再从头开始。

 1 public class RoundRobinRule extends AbstractLoadBalancerRule {
 2 
 3     private AtomicInteger nextServerCyclicCounter;
 4     private static final boolean AVAILABLE_ONLY_SERVERS = true;
 5     private static final boolean ALL_SERVERS = false;
 6 
 7     private static Logger log = LoggerFactory.getLogger(RoundRobinRule.class);
 8 
 9     public RoundRobinRule() {
10         nextServerCyclicCounter = new AtomicInteger(0);
11     }
12 
13     public RoundRobinRule(ILoadBalancer lb) {
14         this();
15         setLoadBalancer(lb);
16     }
17 
18     public Server choose(ILoadBalancer lb, Object key) {
19         if (lb == null) {
20             log.warn("no load balancer");
21             return null;
22         }
23 
24         Server server = null;
25         int count = 0;
26         while (server == null && count++ < 10) {
27             List<Server> reachableServers = lb.getReachableServers();
28             List<Server> allServers = lb.getAllServers();
29             int upCount = reachableServers.size();
30             int serverCount = allServers.size();
31 
32             if ((upCount == 0) || (serverCount == 0)) {
33                 log.warn("No up servers available from load balancer: " + lb);
34                 return null;
35             }
36 
37             int nextServerIndex = incrementAndGetModulo(serverCount);
38             server = allServers.get(nextServerIndex);
39 
40             if (server == null) {
41                 /* Transient. */
42                 Thread.yield();
43                 continue;
44             }
45 
46             if (server.isAlive() && (server.isReadyToServe())) {
47                 return (server);
48             }
49 
50             // Next.
51             server = null;
52         }
53 
54         if (count >= 10) {
55             log.warn("No available alive servers after 10 tries from load balancer: "
56                     + lb);
57         }
58         return server;
59     }
60 
61     /**
62      * Inspired by the implementation of {@link AtomicInteger#incrementAndGet()}.
63      *
64      * @param modulo The modulo to bound the value of the counter.
65      * @return The next value.
66      */
67     private int incrementAndGetModulo(int modulo) {
68         for (;;) {
69             int current = nextServerCyclicCounter.get();
70             int next = (current + 1) % modulo;
71             if (nextServerCyclicCounter.compareAndSet(current, next))
72                 return next;
73         }
74     }
75 
76     @Override
77     public Server choose(Object key) {
78         return choose(getLoadBalancer(), key);
79     }
80 
81     @Override
82     public void initWithNiwsConfig(IClientConfig clientConfig) {
83     }
84 }

WeightedResponseTimeRule
响应时间作为选取权重的负载均衡策略。其含义就是，响应时间越短的服务被选中的可能性大。继承自RoundRobinRule类。

  1 public class WeightedResponseTimeRule extends RoundRobinRule {
  2 
  3     public static final IClientConfigKey<Integer> WEIGHT_TASK_TIMER_INTERVAL_CONFIG_KEY = new IClientConfigKey<Integer>() {
  4         @Override
  5         public String key() {
  6             return "ServerWeightTaskTimerInterval";
  7         }
  8         
  9         @Override
 10         public String toString() {
 11             return key();
 12         }
 13 
 14         @Override
 15         public Class<Integer> type() {
 16             return Integer.class;
 17         }
 18     };
 19     
 20     public static final int DEFAULT_TIMER_INTERVAL = 30 * 1000;
 21     
 22     private int serverWeightTaskTimerInterval = DEFAULT_TIMER_INTERVAL;
 23 
 24     private static final Logger logger = LoggerFactory.getLogger(WeightedResponseTimeRule.class);
 25     
 26     // holds the accumulated weight from index 0 to current index
 27     // for example, element at index 2 holds the sum of weight of servers from 0 to 2
 28     private volatile List<Double> accumulatedWeights = new ArrayList<Double>();
 29     
 30 
 31     private final Random random = new Random();
 32 
 33     protected Timer serverWeightTimer = null;
 34 
 35     protected AtomicBoolean serverWeightAssignmentInProgress = new AtomicBoolean(false);
 36 
 37     String name = "unknown";
 38 
 39     public WeightedResponseTimeRule() {
 40         super();
 41     }
 42 
 43     public WeightedResponseTimeRule(ILoadBalancer lb) {
 44         super(lb);
 45     }
 46     
 47     @Override
 48     public void setLoadBalancer(ILoadBalancer lb) {
 49         super.setLoadBalancer(lb);
 50         if (lb instanceof BaseLoadBalancer) {
 51             name = ((BaseLoadBalancer) lb).getName();
 52         }
 53         initialize(lb);
 54     }
 55 
 56     void initialize(ILoadBalancer lb) {        
 57         if (serverWeightTimer != null) {
 58             serverWeightTimer.cancel();
 59         }
 60         serverWeightTimer = new Timer("NFLoadBalancer-serverWeightTimer-"
 61                 + name, true);
 62         serverWeightTimer.schedule(new DynamicServerWeightTask(), 0,
 63                 serverWeightTaskTimerInterval);
 64         // do a initial run
 65         ServerWeight sw = new ServerWeight();
 66         sw.maintainWeights();
 67 
 68         Runtime.getRuntime().addShutdownHook(new Thread(new Runnable() {
 69             public void run() {
 70                 logger
 71                         .info("Stopping NFLoadBalancer-serverWeightTimer-"
 72                                 + name);
 73                 serverWeightTimer.cancel();
 74             }
 75         }));
 76     }
 77 
 78     public void shutdown() {
 79         if (serverWeightTimer != null) {
 80             logger.info("Stopping NFLoadBalancer-serverWeightTimer-" + name);
 81             serverWeightTimer.cancel();
 82         }
 83     }
 84 
 85     List<Double> getAccumulatedWeights() {
 86         return Collections.unmodifiableList(accumulatedWeights);
 87     }
 88 
 89     @edu.umd.cs.findbugs.annotations.SuppressWarnings(value = "RCN_REDUNDANT_NULLCHECK_OF_NULL_VALUE")
 90     @Override
 91     public Server choose(ILoadBalancer lb, Object key) {
 92         if (lb == null) {
 93             return null;
 94         }
 95         Server server = null;
 96 
 97         while (server == null) {
 98             // get hold of the current reference in case it is changed from the other thread
 99             List<Double> currentWeights = accumulatedWeights;
100             if (Thread.interrupted()) {
101                 return null;
102             }
103             List<Server> allList = lb.getAllServers();
104 
105             int serverCount = allList.size();
106 
107             if (serverCount == 0) {
108                 return null;
109             }
110 
111             int serverIndex = 0;
112 
113             // last one in the list is the sum of all weights
114             double maxTotalWeight = currentWeights.size() == 0 ? 0 : currentWeights.get(currentWeights.size() - 1); 
115             // No server has been hit yet and total weight is not initialized
116             // fallback to use round robin
117             if (maxTotalWeight < 0.001d || serverCount != currentWeights.size()) {
118                 server =  super.choose(getLoadBalancer(), key);
119                 if(server == null) {
120                     return server;
121                 }
122             } else {
123                 // generate a random weight between 0 (inclusive) to maxTotalWeight (exclusive)
124                 double randomWeight = random.nextDouble() * maxTotalWeight;
125                 // pick the server index based on the randomIndex
126                 int n = 0;
127                 for (Double d : currentWeights) {
128                     if (d >= randomWeight) {
129                         serverIndex = n;
130                         break;
131                     } else {
132                         n++;
133                     }
134                 }
135 
136                 server = allList.get(serverIndex);
137             }
138 
139             if (server == null) {
140                 /* Transient. */
141                 Thread.yield();
142                 continue;
143             }
144 
145             if (server.isAlive()) {
146                 return (server);
147             }
148 
149             // Next.
150             server = null;
151         }
152         return server;
153     }
154 
155     class DynamicServerWeightTask extends TimerTask {
156         public void run() {
157             ServerWeight serverWeight = new ServerWeight();
158             try {
159                 serverWeight.maintainWeights();
160             } catch (Exception e) {
161                 logger.error("Error running DynamicServerWeightTask for {}", name, e);
162             }
163         }
164     }
165 
166     class ServerWeight {
167 
168         public void maintainWeights() {
169             ILoadBalancer lb = getLoadBalancer();
170             if (lb == null) {
171                 return;
172             }
173             
174             if (!serverWeightAssignmentInProgress.compareAndSet(false,  true))  {
175                 return; 
176             }
177             
178             try {
179                 logger.info("Weight adjusting job started");
180                 AbstractLoadBalancer nlb = (AbstractLoadBalancer) lb;
181                 LoadBalancerStats stats = nlb.getLoadBalancerStats();
182                 if (stats == null) {
183                     // no statistics, nothing to do
184                     return;
185                 }
186                 double totalResponseTime = 0;
187                 // find maximal 95% response time
188                 for (Server server : nlb.getAllServers()) {
189                     // this will automatically load the stats if not in cache
190                     ServerStats ss = stats.getSingleServerStat(server);
191                     totalResponseTime += ss.getResponseTimeAvg();
192                 }
193                 // weight for each server is (sum of responseTime of all servers - responseTime)
194                 // so that the longer the response time, the less the weight and the less likely to be chosen
195                 Double weightSoFar = 0.0;
196                 
197                 // create new list and hot swap the reference
198                 List<Double> finalWeights = new ArrayList<Double>();
199                 for (Server server : nlb.getAllServers()) {
200                     ServerStats ss = stats.getSingleServerStat(server);
201                     double weight = totalResponseTime - ss.getResponseTimeAvg();
202                     weightSoFar += weight;
203                     finalWeights.add(weightSoFar);   
204                 }
205                 setWeights(finalWeights);
206             } catch (Exception e) {
207                 logger.error("Error calculating server weights", e);
208             } finally {
209                 serverWeightAssignmentInProgress.set(false);
210             }
211 
212         }
213     }
214 
215     void setWeights(List<Double> weights) {
216         this.accumulatedWeights = weights;
217     }
218 
219     @Override
220     public void initWithNiwsConfig(IClientConfig clientConfig) {
221         super.initWithNiwsConfig(clientConfig);
222         serverWeightTaskTimerInterval = clientConfig.get(WEIGHT_TASK_TIMER_INTERVAL_CONFIG_KEY, DEFAULT_TIMER_INTERVAL);
223     }
224 
225 }

既然是按照响应时间权重来选择服务，那么先整理一下权重算法是怎么做的。
观察initialize方法，启动了定时器定时执行DynamicServerWeightTask的run来调用计算服务权重，计算权重是通过内部类ServerWeight的maintainWeights方法来进行。
整理一下maintainWeights方法的逻辑，里面有两个for循环，第一个for循环拿到所有服务的总响应时间，第二个for循环计算每个服务的权重以及总权重。
第一个for循环。
假设有4个服务，每个服务的响应时间（ms）：

A: 200
B: 500
C: 30
D: 1200

总响应时间：
200+500+30+1200=1930ms
接下来第二个for循环，计算每个服务的权重。
服务的权重=总响应时间-服务自身的响应时间：
A: 1930-200=1730
B: 1930-500=1430
C: 1930-30=1900
D: 1930-1200=730

总权重：
1730+1430+1900+730=5790
结果就是响应时间越短的服务，它的权重就越大。

再看一下choose方法。重点在while循环的第3个if这里。首先如果判定没有服务或者权重还没计算出来时，会采用父类RoundRobinRule以线性轮询的方式选择服务器。
有服务，有权重计算结果后，就是以总权重值为限制，拿到一个随机数，然后看随机数落到哪个区间，就选择对应的服务。
所以选取服务的结论就是：响应时间越短的服务，它的权重就越大，被选中的可能性就越大。

还有其他的负载均衡选择策略，下面就不一一列举了。

posted @ 2021-01-18 11:17 郭慕荣阅读(188) 评论(0) 编辑收藏举报

刷新页面返回顶部

郭慕荣博客园

Spring Cloud Ribbon的原理-负载均衡策略

公告