window server 2008的网络负载均衡(NLB)算法

最近项目中接触到关于群集的问题,后来看到关于window server 2008的NLB网络负载均衡中提到其均衡算法不需要群集主机间协商通信,就仔细研究了下,网上查到MSDN的两个帖子。

How Does Network Load Balancing Algorithm Works Internally (矛盾)

Network Load Balancing Technical Overview   (正解)

 

第一个帖子看了后觉得有问题,因为照他描述的算法来看,必须要求主机间在处理网络请求时协商,这与上述所说有矛盾,后来查到这二个帖子,这个帖子是官方正式描述,算法是基于统计学的原理,这个才是正解,现把其算法关键内容翻译下分享给网友:

 

Load Balancing Algorithm (负载均衡算法)

Network Load Balancing employs a fully distributed filtering algorithm to map incoming clients to the cluster hosts. This algorithm was chosen to eable cluster hosts to make a load-balancing decision independently and quickly for each incoming packet. It was optimized to deliver statistically even load balance for a large client population making numerous, relatively small requests, such as those typically made to Web servers. When the client population is small and/or the client connections produce widely varying loads on the server, Network Load Balancing's load balancing algorithm is less effective. However, the simplicity and speed of its algorithm allows it to deliver very high performance, including both high throughput and low response time, in a wide range of useful client/server applications.

NLB(网络负载均衡)采用完全分布式过滤算法将请求的客户端映射到某台群集主机,该算法使得群集主机能对每个请求包快速独立地做负载均衡决策。这个算法经过优化可将大量客户群提交的大量、较小的请求处理到达统计式公平的负载均衡,如典型的针对WEB服务器的请求。当客户端少并且(或)客户端连接产生分布广而不均的服务器负载,NLB的负载均衡算法则效率较低。尽管如此,该算法的简单性、高速度使得它能够在很多有用的客户端/服务端应用场合下,具有非常高的性能,包括高吞吐量、低响应时间。

Network Load Balancing load-balances incoming client requests by directing a selected percentage of new requests to each cluster host; the load percentage is set in the Network Load Balancing Properties dialog box for each port range to be load-balanced. The algorithm does not respond to changes in the load on each cluster host (such as the CPU load or memory usage). However, the mapping is modified when the cluster membership changes, and load percentages are renormalized accordingly.

NLB根据设定比例将请求导向到各个群集主机进行负载均衡,在NLB的属性对话框里可为每段端口范围进行负载比例的设定。该算法不响应群集主机自身的负载变化(如CPU负载或内存使用)。但是,当群集主机成员发生变化时,则负载比例会相应重新常态化。

When inspecting an arriving packet, all hosts simultaneously perform a statistical mapping to quickly determine which host should handle the packet. The mapping uses a randomization function that calculates a host priority based on the client's IP address, port, and other state information maintained to optimize load balance. The corresponding host forwards the packet up the network stack to TCP/IP, and the other cluster hosts discard it. The mapping does not vary unless the membership of cluster hosts changes, ensuring that a given client's IP address and port will always map to the same cluster host. However, the particular cluster host to which the client's IP address and port map cannot be predetermined since the randomization function takes into account the current and past cluster's membership to minimize remappings.

当侦测到一个到达的数据包,所有主机同时执行"统计映射"以迅速判断哪个主机应当处理该数据包.映射采用随机函数,并基于客户端IP地址、端口和用于优化NLB的其他状态信息来计算主机优先级。相应的主机推送数据包从网络栈直到TCP/IP层,而其他主机则丢弃该数据包。映射不会发生变化除非集群主机发生变化,这能保证给定的客户端IP地址和端口总能映射到同一台集群主机。但是,某客户端IP地址和端口所映射的某个主机是不能事先确定的,因为随机函数会考虑当前和过去的群集成员状态以减少重映射过程。(注:算法的关键内容在这里)

 

The load-balancing algorithm assumes that client IP addresses and port numbers (when client affinity is not enabled) are statistically independent. This assumption can break down if a server-side firewall is used that proxies client addresses with one IP address and, at the same time, client affinity is enabled. In this case, all client requests will be handled by one cluster host and load balancing is defeated. However, if client affinity is not enabled, the distribution of client ports within the firewall usually provides good load balance.

负载均衡算法假定客户端IP地址和端口号(当客户相似性未启用)是统计独立的。如果用服务端防火墙1个IP代理多客户端IP地址,同时客户相似性启用,那这个假定就被打破。这种情况,所有客户端请求会被一台群集主机处理,负载均衡就废掉了。如果客户相似性未启用,则客户端在防火墙内的端口分布情况则通常能作为良好负载均衡的条件。

 

In general, the quality of load balance is statistically determined by the number of clients making requests. This behavior is analogous to coin tosses where the two sides of the coin correspond to the number of cluster hosts (thus, in this analogy, two), and the number of tosses corresponds to the number of client requests. The load distribution improves as the number of client requests increases just as the fraction of coin tosses resulting in "heads" approaches 1/2 with an increasing number of tosses. As a rule of thumb, with client affinity set, there must be many more clients than cluster hosts to begin to observe even load balance.

通常,负载均衡的质量在统计上是由请求的客户端的数量决定的。这个行为类似于扔硬币,硬币的两面对应群集主机数(也就是说,两个),扔硬币的次数对应客户端的请求次数。负载的分配质量随客户请求次数增加而提升,这就像随着扔硬币的次数增加,结果为"头"的概率会接近于1/2。作为经验法则,物以类聚,必定会有越来越多于群集主机的客户端能开始觉察到公平的负载均衡。

 

As the statistical nature of the client population fluctuates, the evenness of load balance can be observed to vary slightly over time. It is important to note that achieving precisely identical load balance on each cluster host imposes a performance penalty (throughput and response time) due to the overhead required to measure and react to load changes. This performance penalty must be weighed against the benefit of maximizing the use of cluster resources (principally CPU and memory). In any case, excess cluster resources must be maintained to absorb the client load in case of failover. Network Load Balancing takes the approach of using a very simple but powerful load-balancing algorithm that delivers the highest possible performance and availability.

根据客户群波动的统计学本质看, 在经过足够长的时间后,观察负载均衡的平衡性的变化是轻微的。这里着重注意的是,要达到每个群集主机的负载均衡精确一致则会带来性能上的损失(吞吐和响应时间),这是由于测量和响应负载变化所带来的开销。这个开销必须在与最大化利用群集资源(主要是CPU和内存)的好处之间权衡。无论何种情况,必定会保持多余的群集资源用于承接故障转移而导致的客户端请求负载。NLB通过使用一个非常简单而强大的负载均衡算法的办法来提供尽可能高的性能和可用性。

 

Network Load Balancing's client affinity settings are implemented by modifying the statistical mapping algorithm's input data. When client affinity is selected in the Network Load Balancing Properties dialog box, the client's port information is not used as part of the mapping. Hence, all requests from the same client always map to the same host within the cluster. Note that this constraint has no timeout value (as is often the case in dispatcher-based implementations) and persists until there is a change in cluster membership. When single affinity is selected, the mapping algorithm uses the client's full IP address. However, when class C affinity is selected, the algorithm uses only the class C portion (the upper 24 bits) of the client's IP address. This ensures that all clients within the same class C address space map to the same cluster host.

NLB客户相似性设置是通过修改统计映射算法的输入数据来实现。当在NLB属性对话框里选中客户相似性,客户端端口信息不会作为映射的部分。因此,来自于同一个客户端的所有请求总是映射到同一台群集主机。这个约束没有超时限制(而这通常是基于调度者模式实现的情况)并且保持直到群集成员发生变化。当单一相似性选中,映射算法使用客户端的IP地址值的全部。但是当C类相似性被选中,算法仅使用客户端IP地址C类部分(上面的24位)。这保证所有来自同一C类段地址空间的客户映射到同一台群集主机

 

 

算法理解:

   假设"统计映射”函数 StatMapping , 每个群集主机需要设定负载比例,端口范围,客户相似性。群集时的心跳信息会让每个群集主机都知道群集的成员状态,也就是说一定时间内,群集成员的个数是固定的、负载比例和其他参数是不变的,而且每个群集主机都知道其他群集主机的这些设定。由于群集主机共享一个群集IP,则所有客户端的请求数据包都会到达每个群集主机,此时,每个群集主机都计算所有群集主机的优先级: (假定客户相似性未启用,该参数只是影响输入数据,不影响算法逻辑)

       Priority = StatMapping( Client IP, Port, 主机负载比例,主机其他参数)

通过这个算法就可以得到所有主机的优先级,如果优先级高的主机恰好是本机,则处理,否则则丢弃。

这个算法实质上应该就是一个哈希算法,将大范围的客户端映射到小范围的群集主机,这个哈希算法的统计学分布比例越均衡则负载均衡的质量越高。当然前面也提到了则个算法不考虑主机当时的CPU或内存负载,所以无法做到动态调整负载,但对于常用的客户端/服务器应用则简单实用。

posted on 2012-11-11 17:19  omage  阅读(82)  评论(0编辑  收藏  举报