bbr-congestion-control-00

Network Path Model

BBR is a model-based congestion control algorithm: its behavior is based on an explicit model of the network path over which a transport
flow travels. BBR's model includes explicit estimates of two parameters:

  1. BBR.BtlBw: the estimated bottleneck bandwidth available to the transport flow, estimated from the maximum delivery rate sample
    from a moving window.

  2. BBR.RTprop: the estimated two-way round-trip propagation delay of the path, estimated from the the minimum round-trip delay sample
    from a moving window.

Target Operating Point

BBR uses its model to seek an operating point with high throughput
and low delay. To operate near the optimal operating point, the
point with maximum throughput and minimum delay [K79] [GK81], the
system needs to maintain two conditions:

  1. Rate balance: the bottleneck packet arrival rate equals the
    bottleneck bandwidth available to the transport flow.

  2. Full pipe: the total data in in fight along the path is equal to
    the BDP.

Control Parameters

BBR uses its model to control the connection's sending behavior, to keep it near the target operating point. Rather than using a single
control parameter, like the cwnd parameter that limits the volume of in-flight data in the Reno and CUBIC congestion control algorithms,
BBR uses three distinct control parameters:

  1. pacing rate: the rate at which BBR sends data.
  2. send quantum: the maximum size of any aggregate that the transport sender implementation may need to transmit in order to
    amortize per-packet transmission overheads.
  3. cwnd: the maximum volume of data BBR allows in-flight in the network at any time.

3.5.1. Initialization

Upon transport connection initialization, BBR executes its
initialization steps:

 BBROnConnectionInit():
   BBRInit()

3.5.2. Per-ACK Steps

On every ACK, the BBR algorithm executes the following
BBRUpdateOnACK() steps in order to update its network path model,
update its state machine, and adjust its control parameters to adapt
to the updated model:

 BBRUpdateOnACK():
   BBRUpdateModelAndState()
   BBRUpdateControlParameters()

 BBRUpdateModelAndState():
   BBRUpdateBtlBw()
   BBRCheckCyclePhase()
   BBRCheckFullPipe()
   BBRCheckDrain()
   BBRUpdateRTprop()
   BBRCheckProbeRTT()

 BBRUpdateControlParameters()
   BBRSetPacingRate()
   BBRSetSendQuantum()
   BBRSetCwnd()

3.5.3. Per-Transmit Steps

When transmitting, BBR merely needs to check for the case where the
flow is restarting from idle:

 BBROnTransmit():
   BBRHandleRestartFromIdle()

Tracking Time for the BBR.BtlBw Max Filter

The BBR.round_count counts packet-timed round trips by recording state about a sentinel packet, and waiting for an ACK of any data

packet that was sent after that sentinel packet, using the following pseudocode:
Upon connection initialization:

 BBRInitRoundCounting():
       BBR.next_round_delivered = 0
       BBR.round_start = false
       BBR.round_count = 0

Upon sending each packet transmission:

 packet.delivered = BBR.delivered

Upon receiving an ACK for a given data packet:

 BBRUpdateRound():
   BBR.delivered += packet.size
   if (packet.delivered >= BBR.next_round_delivered)
     BBR.next_round_delivered = BBR.delivered
     BBR.round_count++
     BBR.round_start = true
   else
     BBR.round_start = false

Updating the BBR.BtlBw Max Filter
For every ACK that acknowledges some data packets as delivered, BBR invokes BBRUpdateBtlBw() to update the BBR.BtlBw estimator as follows
(here packet.delivery_rate is the delivery rate sample obtained from the "packet" that has just been ACKed, as specified in
[draft-cheng-iccrg-delivery-rate-estimation]):

 BBRUpdateBtlBw()
       BBRUpdateRound()
       if (rs.delivery_rate >= BBR.BtlBw || ! rs.is_app_limited)
           BBR.BtlBw = update_windowed_max_filter(
                         filter=BBR.BtlBwFilter,
                         value=rs.delivery_rate,
                         time=BBR.round_count,
                         window_length=BtlBwFilterLen)

** Updating the BBR.RTprop Min Filter**

Upon transmitting each packet, BBR (or the associated transport protocol) stores in per-packet data the wall-clock transmission time
of the packet (Now() returns the current wall-clock time):

packet.send_time = Now()

For every ACK that acknowledges some data packets as delivered, BBR (or the associated transport protocol) calculates an RTT sample "rtt"
as follows:
packet.rtt = Now() - packet.send_time
A BBR implementation MAY use a generic windowed min filter to track BBR.RTprop. However, a significant savings in space can be achieved
by using the same state to track BBR.RTprop and ProbeRTT timing, so this document describes this combined approach. With this approach,
on every ACK that provides an RTT sample BBR updates the BBR.RTprop estimator as follows:

BBRUpdateRTprop()
       BBR.rtprop_expired =
         Now() > BBR.rtprop_stamp + RTpropFilterLen
       if (packet.rtt >= 0 and
          (packet.rtt <= BBR.RTprop or BBR.rtprop_expired))
         BBR.RTprop = packet.rtt
         BBR.rtprop_stamp = Now()

RTpropFilterLen(RTProp最小过滤窗口的长度)被设置为10秒。这个值是根据多个考虑因素进行权衡的:

RTpropFilterLen比ProbeRTTInterval长,这样它可以覆盖整个ProbeRTT周期(参见下面的"ProbeRTT"部分)。这有助于确保窗口可以包含由于数据传输时的inflight低于流的估计BDP而导致的RTT样本。这样的RTT样本对于揭示路径的双向传播延迟非常重要,即使前面提到的"噪声"效应经常会使其模糊不清。
RTpropFilterLen的目标是足够长,以避免频繁地切断in-flight和吞吐量。测量双向传播延迟需要保持in-flight在BDP以下,这会带来一定程度的低利用率风险,因此BBR使用足够长的过滤窗口,使这种低利用事件变得罕见。
RTpropFilterLen的目标是足够长,以使许多应用程序具有自然的静音或低利用率时刻,可以将in-flight切换到BDP以下,并自然地刷新BBR.RTprop,而无需BBR强制进行人为的in-flight切断。这适用于许多常见的应用程序,包括Web、RPC、分块音频或视频流量。
RTpropFilterLen的目标是足够短,以及时地对路径的双向传播延迟的实际增加做出响应,例如由于路由变化引起的延迟增加,这些变化通常在30秒或更长的时间尺度上发生。
综合考虑这些因素,BBR算法选择了10秒作为RTpropFilterLen的长度,以在延迟和吞吐量之间达到一个平衡,同时保证能够准确估计网络路径的双向传播延迟。

BBR Control Parameters

BBR uses three distinct but interrelated control parameters: pacing rate, send quantum, and congestion window (cwnd).

Pacing Rate
The sending host implements pacing by maintaining inter-packet spacing at the time each packet is scheduled for transmission,
calculating the next transmission time for a packet for a given flow (here "next_send_time") as a function of the most recent packet size
and the current pacing rate, as follows:
next_send_time = Now() + packet.size / pacing_rate

When a BBR flow starts it has no BBR.BtlBw estimate. So in this case it sets an initial pacing rate based on the transport sender
implementation's initial congestion window ("InitialCwnd", e.g. from [RFC6298]), the initial SRTT (smoothed round-trip time) after the
first non-zero RTT sample, and the initial pacing_gain:

BBRInitPacingRate():
       nominal_bandwidth = InitialCwnd / (SRTT ? SRTT : 1ms)
       BBR.pacing_rate =  BBR.pacing_gain * nominal_bandwidth

After initialization, on each data ACK BBR updates its pacing rate to be proportional to BBR.BtlBw, as long as it estimates that it has
filled the pipe (BBR.filled_pipe is true; see the "Startup" section below for details), or doing so increases the pacing rate. Limiting
the pacing rate updates in this way helps the connection probe robustly for bandwidth until it estimates it has reached its full
available bandwidth ("filled the pipe"). In particular, this prevents the pacing rate from being reduced when the connection has
only seen application-limited samples. BBR updates the pacing rate on each ACK by executing the BBRSetPacingRate() step as follows:

BBRSetPacingRateWithGain(pacing_gain):
       rate = pacing_gain * BBR.BtlBw
       if (BBR.filled_pipe || rate > BBR.pacing_rate) //只要BBR估计已经填满了管道(BBR.filled_pipe为tru
         BBR.pacing_rate = rate

     BBRSetPacingRate():
       BBRSetPacingRateWithGain(BBR.pacing_gain)

Send Quantum
In order to amortize per-packet host overheads involved in the sending process, high-performance transport sender implementations
often schedule an aggregate containing multiple packets (multiple MSS) worth of data as a single quantum (using TSO, GSO, or other
offload mechanisms [DC13]).The BBR congestion control algorithm makes this control decision explicitly, dynamically calculating a
BBR.send_quantum control parameter that specifies the maximum size of these transmission aggregates.

On each ACK, BBR runs BBRSetSendQuantum() to update BBR.send_quantum
as follows:

 BBRSetSendQuantum():
   if (BBR.pacing_rate < 1.2 Mbps)
     BBR.send_quantum = 1 * MSS
   else if (BBR.pacing_rate < 24 Mbps)
     BBR.send_quantum  = 2 * MSS
   else
     BBR.send_quantum  = min(BBR.pacing_rate * 1ms, 64KBytes)

尽管BBR实现可以选择适合特定情况的BBR.send_quantum选择方法,但应该优先考虑使用较小的可行quantum量子。
使用较小的量子-quantum可以减少数据包突发、降低队列延迟和减少数据包丢失率,从而改善网络性能并提高公平共享带宽的能力。
因此,为了遵守规范并对网络和其他用户负责,BBR实现应该尽量选择较小的BBR.send_quantum值,以提供更好的网络性能和公平性。
Congestion Window
在初始化阶段,BBR使用传输发送方实现的初始拥塞窗口(例如,对于TCP,可以使用[RFC6298]中的值);
BBR.target_cwnd用于限制流在发送过程中允许的最大数据量。当流量不处于丢失恢复状态、不需要探测BBR.RTprop并且已经通过接收足够的ACK来逐渐增加当前拥塞窗口时,
BBR.target_cwnd起主导作用。当这些条件满足时,BBR算法将保持数据量在BBR.target_cwnd以下,以确保网络的稳定性和性能。
On each ACK, BBR calculates the BBR.target_cwnd as follows:

 BBRInflight(gain):
       if (BBR.RTprop == Inf)
         return InitialCwnd /* no valid RTT samples yet */
       quanta = 3*BBR.send_quantum
       estimated_bdp = BBR.BtlBw * BBR.RTprop
       return gain * estimated_bdp + quanta
 BBRUpdateTargetCwnd():
   BBR.target_cwnd = BBRInflight(BBR.cwnd_gain)

BBR算法根据估计的BDP和路径特性计算BBR.target_cwnd。通过控制在飞行中的数据量,BBR算法可以更好地利用可用带宽,同时考虑网络和接收端的特殊情况

Minimum cwnd for Pipelining
BBR设置了一个最小的拥塞窗口值,即BBRMinPipeCwnd,用于确保即使在非常低的BDP情况下,数据仍能够保持充分的流水线操作。通过在发送方和接收方之间保持足够的数据包在飞行中
BBR试图至少允许2个数据包在飞行中,并在从接收方到发送方的路径上为至少2个数据包提供ACK
Modulating cwnd in Loss Recovery
当发生重传超时(retransmission timeout)时,表示发送方认为所有在飞行中的数据包都已丢失,BBR会将拥塞窗口保守地减小到一个数据包(1个MSS),并发送一个单独的数据包。然后,BBR通过下面所述的"核心cwnd调整机制"逐渐增加拥塞窗口。
当BBR发送方检测到数据包丢失,但仍然有数据包在飞行中时,在第一轮的丢失修复过程中,BBR会临时减小拥塞窗口以与当前的传输速率相匹配。在第二轮及以后的丢失修复过程中,它确保发送速率在ACK到达时不会超过当前传输速率的两倍。
当BBR退出丢失恢复时,它将拥塞窗口恢复到进入恢复之前的"最后已知良好"值。无论流量退出丢失恢复是因为修复了所有丢失数据还是因为执行了一个"撤销"事件,BBR都会将拥塞窗口恢复到之前的值。

在发生重传超时(RTO)时:

 BBR.prior_cwnd = BBRSaveCwnd()
 cwnd = 1

进入快速恢复(Fast Recovery)时,将拥塞窗口设置为仍在飞行中的数据包数量+(至少保留一个用于快速重传):

 BBR.prior_cwnd = BBRSaveCwnd()
 cwnd = packets_in_flight + max(packets_delivered, 1)
 BBR.packet_conservation = true

在快速恢复期间的每个ACK上,执行以下BBRModulateCwndForRecovery()步骤,
这有助于在恢复的第一轮中保持数据包保持一致,并在后续的恢复轮次中以不超过当前传输速率的两倍发送
(假设"packets_delivered"个数据包被新标记为ACK或SACK,"packets_lost"个数据包被新标记为丢失):

 BBRModulateCwndForRecovery():
   if (packets_lost > 0)
     cwnd = max(cwnd - packets_lost, 1)
   if (BBR.packet_conservation)
     cwnd = max(cwnd, packets_in_flight + packets_delivered)

完成一次往返时间后,BBR.packet_conservation = false
Upon exiting loss recovery (RTO recovery or Fast Recovery), either by repairing all losses or undoing recovery, BBR restores the best-known
cwnd value we had upon entering loss recovery:

BBR.packet_conservation = false
     BBRRestoreCwnd()

BBRSaveCwnd()和BBRRestoreCwnd()这两个辅助函数有助于记住和恢复最后已知的良好拥塞窗口值
(the latest cwnd unmodulated by loss recovery or ProbeRTT)

BBRSaveCwnd():
       if (not InLossRecovery() and BBR.state != ProbeRTT)
         return cwnd
       else
         return max(BBR.prior_cwnd, cwnd)

     BBRRestoreCwnd():
       cwnd = max(cwnd, BBR.prior_cwnd)

Modulating cwnd in ProbeRTT
在进行RTT探测(ProbeRTT)时,BBR决定进入ProbeRTT状态的目标是快速减少在途数据的数量
并排空瓶颈队列,从而允许测量BBR.RTprop。为了实现这种模式,BBR将拥塞窗口(cwnd)限制
在BBRMinPipeCwnd的最小值

 BBRModulateCwndForProbeRTT():
   if (BBR.state == ProbeRTT)
     cwnd = min(cwnd, BBRMinPipeCwnd)

这段代码用于在ProbeRTT状态下调节拥塞窗口。它将拥塞窗口限制在BBRMinPipeCwnd和当前拥塞窗口值之间的较小值,
以确保在ProbeRTT期间维持最小的拥塞窗口,以实现快速排空瓶颈队列的目标。

Core cwnd Adjustment Mechanism
当cwnd超过BBR.target_cwnd时,BBR会迅速减小cwnd以避免过载网络,
而当cwnd低于BBR.target_cwnd时,BBR会逐步增加cwnd以充分利用可用的带宽,同时避免过度拥塞。

 BBRSetCwnd():
       BBRUpdateTargetCwnd()
       BBRModulateCwndForRecovery()
       if (not BBR.packet_conservation) {
         if (BBR.filled_pipe)
           cwnd = min(cwnd + packets_delivered, BBR.target_cwnd)
         else if (cwnd < BBR.target_cwnd || BBR.delivered < InitialCwnd)
           cwnd = cwnd + packets_delivered
         cwnd = max(cwnd, BBRMinPipeCwnd)
       }
       BBRModulateCwndForProbeRTT()

具体而言,BBRSetCwnd() 函数执行以下步骤:

1、BBRUpdateTargetCwnd(): 更新目标拥塞窗口 (BBR.target_cwnd) 的值。这个目标窗口是根据 BBR 对网络路径的模型计算得出的。
2、BBRModulateCwndForRecovery(): 根据当前是否处于丢包恢复状态,对拥塞窗口进行调整。在第一轮丢包恢复过程中,将 cwnd 降低到当前的交付速率,以保持数据包的保守发送。在后续的丢包恢复轮次中,确保发送速率不超过当前的交付速率的两倍。
3、根据不同条件对 cwnd 进行调整:
如果不需要进行数据包保守发送(BBR.packet_conservation 为 false),根据是否已经填满管道(BBR.filled_pipe)来增加 cwnd。如果已经填满管道,则将 cwnd 增加已交付的数据包数量,但不超过 BBR.target_cwnd 的值。如果没有填满管道,但 cwnd 低于目标值或者已交付的数据包数量小于 InitialCwnd,仍然增加 cwnd 的值。
无论以上条件如何,都会将 cwnd 的值限制在 BBRMinPipeCwnd 的最小值,以确保即使在小的 BDP 情况下仍能实现流水线操作。
4、BBRModulateCwndForProbeRTT(): 如果 BBR 进入 ProbeRTT 状态,即需要快速减少正在传输的数据量以便测量 BBR.RTprop,那么将 cwnd 限制在 BBRMinPipeCwnd 的最小值

posted @   codestacklinuxer  阅读(88)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· 25岁的心里话
· 闲置电脑爆改个人服务器(超详细) #公网映射 #Vmware虚拟网络编辑器
· 基于 Docker 搭建 FRP 内网穿透开源项目(很简单哒)
· 零经验选手,Compose 一天开发一款小游戏!
· 一起来玩mcp_server_sqlite,让AI帮你做增删改查!!
点击右上角即可分享
微信分享提示