tcp: avoid min-RTT overestimation from delayed ACKs

https://patchwork.ozlabs.org/project/netdev/patch/20180117201101.14137-2-ycheng@google.com/

 

Message ID    20180117201101.14137-2-ycheng@google.com
State    Accepted, archived
Delegated to:    David Miller
Headers    show
Series    tcp: do not use RTT from delayed ACKs for min-RTT | expand
Commit Message
Yuchung ChengJan. 17, 2018, 8:11 p.m. UTC
This patch avoids having TCP sender or congestion control
overestimate the min RTT by orders of magnitude. This happens when
all the samples in the windowed filter are one-packet transfer
like small request and health-check like chit-chat, which is farily
common for applications using persistent connections. This patch
tries to conservatively labels and skip RTT samples obtained from
this type of workload.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_input.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)
Patch

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index ff71b18d9682..2c6797134553 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -97,6 +97,7 @@  int sysctl_tcp_max_orphans __read_mostly = NR_FILE;
 #define FLAG_SACK_RENEGING 0x2000 /* snd_una advanced to a sacked seq */
 #define FLAG_UPDATE_TS_RECENT  0x4000 /* tcp_replace_ts_recent() */
 #define FLAG_NO_CHALLENGE_ACK  0x8000 /* do not call tcp_send_challenge_ack()  */
+#define FLAG_ACK_MAYBE_DELAYED 0x10000 /* Likely a delayed ACK */
  
 #define FLAG_ACKED     (FLAG_DATA_ACKED|FLAG_SYN_ACKED)
 #define FLAG_NOT_DUP       (FLAG_DATA|FLAG_WIN_UPDATE|FLAG_ACKED)
@@ -2857,11 +2858,18 @@  static void tcp_fastretrans_alert(struct sock *sk, const u32 prior_snd_una,
    *rexmit = REXMIT_LOST;
 }
  
-static void tcp_update_rtt_min(struct sock *sk, u32 rtt_us)
+static void tcp_update_rtt_min(struct sock *sk, u32 rtt_us, const int flag)
 {
    u32 wlen = sock_net(sk)->ipv4.sysctl_tcp_min_rtt_wlen * HZ;
    struct tcp_sock *tp = tcp_sk(sk);
  
+   if ((flag & FLAG_ACK_MAYBE_DELAYED) && rtt_us > tcp_min_rtt(tp)) {
+       /* If the remote keeps returning delayed ACKs, eventually
+        * the min filter would pick it up and overestimate the
+        * prop. delay when it expires. Skip suspected delayed ACKs.
+        */
+       return;
+   }
    minmax_running_min(&tp->rtt_min, wlen, tcp_jiffies32,
               rtt_us ? : jiffies_to_usecs(1));
 }
@@ -2901,7 +2909,7 @@  static bool tcp_ack_update_rtt(struct sock *sk, const int flag,
     * always taken together with ACK, SACK, or TS-opts. Any negative
     * values will be skipped with the seq_rtt_us < 0 check above.
     */
-   tcp_update_rtt_min(sk, ca_rtt_us);
+   tcp_update_rtt_min(sk, ca_rtt_us, flag);
    tcp_rtt_estimator(sk, seq_rtt_us);
    tcp_set_rto(sk);
  
@@ -3125,6 +3133,17 @@  static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack,
    if (likely(first_ackt) && !(flag & FLAG_RETRANS_DATA_ACKED)) {
        seq_rtt_us = tcp_stamp_us_delta(tp->tcp_mstamp, first_ackt);
        ca_rtt_us = tcp_stamp_us_delta(tp->tcp_mstamp, last_ackt);
+
+       if (pkts_acked == 1 && last_in_flight < tp->mss_cache &&
+           last_in_flight && !prior_sacked && fully_acked &&
+           sack->rate->prior_delivered + 1 == tp->delivered &&
+           !(flag & (FLAG_CA_ALERT | FLAG_SYN_ACKED))) {
+           /* Conservatively mark a delayed ACK. It's typically
+            * from a lone runt packet over the round trip to
+            * a receiver w/o out-of-order or CE events.
+            */
+           flag |= FLAG_ACK_MAYBE_DELAYED;
+       }
    }
    if (sack->first_sackt) {
        sack_rtt_us = tcp_stamp_us_delta(tp->tcp_mstamp, sack->first_sackt);

 

posted @   张同光  阅读(59)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· 无需6万激活码!GitHub神秘组织3小时极速复刻Manus,手把手教你使用OpenManus搭建本
· Manus爆火,是硬核还是营销?
· 终于写完轮子一部分:tcp代理 了,记录一下
· 别再用vector<bool>了!Google高级工程师:这可能是STL最大的设计失误
· 单元测试从入门到精通
点击右上角即可分享
微信分享提示