Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

tcp: ensure to use the most recently sent skb when filling the rate sample

If an ACK (s)acks multiple skbs, we favor the information
from the most recently sent skb by choosing the skb with
the highest prior_delivered count. But in the interval
between receiving ACKs, we send multiple skbs with the same
prior_delivered, because the tp->delivered only changes
when we receive an ACK.

We used RACK's solution, copying tcp_rack_sent_after() as
tcp_skb_sent_after() helper to determine "which packet was
sent last?". Later, we will use tcp_skb_sent_after() instead
in RACK.

Fixes: b9f64820fb22 ("tcp: track data delivery rate for a TCP connection")
Signed-off-by: Pengcheng Yang <yangpc@wangsu.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Tested-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/1650422081-22153-1-git-send-email-yangpc@wangsu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

authored by

Pengcheng Yang and committed by
Jakub Kicinski
b253a068 b107a639

+14 -3
+6
include/net/tcp.h
··· 1042 1042 int losses; /* number of packets marked lost upon ACK */ 1043 1043 u32 acked_sacked; /* number of packets newly (S)ACKed upon ACK */ 1044 1044 u32 prior_in_flight; /* in flight before this ACK */ 1045 + u32 last_end_seq; /* end_seq of most recently ACKed packet */ 1045 1046 bool is_app_limited; /* is sample from packet with bubble in pipe? */ 1046 1047 bool is_retrans; /* is sample from retransmission? */ 1047 1048 bool is_ack_delayed; /* is this (likely) a delayed ACK? */ ··· 1164 1163 void tcp_rate_gen(struct sock *sk, u32 delivered, u32 lost, 1165 1164 bool is_sack_reneg, struct rate_sample *rs); 1166 1165 void tcp_rate_check_app_limited(struct sock *sk); 1166 + 1167 + static inline bool tcp_skb_sent_after(u64 t1, u64 t2, u32 seq1, u32 seq2) 1168 + { 1169 + return t1 > t2 || (t1 == t2 && after(seq1, seq2)); 1170 + } 1167 1171 1168 1172 /* These functions determine how the current flow behaves in respect of SACK 1169 1173 * handling. SACK is negotiated with the peer, and therefore it can vary
+8 -3
net/ipv4/tcp_rate.c
··· 74 74 * 75 75 * If an ACK (s)acks multiple skbs (e.g., stretched-acks), this function is 76 76 * called multiple times. We favor the information from the most recently 77 - * sent skb, i.e., the skb with the highest prior_delivered count. 77 + * sent skb, i.e., the skb with the most recently sent time and the highest 78 + * sequence. 78 79 */ 79 80 void tcp_rate_skb_delivered(struct sock *sk, struct sk_buff *skb, 80 81 struct rate_sample *rs) 81 82 { 82 83 struct tcp_sock *tp = tcp_sk(sk); 83 84 struct tcp_skb_cb *scb = TCP_SKB_CB(skb); 85 + u64 tx_tstamp; 84 86 85 87 if (!scb->tx.delivered_mstamp) 86 88 return; 87 89 90 + tx_tstamp = tcp_skb_timestamp_us(skb); 88 91 if (!rs->prior_delivered || 89 - after(scb->tx.delivered, rs->prior_delivered)) { 92 + tcp_skb_sent_after(tx_tstamp, tp->first_tx_mstamp, 93 + scb->end_seq, rs->last_end_seq)) { 90 94 rs->prior_delivered_ce = scb->tx.delivered_ce; 91 95 rs->prior_delivered = scb->tx.delivered; 92 96 rs->prior_mstamp = scb->tx.delivered_mstamp; 93 97 rs->is_app_limited = scb->tx.is_app_limited; 94 98 rs->is_retrans = scb->sacked & TCPCB_RETRANS; 99 + rs->last_end_seq = scb->end_seq; 95 100 96 101 /* Record send time of most recently ACKed packet: */ 97 - tp->first_tx_mstamp = tcp_skb_timestamp_us(skb); 102 + tp->first_tx_mstamp = tx_tstamp; 98 103 /* Find the duration of the "send phase" of this window: */ 99 104 rs->interval_us = tcp_stamp_us_delta(tp->first_tx_mstamp, 100 105 scb->tx.first_tx_mstamp);