Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

tcp: limit GSO packets to half cwnd

In DC world, GSO packets initially cooked by tcp_sendmsg() are usually
big, as sk_pacing_rate is high.

When network is congested, cwnd can be smaller than the GSO packets
found in socket write queue. tcp_write_xmit() splits GSO packets
using the available cwnd, and we end up sending a single GSO packet,
consuming all available cwnd.

With GRO aggregation on the receiver, we might handle a single GRO
packet, sending back a single ACK.

1) This single ACK might be lost
TLP or RTO are forced to attempt a retransmit.
2) This ACK releases a full cwnd, sender sends another big GSO packet,
in a ping pong mode.

This behavior does not fill the pipes in the best way, because of
scheduling artifacts.

Make sure we always have at least two GSO packets in flight.

This allows us to safely increase GRO efficiency without risking
spurious retransmits.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

authored by

Eric Dumazet and committed by
David S. Miller
d649a7a8 6eba8224

+8 -4
+8 -4
net/ipv4/tcp_output.c
··· 1562 1562 static inline unsigned int tcp_cwnd_test(const struct tcp_sock *tp, 1563 1563 const struct sk_buff *skb) 1564 1564 { 1565 - u32 in_flight, cwnd; 1565 + u32 in_flight, cwnd, halfcwnd; 1566 1566 1567 1567 /* Don't be strict about the congestion window for the final FIN. */ 1568 1568 if ((TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN) && ··· 1571 1571 1572 1572 in_flight = tcp_packets_in_flight(tp); 1573 1573 cwnd = tp->snd_cwnd; 1574 - if (in_flight < cwnd) 1575 - return (cwnd - in_flight); 1574 + if (in_flight >= cwnd) 1575 + return 0; 1576 1576 1577 - return 0; 1577 + /* For better scheduling, ensure we have at least 1578 + * 2 GSO packets in flight. 1579 + */ 1580 + halfcwnd = max(cwnd >> 1, 1U); 1581 + return min(halfcwnd, cwnd - in_flight); 1578 1582 } 1579 1583 1580 1584 /* Initialize TSO state of a skb.