Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

net: gro: change GRO overflow strategy

GRO layer has a limit of 8 flows being held in GRO list,
for performance reason.

When a packet comes for a flow not yet in the list,
and list is full, we immediately give it to upper
stacks, lowering aggregation performance.

With TSO auto sizing and FQ packet scheduler, this situation
happens more often.

This patch changes strategy to simply evict the oldest flow of
the list. This works better because of the nature of packet
trains for which GRO is efficient. This also has the effect
of lowering the GRO latency if many flows are competing.

Tested :

Used a 40Gbps NIC, with 4 RX queues, and 200 concurrent TCP_STREAM
netperf.

Before patch, aggregate rate is 11Gbps (while a single flow can reach
30Gbps)

After patch, line rate is reached.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jerry Chu <hkchu@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

authored by

Eric Dumazet and committed by
David S. Miller
600adc18 e6a76758

+15 -2
+15 -2
net/core/dev.c
··· 3882 3882 if (same_flow) 3883 3883 goto ok; 3884 3884 3885 - if (NAPI_GRO_CB(skb)->flush || napi->gro_count >= MAX_GRO_SKBS) 3885 + if (NAPI_GRO_CB(skb)->flush) 3886 3886 goto normal; 3887 3887 3888 - napi->gro_count++; 3888 + if (unlikely(napi->gro_count >= MAX_GRO_SKBS)) { 3889 + struct sk_buff *nskb = napi->gro_list; 3890 + 3891 + /* locate the end of the list to select the 'oldest' flow */ 3892 + while (nskb->next) { 3893 + pp = &nskb->next; 3894 + nskb = *pp; 3895 + } 3896 + *pp = NULL; 3897 + nskb->next = NULL; 3898 + napi_gro_complete(nskb); 3899 + } else { 3900 + napi->gro_count++; 3901 + } 3889 3902 NAPI_GRO_CB(skb)->count = 1; 3890 3903 NAPI_GRO_CB(skb)->age = jiffies; 3891 3904 skb_shinfo(skb)->gso_size = skb_gro_len(skb);