Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

virtio_net: Do not pull payload in skb->head

Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") brought a ~10% performance drop.

The reason for the performance drop was that GRO was forced
to chain sk_buff (using skb_shinfo(skb)->frag_list), which
uses more memory but also cause packet consumers to go over
a lot of overhead handling all the tiny skbs.

It turns out that virtio_net page_to_skb() has a wrong strategy :
It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then
copies 128 bytes from the page, before feeding the packet to GRO stack.

This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") because GRO was using 2 frags per MSS,
meaning we were not packing MSS with 100% efficiency.

Fix is to pull only the ethernet header in page_to_skb()

Then, we change virtio_net_hdr_to_skb() to pull the missing
headers, instead of assuming they were already pulled by callers.

This fixes the performance regression, but could also allow virtio_net
to accept packets with more than 128bytes of headers.

Many thanks to Xuan Zhuo for his report, and his tests/help.

Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
Reported-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://www.spinics.net/lists/netdev/msg731397.html
Co-Developed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

authored by

Eric Dumazet and committed by
David S. Miller
0f6925b3 b25b343d

+16 -8
+7 -3
drivers/net/virtio_net.c
··· 406 406 offset += hdr_padded_len; 407 407 p += hdr_padded_len; 408 408 409 - copy = len; 410 - if (copy > skb_tailroom(skb)) 411 - copy = skb_tailroom(skb); 409 + /* Copy all frame if it fits skb->head, otherwise 410 + * we let virtio_net_hdr_to_skb() and GRO pull headers as needed. 411 + */ 412 + if (len <= skb_tailroom(skb)) 413 + copy = len; 414 + else 415 + copy = ETH_HLEN + metasize; 412 416 skb_put_data(skb, p, copy); 413 417 414 418 if (metasize) {
+9 -5
include/linux/virtio_net.h
··· 65 65 skb_reset_mac_header(skb); 66 66 67 67 if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) { 68 - u16 start = __virtio16_to_cpu(little_endian, hdr->csum_start); 69 - u16 off = __virtio16_to_cpu(little_endian, hdr->csum_offset); 68 + u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start); 69 + u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset); 70 + u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16)); 71 + 72 + if (!pskb_may_pull(skb, needed)) 73 + return -EINVAL; 70 74 71 75 if (!skb_partial_csum_set(skb, start, off)) 72 76 return -EINVAL; 73 77 74 78 p_off = skb_transport_offset(skb) + thlen; 75 - if (p_off > skb_headlen(skb)) 79 + if (!pskb_may_pull(skb, p_off)) 76 80 return -EINVAL; 77 81 } else { 78 82 /* gso packets without NEEDS_CSUM do not set transport_offset. ··· 106 102 } 107 103 108 104 p_off = keys.control.thoff + thlen; 109 - if (p_off > skb_headlen(skb) || 105 + if (!pskb_may_pull(skb, p_off) || 110 106 keys.basic.ip_proto != ip_proto) 111 107 return -EINVAL; 112 108 113 109 skb_set_transport_header(skb, keys.control.thoff); 114 110 } else if (gso_type) { 115 111 p_off = thlen; 116 - if (p_off > skb_headlen(skb)) 112 + if (!pskb_may_pull(skb, p_off)) 117 113 return -EINVAL; 118 114 } 119 115 }