Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

net: avoid unconditionally touching sk_tsflags on RX

After commit 5d4cc87414c5 ("net: reorganize "struct sock" fields"),
the sk_tsflags field shares the same cacheline with sk_forward_alloc.

The UDP protocol does not acquire the sock lock in the RX path;
forward allocations are protected via the receive queue spinlock;
additionally udp_recvmsg() calls sock_recv_cmsgs() unconditionally
touching sk_tsflags on each packet reception.

Due to the above, under high packet rate traffic, when the BH and the
user-space process run on different CPUs, UDP packet reception
experiences a cache miss while accessing sk_tsflags.

The receive path doesn't strictly need to access the problematic field;
change sock_set_timestamping() to maintain the relevant information
in a newly allocated sk_flags bit, so that sock_recv_cmsgs() can
take decisions accessing the latter field only.

With this patch applied, on an AMD epic server with i40e NICs, I
measured a 10% performance improvement for small packets UDP flood
performance tests - possibly a larger delta could be observed with more
recent H/W.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/dbd18c8a1171549f8249ac5a8b30b1b5ec88a425.1739294057.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

authored by

Paolo Abeni and committed by
Jakub Kicinski
f0e70409 ea80f2d9

+6 -4
+5 -4
include/net/sock.h
··· 954 954 SOCK_TSTAMP_NEW, /* Indicates 64 bit timestamps always */ 955 955 SOCK_RCVMARK, /* Receive SO_MARK ancillary data with packet */ 956 956 SOCK_RCVPRIORITY, /* Receive SO_PRIORITY ancillary data with packet */ 957 + SOCK_TIMESTAMPING_ANY, /* Copy of sk_tsflags & TSFLAGS_ANY */ 957 958 }; 958 959 959 960 #define SK_FLAGS_TIMESTAMP ((1UL << SOCK_TIMESTAMP) | (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE)) ··· 2665 2664 { 2666 2665 #define FLAGS_RECV_CMSGS ((1UL << SOCK_RXQ_OVFL) | \ 2667 2666 (1UL << SOCK_RCVTSTAMP) | \ 2668 - (1UL << SOCK_RCVMARK) |\ 2669 - (1UL << SOCK_RCVPRIORITY)) 2667 + (1UL << SOCK_RCVMARK) | \ 2668 + (1UL << SOCK_RCVPRIORITY) | \ 2669 + (1UL << SOCK_TIMESTAMPING_ANY)) 2670 2670 #define TSFLAGS_ANY (SOF_TIMESTAMPING_SOFTWARE | \ 2671 2671 SOF_TIMESTAMPING_RAW_HARDWARE) 2672 2672 2673 - if (sk->sk_flags & FLAGS_RECV_CMSGS || 2674 - READ_ONCE(sk->sk_tsflags) & TSFLAGS_ANY) 2673 + if (READ_ONCE(sk->sk_flags) & FLAGS_RECV_CMSGS) 2675 2674 __sock_recv_cmsgs(msg, sk, skb); 2676 2675 else if (unlikely(sock_flag(sk, SOCK_TIMESTAMP))) 2677 2676 sock_write_timestamp(sk, skb->tstamp);
+1
net/core/sock.c
··· 938 938 939 939 WRITE_ONCE(sk->sk_tsflags, val); 940 940 sock_valbool_flag(sk, SOCK_TSTAMP_NEW, optname == SO_TIMESTAMPING_NEW); 941 + sock_valbool_flag(sk, SOCK_TIMESTAMPING_ANY, !!(val & TSFLAGS_ANY)); 941 942 942 943 if (val & SOF_TIMESTAMPING_RX_SOFTWARE) 943 944 sock_enable_timestamp(sk,