Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

xsk: proper AF_XDP socket teardown ordering

The AF_XDP socket struct can exist in three different, implicit
states: setup, bound and released. Setup is prior the socket has been
bound to a device. Bound is when the socket is active for receive and
send. Released is when the process/userspace side of the socket is
released, but the sock object is still lingering, e.g. when there is a
reference to the socket in an XSKMAP after process termination.

The Rx fast-path code uses the "dev" member of struct xdp_sock to
check whether a socket is bound or relased, and the Tx code uses the
struct xdp_umem "xsk_list" member in conjunction with "dev" to
determine the state of a socket.

However, the transition from bound to released did not tear the socket
down in correct order.

On the Rx side "dev" was cleared after synchronize_net() making the
synchronization useless. On the Tx side, the internal queues were
destroyed prior removing them from the "xsk_list".

This commit corrects the cleanup order, and by doing so
xdp_del_sk_umem() can be simplified and one synchronize_net() can be
removed.

Fixes: 965a99098443 ("xsk: add support for bind for Rx")
Fixes: ac98d8aab61b ("xsk: wire upp Tx zero-copy functions")
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

authored by

Björn Töpel and committed by
Daniel Borkmann
541d7fdd df1ea77b

+11 -13
+3 -8
net/xdp/xdp_umem.c
··· 32 32 { 33 33 unsigned long flags; 34 34 35 - if (xs->dev) { 36 - spin_lock_irqsave(&umem->xsk_list_lock, flags); 37 - list_del_rcu(&xs->list); 38 - spin_unlock_irqrestore(&umem->xsk_list_lock, flags); 39 - 40 - if (umem->zc) 41 - synchronize_net(); 42 - } 35 + spin_lock_irqsave(&umem->xsk_list_lock, flags); 36 + list_del_rcu(&xs->list); 37 + spin_unlock_irqrestore(&umem->xsk_list_lock, flags); 43 38 } 44 39 45 40 /* The umem is stored both in the _rx struct and the _tx struct as we do
+8 -5
net/xdp/xsk.c
··· 355 355 local_bh_enable(); 356 356 357 357 if (xs->dev) { 358 + struct net_device *dev = xs->dev; 359 + 358 360 /* Wait for driver to stop using the xdp socket. */ 359 - synchronize_net(); 360 - dev_put(xs->dev); 361 + xdp_del_sk_umem(xs->umem, xs); 361 362 xs->dev = NULL; 363 + synchronize_net(); 364 + dev_put(dev); 362 365 } 366 + 367 + xskq_destroy(xs->rx); 368 + xskq_destroy(xs->tx); 363 369 364 370 sock_orphan(sk); 365 371 sock->sk = NULL; ··· 720 714 if (!sock_flag(sk, SOCK_DEAD)) 721 715 return; 722 716 723 - xskq_destroy(xs->rx); 724 - xskq_destroy(xs->tx); 725 - xdp_del_sk_umem(xs->umem, xs); 726 717 xdp_put_umem(xs->umem); 727 718 728 719 sk_refcnt_debug_dec(sk);