Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

tcx: Fix splat during dev unregister

During unregister_netdevice_many_notify(), the ordering of our concerned
function calls is like this:

unregister_netdevice_many_notify
dev_shutdown
qdisc_put
clsact_destroy
tcx_uninstall

The syzbot reproducer triggered a case that the qdisc refcnt is not
zero during dev_shutdown().

tcx_uninstall() will then WARN_ON_ONCE(tcx_entry(entry)->miniq_active)
because the miniq is still active and the entry should not be freed.
The latter assumed that qdisc destruction happens before tcx teardown.

This fix is to avoid tcx_uninstall() doing tcx_entry_free() when the
miniq is still alive and let the clsact_destroy() do the free later, so
that we do not assume any specific ordering for either of them.

If still active, tcx_uninstall() does clear the entry when flushing out
the prog/link. clsact_destroy() will then notice the "!tcx_entry_is_active()"
and then does the tcx_entry_free() eventually.

Fixes: e420bed02507 ("bpf: Add fd-based tcx multi-prog infra with link support")
Reported-by: syzbot+376a289e86a0fd02b9ba@syzkaller.appspotmail.com
Reported-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: syzbot+376a289e86a0fd02b9ba@syzkaller.appspotmail.com
Tested-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/222255fe07cb58f15ee662e7ee78328af5b438e4.1690549248.git.daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

authored by

Martin KaFai Lau and committed by
Jakub Kicinski
079082c6 df41fa67

+24 -4
+16
include/linux/bpf_mprog.h
··· 256 256 memcpy(dst->fp_items, src->fp_items, sizeof(src->fp_items)); 257 257 } 258 258 259 + static inline void bpf_mprog_entry_clear(struct bpf_mprog_entry *dst) 260 + { 261 + memset(dst->fp_items, 0, sizeof(dst->fp_items)); 262 + } 263 + 264 + static inline void bpf_mprog_clear_all(struct bpf_mprog_entry *entry, 265 + struct bpf_mprog_entry **entry_new) 266 + { 267 + struct bpf_mprog_entry *peer; 268 + 269 + peer = bpf_mprog_peer(entry); 270 + bpf_mprog_entry_clear(peer); 271 + peer->parent->count = 0; 272 + *entry_new = peer; 273 + } 274 + 259 275 static inline void bpf_mprog_entry_grow(struct bpf_mprog_entry *entry, int idx) 260 276 { 261 277 int total = bpf_mprog_total(entry);
+8 -4
kernel/bpf/tcx.c
··· 94 94 95 95 void tcx_uninstall(struct net_device *dev, bool ingress) 96 96 { 97 + struct bpf_mprog_entry *entry, *entry_new = NULL; 97 98 struct bpf_tuple tuple = {}; 98 - struct bpf_mprog_entry *entry; 99 99 struct bpf_mprog_fp *fp; 100 100 struct bpf_mprog_cp *cp; 101 + bool active; 101 102 102 103 entry = tcx_entry_fetch(dev, ingress); 103 104 if (!entry) 104 105 return; 105 - tcx_entry_update(dev, NULL, ingress); 106 + active = tcx_entry(entry)->miniq_active; 107 + if (active) 108 + bpf_mprog_clear_all(entry, &entry_new); 109 + tcx_entry_update(dev, entry_new, ingress); 106 110 tcx_entry_sync(); 107 111 bpf_mprog_foreach_tuple(entry, fp, cp, tuple) { 108 112 if (tuple.link) ··· 115 111 bpf_prog_put(tuple.prog); 116 112 tcx_skeys_dec(ingress); 117 113 } 118 - WARN_ON_ONCE(tcx_entry(entry)->miniq_active); 119 - tcx_entry_free(entry); 114 + if (!active) 115 + tcx_entry_free(entry); 120 116 } 121 117 122 118 int tcx_prog_query(const union bpf_attr *attr, union bpf_attr __user *uattr)