Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

netfilter: conntrack: fix rmmod double-free race

nf_conntrack_hash_check_insert() callers free the ct entry directly, via
nf_conntrack_free.

This isn't safe anymore because
nf_conntrack_hash_check_insert() might place the entry into the conntrack
table and then delteted the entry again because it found that a conntrack
extension has been removed at the same time.

In this case, the just-added entry is removed again and an error is
returned to the caller.

Problem is that another cpu might have picked up this entry and
incremented its reference count.

This results in a use-after-free/double-free, once by the other cpu and
once by the caller of nf_conntrack_hash_check_insert().

Fix this by making nf_conntrack_hash_check_insert() not fail anymore
after the insertion, just like before the 'Fixes' commit.

This is safe because a racing nf_ct_iterate() has to wait for us
to release the conntrack hash spinlocks.

While at it, make the function return -EAGAIN in the rmmod (genid
changed) case, this makes nfnetlink replay the command (suggested
by Pablo Neira).

Fixes: c56716c69ce1 ("netfilter: extensions: introduce extension genid count")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

authored by

Florian Westphal and committed by
Pablo Neira Ayuso
e6d57e9f ac489398

+15 -14
-1
net/netfilter/nf_conntrack_bpf.c
··· 381 381 struct nf_conn *nfct = (struct nf_conn *)nfct_i; 382 382 int err; 383 383 384 - nfct->status |= IPS_CONFIRMED; 385 384 err = nf_conntrack_hash_check_insert(nfct); 386 385 if (err < 0) { 387 386 nf_conntrack_free(nfct);
+15 -10
net/netfilter/nf_conntrack_core.c
··· 886 886 887 887 zone = nf_ct_zone(ct); 888 888 889 - if (!nf_ct_ext_valid_pre(ct->ext)) { 890 - NF_CT_STAT_INC_ATOMIC(net, insert_failed); 891 - return -ETIMEDOUT; 892 - } 889 + if (!nf_ct_ext_valid_pre(ct->ext)) 890 + return -EAGAIN; 893 891 894 892 local_bh_disable(); 895 893 do { ··· 922 924 goto chaintoolong; 923 925 } 924 926 927 + /* If genid has changed, we can't insert anymore because ct 928 + * extensions could have stale pointers and nf_ct_iterate_destroy 929 + * might have completed its table scan already. 930 + * 931 + * Increment of the ext genid right after this check is fine: 932 + * nf_ct_iterate_destroy blocks until locks are released. 933 + */ 934 + if (!nf_ct_ext_valid_post(ct->ext)) { 935 + err = -EAGAIN; 936 + goto out; 937 + } 938 + 939 + ct->status |= IPS_CONFIRMED; 925 940 smp_wmb(); 926 941 /* The caller holds a reference to this object */ 927 942 refcount_set(&ct->ct_general.use, 2); ··· 942 931 nf_conntrack_double_unlock(hash, reply_hash); 943 932 NF_CT_STAT_INC(net, insert); 944 933 local_bh_enable(); 945 - 946 - if (!nf_ct_ext_valid_post(ct->ext)) { 947 - nf_ct_kill(ct); 948 - NF_CT_STAT_INC_ATOMIC(net, drop); 949 - return -ETIMEDOUT; 950 - } 951 934 952 935 return 0; 953 936 chaintoolong:
-3
net/netfilter/nf_conntrack_netlink.c
··· 2316 2316 nfct_seqadj_ext_add(ct); 2317 2317 nfct_synproxy_ext_add(ct); 2318 2318 2319 - /* we must add conntrack extensions before confirmation. */ 2320 - ct->status |= IPS_CONFIRMED; 2321 - 2322 2319 if (cda[CTA_STATUS]) { 2323 2320 err = ctnetlink_change_status(ct, cda); 2324 2321 if (err < 0)