Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

netfilter: xt_hashlimit: htable_selective_cleanup() optimization

I have seen syzbot reports hinting at xt_hashlimit abuse:

[ 105.783066][ T4331] xt_hashlimit: max too large, truncated to 1048576
[ 105.811405][ T4331] xt_hashlimit: size too large, truncated to 1048576

And worker threads using up to 1 second per htable_selective_cleanup() invocation.

[ 269.734496][ C1] [<ffffffff81547180>] ? __local_bh_enable_ip+0x1a0/0x1a0
[ 269.734513][ C1] [<ffffffff817d75d0>] ? lockdep_hardirqs_on_prepare+0x740/0x740
[ 269.734533][ C1] [<ffffffff852e71ff>] ? htable_selective_cleanup+0x25f/0x310
[ 269.734549][ C1] [<ffffffff817dcd30>] ? __lock_acquire+0x2060/0x2060
[ 269.734567][ C1] [<ffffffff817f058a>] ? do_raw_spin_lock+0x14a/0x370
[ 269.734583][ C1] [<ffffffff852e71ff>] ? htable_selective_cleanup+0x25f/0x310
[ 269.734599][ C1] [<ffffffff81547147>] __local_bh_enable_ip+0x167/0x1a0
[ 269.734616][ C1] [<ffffffff81546fe0>] ? _local_bh_enable+0xa0/0xa0
[ 269.734634][ C1] [<ffffffff852e71ff>] ? htable_selective_cleanup+0x25f/0x310
[ 269.734651][ C1] [<ffffffff852e71ff>] htable_selective_cleanup+0x25f/0x310
[ 269.734670][ C1] [<ffffffff815b3cc9>] ? process_one_work+0x7a9/0x1170
[ 269.734685][ C1] [<ffffffff852e57db>] htable_gc+0x1b/0xa0
[ 269.734700][ C1] [<ffffffff815b3cc9>] ? process_one_work+0x7a9/0x1170
[ 269.734714][ C1] [<ffffffff815b3dc9>] process_one_work+0x8a9/0x1170
[ 269.734733][ C1] [<ffffffff815b3520>] ? worker_detach_from_pool+0x260/0x260
[ 269.734749][ C1] [<ffffffff810201c7>] ? _raw_spin_lock_irq+0xb7/0xf0
[ 269.734763][ C1] [<ffffffff81020110>] ? _raw_spin_lock_irqsave+0x100/0x100
[ 269.734777][ C1] [<ffffffff8159d3df>] ? wq_worker_sleeping+0x5f/0x270
[ 269.734800][ C1] [<ffffffff815b53c7>] worker_thread+0xa47/0x1200
[ 269.734815][ C1] [<ffffffff81020010>] ? _raw_spin_lock+0x40/0x40
[ 269.734835][ C1] [<ffffffff815c9f2a>] kthread+0x25a/0x2e0
[ 269.734853][ C1] [<ffffffff815b4980>] ? worker_clr_flags+0x190/0x190
[ 269.734866][ C1] [<ffffffff815c9cd0>] ? kthread_blkcg+0xd0/0xd0
[ 269.734885][ C1] [<ffffffff81027b1a>] ret_from_fork+0x3a/0x50

We can skip over empty buckets, avoiding the lockdep penalty
for debug kernels, and avoid atomic operations on non debug ones.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

authored by

Eric Dumazet and committed by
Pablo Neira Ayuso
95f1c1e9 178883fd

+5 -1
+5 -1
net/netfilter/xt_hashlimit.c
··· 363 363 unsigned int i; 364 364 365 365 for (i = 0; i < ht->cfg.size; i++) { 366 + struct hlist_head *head = &ht->hash[i]; 366 367 struct dsthash_ent *dh; 367 368 struct hlist_node *n; 368 369 370 + if (hlist_empty(head)) 371 + continue; 372 + 369 373 spin_lock_bh(&ht->lock); 370 - hlist_for_each_entry_safe(dh, n, &ht->hash[i], node) { 374 + hlist_for_each_entry_safe(dh, n, head, node) { 371 375 if (time_after_eq(jiffies, dh->expires) || select_all) 372 376 dsthash_free(ht, dh); 373 377 }