Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

neighbour: make proxy_queue.qlen limit per-device

Right now we have a neigh_param PROXY_QLEN which specifies maximum length
of neigh_table->proxy_queue. But in fact, this limitation doesn't work well
because check condition looks like:
tbl->proxy_queue.qlen > NEIGH_VAR(p, PROXY_QLEN)

The problem is that p (struct neigh_parms) is a per-device thing,
but tbl (struct neigh_table) is a system-wide global thing.

It seems reasonable to make proxy_queue limit per-device based.

v2:
- nothing changed in this patch
v3:
- rebase to net tree

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@kernel.org>
Cc: Yajun Deng <yajun.deng@linux.dev>
Cc: Roopa Prabhu <roopa@nvidia.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Cc: Konstantin Khorenko <khorenko@virtuozzo.com>
Cc: kernel@openvz.org
Cc: devel@openvz.org
Suggested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

authored by

Alexander Mikhalitsyn and committed by
David S. Miller
0ff4eb3d 66ba215c

+23 -3
+1
include/net/neighbour.h
··· 83 83 struct rcu_head rcu_head; 84 84 85 85 int reachable_time; 86 + int qlen; 86 87 int data[NEIGH_VAR_DATA_MAX]; 87 88 DECLARE_BITMAP(data_state, NEIGH_VAR_DATA_MAX); 88 89 };
+22 -3
net/core/neighbour.c
··· 316 316 skb = skb_peek(list); 317 317 while (skb != NULL) { 318 318 struct sk_buff *skb_next = skb_peek_next(skb, list); 319 - if (net == NULL || net_eq(dev_net(skb->dev), net)) { 319 + struct net_device *dev = skb->dev; 320 + if (net == NULL || net_eq(dev_net(dev), net)) { 321 + struct in_device *in_dev; 322 + 323 + rcu_read_lock(); 324 + in_dev = __in_dev_get_rcu(dev); 325 + if (in_dev) 326 + in_dev->arp_parms->qlen--; 327 + rcu_read_unlock(); 320 328 __skb_unlink(skb, list); 321 - dev_put(skb->dev); 329 + 330 + dev_put(dev); 322 331 kfree_skb(skb); 323 332 } 324 333 skb = skb_next; ··· 1615 1606 1616 1607 if (tdif <= 0) { 1617 1608 struct net_device *dev = skb->dev; 1609 + struct in_device *in_dev; 1618 1610 1611 + rcu_read_lock(); 1612 + in_dev = __in_dev_get_rcu(dev); 1613 + if (in_dev) 1614 + in_dev->arp_parms->qlen--; 1615 + rcu_read_unlock(); 1619 1616 __skb_unlink(skb, &tbl->proxy_queue); 1617 + 1620 1618 if (tbl->proxy_redo && netif_running(dev)) { 1621 1619 rcu_read_lock(); 1622 1620 tbl->proxy_redo(skb); ··· 1648 1632 unsigned long sched_next = jiffies + 1649 1633 prandom_u32_max(NEIGH_VAR(p, PROXY_DELAY)); 1650 1634 1651 - if (tbl->proxy_queue.qlen > NEIGH_VAR(p, PROXY_QLEN)) { 1635 + if (p->qlen > NEIGH_VAR(p, PROXY_QLEN)) { 1652 1636 kfree_skb(skb); 1653 1637 return; 1654 1638 } ··· 1664 1648 skb_dst_drop(skb); 1665 1649 dev_hold(skb->dev); 1666 1650 __skb_queue_tail(&tbl->proxy_queue, skb); 1651 + p->qlen++; 1667 1652 mod_timer(&tbl->proxy_timer, sched_next); 1668 1653 spin_unlock(&tbl->proxy_queue.lock); 1669 1654 } ··· 1697 1680 refcount_set(&p->refcnt, 1); 1698 1681 p->reachable_time = 1699 1682 neigh_rand_reach_time(NEIGH_VAR(p, BASE_REACHABLE_TIME)); 1683 + p->qlen = 0; 1700 1684 netdev_hold(dev, &p->dev_tracker, GFP_KERNEL); 1701 1685 p->dev = dev; 1702 1686 write_pnet(&p->net, net); ··· 1763 1745 refcount_set(&tbl->parms.refcnt, 1); 1764 1746 tbl->parms.reachable_time = 1765 1747 neigh_rand_reach_time(NEIGH_VAR(&tbl->parms, BASE_REACHABLE_TIME)); 1748 + tbl->parms.qlen = 0; 1766 1749 1767 1750 tbl->stats = alloc_percpu(struct neigh_statistics); 1768 1751 if (!tbl->stats)