Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

netfilter: xt_NFQUEUE: introduce CPU fanout

Current NFQUEUE target uses a hash, computed over source and
destination address (and other parameters), for steering the packet
to the actual NFQUEUE. This, however forgets about the fact that the
packet eventually is handled by a particular CPU on user request.

If E. g.

1) IRQ affinity is used to handle packets on a particular CPU already
(both single-queue or multi-queue case)

and/or

2) RPS is used to steer packets to a specific softirq

the target easily chooses an NFQUEUE which is not handled by a process
pinned to the same CPU.

The idea is therefore to use the CPU index for determining the
NFQUEUE handling the packet.

E. g. when having a system with 4 CPUs, 4 MQ queues and 4 NFQUEUEs it
looks like this:

+-----+ +-----+ +-----+ +-----+
|NFQ#0| |NFQ#1| |NFQ#2| |NFQ#3|
+-----+ +-----+ +-----+ +-----+
^ ^ ^ ^
| |NFQUEUE | |
+ + + +
+-----+ +-----+ +-----+ +-----+
|rx-0 | |rx-1 | |rx-2 | |rx-3 |
+-----+ +-----+ +-----+ +-----+

The NFQUEUEs not necessarily have to start with number 0, setups with
less NFQUEUEs than packet-handling CPUs are not a problem as well.

This patch extends the NFQUEUE target to accept a new
NFQ_FLAG_CPU_FANOUT flag. If this is specified the target uses the
CPU index for determining the NFQUEUE being used. I have to introduce
rev3 for this. The 'flags' are folded into _v2 'bypass'.

By changing the way which queue is assigned, I'm able to improve the
performance if the processes reading on the NFQUEUs are pinned
correctly.

Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

authored by

holger@eitzenberger.org and committed by
Pablo Neira Ayuso
8746ddcf f0165888

+48 -2
+9
include/uapi/linux/netfilter/xt_NFQUEUE.h
··· 26 26 __u16 bypass; 27 27 }; 28 28 29 + struct xt_NFQ_info_v3 { 30 + __u16 queuenum; 31 + __u16 queues_total; 32 + __u16 flags; 33 + #define NFQ_FLAG_BYPASS 0x01 /* for compatibility with v2 */ 34 + #define NFQ_FLAG_CPU_FANOUT 0x02 /* use current CPU (no hashing) */ 35 + #define NFQ_FLAG_MASK 0x03 36 + }; 37 + 29 38 #endif /* _XT_NFQ_TARGET_H */
+39 -2
net/netfilter/xt_NFQUEUE.c
··· 108 108 109 109 static int nfqueue_tg_check(const struct xt_tgchk_param *par) 110 110 { 111 - const struct xt_NFQ_info_v2 *info = par->targinfo; 111 + const struct xt_NFQ_info_v3 *info = par->targinfo; 112 112 u32 maxid; 113 113 114 114 if (unlikely(!rnd_inited)) { ··· 125 125 info->queues_total, maxid); 126 126 return -ERANGE; 127 127 } 128 - if (par->target->revision == 2 && info->bypass > 1) 128 + if (par->target->revision == 2 && info->flags > 1) 129 129 return -EINVAL; 130 + if (par->target->revision == 3 && info->flags & ~NFQ_FLAG_MASK) 131 + return -EINVAL; 132 + 130 133 return 0; 134 + } 135 + 136 + static unsigned int 137 + nfqueue_tg_v3(struct sk_buff *skb, const struct xt_action_param *par) 138 + { 139 + const struct xt_NFQ_info_v3 *info = par->targinfo; 140 + u32 queue = info->queuenum; 141 + 142 + if (info->queues_total > 1) { 143 + if (info->flags & NFQ_FLAG_CPU_FANOUT) { 144 + int cpu = smp_processor_id(); 145 + 146 + queue = info->queuenum + cpu % info->queues_total; 147 + } else { 148 + if (par->family == NFPROTO_IPV4) 149 + queue = (((u64) hash_v4(skb) * info->queues_total) >> 150 + 32) + queue; 151 + #if IS_ENABLED(CONFIG_IP6_NF_IPTABLES) 152 + else if (par->family == NFPROTO_IPV6) 153 + queue = (((u64) hash_v6(skb) * info->queues_total) >> 154 + 32) + queue; 155 + #endif 156 + } 157 + } 158 + return NF_QUEUE_NR(queue); 131 159 } 132 160 133 161 static struct xt_target nfqueue_tg_reg[] __read_mostly = { ··· 182 154 .checkentry = nfqueue_tg_check, 183 155 .target = nfqueue_tg_v2, 184 156 .targetsize = sizeof(struct xt_NFQ_info_v2), 157 + .me = THIS_MODULE, 158 + }, 159 + { 160 + .name = "NFQUEUE", 161 + .revision = 3, 162 + .family = NFPROTO_UNSPEC, 163 + .checkentry = nfqueue_tg_check, 164 + .target = nfqueue_tg_v3, 165 + .targetsize = sizeof(struct xt_NFQ_info_v3), 185 166 .me = THIS_MODULE, 186 167 }, 187 168 };