Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

netfilter: Exclude LEGACY TABLES on PREEMPT_RT.

The seqcount xt_recseq is used to synchronize the replacement of
xt_table::private in xt_replace_table() against all readers such as
ipt_do_table()

To ensure that there is only one writer, the writing side disables
bottom halves. The sequence counter can be acquired recursively. Only the
first invocation modifies the sequence counter (signaling that a writer
is in progress) while the following (recursive) writer does not modify
the counter.
The lack of a proper locking mechanism for the sequence counter can lead
to live lock on PREEMPT_RT if the high prior reader preempts the
writer. Additionally if the per-CPU lock on PREEMPT_RT is removed from
local_bh_disable() then there is no synchronisation for the per-CPU
sequence counter.

The affected code is "just" the legacy netfilter code which is replaced
by "netfilter tables". That code can be disabled without sacrificing
functionality because everything is provided by the newer
implementation. This will only requires the usage of the "-nft" tools
instead of the "-legacy" ones.
The long term plan is to remove the legacy code so lets accelerate the
progress.

Relax dependencies on iptables legacy, replace select with depends on,
this should cause no harm to existing kernel configs and users can still
toggle IP{6}_NF_IPTABLES_LEGACY in any case.
Make EBTABLES_LEGACY, IPTABLES_LEGACY and ARPTABLES depend on
NETFILTER_XTABLES_LEGACY. Hide xt_recseq and its users,
xt_register_table() and xt_percpu_counter_alloc() behind
NETFILTER_XTABLES_LEGACY. Let NETFILTER_XTABLES_LEGACY depend on
!PREEMPT_RT.

This will break selftest expecing the legacy options enabled and will be
addressed in a following patch.

Co-developed-by: Florian Westphal <fw@strlen.de>
Co-developed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

+47 -32
+5 -5
net/bridge/netfilter/Kconfig
··· 42 42 # old sockopt interface and eval loop 43 43 config BRIDGE_NF_EBTABLES_LEGACY 44 44 tristate "Legacy EBTABLES support" 45 - depends on BRIDGE && NETFILTER_XTABLES 46 - default n 45 + depends on BRIDGE && NETFILTER_XTABLES_LEGACY 46 + default n 47 47 help 48 48 Legacy ebtables packet/frame classifier. 49 49 This is not needed if you are using ebtables over nftables ··· 65 65 # 66 66 config BRIDGE_EBT_BROUTE 67 67 tristate "ebt: broute table support" 68 - select BRIDGE_NF_EBTABLES_LEGACY 68 + depends on BRIDGE_NF_EBTABLES_LEGACY 69 69 help 70 70 The ebtables broute table is used to define rules that decide between 71 71 bridging and routing frames, giving Linux the functionality of a ··· 76 76 77 77 config BRIDGE_EBT_T_FILTER 78 78 tristate "ebt: filter table support" 79 - select BRIDGE_NF_EBTABLES_LEGACY 79 + depends on BRIDGE_NF_EBTABLES_LEGACY 80 80 help 81 81 The ebtables filter table is used to define frame filtering rules at 82 82 local input, forwarding and local output. See the man page for ··· 86 86 87 87 config BRIDGE_EBT_T_NAT 88 88 tristate "ebt: nat table support" 89 - select BRIDGE_NF_EBTABLES_LEGACY 89 + depends on BRIDGE_NF_EBTABLES_LEGACY 90 90 help 91 91 The ebtables nat table is used to define rules that alter the MAC 92 92 source address (MAC SNAT) or the MAC destination address (MAC DNAT).
+12 -12
net/ipv4/netfilter/Kconfig
··· 13 13 # old sockopt interface and eval loop 14 14 config IP_NF_IPTABLES_LEGACY 15 15 tristate "Legacy IP tables support" 16 - default n 17 - select NETFILTER_XTABLES 16 + depends on NETFILTER_XTABLES_LEGACY 17 + default m if NETFILTER_XTABLES_LEGACY 18 18 help 19 19 iptables is a legacy packet classifier. 20 20 This is not needed if you are using iptables over nftables ··· 182 182 # `filter', generic and specific targets 183 183 config IP_NF_FILTER 184 184 tristate "Packet filtering" 185 - default m if NETFILTER_ADVANCED=n 186 - select IP_NF_IPTABLES_LEGACY 185 + default m if NETFILTER_ADVANCED=n || IP_NF_IPTABLES_LEGACY 186 + depends on IP_NF_IPTABLES_LEGACY 187 187 help 188 188 Packet filtering defines a table `filter', which has a series of 189 189 rules for simple packet filtering at local input, forwarding and ··· 220 220 config IP_NF_NAT 221 221 tristate "iptables NAT support" 222 222 depends on NF_CONNTRACK 223 + depends on IP_NF_IPTABLES_LEGACY 223 224 default m if NETFILTER_ADVANCED=n 224 225 select NF_NAT 225 226 select NETFILTER_XT_NAT 226 - select IP_NF_IPTABLES_LEGACY 227 227 help 228 228 This enables the `nat' table in iptables. This allows masquerading, 229 229 port forwarding and other forms of full Network Address Port ··· 263 263 # mangle + specific targets 264 264 config IP_NF_MANGLE 265 265 tristate "Packet mangling" 266 - default m if NETFILTER_ADVANCED=n 267 - select IP_NF_IPTABLES_LEGACY 266 + default m if NETFILTER_ADVANCED=n || IP_NF_IPTABLES_LEGACY 267 + depends on IP_NF_IPTABLES_LEGACY 268 268 help 269 269 This option adds a `mangle' table to iptables: see the man page for 270 270 iptables(8). This table is used for various packet alterations ··· 299 299 # raw + specific targets 300 300 config IP_NF_RAW 301 301 tristate 'raw table support (required for NOTRACK/TRACE)' 302 - select IP_NF_IPTABLES_LEGACY 302 + depends on IP_NF_IPTABLES_LEGACY 303 303 help 304 304 This option adds a `raw' table to iptables. This table is the very 305 305 first in the netfilter framework and hooks in at the PREROUTING ··· 313 313 tristate "Security table" 314 314 depends on SECURITY 315 315 depends on NETFILTER_ADVANCED 316 - select IP_NF_IPTABLES_LEGACY 316 + depends on IP_NF_IPTABLES_LEGACY 317 317 help 318 318 This option adds a `security' table to iptables, for use 319 319 with Mandatory Access Control (MAC) policy. ··· 325 325 # ARP tables 326 326 config IP_NF_ARPTABLES 327 327 tristate "Legacy ARPTABLES support" 328 - depends on NETFILTER_XTABLES 329 - default n 328 + depends on NETFILTER_XTABLES_LEGACY 329 + default n 330 330 help 331 331 arptables is a legacy packet classifier. 332 332 This is not needed if you are using arptables over nftables ··· 342 342 tristate "arptables-legacy packet filtering support" 343 343 select IP_NF_ARPTABLES 344 344 select NETFILTER_FAMILY_ARP 345 - depends on NETFILTER_XTABLES 345 + depends on NETFILTER_XTABLES_LEGACY 346 346 help 347 347 ARP packet filtering defines a table `filter', which has a series of 348 348 rules for simple ARP packet filtering at local input and
+9 -10
net/ipv6/netfilter/Kconfig
··· 9 9 # old sockopt interface and eval loop 10 10 config IP6_NF_IPTABLES_LEGACY 11 11 tristate "Legacy IP6 tables support" 12 - depends on INET && IPV6 13 - select NETFILTER_XTABLES 14 - default n 12 + depends on INET && IPV6 && NETFILTER_XTABLES_LEGACY 13 + default m if NETFILTER_XTABLES_LEGACY 15 14 help 16 15 ip6tables is a legacy packet classifier. 17 16 This is not needed if you are using iptables over nftables ··· 195 196 196 197 config IP6_NF_FILTER 197 198 tristate "Packet filtering" 198 - default m if NETFILTER_ADVANCED=n 199 - select IP6_NF_IPTABLES_LEGACY 199 + default m if NETFILTER_ADVANCED=n || IP6_NF_IPTABLES_LEGACY 200 + depends on IP6_NF_IPTABLES_LEGACY 200 201 tristate 201 202 help 202 203 Packet filtering defines a table `filter', which has a series of ··· 232 233 233 234 config IP6_NF_MANGLE 234 235 tristate "Packet mangling" 235 - default m if NETFILTER_ADVANCED=n 236 - select IP6_NF_IPTABLES_LEGACY 236 + default m if NETFILTER_ADVANCED=n || IP6_NF_IPTABLES_LEGACY 237 + depends on IP6_NF_IPTABLES_LEGACY 237 238 help 238 239 This option adds a `mangle' table to iptables: see the man page for 239 240 iptables(8). This table is used for various packet alterations ··· 243 244 244 245 config IP6_NF_RAW 245 246 tristate 'raw table support (required for TRACE)' 246 - select IP6_NF_IPTABLES_LEGACY 247 + depends on IP6_NF_IPTABLES_LEGACY 247 248 help 248 249 This option adds a `raw' table to ip6tables. This table is the very 249 250 first in the netfilter framework and hooks in at the PREROUTING ··· 257 258 tristate "Security table" 258 259 depends on SECURITY 259 260 depends on NETFILTER_ADVANCED 260 - select IP6_NF_IPTABLES_LEGACY 261 + depends on IP6_NF_IPTABLES_LEGACY 261 262 help 262 263 This option adds a `security' table to iptables, for use 263 264 with Mandatory Access Control (MAC) policy. ··· 268 269 tristate "ip6tables NAT support" 269 270 depends on NF_CONNTRACK 270 271 depends on NETFILTER_ADVANCED 272 + depends on IP6_NF_IPTABLES_LEGACY 271 273 select NF_NAT 272 - select IP6_NF_IPTABLES_LEGACY 273 274 select NETFILTER_XT_NAT 274 275 help 275 276 This enables the `nat' table in ip6tables. This allows masquerading,
+10
net/netfilter/Kconfig
··· 758 758 759 759 If unsure, say N. 760 760 761 + config NETFILTER_XTABLES_LEGACY 762 + bool "Netfilter legacy tables support" 763 + depends on !PREEMPT_RT 764 + help 765 + Say Y here if you still require support for legacy tables. This is 766 + required by the legacy tools (iptables-legacy) and is not needed if 767 + you use iptables over nftables (iptables-nft). 768 + Legacy support is not limited to IP, it also includes EBTABLES and 769 + ARPTABLES. 770 + 761 771 comment "Xtables combined modules" 762 772 763 773 config NETFILTER_XT_MARK
+11 -5
net/netfilter/x_tables.c
··· 1317 1317 EXPORT_SYMBOL_GPL(xt_compat_unlock); 1318 1318 #endif 1319 1319 1320 - DEFINE_PER_CPU(seqcount_t, xt_recseq); 1321 - EXPORT_PER_CPU_SYMBOL_GPL(xt_recseq); 1322 - 1323 1320 struct static_key xt_tee_enabled __read_mostly; 1324 1321 EXPORT_SYMBOL_GPL(xt_tee_enabled); 1322 + 1323 + #ifdef CONFIG_NETFILTER_XTABLES_LEGACY 1324 + DEFINE_PER_CPU(seqcount_t, xt_recseq); 1325 + EXPORT_PER_CPU_SYMBOL_GPL(xt_recseq); 1325 1326 1326 1327 static int xt_jumpstack_alloc(struct xt_table_info *i) 1327 1328 { ··· 1515 1514 return private; 1516 1515 } 1517 1516 EXPORT_SYMBOL_GPL(xt_unregister_table); 1517 + #endif 1518 1518 1519 1519 #ifdef CONFIG_PROC_FS 1520 1520 static void *xt_table_seq_start(struct seq_file *seq, loff_t *pos) ··· 1899 1897 } 1900 1898 EXPORT_SYMBOL_GPL(xt_proto_fini); 1901 1899 1900 + #ifdef CONFIG_NETFILTER_XTABLES_LEGACY 1902 1901 /** 1903 1902 * xt_percpu_counter_alloc - allocate x_tables rule counter 1904 1903 * ··· 1954 1951 free_percpu((void __percpu *)pcnt); 1955 1952 } 1956 1953 EXPORT_SYMBOL_GPL(xt_percpu_counter_free); 1954 + #endif 1957 1955 1958 1956 static int __net_init xt_net_init(struct net *net) 1959 1957 { ··· 1987 1983 unsigned int i; 1988 1984 int rv; 1989 1985 1990 - for_each_possible_cpu(i) { 1991 - seqcount_init(&per_cpu(xt_recseq, i)); 1986 + if (IS_ENABLED(CONFIG_NETFILTER_XTABLES_LEGACY)) { 1987 + for_each_possible_cpu(i) { 1988 + seqcount_init(&per_cpu(xt_recseq, i)); 1989 + } 1992 1990 } 1993 1991 1994 1992 xt = kcalloc(NFPROTO_NUMPROTO, sizeof(struct xt_af), GFP_KERNEL);