Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

netfilter: nf_conntrack: add direction support for zones

This work adds a direction parameter to netfilter zones, so identity
separation can be performed only in original/reply or both directions
(default). This basically opens up the possibility of doing NAT with
conflicting IP address/port tuples from multiple, isolated tenants
on a host (e.g. from a netns) without requiring each tenant to NAT
twice resp. to use its own dedicated IP address to SNAT to, meaning
overlapping tuples can be made unique with the zone identifier in
original direction, where the NAT engine will then allocate a unique
tuple in the commonly shared default zone for the reply direction.
In some restricted, local DNAT cases, also port redirection could be
used for making the reply traffic unique w/o requiring SNAT.

The consensus we've reached and discussed at NFWS and since the initial
implementation [1] was to directly integrate the direction meta data
into the existing zones infrastructure, as opposed to the ct->mark
approach we proposed initially.

As we pass the nf_conntrack_zone object directly around, we don't have
to touch all call-sites, but only those, that contain equality checks
of zones. Thus, based on the current direction (original or reply),
we either return the actual id, or the default NF_CT_DEFAULT_ZONE_ID.
CT expectations are direction-agnostic entities when expectations are
being compared among themselves, so we can only use the identifier
in this case.

Note that zone identifiers can not be included into the hash mix
anymore as they don't contain a "stable" value that would be equal
for both directions at all times, f.e. if only zone->id would
unconditionally be xor'ed into the table slot hash, then replies won't
find the corresponding conntracking entry anymore.

If no particular direction is specified when configuring zones, the
behaviour is exactly as we expect currently (both directions).

Support has been added for the CT netlink interface as well as the
x_tables raw CT target, which both already offer existing interfaces
to user space for the configuration of zones.

Below a minimal, simplified collision example (script in [2]) with
netperf sessions:

+--- tenant-1 ---+ mark := 1
| netperf |--+
+----------------+ | CT zone := mark [ORIGINAL]
[ip,sport] := X +--------------+ +--- gateway ---+
| mark routing |--| SNAT |-- ... +
+--------------+ +---------------+ |
+--- tenant-2 ---+ | ~~~|~~~
| netperf |--+ +-----------+ |
+----------------+ mark := 2 | netserver |------ ... +
[ip,sport] := X +-----------+
[ip,port] := Y
On the gateway netns, example:

iptables -t raw -A PREROUTING -j CT --zone mark --zone-dir ORIGINAL
iptables -t nat -A POSTROUTING -o <dev> -j SNAT --to-source <ip> --random-fully

iptables -t mangle -A PREROUTING -m conntrack --ctdir ORIGINAL -j CONNMARK --save-mark
iptables -t mangle -A POSTROUTING -m conntrack --ctdir REPLY -j CONNMARK --restore-mark

conntrack dump from gateway netns:

netperf -H 10.1.1.2 -t TCP_STREAM -l60 -p12865,5555 from each tenant netns

tcp 6 431995 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=5555 dport=12865 zone-orig=1
src=10.1.1.2 dst=10.1.1.1 sport=12865 dport=1024
[ASSURED] mark=1 secctx=system_u:object_r:unlabeled_t:s0 use=1

tcp 6 431994 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=5555 dport=12865 zone-orig=2
src=10.1.1.2 dst=10.1.1.1 sport=12865 dport=5555
[ASSURED] mark=2 secctx=system_u:object_r:unlabeled_t:s0 use=1

tcp 6 299 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=39438 dport=33768 zone-orig=1
src=10.1.1.2 dst=10.1.1.1 sport=33768 dport=39438
[ASSURED] mark=1 secctx=system_u:object_r:unlabeled_t:s0 use=1

tcp 6 300 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=32889 dport=40206 zone-orig=2
src=10.1.1.2 dst=10.1.1.1 sport=40206 dport=32889
[ASSURED] mark=2 secctx=system_u:object_r:unlabeled_t:s0 use=2

Taking this further, test script in [2] creates 200 tenants and runs
original-tuple colliding netperf sessions each. A conntrack -L dump in
the gateway netns also confirms 200 overlapping entries, all in ESTABLISHED
state as expected.

I also did run various other tests with some permutations of the script,
to mention some: SNAT in random/random-fully/persistent mode, no zones (no
overlaps), static zones (original, reply, both directions), etc.

[1] http://thread.gmane.org/gmane.comp.security.firewalls.netfilter.devel/57412/
[2] https://paste.fedoraproject.org/242835/65657871/

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

authored by

Daniel Borkmann and committed by
Pablo Neira Ayuso
deedb590 308ac914

+259 -94
+30 -1
include/net/netfilter/nf_conntrack_zones.h
··· 1 1 #ifndef _NF_CONNTRACK_ZONES_H 2 2 #define _NF_CONNTRACK_ZONES_H 3 3 4 + #include <linux/netfilter/nf_conntrack_tuple_common.h> 5 + 4 6 #define NF_CT_DEFAULT_ZONE_ID 0 7 + 8 + #define NF_CT_ZONE_DIR_ORIG (1 << IP_CT_DIR_ORIGINAL) 9 + #define NF_CT_ZONE_DIR_REPL (1 << IP_CT_DIR_REPLY) 10 + 11 + #define NF_CT_DEFAULT_ZONE_DIR (NF_CT_ZONE_DIR_ORIG | NF_CT_ZONE_DIR_REPL) 5 12 6 13 struct nf_conntrack_zone { 7 14 u16 id; 15 + u16 dir; 8 16 }; 9 17 10 18 extern const struct nf_conntrack_zone nf_ct_zone_dflt; ··· 37 29 return tmpl ? nf_ct_zone(tmpl) : &nf_ct_zone_dflt; 38 30 } 39 31 32 + static inline bool nf_ct_zone_matches_dir(const struct nf_conntrack_zone *zone, 33 + enum ip_conntrack_dir dir) 34 + { 35 + return zone->dir & (1 << dir); 36 + } 37 + 38 + static inline u16 nf_ct_zone_id(const struct nf_conntrack_zone *zone, 39 + enum ip_conntrack_dir dir) 40 + { 41 + return nf_ct_zone_matches_dir(zone, dir) ? 42 + zone->id : NF_CT_DEFAULT_ZONE_ID; 43 + } 44 + 40 45 static inline bool nf_ct_zone_equal(const struct nf_conn *a, 41 - const struct nf_conntrack_zone *b) 46 + const struct nf_conntrack_zone *b, 47 + enum ip_conntrack_dir dir) 48 + { 49 + return nf_ct_zone_id(nf_ct_zone(a), dir) == 50 + nf_ct_zone_id(b, dir); 51 + } 52 + 53 + static inline bool nf_ct_zone_equal_any(const struct nf_conn *a, 54 + const struct nf_conntrack_zone *b) 42 55 { 43 56 return nf_ct_zone(a)->id == b->id; 44 57 }
+5 -1
include/uapi/linux/netfilter/xt_CT.h
··· 6 6 enum { 7 7 XT_CT_NOTRACK = 1 << 0, 8 8 XT_CT_NOTRACK_ALIAS = 1 << 1, 9 - XT_CT_MASK = XT_CT_NOTRACK | XT_CT_NOTRACK_ALIAS, 9 + XT_CT_ZONE_DIR_ORIG = 1 << 2, 10 + XT_CT_ZONE_DIR_REPL = 1 << 3, 11 + 12 + XT_CT_MASK = XT_CT_NOTRACK | XT_CT_NOTRACK_ALIAS | 13 + XT_CT_ZONE_DIR_ORIG | XT_CT_ZONE_DIR_REPL, 10 14 }; 11 15 12 16 struct xt_ct_target_info {
+6 -2
net/ipv4/netfilter/nf_defrag_ipv4.c
··· 45 45 { 46 46 u16 zone_id = NF_CT_DEFAULT_ZONE_ID; 47 47 #if IS_ENABLED(CONFIG_NF_CONNTRACK) 48 - if (skb->nfct) 49 - zone_id = nf_ct_zone((struct nf_conn *)skb->nfct)->id; 48 + if (skb->nfct) { 49 + enum ip_conntrack_info ctinfo; 50 + const struct nf_conn *ct = nf_ct_get(skb, &ctinfo); 51 + 52 + zone_id = nf_ct_zone_id(nf_ct_zone(ct), CTINFO2DIR(ctinfo)); 53 + } 50 54 #endif 51 55 if (nf_bridge_in_prerouting(skb)) 52 56 return IP_DEFRAG_CONNTRACK_BRIDGE_IN + zone_id;
+6 -2
net/ipv6/netfilter/nf_defrag_ipv6_hooks.c
··· 35 35 { 36 36 u16 zone_id = NF_CT_DEFAULT_ZONE_ID; 37 37 #if IS_ENABLED(CONFIG_NF_CONNTRACK) 38 - if (skb->nfct) 39 - zone_id = nf_ct_zone((struct nf_conn *)skb->nfct)->id; 38 + if (skb->nfct) { 39 + enum ip_conntrack_info ctinfo; 40 + const struct nf_conn *ct = nf_ct_get(skb, &ctinfo); 41 + 42 + zone_id = nf_ct_zone_id(nf_ct_zone(ct), CTINFO2DIR(ctinfo)); 43 + } 40 44 #endif 41 45 if (nf_bridge_in_prerouting(skb)) 42 46 return IP6_DEFRAG_CONNTRACK_BRIDGE_IN + zone_id;
+27 -26
net/netfilter/nf_conntrack_core.c
··· 126 126 unsigned int nf_conntrack_hash_rnd __read_mostly; 127 127 EXPORT_SYMBOL_GPL(nf_conntrack_hash_rnd); 128 128 129 - static u32 hash_conntrack_raw(const struct nf_conntrack_tuple *tuple, 130 - const struct nf_conntrack_zone *zone) 129 + static u32 hash_conntrack_raw(const struct nf_conntrack_tuple *tuple) 131 130 { 132 131 unsigned int n; 133 132 ··· 135 136 * three bytes manually. 136 137 */ 137 138 n = (sizeof(tuple->src) + sizeof(tuple->dst.u3)) / sizeof(u32); 138 - return jhash2((u32 *)tuple, n, zone->id ^ nf_conntrack_hash_rnd ^ 139 + return jhash2((u32 *)tuple, n, nf_conntrack_hash_rnd ^ 139 140 (((__force __u16)tuple->dst.u.all << 16) | 140 141 tuple->dst.protonum)); 141 142 } ··· 151 152 } 152 153 153 154 static u_int32_t __hash_conntrack(const struct nf_conntrack_tuple *tuple, 154 - const struct nf_conntrack_zone *zone, 155 155 unsigned int size) 156 156 { 157 - return __hash_bucket(hash_conntrack_raw(tuple, zone), size); 157 + return __hash_bucket(hash_conntrack_raw(tuple), size); 158 158 } 159 159 160 160 static inline u_int32_t hash_conntrack(const struct net *net, 161 - const struct nf_conntrack_zone *zone, 162 161 const struct nf_conntrack_tuple *tuple) 163 162 { 164 - return __hash_conntrack(tuple, zone, net->ct.htable_size); 163 + return __hash_conntrack(tuple, net->ct.htable_size); 165 164 } 166 165 167 166 bool ··· 309 312 if (!nf_ct_zone) 310 313 goto out_free; 311 314 nf_ct_zone->id = zone->id; 315 + nf_ct_zone->dir = zone->dir; 312 316 } 313 317 #endif 314 318 atomic_set(&tmpl->ct_general.use, 0); ··· 374 376 375 377 static void nf_ct_delete_from_lists(struct nf_conn *ct) 376 378 { 377 - const struct nf_conntrack_zone *zone; 378 379 struct net *net = nf_ct_net(ct); 379 380 unsigned int hash, reply_hash; 380 381 unsigned int sequence; 381 382 382 - zone = nf_ct_zone(ct); 383 383 nf_ct_helper_destroy(ct); 384 384 385 385 local_bh_disable(); 386 386 do { 387 387 sequence = read_seqcount_begin(&net->ct.generation); 388 - hash = hash_conntrack(net, zone, 388 + hash = hash_conntrack(net, 389 389 &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple); 390 - reply_hash = hash_conntrack(net, zone, 390 + reply_hash = hash_conntrack(net, 391 391 &ct->tuplehash[IP_CT_DIR_REPLY].tuple); 392 392 } while (nf_conntrack_double_lock(net, hash, reply_hash, sequence)); 393 393 ··· 442 446 * so we need to check that the conntrack is confirmed 443 447 */ 444 448 return nf_ct_tuple_equal(tuple, &h->tuple) && 445 - nf_ct_zone_equal(ct, zone) && 449 + nf_ct_zone_equal(ct, zone, NF_CT_DIRECTION(h)) && 446 450 nf_ct_is_confirmed(ct); 447 451 } 448 452 ··· 519 523 const struct nf_conntrack_tuple *tuple) 520 524 { 521 525 return __nf_conntrack_find_get(net, zone, tuple, 522 - hash_conntrack_raw(tuple, zone)); 526 + hash_conntrack_raw(tuple)); 523 527 } 524 528 EXPORT_SYMBOL_GPL(nf_conntrack_find_get); 525 529 ··· 550 554 local_bh_disable(); 551 555 do { 552 556 sequence = read_seqcount_begin(&net->ct.generation); 553 - hash = hash_conntrack(net, zone, 557 + hash = hash_conntrack(net, 554 558 &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple); 555 - reply_hash = hash_conntrack(net, zone, 559 + reply_hash = hash_conntrack(net, 556 560 &ct->tuplehash[IP_CT_DIR_REPLY].tuple); 557 561 } while (nf_conntrack_double_lock(net, hash, reply_hash, sequence)); 558 562 ··· 560 564 hlist_nulls_for_each_entry(h, n, &net->ct.hash[hash], hnnode) 561 565 if (nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple, 562 566 &h->tuple) && 563 - nf_ct_zone_equal(nf_ct_tuplehash_to_ctrack(h), zone)) 567 + nf_ct_zone_equal(nf_ct_tuplehash_to_ctrack(h), zone, 568 + NF_CT_DIRECTION(h))) 564 569 goto out; 565 570 hlist_nulls_for_each_entry(h, n, &net->ct.hash[reply_hash], hnnode) 566 571 if (nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_REPLY].tuple, 567 572 &h->tuple) && 568 - nf_ct_zone_equal(nf_ct_tuplehash_to_ctrack(h), zone)) 573 + nf_ct_zone_equal(nf_ct_tuplehash_to_ctrack(h), zone, 574 + NF_CT_DIRECTION(h))) 569 575 goto out; 570 576 571 577 add_timer(&ct->timeout); ··· 621 623 /* reuse the hash saved before */ 622 624 hash = *(unsigned long *)&ct->tuplehash[IP_CT_DIR_REPLY].hnnode.pprev; 623 625 hash = hash_bucket(hash, net); 624 - reply_hash = hash_conntrack(net, zone, 626 + reply_hash = hash_conntrack(net, 625 627 &ct->tuplehash[IP_CT_DIR_REPLY].tuple); 626 628 627 629 } while (nf_conntrack_double_lock(net, hash, reply_hash, sequence)); ··· 653 655 hlist_nulls_for_each_entry(h, n, &net->ct.hash[hash], hnnode) 654 656 if (nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple, 655 657 &h->tuple) && 656 - nf_ct_zone_equal(nf_ct_tuplehash_to_ctrack(h), zone)) 658 + nf_ct_zone_equal(nf_ct_tuplehash_to_ctrack(h), zone, 659 + NF_CT_DIRECTION(h))) 657 660 goto out; 658 661 hlist_nulls_for_each_entry(h, n, &net->ct.hash[reply_hash], hnnode) 659 662 if (nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_REPLY].tuple, 660 663 &h->tuple) && 661 - nf_ct_zone_equal(nf_ct_tuplehash_to_ctrack(h), zone)) 664 + nf_ct_zone_equal(nf_ct_tuplehash_to_ctrack(h), zone, 665 + NF_CT_DIRECTION(h))) 662 666 goto out; 663 667 664 668 /* Timer relative to confirmation time, not original ··· 720 720 unsigned int hash; 721 721 722 722 zone = nf_ct_zone(ignored_conntrack); 723 - hash = hash_conntrack(net, zone, tuple); 723 + hash = hash_conntrack(net, tuple); 724 724 725 725 /* Disable BHs the entire time since we need to disable them at 726 726 * least once for the stats anyway. ··· 730 730 ct = nf_ct_tuplehash_to_ctrack(h); 731 731 if (ct != ignored_conntrack && 732 732 nf_ct_tuple_equal(tuple, &h->tuple) && 733 - nf_ct_zone_equal(ct, zone)) { 733 + nf_ct_zone_equal(ct, zone, NF_CT_DIRECTION(h))) { 734 734 NF_CT_STAT_INC(net, found); 735 735 rcu_read_unlock_bh(); 736 736 return 1; ··· 830 830 if (unlikely(!nf_conntrack_hash_rnd)) { 831 831 init_nf_conntrack_hash_rnd(); 832 832 /* recompute the hash as nf_conntrack_hash_rnd is initialized */ 833 - hash = hash_conntrack_raw(orig, zone); 833 + hash = hash_conntrack_raw(orig); 834 834 } 835 835 836 836 /* We don't want any race condition at early drop stage */ ··· 875 875 if (!nf_ct_zone) 876 876 goto out_free; 877 877 nf_ct_zone->id = zone->id; 878 + nf_ct_zone->dir = zone->dir; 878 879 } 879 880 #endif 880 881 /* Because we use RCU lookups, we set ct_general.use to zero before ··· 1054 1053 1055 1054 /* look for tuple match */ 1056 1055 zone = nf_ct_zone_tmpl(tmpl); 1057 - hash = hash_conntrack_raw(&tuple, zone); 1056 + hash = hash_conntrack_raw(&tuple); 1058 1057 h = __nf_conntrack_find_get(net, zone, &tuple, hash); 1059 1058 if (!h) { 1060 1059 h = init_conntrack(net, tmpl, &tuple, l3proto, l4proto, ··· 1307 1306 /* Built-in default zone used e.g. by modules. */ 1308 1307 const struct nf_conntrack_zone nf_ct_zone_dflt = { 1309 1308 .id = NF_CT_DEFAULT_ZONE_ID, 1309 + .dir = NF_CT_DEFAULT_ZONE_DIR, 1310 1310 }; 1311 1311 EXPORT_SYMBOL_GPL(nf_ct_zone_dflt); 1312 1312 ··· 1619 1617 struct nf_conntrack_tuple_hash, hnnode); 1620 1618 ct = nf_ct_tuplehash_to_ctrack(h); 1621 1619 hlist_nulls_del_rcu(&h->hnnode); 1622 - bucket = __hash_conntrack(&h->tuple, nf_ct_zone(ct), 1623 - hashsize); 1620 + bucket = __hash_conntrack(&h->tuple, hashsize); 1624 1621 hlist_nulls_add_head_rcu(&h->hnnode, &hash[bucket]); 1625 1622 } 1626 1623 }
+4 -4
net/netfilter/nf_conntrack_expect.c
··· 101 101 h = nf_ct_expect_dst_hash(tuple); 102 102 hlist_for_each_entry_rcu(i, &net->ct.expect_hash[h], hnode) { 103 103 if (nf_ct_tuple_mask_cmp(tuple, &i->tuple, &i->mask) && 104 - nf_ct_zone_equal(i->master, zone)) 104 + nf_ct_zone_equal_any(i->master, zone)) 105 105 return i; 106 106 } 107 107 return NULL; ··· 143 143 hlist_for_each_entry(i, &net->ct.expect_hash[h], hnode) { 144 144 if (!(i->flags & NF_CT_EXPECT_INACTIVE) && 145 145 nf_ct_tuple_mask_cmp(tuple, &i->tuple, &i->mask) && 146 - nf_ct_zone_equal(i->master, zone)) { 146 + nf_ct_zone_equal_any(i->master, zone)) { 147 147 exp = i; 148 148 break; 149 149 } ··· 223 223 } 224 224 225 225 return nf_ct_tuple_mask_cmp(&a->tuple, &b->tuple, &intersect_mask) && 226 - nf_ct_zone_equal(a->master, nf_ct_zone(b->master)); 226 + nf_ct_zone_equal_any(a->master, nf_ct_zone(b->master)); 227 227 } 228 228 229 229 static inline int expect_matches(const struct nf_conntrack_expect *a, ··· 232 232 return a->master == b->master && a->class == b->class && 233 233 nf_ct_tuple_equal(&a->tuple, &b->tuple) && 234 234 nf_ct_tuple_mask_equal(&a->mask, &b->mask) && 235 - nf_ct_zone_equal(a->master, nf_ct_zone(b->master)); 235 + nf_ct_zone_equal_any(a->master, nf_ct_zone(b->master)); 236 236 } 237 237 238 238 /* Generally a bad idea to call this: could have matched already. */
+131 -46
net/netfilter/nf_conntrack_netlink.c
··· 128 128 } 129 129 130 130 static inline int 131 + ctnetlink_dump_zone_id(struct sk_buff *skb, int attrtype, 132 + const struct nf_conntrack_zone *zone, int dir) 133 + { 134 + if (zone->id == NF_CT_DEFAULT_ZONE_ID || zone->dir != dir) 135 + return 0; 136 + if (nla_put_be16(skb, attrtype, htons(zone->id))) 137 + goto nla_put_failure; 138 + return 0; 139 + 140 + nla_put_failure: 141 + return -1; 142 + } 143 + 144 + static inline int 131 145 ctnetlink_dump_status(struct sk_buff *skb, const struct nf_conn *ct) 132 146 { 133 147 if (nla_put_be32(skb, CTA_STATUS, htonl(ct->status))) ··· 488 474 nfmsg->version = NFNETLINK_V0; 489 475 nfmsg->res_id = 0; 490 476 477 + zone = nf_ct_zone(ct); 478 + 491 479 nest_parms = nla_nest_start(skb, CTA_TUPLE_ORIG | NLA_F_NESTED); 492 480 if (!nest_parms) 493 481 goto nla_put_failure; 494 482 if (ctnetlink_dump_tuples(skb, nf_ct_tuple(ct, IP_CT_DIR_ORIGINAL)) < 0) 483 + goto nla_put_failure; 484 + if (ctnetlink_dump_zone_id(skb, CTA_TUPLE_ZONE, zone, 485 + NF_CT_ZONE_DIR_ORIG) < 0) 495 486 goto nla_put_failure; 496 487 nla_nest_end(skb, nest_parms); 497 488 ··· 505 486 goto nla_put_failure; 506 487 if (ctnetlink_dump_tuples(skb, nf_ct_tuple(ct, IP_CT_DIR_REPLY)) < 0) 507 488 goto nla_put_failure; 489 + if (ctnetlink_dump_zone_id(skb, CTA_TUPLE_ZONE, zone, 490 + NF_CT_ZONE_DIR_REPL) < 0) 491 + goto nla_put_failure; 508 492 nla_nest_end(skb, nest_parms); 509 493 510 - zone = nf_ct_zone(ct); 511 - if (zone->id != NF_CT_DEFAULT_ZONE_ID && 512 - nla_put_be16(skb, CTA_ZONE, htons(zone->id))) 494 + if (ctnetlink_dump_zone_id(skb, CTA_ZONE, zone, 495 + NF_CT_DEFAULT_ZONE_DIR) < 0) 513 496 goto nla_put_failure; 514 497 515 498 if (ctnetlink_dump_status(skb, ct) < 0 || ··· 621 600 + nla_total_size(sizeof(u_int32_t)) /* CTA_MARK */ 622 601 #endif 623 602 #ifdef CONFIG_NF_CONNTRACK_ZONES 624 - + nla_total_size(sizeof(u_int16_t)) /* CTA_ZONE */ 603 + + nla_total_size(sizeof(u_int16_t)) /* CTA_ZONE|CTA_TUPLE_ZONE */ 625 604 #endif 626 605 + ctnetlink_proto_size(ct) 627 606 + ctnetlink_label_size(ct) ··· 679 658 nfmsg->res_id = 0; 680 659 681 660 rcu_read_lock(); 661 + zone = nf_ct_zone(ct); 662 + 682 663 nest_parms = nla_nest_start(skb, CTA_TUPLE_ORIG | NLA_F_NESTED); 683 664 if (!nest_parms) 684 665 goto nla_put_failure; 685 666 if (ctnetlink_dump_tuples(skb, nf_ct_tuple(ct, IP_CT_DIR_ORIGINAL)) < 0) 667 + goto nla_put_failure; 668 + if (ctnetlink_dump_zone_id(skb, CTA_TUPLE_ZONE, zone, 669 + NF_CT_ZONE_DIR_ORIG) < 0) 686 670 goto nla_put_failure; 687 671 nla_nest_end(skb, nest_parms); 688 672 ··· 696 670 goto nla_put_failure; 697 671 if (ctnetlink_dump_tuples(skb, nf_ct_tuple(ct, IP_CT_DIR_REPLY)) < 0) 698 672 goto nla_put_failure; 673 + if (ctnetlink_dump_zone_id(skb, CTA_TUPLE_ZONE, zone, 674 + NF_CT_ZONE_DIR_REPL) < 0) 675 + goto nla_put_failure; 699 676 nla_nest_end(skb, nest_parms); 700 677 701 - zone = nf_ct_zone(ct); 702 - if (zone->id != NF_CT_DEFAULT_ZONE_ID && 703 - nla_put_be16(skb, CTA_ZONE, htons(zone->id))) 678 + if (ctnetlink_dump_zone_id(skb, CTA_ZONE, zone, 679 + NF_CT_DEFAULT_ZONE_DIR) < 0) 704 680 goto nla_put_failure; 705 681 706 682 if (ctnetlink_dump_id(skb, ct) < 0) ··· 952 924 return ret; 953 925 } 954 926 927 + static int 928 + ctnetlink_parse_zone(const struct nlattr *attr, 929 + struct nf_conntrack_zone *zone) 930 + { 931 + zone->id = NF_CT_DEFAULT_ZONE_ID; 932 + zone->dir = NF_CT_DEFAULT_ZONE_DIR; 933 + 934 + #ifdef CONFIG_NF_CONNTRACK_ZONES 935 + if (attr) 936 + zone->id = ntohs(nla_get_be16(attr)); 937 + #else 938 + if (attr) 939 + return -EOPNOTSUPP; 940 + #endif 941 + return 0; 942 + } 943 + 944 + static int 945 + ctnetlink_parse_tuple_zone(struct nlattr *attr, enum ctattr_type type, 946 + struct nf_conntrack_zone *zone) 947 + { 948 + int ret; 949 + 950 + if (zone->id != NF_CT_DEFAULT_ZONE_ID) 951 + return -EINVAL; 952 + 953 + ret = ctnetlink_parse_zone(attr, zone); 954 + if (ret < 0) 955 + return ret; 956 + 957 + if (type == CTA_TUPLE_REPLY) 958 + zone->dir = NF_CT_ZONE_DIR_REPL; 959 + else 960 + zone->dir = NF_CT_ZONE_DIR_ORIG; 961 + 962 + return 0; 963 + } 964 + 955 965 static const struct nla_policy tuple_nla_policy[CTA_TUPLE_MAX+1] = { 956 966 [CTA_TUPLE_IP] = { .type = NLA_NESTED }, 957 967 [CTA_TUPLE_PROTO] = { .type = NLA_NESTED }, 968 + [CTA_TUPLE_ZONE] = { .type = NLA_U16 }, 958 969 }; 959 970 960 971 static int 961 972 ctnetlink_parse_tuple(const struct nlattr * const cda[], 962 973 struct nf_conntrack_tuple *tuple, 963 - enum ctattr_type type, u_int8_t l3num) 974 + enum ctattr_type type, u_int8_t l3num, 975 + struct nf_conntrack_zone *zone) 964 976 { 965 977 struct nlattr *tb[CTA_TUPLE_MAX+1]; 966 978 int err; ··· 1027 959 if (err < 0) 1028 960 return err; 1029 961 962 + if (tb[CTA_TUPLE_ZONE]) { 963 + if (!zone) 964 + return -EINVAL; 965 + 966 + err = ctnetlink_parse_tuple_zone(tb[CTA_TUPLE_ZONE], 967 + type, zone); 968 + if (err < 0) 969 + return err; 970 + } 971 + 1030 972 /* orig and expect tuples get DIR_ORIGINAL */ 1031 973 if (type == CTA_TUPLE_REPLY) 1032 974 tuple->dst.dir = IP_CT_DIR_REPLY; 1033 975 else 1034 976 tuple->dst.dir = IP_CT_DIR_ORIGINAL; 1035 977 1036 - return 0; 1037 - } 1038 - 1039 - static int 1040 - ctnetlink_parse_zone(const struct nlattr *attr, 1041 - struct nf_conntrack_zone *zone) 1042 - { 1043 - zone->id = NF_CT_DEFAULT_ZONE_ID; 1044 - 1045 - #ifdef CONFIG_NF_CONNTRACK_ZONES 1046 - if (attr) 1047 - zone->id = ntohs(nla_get_be16(attr)); 1048 - #else 1049 - if (attr) 1050 - return -EOPNOTSUPP; 1051 - #endif 1052 978 return 0; 1053 979 } 1054 980 ··· 1133 1071 return err; 1134 1072 1135 1073 if (cda[CTA_TUPLE_ORIG]) 1136 - err = ctnetlink_parse_tuple(cda, &tuple, CTA_TUPLE_ORIG, u3); 1074 + err = ctnetlink_parse_tuple(cda, &tuple, CTA_TUPLE_ORIG, 1075 + u3, &zone); 1137 1076 else if (cda[CTA_TUPLE_REPLY]) 1138 - err = ctnetlink_parse_tuple(cda, &tuple, CTA_TUPLE_REPLY, u3); 1077 + err = ctnetlink_parse_tuple(cda, &tuple, CTA_TUPLE_REPLY, 1078 + u3, &zone); 1139 1079 else { 1140 1080 return ctnetlink_flush_conntrack(net, cda, 1141 1081 NETLINK_CB(skb).portid, ··· 1207 1143 return err; 1208 1144 1209 1145 if (cda[CTA_TUPLE_ORIG]) 1210 - err = ctnetlink_parse_tuple(cda, &tuple, CTA_TUPLE_ORIG, u3); 1146 + err = ctnetlink_parse_tuple(cda, &tuple, CTA_TUPLE_ORIG, 1147 + u3, &zone); 1211 1148 else if (cda[CTA_TUPLE_REPLY]) 1212 - err = ctnetlink_parse_tuple(cda, &tuple, CTA_TUPLE_REPLY, u3); 1149 + err = ctnetlink_parse_tuple(cda, &tuple, CTA_TUPLE_REPLY, 1150 + u3, &zone); 1213 1151 else 1214 1152 return -EINVAL; 1215 1153 ··· 1833 1767 struct nf_conntrack_tuple_hash *master_h; 1834 1768 struct nf_conn *master_ct; 1835 1769 1836 - err = ctnetlink_parse_tuple(cda, &master, CTA_TUPLE_MASTER, u3); 1770 + err = ctnetlink_parse_tuple(cda, &master, CTA_TUPLE_MASTER, 1771 + u3, NULL); 1837 1772 if (err < 0) 1838 1773 goto err2; 1839 1774 ··· 1885 1818 return err; 1886 1819 1887 1820 if (cda[CTA_TUPLE_ORIG]) { 1888 - err = ctnetlink_parse_tuple(cda, &otuple, CTA_TUPLE_ORIG, u3); 1821 + err = ctnetlink_parse_tuple(cda, &otuple, CTA_TUPLE_ORIG, 1822 + u3, &zone); 1889 1823 if (err < 0) 1890 1824 return err; 1891 1825 } 1892 1826 1893 1827 if (cda[CTA_TUPLE_REPLY]) { 1894 - err = ctnetlink_parse_tuple(cda, &rtuple, CTA_TUPLE_REPLY, u3); 1828 + err = ctnetlink_parse_tuple(cda, &rtuple, CTA_TUPLE_REPLY, 1829 + u3, &zone); 1895 1830 if (err < 0) 1896 1831 return err; 1897 1832 } ··· 2157 2088 + nla_total_size(sizeof(u_int32_t)) /* CTA_MARK */ 2158 2089 #endif 2159 2090 #ifdef CONFIG_NF_CONNTRACK_ZONES 2160 - + nla_total_size(sizeof(u_int16_t)) /* CTA_ZONE */ 2091 + + nla_total_size(sizeof(u_int16_t)) /* CTA_ZONE|CTA_TUPLE_ZONE */ 2161 2092 #endif 2162 2093 + ctnetlink_proto_size(ct) 2163 2094 ; ··· 2170 2101 struct nlattr *nest_parms; 2171 2102 2172 2103 rcu_read_lock(); 2104 + zone = nf_ct_zone(ct); 2105 + 2173 2106 nest_parms = nla_nest_start(skb, CTA_TUPLE_ORIG | NLA_F_NESTED); 2174 2107 if (!nest_parms) 2175 2108 goto nla_put_failure; 2176 2109 if (ctnetlink_dump_tuples(skb, nf_ct_tuple(ct, IP_CT_DIR_ORIGINAL)) < 0) 2110 + goto nla_put_failure; 2111 + if (ctnetlink_dump_zone_id(skb, CTA_TUPLE_ZONE, zone, 2112 + NF_CT_ZONE_DIR_ORIG) < 0) 2177 2113 goto nla_put_failure; 2178 2114 nla_nest_end(skb, nest_parms); 2179 2115 ··· 2187 2113 goto nla_put_failure; 2188 2114 if (ctnetlink_dump_tuples(skb, nf_ct_tuple(ct, IP_CT_DIR_REPLY)) < 0) 2189 2115 goto nla_put_failure; 2116 + if (ctnetlink_dump_zone_id(skb, CTA_TUPLE_ZONE, zone, 2117 + NF_CT_ZONE_DIR_REPL) < 0) 2118 + goto nla_put_failure; 2190 2119 nla_nest_end(skb, nest_parms); 2191 2120 2192 - zone = nf_ct_zone(ct); 2193 - if (zone->id != NF_CT_DEFAULT_ZONE_ID && 2194 - nla_put_be16(skb, CTA_ZONE, htons(zone->id))) 2121 + if (ctnetlink_dump_zone_id(skb, CTA_ZONE, zone, 2122 + NF_CT_DEFAULT_ZONE_DIR) < 0) 2195 2123 goto nla_put_failure; 2196 2124 2197 2125 if (ctnetlink_dump_id(skb, ct) < 0) ··· 2301 2225 int err; 2302 2226 2303 2227 err = ctnetlink_parse_tuple(cda, tuple, CTA_EXPECT_TUPLE, 2304 - nf_ct_l3num(ct)); 2228 + nf_ct_l3num(ct), NULL); 2305 2229 if (err < 0) 2306 2230 return err; 2307 2231 2308 2232 return ctnetlink_parse_tuple(cda, mask, CTA_EXPECT_MASK, 2309 - nf_ct_l3num(ct)); 2233 + nf_ct_l3num(ct), NULL); 2310 2234 } 2311 2235 2312 2236 static int ··· 2701 2625 .done = ctnetlink_exp_done, 2702 2626 }; 2703 2627 2704 - err = ctnetlink_parse_tuple(cda, &tuple, CTA_EXPECT_MASTER, u3); 2628 + err = ctnetlink_parse_tuple(cda, &tuple, CTA_EXPECT_MASTER, 2629 + u3, NULL); 2705 2630 if (err < 0) 2706 2631 return err; 2707 2632 ··· 2754 2677 return err; 2755 2678 2756 2679 if (cda[CTA_EXPECT_TUPLE]) 2757 - err = ctnetlink_parse_tuple(cda, &tuple, CTA_EXPECT_TUPLE, u3); 2680 + err = ctnetlink_parse_tuple(cda, &tuple, CTA_EXPECT_TUPLE, 2681 + u3, NULL); 2758 2682 else if (cda[CTA_EXPECT_MASTER]) 2759 - err = ctnetlink_parse_tuple(cda, &tuple, CTA_EXPECT_MASTER, u3); 2683 + err = ctnetlink_parse_tuple(cda, &tuple, CTA_EXPECT_MASTER, 2684 + u3, NULL); 2760 2685 else 2761 2686 return -EINVAL; 2762 2687 ··· 2826 2747 if (err < 0) 2827 2748 return err; 2828 2749 2829 - err = ctnetlink_parse_tuple(cda, &tuple, CTA_EXPECT_TUPLE, u3); 2750 + err = ctnetlink_parse_tuple(cda, &tuple, CTA_EXPECT_TUPLE, 2751 + u3, NULL); 2830 2752 if (err < 0) 2831 2753 return err; 2832 2754 ··· 2934 2854 return -EINVAL; 2935 2855 2936 2856 err = ctnetlink_parse_tuple((const struct nlattr * const *)tb, 2937 - &nat_tuple, CTA_EXPECT_NAT_TUPLE, u3); 2857 + &nat_tuple, CTA_EXPECT_NAT_TUPLE, 2858 + u3, NULL); 2938 2859 if (err < 0) 2939 2860 return err; 2940 2861 ··· 3036 2955 int err; 3037 2956 3038 2957 /* caller guarantees that those three CTA_EXPECT_* exist */ 3039 - err = ctnetlink_parse_tuple(cda, &tuple, CTA_EXPECT_TUPLE, u3); 2958 + err = ctnetlink_parse_tuple(cda, &tuple, CTA_EXPECT_TUPLE, 2959 + u3, NULL); 3040 2960 if (err < 0) 3041 2961 return err; 3042 - err = ctnetlink_parse_tuple(cda, &mask, CTA_EXPECT_MASK, u3); 2962 + err = ctnetlink_parse_tuple(cda, &mask, CTA_EXPECT_MASK, 2963 + u3, NULL); 3043 2964 if (err < 0) 3044 2965 return err; 3045 - err = ctnetlink_parse_tuple(cda, &master_tuple, CTA_EXPECT_MASTER, u3); 2966 + err = ctnetlink_parse_tuple(cda, &master_tuple, CTA_EXPECT_MASTER, 2967 + u3, NULL); 3046 2968 if (err < 0) 3047 2969 return err; 3048 2970 ··· 3113 3029 if (err < 0) 3114 3030 return err; 3115 3031 3116 - err = ctnetlink_parse_tuple(cda, &tuple, CTA_EXPECT_TUPLE, u3); 3032 + err = ctnetlink_parse_tuple(cda, &tuple, CTA_EXPECT_TUPLE, 3033 + u3, NULL); 3117 3034 if (err < 0) 3118 3035 return err; 3119 3036
+26 -4
net/netfilter/nf_conntrack_standalone.c
··· 141 141 #endif 142 142 143 143 #ifdef CONFIG_NF_CONNTRACK_ZONES 144 - static void ct_show_zone(struct seq_file *s, const struct nf_conn *ct) 144 + static void ct_show_zone(struct seq_file *s, const struct nf_conn *ct, 145 + int dir) 145 146 { 146 - seq_printf(s, "zone=%u ", nf_ct_zone(ct)->id); 147 + const struct nf_conntrack_zone *zone = nf_ct_zone(ct); 148 + 149 + if (zone->dir != dir) 150 + return; 151 + switch (zone->dir) { 152 + case NF_CT_DEFAULT_ZONE_DIR: 153 + seq_printf(s, "zone=%u ", zone->id); 154 + break; 155 + case NF_CT_ZONE_DIR_ORIG: 156 + seq_printf(s, "zone-orig=%u ", zone->id); 157 + break; 158 + case NF_CT_ZONE_DIR_REPL: 159 + seq_printf(s, "zone-reply=%u ", zone->id); 160 + break; 161 + default: 162 + break; 163 + } 147 164 } 148 165 #else 149 - static inline void ct_show_zone(struct seq_file *s, const struct nf_conn *ct) 166 + static inline void ct_show_zone(struct seq_file *s, const struct nf_conn *ct, 167 + int dir) 150 168 { 151 169 } 152 170 #endif ··· 231 213 print_tuple(s, &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple, 232 214 l3proto, l4proto); 233 215 216 + ct_show_zone(s, ct, NF_CT_ZONE_DIR_ORIG); 217 + 234 218 if (seq_has_overflowed(s)) 235 219 goto release; 236 220 ··· 244 224 245 225 print_tuple(s, &ct->tuplehash[IP_CT_DIR_REPLY].tuple, 246 226 l3proto, l4proto); 227 + 228 + ct_show_zone(s, ct, NF_CT_ZONE_DIR_REPL); 247 229 248 230 if (seq_print_acct(s, ct, IP_CT_DIR_REPLY)) 249 231 goto release; ··· 261 239 #endif 262 240 263 241 ct_show_secctx(s, ct); 264 - ct_show_zone(s, ct); 242 + ct_show_zone(s, ct, NF_CT_DEFAULT_ZONE_DIR); 265 243 ct_show_delta_time(s, ct); 266 244 267 245 seq_printf(s, "use=%u\n", atomic_read(&ct->ct_general.use));
+6 -7
net/netfilter/nf_nat_core.c
··· 118 118 119 119 /* We keep an extra hash for each conntrack, for fast searching. */ 120 120 static inline unsigned int 121 - hash_by_src(const struct net *net, 122 - const struct nf_conntrack_zone *zone, 123 - const struct nf_conntrack_tuple *tuple) 121 + hash_by_src(const struct net *net, const struct nf_conntrack_tuple *tuple) 124 122 { 125 123 unsigned int hash; 126 124 127 125 /* Original src, to ensure we map it consistently if poss. */ 128 126 hash = jhash2((u32 *)&tuple->src, sizeof(tuple->src) / sizeof(u32), 129 - tuple->dst.protonum ^ zone->id ^ nf_conntrack_hash_rnd); 127 + tuple->dst.protonum ^ nf_conntrack_hash_rnd); 130 128 131 129 return reciprocal_scale(hash, net->ct.nat_htable_size); 132 130 } ··· 192 194 struct nf_conntrack_tuple *result, 193 195 const struct nf_nat_range *range) 194 196 { 195 - unsigned int h = hash_by_src(net, zone, tuple); 197 + unsigned int h = hash_by_src(net, tuple); 196 198 const struct nf_conn_nat *nat; 197 199 const struct nf_conn *ct; 198 200 199 201 hlist_for_each_entry_rcu(nat, &net->ct.nat_bysource[h], bysource) { 200 202 ct = nat->ct; 201 - if (same_src(ct, tuple) && nf_ct_zone_equal(ct, zone)) { 203 + if (same_src(ct, tuple) && 204 + nf_ct_zone_equal(ct, zone, IP_CT_DIR_ORIGINAL)) { 202 205 /* Copy source part from reply tuple. */ 203 206 nf_ct_invert_tuplepr(result, 204 207 &ct->tuplehash[IP_CT_DIR_REPLY].tuple); ··· 424 425 if (maniptype == NF_NAT_MANIP_SRC) { 425 426 unsigned int srchash; 426 427 427 - srchash = hash_by_src(net, nf_ct_zone(ct), 428 + srchash = hash_by_src(net, 428 429 &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple); 429 430 spin_lock_bh(&nf_nat_lock); 430 431 /* nf_conntrack_alter_reply might re-allocate extension aera */
+16 -1
net/netfilter/xt_CT.c
··· 181 181 #endif 182 182 } 183 183 184 + static u16 xt_ct_flags_to_dir(const struct xt_ct_target_info_v1 *info) 185 + { 186 + switch (info->flags & (XT_CT_ZONE_DIR_ORIG | 187 + XT_CT_ZONE_DIR_REPL)) { 188 + case XT_CT_ZONE_DIR_ORIG: 189 + return NF_CT_ZONE_DIR_ORIG; 190 + case XT_CT_ZONE_DIR_REPL: 191 + return NF_CT_ZONE_DIR_REPL; 192 + default: 193 + return NF_CT_DEFAULT_ZONE_DIR; 194 + } 195 + } 196 + 184 197 static int xt_ct_tg_check(const struct xt_tgchk_param *par, 185 198 struct xt_ct_target_info_v1 *info) 186 199 { ··· 207 194 } 208 195 209 196 #ifndef CONFIG_NF_CONNTRACK_ZONES 210 - if (info->zone) 197 + if (info->zone || info->flags & (XT_CT_ZONE_DIR_ORIG | 198 + XT_CT_ZONE_DIR_REPL)) 211 199 goto err1; 212 200 #endif 213 201 ··· 218 204 219 205 memset(&zone, 0, sizeof(zone)); 220 206 zone.id = info->zone; 207 + zone.dir = xt_ct_flags_to_dir(info); 221 208 222 209 ct = nf_ct_tmpl_alloc(par->net, &zone, GFP_KERNEL); 223 210 ret = PTR_ERR(ct);
+1
net/sched/act_connmark.c
··· 72 72 goto out; 73 73 74 74 zone.id = ca->zone; 75 + zone.dir = NF_CT_DEFAULT_ZONE_DIR; 75 76 76 77 thash = nf_conntrack_find_get(dev_net(skb->dev), &zone, &tuple); 77 78 if (!thash)