Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following large patchset contains Netfilter updates for your
net-next tree. My initial intention was to send you this in two goes but
when I looked back twice I already had this burden on top of me.

Several updates for IPVS from Marco Angaroni:

1) Allow SIP connections originating from real-servers to be load
balanced by the SIP persistence engine as is already implemented
in the other direction.

2) Release connections immediately for One-packet-scheduling (OPS)
in IPVS, instead of making it via timer and rcu callback.

3) Skip deleting conntracks for each one packet in OPS, and don't call
nf_conntrack_alter_reply() since no reply is expected.

4) Enable drop on exhaustion for OPS + SIP persistence.

Miscelaneous conntrack updates from Florian Westphal, including fix for
hash resize:

5) Move conntrack generation counter out of conntrack pernet structure
since this is only used by the init_ns to allow hash resizing.

6) Use get_random_once() from packet path to collect hash random seed
instead of our compound.

7) Don't disable BH from ____nf_conntrack_find() for statistics,
use NF_CT_STAT_INC_ATOMIC() instead.

8) Fix lookup race during conntrack hash resizing.

9) Introduce clash resolution on conntrack insertion for connectionless
protocol.

Then, Florian's netns rework to get rid of per-netns conntrack table,
thus we use one single table for them all. There was consensus on this
change during the NFWS 2015 and, on top of that, it has recently been
pointed as a source of multiple problems from unpriviledged netns:

11) Use a single conntrack hashtable for all namespaces. Include netns
in object comparisons and make it part of the hash calculation.
Adapt early_drop() to consider netns.

12) Use single expectation and NAT hashtable for all namespaces.

13) Use a single slab cache for all namespaces for conntrack objects.

14) Skip full table scanning from nf_ct_iterate_cleanup() if the pernet
conntrack counter tells us the table is empty (ie. equals zero).

Fixes for nf_tables interval set element handling, support to set
conntrack connlabels and allow set names up to 32 bytes.

15) Parse element flags from element deletion path and pass it up to the
backend set implementation.

16) Allow adjacent intervals in the rbtree set type for dynamic interval
updates.

17) Add support to set connlabel from nf_tables, from Florian Westphal.

18) Allow set names up to 32 bytes in nf_tables.

Several x_tables fixes and updates:

19) Fix incorrect use of IS_ERR_VALUE() in x_tables, original patch
from Andrzej Hajda.

And finally, miscelaneous netfilter updates such as:

20) Disable automatic helper assignment by default. Note this proc knob
was introduced by a9006892643a ("netfilter: nf_ct_helper: allow to
disable automatic helper assignment") 4 years ago to start moving
towards explicit conntrack helper configuration via iptables CT
target.

21) Get rid of obsolete and inconsistent debugging instrumentation
in x_tables.

22) Remove unnecessary check for null after ip6_route_output().
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

+918 -932
+3 -3
include/linux/netfilter/x_tables.h
··· 380 380 * allows us to return 0 for single core systems without forcing 381 381 * callers to deal with SMP vs. NONSMP issues. 382 382 */ 383 - static inline u64 xt_percpu_counter_alloc(void) 383 + static inline unsigned long xt_percpu_counter_alloc(void) 384 384 { 385 385 if (nr_cpu_ids > 1) { 386 386 void __percpu *res = __alloc_percpu(sizeof(struct xt_counters), 387 387 sizeof(struct xt_counters)); 388 388 389 389 if (res == NULL) 390 - return (u64) -ENOMEM; 390 + return -ENOMEM; 391 391 392 - return (u64) (__force unsigned long) res; 392 + return (__force unsigned long) res; 393 393 } 394 394 395 395 return 0;
+17
include/net/ip_vs.h
··· 731 731 u32 (*hashkey_raw)(const struct ip_vs_conn_param *p, u32 initval, 732 732 bool inverse); 733 733 int (*show_pe_data)(const struct ip_vs_conn *cp, char *buf); 734 + /* create connections for real-server outgoing packets */ 735 + struct ip_vs_conn* (*conn_out)(struct ip_vs_service *svc, 736 + struct ip_vs_dest *dest, 737 + struct sk_buff *skb, 738 + const struct ip_vs_iphdr *iph, 739 + __be16 dport, __be16 cport); 734 740 }; 735 741 736 742 /* The application module object (a.k.a. app incarnation) */ ··· 880 874 /* Service counters */ 881 875 atomic_t ftpsvc_counter; 882 876 atomic_t nullsvc_counter; 877 + atomic_t conn_out_counter; 883 878 884 879 #ifdef CONFIG_SYSCTL 885 880 /* 1/rate drop and drop-entry variables */ ··· 1154 1147 */ 1155 1148 const char *ip_vs_proto_name(unsigned int proto); 1156 1149 void ip_vs_init_hash_table(struct list_head *table, int rows); 1150 + struct ip_vs_conn *ip_vs_new_conn_out(struct ip_vs_service *svc, 1151 + struct ip_vs_dest *dest, 1152 + struct sk_buff *skb, 1153 + const struct ip_vs_iphdr *iph, 1154 + __be16 dport, 1155 + __be16 cport); 1157 1156 #define IP_VS_INIT_HASH_TABLE(t) ip_vs_init_hash_table((t), ARRAY_SIZE((t))) 1158 1157 1159 1158 #define IP_VS_APP_TYPE_FTP 1 ··· 1390 1377 1391 1378 bool ip_vs_has_real_service(struct netns_ipvs *ipvs, int af, __u16 protocol, 1392 1379 const union nf_inet_addr *daddr, __be16 dport); 1380 + 1381 + struct ip_vs_dest * 1382 + ip_vs_find_real_service(struct netns_ipvs *ipvs, int af, __u16 protocol, 1383 + const union nf_inet_addr *daddr, __be16 dport); 1393 1384 1394 1385 int ip_vs_use_count_inc(void); 1395 1386 void ip_vs_use_count_dec(void);
-2
include/net/netfilter/nf_conntrack.h
··· 289 289 int nf_conntrack_set_hashsize(const char *val, struct kernel_param *kp); 290 290 extern unsigned int nf_conntrack_htable_size; 291 291 extern unsigned int nf_conntrack_max; 292 - extern unsigned int nf_conntrack_hash_rnd; 293 - void init_nf_conntrack_hash_rnd(void); 294 292 295 293 struct nf_conn *nf_ct_tmpl_alloc(struct net *net, 296 294 const struct nf_conntrack_zone *zone,
+1
include/net/netfilter/nf_conntrack_core.h
··· 81 81 82 82 #define CONNTRACK_LOCKS 1024 83 83 84 + extern struct hlist_nulls_head *nf_conntrack_hash; 84 85 extern spinlock_t nf_conntrack_locks[CONNTRACK_LOCKS]; 85 86 void nf_conntrack_lock(spinlock_t *lock); 86 87
+1
include/net/netfilter/nf_conntrack_expect.h
··· 10 10 11 11 extern unsigned int nf_ct_expect_hsize; 12 12 extern unsigned int nf_ct_expect_max; 13 + extern struct hlist_head *nf_ct_expect_hash; 13 14 14 15 struct nf_conntrack_expect { 15 16 /* Conntrack expectation list member */
+3
include/net/netfilter/nf_conntrack_l4proto.h
··· 23 23 /* L4 Protocol number. */ 24 24 u_int8_t l4proto; 25 25 26 + /* Resolve clashes on insertion races. */ 27 + bool allow_clash; 28 + 26 29 /* Try to fill in the third arg: dataoff is offset past network protocol 27 30 hdr. Return true if possible. */ 28 31 bool (*pkt_to_tuple)(const struct sk_buff *skb, unsigned int dataoff,
+1 -1
include/net/netfilter/nf_tables.h
··· 303 303 struct nft_set { 304 304 struct list_head list; 305 305 struct list_head bindings; 306 - char name[IFNAMSIZ]; 306 + char name[NFT_SET_MAXNAMELEN]; 307 307 u32 ktype; 308 308 u32 dtype; 309 309 u32 size;
-10
include/net/netns/conntrack.h
··· 84 84 struct ctl_table_header *event_sysctl_header; 85 85 struct ctl_table_header *helper_sysctl_header; 86 86 #endif 87 - char *slabname; 88 87 unsigned int sysctl_log_invalid; /* Log invalid packets */ 89 88 int sysctl_events; 90 89 int sysctl_acct; ··· 92 93 int sysctl_tstamp; 93 94 int sysctl_checksum; 94 95 95 - unsigned int htable_size; 96 - seqcount_t generation; 97 - struct kmem_cache *nf_conntrack_cachep; 98 - struct hlist_nulls_head *hash; 99 - struct hlist_head *expect_hash; 100 96 struct ct_pcpu __percpu *pcpu_lists; 101 97 struct ip_conntrack_stat __percpu *stat; 102 98 struct nf_ct_event_notifier __rcu *nf_conntrack_event_cb; ··· 100 106 #if defined(CONFIG_NF_CONNTRACK_LABELS) 101 107 unsigned int labels_used; 102 108 u8 label_words; 103 - #endif 104 - #ifdef CONFIG_NF_NAT_NEEDED 105 - struct hlist_head *nat_bysource; 106 - unsigned int nat_htable_size; 107 109 #endif 108 110 }; 109 111 #endif
+1
include/uapi/linux/netfilter/nf_tables.h
··· 3 3 4 4 #define NFT_TABLE_MAXNAMELEN 32 5 5 #define NFT_CHAIN_MAXNAMELEN 32 6 + #define NFT_SET_MAXNAMELEN 32 6 7 #define NFT_USERDATA_MAXLEN 256 7 8 8 9 /**
+40 -183
net/ipv4/netfilter/arp_tables.c
··· 34 34 MODULE_AUTHOR("David S. Miller <davem@redhat.com>"); 35 35 MODULE_DESCRIPTION("arptables core"); 36 36 37 - /*#define DEBUG_ARP_TABLES*/ 38 - /*#define DEBUG_ARP_TABLES_USER*/ 39 - 40 - #ifdef DEBUG_ARP_TABLES 41 - #define dprintf(format, args...) pr_debug(format, ## args) 42 - #else 43 - #define dprintf(format, args...) 44 - #endif 45 - 46 - #ifdef DEBUG_ARP_TABLES_USER 47 - #define duprintf(format, args...) pr_debug(format, ## args) 48 - #else 49 - #define duprintf(format, args...) 50 - #endif 51 - 52 - #ifdef CONFIG_NETFILTER_DEBUG 53 - #define ARP_NF_ASSERT(x) WARN_ON(!(x)) 54 - #else 55 - #define ARP_NF_ASSERT(x) 56 - #endif 57 - 58 37 void *arpt_alloc_initial_table(const struct xt_table *info) 59 38 { 60 39 return xt_alloc_initial_table(arpt, ARPT); ··· 92 113 #define FWINV(bool, invflg) ((bool) ^ !!(arpinfo->invflags & (invflg))) 93 114 94 115 if (FWINV((arphdr->ar_op & arpinfo->arpop_mask) != arpinfo->arpop, 95 - ARPT_INV_ARPOP)) { 96 - dprintf("ARP operation field mismatch.\n"); 97 - dprintf("ar_op: %04x info->arpop: %04x info->arpop_mask: %04x\n", 98 - arphdr->ar_op, arpinfo->arpop, arpinfo->arpop_mask); 116 + ARPT_INV_ARPOP)) 99 117 return 0; 100 - } 101 118 102 119 if (FWINV((arphdr->ar_hrd & arpinfo->arhrd_mask) != arpinfo->arhrd, 103 - ARPT_INV_ARPHRD)) { 104 - dprintf("ARP hardware address format mismatch.\n"); 105 - dprintf("ar_hrd: %04x info->arhrd: %04x info->arhrd_mask: %04x\n", 106 - arphdr->ar_hrd, arpinfo->arhrd, arpinfo->arhrd_mask); 120 + ARPT_INV_ARPHRD)) 107 121 return 0; 108 - } 109 122 110 123 if (FWINV((arphdr->ar_pro & arpinfo->arpro_mask) != arpinfo->arpro, 111 - ARPT_INV_ARPPRO)) { 112 - dprintf("ARP protocol address format mismatch.\n"); 113 - dprintf("ar_pro: %04x info->arpro: %04x info->arpro_mask: %04x\n", 114 - arphdr->ar_pro, arpinfo->arpro, arpinfo->arpro_mask); 124 + ARPT_INV_ARPPRO)) 115 125 return 0; 116 - } 117 126 118 127 if (FWINV((arphdr->ar_hln & arpinfo->arhln_mask) != arpinfo->arhln, 119 - ARPT_INV_ARPHLN)) { 120 - dprintf("ARP hardware address length mismatch.\n"); 121 - dprintf("ar_hln: %02x info->arhln: %02x info->arhln_mask: %02x\n", 122 - arphdr->ar_hln, arpinfo->arhln, arpinfo->arhln_mask); 128 + ARPT_INV_ARPHLN)) 123 129 return 0; 124 - } 125 130 126 131 src_devaddr = arpptr; 127 132 arpptr += dev->addr_len; ··· 118 155 if (FWINV(arp_devaddr_compare(&arpinfo->src_devaddr, src_devaddr, dev->addr_len), 119 156 ARPT_INV_SRCDEVADDR) || 120 157 FWINV(arp_devaddr_compare(&arpinfo->tgt_devaddr, tgt_devaddr, dev->addr_len), 121 - ARPT_INV_TGTDEVADDR)) { 122 - dprintf("Source or target device address mismatch.\n"); 123 - 158 + ARPT_INV_TGTDEVADDR)) 124 159 return 0; 125 - } 126 160 127 161 if (FWINV((src_ipaddr & arpinfo->smsk.s_addr) != arpinfo->src.s_addr, 128 162 ARPT_INV_SRCIP) || 129 163 FWINV(((tgt_ipaddr & arpinfo->tmsk.s_addr) != arpinfo->tgt.s_addr), 130 - ARPT_INV_TGTIP)) { 131 - dprintf("Source or target IP address mismatch.\n"); 132 - 133 - dprintf("SRC: %pI4. Mask: %pI4. Target: %pI4.%s\n", 134 - &src_ipaddr, 135 - &arpinfo->smsk.s_addr, 136 - &arpinfo->src.s_addr, 137 - arpinfo->invflags & ARPT_INV_SRCIP ? " (INV)" : ""); 138 - dprintf("TGT: %pI4 Mask: %pI4 Target: %pI4.%s\n", 139 - &tgt_ipaddr, 140 - &arpinfo->tmsk.s_addr, 141 - &arpinfo->tgt.s_addr, 142 - arpinfo->invflags & ARPT_INV_TGTIP ? " (INV)" : ""); 164 + ARPT_INV_TGTIP)) 143 165 return 0; 144 - } 145 166 146 167 /* Look for ifname matches. */ 147 168 ret = ifname_compare(indev, arpinfo->iniface, arpinfo->iniface_mask); 148 169 149 - if (FWINV(ret != 0, ARPT_INV_VIA_IN)) { 150 - dprintf("VIA in mismatch (%s vs %s).%s\n", 151 - indev, arpinfo->iniface, 152 - arpinfo->invflags & ARPT_INV_VIA_IN ? " (INV)" : ""); 170 + if (FWINV(ret != 0, ARPT_INV_VIA_IN)) 153 171 return 0; 154 - } 155 172 156 173 ret = ifname_compare(outdev, arpinfo->outiface, arpinfo->outiface_mask); 157 174 158 - if (FWINV(ret != 0, ARPT_INV_VIA_OUT)) { 159 - dprintf("VIA out mismatch (%s vs %s).%s\n", 160 - outdev, arpinfo->outiface, 161 - arpinfo->invflags & ARPT_INV_VIA_OUT ? " (INV)" : ""); 175 + if (FWINV(ret != 0, ARPT_INV_VIA_OUT)) 162 176 return 0; 163 - } 164 177 165 178 return 1; 166 179 #undef FWINV ··· 144 205 145 206 static inline int arp_checkentry(const struct arpt_arp *arp) 146 207 { 147 - if (arp->flags & ~ARPT_F_MASK) { 148 - duprintf("Unknown flag bits set: %08X\n", 149 - arp->flags & ~ARPT_F_MASK); 208 + if (arp->flags & ~ARPT_F_MASK) 150 209 return 0; 151 - } 152 - if (arp->invflags & ~ARPT_INV_MASK) { 153 - duprintf("Unknown invflag bits set: %08X\n", 154 - arp->invflags & ~ARPT_INV_MASK); 210 + if (arp->invflags & ~ARPT_INV_MASK) 155 211 return 0; 156 - } 157 212 158 213 return 1; 159 214 } ··· 339 406 = (void *)arpt_get_target_c(e); 340 407 int visited = e->comefrom & (1 << hook); 341 408 342 - if (e->comefrom & (1 << NF_ARP_NUMHOOKS)) { 343 - pr_notice("arptables: loop hook %u pos %u %08X.\n", 344 - hook, pos, e->comefrom); 409 + if (e->comefrom & (1 << NF_ARP_NUMHOOKS)) 345 410 return 0; 346 - } 411 + 347 412 e->comefrom 348 413 |= ((1 << hook) | (1 << NF_ARP_NUMHOOKS)); 349 414 ··· 354 423 355 424 if ((strcmp(t->target.u.user.name, 356 425 XT_STANDARD_TARGET) == 0) && 357 - t->verdict < -NF_MAX_VERDICT - 1) { 358 - duprintf("mark_source_chains: bad " 359 - "negative verdict (%i)\n", 360 - t->verdict); 426 + t->verdict < -NF_MAX_VERDICT - 1) 361 427 return 0; 362 - } 363 428 364 429 /* Return: backtrack through the last 365 430 * big jump. ··· 389 462 XT_STANDARD_TARGET) == 0 && 390 463 newpos >= 0) { 391 464 /* This a jump; chase it. */ 392 - duprintf("Jump rule %u -> %u\n", 393 - pos, newpos); 394 465 e = (struct arpt_entry *) 395 466 (entry0 + newpos); 396 467 if (!find_jump_target(newinfo, e)) ··· 405 480 pos = newpos; 406 481 } 407 482 } 408 - next: 409 - duprintf("Finished chain %u\n", hook); 483 + next: ; 410 484 } 411 485 return 1; 412 486 } ··· 413 489 static inline int check_target(struct arpt_entry *e, const char *name) 414 490 { 415 491 struct xt_entry_target *t = arpt_get_target(e); 416 - int ret; 417 492 struct xt_tgchk_param par = { 418 493 .table = name, 419 494 .entryinfo = e, ··· 422 499 .family = NFPROTO_ARP, 423 500 }; 424 501 425 - ret = xt_check_target(&par, t->u.target_size - sizeof(*t), 0, false); 426 - if (ret < 0) { 427 - duprintf("arp_tables: check failed for `%s'.\n", 428 - t->u.kernel.target->name); 429 - return ret; 430 - } 431 - return 0; 502 + return xt_check_target(&par, t->u.target_size - sizeof(*t), 0, false); 432 503 } 433 504 434 505 static inline int ··· 430 513 { 431 514 struct xt_entry_target *t; 432 515 struct xt_target *target; 516 + unsigned long pcnt; 433 517 int ret; 434 518 435 - e->counters.pcnt = xt_percpu_counter_alloc(); 436 - if (IS_ERR_VALUE(e->counters.pcnt)) 519 + pcnt = xt_percpu_counter_alloc(); 520 + if (IS_ERR_VALUE(pcnt)) 437 521 return -ENOMEM; 522 + e->counters.pcnt = pcnt; 438 523 439 524 t = arpt_get_target(e); 440 525 target = xt_request_find_target(NFPROTO_ARP, t->u.user.name, 441 526 t->u.user.revision); 442 527 if (IS_ERR(target)) { 443 - duprintf("find_check_entry: `%s' not found\n", t->u.user.name); 444 528 ret = PTR_ERR(target); 445 529 goto out; 446 530 } ··· 487 569 488 570 if ((unsigned long)e % __alignof__(struct arpt_entry) != 0 || 489 571 (unsigned char *)e + sizeof(struct arpt_entry) >= limit || 490 - (unsigned char *)e + e->next_offset > limit) { 491 - duprintf("Bad offset %p\n", e); 572 + (unsigned char *)e + e->next_offset > limit) 492 573 return -EINVAL; 493 - } 494 574 495 575 if (e->next_offset 496 - < sizeof(struct arpt_entry) + sizeof(struct xt_entry_target)) { 497 - duprintf("checking: element %p size %u\n", 498 - e, e->next_offset); 576 + < sizeof(struct arpt_entry) + sizeof(struct xt_entry_target)) 499 577 return -EINVAL; 500 - } 501 578 502 579 if (!arp_checkentry(&e->arp)) 503 580 return -EINVAL; ··· 509 596 if ((unsigned char *)e - base == hook_entries[h]) 510 597 newinfo->hook_entry[h] = hook_entries[h]; 511 598 if ((unsigned char *)e - base == underflows[h]) { 512 - if (!check_underflow(e)) { 513 - pr_debug("Underflows must be unconditional and " 514 - "use the STANDARD target with " 515 - "ACCEPT/DROP\n"); 599 + if (!check_underflow(e)) 516 600 return -EINVAL; 517 - } 601 + 518 602 newinfo->underflow[h] = underflows[h]; 519 603 } 520 604 } ··· 556 646 newinfo->underflow[i] = 0xFFFFFFFF; 557 647 } 558 648 559 - duprintf("translate_table: size %u\n", newinfo->size); 560 649 i = 0; 561 650 562 651 /* Walk through entries, checking offsets. */ ··· 572 663 XT_ERROR_TARGET) == 0) 573 664 ++newinfo->stacksize; 574 665 } 575 - duprintf("translate_table: ARPT_ENTRY_ITERATE gives %d\n", ret); 576 666 if (ret != 0) 577 667 return ret; 578 668 579 - if (i != repl->num_entries) { 580 - duprintf("translate_table: %u not %u entries\n", 581 - i, repl->num_entries); 669 + if (i != repl->num_entries) 582 670 return -EINVAL; 583 - } 584 671 585 672 /* Check hooks all assigned */ 586 673 for (i = 0; i < NF_ARP_NUMHOOKS; i++) { 587 674 /* Only hooks which are valid */ 588 675 if (!(repl->valid_hooks & (1 << i))) 589 676 continue; 590 - if (newinfo->hook_entry[i] == 0xFFFFFFFF) { 591 - duprintf("Invalid hook entry %u %u\n", 592 - i, repl->hook_entry[i]); 677 + if (newinfo->hook_entry[i] == 0xFFFFFFFF) 593 678 return -EINVAL; 594 - } 595 - if (newinfo->underflow[i] == 0xFFFFFFFF) { 596 - duprintf("Invalid underflow %u %u\n", 597 - i, repl->underflow[i]); 679 + if (newinfo->underflow[i] == 0xFFFFFFFF) 598 680 return -EINVAL; 599 - } 600 681 } 601 682 602 683 if (!mark_source_chains(newinfo, repl->valid_hooks, entry0)) ··· 794 895 struct xt_table *t; 795 896 int ret; 796 897 797 - if (*len != sizeof(struct arpt_getinfo)) { 798 - duprintf("length %u != %Zu\n", *len, 799 - sizeof(struct arpt_getinfo)); 898 + if (*len != sizeof(struct arpt_getinfo)) 800 899 return -EINVAL; 801 - } 802 900 803 901 if (copy_from_user(name, user, sizeof(name)) != 0) 804 902 return -EFAULT; ··· 851 955 struct arpt_get_entries get; 852 956 struct xt_table *t; 853 957 854 - if (*len < sizeof(get)) { 855 - duprintf("get_entries: %u < %Zu\n", *len, sizeof(get)); 958 + if (*len < sizeof(get)) 856 959 return -EINVAL; 857 - } 858 960 if (copy_from_user(&get, uptr, sizeof(get)) != 0) 859 961 return -EFAULT; 860 - if (*len != sizeof(struct arpt_get_entries) + get.size) { 861 - duprintf("get_entries: %u != %Zu\n", *len, 862 - sizeof(struct arpt_get_entries) + get.size); 962 + if (*len != sizeof(struct arpt_get_entries) + get.size) 863 963 return -EINVAL; 864 - } 964 + 865 965 get.name[sizeof(get.name) - 1] = '\0'; 866 966 867 967 t = xt_find_table_lock(net, NFPROTO_ARP, get.name); 868 968 if (!IS_ERR_OR_NULL(t)) { 869 969 const struct xt_table_info *private = t->private; 870 970 871 - duprintf("t->private->number = %u\n", 872 - private->number); 873 971 if (get.size == private->size) 874 972 ret = copy_entries_to_user(private->size, 875 973 t, uptr->entrytable); 876 - else { 877 - duprintf("get_entries: I've got %u not %u!\n", 878 - private->size, get.size); 974 + else 879 975 ret = -EAGAIN; 880 - } 976 + 881 977 module_put(t->me); 882 978 xt_table_unlock(t); 883 979 } else ··· 907 1019 908 1020 /* You lied! */ 909 1021 if (valid_hooks != t->valid_hooks) { 910 - duprintf("Valid hook crap: %08X vs %08X\n", 911 - valid_hooks, t->valid_hooks); 912 1022 ret = -EINVAL; 913 1023 goto put_module; 914 1024 } ··· 916 1030 goto put_module; 917 1031 918 1032 /* Update module usage count based on number of rules */ 919 - duprintf("do_replace: oldnum=%u, initnum=%u, newnum=%u\n", 920 - oldinfo->number, oldinfo->initial_entries, newinfo->number); 921 1033 if ((oldinfo->number > oldinfo->initial_entries) || 922 1034 (newinfo->number <= oldinfo->initial_entries)) 923 1035 module_put(t->me); ··· 984 1100 ret = translate_table(newinfo, loc_cpu_entry, &tmp); 985 1101 if (ret != 0) 986 1102 goto free_newinfo; 987 - 988 - duprintf("arp_tables: Translated table\n"); 989 1103 990 1104 ret = __do_replace(net, tmp.name, tmp.valid_hooks, newinfo, 991 1105 tmp.num_counters, tmp.counters); ··· 1082 1200 unsigned int entry_offset; 1083 1201 int ret, off; 1084 1202 1085 - duprintf("check_compat_entry_size_and_hooks %p\n", e); 1086 1203 if ((unsigned long)e % __alignof__(struct compat_arpt_entry) != 0 || 1087 1204 (unsigned char *)e + sizeof(struct compat_arpt_entry) >= limit || 1088 - (unsigned char *)e + e->next_offset > limit) { 1089 - duprintf("Bad offset %p, limit = %p\n", e, limit); 1205 + (unsigned char *)e + e->next_offset > limit) 1090 1206 return -EINVAL; 1091 - } 1092 1207 1093 1208 if (e->next_offset < sizeof(struct compat_arpt_entry) + 1094 - sizeof(struct compat_xt_entry_target)) { 1095 - duprintf("checking: element %p size %u\n", 1096 - e, e->next_offset); 1209 + sizeof(struct compat_xt_entry_target)) 1097 1210 return -EINVAL; 1098 - } 1099 1211 1100 1212 if (!arp_checkentry(&e->arp)) 1101 1213 return -EINVAL; ··· 1106 1230 target = xt_request_find_target(NFPROTO_ARP, t->u.user.name, 1107 1231 t->u.user.revision); 1108 1232 if (IS_ERR(target)) { 1109 - duprintf("check_compat_entry_size_and_hooks: `%s' not found\n", 1110 - t->u.user.name); 1111 1233 ret = PTR_ERR(target); 1112 1234 goto out; 1113 1235 } ··· 1175 1301 size = compatr->size; 1176 1302 info->number = compatr->num_entries; 1177 1303 1178 - duprintf("translate_compat_table: size %u\n", info->size); 1179 1304 j = 0; 1180 1305 xt_compat_lock(NFPROTO_ARP); 1181 1306 xt_compat_init_offsets(NFPROTO_ARP, compatr->num_entries); ··· 1189 1316 } 1190 1317 1191 1318 ret = -EINVAL; 1192 - if (j != compatr->num_entries) { 1193 - duprintf("translate_compat_table: %u not %u entries\n", 1194 - j, compatr->num_entries); 1319 + if (j != compatr->num_entries) 1195 1320 goto out_unlock; 1196 - } 1197 1321 1198 1322 ret = -ENOMEM; 1199 1323 newinfo = xt_alloc_table_info(size); ··· 1281 1411 if (ret != 0) 1282 1412 goto free_newinfo; 1283 1413 1284 - duprintf("compat_do_replace: Translated table\n"); 1285 - 1286 1414 ret = __do_replace(net, tmp.name, tmp.valid_hooks, newinfo, 1287 1415 tmp.num_counters, compat_ptr(tmp.counters)); 1288 1416 if (ret) ··· 1313 1445 break; 1314 1446 1315 1447 default: 1316 - duprintf("do_arpt_set_ctl: unknown request %i\n", cmd); 1317 1448 ret = -EINVAL; 1318 1449 } 1319 1450 ··· 1395 1528 struct compat_arpt_get_entries get; 1396 1529 struct xt_table *t; 1397 1530 1398 - if (*len < sizeof(get)) { 1399 - duprintf("compat_get_entries: %u < %zu\n", *len, sizeof(get)); 1531 + if (*len < sizeof(get)) 1400 1532 return -EINVAL; 1401 - } 1402 1533 if (copy_from_user(&get, uptr, sizeof(get)) != 0) 1403 1534 return -EFAULT; 1404 - if (*len != sizeof(struct compat_arpt_get_entries) + get.size) { 1405 - duprintf("compat_get_entries: %u != %zu\n", 1406 - *len, sizeof(get) + get.size); 1535 + if (*len != sizeof(struct compat_arpt_get_entries) + get.size) 1407 1536 return -EINVAL; 1408 - } 1537 + 1409 1538 get.name[sizeof(get.name) - 1] = '\0'; 1410 1539 1411 1540 xt_compat_lock(NFPROTO_ARP); ··· 1410 1547 const struct xt_table_info *private = t->private; 1411 1548 struct xt_table_info info; 1412 1549 1413 - duprintf("t->private->number = %u\n", private->number); 1414 1550 ret = compat_table_info(private, &info); 1415 1551 if (!ret && get.size == info.size) { 1416 1552 ret = compat_copy_entries_to_user(private->size, 1417 1553 t, uptr->entrytable); 1418 - } else if (!ret) { 1419 - duprintf("compat_get_entries: I've got %u not %u!\n", 1420 - private->size, get.size); 1554 + } else if (!ret) 1421 1555 ret = -EAGAIN; 1422 - } 1556 + 1423 1557 xt_compat_flush_offsets(NFPROTO_ARP); 1424 1558 module_put(t->me); 1425 1559 xt_table_unlock(t); ··· 1468 1608 break; 1469 1609 1470 1610 default: 1471 - duprintf("do_arpt_set_ctl: unknown request %i\n", cmd); 1472 1611 ret = -EINVAL; 1473 1612 } 1474 1613 ··· 1510 1651 } 1511 1652 1512 1653 default: 1513 - duprintf("do_arpt_get_ctl: unknown request %i\n", cmd); 1514 1654 ret = -EINVAL; 1515 1655 } 1516 1656 ··· 1554 1696 memcpy(loc_cpu_entry, repl->entries, repl->size); 1555 1697 1556 1698 ret = translate_table(newinfo, loc_cpu_entry, repl); 1557 - duprintf("arpt_register_table: translate table gives %d\n", ret); 1558 1699 if (ret != 0) 1559 1700 goto out_free; 1560 1701
+44 -206
net/ipv4/netfilter/ip_tables.c
··· 35 35 MODULE_AUTHOR("Netfilter Core Team <coreteam@netfilter.org>"); 36 36 MODULE_DESCRIPTION("IPv4 packet filter"); 37 37 38 - /*#define DEBUG_IP_FIREWALL*/ 39 - /*#define DEBUG_ALLOW_ALL*/ /* Useful for remote debugging */ 40 - /*#define DEBUG_IP_FIREWALL_USER*/ 41 - 42 - #ifdef DEBUG_IP_FIREWALL 43 - #define dprintf(format, args...) pr_info(format , ## args) 44 - #else 45 - #define dprintf(format, args...) 46 - #endif 47 - 48 - #ifdef DEBUG_IP_FIREWALL_USER 49 - #define duprintf(format, args...) pr_info(format , ## args) 50 - #else 51 - #define duprintf(format, args...) 52 - #endif 53 - 54 38 #ifdef CONFIG_NETFILTER_DEBUG 55 39 #define IP_NF_ASSERT(x) WARN_ON(!(x)) 56 40 #else 57 41 #define IP_NF_ASSERT(x) 58 - #endif 59 - 60 - #if 0 61 - /* All the better to debug you with... */ 62 - #define static 63 - #define inline 64 42 #endif 65 43 66 44 void *ipt_alloc_initial_table(const struct xt_table *info) ··· 63 85 if (FWINV((ip->saddr&ipinfo->smsk.s_addr) != ipinfo->src.s_addr, 64 86 IPT_INV_SRCIP) || 65 87 FWINV((ip->daddr&ipinfo->dmsk.s_addr) != ipinfo->dst.s_addr, 66 - IPT_INV_DSTIP)) { 67 - dprintf("Source or dest mismatch.\n"); 68 - 69 - dprintf("SRC: %pI4. Mask: %pI4. Target: %pI4.%s\n", 70 - &ip->saddr, &ipinfo->smsk.s_addr, &ipinfo->src.s_addr, 71 - ipinfo->invflags & IPT_INV_SRCIP ? " (INV)" : ""); 72 - dprintf("DST: %pI4 Mask: %pI4 Target: %pI4.%s\n", 73 - &ip->daddr, &ipinfo->dmsk.s_addr, &ipinfo->dst.s_addr, 74 - ipinfo->invflags & IPT_INV_DSTIP ? " (INV)" : ""); 88 + IPT_INV_DSTIP)) 75 89 return false; 76 - } 77 90 78 91 ret = ifname_compare_aligned(indev, ipinfo->iniface, ipinfo->iniface_mask); 79 92 80 - if (FWINV(ret != 0, IPT_INV_VIA_IN)) { 81 - dprintf("VIA in mismatch (%s vs %s).%s\n", 82 - indev, ipinfo->iniface, 83 - ipinfo->invflags & IPT_INV_VIA_IN ? " (INV)" : ""); 93 + if (FWINV(ret != 0, IPT_INV_VIA_IN)) 84 94 return false; 85 - } 86 95 87 96 ret = ifname_compare_aligned(outdev, ipinfo->outiface, ipinfo->outiface_mask); 88 97 89 - if (FWINV(ret != 0, IPT_INV_VIA_OUT)) { 90 - dprintf("VIA out mismatch (%s vs %s).%s\n", 91 - outdev, ipinfo->outiface, 92 - ipinfo->invflags & IPT_INV_VIA_OUT ? " (INV)" : ""); 98 + if (FWINV(ret != 0, IPT_INV_VIA_OUT)) 93 99 return false; 94 - } 95 100 96 101 /* Check specific protocol */ 97 102 if (ipinfo->proto && 98 - FWINV(ip->protocol != ipinfo->proto, IPT_INV_PROTO)) { 99 - dprintf("Packet protocol %hi does not match %hi.%s\n", 100 - ip->protocol, ipinfo->proto, 101 - ipinfo->invflags & IPT_INV_PROTO ? " (INV)" : ""); 103 + FWINV(ip->protocol != ipinfo->proto, IPT_INV_PROTO)) 102 104 return false; 103 - } 104 105 105 106 /* If we have a fragment rule but the packet is not a fragment 106 107 * then we return zero */ 107 - if (FWINV((ipinfo->flags&IPT_F_FRAG) && !isfrag, IPT_INV_FRAG)) { 108 - dprintf("Fragment rule but not fragment.%s\n", 109 - ipinfo->invflags & IPT_INV_FRAG ? " (INV)" : ""); 108 + if (FWINV((ipinfo->flags&IPT_F_FRAG) && !isfrag, IPT_INV_FRAG)) 110 109 return false; 111 - } 112 110 113 111 return true; 114 112 } ··· 92 138 static bool 93 139 ip_checkentry(const struct ipt_ip *ip) 94 140 { 95 - if (ip->flags & ~IPT_F_MASK) { 96 - duprintf("Unknown flag bits set: %08X\n", 97 - ip->flags & ~IPT_F_MASK); 141 + if (ip->flags & ~IPT_F_MASK) 98 142 return false; 99 - } 100 - if (ip->invflags & ~IPT_INV_MASK) { 101 - duprintf("Unknown invflag bits set: %08X\n", 102 - ip->invflags & ~IPT_INV_MASK); 143 + if (ip->invflags & ~IPT_INV_MASK) 103 144 return false; 104 - } 105 145 return true; 106 146 } 107 147 ··· 294 346 295 347 e = get_entry(table_base, private->hook_entry[hook]); 296 348 297 - pr_debug("Entering %s(hook %u), UF %p\n", 298 - table->name, hook, 299 - get_entry(table_base, private->underflow[hook])); 300 - 301 349 do { 302 350 const struct xt_entry_target *t; 303 351 const struct xt_entry_match *ematch; ··· 340 396 if (stackidx == 0) { 341 397 e = get_entry(table_base, 342 398 private->underflow[hook]); 343 - pr_debug("Underflow (this is normal) " 344 - "to %p\n", e); 345 399 } else { 346 400 e = jumpstack[--stackidx]; 347 - pr_debug("Pulled %p out from pos %u\n", 348 - e, stackidx); 349 401 e = ipt_next_entry(e); 350 402 } 351 403 continue; 352 404 } 353 405 if (table_base + v != ipt_next_entry(e) && 354 - !(e->ip.flags & IPT_F_GOTO)) { 406 + !(e->ip.flags & IPT_F_GOTO)) 355 407 jumpstack[stackidx++] = e; 356 - pr_debug("Pushed %p into pos %u\n", 357 - e, stackidx - 1); 358 - } 359 408 360 409 e = get_entry(table_base, v); 361 410 continue; ··· 366 429 /* Verdict */ 367 430 break; 368 431 } while (!acpar.hotdrop); 369 - pr_debug("Exiting %s; sp at %u\n", __func__, stackidx); 370 432 371 433 xt_write_recseq_end(addend); 372 434 local_bh_enable(); 373 435 374 - #ifdef DEBUG_ALLOW_ALL 375 - return NF_ACCEPT; 376 - #else 377 436 if (acpar.hotdrop) 378 437 return NF_DROP; 379 438 else return verdict; 380 - #endif 381 439 } 382 440 383 441 static bool find_jump_target(const struct xt_table_info *t, ··· 412 480 = (void *)ipt_get_target_c(e); 413 481 int visited = e->comefrom & (1 << hook); 414 482 415 - if (e->comefrom & (1 << NF_INET_NUMHOOKS)) { 416 - pr_err("iptables: loop hook %u pos %u %08X.\n", 417 - hook, pos, e->comefrom); 483 + if (e->comefrom & (1 << NF_INET_NUMHOOKS)) 418 484 return 0; 419 - } 485 + 420 486 e->comefrom |= ((1 << hook) | (1 << NF_INET_NUMHOOKS)); 421 487 422 488 /* Unconditional return/END. */ ··· 426 496 427 497 if ((strcmp(t->target.u.user.name, 428 498 XT_STANDARD_TARGET) == 0) && 429 - t->verdict < -NF_MAX_VERDICT - 1) { 430 - duprintf("mark_source_chains: bad " 431 - "negative verdict (%i)\n", 432 - t->verdict); 499 + t->verdict < -NF_MAX_VERDICT - 1) 433 500 return 0; 434 - } 435 501 436 502 /* Return: backtrack through the last 437 503 big jump. */ 438 504 do { 439 505 e->comefrom ^= (1<<NF_INET_NUMHOOKS); 440 - #ifdef DEBUG_IP_FIREWALL_USER 441 - if (e->comefrom 442 - & (1 << NF_INET_NUMHOOKS)) { 443 - duprintf("Back unset " 444 - "on hook %u " 445 - "rule %u\n", 446 - hook, pos); 447 - } 448 - #endif 449 506 oldpos = pos; 450 507 pos = e->counters.pcnt; 451 508 e->counters.pcnt = 0; ··· 460 543 XT_STANDARD_TARGET) == 0 && 461 544 newpos >= 0) { 462 545 /* This a jump; chase it. */ 463 - duprintf("Jump rule %u -> %u\n", 464 - pos, newpos); 465 546 e = (struct ipt_entry *) 466 547 (entry0 + newpos); 467 548 if (!find_jump_target(newinfo, e)) ··· 476 561 pos = newpos; 477 562 } 478 563 } 479 - next: 480 - duprintf("Finished chain %u\n", hook); 564 + next: ; 481 565 } 482 566 return 1; 483 567 } ··· 498 584 check_match(struct xt_entry_match *m, struct xt_mtchk_param *par) 499 585 { 500 586 const struct ipt_ip *ip = par->entryinfo; 501 - int ret; 502 587 503 588 par->match = m->u.kernel.match; 504 589 par->matchinfo = m->data; 505 590 506 - ret = xt_check_match(par, m->u.match_size - sizeof(*m), 507 - ip->proto, ip->invflags & IPT_INV_PROTO); 508 - if (ret < 0) { 509 - duprintf("check failed for `%s'.\n", par->match->name); 510 - return ret; 511 - } 512 - return 0; 591 + return xt_check_match(par, m->u.match_size - sizeof(*m), 592 + ip->proto, ip->invflags & IPT_INV_PROTO); 513 593 } 514 594 515 595 static int ··· 514 606 515 607 match = xt_request_find_match(NFPROTO_IPV4, m->u.user.name, 516 608 m->u.user.revision); 517 - if (IS_ERR(match)) { 518 - duprintf("find_check_match: `%s' not found\n", m->u.user.name); 609 + if (IS_ERR(match)) 519 610 return PTR_ERR(match); 520 - } 521 611 m->u.kernel.match = match; 522 612 523 613 ret = check_match(m, par); ··· 540 634 .hook_mask = e->comefrom, 541 635 .family = NFPROTO_IPV4, 542 636 }; 543 - int ret; 544 637 545 - ret = xt_check_target(&par, t->u.target_size - sizeof(*t), 546 - e->ip.proto, e->ip.invflags & IPT_INV_PROTO); 547 - if (ret < 0) { 548 - duprintf("check failed for `%s'.\n", 549 - t->u.kernel.target->name); 550 - return ret; 551 - } 552 - return 0; 638 + return xt_check_target(&par, t->u.target_size - sizeof(*t), 639 + e->ip.proto, e->ip.invflags & IPT_INV_PROTO); 553 640 } 554 641 555 642 static int ··· 555 656 unsigned int j; 556 657 struct xt_mtchk_param mtpar; 557 658 struct xt_entry_match *ematch; 659 + unsigned long pcnt; 558 660 559 - e->counters.pcnt = xt_percpu_counter_alloc(); 560 - if (IS_ERR_VALUE(e->counters.pcnt)) 661 + pcnt = xt_percpu_counter_alloc(); 662 + if (IS_ERR_VALUE(pcnt)) 561 663 return -ENOMEM; 664 + e->counters.pcnt = pcnt; 562 665 563 666 j = 0; 564 667 mtpar.net = net; ··· 579 678 target = xt_request_find_target(NFPROTO_IPV4, t->u.user.name, 580 679 t->u.user.revision); 581 680 if (IS_ERR(target)) { 582 - duprintf("find_check_entry: `%s' not found\n", t->u.user.name); 583 681 ret = PTR_ERR(target); 584 682 goto cleanup_matches; 585 683 } ··· 632 732 633 733 if ((unsigned long)e % __alignof__(struct ipt_entry) != 0 || 634 734 (unsigned char *)e + sizeof(struct ipt_entry) >= limit || 635 - (unsigned char *)e + e->next_offset > limit) { 636 - duprintf("Bad offset %p\n", e); 735 + (unsigned char *)e + e->next_offset > limit) 637 736 return -EINVAL; 638 - } 639 737 640 738 if (e->next_offset 641 - < sizeof(struct ipt_entry) + sizeof(struct xt_entry_target)) { 642 - duprintf("checking: element %p size %u\n", 643 - e, e->next_offset); 739 + < sizeof(struct ipt_entry) + sizeof(struct xt_entry_target)) 644 740 return -EINVAL; 645 - } 646 741 647 742 if (!ip_checkentry(&e->ip)) 648 743 return -EINVAL; ··· 654 759 if ((unsigned char *)e - base == hook_entries[h]) 655 760 newinfo->hook_entry[h] = hook_entries[h]; 656 761 if ((unsigned char *)e - base == underflows[h]) { 657 - if (!check_underflow(e)) { 658 - pr_debug("Underflows must be unconditional and " 659 - "use the STANDARD target with " 660 - "ACCEPT/DROP\n"); 762 + if (!check_underflow(e)) 661 763 return -EINVAL; 662 - } 764 + 663 765 newinfo->underflow[h] = underflows[h]; 664 766 } 665 767 } ··· 708 816 newinfo->underflow[i] = 0xFFFFFFFF; 709 817 } 710 818 711 - duprintf("translate_table: size %u\n", newinfo->size); 712 819 i = 0; 713 820 /* Walk through entries, checking offsets. */ 714 821 xt_entry_foreach(iter, entry0, newinfo->size) { ··· 724 833 ++newinfo->stacksize; 725 834 } 726 835 727 - if (i != repl->num_entries) { 728 - duprintf("translate_table: %u not %u entries\n", 729 - i, repl->num_entries); 836 + if (i != repl->num_entries) 730 837 return -EINVAL; 731 - } 732 838 733 839 /* Check hooks all assigned */ 734 840 for (i = 0; i < NF_INET_NUMHOOKS; i++) { 735 841 /* Only hooks which are valid */ 736 842 if (!(repl->valid_hooks & (1 << i))) 737 843 continue; 738 - if (newinfo->hook_entry[i] == 0xFFFFFFFF) { 739 - duprintf("Invalid hook entry %u %u\n", 740 - i, repl->hook_entry[i]); 844 + if (newinfo->hook_entry[i] == 0xFFFFFFFF) 741 845 return -EINVAL; 742 - } 743 - if (newinfo->underflow[i] == 0xFFFFFFFF) { 744 - duprintf("Invalid underflow %u %u\n", 745 - i, repl->underflow[i]); 846 + if (newinfo->underflow[i] == 0xFFFFFFFF) 746 847 return -EINVAL; 747 - } 748 848 } 749 849 750 850 if (!mark_source_chains(newinfo, repl->valid_hooks, entry0)) ··· 963 1081 struct xt_table *t; 964 1082 int ret; 965 1083 966 - if (*len != sizeof(struct ipt_getinfo)) { 967 - duprintf("length %u != %zu\n", *len, 968 - sizeof(struct ipt_getinfo)); 1084 + if (*len != sizeof(struct ipt_getinfo)) 969 1085 return -EINVAL; 970 - } 971 1086 972 1087 if (copy_from_user(name, user, sizeof(name)) != 0) 973 1088 return -EFAULT; ··· 1022 1143 struct ipt_get_entries get; 1023 1144 struct xt_table *t; 1024 1145 1025 - if (*len < sizeof(get)) { 1026 - duprintf("get_entries: %u < %zu\n", *len, sizeof(get)); 1146 + if (*len < sizeof(get)) 1027 1147 return -EINVAL; 1028 - } 1029 1148 if (copy_from_user(&get, uptr, sizeof(get)) != 0) 1030 1149 return -EFAULT; 1031 - if (*len != sizeof(struct ipt_get_entries) + get.size) { 1032 - duprintf("get_entries: %u != %zu\n", 1033 - *len, sizeof(get) + get.size); 1150 + if (*len != sizeof(struct ipt_get_entries) + get.size) 1034 1151 return -EINVAL; 1035 - } 1036 1152 get.name[sizeof(get.name) - 1] = '\0'; 1037 1153 1038 1154 t = xt_find_table_lock(net, AF_INET, get.name); 1039 1155 if (!IS_ERR_OR_NULL(t)) { 1040 1156 const struct xt_table_info *private = t->private; 1041 - duprintf("t->private->number = %u\n", private->number); 1042 1157 if (get.size == private->size) 1043 1158 ret = copy_entries_to_user(private->size, 1044 1159 t, uptr->entrytable); 1045 - else { 1046 - duprintf("get_entries: I've got %u not %u!\n", 1047 - private->size, get.size); 1160 + else 1048 1161 ret = -EAGAIN; 1049 - } 1162 + 1050 1163 module_put(t->me); 1051 1164 xt_table_unlock(t); 1052 1165 } else ··· 1074 1203 1075 1204 /* You lied! */ 1076 1205 if (valid_hooks != t->valid_hooks) { 1077 - duprintf("Valid hook crap: %08X vs %08X\n", 1078 - valid_hooks, t->valid_hooks); 1079 1206 ret = -EINVAL; 1080 1207 goto put_module; 1081 1208 } ··· 1083 1214 goto put_module; 1084 1215 1085 1216 /* Update module usage count based on number of rules */ 1086 - duprintf("do_replace: oldnum=%u, initnum=%u, newnum=%u\n", 1087 - oldinfo->number, oldinfo->initial_entries, newinfo->number); 1088 1217 if ((oldinfo->number > oldinfo->initial_entries) || 1089 1218 (newinfo->number <= oldinfo->initial_entries)) 1090 1219 module_put(t->me); ··· 1150 1283 ret = translate_table(net, newinfo, loc_cpu_entry, &tmp); 1151 1284 if (ret != 0) 1152 1285 goto free_newinfo; 1153 - 1154 - duprintf("Translated table\n"); 1155 1286 1156 1287 ret = __do_replace(net, tmp.name, tmp.valid_hooks, newinfo, 1157 1288 tmp.num_counters, tmp.counters); ··· 1276 1411 1277 1412 match = xt_request_find_match(NFPROTO_IPV4, m->u.user.name, 1278 1413 m->u.user.revision); 1279 - if (IS_ERR(match)) { 1280 - duprintf("compat_check_calc_match: `%s' not found\n", 1281 - m->u.user.name); 1414 + if (IS_ERR(match)) 1282 1415 return PTR_ERR(match); 1283 - } 1416 + 1284 1417 m->u.kernel.match = match; 1285 1418 *size += xt_compat_match_offset(match); 1286 1419 return 0; ··· 1310 1447 unsigned int j; 1311 1448 int ret, off; 1312 1449 1313 - duprintf("check_compat_entry_size_and_hooks %p\n", e); 1314 1450 if ((unsigned long)e % __alignof__(struct compat_ipt_entry) != 0 || 1315 1451 (unsigned char *)e + sizeof(struct compat_ipt_entry) >= limit || 1316 - (unsigned char *)e + e->next_offset > limit) { 1317 - duprintf("Bad offset %p, limit = %p\n", e, limit); 1452 + (unsigned char *)e + e->next_offset > limit) 1318 1453 return -EINVAL; 1319 - } 1320 1454 1321 1455 if (e->next_offset < sizeof(struct compat_ipt_entry) + 1322 - sizeof(struct compat_xt_entry_target)) { 1323 - duprintf("checking: element %p size %u\n", 1324 - e, e->next_offset); 1456 + sizeof(struct compat_xt_entry_target)) 1325 1457 return -EINVAL; 1326 - } 1327 1458 1328 1459 if (!ip_checkentry(&e->ip)) 1329 1460 return -EINVAL; ··· 1341 1484 target = xt_request_find_target(NFPROTO_IPV4, t->u.user.name, 1342 1485 t->u.user.revision); 1343 1486 if (IS_ERR(target)) { 1344 - duprintf("check_compat_entry_size_and_hooks: `%s' not found\n", 1345 - t->u.user.name); 1346 1487 ret = PTR_ERR(target); 1347 1488 goto release_matches; 1348 1489 } ··· 1422 1567 size = compatr->size; 1423 1568 info->number = compatr->num_entries; 1424 1569 1425 - duprintf("translate_compat_table: size %u\n", info->size); 1426 1570 j = 0; 1427 1571 xt_compat_lock(AF_INET); 1428 1572 xt_compat_init_offsets(AF_INET, compatr->num_entries); ··· 1436 1582 } 1437 1583 1438 1584 ret = -EINVAL; 1439 - if (j != compatr->num_entries) { 1440 - duprintf("translate_compat_table: %u not %u entries\n", 1441 - j, compatr->num_entries); 1585 + if (j != compatr->num_entries) 1442 1586 goto out_unlock; 1443 - } 1444 1587 1445 1588 ret = -ENOMEM; 1446 1589 newinfo = xt_alloc_table_info(size); ··· 1534 1683 if (ret != 0) 1535 1684 goto free_newinfo; 1536 1685 1537 - duprintf("compat_do_replace: Translated table\n"); 1538 - 1539 1686 ret = __do_replace(net, tmp.name, tmp.valid_hooks, newinfo, 1540 1687 tmp.num_counters, compat_ptr(tmp.counters)); 1541 1688 if (ret) ··· 1567 1718 break; 1568 1719 1569 1720 default: 1570 - duprintf("do_ipt_set_ctl: unknown request %i\n", cmd); 1571 1721 ret = -EINVAL; 1572 1722 } 1573 1723 ··· 1616 1768 struct compat_ipt_get_entries get; 1617 1769 struct xt_table *t; 1618 1770 1619 - if (*len < sizeof(get)) { 1620 - duprintf("compat_get_entries: %u < %zu\n", *len, sizeof(get)); 1771 + if (*len < sizeof(get)) 1621 1772 return -EINVAL; 1622 - } 1623 1773 1624 1774 if (copy_from_user(&get, uptr, sizeof(get)) != 0) 1625 1775 return -EFAULT; 1626 1776 1627 - if (*len != sizeof(struct compat_ipt_get_entries) + get.size) { 1628 - duprintf("compat_get_entries: %u != %zu\n", 1629 - *len, sizeof(get) + get.size); 1777 + if (*len != sizeof(struct compat_ipt_get_entries) + get.size) 1630 1778 return -EINVAL; 1631 - } 1779 + 1632 1780 get.name[sizeof(get.name) - 1] = '\0'; 1633 1781 1634 1782 xt_compat_lock(AF_INET); ··· 1632 1788 if (!IS_ERR_OR_NULL(t)) { 1633 1789 const struct xt_table_info *private = t->private; 1634 1790 struct xt_table_info info; 1635 - duprintf("t->private->number = %u\n", private->number); 1636 1791 ret = compat_table_info(private, &info); 1637 - if (!ret && get.size == info.size) { 1792 + if (!ret && get.size == info.size) 1638 1793 ret = compat_copy_entries_to_user(private->size, 1639 1794 t, uptr->entrytable); 1640 - } else if (!ret) { 1641 - duprintf("compat_get_entries: I've got %u not %u!\n", 1642 - private->size, get.size); 1795 + else if (!ret) 1643 1796 ret = -EAGAIN; 1644 - } 1797 + 1645 1798 xt_compat_flush_offsets(AF_INET); 1646 1799 module_put(t->me); 1647 1800 xt_table_unlock(t); ··· 1691 1850 break; 1692 1851 1693 1852 default: 1694 - duprintf("do_ipt_set_ctl: unknown request %i\n", cmd); 1695 1853 ret = -EINVAL; 1696 1854 } 1697 1855 ··· 1742 1902 } 1743 1903 1744 1904 default: 1745 - duprintf("do_ipt_get_ctl: unknown request %i\n", cmd); 1746 1905 ret = -EINVAL; 1747 1906 } 1748 1907 ··· 1843 2004 /* We've been asked to examine this packet, and we 1844 2005 * can't. Hence, no choice but to drop. 1845 2006 */ 1846 - duprintf("Dropping evil ICMP tinygram.\n"); 1847 2007 par->hotdrop = true; 1848 2008 return false; 1849 2009 }
+1 -1
net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
··· 360 360 361 361 in->ctl_table[0].data = &nf_conntrack_max; 362 362 in->ctl_table[1].data = &net->ct.count; 363 - in->ctl_table[2].data = &net->ct.htable_size; 363 + in->ctl_table[2].data = &nf_conntrack_htable_size; 364 364 in->ctl_table[3].data = &net->ct.sysctl_checksum; 365 365 in->ctl_table[4].data = &net->ct.sysctl_log_invalid; 366 366 #endif
+32 -15
net/ipv4/netfilter/nf_conntrack_l3proto_ipv4_compat.c
··· 31 31 32 32 static struct hlist_nulls_node *ct_get_first(struct seq_file *seq) 33 33 { 34 - struct net *net = seq_file_net(seq); 35 34 struct ct_iter_state *st = seq->private; 36 35 struct hlist_nulls_node *n; 37 36 38 37 for (st->bucket = 0; 39 - st->bucket < net->ct.htable_size; 38 + st->bucket < nf_conntrack_htable_size; 40 39 st->bucket++) { 41 40 n = rcu_dereference( 42 - hlist_nulls_first_rcu(&net->ct.hash[st->bucket])); 41 + hlist_nulls_first_rcu(&nf_conntrack_hash[st->bucket])); 43 42 if (!is_a_nulls(n)) 44 43 return n; 45 44 } ··· 48 49 static struct hlist_nulls_node *ct_get_next(struct seq_file *seq, 49 50 struct hlist_nulls_node *head) 50 51 { 51 - struct net *net = seq_file_net(seq); 52 52 struct ct_iter_state *st = seq->private; 53 53 54 54 head = rcu_dereference(hlist_nulls_next_rcu(head)); 55 55 while (is_a_nulls(head)) { 56 56 if (likely(get_nulls_value(head) == st->bucket)) { 57 - if (++st->bucket >= net->ct.htable_size) 57 + if (++st->bucket >= nf_conntrack_htable_size) 58 58 return NULL; 59 59 } 60 60 head = rcu_dereference( 61 - hlist_nulls_first_rcu(&net->ct.hash[st->bucket])); 61 + hlist_nulls_first_rcu(&nf_conntrack_hash[st->bucket])); 62 62 } 63 63 return head; 64 64 } ··· 112 114 } 113 115 #endif 114 116 117 + static bool ct_seq_should_skip(const struct nf_conn *ct, 118 + const struct net *net, 119 + const struct nf_conntrack_tuple_hash *hash) 120 + { 121 + /* we only want to print DIR_ORIGINAL */ 122 + if (NF_CT_DIRECTION(hash)) 123 + return true; 124 + 125 + if (nf_ct_l3num(ct) != AF_INET) 126 + return true; 127 + 128 + if (!net_eq(nf_ct_net(ct), net)) 129 + return true; 130 + 131 + return false; 132 + } 133 + 115 134 static int ct_seq_show(struct seq_file *s, void *v) 116 135 { 117 136 struct nf_conntrack_tuple_hash *hash = v; ··· 138 123 int ret = 0; 139 124 140 125 NF_CT_ASSERT(ct); 126 + if (ct_seq_should_skip(ct, seq_file_net(s), hash)) 127 + return 0; 128 + 141 129 if (unlikely(!atomic_inc_not_zero(&ct->ct_general.use))) 142 130 return 0; 143 131 144 - 145 - /* we only want to print DIR_ORIGINAL */ 146 - if (NF_CT_DIRECTION(hash)) 147 - goto release; 148 - if (nf_ct_l3num(ct) != AF_INET) 132 + /* check if we raced w. object reuse */ 133 + if (!nf_ct_is_confirmed(ct) || 134 + ct_seq_should_skip(ct, seq_file_net(s), hash)) 149 135 goto release; 150 136 151 137 l3proto = __nf_ct_l3proto_find(nf_ct_l3num(ct)); ··· 236 220 237 221 static struct hlist_node *ct_expect_get_first(struct seq_file *seq) 238 222 { 239 - struct net *net = seq_file_net(seq); 240 223 struct ct_expect_iter_state *st = seq->private; 241 224 struct hlist_node *n; 242 225 243 226 for (st->bucket = 0; st->bucket < nf_ct_expect_hsize; st->bucket++) { 244 227 n = rcu_dereference( 245 - hlist_first_rcu(&net->ct.expect_hash[st->bucket])); 228 + hlist_first_rcu(&nf_ct_expect_hash[st->bucket])); 246 229 if (n) 247 230 return n; 248 231 } ··· 251 236 static struct hlist_node *ct_expect_get_next(struct seq_file *seq, 252 237 struct hlist_node *head) 253 238 { 254 - struct net *net = seq_file_net(seq); 255 239 struct ct_expect_iter_state *st = seq->private; 256 240 257 241 head = rcu_dereference(hlist_next_rcu(head)); ··· 258 244 if (++st->bucket >= nf_ct_expect_hsize) 259 245 return NULL; 260 246 head = rcu_dereference( 261 - hlist_first_rcu(&net->ct.expect_hash[st->bucket])); 247 + hlist_first_rcu(&nf_ct_expect_hash[st->bucket])); 262 248 } 263 249 return head; 264 250 } ··· 298 284 const struct hlist_node *n = v; 299 285 300 286 exp = hlist_entry(n, struct nf_conntrack_expect, hnode); 287 + 288 + if (!net_eq(nf_ct_net(exp->master), seq_file_net(s))) 289 + return 0; 301 290 302 291 if (exp->tuple.src.l3num != AF_INET) 303 292 return 0;
+45 -190
net/ipv6/netfilter/ip6_tables.c
··· 39 39 MODULE_AUTHOR("Netfilter Core Team <coreteam@netfilter.org>"); 40 40 MODULE_DESCRIPTION("IPv6 packet filter"); 41 41 42 - /*#define DEBUG_IP_FIREWALL*/ 43 - /*#define DEBUG_ALLOW_ALL*/ /* Useful for remote debugging */ 44 - /*#define DEBUG_IP_FIREWALL_USER*/ 45 - 46 - #ifdef DEBUG_IP_FIREWALL 47 - #define dprintf(format, args...) pr_info(format , ## args) 48 - #else 49 - #define dprintf(format, args...) 50 - #endif 51 - 52 - #ifdef DEBUG_IP_FIREWALL_USER 53 - #define duprintf(format, args...) pr_info(format , ## args) 54 - #else 55 - #define duprintf(format, args...) 56 - #endif 57 - 58 42 #ifdef CONFIG_NETFILTER_DEBUG 59 43 #define IP_NF_ASSERT(x) WARN_ON(!(x)) 60 44 #else 61 45 #define IP_NF_ASSERT(x) 62 - #endif 63 - 64 - #if 0 65 - /* All the better to debug you with... */ 66 - #define static 67 - #define inline 68 46 #endif 69 47 70 48 void *ip6t_alloc_initial_table(const struct xt_table *info) ··· 78 100 if (FWINV(ipv6_masked_addr_cmp(&ipv6->saddr, &ip6info->smsk, 79 101 &ip6info->src), IP6T_INV_SRCIP) || 80 102 FWINV(ipv6_masked_addr_cmp(&ipv6->daddr, &ip6info->dmsk, 81 - &ip6info->dst), IP6T_INV_DSTIP)) { 82 - dprintf("Source or dest mismatch.\n"); 83 - /* 84 - dprintf("SRC: %u. Mask: %u. Target: %u.%s\n", ip->saddr, 85 - ipinfo->smsk.s_addr, ipinfo->src.s_addr, 86 - ipinfo->invflags & IP6T_INV_SRCIP ? " (INV)" : ""); 87 - dprintf("DST: %u. Mask: %u. Target: %u.%s\n", ip->daddr, 88 - ipinfo->dmsk.s_addr, ipinfo->dst.s_addr, 89 - ipinfo->invflags & IP6T_INV_DSTIP ? " (INV)" : "");*/ 103 + &ip6info->dst), IP6T_INV_DSTIP)) 90 104 return false; 91 - } 92 105 93 106 ret = ifname_compare_aligned(indev, ip6info->iniface, ip6info->iniface_mask); 94 107 95 - if (FWINV(ret != 0, IP6T_INV_VIA_IN)) { 96 - dprintf("VIA in mismatch (%s vs %s).%s\n", 97 - indev, ip6info->iniface, 98 - ip6info->invflags & IP6T_INV_VIA_IN ? " (INV)" : ""); 108 + if (FWINV(ret != 0, IP6T_INV_VIA_IN)) 99 109 return false; 100 - } 101 110 102 111 ret = ifname_compare_aligned(outdev, ip6info->outiface, ip6info->outiface_mask); 103 112 104 - if (FWINV(ret != 0, IP6T_INV_VIA_OUT)) { 105 - dprintf("VIA out mismatch (%s vs %s).%s\n", 106 - outdev, ip6info->outiface, 107 - ip6info->invflags & IP6T_INV_VIA_OUT ? " (INV)" : ""); 113 + if (FWINV(ret != 0, IP6T_INV_VIA_OUT)) 108 114 return false; 109 - } 110 115 111 116 /* ... might want to do something with class and flowlabel here ... */ 112 117 ··· 105 144 return false; 106 145 } 107 146 *fragoff = _frag_off; 108 - 109 - dprintf("Packet protocol %hi ?= %s%hi.\n", 110 - protohdr, 111 - ip6info->invflags & IP6T_INV_PROTO ? "!":"", 112 - ip6info->proto); 113 147 114 148 if (ip6info->proto == protohdr) { 115 149 if (ip6info->invflags & IP6T_INV_PROTO) ··· 125 169 static bool 126 170 ip6_checkentry(const struct ip6t_ip6 *ipv6) 127 171 { 128 - if (ipv6->flags & ~IP6T_F_MASK) { 129 - duprintf("Unknown flag bits set: %08X\n", 130 - ipv6->flags & ~IP6T_F_MASK); 172 + if (ipv6->flags & ~IP6T_F_MASK) 131 173 return false; 132 - } 133 - if (ipv6->invflags & ~IP6T_INV_MASK) { 134 - duprintf("Unknown invflag bits set: %08X\n", 135 - ipv6->invflags & ~IP6T_INV_MASK); 174 + if (ipv6->invflags & ~IP6T_INV_MASK) 136 175 return false; 137 - } 176 + 138 177 return true; 139 178 } 140 179 ··· 397 446 xt_write_recseq_end(addend); 398 447 local_bh_enable(); 399 448 400 - #ifdef DEBUG_ALLOW_ALL 401 - return NF_ACCEPT; 402 - #else 403 449 if (acpar.hotdrop) 404 450 return NF_DROP; 405 451 else return verdict; 406 - #endif 407 452 } 408 453 409 454 static bool find_jump_target(const struct xt_table_info *t, ··· 439 492 = (void *)ip6t_get_target_c(e); 440 493 int visited = e->comefrom & (1 << hook); 441 494 442 - if (e->comefrom & (1 << NF_INET_NUMHOOKS)) { 443 - pr_err("iptables: loop hook %u pos %u %08X.\n", 444 - hook, pos, e->comefrom); 495 + if (e->comefrom & (1 << NF_INET_NUMHOOKS)) 445 496 return 0; 446 - } 497 + 447 498 e->comefrom |= ((1 << hook) | (1 << NF_INET_NUMHOOKS)); 448 499 449 500 /* Unconditional return/END. */ ··· 453 508 454 509 if ((strcmp(t->target.u.user.name, 455 510 XT_STANDARD_TARGET) == 0) && 456 - t->verdict < -NF_MAX_VERDICT - 1) { 457 - duprintf("mark_source_chains: bad " 458 - "negative verdict (%i)\n", 459 - t->verdict); 511 + t->verdict < -NF_MAX_VERDICT - 1) 460 512 return 0; 461 - } 462 513 463 514 /* Return: backtrack through the last 464 515 big jump. */ 465 516 do { 466 517 e->comefrom ^= (1<<NF_INET_NUMHOOKS); 467 - #ifdef DEBUG_IP_FIREWALL_USER 468 - if (e->comefrom 469 - & (1 << NF_INET_NUMHOOKS)) { 470 - duprintf("Back unset " 471 - "on hook %u " 472 - "rule %u\n", 473 - hook, pos); 474 - } 475 - #endif 476 518 oldpos = pos; 477 519 pos = e->counters.pcnt; 478 520 e->counters.pcnt = 0; ··· 487 555 XT_STANDARD_TARGET) == 0 && 488 556 newpos >= 0) { 489 557 /* This a jump; chase it. */ 490 - duprintf("Jump rule %u -> %u\n", 491 - pos, newpos); 492 558 e = (struct ip6t_entry *) 493 559 (entry0 + newpos); 494 560 if (!find_jump_target(newinfo, e)) ··· 503 573 pos = newpos; 504 574 } 505 575 } 506 - next: 507 - duprintf("Finished chain %u\n", hook); 576 + next: ; 508 577 } 509 578 return 1; 510 579 } ··· 524 595 static int check_match(struct xt_entry_match *m, struct xt_mtchk_param *par) 525 596 { 526 597 const struct ip6t_ip6 *ipv6 = par->entryinfo; 527 - int ret; 528 598 529 599 par->match = m->u.kernel.match; 530 600 par->matchinfo = m->data; 531 601 532 - ret = xt_check_match(par, m->u.match_size - sizeof(*m), 533 - ipv6->proto, ipv6->invflags & IP6T_INV_PROTO); 534 - if (ret < 0) { 535 - duprintf("ip_tables: check failed for `%s'.\n", 536 - par.match->name); 537 - return ret; 538 - } 539 - return 0; 602 + return xt_check_match(par, m->u.match_size - sizeof(*m), 603 + ipv6->proto, ipv6->invflags & IP6T_INV_PROTO); 540 604 } 541 605 542 606 static int ··· 540 618 541 619 match = xt_request_find_match(NFPROTO_IPV6, m->u.user.name, 542 620 m->u.user.revision); 543 - if (IS_ERR(match)) { 544 - duprintf("find_check_match: `%s' not found\n", m->u.user.name); 621 + if (IS_ERR(match)) 545 622 return PTR_ERR(match); 546 - } 623 + 547 624 m->u.kernel.match = match; 548 625 549 626 ret = check_match(m, par); ··· 567 646 .hook_mask = e->comefrom, 568 647 .family = NFPROTO_IPV6, 569 648 }; 570 - int ret; 571 649 572 650 t = ip6t_get_target(e); 573 - ret = xt_check_target(&par, t->u.target_size - sizeof(*t), 574 - e->ipv6.proto, e->ipv6.invflags & IP6T_INV_PROTO); 575 - if (ret < 0) { 576 - duprintf("ip_tables: check failed for `%s'.\n", 577 - t->u.kernel.target->name); 578 - return ret; 579 - } 580 - return 0; 651 + return xt_check_target(&par, t->u.target_size - sizeof(*t), 652 + e->ipv6.proto, 653 + e->ipv6.invflags & IP6T_INV_PROTO); 581 654 } 582 655 583 656 static int ··· 584 669 unsigned int j; 585 670 struct xt_mtchk_param mtpar; 586 671 struct xt_entry_match *ematch; 672 + unsigned long pcnt; 587 673 588 - e->counters.pcnt = xt_percpu_counter_alloc(); 589 - if (IS_ERR_VALUE(e->counters.pcnt)) 674 + pcnt = xt_percpu_counter_alloc(); 675 + if (IS_ERR_VALUE(pcnt)) 590 676 return -ENOMEM; 677 + e->counters.pcnt = pcnt; 591 678 592 679 j = 0; 593 680 mtpar.net = net; ··· 608 691 target = xt_request_find_target(NFPROTO_IPV6, t->u.user.name, 609 692 t->u.user.revision); 610 693 if (IS_ERR(target)) { 611 - duprintf("find_check_entry: `%s' not found\n", t->u.user.name); 612 694 ret = PTR_ERR(target); 613 695 goto cleanup_matches; 614 696 } ··· 660 744 661 745 if ((unsigned long)e % __alignof__(struct ip6t_entry) != 0 || 662 746 (unsigned char *)e + sizeof(struct ip6t_entry) >= limit || 663 - (unsigned char *)e + e->next_offset > limit) { 664 - duprintf("Bad offset %p\n", e); 747 + (unsigned char *)e + e->next_offset > limit) 665 748 return -EINVAL; 666 - } 667 749 668 750 if (e->next_offset 669 - < sizeof(struct ip6t_entry) + sizeof(struct xt_entry_target)) { 670 - duprintf("checking: element %p size %u\n", 671 - e, e->next_offset); 751 + < sizeof(struct ip6t_entry) + sizeof(struct xt_entry_target)) 672 752 return -EINVAL; 673 - } 674 753 675 754 if (!ip6_checkentry(&e->ipv6)) 676 755 return -EINVAL; ··· 682 771 if ((unsigned char *)e - base == hook_entries[h]) 683 772 newinfo->hook_entry[h] = hook_entries[h]; 684 773 if ((unsigned char *)e - base == underflows[h]) { 685 - if (!check_underflow(e)) { 686 - pr_debug("Underflows must be unconditional and " 687 - "use the STANDARD target with " 688 - "ACCEPT/DROP\n"); 774 + if (!check_underflow(e)) 689 775 return -EINVAL; 690 - } 776 + 691 777 newinfo->underflow[h] = underflows[h]; 692 778 } 693 779 } ··· 736 828 newinfo->underflow[i] = 0xFFFFFFFF; 737 829 } 738 830 739 - duprintf("translate_table: size %u\n", newinfo->size); 740 831 i = 0; 741 832 /* Walk through entries, checking offsets. */ 742 833 xt_entry_foreach(iter, entry0, newinfo->size) { ··· 752 845 ++newinfo->stacksize; 753 846 } 754 847 755 - if (i != repl->num_entries) { 756 - duprintf("translate_table: %u not %u entries\n", 757 - i, repl->num_entries); 848 + if (i != repl->num_entries) 758 849 return -EINVAL; 759 - } 760 850 761 851 /* Check hooks all assigned */ 762 852 for (i = 0; i < NF_INET_NUMHOOKS; i++) { 763 853 /* Only hooks which are valid */ 764 854 if (!(repl->valid_hooks & (1 << i))) 765 855 continue; 766 - if (newinfo->hook_entry[i] == 0xFFFFFFFF) { 767 - duprintf("Invalid hook entry %u %u\n", 768 - i, repl->hook_entry[i]); 856 + if (newinfo->hook_entry[i] == 0xFFFFFFFF) 769 857 return -EINVAL; 770 - } 771 - if (newinfo->underflow[i] == 0xFFFFFFFF) { 772 - duprintf("Invalid underflow %u %u\n", 773 - i, repl->underflow[i]); 858 + if (newinfo->underflow[i] == 0xFFFFFFFF) 774 859 return -EINVAL; 775 - } 776 860 } 777 861 778 862 if (!mark_source_chains(newinfo, repl->valid_hooks, entry0)) ··· 991 1093 struct xt_table *t; 992 1094 int ret; 993 1095 994 - if (*len != sizeof(struct ip6t_getinfo)) { 995 - duprintf("length %u != %zu\n", *len, 996 - sizeof(struct ip6t_getinfo)); 1096 + if (*len != sizeof(struct ip6t_getinfo)) 997 1097 return -EINVAL; 998 - } 999 1098 1000 1099 if (copy_from_user(name, user, sizeof(name)) != 0) 1001 1100 return -EFAULT; ··· 1050 1155 struct ip6t_get_entries get; 1051 1156 struct xt_table *t; 1052 1157 1053 - if (*len < sizeof(get)) { 1054 - duprintf("get_entries: %u < %zu\n", *len, sizeof(get)); 1158 + if (*len < sizeof(get)) 1055 1159 return -EINVAL; 1056 - } 1057 1160 if (copy_from_user(&get, uptr, sizeof(get)) != 0) 1058 1161 return -EFAULT; 1059 - if (*len != sizeof(struct ip6t_get_entries) + get.size) { 1060 - duprintf("get_entries: %u != %zu\n", 1061 - *len, sizeof(get) + get.size); 1162 + if (*len != sizeof(struct ip6t_get_entries) + get.size) 1062 1163 return -EINVAL; 1063 - } 1164 + 1064 1165 get.name[sizeof(get.name) - 1] = '\0'; 1065 1166 1066 1167 t = xt_find_table_lock(net, AF_INET6, get.name); 1067 1168 if (!IS_ERR_OR_NULL(t)) { 1068 1169 struct xt_table_info *private = t->private; 1069 - duprintf("t->private->number = %u\n", private->number); 1070 1170 if (get.size == private->size) 1071 1171 ret = copy_entries_to_user(private->size, 1072 1172 t, uptr->entrytable); 1073 - else { 1074 - duprintf("get_entries: I've got %u not %u!\n", 1075 - private->size, get.size); 1173 + else 1076 1174 ret = -EAGAIN; 1077 - } 1175 + 1078 1176 module_put(t->me); 1079 1177 xt_table_unlock(t); 1080 1178 } else ··· 1103 1215 1104 1216 /* You lied! */ 1105 1217 if (valid_hooks != t->valid_hooks) { 1106 - duprintf("Valid hook crap: %08X vs %08X\n", 1107 - valid_hooks, t->valid_hooks); 1108 1218 ret = -EINVAL; 1109 1219 goto put_module; 1110 1220 } ··· 1112 1226 goto put_module; 1113 1227 1114 1228 /* Update module usage count based on number of rules */ 1115 - duprintf("do_replace: oldnum=%u, initnum=%u, newnum=%u\n", 1116 - oldinfo->number, oldinfo->initial_entries, newinfo->number); 1117 1229 if ((oldinfo->number > oldinfo->initial_entries) || 1118 1230 (newinfo->number <= oldinfo->initial_entries)) 1119 1231 module_put(t->me); ··· 1179 1295 ret = translate_table(net, newinfo, loc_cpu_entry, &tmp); 1180 1296 if (ret != 0) 1181 1297 goto free_newinfo; 1182 - 1183 - duprintf("ip_tables: Translated table\n"); 1184 1298 1185 1299 ret = __do_replace(net, tmp.name, tmp.valid_hooks, newinfo, 1186 1300 tmp.num_counters, tmp.counters); ··· 1304 1422 1305 1423 match = xt_request_find_match(NFPROTO_IPV6, m->u.user.name, 1306 1424 m->u.user.revision); 1307 - if (IS_ERR(match)) { 1308 - duprintf("compat_check_calc_match: `%s' not found\n", 1309 - m->u.user.name); 1425 + if (IS_ERR(match)) 1310 1426 return PTR_ERR(match); 1311 - } 1427 + 1312 1428 m->u.kernel.match = match; 1313 1429 *size += xt_compat_match_offset(match); 1314 1430 return 0; ··· 1338 1458 unsigned int j; 1339 1459 int ret, off; 1340 1460 1341 - duprintf("check_compat_entry_size_and_hooks %p\n", e); 1342 1461 if ((unsigned long)e % __alignof__(struct compat_ip6t_entry) != 0 || 1343 1462 (unsigned char *)e + sizeof(struct compat_ip6t_entry) >= limit || 1344 - (unsigned char *)e + e->next_offset > limit) { 1345 - duprintf("Bad offset %p, limit = %p\n", e, limit); 1463 + (unsigned char *)e + e->next_offset > limit) 1346 1464 return -EINVAL; 1347 - } 1348 1465 1349 1466 if (e->next_offset < sizeof(struct compat_ip6t_entry) + 1350 - sizeof(struct compat_xt_entry_target)) { 1351 - duprintf("checking: element %p size %u\n", 1352 - e, e->next_offset); 1467 + sizeof(struct compat_xt_entry_target)) 1353 1468 return -EINVAL; 1354 - } 1355 1469 1356 1470 if (!ip6_checkentry(&e->ipv6)) 1357 1471 return -EINVAL; ··· 1369 1495 target = xt_request_find_target(NFPROTO_IPV6, t->u.user.name, 1370 1496 t->u.user.revision); 1371 1497 if (IS_ERR(target)) { 1372 - duprintf("check_compat_entry_size_and_hooks: `%s' not found\n", 1373 - t->u.user.name); 1374 1498 ret = PTR_ERR(target); 1375 1499 goto release_matches; 1376 1500 } ··· 1447 1575 size = compatr->size; 1448 1576 info->number = compatr->num_entries; 1449 1577 1450 - duprintf("translate_compat_table: size %u\n", info->size); 1451 1578 j = 0; 1452 1579 xt_compat_lock(AF_INET6); 1453 1580 xt_compat_init_offsets(AF_INET6, compatr->num_entries); ··· 1461 1590 } 1462 1591 1463 1592 ret = -EINVAL; 1464 - if (j != compatr->num_entries) { 1465 - duprintf("translate_compat_table: %u not %u entries\n", 1466 - j, compatr->num_entries); 1593 + if (j != compatr->num_entries) 1467 1594 goto out_unlock; 1468 - } 1469 1595 1470 1596 ret = -ENOMEM; 1471 1597 newinfo = xt_alloc_table_info(size); ··· 1553 1685 if (ret != 0) 1554 1686 goto free_newinfo; 1555 1687 1556 - duprintf("compat_do_replace: Translated table\n"); 1557 - 1558 1688 ret = __do_replace(net, tmp.name, tmp.valid_hooks, newinfo, 1559 1689 tmp.num_counters, compat_ptr(tmp.counters)); 1560 1690 if (ret) ··· 1586 1720 break; 1587 1721 1588 1722 default: 1589 - duprintf("do_ip6t_set_ctl: unknown request %i\n", cmd); 1590 1723 ret = -EINVAL; 1591 1724 } 1592 1725 ··· 1635 1770 struct compat_ip6t_get_entries get; 1636 1771 struct xt_table *t; 1637 1772 1638 - if (*len < sizeof(get)) { 1639 - duprintf("compat_get_entries: %u < %zu\n", *len, sizeof(get)); 1773 + if (*len < sizeof(get)) 1640 1774 return -EINVAL; 1641 - } 1642 1775 1643 1776 if (copy_from_user(&get, uptr, sizeof(get)) != 0) 1644 1777 return -EFAULT; 1645 1778 1646 - if (*len != sizeof(struct compat_ip6t_get_entries) + get.size) { 1647 - duprintf("compat_get_entries: %u != %zu\n", 1648 - *len, sizeof(get) + get.size); 1779 + if (*len != sizeof(struct compat_ip6t_get_entries) + get.size) 1649 1780 return -EINVAL; 1650 - } 1781 + 1651 1782 get.name[sizeof(get.name) - 1] = '\0'; 1652 1783 1653 1784 xt_compat_lock(AF_INET6); ··· 1651 1790 if (!IS_ERR_OR_NULL(t)) { 1652 1791 const struct xt_table_info *private = t->private; 1653 1792 struct xt_table_info info; 1654 - duprintf("t->private->number = %u\n", private->number); 1655 1793 ret = compat_table_info(private, &info); 1656 - if (!ret && get.size == info.size) { 1794 + if (!ret && get.size == info.size) 1657 1795 ret = compat_copy_entries_to_user(private->size, 1658 1796 t, uptr->entrytable); 1659 - } else if (!ret) { 1660 - duprintf("compat_get_entries: I've got %u not %u!\n", 1661 - private->size, get.size); 1797 + else if (!ret) 1662 1798 ret = -EAGAIN; 1663 - } 1799 + 1664 1800 xt_compat_flush_offsets(AF_INET6); 1665 1801 module_put(t->me); 1666 1802 xt_table_unlock(t); ··· 1710 1852 break; 1711 1853 1712 1854 default: 1713 - duprintf("do_ip6t_set_ctl: unknown request %i\n", cmd); 1714 1855 ret = -EINVAL; 1715 1856 } 1716 1857 ··· 1761 1904 } 1762 1905 1763 1906 default: 1764 - duprintf("do_ip6t_get_ctl: unknown request %i\n", cmd); 1765 1907 ret = -EINVAL; 1766 1908 } 1767 1909 ··· 1862 2006 /* We've been asked to examine this packet, and we 1863 2007 * can't. Hence, no choice but to drop. 1864 2008 */ 1865 - duprintf("Dropping evil ICMP tinygram.\n"); 1866 2009 par->hotdrop = true; 1867 2010 return false; 1868 2011 }
+1 -1
net/ipv6/netfilter/ip6t_SYNPROXY.c
··· 60 60 fl6.fl6_dport = nth->dest; 61 61 security_skb_classify_flow((struct sk_buff *)skb, flowi6_to_flowi(&fl6)); 62 62 dst = ip6_route_output(net, NULL, &fl6); 63 - if (dst == NULL || dst->error) { 63 + if (dst->error) { 64 64 dst_release(dst); 65 65 goto free_nskb; 66 66 }
+44 -7
net/netfilter/ipvs/ip_vs_conn.c
··· 104 104 spin_unlock_bh(&__ip_vs_conntbl_lock_array[key&CT_LOCKARRAY_MASK].l); 105 105 } 106 106 107 + static void ip_vs_conn_expire(unsigned long data); 107 108 108 109 /* 109 110 * Returns hash value for IPVS connection entry ··· 454 453 } 455 454 EXPORT_SYMBOL_GPL(ip_vs_conn_out_get_proto); 456 455 456 + static void __ip_vs_conn_put_notimer(struct ip_vs_conn *cp) 457 + { 458 + __ip_vs_conn_put(cp); 459 + ip_vs_conn_expire((unsigned long)cp); 460 + } 461 + 457 462 /* 458 463 * Put back the conn and restart its timer with its timeout 459 464 */ 460 - void ip_vs_conn_put(struct ip_vs_conn *cp) 465 + static void __ip_vs_conn_put_timer(struct ip_vs_conn *cp) 461 466 { 462 467 unsigned long t = (cp->flags & IP_VS_CONN_F_ONE_PACKET) ? 463 468 0 : cp->timeout; ··· 472 465 __ip_vs_conn_put(cp); 473 466 } 474 467 468 + void ip_vs_conn_put(struct ip_vs_conn *cp) 469 + { 470 + if ((cp->flags & IP_VS_CONN_F_ONE_PACKET) && 471 + (atomic_read(&cp->refcnt) == 1) && 472 + !timer_pending(&cp->timer)) 473 + /* expire connection immediately */ 474 + __ip_vs_conn_put_notimer(cp); 475 + else 476 + __ip_vs_conn_put_timer(cp); 477 + } 475 478 476 479 /* 477 480 * Fill a no_client_port connection with a client port number ··· 836 819 if (cp->control) 837 820 ip_vs_control_del(cp); 838 821 839 - if (cp->flags & IP_VS_CONN_F_NFCT) { 822 + if ((cp->flags & IP_VS_CONN_F_NFCT) && 823 + !(cp->flags & IP_VS_CONN_F_ONE_PACKET)) { 840 824 /* Do not access conntracks during subsys cleanup 841 825 * because nf_conntrack_find_get can not be used after 842 826 * conntrack cleanup for the net. ··· 852 834 ip_vs_unbind_dest(cp); 853 835 if (cp->flags & IP_VS_CONN_F_NO_CPORT) 854 836 atomic_dec(&ip_vs_conn_no_cport_cnt); 855 - call_rcu(&cp->rcu_head, ip_vs_conn_rcu_free); 837 + if (cp->flags & IP_VS_CONN_F_ONE_PACKET) 838 + ip_vs_conn_rcu_free(&cp->rcu_head); 839 + else 840 + call_rcu(&cp->rcu_head, ip_vs_conn_rcu_free); 856 841 atomic_dec(&ipvs->conn_count); 857 842 return; 858 843 } ··· 871 850 if (ipvs->sync_state & IP_VS_STATE_MASTER) 872 851 ip_vs_sync_conn(ipvs, cp, sysctl_sync_threshold(ipvs)); 873 852 874 - ip_vs_conn_put(cp); 853 + __ip_vs_conn_put_timer(cp); 875 854 } 876 855 877 856 /* Modify timer, so that it expires as soon as possible. ··· 1261 1240 return 1; 1262 1241 } 1263 1242 1243 + static inline bool ip_vs_conn_ops_mode(struct ip_vs_conn *cp) 1244 + { 1245 + struct ip_vs_service *svc; 1246 + 1247 + if (!cp->dest) 1248 + return false; 1249 + svc = rcu_dereference(cp->dest->svc); 1250 + return svc && (svc->flags & IP_VS_SVC_F_ONEPACKET); 1251 + } 1252 + 1264 1253 /* Called from keventd and must protect itself from softirqs */ 1265 1254 void ip_vs_random_dropentry(struct netns_ipvs *ipvs) 1266 1255 { ··· 1285 1254 unsigned int hash = prandom_u32() & ip_vs_conn_tab_mask; 1286 1255 1287 1256 hlist_for_each_entry_rcu(cp, &ip_vs_conn_tab[hash], c_list) { 1288 - if (cp->flags & IP_VS_CONN_F_TEMPLATE) 1289 - /* connection template */ 1290 - continue; 1291 1257 if (cp->ipvs != ipvs) 1292 1258 continue; 1259 + if (cp->flags & IP_VS_CONN_F_TEMPLATE) { 1260 + if (atomic_read(&cp->n_control) || 1261 + !ip_vs_conn_ops_mode(cp)) 1262 + continue; 1263 + else 1264 + /* connection template of OPS */ 1265 + goto try_drop; 1266 + } 1293 1267 if (cp->protocol == IPPROTO_TCP) { 1294 1268 switch(cp->state) { 1295 1269 case IP_VS_TCP_S_SYN_RECV: ··· 1322 1286 continue; 1323 1287 } 1324 1288 } else { 1289 + try_drop: 1325 1290 if (!todrop_entry(cp)) 1326 1291 continue; 1327 1292 }
+161 -1
net/netfilter/ipvs/ip_vs_core.c
··· 68 68 #ifdef CONFIG_IP_VS_DEBUG 69 69 EXPORT_SYMBOL(ip_vs_get_debug_level); 70 70 #endif 71 + EXPORT_SYMBOL(ip_vs_new_conn_out); 71 72 72 73 static int ip_vs_net_id __read_mostly; 73 74 /* netns cnt used for uniqueness */ ··· 612 611 ret = cp->packet_xmit(skb, cp, pd->pp, iph); 613 612 /* do not touch skb anymore */ 614 613 615 - atomic_inc(&cp->in_pkts); 614 + if ((cp->flags & IP_VS_CONN_F_ONE_PACKET) && cp->control) 615 + atomic_inc(&cp->control->in_pkts); 616 + else 617 + atomic_inc(&cp->in_pkts); 616 618 ip_vs_conn_put(cp); 617 619 return ret; 618 620 } ··· 1104 1100 } 1105 1101 } 1106 1102 1103 + /* Generic function to create new connections for outgoing RS packets 1104 + * 1105 + * Pre-requisites for successful connection creation: 1106 + * 1) Virtual Service is NOT fwmark based: 1107 + * In fwmark-VS actual vaddr and vport are unknown to IPVS 1108 + * 2) Real Server and Virtual Service were NOT configured without port: 1109 + * This is to allow match of different VS to the same RS ip-addr 1110 + */ 1111 + struct ip_vs_conn *ip_vs_new_conn_out(struct ip_vs_service *svc, 1112 + struct ip_vs_dest *dest, 1113 + struct sk_buff *skb, 1114 + const struct ip_vs_iphdr *iph, 1115 + __be16 dport, 1116 + __be16 cport) 1117 + { 1118 + struct ip_vs_conn_param param; 1119 + struct ip_vs_conn *ct = NULL, *cp = NULL; 1120 + const union nf_inet_addr *vaddr, *daddr, *caddr; 1121 + union nf_inet_addr snet; 1122 + __be16 vport; 1123 + unsigned int flags; 1124 + 1125 + EnterFunction(12); 1126 + vaddr = &svc->addr; 1127 + vport = svc->port; 1128 + daddr = &iph->saddr; 1129 + caddr = &iph->daddr; 1130 + 1131 + /* check pre-requisites are satisfied */ 1132 + if (svc->fwmark) 1133 + return NULL; 1134 + if (!vport || !dport) 1135 + return NULL; 1136 + 1137 + /* for persistent service first create connection template */ 1138 + if (svc->flags & IP_VS_SVC_F_PERSISTENT) { 1139 + /* apply netmask the same way ingress-side does */ 1140 + #ifdef CONFIG_IP_VS_IPV6 1141 + if (svc->af == AF_INET6) 1142 + ipv6_addr_prefix(&snet.in6, &caddr->in6, 1143 + (__force __u32)svc->netmask); 1144 + else 1145 + #endif 1146 + snet.ip = caddr->ip & svc->netmask; 1147 + /* fill params and create template if not existent */ 1148 + if (ip_vs_conn_fill_param_persist(svc, skb, iph->protocol, 1149 + &snet, 0, vaddr, 1150 + vport, &param) < 0) 1151 + return NULL; 1152 + ct = ip_vs_ct_in_get(&param); 1153 + if (!ct) { 1154 + ct = ip_vs_conn_new(&param, dest->af, daddr, dport, 1155 + IP_VS_CONN_F_TEMPLATE, dest, 0); 1156 + if (!ct) { 1157 + kfree(param.pe_data); 1158 + return NULL; 1159 + } 1160 + ct->timeout = svc->timeout; 1161 + } else { 1162 + kfree(param.pe_data); 1163 + } 1164 + } 1165 + 1166 + /* connection flags */ 1167 + flags = ((svc->flags & IP_VS_SVC_F_ONEPACKET) && 1168 + iph->protocol == IPPROTO_UDP) ? IP_VS_CONN_F_ONE_PACKET : 0; 1169 + /* create connection */ 1170 + ip_vs_conn_fill_param(svc->ipvs, svc->af, iph->protocol, 1171 + caddr, cport, vaddr, vport, &param); 1172 + cp = ip_vs_conn_new(&param, dest->af, daddr, dport, flags, dest, 0); 1173 + if (!cp) { 1174 + if (ct) 1175 + ip_vs_conn_put(ct); 1176 + return NULL; 1177 + } 1178 + if (ct) { 1179 + ip_vs_control_add(cp, ct); 1180 + ip_vs_conn_put(ct); 1181 + } 1182 + ip_vs_conn_stats(cp, svc); 1183 + 1184 + /* return connection (will be used to handle outgoing packet) */ 1185 + IP_VS_DBG_BUF(6, "New connection RS-initiated:%c c:%s:%u v:%s:%u " 1186 + "d:%s:%u conn->flags:%X conn->refcnt:%d\n", 1187 + ip_vs_fwd_tag(cp), 1188 + IP_VS_DBG_ADDR(cp->af, &cp->caddr), ntohs(cp->cport), 1189 + IP_VS_DBG_ADDR(cp->af, &cp->vaddr), ntohs(cp->vport), 1190 + IP_VS_DBG_ADDR(cp->af, &cp->daddr), ntohs(cp->dport), 1191 + cp->flags, atomic_read(&cp->refcnt)); 1192 + LeaveFunction(12); 1193 + return cp; 1194 + } 1195 + 1196 + /* Handle outgoing packets which are considered requests initiated by 1197 + * real servers, so that subsequent responses from external client can be 1198 + * routed to the right real server. 1199 + * Used also for outgoing responses in OPS mode. 1200 + * 1201 + * Connection management is handled by persistent-engine specific callback. 1202 + */ 1203 + static struct ip_vs_conn *__ip_vs_rs_conn_out(unsigned int hooknum, 1204 + struct netns_ipvs *ipvs, 1205 + int af, struct sk_buff *skb, 1206 + const struct ip_vs_iphdr *iph) 1207 + { 1208 + struct ip_vs_dest *dest; 1209 + struct ip_vs_conn *cp = NULL; 1210 + __be16 _ports[2], *pptr; 1211 + 1212 + if (hooknum == NF_INET_LOCAL_IN) 1213 + return NULL; 1214 + 1215 + pptr = frag_safe_skb_hp(skb, iph->len, 1216 + sizeof(_ports), _ports, iph); 1217 + if (!pptr) 1218 + return NULL; 1219 + 1220 + rcu_read_lock(); 1221 + dest = ip_vs_find_real_service(ipvs, af, iph->protocol, 1222 + &iph->saddr, pptr[0]); 1223 + if (dest) { 1224 + struct ip_vs_service *svc; 1225 + struct ip_vs_pe *pe; 1226 + 1227 + svc = rcu_dereference(dest->svc); 1228 + if (svc) { 1229 + pe = rcu_dereference(svc->pe); 1230 + if (pe && pe->conn_out) 1231 + cp = pe->conn_out(svc, dest, skb, iph, 1232 + pptr[0], pptr[1]); 1233 + } 1234 + } 1235 + rcu_read_unlock(); 1236 + 1237 + return cp; 1238 + } 1239 + 1107 1240 /* Handle response packets: rewrite addresses and send away... 1108 1241 */ 1109 1242 static unsigned int ··· 1386 1245 1387 1246 if (likely(cp)) 1388 1247 return handle_response(af, skb, pd, cp, &iph, hooknum); 1248 + 1249 + /* Check for real-server-started requests */ 1250 + if (atomic_read(&ipvs->conn_out_counter)) { 1251 + /* Currently only for UDP: 1252 + * connection oriented protocols typically use 1253 + * ephemeral ports for outgoing connections, so 1254 + * related incoming responses would not match any VS 1255 + */ 1256 + if (pp->protocol == IPPROTO_UDP) { 1257 + cp = __ip_vs_rs_conn_out(hooknum, ipvs, af, skb, &iph); 1258 + if (likely(cp)) 1259 + return handle_response(af, skb, pd, cp, &iph, 1260 + hooknum); 1261 + } 1262 + } 1263 + 1389 1264 if (sysctl_nat_icmp_send(ipvs) && 1390 1265 (pp->protocol == IPPROTO_TCP || 1391 1266 pp->protocol == IPPROTO_UDP || ··· 1994 1837 1995 1838 if (ipvs->sync_state & IP_VS_STATE_MASTER) 1996 1839 ip_vs_sync_conn(ipvs, cp, pkts); 1840 + else if ((cp->flags & IP_VS_CONN_F_ONE_PACKET) && cp->control) 1841 + /* increment is done inside ip_vs_sync_conn too */ 1842 + atomic_inc(&cp->control->in_pkts); 1997 1843 1998 1844 ip_vs_conn_put(cp); 1999 1845 return ret;
+45 -1
net/netfilter/ipvs/ip_vs_ctl.c
··· 567 567 return false; 568 568 } 569 569 570 + /* Find real service record by <proto,addr,port>. 571 + * In case of multiple records with the same <proto,addr,port>, only 572 + * the first found record is returned. 573 + * 574 + * To be called under RCU lock. 575 + */ 576 + struct ip_vs_dest *ip_vs_find_real_service(struct netns_ipvs *ipvs, int af, 577 + __u16 protocol, 578 + const union nf_inet_addr *daddr, 579 + __be16 dport) 580 + { 581 + unsigned int hash; 582 + struct ip_vs_dest *dest; 583 + 584 + /* Check for "full" addressed entries */ 585 + hash = ip_vs_rs_hashkey(af, daddr, dport); 586 + 587 + hlist_for_each_entry_rcu(dest, &ipvs->rs_table[hash], d_list) { 588 + if (dest->port == dport && 589 + dest->af == af && 590 + ip_vs_addr_equal(af, &dest->addr, daddr) && 591 + (dest->protocol == protocol || dest->vfwmark)) { 592 + /* HIT */ 593 + return dest; 594 + } 595 + } 596 + 597 + return NULL; 598 + } 599 + 570 600 /* Lookup destination by {addr,port} in the given service 571 601 * Called under RCU lock. 572 602 */ ··· 1283 1253 atomic_inc(&ipvs->ftpsvc_counter); 1284 1254 else if (svc->port == 0) 1285 1255 atomic_inc(&ipvs->nullsvc_counter); 1256 + if (svc->pe && svc->pe->conn_out) 1257 + atomic_inc(&ipvs->conn_out_counter); 1286 1258 1287 1259 ip_vs_start_estimator(ipvs, &svc->stats); 1288 1260 ··· 1325 1293 struct ip_vs_scheduler *sched = NULL, *old_sched; 1326 1294 struct ip_vs_pe *pe = NULL, *old_pe = NULL; 1327 1295 int ret = 0; 1296 + bool new_pe_conn_out, old_pe_conn_out; 1328 1297 1329 1298 /* 1330 1299 * Lookup the scheduler, by 'u->sched_name' ··· 1388 1355 svc->netmask = u->netmask; 1389 1356 1390 1357 old_pe = rcu_dereference_protected(svc->pe, 1); 1391 - if (pe != old_pe) 1358 + if (pe != old_pe) { 1392 1359 rcu_assign_pointer(svc->pe, pe); 1360 + /* check for optional methods in new pe */ 1361 + new_pe_conn_out = (pe && pe->conn_out) ? true : false; 1362 + old_pe_conn_out = (old_pe && old_pe->conn_out) ? true : false; 1363 + if (new_pe_conn_out && !old_pe_conn_out) 1364 + atomic_inc(&svc->ipvs->conn_out_counter); 1365 + if (old_pe_conn_out && !new_pe_conn_out) 1366 + atomic_dec(&svc->ipvs->conn_out_counter); 1367 + } 1393 1368 1394 1369 out: 1395 1370 ip_vs_scheduler_put(old_sched); ··· 1430 1389 1431 1390 /* Unbind persistence engine, keep svc->pe */ 1432 1391 old_pe = rcu_dereference_protected(svc->pe, 1); 1392 + if (old_pe && old_pe->conn_out) 1393 + atomic_dec(&ipvs->conn_out_counter); 1433 1394 ip_vs_pe_put(old_pe); 1434 1395 1435 1396 /* ··· 4012 3969 (unsigned long) ipvs); 4013 3970 atomic_set(&ipvs->ftpsvc_counter, 0); 4014 3971 atomic_set(&ipvs->nullsvc_counter, 0); 3972 + atomic_set(&ipvs->conn_out_counter, 0); 4015 3973 4016 3974 /* procfs stats */ 4017 3975 ipvs->tot_stats.cpustats = alloc_percpu(struct ip_vs_cpu_stats);
+4
net/netfilter/ipvs/ip_vs_nfct.c
··· 93 93 if (IP_VS_FWD_METHOD(cp) != IP_VS_CONN_F_MASQ) 94 94 return; 95 95 96 + /* Never alter conntrack for OPS conns (no reply is expected) */ 97 + if (cp->flags & IP_VS_CONN_F_ONE_PACKET) 98 + return; 99 + 96 100 /* Alter reply only in original direction */ 97 101 if (CTINFO2DIR(ctinfo) != IP_CT_DIR_ORIGINAL) 98 102 return;
+15
net/netfilter/ipvs/ip_vs_pe_sip.c
··· 143 143 return cp->pe_data_len; 144 144 } 145 145 146 + static struct ip_vs_conn * 147 + ip_vs_sip_conn_out(struct ip_vs_service *svc, 148 + struct ip_vs_dest *dest, 149 + struct sk_buff *skb, 150 + const struct ip_vs_iphdr *iph, 151 + __be16 dport, 152 + __be16 cport) 153 + { 154 + if (likely(iph->protocol == IPPROTO_UDP)) 155 + return ip_vs_new_conn_out(svc, dest, skb, iph, dport, cport); 156 + /* currently no need to handle other than UDP */ 157 + return NULL; 158 + } 159 + 146 160 static struct ip_vs_pe ip_vs_sip_pe = 147 161 { 148 162 .name = "sip", ··· 167 153 .ct_match = ip_vs_sip_ct_match, 168 154 .hashkey_raw = ip_vs_sip_hashkey_raw, 169 155 .show_pe_data = ip_vs_sip_show_pe_data, 156 + .conn_out = ip_vs_sip_conn_out, 170 157 }; 171 158 172 159 static int __init ip_vs_sip_init(void)
+222 -193
net/netfilter/nf_conntrack_core.c
··· 54 54 #include <net/netfilter/nf_nat.h> 55 55 #include <net/netfilter/nf_nat_core.h> 56 56 #include <net/netfilter/nf_nat_helper.h> 57 + #include <net/netns/hash.h> 57 58 58 59 #define NF_CONNTRACK_VERSION "0.5.0" 59 60 ··· 69 68 __cacheline_aligned_in_smp DEFINE_SPINLOCK(nf_conntrack_expect_lock); 70 69 EXPORT_SYMBOL_GPL(nf_conntrack_expect_lock); 71 70 71 + struct hlist_nulls_head *nf_conntrack_hash __read_mostly; 72 + EXPORT_SYMBOL_GPL(nf_conntrack_hash); 73 + 74 + static __read_mostly struct kmem_cache *nf_conntrack_cachep; 72 75 static __read_mostly spinlock_t nf_conntrack_locks_all_lock; 76 + static __read_mostly seqcount_t nf_conntrack_generation; 73 77 static __read_mostly bool nf_conntrack_locks_all; 74 78 75 79 void nf_conntrack_lock(spinlock_t *lock) __acquires(lock) ··· 113 107 spin_lock_nested(&nf_conntrack_locks[h1], 114 108 SINGLE_DEPTH_NESTING); 115 109 } 116 - if (read_seqcount_retry(&net->ct.generation, sequence)) { 110 + if (read_seqcount_retry(&nf_conntrack_generation, sequence)) { 117 111 nf_conntrack_double_unlock(h1, h2); 118 112 return true; 119 113 } ··· 147 141 DEFINE_PER_CPU(struct nf_conn, nf_conntrack_untracked); 148 142 EXPORT_PER_CPU_SYMBOL(nf_conntrack_untracked); 149 143 150 - unsigned int nf_conntrack_hash_rnd __read_mostly; 151 - EXPORT_SYMBOL_GPL(nf_conntrack_hash_rnd); 144 + static unsigned int nf_conntrack_hash_rnd __read_mostly; 152 145 153 - static u32 hash_conntrack_raw(const struct nf_conntrack_tuple *tuple) 146 + static u32 hash_conntrack_raw(const struct nf_conntrack_tuple *tuple, 147 + const struct net *net) 154 148 { 155 149 unsigned int n; 150 + u32 seed; 151 + 152 + get_random_once(&nf_conntrack_hash_rnd, sizeof(nf_conntrack_hash_rnd)); 156 153 157 154 /* The direction must be ignored, so we hash everything up to the 158 155 * destination ports (which is a multiple of 4) and treat the last 159 156 * three bytes manually. 160 157 */ 158 + seed = nf_conntrack_hash_rnd ^ net_hash_mix(net); 161 159 n = (sizeof(tuple->src) + sizeof(tuple->dst.u3)) / sizeof(u32); 162 - return jhash2((u32 *)tuple, n, nf_conntrack_hash_rnd ^ 160 + return jhash2((u32 *)tuple, n, seed ^ 163 161 (((__force __u16)tuple->dst.u.all << 16) | 164 162 tuple->dst.protonum)); 165 163 } 166 164 167 - static u32 __hash_bucket(u32 hash, unsigned int size) 165 + static u32 scale_hash(u32 hash) 168 166 { 169 - return reciprocal_scale(hash, size); 167 + return reciprocal_scale(hash, nf_conntrack_htable_size); 170 168 } 171 169 172 - static u32 hash_bucket(u32 hash, const struct net *net) 170 + static u32 __hash_conntrack(const struct net *net, 171 + const struct nf_conntrack_tuple *tuple, 172 + unsigned int size) 173 173 { 174 - return __hash_bucket(hash, net->ct.htable_size); 174 + return reciprocal_scale(hash_conntrack_raw(tuple, net), size); 175 175 } 176 176 177 - static u_int32_t __hash_conntrack(const struct nf_conntrack_tuple *tuple, 178 - unsigned int size) 177 + static u32 hash_conntrack(const struct net *net, 178 + const struct nf_conntrack_tuple *tuple) 179 179 { 180 - return __hash_bucket(hash_conntrack_raw(tuple), size); 181 - } 182 - 183 - static inline u_int32_t hash_conntrack(const struct net *net, 184 - const struct nf_conntrack_tuple *tuple) 185 - { 186 - return __hash_conntrack(tuple, net->ct.htable_size); 180 + return scale_hash(hash_conntrack_raw(tuple, net)); 187 181 } 188 182 189 183 bool ··· 364 358 } 365 359 rcu_read_lock(); 366 360 l4proto = __nf_ct_l4proto_find(nf_ct_l3num(ct), nf_ct_protonum(ct)); 367 - if (l4proto && l4proto->destroy) 361 + if (l4proto->destroy) 368 362 l4proto->destroy(ct); 369 363 370 364 rcu_read_unlock(); ··· 399 393 400 394 local_bh_disable(); 401 395 do { 402 - sequence = read_seqcount_begin(&net->ct.generation); 396 + sequence = read_seqcount_begin(&nf_conntrack_generation); 403 397 hash = hash_conntrack(net, 404 398 &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple); 405 399 reply_hash = hash_conntrack(net, ··· 451 445 static inline bool 452 446 nf_ct_key_equal(struct nf_conntrack_tuple_hash *h, 453 447 const struct nf_conntrack_tuple *tuple, 454 - const struct nf_conntrack_zone *zone) 448 + const struct nf_conntrack_zone *zone, 449 + const struct net *net) 455 450 { 456 451 struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h); 457 452 ··· 461 454 */ 462 455 return nf_ct_tuple_equal(tuple, &h->tuple) && 463 456 nf_ct_zone_equal(ct, zone, NF_CT_DIRECTION(h)) && 464 - nf_ct_is_confirmed(ct); 457 + nf_ct_is_confirmed(ct) && 458 + net_eq(net, nf_ct_net(ct)); 465 459 } 466 460 467 461 /* ··· 475 467 const struct nf_conntrack_tuple *tuple, u32 hash) 476 468 { 477 469 struct nf_conntrack_tuple_hash *h; 470 + struct hlist_nulls_head *ct_hash; 478 471 struct hlist_nulls_node *n; 479 - unsigned int bucket = hash_bucket(hash, net); 472 + unsigned int bucket, sequence; 480 473 481 - /* Disable BHs the entire time since we normally need to disable them 482 - * at least once for the stats anyway. 483 - */ 484 - local_bh_disable(); 485 474 begin: 486 - hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[bucket], hnnode) { 487 - if (nf_ct_key_equal(h, tuple, zone)) { 488 - NF_CT_STAT_INC(net, found); 489 - local_bh_enable(); 475 + do { 476 + sequence = read_seqcount_begin(&nf_conntrack_generation); 477 + bucket = scale_hash(hash); 478 + ct_hash = nf_conntrack_hash; 479 + } while (read_seqcount_retry(&nf_conntrack_generation, sequence)); 480 + 481 + hlist_nulls_for_each_entry_rcu(h, n, &ct_hash[bucket], hnnode) { 482 + if (nf_ct_key_equal(h, tuple, zone, net)) { 483 + NF_CT_STAT_INC_ATOMIC(net, found); 490 484 return h; 491 485 } 492 - NF_CT_STAT_INC(net, searched); 486 + NF_CT_STAT_INC_ATOMIC(net, searched); 493 487 } 494 488 /* 495 489 * if the nulls value we got at the end of this lookup is ··· 499 489 * We probably met an item that was moved to another chain. 500 490 */ 501 491 if (get_nulls_value(n) != bucket) { 502 - NF_CT_STAT_INC(net, search_restart); 492 + NF_CT_STAT_INC_ATOMIC(net, search_restart); 503 493 goto begin; 504 494 } 505 - local_bh_enable(); 506 495 507 496 return NULL; 508 497 } ··· 523 514 !atomic_inc_not_zero(&ct->ct_general.use))) 524 515 h = NULL; 525 516 else { 526 - if (unlikely(!nf_ct_key_equal(h, tuple, zone))) { 517 + if (unlikely(!nf_ct_key_equal(h, tuple, zone, net))) { 527 518 nf_ct_put(ct); 528 519 goto begin; 529 520 } ··· 539 530 const struct nf_conntrack_tuple *tuple) 540 531 { 541 532 return __nf_conntrack_find_get(net, zone, tuple, 542 - hash_conntrack_raw(tuple)); 533 + hash_conntrack_raw(tuple, net)); 543 534 } 544 535 EXPORT_SYMBOL_GPL(nf_conntrack_find_get); 545 536 ··· 547 538 unsigned int hash, 548 539 unsigned int reply_hash) 549 540 { 550 - struct net *net = nf_ct_net(ct); 551 - 552 541 hlist_nulls_add_head_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode, 553 - &net->ct.hash[hash]); 542 + &nf_conntrack_hash[hash]); 554 543 hlist_nulls_add_head_rcu(&ct->tuplehash[IP_CT_DIR_REPLY].hnnode, 555 - &net->ct.hash[reply_hash]); 544 + &nf_conntrack_hash[reply_hash]); 556 545 } 557 546 558 547 int ··· 567 560 568 561 local_bh_disable(); 569 562 do { 570 - sequence = read_seqcount_begin(&net->ct.generation); 563 + sequence = read_seqcount_begin(&nf_conntrack_generation); 571 564 hash = hash_conntrack(net, 572 565 &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple); 573 566 reply_hash = hash_conntrack(net, ··· 575 568 } while (nf_conntrack_double_lock(net, hash, reply_hash, sequence)); 576 569 577 570 /* See if there's one in the list already, including reverse */ 578 - hlist_nulls_for_each_entry(h, n, &net->ct.hash[hash], hnnode) 579 - if (nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple, 580 - &h->tuple) && 581 - nf_ct_zone_equal(nf_ct_tuplehash_to_ctrack(h), zone, 582 - NF_CT_DIRECTION(h))) 571 + hlist_nulls_for_each_entry(h, n, &nf_conntrack_hash[hash], hnnode) 572 + if (nf_ct_key_equal(h, &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple, 573 + zone, net)) 583 574 goto out; 584 - hlist_nulls_for_each_entry(h, n, &net->ct.hash[reply_hash], hnnode) 585 - if (nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_REPLY].tuple, 586 - &h->tuple) && 587 - nf_ct_zone_equal(nf_ct_tuplehash_to_ctrack(h), zone, 588 - NF_CT_DIRECTION(h))) 575 + 576 + hlist_nulls_for_each_entry(h, n, &nf_conntrack_hash[reply_hash], hnnode) 577 + if (nf_ct_key_equal(h, &ct->tuplehash[IP_CT_DIR_REPLY].tuple, 578 + zone, net)) 589 579 goto out; 590 580 591 581 add_timer(&ct->timeout); ··· 603 599 } 604 600 EXPORT_SYMBOL_GPL(nf_conntrack_hash_check_insert); 605 601 602 + static inline void nf_ct_acct_update(struct nf_conn *ct, 603 + enum ip_conntrack_info ctinfo, 604 + unsigned int len) 605 + { 606 + struct nf_conn_acct *acct; 607 + 608 + acct = nf_conn_acct_find(ct); 609 + if (acct) { 610 + struct nf_conn_counter *counter = acct->counter; 611 + 612 + atomic64_inc(&counter[CTINFO2DIR(ctinfo)].packets); 613 + atomic64_add(len, &counter[CTINFO2DIR(ctinfo)].bytes); 614 + } 615 + } 616 + 617 + static void nf_ct_acct_merge(struct nf_conn *ct, enum ip_conntrack_info ctinfo, 618 + const struct nf_conn *loser_ct) 619 + { 620 + struct nf_conn_acct *acct; 621 + 622 + acct = nf_conn_acct_find(loser_ct); 623 + if (acct) { 624 + struct nf_conn_counter *counter = acct->counter; 625 + enum ip_conntrack_info ctinfo; 626 + unsigned int bytes; 627 + 628 + /* u32 should be fine since we must have seen one packet. */ 629 + bytes = atomic64_read(&counter[CTINFO2DIR(ctinfo)].bytes); 630 + nf_ct_acct_update(ct, ctinfo, bytes); 631 + } 632 + } 633 + 634 + /* Resolve race on insertion if this protocol allows this. */ 635 + static int nf_ct_resolve_clash(struct net *net, struct sk_buff *skb, 636 + enum ip_conntrack_info ctinfo, 637 + struct nf_conntrack_tuple_hash *h) 638 + { 639 + /* This is the conntrack entry already in hashes that won race. */ 640 + struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h); 641 + struct nf_conntrack_l4proto *l4proto; 642 + 643 + l4proto = __nf_ct_l4proto_find(nf_ct_l3num(ct), nf_ct_protonum(ct)); 644 + if (l4proto->allow_clash && 645 + !nf_ct_is_dying(ct) && 646 + atomic_inc_not_zero(&ct->ct_general.use)) { 647 + nf_ct_acct_merge(ct, ctinfo, (struct nf_conn *)skb->nfct); 648 + nf_conntrack_put(skb->nfct); 649 + /* Assign conntrack already in hashes to this skbuff. Don't 650 + * modify skb->nfctinfo to ensure consistent stateful filtering. 651 + */ 652 + skb->nfct = &ct->ct_general; 653 + return NF_ACCEPT; 654 + } 655 + NF_CT_STAT_INC(net, drop); 656 + return NF_DROP; 657 + } 658 + 606 659 /* Confirm a connection given skb; places it in hash table */ 607 660 int 608 661 __nf_conntrack_confirm(struct sk_buff *skb) ··· 674 613 enum ip_conntrack_info ctinfo; 675 614 struct net *net; 676 615 unsigned int sequence; 616 + int ret = NF_DROP; 677 617 678 618 ct = nf_ct_get(skb, &ctinfo); 679 619 net = nf_ct_net(ct); ··· 690 628 local_bh_disable(); 691 629 692 630 do { 693 - sequence = read_seqcount_begin(&net->ct.generation); 631 + sequence = read_seqcount_begin(&nf_conntrack_generation); 694 632 /* reuse the hash saved before */ 695 633 hash = *(unsigned long *)&ct->tuplehash[IP_CT_DIR_REPLY].hnnode.pprev; 696 - hash = hash_bucket(hash, net); 634 + hash = scale_hash(hash); 697 635 reply_hash = hash_conntrack(net, 698 636 &ct->tuplehash[IP_CT_DIR_REPLY].tuple); 699 637 ··· 717 655 */ 718 656 nf_ct_del_from_dying_or_unconfirmed_list(ct); 719 657 720 - if (unlikely(nf_ct_is_dying(ct))) 721 - goto out; 658 + if (unlikely(nf_ct_is_dying(ct))) { 659 + nf_ct_add_to_dying_list(ct); 660 + goto dying; 661 + } 722 662 723 663 /* See if there's one in the list already, including reverse: 724 664 NAT could have grabbed it without realizing, since we're 725 665 not in the hash. If there is, we lost race. */ 726 - hlist_nulls_for_each_entry(h, n, &net->ct.hash[hash], hnnode) 727 - if (nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple, 728 - &h->tuple) && 729 - nf_ct_zone_equal(nf_ct_tuplehash_to_ctrack(h), zone, 730 - NF_CT_DIRECTION(h))) 666 + hlist_nulls_for_each_entry(h, n, &nf_conntrack_hash[hash], hnnode) 667 + if (nf_ct_key_equal(h, &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple, 668 + zone, net)) 731 669 goto out; 732 - hlist_nulls_for_each_entry(h, n, &net->ct.hash[reply_hash], hnnode) 733 - if (nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_REPLY].tuple, 734 - &h->tuple) && 735 - nf_ct_zone_equal(nf_ct_tuplehash_to_ctrack(h), zone, 736 - NF_CT_DIRECTION(h))) 670 + 671 + hlist_nulls_for_each_entry(h, n, &nf_conntrack_hash[reply_hash], hnnode) 672 + if (nf_ct_key_equal(h, &ct->tuplehash[IP_CT_DIR_REPLY].tuple, 673 + zone, net)) 737 674 goto out; 738 675 739 676 /* Timer relative to confirmation time, not original ··· 771 710 772 711 out: 773 712 nf_ct_add_to_dying_list(ct); 713 + ret = nf_ct_resolve_clash(net, skb, ctinfo, h); 714 + dying: 774 715 nf_conntrack_double_unlock(hash, reply_hash); 775 716 NF_CT_STAT_INC(net, insert_failed); 776 717 local_bh_enable(); 777 - return NF_DROP; 718 + return ret; 778 719 } 779 720 EXPORT_SYMBOL_GPL(__nf_conntrack_confirm); 780 721 ··· 789 726 struct net *net = nf_ct_net(ignored_conntrack); 790 727 const struct nf_conntrack_zone *zone; 791 728 struct nf_conntrack_tuple_hash *h; 729 + struct hlist_nulls_head *ct_hash; 730 + unsigned int hash, sequence; 792 731 struct hlist_nulls_node *n; 793 732 struct nf_conn *ct; 794 - unsigned int hash; 795 733 796 734 zone = nf_ct_zone(ignored_conntrack); 797 - hash = hash_conntrack(net, tuple); 798 735 799 - /* Disable BHs the entire time since we need to disable them at 800 - * least once for the stats anyway. 801 - */ 802 - rcu_read_lock_bh(); 803 - hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[hash], hnnode) { 736 + rcu_read_lock(); 737 + do { 738 + sequence = read_seqcount_begin(&nf_conntrack_generation); 739 + hash = hash_conntrack(net, tuple); 740 + ct_hash = nf_conntrack_hash; 741 + } while (read_seqcount_retry(&nf_conntrack_generation, sequence)); 742 + 743 + hlist_nulls_for_each_entry_rcu(h, n, &ct_hash[hash], hnnode) { 804 744 ct = nf_ct_tuplehash_to_ctrack(h); 805 745 if (ct != ignored_conntrack && 806 - nf_ct_tuple_equal(tuple, &h->tuple) && 807 - nf_ct_zone_equal(ct, zone, NF_CT_DIRECTION(h))) { 808 - NF_CT_STAT_INC(net, found); 809 - rcu_read_unlock_bh(); 746 + nf_ct_key_equal(h, tuple, zone, net)) { 747 + NF_CT_STAT_INC_ATOMIC(net, found); 748 + rcu_read_unlock(); 810 749 return 1; 811 750 } 812 - NF_CT_STAT_INC(net, searched); 751 + NF_CT_STAT_INC_ATOMIC(net, searched); 813 752 } 814 - rcu_read_unlock_bh(); 753 + rcu_read_unlock(); 815 754 816 755 return 0; 817 756 } ··· 827 762 { 828 763 /* Use oldest entry, which is roughly LRU */ 829 764 struct nf_conntrack_tuple_hash *h; 830 - struct nf_conn *ct = NULL, *tmp; 765 + struct nf_conn *tmp; 831 766 struct hlist_nulls_node *n; 832 - unsigned int i = 0, cnt = 0; 833 - int dropped = 0; 834 - unsigned int hash, sequence; 767 + unsigned int i, hash, sequence; 768 + struct nf_conn *ct = NULL; 835 769 spinlock_t *lockp; 770 + bool ret = false; 771 + 772 + i = 0; 836 773 837 774 local_bh_disable(); 838 775 restart: 839 - sequence = read_seqcount_begin(&net->ct.generation); 840 - hash = hash_bucket(_hash, net); 841 - for (; i < net->ct.htable_size; i++) { 776 + sequence = read_seqcount_begin(&nf_conntrack_generation); 777 + for (; i < NF_CT_EVICTION_RANGE; i++) { 778 + hash = scale_hash(_hash++); 842 779 lockp = &nf_conntrack_locks[hash % CONNTRACK_LOCKS]; 843 780 nf_conntrack_lock(lockp); 844 - if (read_seqcount_retry(&net->ct.generation, sequence)) { 781 + if (read_seqcount_retry(&nf_conntrack_generation, sequence)) { 845 782 spin_unlock(lockp); 846 783 goto restart; 847 784 } 848 - hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[hash], 849 - hnnode) { 785 + hlist_nulls_for_each_entry_rcu(h, n, &nf_conntrack_hash[hash], 786 + hnnode) { 850 787 tmp = nf_ct_tuplehash_to_ctrack(h); 851 - if (!test_bit(IPS_ASSURED_BIT, &tmp->status) && 852 - !nf_ct_is_dying(tmp) && 853 - atomic_inc_not_zero(&tmp->ct_general.use)) { 788 + 789 + if (test_bit(IPS_ASSURED_BIT, &tmp->status) || 790 + !net_eq(nf_ct_net(tmp), net) || 791 + nf_ct_is_dying(tmp)) 792 + continue; 793 + 794 + if (atomic_inc_not_zero(&tmp->ct_general.use)) { 854 795 ct = tmp; 855 796 break; 856 797 } 857 - cnt++; 858 798 } 859 799 860 - hash = (hash + 1) % net->ct.htable_size; 861 800 spin_unlock(lockp); 862 - 863 - if (ct || cnt >= NF_CT_EVICTION_RANGE) 801 + if (ct) 864 802 break; 865 - 866 803 } 804 + 867 805 local_bh_enable(); 868 806 869 807 if (!ct) 870 - return dropped; 808 + return false; 871 809 872 - if (del_timer(&ct->timeout)) { 810 + /* kill only if in same netns -- might have moved due to 811 + * SLAB_DESTROY_BY_RCU rules 812 + */ 813 + if (net_eq(nf_ct_net(ct), net) && del_timer(&ct->timeout)) { 873 814 if (nf_ct_delete(ct, 0, 0)) { 874 - dropped = 1; 875 815 NF_CT_STAT_INC_ATOMIC(net, early_drop); 816 + ret = true; 876 817 } 877 818 } 819 + 878 820 nf_ct_put(ct); 879 - return dropped; 880 - } 881 - 882 - void init_nf_conntrack_hash_rnd(void) 883 - { 884 - unsigned int rand; 885 - 886 - /* 887 - * Why not initialize nf_conntrack_rnd in a "init()" function ? 888 - * Because there isn't enough entropy when system initializing, 889 - * and we initialize it as late as possible. 890 - */ 891 - do { 892 - get_random_bytes(&rand, sizeof(rand)); 893 - } while (!rand); 894 - cmpxchg(&nf_conntrack_hash_rnd, 0, rand); 821 + return ret; 895 822 } 896 823 897 824 static struct nf_conn * ··· 894 837 gfp_t gfp, u32 hash) 895 838 { 896 839 struct nf_conn *ct; 897 - 898 - if (unlikely(!nf_conntrack_hash_rnd)) { 899 - init_nf_conntrack_hash_rnd(); 900 - /* recompute the hash as nf_conntrack_hash_rnd is initialized */ 901 - hash = hash_conntrack_raw(orig); 902 - } 903 840 904 841 /* We don't want any race condition at early drop stage */ 905 842 atomic_inc(&net->ct.count); ··· 911 860 * Do not use kmem_cache_zalloc(), as this cache uses 912 861 * SLAB_DESTROY_BY_RCU. 913 862 */ 914 - ct = kmem_cache_alloc(net->ct.nf_conntrack_cachep, gfp); 863 + ct = kmem_cache_alloc(nf_conntrack_cachep, gfp); 915 864 if (ct == NULL) 916 865 goto out; 917 866 ··· 938 887 atomic_set(&ct->ct_general.use, 0); 939 888 return ct; 940 889 out_free: 941 - kmem_cache_free(net->ct.nf_conntrack_cachep, ct); 890 + kmem_cache_free(nf_conntrack_cachep, ct); 942 891 out: 943 892 atomic_dec(&net->ct.count); 944 893 return ERR_PTR(-ENOMEM); ··· 965 914 966 915 nf_ct_ext_destroy(ct); 967 916 nf_ct_ext_free(ct); 968 - kmem_cache_free(net->ct.nf_conntrack_cachep, ct); 917 + kmem_cache_free(nf_conntrack_cachep, ct); 969 918 smp_mb__before_atomic(); 970 919 atomic_dec(&net->ct.count); 971 920 } ··· 1112 1061 1113 1062 /* look for tuple match */ 1114 1063 zone = nf_ct_zone_tmpl(tmpl, skb, &tmp); 1115 - hash = hash_conntrack_raw(&tuple); 1064 + hash = hash_conntrack_raw(&tuple, net); 1116 1065 h = __nf_conntrack_find_get(net, zone, &tuple, hash); 1117 1066 if (!h) { 1118 1067 h = init_conntrack(net, tmpl, &tuple, l3proto, l4proto, ··· 1321 1270 } 1322 1271 1323 1272 acct: 1324 - if (do_acct) { 1325 - struct nf_conn_acct *acct; 1326 - 1327 - acct = nf_conn_acct_find(ct); 1328 - if (acct) { 1329 - struct nf_conn_counter *counter = acct->counter; 1330 - 1331 - atomic64_inc(&counter[CTINFO2DIR(ctinfo)].packets); 1332 - atomic64_add(skb->len, &counter[CTINFO2DIR(ctinfo)].bytes); 1333 - } 1334 - } 1273 + if (do_acct) 1274 + nf_ct_acct_update(ct, ctinfo, skb->len); 1335 1275 } 1336 1276 EXPORT_SYMBOL_GPL(__nf_ct_refresh_acct); 1337 1277 ··· 1331 1289 const struct sk_buff *skb, 1332 1290 int do_acct) 1333 1291 { 1334 - if (do_acct) { 1335 - struct nf_conn_acct *acct; 1336 - 1337 - acct = nf_conn_acct_find(ct); 1338 - if (acct) { 1339 - struct nf_conn_counter *counter = acct->counter; 1340 - 1341 - atomic64_inc(&counter[CTINFO2DIR(ctinfo)].packets); 1342 - atomic64_add(skb->len - skb_network_offset(skb), 1343 - &counter[CTINFO2DIR(ctinfo)].bytes); 1344 - } 1345 - } 1292 + if (do_acct) 1293 + nf_ct_acct_update(ct, ctinfo, skb->len); 1346 1294 1347 1295 if (del_timer(&ct->timeout)) { 1348 1296 ct->timeout.function((unsigned long)ct); ··· 1428 1396 int cpu; 1429 1397 spinlock_t *lockp; 1430 1398 1431 - for (; *bucket < net->ct.htable_size; (*bucket)++) { 1399 + for (; *bucket < nf_conntrack_htable_size; (*bucket)++) { 1432 1400 lockp = &nf_conntrack_locks[*bucket % CONNTRACK_LOCKS]; 1433 1401 local_bh_disable(); 1434 1402 nf_conntrack_lock(lockp); 1435 - if (*bucket < net->ct.htable_size) { 1436 - hlist_nulls_for_each_entry(h, n, &net->ct.hash[*bucket], hnnode) { 1403 + if (*bucket < nf_conntrack_htable_size) { 1404 + hlist_nulls_for_each_entry(h, n, &nf_conntrack_hash[*bucket], hnnode) { 1437 1405 if (NF_CT_DIRECTION(h) != IP_CT_DIR_ORIGINAL) 1438 1406 continue; 1439 1407 ct = nf_ct_tuplehash_to_ctrack(h); 1440 - if (iter(ct, data)) 1408 + if (net_eq(nf_ct_net(ct), net) && 1409 + iter(ct, data)) 1441 1410 goto found; 1442 1411 } 1443 1412 } ··· 1475 1442 unsigned int bucket = 0; 1476 1443 1477 1444 might_sleep(); 1445 + 1446 + if (atomic_read(&net->ct.count) == 0) 1447 + return; 1478 1448 1479 1449 while ((ct = get_next_corpse(net, iter, data, &bucket)) != NULL) { 1480 1450 /* Time to push up daises... */ ··· 1530 1494 while (untrack_refs() > 0) 1531 1495 schedule(); 1532 1496 1497 + nf_ct_free_hashtable(nf_conntrack_hash, nf_conntrack_htable_size); 1498 + 1533 1499 #ifdef CONFIG_NF_CONNTRACK_ZONES 1534 1500 nf_ct_extend_unregister(&nf_ct_zone_extend); 1535 1501 #endif ··· 1582 1544 } 1583 1545 1584 1546 list_for_each_entry(net, net_exit_list, exit_list) { 1585 - nf_ct_free_hashtable(net->ct.hash, net->ct.htable_size); 1586 1547 nf_conntrack_proto_pernet_fini(net); 1587 1548 nf_conntrack_helper_pernet_fini(net); 1588 1549 nf_conntrack_ecache_pernet_fini(net); 1589 1550 nf_conntrack_tstamp_pernet_fini(net); 1590 1551 nf_conntrack_acct_pernet_fini(net); 1591 1552 nf_conntrack_expect_pernet_fini(net); 1592 - kmem_cache_destroy(net->ct.nf_conntrack_cachep); 1593 - kfree(net->ct.slabname); 1594 1553 free_percpu(net->ct.stat); 1595 1554 free_percpu(net->ct.pcpu_lists); 1596 1555 } ··· 1642 1607 1643 1608 local_bh_disable(); 1644 1609 nf_conntrack_all_lock(); 1645 - write_seqcount_begin(&init_net.ct.generation); 1610 + write_seqcount_begin(&nf_conntrack_generation); 1646 1611 1647 1612 /* Lookups in the old hash might happen in parallel, which means we 1648 1613 * might get false negatives during connection lookup. New connections ··· 1650 1615 * though since that required taking the locks. 1651 1616 */ 1652 1617 1653 - for (i = 0; i < init_net.ct.htable_size; i++) { 1654 - while (!hlist_nulls_empty(&init_net.ct.hash[i])) { 1655 - h = hlist_nulls_entry(init_net.ct.hash[i].first, 1656 - struct nf_conntrack_tuple_hash, hnnode); 1618 + for (i = 0; i < nf_conntrack_htable_size; i++) { 1619 + while (!hlist_nulls_empty(&nf_conntrack_hash[i])) { 1620 + h = hlist_nulls_entry(nf_conntrack_hash[i].first, 1621 + struct nf_conntrack_tuple_hash, hnnode); 1657 1622 ct = nf_ct_tuplehash_to_ctrack(h); 1658 1623 hlist_nulls_del_rcu(&h->hnnode); 1659 - bucket = __hash_conntrack(&h->tuple, hashsize); 1624 + bucket = __hash_conntrack(nf_ct_net(ct), 1625 + &h->tuple, hashsize); 1660 1626 hlist_nulls_add_head_rcu(&h->hnnode, &hash[bucket]); 1661 1627 } 1662 1628 } 1663 - old_size = init_net.ct.htable_size; 1664 - old_hash = init_net.ct.hash; 1629 + old_size = nf_conntrack_htable_size; 1630 + old_hash = nf_conntrack_hash; 1665 1631 1666 - init_net.ct.htable_size = nf_conntrack_htable_size = hashsize; 1667 - init_net.ct.hash = hash; 1632 + nf_conntrack_hash = hash; 1633 + nf_conntrack_htable_size = hashsize; 1668 1634 1669 - write_seqcount_end(&init_net.ct.generation); 1635 + write_seqcount_end(&nf_conntrack_generation); 1670 1636 nf_conntrack_all_unlock(); 1671 1637 local_bh_enable(); 1672 1638 1639 + synchronize_net(); 1673 1640 nf_ct_free_hashtable(old_hash, old_size); 1674 1641 return 0; 1675 1642 } ··· 1692 1655 int nf_conntrack_init_start(void) 1693 1656 { 1694 1657 int max_factor = 8; 1695 - int i, ret, cpu; 1658 + int ret = -ENOMEM; 1659 + int i, cpu; 1660 + 1661 + seqcount_init(&nf_conntrack_generation); 1696 1662 1697 1663 for (i = 0; i < CONNTRACK_LOCKS; i++) 1698 1664 spin_lock_init(&nf_conntrack_locks[i]); ··· 1722 1682 * entries. */ 1723 1683 max_factor = 4; 1724 1684 } 1685 + 1686 + nf_conntrack_hash = nf_ct_alloc_hashtable(&nf_conntrack_htable_size, 1); 1687 + if (!nf_conntrack_hash) 1688 + return -ENOMEM; 1689 + 1725 1690 nf_conntrack_max = max_factor * nf_conntrack_htable_size; 1691 + 1692 + nf_conntrack_cachep = kmem_cache_create("nf_conntrack", 1693 + sizeof(struct nf_conn), 0, 1694 + SLAB_DESTROY_BY_RCU, NULL); 1695 + if (!nf_conntrack_cachep) 1696 + goto err_cachep; 1726 1697 1727 1698 printk(KERN_INFO "nf_conntrack version %s (%u buckets, %d max)\n", 1728 1699 NF_CONNTRACK_VERSION, nf_conntrack_htable_size, ··· 1811 1760 err_acct: 1812 1761 nf_conntrack_expect_fini(); 1813 1762 err_expect: 1763 + kmem_cache_destroy(nf_conntrack_cachep); 1764 + err_cachep: 1765 + nf_ct_free_hashtable(nf_conntrack_hash, nf_conntrack_htable_size); 1814 1766 return ret; 1815 1767 } 1816 1768 ··· 1837 1783 int cpu; 1838 1784 1839 1785 atomic_set(&net->ct.count, 0); 1840 - seqcount_init(&net->ct.generation); 1841 1786 1842 1787 net->ct.pcpu_lists = alloc_percpu(struct ct_pcpu); 1843 1788 if (!net->ct.pcpu_lists) ··· 1854 1801 if (!net->ct.stat) 1855 1802 goto err_pcpu_lists; 1856 1803 1857 - net->ct.slabname = kasprintf(GFP_KERNEL, "nf_conntrack_%p", net); 1858 - if (!net->ct.slabname) 1859 - goto err_slabname; 1860 - 1861 - net->ct.nf_conntrack_cachep = kmem_cache_create(net->ct.slabname, 1862 - sizeof(struct nf_conn), 0, 1863 - SLAB_DESTROY_BY_RCU, NULL); 1864 - if (!net->ct.nf_conntrack_cachep) { 1865 - printk(KERN_ERR "Unable to create nf_conn slab cache\n"); 1866 - goto err_cache; 1867 - } 1868 - 1869 - net->ct.htable_size = nf_conntrack_htable_size; 1870 - net->ct.hash = nf_ct_alloc_hashtable(&net->ct.htable_size, 1); 1871 - if (!net->ct.hash) { 1872 - printk(KERN_ERR "Unable to create nf_conntrack_hash\n"); 1873 - goto err_hash; 1874 - } 1875 1804 ret = nf_conntrack_expect_pernet_init(net); 1876 1805 if (ret < 0) 1877 1806 goto err_expect; ··· 1885 1850 err_acct: 1886 1851 nf_conntrack_expect_pernet_fini(net); 1887 1852 err_expect: 1888 - nf_ct_free_hashtable(net->ct.hash, net->ct.htable_size); 1889 - err_hash: 1890 - kmem_cache_destroy(net->ct.nf_conntrack_cachep); 1891 - err_cache: 1892 - kfree(net->ct.slabname); 1893 - err_slabname: 1894 1853 free_percpu(net->ct.stat); 1895 1854 err_pcpu_lists: 1896 1855 free_percpu(net->ct.pcpu_lists);
+45 -38
net/netfilter/nf_conntrack_expect.c
··· 24 24 #include <linux/moduleparam.h> 25 25 #include <linux/export.h> 26 26 #include <net/net_namespace.h> 27 + #include <net/netns/hash.h> 27 28 28 29 #include <net/netfilter/nf_conntrack.h> 29 30 #include <net/netfilter/nf_conntrack_core.h> ··· 36 35 unsigned int nf_ct_expect_hsize __read_mostly; 37 36 EXPORT_SYMBOL_GPL(nf_ct_expect_hsize); 38 37 38 + struct hlist_head *nf_ct_expect_hash __read_mostly; 39 + EXPORT_SYMBOL_GPL(nf_ct_expect_hash); 40 + 39 41 unsigned int nf_ct_expect_max __read_mostly; 40 42 41 43 static struct kmem_cache *nf_ct_expect_cachep __read_mostly; 44 + static unsigned int nf_ct_expect_hashrnd __read_mostly; 42 45 43 46 /* nf_conntrack_expect helper functions */ 44 47 void nf_ct_unlink_expect_report(struct nf_conntrack_expect *exp, ··· 77 72 nf_ct_expect_put(exp); 78 73 } 79 74 80 - static unsigned int nf_ct_expect_dst_hash(const struct nf_conntrack_tuple *tuple) 75 + static unsigned int nf_ct_expect_dst_hash(const struct net *n, const struct nf_conntrack_tuple *tuple) 81 76 { 82 - unsigned int hash; 77 + unsigned int hash, seed; 83 78 84 - if (unlikely(!nf_conntrack_hash_rnd)) { 85 - init_nf_conntrack_hash_rnd(); 86 - } 79 + get_random_once(&nf_ct_expect_hashrnd, sizeof(nf_ct_expect_hashrnd)); 80 + 81 + seed = nf_ct_expect_hashrnd ^ net_hash_mix(n); 87 82 88 83 hash = jhash2(tuple->dst.u3.all, ARRAY_SIZE(tuple->dst.u3.all), 89 84 (((tuple->dst.protonum ^ tuple->src.l3num) << 16) | 90 - (__force __u16)tuple->dst.u.all) ^ nf_conntrack_hash_rnd); 85 + (__force __u16)tuple->dst.u.all) ^ seed); 91 86 92 87 return reciprocal_scale(hash, nf_ct_expect_hsize); 88 + } 89 + 90 + static bool 91 + nf_ct_exp_equal(const struct nf_conntrack_tuple *tuple, 92 + const struct nf_conntrack_expect *i, 93 + const struct nf_conntrack_zone *zone, 94 + const struct net *net) 95 + { 96 + return nf_ct_tuple_mask_cmp(tuple, &i->tuple, &i->mask) && 97 + net_eq(net, nf_ct_net(i->master)) && 98 + nf_ct_zone_equal_any(i->master, zone); 93 99 } 94 100 95 101 struct nf_conntrack_expect * ··· 114 98 if (!net->ct.expect_count) 115 99 return NULL; 116 100 117 - h = nf_ct_expect_dst_hash(tuple); 118 - hlist_for_each_entry_rcu(i, &net->ct.expect_hash[h], hnode) { 119 - if (nf_ct_tuple_mask_cmp(tuple, &i->tuple, &i->mask) && 120 - nf_ct_zone_equal_any(i->master, zone)) 101 + h = nf_ct_expect_dst_hash(net, tuple); 102 + hlist_for_each_entry_rcu(i, &nf_ct_expect_hash[h], hnode) { 103 + if (nf_ct_exp_equal(tuple, i, zone, net)) 121 104 return i; 122 105 } 123 106 return NULL; ··· 154 139 if (!net->ct.expect_count) 155 140 return NULL; 156 141 157 - h = nf_ct_expect_dst_hash(tuple); 158 - hlist_for_each_entry(i, &net->ct.expect_hash[h], hnode) { 142 + h = nf_ct_expect_dst_hash(net, tuple); 143 + hlist_for_each_entry(i, &nf_ct_expect_hash[h], hnode) { 159 144 if (!(i->flags & NF_CT_EXPECT_INACTIVE) && 160 - nf_ct_tuple_mask_cmp(tuple, &i->tuple, &i->mask) && 161 - nf_ct_zone_equal_any(i->master, zone)) { 145 + nf_ct_exp_equal(tuple, i, zone, net)) { 162 146 exp = i; 163 147 break; 164 148 } ··· 237 223 } 238 224 239 225 return nf_ct_tuple_mask_cmp(&a->tuple, &b->tuple, &intersect_mask) && 226 + net_eq(nf_ct_net(a->master), nf_ct_net(b->master)) && 240 227 nf_ct_zone_equal_any(a->master, nf_ct_zone(b->master)); 241 228 } 242 229 ··· 247 232 return a->master == b->master && a->class == b->class && 248 233 nf_ct_tuple_equal(&a->tuple, &b->tuple) && 249 234 nf_ct_tuple_mask_equal(&a->mask, &b->mask) && 235 + net_eq(nf_ct_net(a->master), nf_ct_net(b->master)) && 250 236 nf_ct_zone_equal_any(a->master, nf_ct_zone(b->master)); 251 237 } 252 238 ··· 358 342 struct nf_conn_help *master_help = nfct_help(exp->master); 359 343 struct nf_conntrack_helper *helper; 360 344 struct net *net = nf_ct_exp_net(exp); 361 - unsigned int h = nf_ct_expect_dst_hash(&exp->tuple); 345 + unsigned int h = nf_ct_expect_dst_hash(net, &exp->tuple); 362 346 363 347 /* two references : one for hash insert, one for the timer */ 364 348 atomic_add(2, &exp->use); ··· 366 350 hlist_add_head(&exp->lnode, &master_help->expectations); 367 351 master_help->expecting[exp->class]++; 368 352 369 - hlist_add_head_rcu(&exp->hnode, &net->ct.expect_hash[h]); 353 + hlist_add_head_rcu(&exp->hnode, &nf_ct_expect_hash[h]); 370 354 net->ct.expect_count++; 371 355 372 356 setup_timer(&exp->timeout, nf_ct_expectation_timed_out, ··· 417 401 ret = -ESHUTDOWN; 418 402 goto out; 419 403 } 420 - h = nf_ct_expect_dst_hash(&expect->tuple); 421 - hlist_for_each_entry_safe(i, next, &net->ct.expect_hash[h], hnode) { 404 + h = nf_ct_expect_dst_hash(net, &expect->tuple); 405 + hlist_for_each_entry_safe(i, next, &nf_ct_expect_hash[h], hnode) { 422 406 if (expect_matches(i, expect)) { 423 407 if (del_timer(&i->timeout)) { 424 408 nf_ct_unlink_expect(i); ··· 484 468 485 469 static struct hlist_node *ct_expect_get_first(struct seq_file *seq) 486 470 { 487 - struct net *net = seq_file_net(seq); 488 471 struct ct_expect_iter_state *st = seq->private; 489 472 struct hlist_node *n; 490 473 491 474 for (st->bucket = 0; st->bucket < nf_ct_expect_hsize; st->bucket++) { 492 - n = rcu_dereference(hlist_first_rcu(&net->ct.expect_hash[st->bucket])); 475 + n = rcu_dereference(hlist_first_rcu(&nf_ct_expect_hash[st->bucket])); 493 476 if (n) 494 477 return n; 495 478 } ··· 498 483 static struct hlist_node *ct_expect_get_next(struct seq_file *seq, 499 484 struct hlist_node *head) 500 485 { 501 - struct net *net = seq_file_net(seq); 502 486 struct ct_expect_iter_state *st = seq->private; 503 487 504 488 head = rcu_dereference(hlist_next_rcu(head)); 505 489 while (head == NULL) { 506 490 if (++st->bucket >= nf_ct_expect_hsize) 507 491 return NULL; 508 - head = rcu_dereference(hlist_first_rcu(&net->ct.expect_hash[st->bucket])); 492 + head = rcu_dereference(hlist_first_rcu(&nf_ct_expect_hash[st->bucket])); 509 493 } 510 494 return head; 511 495 } ··· 637 623 638 624 int nf_conntrack_expect_pernet_init(struct net *net) 639 625 { 640 - int err = -ENOMEM; 641 - 642 626 net->ct.expect_count = 0; 643 - net->ct.expect_hash = nf_ct_alloc_hashtable(&nf_ct_expect_hsize, 0); 644 - if (net->ct.expect_hash == NULL) 645 - goto err1; 646 - 647 - err = exp_proc_init(net); 648 - if (err < 0) 649 - goto err2; 650 - 651 - return 0; 652 - err2: 653 - nf_ct_free_hashtable(net->ct.expect_hash, nf_ct_expect_hsize); 654 - err1: 655 - return err; 627 + return exp_proc_init(net); 656 628 } 657 629 658 630 void nf_conntrack_expect_pernet_fini(struct net *net) 659 631 { 660 632 exp_proc_remove(net); 661 - nf_ct_free_hashtable(net->ct.expect_hash, nf_ct_expect_hsize); 662 633 } 663 634 664 635 int nf_conntrack_expect_init(void) ··· 659 660 0, 0, NULL); 660 661 if (!nf_ct_expect_cachep) 661 662 return -ENOMEM; 663 + 664 + nf_ct_expect_hash = nf_ct_alloc_hashtable(&nf_ct_expect_hsize, 0); 665 + if (!nf_ct_expect_hash) { 666 + kmem_cache_destroy(nf_ct_expect_cachep); 667 + return -ENOMEM; 668 + } 669 + 662 670 return 0; 663 671 } 664 672 ··· 673 667 { 674 668 rcu_barrier(); /* Wait for call_rcu() before destroy */ 675 669 kmem_cache_destroy(nf_ct_expect_cachep); 670 + nf_ct_free_hashtable(nf_ct_expect_hash, nf_ct_expect_hsize); 676 671 }
+6 -6
net/netfilter/nf_conntrack_helper.c
··· 38 38 EXPORT_SYMBOL_GPL(nf_ct_helper_hsize); 39 39 static unsigned int nf_ct_helper_count __read_mostly; 40 40 41 - static bool nf_ct_auto_assign_helper __read_mostly = true; 41 + static bool nf_ct_auto_assign_helper __read_mostly = false; 42 42 module_param_named(nf_conntrack_helper, nf_ct_auto_assign_helper, bool, 0644); 43 43 MODULE_PARM_DESC(nf_conntrack_helper, 44 - "Enable automatic conntrack helper assignment (default 1)"); 44 + "Enable automatic conntrack helper assignment (default 0)"); 45 45 46 46 #ifdef CONFIG_SYSCTL 47 47 static struct ctl_table helper_sysctl_table[] = { ··· 400 400 spin_lock_bh(&nf_conntrack_expect_lock); 401 401 for (i = 0; i < nf_ct_expect_hsize; i++) { 402 402 hlist_for_each_entry_safe(exp, next, 403 - &net->ct.expect_hash[i], hnode) { 403 + &nf_ct_expect_hash[i], hnode) { 404 404 struct nf_conn_help *help = nfct_help(exp->master); 405 405 if ((rcu_dereference_protected( 406 406 help->helper, ··· 424 424 spin_unlock_bh(&pcpu->lock); 425 425 } 426 426 local_bh_disable(); 427 - for (i = 0; i < net->ct.htable_size; i++) { 427 + for (i = 0; i < nf_conntrack_htable_size; i++) { 428 428 nf_conntrack_lock(&nf_conntrack_locks[i % CONNTRACK_LOCKS]); 429 - if (i < net->ct.htable_size) { 430 - hlist_nulls_for_each_entry(h, nn, &net->ct.hash[i], hnnode) 429 + if (i < nf_conntrack_htable_size) { 430 + hlist_nulls_for_each_entry(h, nn, &nf_conntrack_hash[i], hnnode) 431 431 unhelp(h, me); 432 432 } 433 433 spin_unlock(&nf_conntrack_locks[i % CONNTRACK_LOCKS]);
+22 -7
net/netfilter/nf_conntrack_netlink.c
··· 824 824 last = (struct nf_conn *)cb->args[1]; 825 825 826 826 local_bh_disable(); 827 - for (; cb->args[0] < net->ct.htable_size; cb->args[0]++) { 827 + for (; cb->args[0] < nf_conntrack_htable_size; cb->args[0]++) { 828 828 restart: 829 829 lockp = &nf_conntrack_locks[cb->args[0] % CONNTRACK_LOCKS]; 830 830 nf_conntrack_lock(lockp); 831 - if (cb->args[0] >= net->ct.htable_size) { 831 + if (cb->args[0] >= nf_conntrack_htable_size) { 832 832 spin_unlock(lockp); 833 833 goto out; 834 834 } 835 - hlist_nulls_for_each_entry(h, n, &net->ct.hash[cb->args[0]], 836 - hnnode) { 835 + hlist_nulls_for_each_entry(h, n, &nf_conntrack_hash[cb->args[0]], 836 + hnnode) { 837 837 if (NF_CT_DIRECTION(h) != IP_CT_DIR_ORIGINAL) 838 838 continue; 839 839 ct = nf_ct_tuplehash_to_ctrack(h); 840 + if (!net_eq(net, nf_ct_net(ct))) 841 + continue; 842 + 840 843 /* Dump entries of a given L3 protocol number. 841 844 * If it is not specified, ie. l3proto == 0, 842 845 * then dump everything. */ ··· 2632 2629 last = (struct nf_conntrack_expect *)cb->args[1]; 2633 2630 for (; cb->args[0] < nf_ct_expect_hsize; cb->args[0]++) { 2634 2631 restart: 2635 - hlist_for_each_entry(exp, &net->ct.expect_hash[cb->args[0]], 2632 + hlist_for_each_entry(exp, &nf_ct_expect_hash[cb->args[0]], 2636 2633 hnode) { 2637 2634 if (l3proto && exp->tuple.src.l3num != l3proto) 2638 2635 continue; 2636 + 2637 + if (!net_eq(nf_ct_net(exp->master), net)) 2638 + continue; 2639 + 2639 2640 if (cb->args[1]) { 2640 2641 if (exp != last) 2641 2642 continue; ··· 2890 2883 spin_lock_bh(&nf_conntrack_expect_lock); 2891 2884 for (i = 0; i < nf_ct_expect_hsize; i++) { 2892 2885 hlist_for_each_entry_safe(exp, next, 2893 - &net->ct.expect_hash[i], 2886 + &nf_ct_expect_hash[i], 2894 2887 hnode) { 2888 + 2889 + if (!net_eq(nf_ct_exp_net(exp), net)) 2890 + continue; 2891 + 2895 2892 m_help = nfct_help(exp->master); 2896 2893 if (!strcmp(m_help->helper->name, name) && 2897 2894 del_timer(&exp->timeout)) { ··· 2912 2901 spin_lock_bh(&nf_conntrack_expect_lock); 2913 2902 for (i = 0; i < nf_ct_expect_hsize; i++) { 2914 2903 hlist_for_each_entry_safe(exp, next, 2915 - &net->ct.expect_hash[i], 2904 + &nf_ct_expect_hash[i], 2916 2905 hnode) { 2906 + 2907 + if (!net_eq(nf_ct_exp_net(exp), net)) 2908 + continue; 2909 + 2917 2910 if (del_timer(&exp->timeout)) { 2918 2911 nf_ct_unlink_expect_report(exp, 2919 2912 NETLINK_CB(skb).portid,
+2
net/netfilter/nf_conntrack_proto_udp.c
··· 309 309 .l3proto = PF_INET, 310 310 .l4proto = IPPROTO_UDP, 311 311 .name = "udp", 312 + .allow_clash = true, 312 313 .pkt_to_tuple = udp_pkt_to_tuple, 313 314 .invert_tuple = udp_invert_tuple, 314 315 .print_tuple = udp_print_tuple, ··· 342 341 .l3proto = PF_INET6, 343 342 .l4proto = IPPROTO_UDP, 344 343 .name = "udp", 344 + .allow_clash = true, 345 345 .pkt_to_tuple = udp_pkt_to_tuple, 346 346 .invert_tuple = udp_invert_tuple, 347 347 .print_tuple = udp_print_tuple,
+2
net/netfilter/nf_conntrack_proto_udplite.c
··· 274 274 .l3proto = PF_INET, 275 275 .l4proto = IPPROTO_UDPLITE, 276 276 .name = "udplite", 277 + .allow_clash = true, 277 278 .pkt_to_tuple = udplite_pkt_to_tuple, 278 279 .invert_tuple = udplite_invert_tuple, 279 280 .print_tuple = udplite_print_tuple, ··· 307 306 .l3proto = PF_INET6, 308 307 .l4proto = IPPROTO_UDPLITE, 309 308 .name = "udplite", 309 + .allow_clash = true, 310 310 .pkt_to_tuple = udplite_pkt_to_tuple, 311 311 .invert_tuple = udplite_invert_tuple, 312 312 .print_tuple = udplite_print_tuple,
+5 -8
net/netfilter/nf_conntrack_standalone.c
··· 54 54 55 55 static struct hlist_nulls_node *ct_get_first(struct seq_file *seq) 56 56 { 57 - struct net *net = seq_file_net(seq); 58 57 struct ct_iter_state *st = seq->private; 59 58 struct hlist_nulls_node *n; 60 59 61 60 for (st->bucket = 0; 62 - st->bucket < net->ct.htable_size; 61 + st->bucket < nf_conntrack_htable_size; 63 62 st->bucket++) { 64 - n = rcu_dereference(hlist_nulls_first_rcu(&net->ct.hash[st->bucket])); 63 + n = rcu_dereference(hlist_nulls_first_rcu(&nf_conntrack_hash[st->bucket])); 65 64 if (!is_a_nulls(n)) 66 65 return n; 67 66 } ··· 70 71 static struct hlist_nulls_node *ct_get_next(struct seq_file *seq, 71 72 struct hlist_nulls_node *head) 72 73 { 73 - struct net *net = seq_file_net(seq); 74 74 struct ct_iter_state *st = seq->private; 75 75 76 76 head = rcu_dereference(hlist_nulls_next_rcu(head)); 77 77 while (is_a_nulls(head)) { 78 78 if (likely(get_nulls_value(head) == st->bucket)) { 79 - if (++st->bucket >= net->ct.htable_size) 79 + if (++st->bucket >= nf_conntrack_htable_size) 80 80 return NULL; 81 81 } 82 82 head = rcu_dereference( 83 83 hlist_nulls_first_rcu( 84 - &net->ct.hash[st->bucket])); 84 + &nf_conntrack_hash[st->bucket])); 85 85 } 86 86 return head; 87 87 } ··· 456 458 }, 457 459 { 458 460 .procname = "nf_conntrack_buckets", 459 - .data = &init_net.ct.htable_size, 461 + .data = &nf_conntrack_htable_size, 460 462 .maxlen = sizeof(unsigned int), 461 463 .mode = 0444, 462 464 .proc_handler = proc_dointvec, ··· 510 512 goto out_kmemdup; 511 513 512 514 table[1].data = &net->ct.count; 513 - table[2].data = &net->ct.htable_size; 514 515 table[3].data = &net->ct.sysctl_checksum; 515 516 table[4].data = &net->ct.sysctl_log_invalid; 516 517
+21 -18
net/netfilter/nf_nat_core.c
··· 38 38 static const struct nf_nat_l4proto __rcu **nf_nat_l4protos[NFPROTO_NUMPROTO] 39 39 __read_mostly; 40 40 41 + static struct hlist_head *nf_nat_bysource __read_mostly; 42 + static unsigned int nf_nat_htable_size __read_mostly; 43 + static unsigned int nf_nat_hash_rnd __read_mostly; 41 44 42 45 inline const struct nf_nat_l3proto * 43 46 __nf_nat_l3proto_find(u8 family) ··· 121 118 122 119 /* We keep an extra hash for each conntrack, for fast searching. */ 123 120 static inline unsigned int 124 - hash_by_src(const struct net *net, const struct nf_conntrack_tuple *tuple) 121 + hash_by_src(const struct net *n, const struct nf_conntrack_tuple *tuple) 125 122 { 126 123 unsigned int hash; 127 124 125 + get_random_once(&nf_nat_hash_rnd, sizeof(nf_nat_hash_rnd)); 126 + 128 127 /* Original src, to ensure we map it consistently if poss. */ 129 128 hash = jhash2((u32 *)&tuple->src, sizeof(tuple->src) / sizeof(u32), 130 - tuple->dst.protonum ^ nf_conntrack_hash_rnd); 129 + tuple->dst.protonum ^ nf_nat_hash_rnd ^ net_hash_mix(n)); 131 130 132 - return reciprocal_scale(hash, net->ct.nat_htable_size); 131 + return reciprocal_scale(hash, nf_nat_htable_size); 133 132 } 134 133 135 134 /* Is this tuple already taken? (not by us) */ ··· 201 196 const struct nf_conn_nat *nat; 202 197 const struct nf_conn *ct; 203 198 204 - hlist_for_each_entry_rcu(nat, &net->ct.nat_bysource[h], bysource) { 199 + hlist_for_each_entry_rcu(nat, &nf_nat_bysource[h], bysource) { 205 200 ct = nat->ct; 206 201 if (same_src(ct, tuple) && 202 + net_eq(net, nf_ct_net(ct)) && 207 203 nf_ct_zone_equal(ct, zone, IP_CT_DIR_ORIGINAL)) { 208 204 /* Copy source part from reply tuple. */ 209 205 nf_ct_invert_tuplepr(result, ··· 437 431 nat = nfct_nat(ct); 438 432 nat->ct = ct; 439 433 hlist_add_head_rcu(&nat->bysource, 440 - &net->ct.nat_bysource[srchash]); 434 + &nf_nat_bysource[srchash]); 441 435 spin_unlock_bh(&nf_nat_lock); 442 436 } 443 437 ··· 825 819 } 826 820 #endif 827 821 828 - static int __net_init nf_nat_net_init(struct net *net) 829 - { 830 - /* Leave them the same for the moment. */ 831 - net->ct.nat_htable_size = net->ct.htable_size; 832 - net->ct.nat_bysource = nf_ct_alloc_hashtable(&net->ct.nat_htable_size, 0); 833 - if (!net->ct.nat_bysource) 834 - return -ENOMEM; 835 - return 0; 836 - } 837 - 838 822 static void __net_exit nf_nat_net_exit(struct net *net) 839 823 { 840 824 struct nf_nat_proto_clean clean = {}; 841 825 842 826 nf_ct_iterate_cleanup(net, nf_nat_proto_clean, &clean, 0, 0); 843 - synchronize_rcu(); 844 - nf_ct_free_hashtable(net->ct.nat_bysource, net->ct.nat_htable_size); 845 827 } 846 828 847 829 static struct pernet_operations nf_nat_net_ops = { 848 - .init = nf_nat_net_init, 849 830 .exit = nf_nat_net_exit, 850 831 }; 851 832 ··· 845 852 { 846 853 int ret; 847 854 855 + /* Leave them the same for the moment. */ 856 + nf_nat_htable_size = nf_conntrack_htable_size; 857 + 858 + nf_nat_bysource = nf_ct_alloc_hashtable(&nf_nat_htable_size, 0); 859 + if (!nf_nat_bysource) 860 + return -ENOMEM; 861 + 848 862 ret = nf_ct_extend_register(&nat_extend); 849 863 if (ret < 0) { 864 + nf_ct_free_hashtable(nf_nat_bysource, nf_nat_htable_size); 850 865 printk(KERN_ERR "nf_nat_core: Unable to register extension\n"); 851 866 return ret; 852 867 } ··· 878 877 return 0; 879 878 880 879 cleanup_extend: 880 + nf_ct_free_hashtable(nf_nat_bysource, nf_nat_htable_size); 881 881 nf_ct_extend_unregister(&nat_extend); 882 882 return ret; 883 883 } ··· 897 895 for (i = 0; i < NFPROTO_NUMPROTO; i++) 898 896 kfree(nf_nat_l4protos[i]); 899 897 synchronize_net(); 898 + nf_ct_free_hashtable(nf_nat_bysource, nf_nat_htable_size); 900 899 } 901 900 902 901 MODULE_LICENSE("GPL");
+60 -22
net/netfilter/nf_tables_api.c
··· 2317 2317 static const struct nla_policy nft_set_policy[NFTA_SET_MAX + 1] = { 2318 2318 [NFTA_SET_TABLE] = { .type = NLA_STRING }, 2319 2319 [NFTA_SET_NAME] = { .type = NLA_STRING, 2320 - .len = IFNAMSIZ - 1 }, 2320 + .len = NFT_SET_MAXNAMELEN - 1 }, 2321 2321 [NFTA_SET_FLAGS] = { .type = NLA_U32 }, 2322 2322 [NFTA_SET_KEY_TYPE] = { .type = NLA_U32 }, 2323 2323 [NFTA_SET_KEY_LEN] = { .type = NLA_U32 }, ··· 2401 2401 unsigned long *inuse; 2402 2402 unsigned int n = 0, min = 0; 2403 2403 2404 - p = strnchr(name, IFNAMSIZ, '%'); 2404 + p = strnchr(name, NFT_SET_MAXNAMELEN, '%'); 2405 2405 if (p != NULL) { 2406 2406 if (p[1] != 'd' || strchr(p + 2, '%')) 2407 2407 return -EINVAL; ··· 2696 2696 struct nft_table *table; 2697 2697 struct nft_set *set; 2698 2698 struct nft_ctx ctx; 2699 - char name[IFNAMSIZ]; 2699 + char name[NFT_SET_MAXNAMELEN]; 2700 2700 unsigned int size; 2701 2701 bool create; 2702 2702 u64 timeout; ··· 3375 3375 } 3376 3376 EXPORT_SYMBOL_GPL(nft_set_elem_destroy); 3377 3377 3378 + static int nft_setelem_parse_flags(const struct nft_set *set, 3379 + const struct nlattr *attr, u32 *flags) 3380 + { 3381 + if (attr == NULL) 3382 + return 0; 3383 + 3384 + *flags = ntohl(nla_get_be32(attr)); 3385 + if (*flags & ~NFT_SET_ELEM_INTERVAL_END) 3386 + return -EINVAL; 3387 + if (!(set->flags & NFT_SET_INTERVAL) && 3388 + *flags & NFT_SET_ELEM_INTERVAL_END) 3389 + return -EINVAL; 3390 + 3391 + return 0; 3392 + } 3393 + 3378 3394 static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, 3379 3395 const struct nlattr *attr) 3380 3396 { ··· 3404 3388 struct nft_data data; 3405 3389 enum nft_registers dreg; 3406 3390 struct nft_trans *trans; 3391 + u32 flags = 0; 3407 3392 u64 timeout; 3408 - u32 flags; 3409 3393 u8 ulen; 3410 3394 int err; 3411 3395 ··· 3419 3403 3420 3404 nft_set_ext_prepare(&tmpl); 3421 3405 3422 - flags = 0; 3423 - if (nla[NFTA_SET_ELEM_FLAGS] != NULL) { 3424 - flags = ntohl(nla_get_be32(nla[NFTA_SET_ELEM_FLAGS])); 3425 - if (flags & ~NFT_SET_ELEM_INTERVAL_END) 3426 - return -EINVAL; 3427 - if (!(set->flags & NFT_SET_INTERVAL) && 3428 - flags & NFT_SET_ELEM_INTERVAL_END) 3429 - return -EINVAL; 3430 - if (flags != 0) 3431 - nft_set_ext_add(&tmpl, NFT_SET_EXT_FLAGS); 3432 - } 3406 + err = nft_setelem_parse_flags(set, nla[NFTA_SET_ELEM_FLAGS], &flags); 3407 + if (err < 0) 3408 + return err; 3409 + if (flags != 0) 3410 + nft_set_ext_add(&tmpl, NFT_SET_EXT_FLAGS); 3433 3411 3434 3412 if (set->flags & NFT_SET_MAP) { 3435 3413 if (nla[NFTA_SET_ELEM_DATA] == NULL && ··· 3592 3582 const struct nlattr *attr) 3593 3583 { 3594 3584 struct nlattr *nla[NFTA_SET_ELEM_MAX + 1]; 3585 + struct nft_set_ext_tmpl tmpl; 3595 3586 struct nft_data_desc desc; 3596 3587 struct nft_set_elem elem; 3588 + struct nft_set_ext *ext; 3597 3589 struct nft_trans *trans; 3590 + u32 flags = 0; 3591 + void *priv; 3598 3592 int err; 3599 3593 3600 3594 err = nla_parse_nested(nla, NFTA_SET_ELEM_MAX, attr, ··· 3610 3596 if (nla[NFTA_SET_ELEM_KEY] == NULL) 3611 3597 goto err1; 3612 3598 3599 + nft_set_ext_prepare(&tmpl); 3600 + 3601 + err = nft_setelem_parse_flags(set, nla[NFTA_SET_ELEM_FLAGS], &flags); 3602 + if (err < 0) 3603 + return err; 3604 + if (flags != 0) 3605 + nft_set_ext_add(&tmpl, NFT_SET_EXT_FLAGS); 3606 + 3613 3607 err = nft_data_init(ctx, &elem.key.val, sizeof(elem.key), &desc, 3614 3608 nla[NFTA_SET_ELEM_KEY]); 3615 3609 if (err < 0) ··· 3627 3605 if (desc.type != NFT_DATA_VALUE || desc.len != set->klen) 3628 3606 goto err2; 3629 3607 3608 + nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, desc.len); 3609 + 3610 + err = -ENOMEM; 3611 + elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, NULL, 0, 3612 + GFP_KERNEL); 3613 + if (elem.priv == NULL) 3614 + goto err2; 3615 + 3616 + ext = nft_set_elem_ext(set, elem.priv); 3617 + if (flags) 3618 + *nft_set_ext_flags(ext) = flags; 3619 + 3630 3620 trans = nft_trans_elem_alloc(ctx, NFT_MSG_DELSETELEM, set); 3631 3621 if (trans == NULL) { 3632 3622 err = -ENOMEM; 3633 - goto err2; 3634 - } 3635 - 3636 - elem.priv = set->ops->deactivate(set, &elem); 3637 - if (elem.priv == NULL) { 3638 - err = -ENOENT; 3639 3623 goto err3; 3640 3624 } 3625 + 3626 + priv = set->ops->deactivate(set, &elem); 3627 + if (priv == NULL) { 3628 + err = -ENOENT; 3629 + goto err4; 3630 + } 3631 + kfree(elem.priv); 3632 + elem.priv = priv; 3641 3633 3642 3634 nft_trans_elem(trans) = elem; 3643 3635 list_add_tail(&trans->list, &ctx->net->nft.commit_list); 3644 3636 return 0; 3645 3637 3646 - err3: 3638 + err4: 3647 3639 kfree(trans); 3640 + err3: 3641 + kfree(elem.priv); 3648 3642 err2: 3649 3643 nft_data_uninit(&elem.key.val, desc.type); 3650 3644 err1:
+30
net/netfilter/nft_ct.c
··· 198 198 } 199 199 break; 200 200 #endif 201 + #ifdef CONFIG_NF_CONNTRACK_LABELS 202 + case NFT_CT_LABELS: 203 + nf_connlabels_replace(ct, 204 + &regs->data[priv->sreg], 205 + &regs->data[priv->sreg], 206 + NF_CT_LABELS_MAX_SIZE / sizeof(u32)); 207 + break; 208 + #endif 201 209 default: 202 210 break; 203 211 } ··· 373 365 len = FIELD_SIZEOF(struct nf_conn, mark); 374 366 break; 375 367 #endif 368 + #ifdef CONFIG_NF_CONNTRACK_LABELS 369 + case NFT_CT_LABELS: 370 + if (tb[NFTA_CT_DIRECTION]) 371 + return -EINVAL; 372 + len = NF_CT_LABELS_MAX_SIZE; 373 + err = nf_connlabels_get(ctx->net, (len * BITS_PER_BYTE) - 1); 374 + if (err) 375 + return err; 376 + break; 377 + #endif 376 378 default: 377 379 return -EOPNOTSUPP; 378 380 } ··· 402 384 static void nft_ct_destroy(const struct nft_ctx *ctx, 403 385 const struct nft_expr *expr) 404 386 { 387 + struct nft_ct *priv = nft_expr_priv(expr); 388 + 389 + switch (priv->key) { 390 + #ifdef CONFIG_NF_CONNTRACK_LABELS 391 + case NFT_CT_LABELS: 392 + nf_connlabels_put(ctx->net); 393 + break; 394 + #endif 395 + default: 396 + break; 397 + } 398 + 405 399 nft_ct_l3proto_module_put(ctx->afi->family); 406 400 } 407 401
+41 -8
net/netfilter/nft_rbtree.c
··· 29 29 struct nft_set_ext ext; 30 30 }; 31 31 32 + static bool nft_rbtree_interval_end(const struct nft_rbtree_elem *rbe) 33 + { 34 + return nft_set_ext_exists(&rbe->ext, NFT_SET_EXT_FLAGS) && 35 + (*nft_set_ext_flags(&rbe->ext) & NFT_SET_ELEM_INTERVAL_END); 36 + } 37 + 38 + static bool nft_rbtree_equal(const struct nft_set *set, const void *this, 39 + const struct nft_rbtree_elem *interval) 40 + { 41 + return memcmp(this, nft_set_ext_key(&interval->ext), set->klen) == 0; 42 + } 32 43 33 44 static bool nft_rbtree_lookup(const struct nft_set *set, const u32 *key, 34 45 const struct nft_set_ext **ext) ··· 48 37 const struct nft_rbtree_elem *rbe, *interval = NULL; 49 38 const struct rb_node *parent; 50 39 u8 genmask = nft_genmask_cur(read_pnet(&set->pnet)); 40 + const void *this; 51 41 int d; 52 42 53 43 spin_lock_bh(&nft_rbtree_lock); ··· 56 44 while (parent != NULL) { 57 45 rbe = rb_entry(parent, struct nft_rbtree_elem, node); 58 46 59 - d = memcmp(nft_set_ext_key(&rbe->ext), key, set->klen); 47 + this = nft_set_ext_key(&rbe->ext); 48 + d = memcmp(this, key, set->klen); 60 49 if (d < 0) { 61 50 parent = parent->rb_left; 51 + /* In case of adjacent ranges, we always see the high 52 + * part of the range in first place, before the low one. 53 + * So don't update interval if the keys are equal. 54 + */ 55 + if (interval && nft_rbtree_equal(set, this, interval)) 56 + continue; 62 57 interval = rbe; 63 58 } else if (d > 0) 64 59 parent = parent->rb_right; ··· 75 56 parent = parent->rb_left; 76 57 continue; 77 58 } 78 - if (nft_set_ext_exists(&rbe->ext, NFT_SET_EXT_FLAGS) && 79 - *nft_set_ext_flags(&rbe->ext) & 80 - NFT_SET_ELEM_INTERVAL_END) 59 + if (nft_rbtree_interval_end(rbe)) 81 60 goto out; 82 61 spin_unlock_bh(&nft_rbtree_lock); 83 62 ··· 115 98 else if (d > 0) 116 99 p = &parent->rb_right; 117 100 else { 118 - if (nft_set_elem_active(&rbe->ext, genmask)) 119 - return -EEXIST; 120 - p = &parent->rb_left; 101 + if (nft_set_elem_active(&rbe->ext, genmask)) { 102 + if (nft_rbtree_interval_end(rbe) && 103 + !nft_rbtree_interval_end(new)) 104 + p = &parent->rb_left; 105 + else if (!nft_rbtree_interval_end(rbe) && 106 + nft_rbtree_interval_end(new)) 107 + p = &parent->rb_right; 108 + else 109 + return -EEXIST; 110 + } 121 111 } 122 112 } 123 113 rb_link_node(&new->node, parent, p); ··· 169 145 { 170 146 const struct nft_rbtree *priv = nft_set_priv(set); 171 147 const struct rb_node *parent = priv->root.rb_node; 172 - struct nft_rbtree_elem *rbe; 148 + struct nft_rbtree_elem *rbe, *this = elem->priv; 173 149 u8 genmask = nft_genmask_cur(read_pnet(&set->pnet)); 174 150 int d; 175 151 ··· 185 161 else { 186 162 if (!nft_set_elem_active(&rbe->ext, genmask)) { 187 163 parent = parent->rb_left; 164 + continue; 165 + } 166 + if (nft_rbtree_interval_end(rbe) && 167 + !nft_rbtree_interval_end(this)) { 168 + parent = parent->rb_left; 169 + continue; 170 + } else if (!nft_rbtree_interval_end(rbe) && 171 + nft_rbtree_interval_end(this)) { 172 + parent = parent->rb_right; 188 173 continue; 189 174 } 190 175 nft_set_elem_change_active(set, &rbe->ext);
-8
net/openvswitch/conntrack.c
··· 439 439 u8 protonum; 440 440 441 441 l3proto = __nf_ct_l3proto_find(l3num); 442 - if (!l3proto) { 443 - pr_debug("ovs_ct_find_existing: Can't get l3proto\n"); 444 - return NULL; 445 - } 446 442 if (l3proto->get_l4proto(skb, skb_network_offset(skb), &dataoff, 447 443 &protonum) <= 0) { 448 444 pr_debug("ovs_ct_find_existing: Can't get protonum\n"); 449 445 return NULL; 450 446 } 451 447 l4proto = __nf_ct_l4proto_find(l3num, protonum); 452 - if (!l4proto) { 453 - pr_debug("ovs_ct_find_existing: Can't get l4proto\n"); 454 - return NULL; 455 - } 456 448 if (!nf_ct_get_tuple(skb, skb_network_offset(skb), dataoff, l3num, 457 449 protonum, net, &tuple, l3proto, l4proto)) { 458 450 pr_debug("ovs_ct_find_existing: Can't get tuple\n");