Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

icmp: fix icmp_ndo_send address translation for reply direction

The icmp_ndo_send function was originally introduced to ensure proper
rate limiting when icmp_send is called by a network device driver,
where the packet's source address may have already been transformed
by SNAT.

However, the original implementation only considers the
IP_CT_DIR_ORIGINAL direction for SNAT and always replaced the packet's
source address with that of the original-direction tuple. This causes
two problems:

1. For SNAT:
Reply-direction packets were incorrectly translated using the source
address of the CT original direction, even though no translation is
required.

2. For DNAT:
Reply-direction packets were not handled at all. In DNAT, the original
direction's destination is translated. Therefore, in the reply
direction the source address must be set to the reply-direction
source, so rate limiting works as intended.

Fix this by using the connection direction to select the correct tuple
for source address translation, and adjust the pre-checks to handle
reply-direction packets in case of DNAT.

Additionally, wrap the `ct->status` access in READ_ONCE(). This avoids
possible KCSAN reports about concurrent updates to `ct->status`.

Fixes: 0b41713b6066 ("icmp: introduce helper for nat'd source address in network device context")
Signed-off-by: Fabian Bläse <fabian@blaese.de>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

authored by

Fabian Bläse and committed by
Jakub Kicinski
c6dd1aa2 7000f4fa

+8 -4
+4 -2
net/ipv4/icmp.c
··· 799 799 struct sk_buff *cloned_skb = NULL; 800 800 struct ip_options opts = { 0 }; 801 801 enum ip_conntrack_info ctinfo; 802 + enum ip_conntrack_dir dir; 802 803 struct nf_conn *ct; 803 804 __be32 orig_ip; 804 805 805 806 ct = nf_ct_get(skb_in, &ctinfo); 806 - if (!ct || !(ct->status & IPS_SRC_NAT)) { 807 + if (!ct || !(READ_ONCE(ct->status) & IPS_NAT_MASK)) { 807 808 __icmp_send(skb_in, type, code, info, &opts); 808 809 return; 809 810 } ··· 819 818 goto out; 820 819 821 820 orig_ip = ip_hdr(skb_in)->saddr; 822 - ip_hdr(skb_in)->saddr = ct->tuplehash[0].tuple.src.u3.ip; 821 + dir = CTINFO2DIR(ctinfo); 822 + ip_hdr(skb_in)->saddr = ct->tuplehash[dir].tuple.src.u3.ip; 823 823 __icmp_send(skb_in, type, code, info, &opts); 824 824 ip_hdr(skb_in)->saddr = orig_ip; 825 825 out:
+4 -2
net/ipv6/ip6_icmp.c
··· 54 54 struct inet6_skb_parm parm = { 0 }; 55 55 struct sk_buff *cloned_skb = NULL; 56 56 enum ip_conntrack_info ctinfo; 57 + enum ip_conntrack_dir dir; 57 58 struct in6_addr orig_ip; 58 59 struct nf_conn *ct; 59 60 60 61 ct = nf_ct_get(skb_in, &ctinfo); 61 - if (!ct || !(ct->status & IPS_SRC_NAT)) { 62 + if (!ct || !(READ_ONCE(ct->status) & IPS_NAT_MASK)) { 62 63 __icmpv6_send(skb_in, type, code, info, &parm); 63 64 return; 64 65 } ··· 74 73 goto out; 75 74 76 75 orig_ip = ipv6_hdr(skb_in)->saddr; 77 - ipv6_hdr(skb_in)->saddr = ct->tuplehash[0].tuple.src.u3.in6; 76 + dir = CTINFO2DIR(ctinfo); 77 + ipv6_hdr(skb_in)->saddr = ct->tuplehash[dir].tuple.src.u3.in6; 78 78 __icmpv6_send(skb_in, type, code, info, &parm); 79 79 ipv6_hdr(skb_in)->saddr = orig_ip; 80 80 out: