Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

net: Fix checksum update for ILA adj-transport

During ILA address translations, the L4 checksums can be handled in
different ways. One of them, adj-transport, consist in parsing the
transport layer and updating any found checksum. This logic relies on
inet_proto_csum_replace_by_diff and produces an incorrect skb->csum when
in state CHECKSUM_COMPLETE.

This bug can be reproduced with a simple ILA to SIR mapping, assuming
packets are received with CHECKSUM_COMPLETE:

$ ip a show dev eth0
14: eth0@if15: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 62:ae:35:9e:0f:8d brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet6 3333:0:0:1::c078/64 scope global
valid_lft forever preferred_lft forever
inet6 fd00:10:244:1::c078/128 scope global nodad
valid_lft forever preferred_lft forever
inet6 fe80::60ae:35ff:fe9e:f8d/64 scope link proto kernel_ll
valid_lft forever preferred_lft forever
$ ip ila add loc_match fd00:10:244:1 loc 3333:0:0:1 \
csum-mode adj-transport ident-type luid dev eth0

Then I hit [fd00:10:244:1::c078]:8000 with a server listening only on
[3333:0:0:1::c078]:8000. With the bug, the SYN packet is dropped with
SKB_DROP_REASON_TCP_CSUM after inet_proto_csum_replace_by_diff changed
skb->csum. The translation and drop are visible on pwru [1] traces:

IFACE TUPLE FUNC
eth0:9 [fd00:10:244:3::3d8]:51420->[fd00:10:244:1::c078]:8000(tcp) ipv6_rcv
eth0:9 [fd00:10:244:3::3d8]:51420->[fd00:10:244:1::c078]:8000(tcp) ip6_rcv_core
eth0:9 [fd00:10:244:3::3d8]:51420->[fd00:10:244:1::c078]:8000(tcp) nf_hook_slow
eth0:9 [fd00:10:244:3::3d8]:51420->[fd00:10:244:1::c078]:8000(tcp) inet_proto_csum_replace_by_diff
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) tcp_v6_early_demux
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) ip6_route_input
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) ip6_input
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) ip6_input_finish
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) ip6_protocol_deliver_rcu
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) raw6_local_deliver
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) ipv6_raw_deliver
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) tcp_v6_rcv
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) __skb_checksum_complete
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) kfree_skb_reason(SKB_DROP_REASON_TCP_CSUM)
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) skb_release_head_state
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) skb_release_data
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) skb_free_head
eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) kfree_skbmem

This is happening because inet_proto_csum_replace_by_diff is updating
skb->csum when it shouldn't. The L4 checksum is updated such that it
"cancels" the IPv6 address change in terms of checksum computation, so
the impact on skb->csum is null.

Note this would be different for an IPv4 packet since three fields
would be updated: the IPv4 address, the IP checksum, and the L4
checksum. Two would cancel each other and skb->csum would still need
to be updated to take the L4 checksum change into account.

This patch fixes it by passing an ipv6 flag to
inet_proto_csum_replace_by_diff, to skip the skb->csum update if we're
in the IPv6 case. Note the behavior of the only other user of
inet_proto_csum_replace_by_diff, the BPF subsystem, is left as is in
this patch and fixed in the subsequent patch.

With the fix, using the reproduction from above, I can confirm
skb->csum is not touched by inet_proto_csum_replace_by_diff and the TCP
SYN proceeds to the application after the ILA translation.

Link: https://github.com/cilium/pwru [1]
Fixes: 65d7ab8de582 ("net: Identifier Locator Addressing module")
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://patch.msgid.link/b5539869e3550d46068504feb02d37653d939c0b.1748509484.git.paul.chaignon@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

authored by

Paul Chaignon and committed by
Jakub Kicinski
6043b794 44abca1f

+7 -7
+1 -1
include/net/checksum.h
··· 152 152 const __be32 *from, const __be32 *to, 153 153 bool pseudohdr); 154 154 void inet_proto_csum_replace_by_diff(__sum16 *sum, struct sk_buff *skb, 155 - __wsum diff, bool pseudohdr); 155 + __wsum diff, bool pseudohdr, bool ipv6); 156 156 157 157 static __always_inline 158 158 void inet_proto_csum_replace2(__sum16 *sum, struct sk_buff *skb,
+1 -1
net/core/filter.c
··· 1987 1987 if (unlikely(from != 0)) 1988 1988 return -EINVAL; 1989 1989 1990 - inet_proto_csum_replace_by_diff(ptr, skb, to, is_pseudo); 1990 + inet_proto_csum_replace_by_diff(ptr, skb, to, is_pseudo, false); 1991 1991 break; 1992 1992 case 2: 1993 1993 inet_proto_csum_replace2(ptr, skb, from, to, is_pseudo);
+2 -2
net/core/utils.c
··· 473 473 EXPORT_SYMBOL(inet_proto_csum_replace16); 474 474 475 475 void inet_proto_csum_replace_by_diff(__sum16 *sum, struct sk_buff *skb, 476 - __wsum diff, bool pseudohdr) 476 + __wsum diff, bool pseudohdr, bool ipv6) 477 477 { 478 478 if (skb->ip_summed != CHECKSUM_PARTIAL) { 479 479 csum_replace_by_diff(sum, diff); 480 - if (skb->ip_summed == CHECKSUM_COMPLETE && pseudohdr) 480 + if (skb->ip_summed == CHECKSUM_COMPLETE && pseudohdr && !ipv6) 481 481 skb->csum = ~csum_sub(diff, skb->csum); 482 482 } else if (pseudohdr) { 483 483 *sum = ~csum_fold(csum_add(diff, csum_unfold(*sum)));
+3 -3
net/ipv6/ila/ila_common.c
··· 86 86 87 87 diff = get_csum_diff(ip6h, p); 88 88 inet_proto_csum_replace_by_diff(&th->check, skb, 89 - diff, true); 89 + diff, true, true); 90 90 } 91 91 break; 92 92 case NEXTHDR_UDP: ··· 97 97 if (uh->check || skb->ip_summed == CHECKSUM_PARTIAL) { 98 98 diff = get_csum_diff(ip6h, p); 99 99 inet_proto_csum_replace_by_diff(&uh->check, skb, 100 - diff, true); 100 + diff, true, true); 101 101 if (!uh->check) 102 102 uh->check = CSUM_MANGLED_0; 103 103 } ··· 111 111 112 112 diff = get_csum_diff(ip6h, p); 113 113 inet_proto_csum_replace_by_diff(&ih->icmp6_cksum, skb, 114 - diff, true); 114 + diff, true, true); 115 115 } 116 116 break; 117 117 }