Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge branch 'fix-isolation-of-broadcast-traffic-and-unmatched-unicast-traffic-with-macsec-offload'

Rahul Rameshbabu says:

====================
Fix isolation of broadcast traffic and unmatched unicast traffic with MACsec offload

Some device drivers support devices that enable them to annotate whether a
Rx skb refers to a packet that was processed by the MACsec offloading
functionality of the device. Logic in the Rx handling for MACsec offload
does not utilize this information to preemptively avoid forwarding to the
macsec netdev currently. Because of this, things like multicast messages or
unicast messages with an unmatched destination address such as ARP requests
are forwarded to the macsec netdev whether the message received was MACsec
encrypted or not. The goal of this patch series is to improve the Rx
handling for MACsec offload for devices capable of annotating skbs received
that were decrypted by the NIC offload for MACsec.

Here is a summary of the issue that occurs with the existing logic today.

* The current design of the MACsec offload handling path tries to use
"best guess" mechanisms for determining whether a packet associated
with the currently handled skb in the datapath was processed via HW
offload
* The best guess mechanism uses the following heuristic logic (in order of
precedence)
- Check if header destination MAC address matches MACsec netdev MAC
address -> forward to MACsec port
- Check if packet is multicast traffic -> forward to MACsec port
- MACsec security channel was able to be looked up from skb offload
context (mlx5 only) -> forward to MACsec port
* Problem: plaintext traffic can potentially solicit a MACsec encrypted
response from the offload device
- Core aspect of MACsec is that it identifies unauthorized LAN connections
and excludes them from communication
+ This behavior can be seen when not enabling offload for MACsec
- The offload behavior violates this principle in MACsec

I believe this behavior is a security bug since applications utilizing
MACsec could be exploited using this behavior, and the correct way to
resolve this is by having the hardware correctly indicate whether MACsec
offload occurred for the packet or not. In the patches in this series, I
leave a warning for when the problematic path occurs because I cannot
figure out a secure way to fix the security issue that applies to the core
MACsec offload handling in the Rx path without breaking MACsec offload for
other vendors.

Shown at the bottom is an example use case where plaintext traffic sent to
a physical port of a NIC configured for MACsec offload is unable to be
handled correctly by the software stack when the NIC provides awareness to
the kernel about whether the received packet is MACsec traffic or not. In
this specific example, plaintext ARP requests are being responded with
MACsec encrypted ARP replies (which leads to routing information being
unable to be built for the requester).

Side 1

ip link del macsec0
ip address flush mlx5_1
ip address add 1.1.1.1/24 dev mlx5_1
ip link set dev mlx5_1 up
ip link add link mlx5_1 macsec0 type macsec sci 1 encrypt on
ip link set dev macsec0 address 00:11:22:33:44:66
ip macsec offload macsec0 mac
ip macsec add macsec0 tx sa 0 pn 1 on key 00 dffafc8d7b9a43d5b9a3dfbbf6a30c16
ip macsec add macsec0 rx sci 2 on
ip macsec add macsec0 rx sci 2 sa 0 pn 1 on key 00 ead3664f508eb06c40ac7104cdae4ce5
ip address flush macsec0
ip address add 2.2.2.1/24 dev macsec0
ip link set dev macsec0 up

# macsec0 enters promiscuous mode.
# This enables all traffic received on macsec_vlan to be processed by
# the macsec offload rx datapath. This however means that traffic
# meant to be received by mlx5_1 will be incorrectly steered to
# macsec0 as well.

ip link add link macsec0 name macsec_vlan type vlan id 1
ip link set dev macsec_vlan address 00:11:22:33:44:88
ip address flush macsec_vlan
ip address add 3.3.3.1/24 dev macsec_vlan
ip link set dev macsec_vlan up

Side 2

ip link del macsec0
ip address flush mlx5_1
ip address add 1.1.1.2/24 dev mlx5_1
ip link set dev mlx5_1 up
ip link add link mlx5_1 macsec0 type macsec sci 2 encrypt on
ip link set dev macsec0 address 00:11:22:33:44:77
ip macsec offload macsec0 mac
ip macsec add macsec0 tx sa 0 pn 1 on key 00 ead3664f508eb06c40ac7104cdae4ce5
ip macsec add macsec0 rx sci 1 on
ip macsec add macsec0 rx sci 1 sa 0 pn 1 on key 00 dffafc8d7b9a43d5b9a3dfbbf6a30c16
ip address flush macsec0
ip address add 2.2.2.2/24 dev macsec0
ip link set dev macsec0 up

# macsec0 enters promiscuous mode.
# This enables all traffic received on macsec_vlan to be processed by
# the macsec offload rx datapath. This however means that traffic
# meant to be received by mlx5_1 will be incorrectly steered to
# macsec0 as well.

ip link add link macsec0 name macsec_vlan type vlan id 1
ip link set dev macsec_vlan address 00:11:22:33:44:99
ip address flush macsec_vlan
ip address add 3.3.3.2/24 dev macsec_vlan
ip link set dev macsec_vlan up

Side 1

ping -I mlx5_1 1.1.1.2
PING 1.1.1.2 (1.1.1.2) from 1.1.1.1 mlx5_1: 56(84) bytes of data.
From 1.1.1.1 icmp_seq=1 Destination Host Unreachable
ping: sendmsg: No route to host
From 1.1.1.1 icmp_seq=2 Destination Host Unreachable
From 1.1.1.1 icmp_seq=3 Destination Host Unreachable

Changes:

v2->v3:
* Made dev paramater const for eth_skb_pkt_type helper as suggested by Sabrina
Dubroca <sd@queasysnail.net>
v1->v2:
* Fixed series subject to detail the issue being fixed
* Removed strange characters from cover letter
* Added comment in example that illustrates the impact involving
promiscuous mode
* Added patch for generalizing packet type detection
* Added Fixes: tags and targeting net
* Removed pointless warning in the heuristic Rx path for macsec offload
* Applied small refactor in Rx path offload to minimize scope of rx_sc
local variable

Link: https://github.com/Binary-Eater/macsec-rx-offload/blob/trunk/MACsec_violation_in_core_stack_offload_rx_handling.pdf
Link: https://lore.kernel.org/netdev/20240419213033.400467-5-rrameshbabu@nvidia.com/
Link: https://lore.kernel.org/netdev/20240419011740.333714-1-rrameshbabu@nvidia.com/
Link: https://lore.kernel.org/netdev/87r0l25y1c.fsf@nvidia.com/
Link: https://lore.kernel.org/netdev/20231116182900.46052-1-rrameshbabu@nvidia.com/
====================

Link: https://lore.kernel.org/r/20240423181319.115860-1-rrameshbabu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

+65 -21
+1
drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c
··· 1640 1640 .mdo_add_secy = mlx5e_macsec_add_secy, 1641 1641 .mdo_upd_secy = mlx5e_macsec_upd_secy, 1642 1642 .mdo_del_secy = mlx5e_macsec_del_secy, 1643 + .rx_uses_md_dst = true, 1643 1644 }; 1644 1645 1645 1646 bool mlx5e_macsec_handle_tx_skb(struct mlx5e_macsec *macsec, struct sk_buff *skb)
+36 -10
drivers/net/macsec.c
··· 999 999 struct metadata_dst *md_dst; 1000 1000 struct macsec_rxh_data *rxd; 1001 1001 struct macsec_dev *macsec; 1002 + bool is_macsec_md_dst; 1002 1003 1003 1004 rcu_read_lock(); 1004 1005 rxd = macsec_data_rcu(skb->dev); 1005 1006 md_dst = skb_metadata_dst(skb); 1007 + is_macsec_md_dst = md_dst && md_dst->type == METADATA_MACSEC; 1006 1008 1007 1009 list_for_each_entry_rcu(macsec, &rxd->secys, secys) { 1008 1010 struct sk_buff *nskb; ··· 1015 1013 * the SecTAG, so we have to deduce which port to deliver to. 1016 1014 */ 1017 1015 if (macsec_is_offloaded(macsec) && netif_running(ndev)) { 1018 - struct macsec_rx_sc *rx_sc = NULL; 1016 + const struct macsec_ops *ops; 1019 1017 1020 - if (md_dst && md_dst->type == METADATA_MACSEC) 1021 - rx_sc = find_rx_sc(&macsec->secy, md_dst->u.macsec_info.sci); 1018 + ops = macsec_get_ops(macsec, NULL); 1022 1019 1023 - if (md_dst && md_dst->type == METADATA_MACSEC && !rx_sc) 1020 + if (ops->rx_uses_md_dst && !is_macsec_md_dst) 1024 1021 continue; 1025 1022 1023 + if (is_macsec_md_dst) { 1024 + struct macsec_rx_sc *rx_sc; 1025 + 1026 + /* All drivers that implement MACsec offload 1027 + * support using skb metadata destinations must 1028 + * indicate that they do so. 1029 + */ 1030 + DEBUG_NET_WARN_ON_ONCE(!ops->rx_uses_md_dst); 1031 + rx_sc = find_rx_sc(&macsec->secy, 1032 + md_dst->u.macsec_info.sci); 1033 + if (!rx_sc) 1034 + continue; 1035 + /* device indicated macsec offload occurred */ 1036 + skb->dev = ndev; 1037 + skb->pkt_type = PACKET_HOST; 1038 + eth_skb_pkt_type(skb, ndev); 1039 + ret = RX_HANDLER_ANOTHER; 1040 + goto out; 1041 + } 1042 + 1043 + /* This datapath is insecure because it is unable to 1044 + * enforce isolation of broadcast/multicast traffic and 1045 + * unicast traffic with promiscuous mode on the macsec 1046 + * netdev. Since the core stack has no mechanism to 1047 + * check that the hardware did indeed receive MACsec 1048 + * traffic, it is possible that the response handling 1049 + * done by the MACsec port was to a plaintext packet. 1050 + * This violates the MACsec protocol standard. 1051 + */ 1026 1052 if (ether_addr_equal_64bits(hdr->h_dest, 1027 1053 ndev->dev_addr)) { 1028 1054 /* exact match, divert skb to this port */ ··· 1066 1036 break; 1067 1037 1068 1038 nskb->dev = ndev; 1069 - if (ether_addr_equal_64bits(hdr->h_dest, 1070 - ndev->broadcast)) 1071 - nskb->pkt_type = PACKET_BROADCAST; 1072 - else 1073 - nskb->pkt_type = PACKET_MULTICAST; 1039 + eth_skb_pkt_type(nskb, ndev); 1074 1040 1075 1041 __netif_rx(nskb); 1076 - } else if (rx_sc || ndev->flags & IFF_PROMISC) { 1042 + } else if (ndev->flags & IFF_PROMISC) { 1077 1043 skb->dev = ndev; 1078 1044 skb->pkt_type = PACKET_HOST; 1079 1045 ret = RX_HANDLER_ANOTHER;
+25
include/linux/etherdevice.h
··· 608 608 } 609 609 610 610 /** 611 + * eth_skb_pkt_type - Assign packet type if destination address does not match 612 + * @skb: Assigned a packet type if address does not match @dev address 613 + * @dev: Network device used to compare packet address against 614 + * 615 + * If the destination MAC address of the packet does not match the network 616 + * device address, assign an appropriate packet type. 617 + */ 618 + static inline void eth_skb_pkt_type(struct sk_buff *skb, 619 + const struct net_device *dev) 620 + { 621 + const struct ethhdr *eth = eth_hdr(skb); 622 + 623 + if (unlikely(!ether_addr_equal_64bits(eth->h_dest, dev->dev_addr))) { 624 + if (unlikely(is_multicast_ether_addr_64bits(eth->h_dest))) { 625 + if (ether_addr_equal_64bits(eth->h_dest, dev->broadcast)) 626 + skb->pkt_type = PACKET_BROADCAST; 627 + else 628 + skb->pkt_type = PACKET_MULTICAST; 629 + } else { 630 + skb->pkt_type = PACKET_OTHERHOST; 631 + } 632 + } 633 + } 634 + 635 + /** 611 636 * eth_skb_pad - Pad buffer to mininum number of octets for Ethernet frame 612 637 * @skb: Buffer to pad 613 638 *
+2
include/net/macsec.h
··· 321 321 * for the TX tag 322 322 * @needed_tailroom: number of bytes reserved at the end of the sk_buff for the 323 323 * TX tag 324 + * @rx_uses_md_dst: whether MACsec device offload supports sk_buff md_dst 324 325 */ 325 326 struct macsec_ops { 326 327 /* Device wide */ ··· 353 352 struct sk_buff *skb); 354 353 unsigned int needed_headroom; 355 354 unsigned int needed_tailroom; 355 + bool rx_uses_md_dst; 356 356 }; 357 357 358 358 void macsec_pn_wrapped(struct macsec_secy *secy, struct macsec_tx_sa *tx_sa);
+1 -11
net/ethernet/eth.c
··· 164 164 eth = (struct ethhdr *)skb->data; 165 165 skb_pull_inline(skb, ETH_HLEN); 166 166 167 - if (unlikely(!ether_addr_equal_64bits(eth->h_dest, 168 - dev->dev_addr))) { 169 - if (unlikely(is_multicast_ether_addr_64bits(eth->h_dest))) { 170 - if (ether_addr_equal_64bits(eth->h_dest, dev->broadcast)) 171 - skb->pkt_type = PACKET_BROADCAST; 172 - else 173 - skb->pkt_type = PACKET_MULTICAST; 174 - } else { 175 - skb->pkt_type = PACKET_OTHERHOST; 176 - } 177 - } 167 + eth_skb_pkt_type(skb, dev); 178 168 179 169 /* 180 170 * Some variants of DSA tagging don't have an ethertype field