Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

mlxsw: Add VXLAN bridge ports to same hardware domain as physical bridge ports

When hardware floods packets to bridge ports, but flooding to VXLAN bridge
port fails during encapsulation to one of the remote VTEPs, the packets are
trapped to CPU. In such case, the packets are marked with
skb->offload_fwd_mark, which means that packet was L2-forwarded in
hardware. Software data path repeats flooding, but packets which are
marked with skb->offload_fwd_mark will not be flooded by the bridge to
bridge ports which are in the same hardware domain as the ingress port.

Currently, mlxsw does not add VXLAN bridge ports to the same hardware
domain as physical bridge ports despite the fact that the device is able
to forward packets to and from VXLAN tunnels in hardware. In some scenarios
(as mentioned above) this can result in remote VTEPs receiving duplicate
packets. The packets are first flooded by hardware and after an
encapsulation failure, they are flooded again to all remote VTEPs by
software.

Solve this by adding VXLAN bridge ports to the same hardware domain as
physical bridge ports, so then nbp_switchdev_allowed_egress() will return
false also for VXLAN, and packets will not be sent twice from VXLAN device.

switchdev_bridge_port_offload() should get vxlan_dev not as const, so
some changes are required. Call switchdev API from
mlxsw_sp_bridge_vxlan_{join,leave}() which handle offload configurations.

Reported-by: Vladimir Oltean <olteanv@gmail.com>
Closes: https://lore.kernel.org/all/20250210152246.4ajumdchwhvbarik@skbuf/
Reported-by: Vladyslav Mykhaliuk <vmykhaliuk@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/7279056843140fae3a72c2d204c7886b79d03899.1742224300.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

authored by

Amit Cohen and committed by
Jakub Kicinski
139ae877 630e7e20

+26 -6
+2 -2
drivers/net/ethernet/mellanox/mlxsw/spectrum.h
··· 661 661 const struct net_device *br_dev); 662 662 int mlxsw_sp_bridge_vxlan_join(struct mlxsw_sp *mlxsw_sp, 663 663 const struct net_device *br_dev, 664 - const struct net_device *vxlan_dev, u16 vid, 664 + struct net_device *vxlan_dev, u16 vid, 665 665 struct netlink_ext_ack *extack); 666 666 void mlxsw_sp_bridge_vxlan_leave(struct mlxsw_sp *mlxsw_sp, 667 - const struct net_device *vxlan_dev); 667 + struct net_device *vxlan_dev); 668 668 extern struct notifier_block mlxsw_sp_switchdev_notifier; 669 669 670 670 /* spectrum.c */
+24 -4
drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
··· 2950 2950 2951 2951 int mlxsw_sp_bridge_vxlan_join(struct mlxsw_sp *mlxsw_sp, 2952 2952 const struct net_device *br_dev, 2953 - const struct net_device *vxlan_dev, u16 vid, 2953 + struct net_device *vxlan_dev, u16 vid, 2954 2954 struct netlink_ext_ack *extack) 2955 2955 { 2956 2956 struct mlxsw_sp_bridge_device *bridge_device; 2957 + struct mlxsw_sp_port *mlxsw_sp_port; 2958 + int err; 2957 2959 2958 2960 bridge_device = mlxsw_sp_bridge_device_find(mlxsw_sp->bridge, br_dev); 2959 2961 if (WARN_ON(!bridge_device)) 2960 2962 return -EINVAL; 2961 2963 2962 - return bridge_device->ops->vxlan_join(bridge_device, vxlan_dev, vid, 2963 - extack); 2964 + mlxsw_sp_port = mlxsw_sp_port_dev_lower_find(bridge_device->dev); 2965 + if (!mlxsw_sp_port) 2966 + return -EINVAL; 2967 + 2968 + err = bridge_device->ops->vxlan_join(bridge_device, vxlan_dev, vid, 2969 + extack); 2970 + if (err) 2971 + return err; 2972 + 2973 + err = switchdev_bridge_port_offload(vxlan_dev, mlxsw_sp_port->dev, 2974 + NULL, NULL, NULL, false, extack); 2975 + if (err) 2976 + goto err_bridge_port_offload; 2977 + 2978 + return 0; 2979 + 2980 + err_bridge_port_offload: 2981 + __mlxsw_sp_bridge_vxlan_leave(mlxsw_sp, vxlan_dev); 2982 + return err; 2964 2983 } 2965 2984 2966 2985 void mlxsw_sp_bridge_vxlan_leave(struct mlxsw_sp *mlxsw_sp, 2967 - const struct net_device *vxlan_dev) 2986 + struct net_device *vxlan_dev) 2968 2987 { 2988 + switchdev_bridge_port_unoffload(vxlan_dev, NULL, NULL, NULL); 2969 2989 __mlxsw_sp_bridge_vxlan_leave(mlxsw_sp, vxlan_dev); 2970 2990 } 2971 2991