Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

net: make switchdev_bridge_port_{,unoffload} loosely coupled with the bridge

With the introduction of explicit offloading API in switchdev in commit
2f5dc00f7a3e ("net: bridge: switchdev: let drivers inform which bridge
ports are offloaded"), we started having Ethernet switch drivers calling
directly into a function exported by net/bridge/br_switchdev.c, which is
a function exported by the bridge driver.

This means that drivers that did not have an explicit dependency on the
bridge before, like cpsw and am65-cpsw, now do - otherwise it is not
possible to call a symbol exported by a driver that can be built as
module unless you are a module too.

There was an attempt to solve the dependency issue in the form of commit
b0e81817629a ("net: build all switchdev drivers as modules when the
bridge is a module"). Grygorii Strashko, however, says about it:

| In my opinion, the problem is a bit bigger here than just fixing the
| build :(
|
| In case, of ^cpsw the switchdev mode is kinda optional and in many
| cases (especially for testing purposes, NFS) the multi-mac mode is
| still preferable mode.
|
| There were no such tight dependency between switchdev drivers and
| bridge core before and switchdev serviced as independent, notification
| based layer between them, so ^cpsw still can be "Y" and bridge can be
| "M". Now for mostly every kernel build configuration the CONFIG_BRIDGE
| will need to be set as "Y", or we will have to update drivers to
| support build with BRIDGE=n and maintain separate builds for
| networking vs non-networking testing. But is this enough? Wouldn't
| it cause 'chain reaction' required to add more and more "Y" options
| (like CONFIG_VLAN_8021Q)?
|
| PS. Just to be sure we on the same page - ARM builds will be forced
| (with this patch) to have CONFIG_TI_CPSW_SWITCHDEV=m and so all our
| automation testing will just fail with omap2plus_defconfig.

In the light of this, it would be desirable for some configurations to
avoid dependencies between switchdev drivers and the bridge, and have
the switchdev mode as completely optional within the driver.

Arnd Bergmann also tried to write a patch which better expressed the
build time dependency for Ethernet switch drivers where the switchdev
support is optional, like cpsw/am65-cpsw, and this made the drivers
follow the bridge (compile as module if the bridge is a module) only if
the optional switchdev support in the driver was enabled in the first
place:
https://patchwork.kernel.org/project/netdevbpf/patch/20210802144813.1152762-1-arnd@kernel.org/

but this still did not solve the fact that cpsw and am65-cpsw now must
be built as modules when the bridge is a module - it just expressed
correctly that optional dependency. But the new behavior is an apparent
regression from Grygorii's perspective.

So to support the use case where the Ethernet driver is built-in,
NET_SWITCHDEV (a bool option) is enabled, and the bridge is a module, we
need a framework that can handle the possible absence of the bridge from
the running system, i.e. runtime bloatware as opposed to build-time
bloatware.

Luckily we already have this framework, since switchdev has been using
it extensively. Events from the bridge side are transmitted to the
driver side using notifier chains - this was originally done so that
unrelated drivers could snoop for events emitted by the bridge towards
ports that are implemented by other drivers (think of a switch driver
with LAG offload that listens for switchdev events on a bonding/team
interface that it offloads).

There are also events which are transmitted from the driver side to the
bridge side, which again are modeled using notifiers.
SWITCHDEV_FDB_ADD_TO_BRIDGE is an example of this, and deals with
notifying the bridge that a MAC address has been dynamically learned.
So there is a precedent we can use for modeling the new framework.

The difference compared to SWITCHDEV_FDB_ADD_TO_BRIDGE is that the work
that the bridge needs to do when a port becomes offloaded is blocking in
its nature: replay VLANs, MDBs etc. The calling context is indeed
blocking (we are under rtnl_mutex), but the existing switchdev
notification chain that the bridge is subscribed to is only the atomic
one. So we need to subscribe the bridge to the blocking switchdev
notification chain too.

This patch:
- keeps the driver-side perception of the switchdev_bridge_port_{,un}offload
unchanged
- moves the implementation of switchdev_bridge_port_{,un}offload from
the bridge module into the switchdev module.
- makes everybody that is subscribed to the switchdev blocking notifier
chain "hear" offload & unoffload events
- makes the bridge driver subscribe and handle those events
- moves the bridge driver's handling of those events into 2 new
functions called br_switchdev_port_{,un}offload. These functions
contain in fact the core of the logic that was previously in
switchdev_bridge_port_{,un}offload, just that now we go through an
extra indirection layer to reach them.

Unlike all the other switchdev notification structures, the structure
used to carry the bridge port information, struct
switchdev_notifier_brport_info, does not contain a "bool handled".
This is because in the current usage pattern, we always know that a
switchdev bridge port offloading event will be handled by the bridge,
because the switchdev_bridge_port_offload() call was initiated by a
NETDEV_CHANGEUPPER event in the first place, where info->upper_dev is a
bridge. So if the bridge wasn't loaded, then the CHANGEUPPER event
couldn't have happened.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

authored by

Vladimir Oltean and committed by
David S. Miller
957e2235 9c0532f9

+184 -65
+1 -1
drivers/net/ethernet/ti/am65-cpsw-nuss.c
··· 7 7 8 8 #include <linux/clk.h> 9 9 #include <linux/etherdevice.h> 10 - #include <linux/if_bridge.h> 11 10 #include <linux/if_vlan.h> 12 11 #include <linux/interrupt.h> 13 12 #include <linux/kernel.h> ··· 27 28 #include <linux/sys_soc.h> 28 29 #include <linux/dma/ti-cppi5.h> 29 30 #include <linux/dma/k3-udma-glue.h> 31 + #include <net/switchdev.h> 30 32 31 33 #include "cpsw_ale.h" 32 34 #include "cpsw_sl.h"
+1 -1
drivers/net/ethernet/ti/cpsw_new.c
··· 11 11 #include <linux/module.h> 12 12 #include <linux/irqreturn.h> 13 13 #include <linux/interrupt.h> 14 - #include <linux/if_bridge.h> 15 14 #include <linux/if_ether.h> 16 15 #include <linux/etherdevice.h> 17 16 #include <linux/net_tstamp.h> ··· 28 29 #include <linux/kmemleak.h> 29 30 #include <linux/sys_soc.h> 30 31 32 + #include <net/switchdev.h> 31 33 #include <net/page_pool.h> 32 34 #include <net/pkt_cls.h> 33 35 #include <net/devlink.h>
-35
include/linux/if_bridge.h
··· 190 190 } 191 191 #endif 192 192 193 - #if IS_ENABLED(CONFIG_BRIDGE) && IS_ENABLED(CONFIG_NET_SWITCHDEV) 194 - 195 - int switchdev_bridge_port_offload(struct net_device *brport_dev, 196 - struct net_device *dev, const void *ctx, 197 - struct notifier_block *atomic_nb, 198 - struct notifier_block *blocking_nb, 199 - bool tx_fwd_offload, 200 - struct netlink_ext_ack *extack); 201 - void switchdev_bridge_port_unoffload(struct net_device *brport_dev, 202 - const void *ctx, 203 - struct notifier_block *atomic_nb, 204 - struct notifier_block *blocking_nb); 205 - 206 - #else 207 - 208 - static inline int 209 - switchdev_bridge_port_offload(struct net_device *brport_dev, 210 - struct net_device *dev, const void *ctx, 211 - struct notifier_block *atomic_nb, 212 - struct notifier_block *blocking_nb, 213 - bool tx_fwd_offload, 214 - struct netlink_ext_ack *extack) 215 - { 216 - return -EINVAL; 217 - } 218 - 219 - static inline void 220 - switchdev_bridge_port_unoffload(struct net_device *brport_dev, 221 - const void *ctx, 222 - struct notifier_block *atomic_nb, 223 - struct notifier_block *blocking_nb) 224 - { 225 - } 226 - #endif 227 - 228 193 #endif
+46
include/net/switchdev.h
··· 180 180 181 181 typedef int switchdev_obj_dump_cb_t(struct switchdev_obj *obj); 182 182 183 + struct switchdev_brport { 184 + struct net_device *dev; 185 + const void *ctx; 186 + struct notifier_block *atomic_nb; 187 + struct notifier_block *blocking_nb; 188 + bool tx_fwd_offload; 189 + }; 190 + 183 191 enum switchdev_notifier_type { 184 192 SWITCHDEV_FDB_ADD_TO_BRIDGE = 1, 185 193 SWITCHDEV_FDB_DEL_TO_BRIDGE, ··· 205 197 SWITCHDEV_VXLAN_FDB_ADD_TO_DEVICE, 206 198 SWITCHDEV_VXLAN_FDB_DEL_TO_DEVICE, 207 199 SWITCHDEV_VXLAN_FDB_OFFLOADED, 200 + 201 + SWITCHDEV_BRPORT_OFFLOADED, 202 + SWITCHDEV_BRPORT_UNOFFLOADED, 208 203 }; 209 204 210 205 struct switchdev_notifier_info { ··· 237 226 bool handled; 238 227 }; 239 228 229 + struct switchdev_notifier_brport_info { 230 + struct switchdev_notifier_info info; /* must be first */ 231 + const struct switchdev_brport brport; 232 + }; 233 + 240 234 static inline struct net_device * 241 235 switchdev_notifier_info_to_dev(const struct switchdev_notifier_info *info) 242 236 { ··· 261 245 } 262 246 263 247 #ifdef CONFIG_NET_SWITCHDEV 248 + 249 + int switchdev_bridge_port_offload(struct net_device *brport_dev, 250 + struct net_device *dev, const void *ctx, 251 + struct notifier_block *atomic_nb, 252 + struct notifier_block *blocking_nb, 253 + bool tx_fwd_offload, 254 + struct netlink_ext_ack *extack); 255 + void switchdev_bridge_port_unoffload(struct net_device *brport_dev, 256 + const void *ctx, 257 + struct notifier_block *atomic_nb, 258 + struct notifier_block *blocking_nb); 264 259 265 260 void switchdev_deferred_process(void); 266 261 int switchdev_port_attr_set(struct net_device *dev, ··· 342 315 const struct switchdev_attr *attr, 343 316 struct netlink_ext_ack *extack)); 344 317 #else 318 + 319 + static inline int 320 + switchdev_bridge_port_offload(struct net_device *brport_dev, 321 + struct net_device *dev, const void *ctx, 322 + struct notifier_block *atomic_nb, 323 + struct notifier_block *blocking_nb, 324 + bool tx_fwd_offload, 325 + struct netlink_ext_ack *extack) 326 + { 327 + return -EOPNOTSUPP; 328 + } 329 + 330 + static inline void 331 + switchdev_bridge_port_unoffload(struct net_device *brport_dev, 332 + const void *ctx, 333 + struct notifier_block *atomic_nb, 334 + struct notifier_block *blocking_nb) 335 + { 336 + } 345 337 346 338 static inline void switchdev_deferred_process(void) 347 339 {
+50 -1
net/bridge/br.c
··· 201 201 .notifier_call = br_switchdev_event, 202 202 }; 203 203 204 + /* called under rtnl_mutex */ 205 + static int br_switchdev_blocking_event(struct notifier_block *nb, 206 + unsigned long event, void *ptr) 207 + { 208 + struct netlink_ext_ack *extack = netdev_notifier_info_to_extack(ptr); 209 + struct net_device *dev = switchdev_notifier_info_to_dev(ptr); 210 + struct switchdev_notifier_brport_info *brport_info; 211 + const struct switchdev_brport *b; 212 + struct net_bridge_port *p; 213 + int err = NOTIFY_DONE; 214 + 215 + p = br_port_get_rtnl(dev); 216 + if (!p) 217 + goto out; 218 + 219 + switch (event) { 220 + case SWITCHDEV_BRPORT_OFFLOADED: 221 + brport_info = ptr; 222 + b = &brport_info->brport; 223 + 224 + err = br_switchdev_port_offload(p, b->dev, b->ctx, 225 + b->atomic_nb, b->blocking_nb, 226 + b->tx_fwd_offload, extack); 227 + err = notifier_from_errno(err); 228 + break; 229 + case SWITCHDEV_BRPORT_UNOFFLOADED: 230 + brport_info = ptr; 231 + b = &brport_info->brport; 232 + 233 + br_switchdev_port_unoffload(p, b->ctx, b->atomic_nb, 234 + b->blocking_nb); 235 + break; 236 + } 237 + 238 + out: 239 + return err; 240 + } 241 + 242 + static struct notifier_block br_switchdev_blocking_notifier = { 243 + .notifier_call = br_switchdev_blocking_event, 244 + }; 245 + 204 246 /* br_boolopt_toggle - change user-controlled boolean option 205 247 * 206 248 * @br: bridge device ··· 397 355 if (err) 398 356 goto err_out4; 399 357 400 - err = br_netlink_init(); 358 + err = register_switchdev_blocking_notifier(&br_switchdev_blocking_notifier); 401 359 if (err) 402 360 goto err_out5; 361 + 362 + err = br_netlink_init(); 363 + if (err) 364 + goto err_out6; 403 365 404 366 brioctl_set(br_ioctl_stub); 405 367 ··· 419 373 420 374 return 0; 421 375 376 + err_out6: 377 + unregister_switchdev_blocking_notifier(&br_switchdev_blocking_notifier); 422 378 err_out5: 423 379 unregister_switchdev_notifier(&br_switchdev_notifier); 424 380 err_out4: ··· 440 392 { 441 393 stp_proto_unregister(&br_stp_proto); 442 394 br_netlink_fini(); 395 + unregister_switchdev_blocking_notifier(&br_switchdev_blocking_notifier); 443 396 unregister_switchdev_notifier(&br_switchdev_notifier); 444 397 unregister_netdevice_notifier(&br_device_notifier); 445 398 brioctl_set(NULL);
+29
net/bridge/br_private.h
··· 1880 1880 1881 1881 /* br_switchdev.c */ 1882 1882 #ifdef CONFIG_NET_SWITCHDEV 1883 + int br_switchdev_port_offload(struct net_bridge_port *p, 1884 + struct net_device *dev, const void *ctx, 1885 + struct notifier_block *atomic_nb, 1886 + struct notifier_block *blocking_nb, 1887 + bool tx_fwd_offload, 1888 + struct netlink_ext_ack *extack); 1889 + 1890 + void br_switchdev_port_unoffload(struct net_bridge_port *p, const void *ctx, 1891 + struct notifier_block *atomic_nb, 1892 + struct notifier_block *blocking_nb); 1893 + 1883 1894 bool br_switchdev_frame_uses_tx_fwd_offload(struct sk_buff *skb); 1884 1895 1885 1896 void br_switchdev_frame_set_offload_fwd_mark(struct sk_buff *skb); ··· 1919 1908 skb->offload_fwd_mark = 0; 1920 1909 } 1921 1910 #else 1911 + static inline int 1912 + br_switchdev_port_offload(struct net_bridge_port *p, 1913 + struct net_device *dev, const void *ctx, 1914 + struct notifier_block *atomic_nb, 1915 + struct notifier_block *blocking_nb, 1916 + bool tx_fwd_offload, 1917 + struct netlink_ext_ack *extack) 1918 + { 1919 + return -EOPNOTSUPP; 1920 + } 1921 + 1922 + static inline void 1923 + br_switchdev_port_unoffload(struct net_bridge_port *p, const void *ctx, 1924 + struct notifier_block *atomic_nb, 1925 + struct notifier_block *blocking_nb) 1926 + { 1927 + } 1928 + 1922 1929 static inline bool br_switchdev_frame_uses_tx_fwd_offload(struct sk_buff *skb) 1923 1930 { 1924 1931 return false;
+9 -27
net/bridge/br_switchdev.c
··· 312 312 /* Let the bridge know that this port is offloaded, so that it can assign a 313 313 * switchdev hardware domain to it. 314 314 */ 315 - int switchdev_bridge_port_offload(struct net_device *brport_dev, 316 - struct net_device *dev, const void *ctx, 317 - struct notifier_block *atomic_nb, 318 - struct notifier_block *blocking_nb, 319 - bool tx_fwd_offload, 320 - struct netlink_ext_ack *extack) 315 + int br_switchdev_port_offload(struct net_bridge_port *p, 316 + struct net_device *dev, const void *ctx, 317 + struct notifier_block *atomic_nb, 318 + struct notifier_block *blocking_nb, 319 + bool tx_fwd_offload, 320 + struct netlink_ext_ack *extack) 321 321 { 322 322 struct netdev_phys_item_id ppid; 323 - struct net_bridge_port *p; 324 323 int err; 325 - 326 - ASSERT_RTNL(); 327 - 328 - p = br_port_get_rtnl(brport_dev); 329 - if (!p) 330 - return -ENODEV; 331 324 332 325 err = dev_get_port_parent_id(dev, &ppid, false); 333 326 if (err) ··· 341 348 342 349 return err; 343 350 } 344 - EXPORT_SYMBOL_GPL(switchdev_bridge_port_offload); 345 351 346 - void switchdev_bridge_port_unoffload(struct net_device *brport_dev, 347 - const void *ctx, 348 - struct notifier_block *atomic_nb, 349 - struct notifier_block *blocking_nb) 352 + void br_switchdev_port_unoffload(struct net_bridge_port *p, const void *ctx, 353 + struct notifier_block *atomic_nb, 354 + struct notifier_block *blocking_nb) 350 355 { 351 - struct net_bridge_port *p; 352 - 353 - ASSERT_RTNL(); 354 - 355 - p = br_port_get_rtnl(brport_dev); 356 - if (!p) 357 - return; 358 - 359 356 nbp_switchdev_unsync_objs(p, ctx, atomic_nb, blocking_nb); 360 357 361 358 nbp_switchdev_del(p); 362 359 } 363 - EXPORT_SYMBOL_GPL(switchdev_bridge_port_unoffload);
+48
net/switchdev/switchdev.c
··· 809 809 return err; 810 810 } 811 811 EXPORT_SYMBOL_GPL(switchdev_handle_port_attr_set); 812 + 813 + int switchdev_bridge_port_offload(struct net_device *brport_dev, 814 + struct net_device *dev, const void *ctx, 815 + struct notifier_block *atomic_nb, 816 + struct notifier_block *blocking_nb, 817 + bool tx_fwd_offload, 818 + struct netlink_ext_ack *extack) 819 + { 820 + struct switchdev_notifier_brport_info brport_info = { 821 + .brport = { 822 + .dev = dev, 823 + .ctx = ctx, 824 + .atomic_nb = atomic_nb, 825 + .blocking_nb = blocking_nb, 826 + .tx_fwd_offload = tx_fwd_offload, 827 + }, 828 + }; 829 + int err; 830 + 831 + ASSERT_RTNL(); 832 + 833 + err = call_switchdev_blocking_notifiers(SWITCHDEV_BRPORT_OFFLOADED, 834 + brport_dev, &brport_info.info, 835 + extack); 836 + return notifier_to_errno(err); 837 + } 838 + EXPORT_SYMBOL_GPL(switchdev_bridge_port_offload); 839 + 840 + void switchdev_bridge_port_unoffload(struct net_device *brport_dev, 841 + const void *ctx, 842 + struct notifier_block *atomic_nb, 843 + struct notifier_block *blocking_nb) 844 + { 845 + struct switchdev_notifier_brport_info brport_info = { 846 + .brport = { 847 + .ctx = ctx, 848 + .atomic_nb = atomic_nb, 849 + .blocking_nb = blocking_nb, 850 + }, 851 + }; 852 + 853 + ASSERT_RTNL(); 854 + 855 + call_switchdev_blocking_notifiers(SWITCHDEV_BRPORT_UNOFFLOADED, 856 + brport_dev, &brport_info.info, 857 + NULL); 858 + } 859 + EXPORT_SYMBOL_GPL(switchdev_bridge_port_unoffload);