Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

IB/ipoib: Ignore L3 master device

Currently, all master upper netdevices (e.g., bond, VRF) are treated
equally.

When a VRF netdevice is used over an IPoIB netdevice, the expected
netdev resolution is on the lower IPoIB device which has the IP address
assigned to it and not the VRF device.

The rdma_cm module (CMA) tries to match incoming requests to a
particular netdevice. When successful, it also validates that the return
path points to the same device by performing a routing table lookup.
Currently, the former would resolve to the VRF netdevice, while the
latter to the correct lower IPoIB netdevice, leading to failure in
rdma_cm.

Improve this by ignoring the VRF master netdevice, if it exists, and
instead return the lower IPoIB device.

Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Link: https://patch.msgid.link/20250916111103.84069-5-edwards@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>

authored by

Vlad Dumitrescu and committed by
Leon Romanovsky
42f993d3 c31e4038

+11 -10
+11 -10
drivers/infiniband/ulp/ipoib/ipoib_main.c
··· 351 351 } 352 352 353 353 /* 354 - * Find the master net_device on top of the given net_device. 354 + * Find the L2 master net_device on top of the given net_device. 355 355 * @dev: base IPoIB net_device 356 356 * 357 - * Returns the master net_device with a reference held, or the same net_device 358 - * if no master exists. 357 + * Returns the L2 master net_device with reference held if the L2 master 358 + * exists (such as bond netdevice), or returns same netdev with reference 359 + * held when master does not exist or when L3 master (such as VRF netdev). 359 360 */ 360 361 static struct net_device *ipoib_get_master_net_dev(struct net_device *dev) 361 362 { 362 363 struct net_device *master; 363 364 364 365 rcu_read_lock(); 366 + 365 367 master = netdev_master_upper_dev_get_rcu(dev); 368 + if (!master || netif_is_l3_master(master)) 369 + master = dev; 370 + 366 371 dev_hold(master); 367 372 rcu_read_unlock(); 368 373 369 - if (master) 370 - return master; 371 - 372 - dev_hold(dev); 373 - return dev; 374 + return master; 374 375 } 375 376 376 377 struct ipoib_walk_data { ··· 523 522 if (ret) 524 523 return NULL; 525 524 526 - /* See if we can find a unique device matching the L2 parameters */ 525 + /* See if we can find a unique device matching the pkey and GID */ 527 526 matches = __ipoib_get_net_dev_by_params(dev_list, port, pkey_index, 528 527 gid, NULL, &net_dev); 529 528 ··· 536 535 537 536 dev_put(net_dev); 538 537 539 - /* Couldn't find a unique device with L2 parameters only. Use L3 538 + /* Couldn't find a unique device with pkey and GID only. Use L3 540 539 * address to uniquely match the net device */ 541 540 matches = __ipoib_get_net_dev_by_params(dev_list, port, pkey_index, 542 541 gid, addr, &net_dev);