Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'mlx5-updates-2022-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2022-11-29

Misc update for mlx5 driver

1) Various trivial cleanups

2) Maor Dickman, Adds support for trap offload with additional actions

3) From Tariq, UMR (device memory registrations) cleanups,
UMR WQE must be aligned to 64B per device spec, (not a bug fix).

* tag 'mlx5-updates-2022-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5e: Support devlink reload of IPsec core
net/mlx5e: TC, Add offload support for trap with additional actions
net/mlx5e: Do early return when setup vports dests for slow path flow
net/mlx5: Remove redundant check
net/mlx5e: Delete always true DMA check
net/mlx5e: Don't access directly DMA device pointer
net/mlx5e: Don't use termination table when redundant
net/mlx5: Fix orthography errors in documentation
net/mlx5: Use generic definition for UMR KLM alignment
net/mlx5: Generalize name of UMR alignment definition
net/mlx5: Remove unused UMR MTT definitions
net/mlx5e: Add padding when needed in UMR WQEs
net/mlx5: Remove unused ctx variables
net/mlx5e: Replace zero-length arrays with DECLARE_FLEX_ARRAY() helper
net/mlx5e: Remove unneeded io-mapping.h #include
====================

Link: https://lore.kernel.org/r/20221130051152.479480-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

+153 -135
+41 -41
Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
··· 25 25 | at build time via kernel Kconfig flags. 26 26 | Basic features, ethernet net device rx/tx offloads and XDP, are available with the most basic flags 27 27 | CONFIG_MLX5_CORE=y/m and CONFIG_MLX5_CORE_EN=y. 28 - | For the list of advanced features please see below. 28 + | For the list of advanced features, please see below. 29 29 30 30 **CONFIG_MLX5_CORE=(y/m/n)** (module mlx5_core.ko) 31 31 ··· 89 89 90 90 **CONFIG_MLX5_EN_IPSEC=(y/n)** 91 91 92 - | Enables `IPSec XFRM cryptography-offload accelaration <http://www.mellanox.com/related-docs/prod_software/Mellanox_Innova_IPsec_Ethernet_Adapter_Card_User_Manual.pdf>`_. 92 + | Enables `IPSec XFRM cryptography-offload acceleration <http://www.mellanox.com/related-docs/prod_software/Mellanox_Innova_IPsec_Ethernet_Adapter_Card_User_Manual.pdf>`_. 93 93 94 94 **CONFIG_MLX5_EN_TLS=(y/n)** 95 95 96 - | TLS cryptography-offload accelaration. 96 + | TLS cryptography-offload acceleration. 97 97 98 98 99 99 **CONFIG_MLX5_INFINIBAND=(y/n/m)** (module mlx5_ib.ko) ··· 139 139 The flow steering mode parameter controls the flow steering mode of the driver. 140 140 Two modes are supported: 141 141 1. 'dmfs' - Device managed flow steering. 142 - 2. 'smfs - Software/Driver managed flow steering. 142 + 2. 'smfs' - Software/Driver managed flow steering. 143 143 144 144 In DMFS mode, the HW steering entities are created and managed through the 145 145 Firmware. 146 146 In SMFS mode, the HW steering entities are created and managed though by 147 - the driver directly into Hardware without firmware intervention. 147 + the driver directly into hardware without firmware intervention. 148 148 149 - SMFS mode is faster and provides better rule inserstion rate compared to default DMFS mode. 149 + SMFS mode is faster and provides better rule insertion rate compared to default DMFS mode. 150 150 151 151 User command examples: 152 152 ··· 165 165 enable_roce: RoCE enablement state 166 166 ---------------------------------- 167 167 RoCE enablement state controls driver support for RoCE traffic. 168 - When RoCE is disabled, there is no gid table, only raw ethernet QPs are supported and traffic on the well known UDP RoCE port is handled as raw ethernet traffic. 168 + When RoCE is disabled, there is no gid table, only raw ethernet QPs are supported and traffic on the well-known UDP RoCE port is handled as raw ethernet traffic. 169 169 170 - To change RoCE enablement state a user must change the driverinit cmode value and run devlink reload. 170 + To change RoCE enablement state, a user must change the driverinit cmode value and run devlink reload. 171 171 172 172 User command examples: 173 173 ··· 186 186 187 187 esw_port_metadata: Eswitch port metadata state 188 188 ---------------------------------------------- 189 - When applicable, disabling Eswitch metadata can increase packet rate 189 + When applicable, disabling eswitch metadata can increase packet rate 190 190 up to 20% depending on the use case and packet sizes. 191 191 192 192 Eswitch port metadata state controls whether to internally tag packets with ··· 253 253 ================ 254 254 mlx5 supports subfunction management using devlink port (see :ref:`Documentation/networking/devlink/devlink-port.rst <devlink_port>`) interface. 255 255 256 - A Subfunction has its own function capabilities and its own resources. This 256 + A subfunction has its own function capabilities and its own resources. This 257 257 means a subfunction has its own dedicated queues (txq, rxq, cq, eq). These 258 258 queues are neither shared nor stolen from the parent PCI function. 259 259 260 - When a subfunction is RDMA capable, it has its own QP1, GID table and rdma 260 + When a subfunction is RDMA capable, it has its own QP1, GID table, and RDMA 261 261 resources neither shared nor stolen from the parent PCI function. 262 262 263 263 A subfunction has a dedicated window in PCI BAR space that is not shared 264 - with ther other subfunctions or the parent PCI function. This ensures that all 265 - devices (netdev, rdma, vdpa etc.) of the subfunction accesses only assigned 264 + with the other subfunctions or the parent PCI function. This ensures that all 265 + devices (netdev, rdma, vdpa, etc.) of the subfunction accesses only assigned 266 266 PCI BAR space. 267 267 268 - A Subfunction supports eswitch representation through which it supports tc 268 + A subfunction supports eswitch representation through which it supports tc 269 269 offloads. The user configures eswitch to send/receive packets from/to 270 270 the subfunction port. 271 271 272 272 Subfunctions share PCI level resources such as PCI MSI-X IRQs with 273 273 other subfunctions and/or with its parent PCI function. 274 274 275 - Example mlx5 software, system and device view:: 275 + Example mlx5 software, system, and device view:: 276 276 277 277 _______ 278 278 | admin | ··· 310 310 | (device add/del) 311 311 _____|____ ____|________ 312 312 | | | subfunction | 313 - | PCI NIC |---- activate/deactive events---->| host driver | 313 + | PCI NIC |--- activate/deactivate events--->| host driver | 314 314 |__________| | (mlx5_core) | 315 315 |_____________| 316 316 ··· 320 320 321 321 $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev 322 322 323 - - Add a devlink port of subfunction flaovur:: 323 + - Add a devlink port of subfunction flavour:: 324 324 325 325 $ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88 326 326 pci/0000:06:00.0/32768: type eth netdev eth6 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false ··· 379 379 function: 380 380 hw_addr 00:00:00:00:00:00 381 381 382 - - Set the MAC address of the VF identified by its unique devlink port index:: 382 + - Set the MAC address of the SF identified by its unique devlink port index:: 383 383 384 384 $ devlink port function set pci/0000:06:00.0/32768 hw_addr 00:00:00:00:88:88 385 385 386 386 $ devlink port show pci/0000:06:00.0/32768 387 - pci/0000:06:00.0/32768: type eth netdev enp6s0pf0sf88 flavour pcivf pfnum 0 sfnum 88 387 + pci/0000:06:00.0/32768: type eth netdev enp6s0pf0sf88 flavour pcisf pfnum 0 sfnum 88 388 388 function: 389 389 hw_addr 00:00:00:00:88:88 390 390 391 391 SF state setup 392 392 -------------- 393 - To use the SF, the user must active the SF using the SF function state 393 + To use the SF, the user must activate the SF using the SF function state 394 394 attribute. 395 395 396 396 - Get the state of the SF identified by its unique devlink port index:: ··· 447 447 448 448 Additionally, the SF port also gets the event when the driver attaches to the 449 449 auxiliary device of the subfunction. This results in changing the operational 450 - state of the function. This provides visiblity to the user to decide when is it 450 + state of the function. This provides visibility to the user to decide when is it 451 451 safe to delete the SF port for graceful termination of the subfunction. 452 452 453 453 - Show the SF port operational state:: ··· 464 464 ----------- 465 465 The tx reporter is responsible for reporting and recovering of the following two error scenarios: 466 466 467 - - TX timeout 467 + - tx timeout 468 468 Report on kernel tx timeout detection. 469 469 Recover by searching lost interrupts. 470 - - TX error completion 470 + - tx error completion 471 471 Report on error tx completion. 472 - Recover by flushing the TX queue and reset it. 472 + Recover by flushing the tx queue and reset it. 473 473 474 - TX reporter also support on demand diagnose callback, on which it provides 474 + tx reporter also support on demand diagnose callback, on which it provides 475 475 real time information of its send queues status. 476 476 477 477 User commands examples: ··· 491 491 ----------- 492 492 The rx reporter is responsible for reporting and recovering of the following two error scenarios: 493 493 494 - - RX queues initialization (population) timeout 495 - RX queues descriptors population on ring initialization is done in 496 - napi context via triggering an irq, in case of a failure to get 497 - the minimum amount of descriptors, a timeout would occur and it 498 - could be recoverable by polling the EQ (Event Queue). 499 - - RX completions with errors (reported by HW on interrupt context) 494 + - rx queues' initialization (population) timeout 495 + Population of rx queues' descriptors on ring initialization is done 496 + in napi context via triggering an irq. In case of a failure to get 497 + the minimum amount of descriptors, a timeout would occur, and 498 + descriptors could be recovered by polling the EQ (Event Queue). 499 + - rx completions with errors (reported by HW on interrupt context) 500 500 Report on rx completion error. 501 501 Recover (if needed) by flushing the related queue and reset it. 502 502 503 - RX reporter also supports on demand diagnose callback, on which it 504 - provides real time information of its receive queues status. 503 + rx reporter also supports on demand diagnose callback, on which it 504 + provides real time information of its receive queues' status. 505 505 506 - - Diagnose rx queues status, and corresponding completion queue:: 506 + - Diagnose rx queues' status and corresponding completion queue:: 507 507 508 508 $ devlink health diagnose pci/0000:82:00.0 reporter rx 509 509 510 - NOTE: This command has valid output only when interface is up, otherwise the command has empty output. 510 + NOTE: This command has valid output only when interface is up. Otherwise, the command has empty output. 511 511 512 512 - Show number of rx errors indicated, number of recover flows ended successfully, 513 - is autorecover enabled and graceful period from last recover:: 513 + is autorecover enabled, and graceful period from last recover:: 514 514 515 515 $ devlink health show pci/0000:82:00.0 reporter rx 516 516 517 517 fw reporter 518 518 ----------- 519 - The fw reporter implements diagnose and dump callbacks. 519 + The fw reporter implements `diagnose` and `dump` callbacks. 520 520 It follows symptoms of fw error such as fw syndrome by triggering 521 521 fw core dump and storing it into the dump buffer. 522 522 The fw reporter diagnose command can be triggered any time by the user to check ··· 537 537 538 538 fw fatal reporter 539 539 ----------------- 540 - The fw fatal reporter implements dump and recover callbacks. 540 + The fw fatal reporter implements `dump` and `recover` callbacks. 541 541 It follows fatal errors indications by CR-space dump and recover flow. 542 542 The CR-space dump uses vsc interface which is valid even if the FW command 543 543 interface is not functional, which is the case in most FW fatal errors. ··· 552 552 553 553 $ devlink health recover pci/0000:82:00.0 reporter fw_fatal 554 554 555 - - Read FW CR-space dump if already strored or trigger new one:: 555 + - Read FW CR-space dump if already stored or trigger new one:: 556 556 557 557 $ devlink health dump show pci/0000:82:00.1 reporter fw_fatal 558 558 ··· 561 561 mlx5 tracepoints 562 562 ================ 563 563 564 - mlx5 driver provides internal trace points for tracking and debugging using 564 + mlx5 driver provides internal tracepoints for tracking and debugging using 565 565 kernel tracepoints interfaces (refer to Documentation/trace/ftrace.rst). 566 566 567 - For the list of support mlx5 events check /sys/kernel/debug/tracing/events/mlx5/ 567 + For the list of support mlx5 events, check `/sys/kernel/debug/tracing/events/mlx5/`. 568 568 569 569 tc and eswitch offloads tracepoints: 570 570
+1 -2
drivers/infiniband/hw/mlx5/odp.c
··· 230 230 struct ib_umem_odp *umem_odp = 231 231 container_of(mni, struct ib_umem_odp, notifier); 232 232 struct mlx5_ib_mr *mr; 233 - const u64 umr_block_mask = (MLX5_UMR_MTT_ALIGNMENT / 234 - sizeof(struct mlx5_mtt)) - 1; 233 + const u64 umr_block_mask = MLX5_UMR_MTT_NUM_ENTRIES_ALIGNMENT - 1; 235 234 u64 idx = 0, blk_start_idx = 0; 236 235 u64 invalidations = 0; 237 236 unsigned long start;
+7 -7
drivers/infiniband/hw/mlx5/umr.c
··· 418 418 } 419 419 420 420 #define MLX5_MAX_UMR_CHUNK \ 421 - ((1 << (MLX5_MAX_UMR_SHIFT + 4)) - MLX5_UMR_MTT_ALIGNMENT) 421 + ((1 << (MLX5_MAX_UMR_SHIFT + 4)) - MLX5_UMR_FLEX_ALIGNMENT) 422 422 #define MLX5_SPARE_UMR_CHUNK 0x10000 423 423 424 424 /* ··· 428 428 */ 429 429 static void *mlx5r_umr_alloc_xlt(size_t *nents, size_t ent_size, gfp_t gfp_mask) 430 430 { 431 - const size_t xlt_chunk_align = MLX5_UMR_MTT_ALIGNMENT / ent_size; 431 + const size_t xlt_chunk_align = MLX5_UMR_FLEX_ALIGNMENT / ent_size; 432 432 size_t size; 433 433 void *res = NULL; 434 434 435 - static_assert(PAGE_SIZE % MLX5_UMR_MTT_ALIGNMENT == 0); 435 + static_assert(PAGE_SIZE % MLX5_UMR_FLEX_ALIGNMENT == 0); 436 436 437 437 /* 438 438 * MLX5_IB_UPD_XLT_ATOMIC doesn't signal an atomic context just that the ··· 666 666 } 667 667 668 668 final_size = (void *)cur_mtt - (void *)mtt; 669 - sg.length = ALIGN(final_size, MLX5_UMR_MTT_ALIGNMENT); 669 + sg.length = ALIGN(final_size, MLX5_UMR_FLEX_ALIGNMENT); 670 670 memset(cur_mtt, 0, sg.length - final_size); 671 671 mlx5r_umr_final_update_xlt(dev, &wqe, mr, &sg, flags); 672 672 ··· 690 690 int desc_size = (flags & MLX5_IB_UPD_XLT_INDIRECT) 691 691 ? sizeof(struct mlx5_klm) 692 692 : sizeof(struct mlx5_mtt); 693 - const int page_align = MLX5_UMR_MTT_ALIGNMENT / desc_size; 693 + const int page_align = MLX5_UMR_FLEX_ALIGNMENT / desc_size; 694 694 struct mlx5_ib_dev *dev = mr_to_mdev(mr); 695 695 struct device *ddev = &dev->mdev->pdev->dev; 696 696 const int page_mask = page_align - 1; ··· 711 711 if (WARN_ON(!mr->umem->is_odp)) 712 712 return -EINVAL; 713 713 714 - /* UMR copies MTTs in units of MLX5_UMR_MTT_ALIGNMENT bytes, 714 + /* UMR copies MTTs in units of MLX5_UMR_FLEX_ALIGNMENT bytes, 715 715 * so we need to align the offset and length accordingly 716 716 */ 717 717 if (idx & page_mask) { ··· 748 748 mlx5_odp_populate_xlt(xlt, idx, npages, mr, flags); 749 749 dma_sync_single_for_device(ddev, sg.addr, sg.length, 750 750 DMA_TO_DEVICE); 751 - sg.length = ALIGN(size_to_map, MLX5_UMR_MTT_ALIGNMENT); 751 + sg.length = ALIGN(size_to_map, MLX5_UMR_FLEX_ALIGNMENT); 752 752 753 753 if (pages_mapped + pages_iter >= pages_to_map) 754 754 mlx5r_umr_final_update_xlt(dev, &wqe, mr, &sg, flags);
-1
drivers/net/ethernet/mellanox/mlx5/core/cmd.c
··· 37 37 #include <linux/slab.h> 38 38 #include <linux/delay.h> 39 39 #include <linux/random.h> 40 - #include <linux/io-mapping.h> 41 40 #include <linux/mlx5/driver.h> 42 41 #include <linux/mlx5/eq.h> 43 42 #include <linux/debugfs.h>
+4 -4
drivers/net/ethernet/mellanox/mlx5/core/en.h
··· 103 103 * size actually used at runtime, but it's not a problem when calculating static 104 104 * array sizes. 105 105 */ 106 - #define MLX5_UMR_MAX_MTT_SPACE \ 106 + #define MLX5_UMR_MAX_FLEX_SPACE \ 107 107 (ALIGN_DOWN(MLX5_SEND_WQE_MAX_SIZE - sizeof(struct mlx5e_umr_wqe), \ 108 - MLX5_UMR_MTT_ALIGNMENT)) 108 + MLX5_UMR_FLEX_ALIGNMENT)) 109 109 #define MLX5_MPWRQ_MAX_PAGES_PER_WQE \ 110 - rounddown_pow_of_two(MLX5_UMR_MAX_MTT_SPACE / sizeof(struct mlx5_mtt)) 110 + rounddown_pow_of_two(MLX5_UMR_MAX_FLEX_SPACE / sizeof(struct mlx5_mtt)) 111 111 112 112 #define MLX5E_MAX_RQ_NUM_MTTS \ 113 113 (ALIGN_DOWN(U16_MAX, 4) * 2) /* Fits into u16 and aligned by WQEBB. */ ··· 160 160 (((wqe_size) - sizeof(struct mlx5e_umr_wqe)) / sizeof(struct mlx5_klm)) 161 161 162 162 #define MLX5E_KLM_ENTRIES_PER_WQE(wqe_size)\ 163 - ALIGN_DOWN(MLX5E_KLM_MAX_ENTRIES_PER_WQE(wqe_size), MLX5_UMR_KLM_ALIGNMENT) 163 + ALIGN_DOWN(MLX5E_KLM_MAX_ENTRIES_PER_WQE(wqe_size), MLX5_UMR_KLM_NUM_ENTRIES_ALIGNMENT) 164 164 165 165 #define MLX5E_MAX_KLM_PER_WQE(mdev) \ 166 166 MLX5E_KLM_ENTRIES_PER_WQE(MLX5_SEND_WQE_BB * mlx5e_get_max_sq_aligned_wqebbs(mdev))
+2 -2
drivers/net/ethernet/mellanox/mlx5/core/en/params.c
··· 107 107 /* Keep in sync with MLX5_MPWRQ_MAX_PAGES_PER_WQE. */ 108 108 max_wqe_size = mlx5e_get_max_sq_aligned_wqebbs(mdev) * MLX5_SEND_WQE_BB; 109 109 max_pages_per_wqe = ALIGN_DOWN(max_wqe_size - sizeof(struct mlx5e_umr_wqe), 110 - MLX5_UMR_MTT_ALIGNMENT) / umr_entry_size; 110 + MLX5_UMR_FLEX_ALIGNMENT) / umr_entry_size; 111 111 max_log_mpwqe_size = ilog2(max_pages_per_wqe) + page_shift; 112 112 113 113 WARN_ON_ONCE(max_log_mpwqe_size < MLX5E_ORDER2_MAX_PACKET_MTU); ··· 146 146 u16 umr_wqe_sz; 147 147 148 148 umr_wqe_sz = sizeof(struct mlx5e_umr_wqe) + 149 - ALIGN(pages_per_wqe * umr_entry_size, MLX5_UMR_MTT_ALIGNMENT); 149 + ALIGN(pages_per_wqe * umr_entry_size, MLX5_UMR_FLEX_ALIGNMENT); 150 150 151 151 WARN_ON_ONCE(DIV_ROUND_UP(umr_wqe_sz, MLX5_SEND_WQE_DS) > MLX5_WQE_CTRL_DS_MASK); 152 152
+2 -8
drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/trap.c
··· 3 3 4 4 #include "act.h" 5 5 #include "en/tc_priv.h" 6 + #include "eswitch.h" 6 7 7 8 static bool 8 9 tc_act_can_offload_trap(struct mlx5e_tc_act_parse_state *parse_state, ··· 11 10 int act_index, 12 11 struct mlx5_flow_attr *attr) 13 12 { 14 - struct netlink_ext_ack *extack = parse_state->extack; 15 - 16 - if (parse_state->flow_action->num_entries != 1) { 17 - NL_SET_ERR_MSG_MOD(extack, "action trap is supported as a sole action only"); 18 - return false; 19 - } 20 - 21 13 return true; 22 14 } 23 15 ··· 21 27 struct mlx5_flow_attr *attr) 22 28 { 23 29 attr->action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; 24 - attr->flags |= MLX5_ATTR_FLAG_SLOW_PATH; 30 + attr->dest_ft = mlx5_eswitch_get_slow_fdb(priv->mdev->priv.eswitch); 25 31 26 32 return 0; 27 33 }
+8 -9
drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
··· 345 345 kfree(sa_entry); 346 346 } 347 347 348 - int mlx5e_ipsec_init(struct mlx5e_priv *priv) 348 + void mlx5e_ipsec_init(struct mlx5e_priv *priv) 349 349 { 350 350 struct mlx5e_ipsec *ipsec; 351 - int ret; 351 + int ret = -ENOMEM; 352 352 353 353 if (!mlx5_ipsec_device_caps(priv->mdev)) { 354 354 netdev_dbg(priv->netdev, "Not an IPSec offload device\n"); 355 - return 0; 355 + return; 356 356 } 357 357 358 358 ipsec = kzalloc(sizeof(*ipsec), GFP_KERNEL); 359 359 if (!ipsec) 360 - return -ENOMEM; 360 + return; 361 361 362 362 hash_init(ipsec->sadb_rx); 363 363 spin_lock_init(&ipsec->sadb_rx_lock); 364 364 ipsec->mdev = priv->mdev; 365 365 ipsec->wq = alloc_ordered_workqueue("mlx5e_ipsec: %s", 0, 366 366 priv->netdev->name); 367 - if (!ipsec->wq) { 368 - ret = -ENOMEM; 367 + if (!ipsec->wq) 369 368 goto err_wq; 370 - } 371 369 372 370 ret = mlx5e_accel_ipsec_fs_init(ipsec); 373 371 if (ret) ··· 373 375 374 376 priv->ipsec = ipsec; 375 377 netdev_dbg(priv->netdev, "IPSec attached to netdevice\n"); 376 - return 0; 378 + return; 377 379 378 380 err_fs_init: 379 381 destroy_workqueue(ipsec->wq); 380 382 err_wq: 381 383 kfree(ipsec); 382 - return (ret != -EOPNOTSUPP) ? ret : 0; 384 + mlx5_core_err(priv->mdev, "IPSec initialization failed, %d\n", ret); 385 + return; 383 386 } 384 387 385 388 void mlx5e_ipsec_cleanup(struct mlx5e_priv *priv)
+2 -3
drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h
··· 146 146 struct mlx5e_ipsec_modify_state_work modify_work; 147 147 }; 148 148 149 - int mlx5e_ipsec_init(struct mlx5e_priv *priv); 149 + void mlx5e_ipsec_init(struct mlx5e_priv *priv); 150 150 void mlx5e_ipsec_cleanup(struct mlx5e_priv *priv); 151 151 void mlx5e_ipsec_build_netdev(struct mlx5e_priv *priv); 152 152 ··· 174 174 return sa_entry->ipsec->mdev; 175 175 } 176 176 #else 177 - static inline int mlx5e_ipsec_init(struct mlx5e_priv *priv) 177 + static inline void mlx5e_ipsec_init(struct mlx5e_priv *priv) 178 178 { 179 - return 0; 180 179 } 181 180 182 181 static inline void mlx5e_ipsec_cleanup(struct mlx5e_priv *priv)
+6 -6
drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c
··· 186 186 return err; 187 187 } 188 188 189 - dma_device = &mdev->pdev->dev; 189 + dma_device = mlx5_core_dma_dev(mdev); 190 190 dma_addr = dma_map_single(dma_device, umr->ctx, sizeof(umr->ctx), DMA_BIDIRECTIONAL); 191 191 err = dma_mapping_error(dma_device, dma_addr); 192 192 if (err) { ··· 1299 1299 struct mlx5_wqe_aso_ctrl_seg *aso_ctrl, 1300 1300 struct mlx5_aso_ctrl_param *param) 1301 1301 { 1302 + struct mlx5e_macsec_umr *umr = macsec_aso->umr; 1303 + 1302 1304 memset(aso_ctrl, 0, sizeof(*aso_ctrl)); 1303 - if (macsec_aso->umr->dma_addr) { 1304 - aso_ctrl->va_l = cpu_to_be32(macsec_aso->umr->dma_addr | ASO_CTRL_READ_EN); 1305 - aso_ctrl->va_h = cpu_to_be32((u64)macsec_aso->umr->dma_addr >> 32); 1306 - aso_ctrl->l_key = cpu_to_be32(macsec_aso->umr->mkey); 1307 - } 1305 + aso_ctrl->va_l = cpu_to_be32(umr->dma_addr | ASO_CTRL_READ_EN); 1306 + aso_ctrl->va_h = cpu_to_be32((u64)umr->dma_addr >> 32); 1307 + aso_ctrl->l_key = cpu_to_be32(umr->mkey); 1308 1308 1309 1309 if (!param) 1310 1310 return;
+3 -6
drivers/net/ethernet/mellanox/mlx5/core/en_main.c
··· 208 208 u8 umr_entry_size = mlx5e_mpwrq_umr_entry_size(umr_mode); 209 209 u32 sz; 210 210 211 - sz = ALIGN(entries * umr_entry_size, MLX5_UMR_MTT_ALIGNMENT); 211 + sz = ALIGN(entries * umr_entry_size, MLX5_UMR_FLEX_ALIGNMENT); 212 212 213 213 return sz / MLX5_OCTWORD; 214 214 } ··· 5238 5238 } 5239 5239 priv->fs = fs; 5240 5240 5241 - err = mlx5e_ipsec_init(priv); 5242 - if (err) 5243 - mlx5_core_err(mdev, "IPSec initialization failed, %d\n", err); 5244 - 5245 5241 err = mlx5e_ktls_init(priv); 5246 5242 if (err) 5247 5243 mlx5_core_err(mdev, "TLS initialization failed, %d\n", err); ··· 5250 5254 { 5251 5255 mlx5e_health_destroy_reporters(priv); 5252 5256 mlx5e_ktls_cleanup(priv); 5253 - mlx5e_ipsec_cleanup(priv); 5254 5257 mlx5e_fs_cleanup(priv->fs); 5255 5258 } 5256 5259 ··· 5378 5383 int err; 5379 5384 5380 5385 mlx5e_fs_init_l2_addr(priv->fs, netdev); 5386 + mlx5e_ipsec_init(priv); 5381 5387 5382 5388 err = mlx5e_macsec_init(priv); 5383 5389 if (err) ··· 5442 5446 mlx5_lag_remove_netdev(mdev, priv->netdev); 5443 5447 mlx5_vxlan_reset_to_default(mdev->vxlan); 5444 5448 mlx5e_macsec_cleanup(priv); 5449 + mlx5e_ipsec_cleanup(priv); 5445 5450 } 5446 5451 5447 5452 int mlx5e_update_nic_rx(struct mlx5e_priv *priv)
+4 -6
drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
··· 751 751 struct net_device *netdev) 752 752 { 753 753 struct mlx5e_priv *priv = netdev_priv(netdev); 754 - int err; 755 754 756 755 priv->fs = mlx5e_fs_init(priv->profile, mdev, 757 756 !test_bit(MLX5E_STATE_DESTROYING, &priv->state)); ··· 758 759 netdev_err(priv->netdev, "FS allocation failed\n"); 759 760 return -ENOMEM; 760 761 } 761 - 762 - err = mlx5e_ipsec_init(priv); 763 - if (err) 764 - mlx5_core_err(mdev, "Uplink rep IPsec initialization failed, %d\n", err); 765 762 766 763 mlx5e_vxlan_set_netdev_info(priv); 767 764 mlx5e_build_rep_params(netdev); ··· 768 773 static void mlx5e_cleanup_rep(struct mlx5e_priv *priv) 769 774 { 770 775 mlx5e_fs_cleanup(priv->fs); 771 - mlx5e_ipsec_cleanup(priv); 772 776 } 773 777 774 778 static int mlx5e_create_rep_ttc_table(struct mlx5e_priv *priv) ··· 1106 1112 struct mlx5_core_dev *mdev = priv->mdev; 1107 1113 u16 max_mtu; 1108 1114 1115 + mlx5e_ipsec_init(priv); 1116 + 1109 1117 netdev->min_mtu = ETH_MIN_MTU; 1110 1118 mlx5_query_port_max_mtu(priv->mdev, &max_mtu, 1); 1111 1119 netdev->max_mtu = MLX5E_HW2SW_MTU(&priv->channels.params, max_mtu); ··· 1154 1158 mlx5e_rep_tc_disable(priv); 1155 1159 mlx5_lag_remove_netdev(mdev, priv->netdev); 1156 1160 mlx5_vxlan_reset_to_default(mdev->vxlan); 1161 + 1162 + mlx5e_ipsec_cleanup(priv); 1157 1163 } 1158 1164 1159 1165 static MLX5E_DEFINE_STATS_GRP(sw_rep, 0);
+16 -5
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
··· 593 593 int headroom, i; 594 594 595 595 headroom = rq->buff.headroom; 596 - new_entries = klm_entries - (shampo->pi & (MLX5_UMR_KLM_ALIGNMENT - 1)); 597 - entries = ALIGN(klm_entries, MLX5_UMR_KLM_ALIGNMENT); 596 + new_entries = klm_entries - (shampo->pi & (MLX5_UMR_KLM_NUM_ENTRIES_ALIGNMENT - 1)); 597 + entries = ALIGN(klm_entries, MLX5_UMR_KLM_NUM_ENTRIES_ALIGNMENT); 598 598 wqe_bbs = MLX5E_KLM_UMR_WQEBBS(entries); 599 599 pi = mlx5e_icosq_get_next_pi(sq, wqe_bbs); 600 600 umr_wqe = mlx5_wq_cyc_get_wqe(&sq->wq, pi); ··· 603 603 for (i = 0; i < entries; i++, index++) { 604 604 dma_info = &shampo->info[index]; 605 605 if (i >= klm_entries || (index < shampo->pi && shampo->pi - index < 606 - MLX5_UMR_KLM_ALIGNMENT)) 606 + MLX5_UMR_KLM_NUM_ENTRIES_ALIGNMENT)) 607 607 goto update_klm; 608 608 header_offset = (index & (MLX5E_SHAMPO_WQ_HEADER_PER_PAGE - 1)) << 609 609 MLX5E_SHAMPO_LOG_MAX_HEADER_ENTRY_SIZE; ··· 668 668 if (!klm_entries) 669 669 return 0; 670 670 671 - klm_entries += (shampo->pi & (MLX5_UMR_KLM_ALIGNMENT - 1)); 672 - index = ALIGN_DOWN(shampo->pi, MLX5_UMR_KLM_ALIGNMENT); 671 + klm_entries += (shampo->pi & (MLX5_UMR_KLM_NUM_ENTRIES_ALIGNMENT - 1)); 672 + index = ALIGN_DOWN(shampo->pi, MLX5_UMR_KLM_NUM_ENTRIES_ALIGNMENT); 673 673 entries_before = shampo->hd_per_wq - index; 674 674 675 675 if (unlikely(entries_before < klm_entries)) ··· 725 725 umr_wqe->inline_mtts[i] = (struct mlx5_mtt) { 726 726 .ptag = cpu_to_be64(addr | MLX5_EN_WR), 727 727 }; 728 + } 729 + 730 + /* Pad if needed, in case the value set to ucseg->xlt_octowords 731 + * in mlx5e_build_umr_wqe() needed alignment. 732 + */ 733 + if (rq->mpwqe.pages_per_wqe & (MLX5_UMR_MTT_NUM_ENTRIES_ALIGNMENT - 1)) { 734 + int pad = ALIGN(rq->mpwqe.pages_per_wqe, MLX5_UMR_MTT_NUM_ENTRIES_ALIGNMENT) - 735 + rq->mpwqe.pages_per_wqe; 736 + 737 + memset(&umr_wqe->inline_mtts[rq->mpwqe.pages_per_wqe], 0, 738 + sizeof(*umr_wqe->inline_mtts) * pad); 728 739 } 729 740 730 741 bitmap_zero(wi->xdp_xmit_bitmap, rq->mpwqe.pages_per_wqe);
+2 -2
drivers/net/ethernet/mellanox/mlx5/core/en_tc.h
··· 97 97 } lag; 98 98 /* keep this union last */ 99 99 union { 100 - struct mlx5_esw_flow_attr esw_attr[0]; 101 - struct mlx5_nic_flow_attr nic_attr[0]; 100 + DECLARE_FLEX_ARRAY(struct mlx5_esw_flow_attr, esw_attr); 101 + DECLARE_FLEX_ARRAY(struct mlx5_nic_flow_attr, nic_attr); 102 102 }; 103 103 }; 104 104
+5
drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
··· 744 744 return 0; 745 745 } 746 746 747 + static inline struct mlx5_flow_table * 748 + mlx5_eswitch_get_slow_fdb(struct mlx5_eswitch *esw) 749 + { 750 + return esw->fdb_table.offloads.slow_fdb; 751 + } 747 752 #else /* CONFIG_MLX5_ESWITCH */ 748 753 /* eswitch API stubs */ 749 754 static inline int mlx5_eswitch_init(struct mlx5_core_dev *dev) { return 0; }
+19 -16
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
··· 248 248 if (MLX5_CAP_ESW_FLOWTABLE_FDB(esw->dev, ignore_flow_level)) 249 249 flow_act->flags |= FLOW_ACT_IGNORE_FLOW_LEVEL; 250 250 dest[i].type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE; 251 - dest[i].ft = esw->fdb_table.offloads.slow_fdb; 251 + dest[i].ft = mlx5_eswitch_get_slow_fdb(esw); 252 252 } 253 253 254 254 static int ··· 479 479 esw_src_port_rewrite_supported(esw)) 480 480 attr->flags |= MLX5_ATTR_FLAG_SRC_REWRITE; 481 481 482 - if (attr->flags & MLX5_ATTR_FLAG_SAMPLE && 483 - !(attr->flags & MLX5_ATTR_FLAG_SLOW_PATH)) { 484 - esw_setup_sampler_dest(dest, flow_act, attr->sample_attr.sampler_id, *i); 485 - (*i)++; 486 - } else if (attr->flags & MLX5_ATTR_FLAG_SLOW_PATH) { 482 + if (attr->flags & MLX5_ATTR_FLAG_SLOW_PATH) { 487 483 esw_setup_slow_path_dest(dest, flow_act, esw, *i); 484 + (*i)++; 485 + goto out; 486 + } 487 + 488 + if (attr->flags & MLX5_ATTR_FLAG_SAMPLE) { 489 + esw_setup_sampler_dest(dest, flow_act, attr->sample_attr.sampler_id, *i); 488 490 (*i)++; 489 491 } else if (attr->flags & MLX5_ATTR_FLAG_ACCEPT) { 490 492 esw_setup_accept_dest(dest, flow_act, chains, *i); ··· 508 506 } 509 507 } 510 508 509 + out: 511 510 return err; 512 511 } 513 512 ··· 1049 1046 if (rep->vport == MLX5_VPORT_UPLINK) 1050 1047 spec->flow_context.flow_source = MLX5_FLOW_CONTEXT_FLOW_SOURCE_LOCAL_VPORT; 1051 1048 1052 - flow_rule = mlx5_add_flow_rules(on_esw->fdb_table.offloads.slow_fdb, 1049 + flow_rule = mlx5_add_flow_rules(mlx5_eswitch_get_slow_fdb(on_esw), 1053 1050 spec, &flow_act, &dest, 1); 1054 1051 if (IS_ERR(flow_rule)) 1055 1052 esw_warn(on_esw->dev, "FDB: Failed to add send to vport rule err %ld\n", ··· 1098 1095 mlx5_eswitch_get_vport_metadata_for_match(esw, vport_num)); 1099 1096 dest.vport.num = vport_num; 1100 1097 1101 - flow_rule = mlx5_add_flow_rules(esw->fdb_table.offloads.slow_fdb, 1098 + flow_rule = mlx5_add_flow_rules(mlx5_eswitch_get_slow_fdb(esw), 1102 1099 spec, &flow_act, &dest, 1); 1103 1100 if (IS_ERR(flow_rule)) 1104 1101 esw_warn(esw->dev, "FDB: Failed to add send to vport meta rule vport %d, err %ld\n", ··· 1251 1248 esw_set_peer_miss_rule_source_port(esw, peer_dev->priv.eswitch, 1252 1249 spec, MLX5_VPORT_PF); 1253 1250 1254 - flow = mlx5_add_flow_rules(esw->fdb_table.offloads.slow_fdb, 1251 + flow = mlx5_add_flow_rules(mlx5_eswitch_get_slow_fdb(esw), 1255 1252 spec, &flow_act, &dest, 1); 1256 1253 if (IS_ERR(flow)) { 1257 1254 err = PTR_ERR(flow); ··· 1263 1260 if (mlx5_ecpf_vport_exists(esw->dev)) { 1264 1261 vport = mlx5_eswitch_get_vport(esw, MLX5_VPORT_ECPF); 1265 1262 MLX5_SET(fte_match_set_misc, misc, source_port, MLX5_VPORT_ECPF); 1266 - flow = mlx5_add_flow_rules(esw->fdb_table.offloads.slow_fdb, 1263 + flow = mlx5_add_flow_rules(mlx5_eswitch_get_slow_fdb(esw), 1267 1264 spec, &flow_act, &dest, 1); 1268 1265 if (IS_ERR(flow)) { 1269 1266 err = PTR_ERR(flow); ··· 1277 1274 peer_dev->priv.eswitch, 1278 1275 spec, vport->vport); 1279 1276 1280 - flow = mlx5_add_flow_rules(esw->fdb_table.offloads.slow_fdb, 1277 + flow = mlx5_add_flow_rules(mlx5_eswitch_get_slow_fdb(esw), 1281 1278 spec, &flow_act, &dest, 1); 1282 1279 if (IS_ERR(flow)) { 1283 1280 err = PTR_ERR(flow); ··· 1366 1363 dest.vport.num = esw->manager_vport; 1367 1364 flow_act.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; 1368 1365 1369 - flow_rule = mlx5_add_flow_rules(esw->fdb_table.offloads.slow_fdb, 1366 + flow_rule = mlx5_add_flow_rules(mlx5_eswitch_get_slow_fdb(esw), 1370 1367 spec, &flow_act, &dest, 1); 1371 1368 if (IS_ERR(flow_rule)) { 1372 1369 err = PTR_ERR(flow_rule); ··· 1381 1378 dmac_v = MLX5_ADDR_OF(fte_match_param, headers_v, 1382 1379 outer_headers.dmac_47_16); 1383 1380 dmac_v[0] = 0x01; 1384 - flow_rule = mlx5_add_flow_rules(esw->fdb_table.offloads.slow_fdb, 1381 + flow_rule = mlx5_add_flow_rules(mlx5_eswitch_get_slow_fdb(esw), 1385 1382 spec, &flow_act, &dest, 1); 1386 1383 if (IS_ERR(flow_rule)) { 1387 1384 err = PTR_ERR(flow_rule); ··· 1930 1927 fdb_chains_err: 1931 1928 mlx5_destroy_flow_table(esw->fdb_table.offloads.tc_miss_table); 1932 1929 tc_miss_table_err: 1933 - mlx5_destroy_flow_table(esw->fdb_table.offloads.slow_fdb); 1930 + mlx5_destroy_flow_table(mlx5_eswitch_get_slow_fdb(esw)); 1934 1931 slow_fdb_err: 1935 1932 /* Holds true only as long as DMFS is the default */ 1936 1933 mlx5_flow_namespace_set_mode(root_ns, MLX5_FLOW_STEERING_MODE_DMFS); ··· 1941 1938 1942 1939 static void esw_destroy_offloads_fdb_tables(struct mlx5_eswitch *esw) 1943 1940 { 1944 - if (!esw->fdb_table.offloads.slow_fdb) 1941 + if (!mlx5_eswitch_get_slow_fdb(esw)) 1945 1942 return; 1946 1943 1947 1944 esw_debug(esw->dev, "Destroy offloads FDB Tables\n"); ··· 1957 1954 esw_chains_destroy(esw, esw_chains(esw)); 1958 1955 1959 1956 mlx5_destroy_flow_table(esw->fdb_table.offloads.tc_miss_table); 1960 - mlx5_destroy_flow_table(esw->fdb_table.offloads.slow_fdb); 1957 + mlx5_destroy_flow_table(mlx5_eswitch_get_slow_fdb(esw)); 1961 1958 /* Holds true only as long as DMFS is the default */ 1962 1959 mlx5_flow_namespace_set_mode(esw->fdb_table.offloads.ns, 1963 1960 MLX5_FLOW_STEERING_MODE_DMFS);
+28 -4
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_termtbl.c
··· 210 210 return (port_mask & port_value) == MLX5_VPORT_UPLINK; 211 211 } 212 212 213 + static bool 214 + mlx5_eswitch_is_push_vlan_no_cap(struct mlx5_eswitch *esw, 215 + struct mlx5_flow_act *flow_act) 216 + { 217 + if (flow_act->action & MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH && 218 + !(mlx5_fs_get_capabilities(esw->dev, MLX5_FLOW_NAMESPACE_FDB) & 219 + MLX5_FLOW_STEERING_CAP_VLAN_PUSH_ON_RX)) 220 + return true; 221 + 222 + return false; 223 + } 224 + 213 225 bool 214 226 mlx5_eswitch_termtbl_required(struct mlx5_eswitch *esw, 215 227 struct mlx5_flow_attr *attr, ··· 237 225 (!mlx5_eswitch_offload_is_uplink_port(esw, spec) && !esw_attr->int_port)) 238 226 return false; 239 227 240 - /* push vlan on RX */ 241 - if (flow_act->action & MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH && 242 - !(mlx5_fs_get_capabilities(esw->dev, MLX5_FLOW_NAMESPACE_FDB) & 243 - MLX5_FLOW_STEERING_CAP_VLAN_PUSH_ON_RX)) 228 + if (mlx5_eswitch_is_push_vlan_no_cap(esw, flow_act)) 244 229 return true; 245 230 246 231 /* hairpin */ ··· 261 252 struct mlx5_flow_act term_tbl_act = {}; 262 253 struct mlx5_flow_handle *rule = NULL; 263 254 bool term_table_created = false; 255 + bool is_push_vlan_on_rx; 264 256 int num_vport_dests = 0; 265 257 int i, curr_dest; 266 258 259 + is_push_vlan_on_rx = mlx5_eswitch_is_push_vlan_no_cap(esw, flow_act); 267 260 mlx5_eswitch_termtbl_actions_move(flow_act, &term_tbl_act); 268 261 term_tbl_act.action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; 269 262 270 263 for (i = 0; i < num_dest; i++) { 271 264 struct mlx5_termtbl_handle *tt; 265 + bool hairpin = false; 272 266 273 267 /* only vport destinations can be terminated */ 274 268 if (dest[i].type != MLX5_FLOW_DESTINATION_TYPE_VPORT) 275 269 continue; 270 + 271 + if (attr->dests[num_vport_dests].rep && 272 + attr->dests[num_vport_dests].rep->vport == MLX5_VPORT_UPLINK) 273 + hairpin = true; 274 + 275 + if (!is_push_vlan_on_rx && !hairpin) { 276 + num_vport_dests++; 277 + continue; 278 + } 276 279 277 280 if (attr->dests[num_vport_dests].flags & MLX5_ESW_DEST_ENCAP) { 278 281 term_tbl_act.action |= MLX5_FLOW_CONTEXT_ACTION_PACKET_REFORMAT; ··· 332 311 333 312 for (curr_dest = 0; curr_dest < num_vport_dests; curr_dest++) { 334 313 struct mlx5_termtbl_handle *tt = attr->dests[curr_dest].termtbl; 314 + 315 + if (!tt) 316 + continue; 335 317 336 318 attr->dests[curr_dest].termtbl = NULL; 337 319
-3
drivers/net/ethernet/mellanox/mlx5/core/lib/aso.c
··· 334 334 335 335 void mlx5_aso_destroy(struct mlx5_aso *aso) 336 336 { 337 - if (IS_ERR_OR_NULL(aso)) 338 - return; 339 - 340 337 mlx5_aso_destroy_sq(aso); 341 338 mlx5_aso_destroy_cq(&aso->cq); 342 339 kfree(aso);
-3
drivers/net/ethernet/mellanox/mlx5/core/main.c
··· 37 37 #include <linux/pci.h> 38 38 #include <linux/dma-mapping.h> 39 39 #include <linux/slab.h> 40 - #include <linux/io-mapping.h> 41 40 #include <linux/interrupt.h> 42 41 #include <linux/delay.h> 43 42 #include <linux/mlx5/driver.h> ··· 1603 1604 int err; 1604 1605 1605 1606 memcpy(&dev->profile, &profile[profile_idx], sizeof(dev->profile)); 1606 - INIT_LIST_HEAD(&priv->ctx_list); 1607 - spin_lock_init(&priv->ctx_lock); 1608 1607 lockdep_register_key(&dev->lock_key); 1609 1608 mutex_init(&dev->intf_state_mutex); 1610 1609 lockdep_set_class(&dev->intf_state_mutex, &dev->lock_key);
-1
drivers/net/ethernet/mellanox/mlx5/core/uar.c
··· 31 31 */ 32 32 33 33 #include <linux/kernel.h> 34 - #include <linux/io-mapping.h> 35 34 #include <linux/mlx5/driver.h> 36 35 #include "mlx5_core.h" 37 36
+3 -4
include/linux/mlx5/device.h
··· 290 290 MLX5_UMR_INLINE = (1 << 7), 291 291 }; 292 292 293 - #define MLX5_UMR_KLM_ALIGNMENT 4 294 - #define MLX5_UMR_MTT_ALIGNMENT 0x40 295 - #define MLX5_UMR_MTT_MASK (MLX5_UMR_MTT_ALIGNMENT - 1) 296 - #define MLX5_UMR_MTT_MIN_CHUNK_SIZE MLX5_UMR_MTT_ALIGNMENT 293 + #define MLX5_UMR_FLEX_ALIGNMENT 0x40 294 + #define MLX5_UMR_MTT_NUM_ENTRIES_ALIGNMENT (MLX5_UMR_FLEX_ALIGNMENT / sizeof(struct mlx5_mtt)) 295 + #define MLX5_UMR_KLM_NUM_ENTRIES_ALIGNMENT (MLX5_UMR_FLEX_ALIGNMENT / sizeof(struct mlx5_klm)) 297 296 298 297 #define MLX5_USER_INDEX_LEN (MLX5_FLD_SZ_BYTES(qpc, user_index) * 8) 299 298
-2
include/linux/mlx5/driver.h
··· 606 606 struct list_head pgdir_list; 607 607 /* end: alloc staff */ 608 608 609 - struct list_head ctx_list; 610 - spinlock_t ctx_lock; 611 609 struct mlx5_adev **adev; 612 610 int adev_idx; 613 611 int sw_vhca_id;