Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'mlx5-updates-2019-06-13' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2019-06-13

Mlx5 devlink health fw reporters and sw reset support

This series provides mlx5 firmware reset support and firmware devlink health
reporters.

1) Add initial mlx5 kernel documentation and include devlink health reporters

2) Add CR-Space access and FW Crdump snapshot support via devlink region_snapshot

3) Issue software reset upon FW asserts

4) Add fw and fw_fatal devlink heath reporters to follow fw errors indication by
dump and recover procedures and enable trigger these functionality by user.

4.1) fw reporter:
The fw reporter implements diagnose and dump callbacks.
It follows symptoms of fw error such as fw syndrome by triggering
fw core dump and storing it and any other fw trace into the dump buffer.
The fw reporter diagnose command can be triggered any time by the user to check
current fw status.

4.2) fw_fatal repoter:
The fw_fatal reporter implements dump and recover callbacks.
It follows fatal errors indications by CR-space dump and recover flow.
The CR-space dump uses vsc interface which is valid even if the FW command
interface is not functional, which is the case in most FW fatal errors. The
CR-space dump is stored as a memory region snapshot to ease read by address.
The recover function runs recover flow which reloads the driver and triggers fw
reset if needed.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

+1517 -145
+1
Documentation/networking/device_drivers/index.rst
··· 21 21 intel/i40e 22 22 intel/iavf 23 23 intel/ice 24 + mellanox/mlx5 24 25 25 26 .. only:: subproject 26 27
+173
Documentation/networking/device_drivers/mellanox/mlx5.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 2 + 3 + ================================================= 4 + Mellanox ConnectX(R) mlx5 core VPI Network Driver 5 + ================================================= 6 + 7 + Copyright (c) 2019, Mellanox Technologies LTD. 8 + 9 + Contents 10 + ======== 11 + 12 + - `Enabling the driver and kconfig options`_ 13 + - `Devlink health reporters`_ 14 + 15 + Enabling the driver and kconfig options 16 + ================================================ 17 + 18 + | mlx5 core is modular and most of the major mlx5 core driver features can be selected (compiled in/out) 19 + | at build time via kernel Kconfig flags. 20 + | Basic features, ethernet net device rx/tx offloads and XDP, are available with the most basic flags 21 + | CONFIG_MLX5_CORE=y/m and CONFIG_MLX5_CORE_EN=y. 22 + | For the list of advanced features please see below. 23 + 24 + **CONFIG_MLX5_CORE=(y/m/n)** (module mlx5_core.ko) 25 + 26 + | The driver can be enabled by choosing CONFIG_MLX5_CORE=y/m in kernel config. 27 + | This will provide mlx5 core driver for mlx5 ulps to interface with (mlx5e, mlx5_ib). 28 + 29 + 30 + **CONFIG_MLX5_CORE_EN=(y/n)** 31 + 32 + | Choosing this option will allow basic ethernet netdevice support with all of the standard rx/tx offloads. 33 + | mlx5e is the mlx5 ulp driver which provides netdevice kernel interface, when chosen, mlx5e will be 34 + | built-in into mlx5_core.ko. 35 + 36 + 37 + **CONFIG_MLX5_EN_ARFS=(y/n)** 38 + 39 + | Enables Hardware-accelerated receive flow steering (arfs) support, and ntuple filtering. 40 + | https://community.mellanox.com/s/article/howto-configure-arfs-on-connectx-4 41 + 42 + 43 + **CONFIG_MLX5_EN_RXNFC=(y/n)** 44 + 45 + | Enables ethtool receive network flow classification, which allows user defined 46 + | flow rules to direct traffic into arbitrary rx queue via ethtool set/get_rxnfc API. 47 + 48 + 49 + **CONFIG_MLX5_CORE_EN_DCB=(y/n)**: 50 + 51 + | Enables `Data Center Bridging (DCB) Support <https://community.mellanox.com/s/article/howto-auto-config-pfc-and-ets-on-connectx-4-via-lldp-dcbx>`_. 52 + 53 + 54 + **CONFIG_MLX5_MPFS=(y/n)** 55 + 56 + | Ethernet Multi-Physical Function Switch (MPFS) support in ConnectX NIC. 57 + | MPFs is required for when `Multi-Host <http://www.mellanox.com/page/multihost>`_ configuration is enabled to allow passing 58 + | user configured unicast MAC addresses to the requesting PF. 59 + 60 + 61 + **CONFIG_MLX5_ESWITCH=(y/n)** 62 + 63 + | Ethernet SRIOV E-Switch support in ConnectX NIC. E-Switch provides internal SRIOV packet steering 64 + | and switching for the enabled VFs and PF in two available modes: 65 + | 1) `Legacy SRIOV mode (L2 mac vlan steering based) <https://community.mellanox.com/s/article/howto-configure-sr-iov-for-connectx-4-connectx-5-with-kvm--ethernet-x>`_. 66 + | 2) `Switchdev mode (eswitch offloads) <https://www.mellanox.com/related-docs/prod_software/ASAP2_Hardware_Offloading_for_vSwitches_User_Manual_v4.4.pdf>`_. 67 + 68 + 69 + **CONFIG_MLX5_CORE_IPOIB=(y/n)** 70 + 71 + | IPoIB offloads & acceleration support. 72 + | Requires CONFIG_MLX5_CORE_EN to provide an accelerated interface for the rdma 73 + | IPoIB ulp netdevice. 74 + 75 + 76 + **CONFIG_MLX5_FPGA=(y/n)** 77 + 78 + | Build support for the Innova family of network cards by Mellanox Technologies. 79 + | Innova network cards are comprised of a ConnectX chip and an FPGA chip on one board. 80 + | If you select this option, the mlx5_core driver will include the Innova FPGA core and allow 81 + | building sandbox-specific client drivers. 82 + 83 + 84 + **CONFIG_MLX5_EN_IPSEC=(y/n)** 85 + 86 + | Enables `IPSec XFRM cryptography-offload accelaration <http://www.mellanox.com/related-docs/prod_software/Mellanox_Innova_IPsec_Ethernet_Adapter_Card_User_Manual.pdf>`_. 87 + 88 + **CONFIG_MLX5_EN_TLS=(y/n)** 89 + 90 + | TLS cryptography-offload accelaration. 91 + 92 + 93 + **CONFIG_MLX5_INFINIBAND=(y/n/m)** (module mlx5_ib.ko) 94 + 95 + | Provides low-level InfiniBand/RDMA and `RoCE <https://community.mellanox.com/s/article/recommended-network-configuration-examples-for-roce-deployment>`_ support. 96 + 97 + 98 + **External options** ( Choose if the corresponding mlx5 feature is required ) 99 + 100 + - CONFIG_PTP_1588_CLOCK: When chosen, mlx5 ptp support will be enabled 101 + - CONFIG_VXLAN: When chosen, mlx5 vxaln support will be enabled. 102 + - CONFIG_MLXFW: When chosen, mlx5 firmware flashing support will be enabled (via devlink and ethtool). 103 + 104 + 105 + Devlink health reporters 106 + ======================== 107 + 108 + tx reporter 109 + ----------- 110 + The tx reporter is responsible of two error scenarios: 111 + 112 + - TX timeout 113 + Report on kernel tx timeout detection. 114 + Recover by searching lost interrupts. 115 + - TX error completion 116 + Report on error tx completion. 117 + Recover by flushing the TX queue and reset it. 118 + 119 + TX reporter also support Diagnose callback, on which it provides 120 + real time information of its send queues status. 121 + 122 + User commands examples: 123 + 124 + - Diagnose send queues status:: 125 + 126 + $ devlink health diagnose pci/0000:82:00.0 reporter tx 127 + 128 + - Show number of tx errors indicated, number of recover flows ended successfully, 129 + is autorecover enabled and graceful period from last recover:: 130 + 131 + $ devlink health show pci/0000:82:00.0 reporter tx 132 + 133 + fw reporter 134 + ----------- 135 + The fw reporter implements diagnose and dump callbacks. 136 + It follows symptoms of fw error such as fw syndrome by triggering 137 + fw core dump and storing it into the dump buffer. 138 + The fw reporter diagnose command can be triggered any time by the user to check 139 + current fw status. 140 + 141 + User commands examples: 142 + 143 + - Check fw heath status:: 144 + 145 + $ devlink health diagnose pci/0000:82:00.0 reporter fw 146 + 147 + - Read FW core dump if already stored or trigger new one:: 148 + 149 + $ devlink health dump show pci/0000:82:00.0 reporter fw 150 + 151 + NOTE: This command can run only on the PF which has fw tracer ownership, 152 + running it on other PF or any VF will return "Operation not permitted". 153 + 154 + fw fatal reporter 155 + ----------------- 156 + The fw fatal reporter implements dump and recover callbacks. 157 + It follows fatal errors indications by CR-space dump and recover flow. 158 + The CR-space dump uses vsc interface which is valid even if the FW command 159 + interface is not functional, which is the case in most FW fatal errors. 160 + The recover function runs recover flow which reloads the driver and triggers fw 161 + reset if needed. 162 + 163 + User commands examples: 164 + 165 + - Run fw recover flow manually:: 166 + 167 + $ devlink health recover pci/0000:82:00.0 reporter fw_fatal 168 + 169 + - Read FW CR-space dump if already strored or trigger new one:: 170 + 171 + $ devlink health dump show pci/0000:82:00.1 reporter fw_fatal 172 + 173 + NOTE: This command can run only on PF.
+1
MAINTAINERS
··· 10108 10108 S: Supported 10109 10109 F: drivers/net/ethernet/mellanox/mlx5/core/ 10110 10110 F: include/linux/mlx5/ 10111 + F: Documentation/networking/device_drivers/mellanox/ 10111 10112 10112 10113 MELLANOX MLX5 IB driver 10113 10114 M: Leon Romanovsky <leonro@mellanox.com>
+2 -1
drivers/net/ethernet/mellanox/mlx5/core/Makefile
··· 15 15 health.o mcg.o cq.o alloc.o qp.o port.o mr.o pd.o \ 16 16 transobj.o vport.o sriov.o fs_cmd.o fs_core.o \ 17 17 fs_counters.o rl.o lag.o dev.o events.o wq.o lib/gid.o \ 18 - lib/devcom.o diag/fs_tracepoint.o diag/fw_tracer.o 18 + lib/devcom.o lib/pci_vsc.o diag/fs_tracepoint.o \ 19 + diag/fw_tracer.o diag/crdump.o devlink.o 19 20 20 21 # 21 22 # Netdev basic
+58
drivers/net/ethernet/mellanox/mlx5/core/devlink.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 2 + /* Copyright (c) 2019 Mellanox Technologies */ 3 + 4 + #include <devlink.h> 5 + 6 + #include "mlx5_core.h" 7 + #include "eswitch.h" 8 + 9 + static int mlx5_devlink_flash_update(struct devlink *devlink, 10 + const char *file_name, 11 + const char *component, 12 + struct netlink_ext_ack *extack) 13 + { 14 + struct mlx5_core_dev *dev = devlink_priv(devlink); 15 + const struct firmware *fw; 16 + int err; 17 + 18 + if (component) 19 + return -EOPNOTSUPP; 20 + 21 + err = request_firmware_direct(&fw, file_name, &dev->pdev->dev); 22 + if (err) 23 + return err; 24 + 25 + return mlx5_firmware_flash(dev, fw, extack); 26 + } 27 + 28 + static const struct devlink_ops mlx5_devlink_ops = { 29 + #ifdef CONFIG_MLX5_ESWITCH 30 + .eswitch_mode_set = mlx5_devlink_eswitch_mode_set, 31 + .eswitch_mode_get = mlx5_devlink_eswitch_mode_get, 32 + .eswitch_inline_mode_set = mlx5_devlink_eswitch_inline_mode_set, 33 + .eswitch_inline_mode_get = mlx5_devlink_eswitch_inline_mode_get, 34 + .eswitch_encap_mode_set = mlx5_devlink_eswitch_encap_mode_set, 35 + .eswitch_encap_mode_get = mlx5_devlink_eswitch_encap_mode_get, 36 + #endif 37 + .flash_update = mlx5_devlink_flash_update, 38 + }; 39 + 40 + struct devlink *mlx5_devlink_alloc() 41 + { 42 + return devlink_alloc(&mlx5_devlink_ops, sizeof(struct mlx5_core_dev)); 43 + } 44 + 45 + void mlx5_devlink_free(struct devlink *devlink) 46 + { 47 + devlink_free(devlink); 48 + } 49 + 50 + int mlx5_devlink_register(struct devlink *devlink, struct device *dev) 51 + { 52 + return devlink_register(devlink, dev); 53 + } 54 + 55 + void mlx5_devlink_unregister(struct devlink *devlink) 56 + { 57 + devlink_unregister(devlink); 58 + }
+14
drivers/net/ethernet/mellanox/mlx5/core/devlink.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ 2 + /* Copyright (c) 2019, Mellanox Technologies */ 3 + 4 + #ifndef __MLX5_DEVLINK_H__ 5 + #define __MLX5_DEVLINK_H__ 6 + 7 + #include <net/devlink.h> 8 + 9 + struct devlink *mlx5_devlink_alloc(void); 10 + void mlx5_devlink_free(struct devlink *devlink); 11 + int mlx5_devlink_register(struct devlink *devlink, struct device *dev); 12 + void mlx5_devlink_unregister(struct devlink *devlink); 13 + 14 + #endif /* __MLX5_DEVLINK_H__ */
+115
drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 2 + /* Copyright (c) 2019 Mellanox Technologies */ 3 + 4 + #include <linux/mlx5/driver.h> 5 + #include "mlx5_core.h" 6 + #include "lib/pci_vsc.h" 7 + #include "lib/mlx5.h" 8 + 9 + #define BAD_ACCESS 0xBADACCE5 10 + #define MLX5_PROTECTED_CR_SCAN_CRSPACE 0x7 11 + 12 + static bool mlx5_crdump_enabled(struct mlx5_core_dev *dev) 13 + { 14 + return !!dev->priv.health.crdump_size; 15 + } 16 + 17 + static int mlx5_crdump_fill(struct mlx5_core_dev *dev, u32 *cr_data) 18 + { 19 + u32 crdump_size = dev->priv.health.crdump_size; 20 + int i, ret; 21 + 22 + for (i = 0; i < (crdump_size / 4); i++) 23 + cr_data[i] = BAD_ACCESS; 24 + 25 + ret = mlx5_vsc_gw_read_block_fast(dev, cr_data, crdump_size); 26 + if (ret <= 0) { 27 + if (ret == 0) 28 + return -EIO; 29 + return ret; 30 + } 31 + 32 + if (crdump_size != ret) { 33 + mlx5_core_warn(dev, "failed to read full dump, read %d out of %u\n", 34 + ret, crdump_size); 35 + return -EINVAL; 36 + } 37 + 38 + return 0; 39 + } 40 + 41 + int mlx5_crdump_collect(struct mlx5_core_dev *dev, u32 *cr_data) 42 + { 43 + int ret; 44 + 45 + if (!mlx5_crdump_enabled(dev)) 46 + return -ENODEV; 47 + 48 + ret = mlx5_vsc_gw_lock(dev); 49 + if (ret) { 50 + mlx5_core_warn(dev, "crdump: failed to lock vsc gw err %d\n", 51 + ret); 52 + return ret; 53 + } 54 + /* Verify no other PF is running cr-dump or sw reset */ 55 + ret = mlx5_vsc_sem_set_space(dev, MLX5_SEMAPHORE_SW_RESET, 56 + MLX5_VSC_LOCK); 57 + if (ret) { 58 + mlx5_core_warn(dev, "Failed to lock SW reset semaphore\n"); 59 + goto unlock_gw; 60 + } 61 + 62 + ret = mlx5_vsc_gw_set_space(dev, MLX5_VSC_SPACE_SCAN_CRSPACE, NULL); 63 + if (ret) 64 + goto unlock_sem; 65 + 66 + ret = mlx5_crdump_fill(dev, cr_data); 67 + 68 + unlock_sem: 69 + mlx5_vsc_sem_set_space(dev, MLX5_SEMAPHORE_SW_RESET, MLX5_VSC_UNLOCK); 70 + unlock_gw: 71 + mlx5_vsc_gw_unlock(dev); 72 + return ret; 73 + } 74 + 75 + int mlx5_crdump_enable(struct mlx5_core_dev *dev) 76 + { 77 + struct mlx5_priv *priv = &dev->priv; 78 + u32 space_size; 79 + int ret; 80 + 81 + if (!mlx5_core_is_pf(dev) || !mlx5_vsc_accessible(dev) || 82 + mlx5_crdump_enabled(dev)) 83 + return 0; 84 + 85 + ret = mlx5_vsc_gw_lock(dev); 86 + if (ret) 87 + return ret; 88 + 89 + /* Check if space is supported and get space size */ 90 + ret = mlx5_vsc_gw_set_space(dev, MLX5_VSC_SPACE_SCAN_CRSPACE, 91 + &space_size); 92 + if (ret) { 93 + /* Unlock and mask error since space is not supported */ 94 + mlx5_vsc_gw_unlock(dev); 95 + return 0; 96 + } 97 + 98 + if (!space_size) { 99 + mlx5_core_warn(dev, "Invalid Crspace size, zero\n"); 100 + mlx5_vsc_gw_unlock(dev); 101 + return -EINVAL; 102 + } 103 + 104 + ret = mlx5_vsc_gw_unlock(dev); 105 + if (ret) 106 + return ret; 107 + 108 + priv->health.crdump_size = space_size; 109 + return 0; 110 + } 111 + 112 + void mlx5_crdump_disable(struct mlx5_core_dev *dev) 113 + { 114 + dev->priv.health.crdump_size = 0; 115 + }
+139
drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c
··· 243 243 return -ENOMEM; 244 244 } 245 245 246 + static void 247 + mlx5_fw_tracer_init_saved_traces_array(struct mlx5_fw_tracer *tracer) 248 + { 249 + tracer->st_arr.saved_traces_index = 0; 250 + mutex_init(&tracer->st_arr.lock); 251 + } 252 + 253 + static void 254 + mlx5_fw_tracer_clean_saved_traces_array(struct mlx5_fw_tracer *tracer) 255 + { 256 + mutex_destroy(&tracer->st_arr.lock); 257 + } 258 + 246 259 static void mlx5_tracer_read_strings_db(struct work_struct *work) 247 260 { 248 261 struct mlx5_fw_tracer *tracer = container_of(work, struct mlx5_fw_tracer, ··· 535 522 list_del(&str_frmt->list); 536 523 } 537 524 525 + static void mlx5_fw_tracer_save_trace(struct mlx5_fw_tracer *tracer, 526 + u64 timestamp, bool lost, 527 + u8 event_id, char *msg) 528 + { 529 + struct mlx5_fw_trace_data *trace_data; 530 + 531 + mutex_lock(&tracer->st_arr.lock); 532 + trace_data = &tracer->st_arr.straces[tracer->st_arr.saved_traces_index]; 533 + trace_data->timestamp = timestamp; 534 + trace_data->lost = lost; 535 + trace_data->event_id = event_id; 536 + strncpy(trace_data->msg, msg, TRACE_STR_MSG); 537 + 538 + tracer->st_arr.saved_traces_index = 539 + (tracer->st_arr.saved_traces_index + 1) & (SAVED_TRACES_NUM - 1); 540 + mutex_unlock(&tracer->st_arr.lock); 541 + } 542 + 538 543 static void mlx5_tracer_print_trace(struct tracer_string_format *str_frmt, 539 544 struct mlx5_core_dev *dev, 540 545 u64 trace_timestamp) ··· 570 539 571 540 trace_mlx5_fw(dev->tracer, trace_timestamp, str_frmt->lost, 572 541 str_frmt->event_id, tmp); 542 + 543 + mlx5_fw_tracer_save_trace(dev->tracer, trace_timestamp, 544 + str_frmt->lost, str_frmt->event_id, tmp); 573 545 574 546 /* remove it from hash */ 575 547 mlx5_tracer_clean_message(str_frmt); ··· 820 786 mlx5_fw_tracer_start(tracer); 821 787 } 822 788 789 + static int mlx5_fw_tracer_set_core_dump_reg(struct mlx5_core_dev *dev, 790 + u32 *in, int size_in) 791 + { 792 + u32 out[MLX5_ST_SZ_DW(core_dump_reg)] = {}; 793 + 794 + if (!MLX5_CAP_DEBUG(dev, core_dump_general) && 795 + !MLX5_CAP_DEBUG(dev, core_dump_qp)) 796 + return -EOPNOTSUPP; 797 + 798 + return mlx5_core_access_reg(dev, in, size_in, out, sizeof(out), 799 + MLX5_REG_CORE_DUMP, 0, 1); 800 + } 801 + 802 + int mlx5_fw_tracer_trigger_core_dump_general(struct mlx5_core_dev *dev) 803 + { 804 + struct mlx5_fw_tracer *tracer = dev->tracer; 805 + u32 in[MLX5_ST_SZ_DW(core_dump_reg)] = {}; 806 + int err; 807 + 808 + if (!MLX5_CAP_DEBUG(dev, core_dump_general) || !tracer) 809 + return -EOPNOTSUPP; 810 + if (!tracer->owner) 811 + return -EPERM; 812 + 813 + MLX5_SET(core_dump_reg, in, core_dump_type, 0x0); 814 + 815 + err = mlx5_fw_tracer_set_core_dump_reg(dev, in, sizeof(in)); 816 + if (err) 817 + return err; 818 + queue_work(tracer->work_queue, &tracer->handle_traces_work); 819 + flush_workqueue(tracer->work_queue); 820 + return 0; 821 + } 822 + 823 + static int 824 + mlx5_devlink_fmsg_fill_trace(struct devlink_fmsg *fmsg, 825 + struct mlx5_fw_trace_data *trace_data) 826 + { 827 + int err; 828 + 829 + err = devlink_fmsg_obj_nest_start(fmsg); 830 + if (err) 831 + return err; 832 + 833 + err = devlink_fmsg_u64_pair_put(fmsg, "timestamp", trace_data->timestamp); 834 + if (err) 835 + return err; 836 + 837 + err = devlink_fmsg_bool_pair_put(fmsg, "lost", trace_data->lost); 838 + if (err) 839 + return err; 840 + 841 + err = devlink_fmsg_u8_pair_put(fmsg, "event_id", trace_data->event_id); 842 + if (err) 843 + return err; 844 + 845 + err = devlink_fmsg_string_pair_put(fmsg, "msg", trace_data->msg); 846 + if (err) 847 + return err; 848 + 849 + err = devlink_fmsg_obj_nest_end(fmsg); 850 + if (err) 851 + return err; 852 + return 0; 853 + } 854 + 855 + int mlx5_fw_tracer_get_saved_traces_objects(struct mlx5_fw_tracer *tracer, 856 + struct devlink_fmsg *fmsg) 857 + { 858 + struct mlx5_fw_trace_data *straces = tracer->st_arr.straces; 859 + u32 index, start_index, end_index; 860 + u32 saved_traces_index; 861 + int err; 862 + 863 + if (!straces[0].timestamp) 864 + return -ENOMSG; 865 + 866 + mutex_lock(&tracer->st_arr.lock); 867 + saved_traces_index = tracer->st_arr.saved_traces_index; 868 + if (straces[saved_traces_index].timestamp) 869 + start_index = saved_traces_index; 870 + else 871 + start_index = 0; 872 + end_index = (saved_traces_index - 1) & (SAVED_TRACES_NUM - 1); 873 + 874 + err = devlink_fmsg_arr_pair_nest_start(fmsg, "dump fw traces"); 875 + if (err) 876 + goto unlock; 877 + index = start_index; 878 + while (index != end_index) { 879 + err = mlx5_devlink_fmsg_fill_trace(fmsg, &straces[index]); 880 + if (err) 881 + goto unlock; 882 + 883 + index = (index + 1) & (SAVED_TRACES_NUM - 1); 884 + } 885 + 886 + err = devlink_fmsg_arr_pair_nest_end(fmsg); 887 + unlock: 888 + mutex_unlock(&tracer->st_arr.lock); 889 + return err; 890 + } 891 + 823 892 /* Create software resources (Buffers, etc ..) */ 824 893 struct mlx5_fw_tracer *mlx5_fw_tracer_create(struct mlx5_core_dev *dev) 825 894 { ··· 970 833 goto free_log_buf; 971 834 } 972 835 836 + mlx5_fw_tracer_init_saved_traces_array(tracer); 973 837 mlx5_core_dbg(dev, "FWTracer: Tracer created\n"); 974 838 975 839 return tracer; ··· 1055 917 cancel_work_sync(&tracer->read_fw_strings_work); 1056 918 mlx5_fw_tracer_clean_ready_list(tracer); 1057 919 mlx5_fw_tracer_clean_print_hash(tracer); 920 + mlx5_fw_tracer_clean_saved_traces_array(tracer); 1058 921 mlx5_fw_tracer_free_strings_db(tracer); 1059 922 mlx5_fw_tracer_destroy_log_buf(tracer); 1060 923 flush_workqueue(tracer->work_queue);
+20
drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.h
··· 46 46 #define TRACER_BLOCK_SIZE_BYTE 256 47 47 #define TRACES_PER_BLOCK 32 48 48 49 + #define TRACE_STR_MSG 256 50 + #define SAVED_TRACES_NUM 8192 51 + 49 52 #define TRACER_MAX_PARAMS 7 50 53 #define MESSAGE_HASH_BITS 6 51 54 #define MESSAGE_HASH_SIZE BIT(MESSAGE_HASH_BITS) 52 55 53 56 #define MASK_52_7 (0x1FFFFFFFFFFF80) 54 57 #define MASK_6_0 (0x7F) 58 + 59 + struct mlx5_fw_trace_data { 60 + u64 timestamp; 61 + bool lost; 62 + u8 event_id; 63 + char msg[TRACE_STR_MSG]; 64 + }; 55 65 56 66 struct mlx5_fw_tracer { 57 67 struct mlx5_core_dev *dev; ··· 92 82 struct mlx5_core_mkey mkey; 93 83 u32 consumer_index; 94 84 } buff; 85 + 86 + /* Saved Traces Array */ 87 + struct { 88 + struct mlx5_fw_trace_data straces[SAVED_TRACES_NUM]; 89 + u32 saved_traces_index; 90 + struct mutex lock; /* Protect st_arr access */ 91 + } st_arr; 95 92 96 93 u64 last_timestamp; 97 94 struct work_struct handle_traces_work; ··· 188 171 int mlx5_fw_tracer_init(struct mlx5_fw_tracer *tracer); 189 172 void mlx5_fw_tracer_cleanup(struct mlx5_fw_tracer *tracer); 190 173 void mlx5_fw_tracer_destroy(struct mlx5_fw_tracer *tracer); 174 + int mlx5_fw_tracer_trigger_core_dump_general(struct mlx5_core_dev *dev); 175 + int mlx5_fw_tracer_get_saved_traces_objects(struct mlx5_fw_tracer *tracer, 176 + struct devlink_fmsg *fmsg); 191 177 192 178 #endif
+1 -1
drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
··· 64 64 { 65 65 struct mlx5_core_health *health = &priv->mdev->priv.health; 66 66 67 - return health->sick ? 1 : 0; 67 + return health->fatal_error ? 1 : 0; 68 68 } 69 69 70 70 static int mlx5e_test_link_state(struct mlx5e_priv *priv)
+496 -73
drivers/net/ethernet/mellanox/mlx5/core/health.c
··· 40 40 #include "mlx5_core.h" 41 41 #include "lib/eq.h" 42 42 #include "lib/mlx5.h" 43 + #include "lib/pci_vsc.h" 44 + #include "diag/fw_tracer.h" 43 45 44 46 enum { 45 47 MLX5_HEALTH_POLL_INTERVAL = 2 * HZ, ··· 64 62 65 63 enum { 66 64 MLX5_DROP_NEW_HEALTH_WORK, 67 - MLX5_DROP_NEW_RECOVERY_WORK, 65 + }; 66 + 67 + enum { 68 + MLX5_SENSOR_NO_ERR = 0, 69 + MLX5_SENSOR_PCI_COMM_ERR = 1, 70 + MLX5_SENSOR_PCI_ERR = 2, 71 + MLX5_SENSOR_NIC_DISABLED = 3, 72 + MLX5_SENSOR_NIC_SW_RESET = 4, 73 + MLX5_SENSOR_FW_SYND_RFR = 5, 68 74 }; 69 75 70 76 u8 mlx5_get_nic_state(struct mlx5_core_dev *dev) 71 77 { 72 - return (ioread32be(&dev->iseg->cmdq_addr_l_sz) >> 8) & 3; 78 + return (ioread32be(&dev->iseg->cmdq_addr_l_sz) >> 8) & 7; 73 79 } 74 80 75 81 void mlx5_set_nic_state(struct mlx5_core_dev *dev, u8 state) ··· 90 80 &dev->iseg->cmdq_addr_l_sz); 91 81 } 92 82 93 - static int in_fatal(struct mlx5_core_dev *dev) 83 + static bool sensor_pci_not_working(struct mlx5_core_dev *dev) 94 84 { 95 85 struct mlx5_core_health *health = &dev->priv.health; 96 86 struct health_buffer __iomem *h = health->health; 97 87 88 + /* Offline PCI reads return 0xffffffff */ 89 + return (ioread32be(&h->fw_ver) == 0xffffffff); 90 + } 91 + 92 + static bool sensor_fw_synd_rfr(struct mlx5_core_dev *dev) 93 + { 94 + struct mlx5_core_health *health = &dev->priv.health; 95 + struct health_buffer __iomem *h = health->health; 96 + u32 rfr = ioread32be(&h->rfr) >> MLX5_RFR_OFFSET; 97 + u8 synd = ioread8(&h->synd); 98 + 99 + if (rfr && synd) 100 + mlx5_core_dbg(dev, "FW requests reset, synd: %d\n", synd); 101 + return rfr && synd; 102 + } 103 + 104 + static u32 check_fatal_sensors(struct mlx5_core_dev *dev) 105 + { 106 + if (sensor_pci_not_working(dev)) 107 + return MLX5_SENSOR_PCI_COMM_ERR; 108 + if (pci_channel_offline(dev->pdev)) 109 + return MLX5_SENSOR_PCI_ERR; 98 110 if (mlx5_get_nic_state(dev) == MLX5_NIC_IFC_DISABLED) 99 - return 1; 111 + return MLX5_SENSOR_NIC_DISABLED; 112 + if (mlx5_get_nic_state(dev) == MLX5_NIC_IFC_SW_RESET) 113 + return MLX5_SENSOR_NIC_SW_RESET; 114 + if (sensor_fw_synd_rfr(dev)) 115 + return MLX5_SENSOR_FW_SYND_RFR; 100 116 101 - if (ioread32be(&h->fw_ver) == 0xffffffff) 102 - return 1; 117 + return MLX5_SENSOR_NO_ERR; 118 + } 103 119 104 - return 0; 120 + static int lock_sem_sw_reset(struct mlx5_core_dev *dev, bool lock) 121 + { 122 + enum mlx5_vsc_state state; 123 + int ret; 124 + 125 + if (!mlx5_core_is_pf(dev)) 126 + return -EBUSY; 127 + 128 + /* Try to lock GW access, this stage doesn't return 129 + * EBUSY because locked GW does not mean that other PF 130 + * already started the reset. 131 + */ 132 + ret = mlx5_vsc_gw_lock(dev); 133 + if (ret == -EBUSY) 134 + return -EINVAL; 135 + if (ret) 136 + return ret; 137 + 138 + state = lock ? MLX5_VSC_LOCK : MLX5_VSC_UNLOCK; 139 + /* At this stage, if the return status == EBUSY, then we know 140 + * for sure that another PF started the reset, so don't allow 141 + * another reset. 142 + */ 143 + ret = mlx5_vsc_sem_set_space(dev, MLX5_SEMAPHORE_SW_RESET, state); 144 + if (ret) 145 + mlx5_core_warn(dev, "Failed to lock SW reset semaphore\n"); 146 + 147 + /* Unlock GW access */ 148 + mlx5_vsc_gw_unlock(dev); 149 + 150 + return ret; 151 + } 152 + 153 + static bool reset_fw_if_needed(struct mlx5_core_dev *dev) 154 + { 155 + bool supported = (ioread32be(&dev->iseg->initializing) >> 156 + MLX5_FW_RESET_SUPPORTED_OFFSET) & 1; 157 + u32 fatal_error; 158 + 159 + if (!supported) 160 + return false; 161 + 162 + /* The reset only needs to be issued by one PF. The health buffer is 163 + * shared between all functions, and will be cleared during a reset. 164 + * Check again to avoid a redundant 2nd reset. If the fatal erros was 165 + * PCI related a reset won't help. 166 + */ 167 + fatal_error = check_fatal_sensors(dev); 168 + if (fatal_error == MLX5_SENSOR_PCI_COMM_ERR || 169 + fatal_error == MLX5_SENSOR_NIC_DISABLED || 170 + fatal_error == MLX5_SENSOR_NIC_SW_RESET) { 171 + mlx5_core_warn(dev, "Not issuing FW reset. Either it's already done or won't help."); 172 + return false; 173 + } 174 + 175 + mlx5_core_warn(dev, "Issuing FW Reset\n"); 176 + /* Write the NIC interface field to initiate the reset, the command 177 + * interface address also resides here, don't overwrite it. 178 + */ 179 + mlx5_set_nic_state(dev, MLX5_NIC_IFC_SW_RESET); 180 + 181 + return true; 105 182 } 106 183 107 184 void mlx5_enter_error_state(struct mlx5_core_dev *dev, bool force) ··· 196 99 mutex_lock(&dev->intf_state_mutex); 197 100 if (dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR) 198 101 goto unlock; 102 + if (dev->state == MLX5_DEVICE_STATE_UNINITIALIZED) { 103 + dev->state = MLX5_DEVICE_STATE_INTERNAL_ERROR; 104 + goto unlock; 105 + } 199 106 200 - mlx5_core_err(dev, "start\n"); 201 - if (pci_channel_offline(dev->pdev) || in_fatal(dev) || force) { 107 + if (check_fatal_sensors(dev) || force) { 202 108 dev->state = MLX5_DEVICE_STATE_INTERNAL_ERROR; 203 109 mlx5_cmd_flush(dev); 204 110 } 205 111 206 112 mlx5_notifier_call_chain(dev->priv.events, MLX5_DEV_EVENT_SYS_ERROR, (void *)1); 113 + unlock: 114 + mutex_unlock(&dev->intf_state_mutex); 115 + } 116 + 117 + #define MLX5_CRDUMP_WAIT_MS 60000 118 + #define MLX5_FW_RESET_WAIT_MS 1000 119 + void mlx5_error_sw_reset(struct mlx5_core_dev *dev) 120 + { 121 + unsigned long end, delay_ms = MLX5_FW_RESET_WAIT_MS; 122 + int lock = -EBUSY; 123 + 124 + mutex_lock(&dev->intf_state_mutex); 125 + if (dev->state != MLX5_DEVICE_STATE_INTERNAL_ERROR) 126 + goto unlock; 127 + 128 + mlx5_core_err(dev, "start\n"); 129 + 130 + if (check_fatal_sensors(dev) == MLX5_SENSOR_FW_SYND_RFR) { 131 + /* Get cr-dump and reset FW semaphore */ 132 + lock = lock_sem_sw_reset(dev, true); 133 + 134 + if (lock == -EBUSY) { 135 + delay_ms = MLX5_CRDUMP_WAIT_MS; 136 + goto recover_from_sw_reset; 137 + } 138 + /* Execute SW reset */ 139 + reset_fw_if_needed(dev); 140 + } 141 + 142 + recover_from_sw_reset: 143 + /* Recover from SW reset */ 144 + end = jiffies + msecs_to_jiffies(delay_ms); 145 + do { 146 + if (mlx5_get_nic_state(dev) == MLX5_NIC_IFC_DISABLED) 147 + break; 148 + 149 + cond_resched(); 150 + } while (!time_after(jiffies, end)); 151 + 152 + if (mlx5_get_nic_state(dev) != MLX5_NIC_IFC_DISABLED) { 153 + dev_err(&dev->pdev->dev, "NIC IFC still %d after %lums.\n", 154 + mlx5_get_nic_state(dev), delay_ms); 155 + } 156 + 157 + /* Release FW semaphore if you are the lock owner */ 158 + if (!lock) 159 + lock_sem_sw_reset(dev, false); 160 + 207 161 mlx5_core_err(dev, "end\n"); 208 162 209 163 unlock: ··· 277 129 case MLX5_NIC_IFC_NO_DRAM_NIC: 278 130 mlx5_core_warn(dev, "Expected to see disabled NIC but it is no dram nic\n"); 279 131 break; 132 + 133 + case MLX5_NIC_IFC_SW_RESET: 134 + /* The IFC mode field is 3 bits, so it will read 0x7 in 2 cases: 135 + * 1. PCI has been disabled (ie. PCI-AER, PF driver unloaded 136 + * and this is a VF), this is not recoverable by SW reset. 137 + * Logging of this is handled elsewhere. 138 + * 2. FW reset has been issued by another function, driver can 139 + * be reloaded to recover after the mode switches to 140 + * MLX5_NIC_IFC_DISABLED. 141 + */ 142 + if (dev->priv.health.fatal_error != MLX5_SENSOR_PCI_COMM_ERR) 143 + mlx5_core_warn(dev, "NIC SW reset in progress\n"); 144 + break; 145 + 280 146 default: 281 147 mlx5_core_warn(dev, "Expected to see disabled NIC but it is has invalid value %d\n", 282 148 nic_interface); ··· 299 137 mlx5_disable_device(dev); 300 138 } 301 139 302 - static void health_recover(struct work_struct *work) 140 + /* How much time to wait until health resetting the driver (in msecs) */ 141 + #define MLX5_RECOVERY_WAIT_MSECS 60000 142 + static int mlx5_health_try_recover(struct mlx5_core_dev *dev) 303 143 { 304 - struct mlx5_core_health *health; 305 - struct delayed_work *dwork; 306 - struct mlx5_core_dev *dev; 307 - struct mlx5_priv *priv; 308 - u8 nic_state; 144 + unsigned long end; 309 145 310 - dwork = container_of(work, struct delayed_work, work); 311 - health = container_of(dwork, struct mlx5_core_health, recover_work); 312 - priv = container_of(health, struct mlx5_priv, health); 313 - dev = container_of(priv, struct mlx5_core_dev, priv); 314 - 315 - nic_state = mlx5_get_nic_state(dev); 316 - if (nic_state == MLX5_NIC_IFC_INVALID) { 317 - mlx5_core_err(dev, "health recovery flow aborted since the nic state is invalid\n"); 318 - return; 146 + mlx5_core_warn(dev, "handling bad device here\n"); 147 + mlx5_handle_bad_state(dev); 148 + end = jiffies + msecs_to_jiffies(MLX5_RECOVERY_WAIT_MSECS); 149 + while (sensor_pci_not_working(dev)) { 150 + if (time_after(jiffies, end)) { 151 + mlx5_core_err(dev, 152 + "health recovery flow aborted, PCI reads still not working\n"); 153 + return -EIO; 154 + } 155 + msleep(100); 319 156 } 320 157 321 158 mlx5_core_err(dev, "starting health recovery flow\n"); 322 159 mlx5_recover_device(dev); 323 - } 324 - 325 - /* How much time to wait until health resetting the driver (in msecs) */ 326 - #define MLX5_RECOVERY_DELAY_MSECS 60000 327 - static void health_care(struct work_struct *work) 328 - { 329 - unsigned long recover_delay = msecs_to_jiffies(MLX5_RECOVERY_DELAY_MSECS); 330 - struct mlx5_core_health *health; 331 - struct mlx5_core_dev *dev; 332 - struct mlx5_priv *priv; 333 - unsigned long flags; 334 - 335 - health = container_of(work, struct mlx5_core_health, work); 336 - priv = container_of(health, struct mlx5_priv, health); 337 - dev = container_of(priv, struct mlx5_core_dev, priv); 338 - mlx5_core_warn(dev, "handling bad device here\n"); 339 - mlx5_handle_bad_state(dev); 340 - 341 - spin_lock_irqsave(&health->wq_lock, flags); 342 - if (!test_bit(MLX5_DROP_NEW_RECOVERY_WORK, &health->flags)) 343 - schedule_delayed_work(&health->recover_work, recover_delay); 344 - else 345 - mlx5_core_err(dev, 346 - "new health works are not permitted at this stage\n"); 347 - spin_unlock_irqrestore(&health->wq_lock, flags); 160 + if (!test_bit(MLX5_INTERFACE_STATE_UP, &dev->intf_state) || 161 + check_fatal_sensors(dev)) { 162 + mlx5_core_err(dev, "health recovery failed\n"); 163 + return -EIO; 164 + } 165 + return 0; 348 166 } 349 167 350 168 static const char *hsynd_str(u8 synd) ··· 388 246 mlx5_core_err(dev, "raw fw_ver 0x%08x\n", fw); 389 247 } 390 248 249 + static int 250 + mlx5_fw_reporter_diagnose(struct devlink_health_reporter *reporter, 251 + struct devlink_fmsg *fmsg) 252 + { 253 + struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter); 254 + struct mlx5_core_health *health = &dev->priv.health; 255 + struct health_buffer __iomem *h = health->health; 256 + u8 synd; 257 + int err; 258 + 259 + synd = ioread8(&h->synd); 260 + err = devlink_fmsg_u8_pair_put(fmsg, "Syndrome", synd); 261 + if (err || !synd) 262 + return err; 263 + return devlink_fmsg_string_pair_put(fmsg, "Description", hsynd_str(synd)); 264 + } 265 + 266 + struct mlx5_fw_reporter_ctx { 267 + u8 err_synd; 268 + int miss_counter; 269 + }; 270 + 271 + static int 272 + mlx5_fw_reporter_ctx_pairs_put(struct devlink_fmsg *fmsg, 273 + struct mlx5_fw_reporter_ctx *fw_reporter_ctx) 274 + { 275 + int err; 276 + 277 + err = devlink_fmsg_u8_pair_put(fmsg, "syndrome", 278 + fw_reporter_ctx->err_synd); 279 + if (err) 280 + return err; 281 + err = devlink_fmsg_u32_pair_put(fmsg, "fw_miss_counter", 282 + fw_reporter_ctx->miss_counter); 283 + if (err) 284 + return err; 285 + return 0; 286 + } 287 + 288 + static int 289 + mlx5_fw_reporter_heath_buffer_data_put(struct mlx5_core_dev *dev, 290 + struct devlink_fmsg *fmsg) 291 + { 292 + struct mlx5_core_health *health = &dev->priv.health; 293 + struct health_buffer __iomem *h = health->health; 294 + int err; 295 + int i; 296 + 297 + if (!ioread8(&h->synd)) 298 + return 0; 299 + 300 + err = devlink_fmsg_pair_nest_start(fmsg, "health buffer"); 301 + if (err) 302 + return err; 303 + err = devlink_fmsg_obj_nest_start(fmsg); 304 + if (err) 305 + return err; 306 + err = devlink_fmsg_arr_pair_nest_start(fmsg, "assert_var"); 307 + if (err) 308 + return err; 309 + 310 + for (i = 0; i < ARRAY_SIZE(h->assert_var); i++) { 311 + err = devlink_fmsg_u32_put(fmsg, ioread32be(h->assert_var + i)); 312 + if (err) 313 + return err; 314 + } 315 + err = devlink_fmsg_arr_pair_nest_end(fmsg); 316 + if (err) 317 + return err; 318 + err = devlink_fmsg_u32_pair_put(fmsg, "assert_exit_ptr", 319 + ioread32be(&h->assert_exit_ptr)); 320 + if (err) 321 + return err; 322 + err = devlink_fmsg_u32_pair_put(fmsg, "assert_callra", 323 + ioread32be(&h->assert_callra)); 324 + if (err) 325 + return err; 326 + err = devlink_fmsg_u32_pair_put(fmsg, "hw_id", ioread32be(&h->hw_id)); 327 + if (err) 328 + return err; 329 + err = devlink_fmsg_u8_pair_put(fmsg, "irisc_index", 330 + ioread8(&h->irisc_index)); 331 + if (err) 332 + return err; 333 + err = devlink_fmsg_u8_pair_put(fmsg, "synd", ioread8(&h->synd)); 334 + if (err) 335 + return err; 336 + err = devlink_fmsg_u32_pair_put(fmsg, "ext_synd", 337 + ioread16be(&h->ext_synd)); 338 + if (err) 339 + return err; 340 + err = devlink_fmsg_u32_pair_put(fmsg, "raw_fw_ver", 341 + ioread32be(&h->fw_ver)); 342 + if (err) 343 + return err; 344 + err = devlink_fmsg_obj_nest_end(fmsg); 345 + if (err) 346 + return err; 347 + return devlink_fmsg_pair_nest_end(fmsg); 348 + } 349 + 350 + static int 351 + mlx5_fw_reporter_dump(struct devlink_health_reporter *reporter, 352 + struct devlink_fmsg *fmsg, void *priv_ctx) 353 + { 354 + struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter); 355 + int err; 356 + 357 + err = mlx5_fw_tracer_trigger_core_dump_general(dev); 358 + if (err) 359 + return err; 360 + 361 + if (priv_ctx) { 362 + struct mlx5_fw_reporter_ctx *fw_reporter_ctx = priv_ctx; 363 + 364 + err = mlx5_fw_reporter_ctx_pairs_put(fmsg, fw_reporter_ctx); 365 + if (err) 366 + return err; 367 + } 368 + 369 + err = mlx5_fw_reporter_heath_buffer_data_put(dev, fmsg); 370 + if (err) 371 + return err; 372 + return mlx5_fw_tracer_get_saved_traces_objects(dev->tracer, fmsg); 373 + } 374 + 375 + static void mlx5_fw_reporter_err_work(struct work_struct *work) 376 + { 377 + struct mlx5_fw_reporter_ctx fw_reporter_ctx; 378 + struct mlx5_core_health *health; 379 + 380 + health = container_of(work, struct mlx5_core_health, report_work); 381 + 382 + if (IS_ERR_OR_NULL(health->fw_reporter)) 383 + return; 384 + 385 + fw_reporter_ctx.err_synd = health->synd; 386 + fw_reporter_ctx.miss_counter = health->miss_counter; 387 + if (fw_reporter_ctx.err_synd) { 388 + devlink_health_report(health->fw_reporter, 389 + "FW syndrom reported", &fw_reporter_ctx); 390 + return; 391 + } 392 + if (fw_reporter_ctx.miss_counter) 393 + devlink_health_report(health->fw_reporter, 394 + "FW miss counter reported", 395 + &fw_reporter_ctx); 396 + } 397 + 398 + static const struct devlink_health_reporter_ops mlx5_fw_reporter_ops = { 399 + .name = "fw", 400 + .diagnose = mlx5_fw_reporter_diagnose, 401 + .dump = mlx5_fw_reporter_dump, 402 + }; 403 + 404 + static int 405 + mlx5_fw_fatal_reporter_recover(struct devlink_health_reporter *reporter, 406 + void *priv_ctx) 407 + { 408 + struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter); 409 + 410 + return mlx5_health_try_recover(dev); 411 + } 412 + 413 + #define MLX5_CR_DUMP_CHUNK_SIZE 256 414 + static int 415 + mlx5_fw_fatal_reporter_dump(struct devlink_health_reporter *reporter, 416 + struct devlink_fmsg *fmsg, void *priv_ctx) 417 + { 418 + struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter); 419 + u32 crdump_size = dev->priv.health.crdump_size; 420 + u32 *cr_data; 421 + u32 data_size; 422 + u32 offset; 423 + int err; 424 + 425 + if (!mlx5_core_is_pf(dev)) 426 + return -EPERM; 427 + 428 + cr_data = kvmalloc(crdump_size, GFP_KERNEL); 429 + if (!cr_data) 430 + return -ENOMEM; 431 + err = mlx5_crdump_collect(dev, cr_data); 432 + if (err) 433 + return err; 434 + 435 + if (priv_ctx) { 436 + struct mlx5_fw_reporter_ctx *fw_reporter_ctx = priv_ctx; 437 + 438 + err = mlx5_fw_reporter_ctx_pairs_put(fmsg, fw_reporter_ctx); 439 + if (err) 440 + goto free_data; 441 + } 442 + 443 + err = devlink_fmsg_arr_pair_nest_start(fmsg, "crdump_data"); 444 + if (err) 445 + goto free_data; 446 + for (offset = 0; offset < crdump_size; offset += data_size) { 447 + if (crdump_size - offset < MLX5_CR_DUMP_CHUNK_SIZE) 448 + data_size = crdump_size - offset; 449 + else 450 + data_size = MLX5_CR_DUMP_CHUNK_SIZE; 451 + err = devlink_fmsg_binary_put(fmsg, cr_data, data_size); 452 + if (err) 453 + goto free_data; 454 + } 455 + err = devlink_fmsg_arr_pair_nest_end(fmsg); 456 + 457 + free_data: 458 + kfree(cr_data); 459 + return err; 460 + } 461 + 462 + static void mlx5_fw_fatal_reporter_err_work(struct work_struct *work) 463 + { 464 + struct mlx5_fw_reporter_ctx fw_reporter_ctx; 465 + struct mlx5_core_health *health; 466 + struct mlx5_core_dev *dev; 467 + struct mlx5_priv *priv; 468 + 469 + health = container_of(work, struct mlx5_core_health, fatal_report_work); 470 + priv = container_of(health, struct mlx5_priv, health); 471 + dev = container_of(priv, struct mlx5_core_dev, priv); 472 + 473 + mlx5_enter_error_state(dev, false); 474 + if (IS_ERR_OR_NULL(health->fw_fatal_reporter)) { 475 + if (mlx5_health_try_recover(dev)) 476 + mlx5_core_err(dev, "health recovery failed\n"); 477 + return; 478 + } 479 + fw_reporter_ctx.err_synd = health->synd; 480 + fw_reporter_ctx.miss_counter = health->miss_counter; 481 + devlink_health_report(health->fw_fatal_reporter, 482 + "FW fatal error reported", &fw_reporter_ctx); 483 + } 484 + 485 + static const struct devlink_health_reporter_ops mlx5_fw_fatal_reporter_ops = { 486 + .name = "fw_fatal", 487 + .recover = mlx5_fw_fatal_reporter_recover, 488 + .dump = mlx5_fw_fatal_reporter_dump, 489 + }; 490 + 491 + #define MLX5_REPORTER_FW_GRACEFUL_PERIOD 1200000 492 + static void mlx5_fw_reporters_create(struct mlx5_core_dev *dev) 493 + { 494 + struct mlx5_core_health *health = &dev->priv.health; 495 + struct devlink *devlink = priv_to_devlink(dev); 496 + 497 + health->fw_reporter = 498 + devlink_health_reporter_create(devlink, &mlx5_fw_reporter_ops, 499 + 0, false, dev); 500 + if (IS_ERR(health->fw_reporter)) 501 + mlx5_core_warn(dev, "Failed to create fw reporter, err = %ld\n", 502 + PTR_ERR(health->fw_reporter)); 503 + 504 + health->fw_fatal_reporter = 505 + devlink_health_reporter_create(devlink, 506 + &mlx5_fw_fatal_reporter_ops, 507 + MLX5_REPORTER_FW_GRACEFUL_PERIOD, 508 + true, dev); 509 + if (IS_ERR(health->fw_fatal_reporter)) 510 + mlx5_core_warn(dev, "Failed to create fw fatal reporter, err = %ld\n", 511 + PTR_ERR(health->fw_fatal_reporter)); 512 + } 513 + 514 + static void mlx5_fw_reporters_destroy(struct mlx5_core_dev *dev) 515 + { 516 + struct mlx5_core_health *health = &dev->priv.health; 517 + 518 + if (!IS_ERR_OR_NULL(health->fw_reporter)) 519 + devlink_health_reporter_destroy(health->fw_reporter); 520 + 521 + if (!IS_ERR_OR_NULL(health->fw_fatal_reporter)) 522 + devlink_health_reporter_destroy(health->fw_fatal_reporter); 523 + } 524 + 391 525 static unsigned long get_next_poll_jiffies(void) 392 526 { 393 527 unsigned long next; ··· 682 264 683 265 spin_lock_irqsave(&health->wq_lock, flags); 684 266 if (!test_bit(MLX5_DROP_NEW_HEALTH_WORK, &health->flags)) 685 - queue_work(health->wq, &health->work); 267 + queue_work(health->wq, &health->fatal_report_work); 686 268 else 687 269 mlx5_core_err(dev, "new health works are not permitted at this stage\n"); 688 270 spin_unlock_irqrestore(&health->wq_lock, flags); ··· 692 274 { 693 275 struct mlx5_core_dev *dev = from_timer(dev, t, priv.health.timer); 694 276 struct mlx5_core_health *health = &dev->priv.health; 277 + struct health_buffer __iomem *h = health->health; 278 + u32 fatal_error; 279 + u8 prev_synd; 695 280 u32 count; 696 281 697 282 if (dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR) ··· 710 289 if (health->miss_counter == MAX_MISSES) { 711 290 mlx5_core_err(dev, "device's health compromised - reached miss count\n"); 712 291 print_health_info(dev); 292 + queue_work(health->wq, &health->report_work); 713 293 } 714 294 715 - if (in_fatal(dev) && !health->sick) { 716 - health->sick = true; 295 + prev_synd = health->synd; 296 + health->synd = ioread8(&h->synd); 297 + if (health->synd && health->synd != prev_synd) 298 + queue_work(health->wq, &health->report_work); 299 + 300 + fatal_error = check_fatal_sensors(dev); 301 + 302 + if (fatal_error && !health->fatal_error) { 303 + mlx5_core_err(dev, "Fatal error %u detected\n", fatal_error); 304 + dev->priv.health.fatal_error = fatal_error; 717 305 print_health_info(dev); 718 306 mlx5_trigger_health_work(dev); 719 307 } ··· 736 306 struct mlx5_core_health *health = &dev->priv.health; 737 307 738 308 timer_setup(&health->timer, poll_health, 0); 739 - health->sick = 0; 309 + health->fatal_error = MLX5_SENSOR_NO_ERR; 740 310 clear_bit(MLX5_DROP_NEW_HEALTH_WORK, &health->flags); 741 - clear_bit(MLX5_DROP_NEW_RECOVERY_WORK, &health->flags); 742 311 health->health = &dev->iseg->health; 743 312 health->health_counter = &dev->iseg->health_counter; 744 313 ··· 753 324 if (disable_health) { 754 325 spin_lock_irqsave(&health->wq_lock, flags); 755 326 set_bit(MLX5_DROP_NEW_HEALTH_WORK, &health->flags); 756 - set_bit(MLX5_DROP_NEW_RECOVERY_WORK, &health->flags); 757 327 spin_unlock_irqrestore(&health->wq_lock, flags); 758 328 } 759 329 ··· 766 338 767 339 spin_lock_irqsave(&health->wq_lock, flags); 768 340 set_bit(MLX5_DROP_NEW_HEALTH_WORK, &health->flags); 769 - set_bit(MLX5_DROP_NEW_RECOVERY_WORK, &health->flags); 770 341 spin_unlock_irqrestore(&health->wq_lock, flags); 771 - cancel_delayed_work_sync(&health->recover_work); 772 - cancel_work_sync(&health->work); 773 - } 774 - 775 - void mlx5_drain_health_recovery(struct mlx5_core_dev *dev) 776 - { 777 - struct mlx5_core_health *health = &dev->priv.health; 778 - unsigned long flags; 779 - 780 - spin_lock_irqsave(&health->wq_lock, flags); 781 - set_bit(MLX5_DROP_NEW_RECOVERY_WORK, &health->flags); 782 - spin_unlock_irqrestore(&health->wq_lock, flags); 783 - cancel_delayed_work_sync(&dev->priv.health.recover_work); 342 + cancel_work_sync(&health->report_work); 343 + cancel_work_sync(&health->fatal_report_work); 784 344 } 785 345 786 346 void mlx5_health_flush(struct mlx5_core_dev *dev) ··· 783 367 struct mlx5_core_health *health = &dev->priv.health; 784 368 785 369 destroy_workqueue(health->wq); 370 + mlx5_fw_reporters_destroy(dev); 786 371 } 787 372 788 373 int mlx5_health_init(struct mlx5_core_dev *dev) ··· 791 374 struct mlx5_core_health *health; 792 375 char *name; 793 376 377 + mlx5_fw_reporters_create(dev); 378 + 794 379 health = &dev->priv.health; 795 380 name = kmalloc(64, GFP_KERNEL); 796 381 if (!name) 797 - return -ENOMEM; 382 + goto out_err; 798 383 799 384 strcpy(name, "mlx5_health"); 800 385 strcat(name, dev_name(dev->device)); 801 386 health->wq = create_singlethread_workqueue(name); 802 387 kfree(name); 803 388 if (!health->wq) 804 - return -ENOMEM; 389 + goto out_err; 805 390 spin_lock_init(&health->wq_lock); 806 - INIT_WORK(&health->work, health_care); 807 - INIT_DELAYED_WORK(&health->recover_work, health_recover); 391 + INIT_WORK(&health->fatal_report_work, mlx5_fw_fatal_reporter_err_work); 392 + INIT_WORK(&health->report_work, mlx5_fw_reporter_err_work); 808 393 809 394 return 0; 395 + 396 + out_err: 397 + mlx5_fw_reporters_destroy(dev); 398 + return -ENOMEM; 810 399 }
+3
drivers/net/ethernet/mellanox/mlx5/core/lib/mlx5.h
··· 41 41 void mlx5_core_unreserve_gids(struct mlx5_core_dev *dev, unsigned int count); 42 42 int mlx5_core_reserved_gid_alloc(struct mlx5_core_dev *dev, int *gid_index); 43 43 void mlx5_core_reserved_gid_free(struct mlx5_core_dev *dev, int gid_index); 44 + int mlx5_crdump_enable(struct mlx5_core_dev *dev); 45 + void mlx5_crdump_disable(struct mlx5_core_dev *dev); 46 + int mlx5_crdump_collect(struct mlx5_core_dev *dev, u32 *cr_data); 44 47 45 48 /* TODO move to lib/events.h */ 46 49
+316
drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 2 + /* Copyright (c) 2019 Mellanox Technologies */ 3 + 4 + #include <linux/pci.h> 5 + #include "mlx5_core.h" 6 + #include "pci_vsc.h" 7 + 8 + #define MLX5_EXTRACT_C(source, offset, size) \ 9 + ((((u32)(source)) >> (offset)) & MLX5_ONES32(size)) 10 + #define MLX5_EXTRACT(src, start, len) \ 11 + (((len) == 32) ? (src) : MLX5_EXTRACT_C(src, start, len)) 12 + #define MLX5_ONES32(size) \ 13 + ((size) ? (0xffffffff >> (32 - (size))) : 0) 14 + #define MLX5_MASK32(offset, size) \ 15 + (MLX5_ONES32(size) << (offset)) 16 + #define MLX5_MERGE_C(rsrc1, rsrc2, start, len) \ 17 + ((((rsrc2) << (start)) & (MLX5_MASK32((start), (len)))) | \ 18 + ((rsrc1) & (~MLX5_MASK32((start), (len))))) 19 + #define MLX5_MERGE(rsrc1, rsrc2, start, len) \ 20 + (((len) == 32) ? (rsrc2) : MLX5_MERGE_C(rsrc1, rsrc2, start, len)) 21 + #define vsc_read(dev, offset, val) \ 22 + pci_read_config_dword((dev)->pdev, (dev)->vsc_addr + (offset), (val)) 23 + #define vsc_write(dev, offset, val) \ 24 + pci_write_config_dword((dev)->pdev, (dev)->vsc_addr + (offset), (val)) 25 + #define VSC_MAX_RETRIES 2048 26 + 27 + enum { 28 + VSC_CTRL_OFFSET = 0x4, 29 + VSC_COUNTER_OFFSET = 0x8, 30 + VSC_SEMAPHORE_OFFSET = 0xc, 31 + VSC_ADDR_OFFSET = 0x10, 32 + VSC_DATA_OFFSET = 0x14, 33 + 34 + VSC_FLAG_BIT_OFFS = 31, 35 + VSC_FLAG_BIT_LEN = 1, 36 + 37 + VSC_SYND_BIT_OFFS = 30, 38 + VSC_SYND_BIT_LEN = 1, 39 + 40 + VSC_ADDR_BIT_OFFS = 0, 41 + VSC_ADDR_BIT_LEN = 30, 42 + 43 + VSC_SPACE_BIT_OFFS = 0, 44 + VSC_SPACE_BIT_LEN = 16, 45 + 46 + VSC_SIZE_VLD_BIT_OFFS = 28, 47 + VSC_SIZE_VLD_BIT_LEN = 1, 48 + 49 + VSC_STATUS_BIT_OFFS = 29, 50 + VSC_STATUS_BIT_LEN = 3, 51 + }; 52 + 53 + void mlx5_pci_vsc_init(struct mlx5_core_dev *dev) 54 + { 55 + if (!mlx5_core_is_pf(dev)) 56 + return; 57 + 58 + dev->vsc_addr = pci_find_capability(dev->pdev, 59 + PCI_CAP_ID_VNDR); 60 + if (!dev->vsc_addr) 61 + mlx5_core_warn(dev, "Failed to get valid vendor specific ID\n"); 62 + } 63 + 64 + int mlx5_vsc_gw_lock(struct mlx5_core_dev *dev) 65 + { 66 + u32 counter = 0; 67 + int retries = 0; 68 + u32 lock_val; 69 + int ret; 70 + 71 + pci_cfg_access_lock(dev->pdev); 72 + do { 73 + if (retries > VSC_MAX_RETRIES) { 74 + ret = -EBUSY; 75 + goto pci_unlock; 76 + } 77 + 78 + /* Check if semaphore is already locked */ 79 + ret = vsc_read(dev, VSC_SEMAPHORE_OFFSET, &lock_val); 80 + if (ret) 81 + goto pci_unlock; 82 + 83 + if (lock_val) { 84 + retries++; 85 + usleep_range(1000, 2000); 86 + continue; 87 + } 88 + 89 + /* Read and write counter value, if written value is 90 + * the same, semaphore was acquired successfully. 91 + */ 92 + ret = vsc_read(dev, VSC_COUNTER_OFFSET, &counter); 93 + if (ret) 94 + goto pci_unlock; 95 + 96 + ret = vsc_write(dev, VSC_SEMAPHORE_OFFSET, counter); 97 + if (ret) 98 + goto pci_unlock; 99 + 100 + ret = vsc_read(dev, VSC_SEMAPHORE_OFFSET, &lock_val); 101 + if (ret) 102 + goto pci_unlock; 103 + 104 + retries++; 105 + } while (counter != lock_val); 106 + 107 + return 0; 108 + 109 + pci_unlock: 110 + pci_cfg_access_unlock(dev->pdev); 111 + return ret; 112 + } 113 + 114 + int mlx5_vsc_gw_unlock(struct mlx5_core_dev *dev) 115 + { 116 + int ret; 117 + 118 + ret = vsc_write(dev, VSC_SEMAPHORE_OFFSET, MLX5_VSC_UNLOCK); 119 + pci_cfg_access_unlock(dev->pdev); 120 + return ret; 121 + } 122 + 123 + int mlx5_vsc_gw_set_space(struct mlx5_core_dev *dev, u16 space, 124 + u32 *ret_space_size) 125 + { 126 + int ret; 127 + u32 val = 0; 128 + 129 + if (!mlx5_vsc_accessible(dev)) 130 + return -EINVAL; 131 + 132 + if (ret_space_size) 133 + *ret_space_size = 0; 134 + 135 + /* Get a unique val */ 136 + ret = vsc_read(dev, VSC_CTRL_OFFSET, &val); 137 + if (ret) 138 + goto out; 139 + 140 + /* Try to modify the lock */ 141 + val = MLX5_MERGE(val, space, VSC_SPACE_BIT_OFFS, VSC_SPACE_BIT_LEN); 142 + ret = vsc_write(dev, VSC_CTRL_OFFSET, val); 143 + if (ret) 144 + goto out; 145 + 146 + /* Verify lock was modified */ 147 + ret = vsc_read(dev, VSC_CTRL_OFFSET, &val); 148 + if (ret) 149 + goto out; 150 + 151 + if (MLX5_EXTRACT(val, VSC_STATUS_BIT_OFFS, VSC_STATUS_BIT_LEN) == 0) 152 + return -EINVAL; 153 + 154 + /* Get space max address if indicated by size valid bit */ 155 + if (ret_space_size && 156 + MLX5_EXTRACT(val, VSC_SIZE_VLD_BIT_OFFS, VSC_SIZE_VLD_BIT_LEN)) { 157 + ret = vsc_read(dev, VSC_ADDR_OFFSET, &val); 158 + if (ret) { 159 + mlx5_core_warn(dev, "Failed to get max space size\n"); 160 + goto out; 161 + } 162 + *ret_space_size = MLX5_EXTRACT(val, VSC_ADDR_BIT_OFFS, 163 + VSC_ADDR_BIT_LEN); 164 + } 165 + return 0; 166 + 167 + out: 168 + return ret; 169 + } 170 + 171 + static int mlx5_vsc_wait_on_flag(struct mlx5_core_dev *dev, u8 expected_val) 172 + { 173 + int retries = 0; 174 + u32 flag; 175 + int ret; 176 + 177 + do { 178 + if (retries > VSC_MAX_RETRIES) 179 + return -EBUSY; 180 + 181 + ret = vsc_read(dev, VSC_ADDR_OFFSET, &flag); 182 + if (ret) 183 + return ret; 184 + flag = MLX5_EXTRACT(flag, VSC_FLAG_BIT_OFFS, VSC_FLAG_BIT_LEN); 185 + retries++; 186 + 187 + if ((retries & 0xf) == 0) 188 + usleep_range(1000, 2000); 189 + 190 + } while (flag != expected_val); 191 + 192 + return 0; 193 + } 194 + 195 + static int mlx5_vsc_gw_write(struct mlx5_core_dev *dev, unsigned int address, 196 + u32 data) 197 + { 198 + int ret; 199 + 200 + if (MLX5_EXTRACT(address, VSC_SYND_BIT_OFFS, 201 + VSC_FLAG_BIT_LEN + VSC_SYND_BIT_LEN)) 202 + return -EINVAL; 203 + 204 + /* Set flag to 0x1 */ 205 + address = MLX5_MERGE(address, 1, VSC_FLAG_BIT_OFFS, 1); 206 + ret = vsc_write(dev, VSC_DATA_OFFSET, data); 207 + if (ret) 208 + goto out; 209 + 210 + ret = vsc_write(dev, VSC_ADDR_OFFSET, address); 211 + if (ret) 212 + goto out; 213 + 214 + /* Wait for the flag to be cleared */ 215 + ret = mlx5_vsc_wait_on_flag(dev, 0); 216 + 217 + out: 218 + return ret; 219 + } 220 + 221 + static int mlx5_vsc_gw_read(struct mlx5_core_dev *dev, unsigned int address, 222 + u32 *data) 223 + { 224 + int ret; 225 + 226 + if (MLX5_EXTRACT(address, VSC_SYND_BIT_OFFS, 227 + VSC_FLAG_BIT_LEN + VSC_SYND_BIT_LEN)) 228 + return -EINVAL; 229 + 230 + ret = vsc_write(dev, VSC_ADDR_OFFSET, address); 231 + if (ret) 232 + goto out; 233 + 234 + ret = mlx5_vsc_wait_on_flag(dev, 1); 235 + if (ret) 236 + goto out; 237 + 238 + ret = vsc_read(dev, VSC_DATA_OFFSET, data); 239 + out: 240 + return ret; 241 + } 242 + 243 + static int mlx5_vsc_gw_read_fast(struct mlx5_core_dev *dev, 244 + unsigned int read_addr, 245 + unsigned int *next_read_addr, 246 + u32 *data) 247 + { 248 + int ret; 249 + 250 + ret = mlx5_vsc_gw_read(dev, read_addr, data); 251 + if (ret) 252 + goto out; 253 + 254 + ret = vsc_read(dev, VSC_ADDR_OFFSET, next_read_addr); 255 + if (ret) 256 + goto out; 257 + 258 + *next_read_addr = MLX5_EXTRACT(*next_read_addr, VSC_ADDR_BIT_OFFS, 259 + VSC_ADDR_BIT_LEN); 260 + 261 + if (*next_read_addr <= read_addr) 262 + ret = -EINVAL; 263 + out: 264 + return ret; 265 + } 266 + 267 + int mlx5_vsc_gw_read_block_fast(struct mlx5_core_dev *dev, u32 *data, 268 + int length) 269 + { 270 + unsigned int next_read_addr = 0; 271 + unsigned int read_addr = 0; 272 + 273 + while (read_addr < length) { 274 + if (mlx5_vsc_gw_read_fast(dev, read_addr, &next_read_addr, 275 + &data[(read_addr >> 2)])) 276 + return read_addr; 277 + 278 + read_addr = next_read_addr; 279 + } 280 + return length; 281 + } 282 + 283 + int mlx5_vsc_sem_set_space(struct mlx5_core_dev *dev, u16 space, 284 + enum mlx5_vsc_state state) 285 + { 286 + u32 data, id = 0; 287 + int ret; 288 + 289 + ret = mlx5_vsc_gw_set_space(dev, MLX5_SEMAPHORE_SPACE_DOMAIN, NULL); 290 + if (ret) { 291 + mlx5_core_warn(dev, "Failed to set gw space %d\n", ret); 292 + return ret; 293 + } 294 + 295 + if (state == MLX5_VSC_LOCK) { 296 + /* Get a unique ID based on the counter */ 297 + ret = vsc_read(dev, VSC_COUNTER_OFFSET, &id); 298 + if (ret) 299 + return ret; 300 + } 301 + 302 + /* Try to modify lock */ 303 + ret = mlx5_vsc_gw_write(dev, space, id); 304 + if (ret) 305 + return ret; 306 + 307 + /* Verify lock was modified */ 308 + ret = mlx5_vsc_gw_read(dev, space, &data); 309 + if (ret) 310 + return -EINVAL; 311 + 312 + if (data != id) 313 + return -EBUSY; 314 + 315 + return 0; 316 + }
+32
drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ 2 + /* Copyright (c) 2019 Mellanox Technologies */ 3 + 4 + #ifndef __MLX5_PCI_VSC_H__ 5 + #define __MLX5_PCI_VSC_H__ 6 + 7 + enum mlx5_vsc_state { 8 + MLX5_VSC_UNLOCK, 9 + MLX5_VSC_LOCK, 10 + }; 11 + 12 + enum { 13 + MLX5_VSC_SPACE_SCAN_CRSPACE = 0x7, 14 + }; 15 + 16 + void mlx5_pci_vsc_init(struct mlx5_core_dev *dev); 17 + int mlx5_vsc_gw_lock(struct mlx5_core_dev *dev); 18 + int mlx5_vsc_gw_unlock(struct mlx5_core_dev *dev); 19 + int mlx5_vsc_gw_set_space(struct mlx5_core_dev *dev, u16 space, 20 + u32 *ret_space_size); 21 + int mlx5_vsc_gw_read_block_fast(struct mlx5_core_dev *dev, u32 *data, 22 + int length); 23 + 24 + static inline bool mlx5_vsc_accessible(struct mlx5_core_dev *dev) 25 + { 26 + return !!dev->vsc_addr; 27 + } 28 + 29 + int mlx5_vsc_sem_set_space(struct mlx5_core_dev *dev, u16 space, 30 + enum mlx5_vsc_state state); 31 + 32 + #endif /* __MLX5_PCI_VSC_H__ */
+21 -44
drivers/net/ethernet/mellanox/mlx5/core/main.c
··· 56 56 #include "fs_core.h" 57 57 #include "lib/mpfs.h" 58 58 #include "eswitch.h" 59 + #include "devlink.h" 59 60 #include "lib/mlx5.h" 60 61 #include "fpga/core.h" 61 62 #include "fpga/ipsec.h" ··· 66 65 #include "lib/vxlan.h" 67 66 #include "lib/geneve.h" 68 67 #include "lib/devcom.h" 68 + #include "lib/pci_vsc.h" 69 69 #include "diag/fw_tracer.h" 70 70 #include "ecpf.h" 71 71 ··· 764 762 goto err_clr_master; 765 763 } 766 764 765 + mlx5_pci_vsc_init(dev); 766 + 767 767 return 0; 768 768 769 769 err_clr_master: ··· 1191 1187 int err = 0; 1192 1188 1193 1189 if (cleanup) 1194 - mlx5_drain_health_recovery(dev); 1190 + mlx5_drain_health_wq(dev); 1195 1191 1196 1192 mutex_lock(&dev->intf_state_mutex); 1197 1193 if (!test_bit(MLX5_INTERFACE_STATE_UP, &dev->intf_state)) { ··· 1217 1213 mutex_unlock(&dev->intf_state_mutex); 1218 1214 return err; 1219 1215 } 1220 - 1221 - static int mlx5_devlink_flash_update(struct devlink *devlink, 1222 - const char *file_name, 1223 - const char *component, 1224 - struct netlink_ext_ack *extack) 1225 - { 1226 - struct mlx5_core_dev *dev = devlink_priv(devlink); 1227 - const struct firmware *fw; 1228 - int err; 1229 - 1230 - if (component) 1231 - return -EOPNOTSUPP; 1232 - 1233 - err = request_firmware_direct(&fw, file_name, &dev->pdev->dev); 1234 - if (err) 1235 - return err; 1236 - 1237 - return mlx5_firmware_flash(dev, fw, extack); 1238 - } 1239 - 1240 - static const struct devlink_ops mlx5_devlink_ops = { 1241 - #ifdef CONFIG_MLX5_ESWITCH 1242 - .eswitch_mode_set = mlx5_devlink_eswitch_mode_set, 1243 - .eswitch_mode_get = mlx5_devlink_eswitch_mode_get, 1244 - .eswitch_inline_mode_set = mlx5_devlink_eswitch_inline_mode_set, 1245 - .eswitch_inline_mode_get = mlx5_devlink_eswitch_inline_mode_get, 1246 - .eswitch_encap_mode_set = mlx5_devlink_eswitch_encap_mode_set, 1247 - .eswitch_encap_mode_get = mlx5_devlink_eswitch_encap_mode_get, 1248 - #endif 1249 - .flash_update = mlx5_devlink_flash_update, 1250 - }; 1251 1216 1252 1217 static int mlx5_mdev_init(struct mlx5_core_dev *dev, int profile_idx) 1253 1218 { ··· 1279 1306 struct devlink *devlink; 1280 1307 int err; 1281 1308 1282 - devlink = devlink_alloc(&mlx5_devlink_ops, sizeof(*dev)); 1309 + devlink = mlx5_devlink_alloc(); 1283 1310 if (!devlink) { 1284 - dev_err(&pdev->dev, "kzalloc failed\n"); 1311 + dev_err(&pdev->dev, "devlink alloc failed\n"); 1285 1312 return -ENOMEM; 1286 1313 } 1287 1314 ··· 1309 1336 1310 1337 request_module_nowait(MLX5_IB_MOD); 1311 1338 1312 - err = devlink_register(devlink, &pdev->dev); 1339 + err = mlx5_devlink_register(devlink, &pdev->dev); 1313 1340 if (err) 1314 1341 goto clean_load; 1342 + 1343 + err = mlx5_crdump_enable(dev); 1344 + if (err) 1345 + dev_err(&pdev->dev, "mlx5_crdump_enable failed with error code %d\n", err); 1315 1346 1316 1347 pci_save_state(pdev); 1317 1348 return 0; ··· 1328 1351 pci_init_err: 1329 1352 mlx5_mdev_uninit(dev); 1330 1353 mdev_init_err: 1331 - devlink_free(devlink); 1354 + mlx5_devlink_free(devlink); 1332 1355 1333 1356 return err; 1334 1357 } ··· 1338 1361 struct mlx5_core_dev *dev = pci_get_drvdata(pdev); 1339 1362 struct devlink *devlink = priv_to_devlink(dev); 1340 1363 1341 - devlink_unregister(devlink); 1364 + mlx5_crdump_disable(dev); 1365 + mlx5_devlink_unregister(devlink); 1342 1366 mlx5_unregister_device(dev); 1343 1367 1344 1368 if (mlx5_unload_one(dev, true)) { ··· 1350 1372 1351 1373 mlx5_pci_close(dev); 1352 1374 mlx5_mdev_uninit(dev); 1353 - devlink_free(devlink); 1375 + mlx5_devlink_free(devlink); 1354 1376 } 1355 1377 1356 1378 static pci_ers_result_t mlx5_pci_err_detected(struct pci_dev *pdev, ··· 1361 1383 mlx5_core_info(dev, "%s was called\n", __func__); 1362 1384 1363 1385 mlx5_enter_error_state(dev, false); 1386 + mlx5_error_sw_reset(dev); 1364 1387 mlx5_unload_one(dev, false); 1365 - /* In case of kernel call drain the health wq */ 1366 - if (state) { 1367 - mlx5_drain_health_wq(dev); 1368 - mlx5_pci_disable_device(dev); 1369 - } 1388 + mlx5_drain_health_wq(dev); 1389 + mlx5_pci_disable_device(dev); 1370 1390 1371 1391 return state == pci_channel_io_perm_failure ? 1372 1392 PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_NEED_RESET; ··· 1532 1556 1533 1557 void mlx5_disable_device(struct mlx5_core_dev *dev) 1534 1558 { 1535 - mlx5_pci_err_detected(dev->pdev, 0); 1559 + mlx5_error_sw_reset(dev); 1560 + mlx5_unload_one(dev, false); 1536 1561 } 1537 1562 1538 1563 void mlx5_recover_device(struct mlx5_core_dev *dev)
+7 -1
drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
··· 111 111 MLX5_DRIVER_SYND = 0xbadd00de, 112 112 }; 113 113 114 + enum mlx5_semaphore_space_address { 115 + MLX5_SEMAPHORE_SPACE_DOMAIN = 0xA, 116 + MLX5_SEMAPHORE_SW_RESET = 0x20, 117 + }; 118 + 114 119 int mlx5_query_hca_caps(struct mlx5_core_dev *dev); 115 120 int mlx5_query_board_id(struct mlx5_core_dev *dev); 116 121 int mlx5_cmd_init_hca(struct mlx5_core_dev *dev, uint32_t *sw_owner_id); ··· 123 118 int mlx5_cmd_force_teardown_hca(struct mlx5_core_dev *dev); 124 119 int mlx5_cmd_fast_teardown_hca(struct mlx5_core_dev *dev); 125 120 void mlx5_enter_error_state(struct mlx5_core_dev *dev, bool force); 121 + void mlx5_error_sw_reset(struct mlx5_core_dev *dev); 126 122 void mlx5_disable_device(struct mlx5_core_dev *dev); 127 123 void mlx5_recover_device(struct mlx5_core_dev *dev); 128 124 int mlx5_sriov_init(struct mlx5_core_dev *dev); ··· 220 214 MLX5_NIC_IFC_FULL = 0, 221 215 MLX5_NIC_IFC_DISABLED = 1, 222 216 MLX5_NIC_IFC_NO_DRAM_NIC = 2, 223 - MLX5_NIC_IFC_INVALID = 3 217 + MLX5_NIC_IFC_SW_RESET = 7 224 218 }; 225 219 226 220 u8 mlx5_get_nic_state(struct mlx5_core_dev *dev);
+9 -1
include/linux/mlx5/device.h
··· 510 510 u8 status_own; 511 511 }; 512 512 513 + enum mlx5_fatal_assert_bit_offsets { 514 + MLX5_RFR_OFFSET = 31, 515 + }; 516 + 513 517 struct health_buffer { 514 518 __be32 assert_var[5]; 515 519 __be32 rsvd0[3]; ··· 522 518 __be32 rsvd1[2]; 523 519 __be32 fw_ver; 524 520 __be32 hw_id; 525 - __be32 rsvd2; 521 + __be32 rfr; 526 522 u8 irisc_index; 527 523 u8 synd; 528 524 __be16 ext_synd; 525 + }; 526 + 527 + enum mlx5_initializing_bit_offsets { 528 + MLX5_FW_RESET_SUPPORTED_OFFSET = 30, 529 529 }; 530 530 531 531 enum mlx5_cmd_addr_l_sz_offset {
+10 -3
include/linux/mlx5/driver.h
··· 53 53 #include <linux/mlx5/eq.h> 54 54 #include <linux/timecounter.h> 55 55 #include <linux/ptp_clock_kernel.h> 56 + #include <net/devlink.h> 56 57 57 58 enum { 58 59 MLX5_BOARD_ID_LEN = 64, ··· 435 434 struct timer_list timer; 436 435 u32 prev; 437 436 int miss_counter; 438 - bool sick; 437 + u8 synd; 438 + u32 fatal_error; 439 + u32 crdump_size; 439 440 /* wq spinlock to synchronize draining */ 440 441 spinlock_t wq_lock; 441 442 struct workqueue_struct *wq; 442 443 unsigned long flags; 443 - struct work_struct work; 444 + struct work_struct fatal_report_work; 445 + struct work_struct report_work; 444 446 struct delayed_work recover_work; 447 + struct devlink_health_reporter *fw_reporter; 448 + struct devlink_health_reporter *fw_fatal_reporter; 445 449 }; 446 450 447 451 struct mlx5_qp_table { ··· 587 581 }; 588 582 589 583 enum mlx5_device_state { 584 + MLX5_DEVICE_STATE_UNINITIALIZED, 590 585 MLX5_DEVICE_STATE_UP, 591 586 MLX5_DEVICE_STATE_INTERNAL_ERROR, 592 587 }; ··· 700 693 struct mlx5_clock clock; 701 694 struct mlx5_ib_clock_info *clock_info; 702 695 struct mlx5_fw_tracer *tracer; 696 + u32 vsc_addr; 703 697 }; 704 698 705 699 struct mlx5_db { ··· 912 904 void mlx5_stop_health_poll(struct mlx5_core_dev *dev, bool disable_health); 913 905 void mlx5_drain_health_wq(struct mlx5_core_dev *dev); 914 906 void mlx5_trigger_health_work(struct mlx5_core_dev *dev); 915 - void mlx5_drain_health_recovery(struct mlx5_core_dev *dev); 916 907 int mlx5_buf_alloc_node(struct mlx5_core_dev *dev, int size, 917 908 struct mlx5_frag_buf *buf, int node); 918 909 int mlx5_buf_alloc(struct mlx5_core_dev *dev,
+99 -21
net/core/devlink.c
··· 4518 4518 return err; 4519 4519 } 4520 4520 4521 + static int devlink_fmsg_dumpit(struct devlink_fmsg *fmsg, struct sk_buff *skb, 4522 + struct netlink_callback *cb, 4523 + enum devlink_command cmd) 4524 + { 4525 + int index = cb->args[0]; 4526 + int tmp_index = index; 4527 + void *hdr; 4528 + int err; 4529 + 4530 + hdr = genlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq, 4531 + &devlink_nl_family, NLM_F_ACK | NLM_F_MULTI, cmd); 4532 + if (!hdr) { 4533 + err = -EMSGSIZE; 4534 + goto nla_put_failure; 4535 + } 4536 + 4537 + err = devlink_fmsg_prepare_skb(fmsg, skb, &index); 4538 + if ((err && err != -EMSGSIZE) || tmp_index == index) 4539 + goto nla_put_failure; 4540 + 4541 + cb->args[0] = index; 4542 + genlmsg_end(skb, hdr); 4543 + return skb->len; 4544 + 4545 + nla_put_failure: 4546 + genlmsg_cancel(skb, hdr); 4547 + return err; 4548 + } 4549 + 4521 4550 struct devlink_health_reporter { 4522 4551 struct list_head list; 4523 4552 void *priv; ··· 4779 4750 EXPORT_SYMBOL_GPL(devlink_health_report); 4780 4751 4781 4752 static struct devlink_health_reporter * 4782 - devlink_health_reporter_get_from_info(struct devlink *devlink, 4783 - struct genl_info *info) 4753 + devlink_health_reporter_get_from_attrs(struct devlink *devlink, 4754 + struct nlattr **attrs) 4784 4755 { 4785 4756 struct devlink_health_reporter *reporter; 4786 4757 char *reporter_name; 4787 4758 4788 - if (!info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_NAME]) 4759 + if (!attrs[DEVLINK_ATTR_HEALTH_REPORTER_NAME]) 4789 4760 return NULL; 4790 4761 4791 - reporter_name = 4792 - nla_data(info->attrs[DEVLINK_ATTR_HEALTH_REPORTER_NAME]); 4762 + reporter_name = nla_data(attrs[DEVLINK_ATTR_HEALTH_REPORTER_NAME]); 4793 4763 mutex_lock(&devlink->reporters_lock); 4794 4764 reporter = devlink_health_reporter_find_by_name(devlink, reporter_name); 4795 4765 if (reporter) 4796 4766 refcount_inc(&reporter->refcount); 4797 4767 mutex_unlock(&devlink->reporters_lock); 4798 4768 return reporter; 4769 + } 4770 + 4771 + static struct devlink_health_reporter * 4772 + devlink_health_reporter_get_from_info(struct devlink *devlink, 4773 + struct genl_info *info) 4774 + { 4775 + return devlink_health_reporter_get_from_attrs(devlink, info->attrs); 4776 + } 4777 + 4778 + static struct devlink_health_reporter * 4779 + devlink_health_reporter_get_from_cb(struct netlink_callback *cb) 4780 + { 4781 + struct devlink_health_reporter *reporter; 4782 + struct devlink *devlink; 4783 + struct nlattr **attrs; 4784 + int err; 4785 + 4786 + attrs = kmalloc_array(DEVLINK_ATTR_MAX + 1, sizeof(*attrs), GFP_KERNEL); 4787 + if (!attrs) 4788 + return NULL; 4789 + 4790 + err = nlmsg_parse_deprecated(cb->nlh, 4791 + GENL_HDRLEN + devlink_nl_family.hdrsize, 4792 + attrs, DEVLINK_ATTR_MAX, 4793 + devlink_nl_family.policy, cb->extack); 4794 + if (err) 4795 + goto free; 4796 + 4797 + mutex_lock(&devlink_mutex); 4798 + devlink = devlink_get_from_attrs(sock_net(cb->skb->sk), attrs); 4799 + if (IS_ERR(devlink)) 4800 + goto unlock; 4801 + 4802 + reporter = devlink_health_reporter_get_from_attrs(devlink, attrs); 4803 + mutex_unlock(&devlink_mutex); 4804 + kfree(attrs); 4805 + return reporter; 4806 + unlock: 4807 + mutex_unlock(&devlink_mutex); 4808 + free: 4809 + kfree(attrs); 4810 + return NULL; 4799 4811 } 4800 4812 4801 4813 static void ··· 5074 5004 return err; 5075 5005 } 5076 5006 5077 - static int devlink_nl_cmd_health_reporter_dump_get_doit(struct sk_buff *skb, 5078 - struct genl_info *info) 5007 + static int 5008 + devlink_nl_cmd_health_reporter_dump_get_dumpit(struct sk_buff *skb, 5009 + struct netlink_callback *cb) 5079 5010 { 5080 - struct devlink *devlink = info->user_ptr[0]; 5081 5011 struct devlink_health_reporter *reporter; 5012 + u64 start = cb->args[0]; 5082 5013 int err; 5083 5014 5084 - reporter = devlink_health_reporter_get_from_info(devlink, info); 5015 + reporter = devlink_health_reporter_get_from_cb(cb); 5085 5016 if (!reporter) 5086 5017 return -EINVAL; 5087 5018 5088 5019 if (!reporter->ops->dump) { 5089 - devlink_health_reporter_put(reporter); 5090 - return -EOPNOTSUPP; 5020 + err = -EOPNOTSUPP; 5021 + goto out; 5022 + } 5023 + mutex_lock(&reporter->dump_lock); 5024 + if (!start) { 5025 + err = devlink_health_do_dump(reporter, NULL); 5026 + if (err) 5027 + goto unlock; 5028 + cb->args[1] = reporter->dump_ts; 5029 + } 5030 + if (!reporter->dump_fmsg || cb->args[1] != reporter->dump_ts) { 5031 + NL_SET_ERR_MSG_MOD(cb->extack, "Dump trampled, please retry"); 5032 + err = -EAGAIN; 5033 + goto unlock; 5091 5034 } 5092 5035 5093 - mutex_lock(&reporter->dump_lock); 5094 - err = devlink_health_do_dump(reporter, NULL); 5095 - if (err) 5096 - goto out; 5097 - 5098 - err = devlink_fmsg_snd(reporter->dump_fmsg, info, 5099 - DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET, 0); 5100 - 5101 - out: 5036 + err = devlink_fmsg_dumpit(reporter->dump_fmsg, skb, cb, 5037 + DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET); 5038 + unlock: 5102 5039 mutex_unlock(&reporter->dump_lock); 5040 + out: 5103 5041 devlink_health_reporter_put(reporter); 5104 5042 return err; 5105 5043 } ··· 5444 5366 { 5445 5367 .cmd = DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET, 5446 5368 .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, 5447 - .doit = devlink_nl_cmd_health_reporter_dump_get_doit, 5369 + .dumpit = devlink_nl_cmd_health_reporter_dump_get_dumpit, 5448 5370 .flags = GENL_ADMIN_PERM, 5449 5371 .internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK | 5450 5372 DEVLINK_NL_FLAG_NO_LOCK,