Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'for-linus-fwctl' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull fwctl subsystem from Jason Gunthorpe:
"fwctl is a new subsystem intended to bring some common rules and order
to the growing pattern of exposing a secure FW interface directly to
userspace.

Unlike existing places like RDMA/DRM/VFIO/uacce that are exposing a
device for datapath operations fwctl is focused on debugging,
configuration and provisioning of the device. It will not have the
necessary features like interrupt delivery to support a datapath.

This concept is similar to the long standing practice in the "HW" RAID
space of having a device specific misc device to manage the RAID
controller FW. fwctl generalizes this notion of a companion debug and
management interface that goes along with a dataplane implemented in
an appropriate subsystem.

There have been three LWN articles written discussing various aspects
of this:

https://lwn.net/Articles/955001/
https://lwn.net/Articles/969383/
https://lwn.net/Articles/990802/

This includes three drivers to launch the subsystem:

- CXL provides a vendor scheme for executing commands and a way to
learn the 'command effects' (ie the security properties) of such
commands. The fwctl driver allows access to these mechanism within
the fwctl security model

- mlx5 is family of networking products, the driver supports all
current Mellanox HW still receiving FW feature updates. This
includes RDMA multiprotocol NICs like ConnectX and the Bluefield
family of Smart NICs.

- AMD/Pensando Distributed Services card is a multi protocol Smart
NIC with a multi PCI function design. fwctl works on the management
PCI function following a 'command effects' model similar to CXL"

* tag 'for-linus-fwctl' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (30 commits)
pds_fwctl: add Documentation entries
pds_fwctl: add rpc and query support
pds_fwctl: initial driver framework
pds_core: add new fwctl auxiliary_device
pds_core: specify auxiliary_device to be created
pds_core: make pdsc_auxbus_dev_del() void
cxl: Fixup kdoc issues for include/cxl/features.h
fwctl/cxl: Add documentation to FWCTL CXL
cxl/test: Add Set Feature support to cxl_test
cxl/test: Add Get Feature support to cxl_test
cxl: Add support to handle user feature commands for set feature
cxl: Add support to handle user feature commands for get feature
cxl: Add support for fwctl RPC command to enable CXL feature commands
cxl: Move cxl feature command structs to user header
cxl: Add FWCTL support to CXL
mlx5: Create an auxiliary device for fwctl_mlx5
fwctl/mlx5: Support for communicating with mlx5 fw
fwctl: Add documentation
fwctl: FWCTL_RPC to execute a Remote Procedure Call to device firmware
taint: Add TAINT_FWCTL
...

+4054 -132
+5
Documentation/admin-guide/tainted-kernels.rst
··· 101 101 16 _/X 65536 auxiliary taint, defined for and used by distros 102 102 17 _/T 131072 kernel was built with the struct randomization plugin 103 103 18 _/N 262144 an in-kernel test has been run 104 + 19 _/J 524288 userspace used a mutating debug operation in fwctl 104 105 === === ====== ======================================================== 105 106 106 107 Note: The character ``_`` is representing a blank in this table to make reading ··· 185 184 build time. 186 185 187 186 18) ``N`` if an in-kernel test, such as a KUnit test, has been run. 187 + 188 + 19) ``J`` if userpace opened /dev/fwctl/* and performed a FWTCL_RPC_DEBUG_WRITE 189 + to use the devices debugging features. Device debugging features could 190 + cause the device to malfunction in undefined ways.
+142
Documentation/userspace-api/fwctl/fwctl-cxl.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0 2 + 3 + ================ 4 + fwctl cxl driver 5 + ================ 6 + 7 + :Author: Dave Jiang 8 + 9 + Overview 10 + ======== 11 + 12 + The CXL spec defines a set of commands that can be issued to the mailbox of a 13 + CXL device or switch. It also left room for vendor specific commands to be 14 + issued to the mailbox as well. fwctl provides a path to issue a set of allowed 15 + mailbox commands from user space to the device moderated by the kernel driver. 16 + 17 + The following 3 commands will be used to support CXL Features: 18 + CXL spec r3.1 8.2.9.6.1 Get Supported Features (Opcode 0500h) 19 + CXL spec r3.1 8.2.9.6.2 Get Feature (Opcode 0501h) 20 + CXL spec r3.1 8.2.9.6.3 Set Feature (Opcode 0502h) 21 + 22 + The "Get Supported Features" return data may be filtered by the kernel driver to 23 + drop any features that are forbidden by the kernel or being exclusively used by 24 + the kernel. The driver will set the "Set Feature Size" of the "Get Supported 25 + Features Supported Feature Entry" to 0 to indicate that the Feature cannot be 26 + modified. The "Get Supported Features" command and the "Get Features" falls 27 + under the fwctl policy of FWCTL_RPC_CONFIGURATION. 28 + 29 + For "Set Feature" command, the access policy currently is broken down into two 30 + categories depending on the Set Feature effects reported by the device. If the 31 + Set Feature will cause immediate change to the device, the fwctl access policy 32 + must be FWCTL_RPC_DEBUG_WRITE_FULL. The effects for this level are 33 + "immediate config change", "immediate data change", "immediate policy change", 34 + or "immediate log change" for the set effects mask. If the effects are "config 35 + change with cold reset" or "config change with conventional reset", then the 36 + fwctl access policy must be FWCTL_RPC_DEBUG_WRITE or higher. 37 + 38 + fwctl cxl User API 39 + ================== 40 + 41 + .. kernel-doc:: include/uapi/fwctl/cxl.h 42 + 43 + 1. Driver info query 44 + -------------------- 45 + 46 + First step for the app is to issue the ioctl(FWCTL_CMD_INFO). Successful 47 + invocation of the ioctl implies the Features capability is operational and 48 + returns an all zeros 32bit payload. A ``struct fwctl_info`` needs to be filled 49 + out with the ``fwctl_info.out_device_type`` set to ``FWCTL_DEVICE_TYPE_CXL``. 50 + The return data should be ``struct fwctl_info_cxl`` that contains a reserved 51 + 32bit field that should be all zeros. 52 + 53 + 2. Send hardware commands 54 + ------------------------- 55 + 56 + Next step is to send the 'Get Supported Features' command to the driver from 57 + user space via ioctl(FWCTL_RPC). A ``struct fwctl_rpc_cxl`` is pointed to 58 + by ``fwctl_rpc.in``. ``struct fwctl_rpc_cxl.in_payload`` points to 59 + the hardware input structure that is defined by the CXL spec. ``fwctl_rpc.out`` 60 + points to the buffer that contains a ``struct fwctl_rpc_cxl_out`` that includes 61 + the hardware output data inlined as ``fwctl_rpc_cxl_out.payload``. This command 62 + is called twice. First time to retrieve the number of features supported. 63 + A second time to retrieve the specific feature details as the output data. 64 + 65 + After getting the specific feature details, a Get/Set Feature command can be 66 + appropriately programmed and sent. For a "Set Feature" command, the retrieved 67 + feature info contains an effects field that details the resulting 68 + "Set Feature" command will trigger. That will inform the user whether 69 + the system is configured to allowed the "Set Feature" command or not. 70 + 71 + Code example of a Get Feature 72 + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 73 + 74 + .. code-block:: c 75 + 76 + static int cxl_fwctl_rpc_get_test_feature(int fd, struct test_feature *feat_ctx, 77 + const uint32_t expected_data) 78 + { 79 + struct cxl_mbox_get_feat_in *feat_in; 80 + struct fwctl_rpc_cxl_out *out; 81 + struct fwctl_rpc rpc = {0}; 82 + struct fwctl_rpc_cxl *in; 83 + size_t out_size, in_size; 84 + uint32_t val; 85 + void *data; 86 + int rc; 87 + 88 + in_size = sizeof(*in) + sizeof(*feat_in); 89 + rc = posix_memalign((void **)&in, 16, in_size); 90 + if (rc) 91 + return -ENOMEM; 92 + memset(in, 0, in_size); 93 + feat_in = &in->get_feat_in; 94 + 95 + uuid_copy(feat_in->uuid, feat_ctx->uuid); 96 + feat_in->count = feat_ctx->get_size; 97 + 98 + out_size = sizeof(*out) + feat_ctx->get_size; 99 + rc = posix_memalign((void **)&out, 16, out_size); 100 + if (rc) 101 + goto free_in; 102 + memset(out, 0, out_size); 103 + 104 + in->opcode = CXL_MBOX_OPCODE_GET_FEATURE; 105 + in->op_size = sizeof(*feat_in); 106 + 107 + rpc.size = sizeof(rpc); 108 + rpc.scope = FWCTL_RPC_CONFIGURATION; 109 + rpc.in_len = in_size; 110 + rpc.out_len = out_size; 111 + rpc.in = (uint64_t)(uint64_t *)in; 112 + rpc.out = (uint64_t)(uint64_t *)out; 113 + 114 + rc = send_command(fd, &rpc, out); 115 + if (rc) 116 + goto free_all; 117 + 118 + data = out->payload; 119 + val = le32toh(*(__le32 *)data); 120 + if (memcmp(&val, &expected_data, sizeof(val)) != 0) { 121 + rc = -ENXIO; 122 + goto free_all; 123 + } 124 + 125 + free_all: 126 + free(out); 127 + free_in: 128 + free(in); 129 + return rc; 130 + } 131 + 132 + Take a look at CXL CLI test directory 133 + <https://github.com/pmem/ndctl/tree/main/test/fwctl.c> for a detailed user code 134 + for examples on how to exercise this path. 135 + 136 + 137 + fwctl cxl Kernel API 138 + ==================== 139 + 140 + .. kernel-doc:: drivers/cxl/core/features.c 141 + :export: 142 + .. kernel-doc:: include/cxl/features.h
+286
Documentation/userspace-api/fwctl/fwctl.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0 2 + 3 + =============== 4 + fwctl subsystem 5 + =============== 6 + 7 + :Author: Jason Gunthorpe 8 + 9 + Overview 10 + ======== 11 + 12 + Modern devices contain extensive amounts of FW, and in many cases, are largely 13 + software-defined pieces of hardware. The evolution of this approach is largely a 14 + reaction to Moore's Law where a chip tape out is now highly expensive, and the 15 + chip design is extremely large. Replacing fixed HW logic with a flexible and 16 + tightly coupled FW/HW combination is an effective risk mitigation against chip 17 + respin. Problems in the HW design can be counteracted in device FW. This is 18 + especially true for devices which present a stable and backwards compatible 19 + interface to the operating system driver (such as NVMe). 20 + 21 + The FW layer in devices has grown to incredible size and devices frequently 22 + integrate clusters of fast processors to run it. For example, mlx5 devices have 23 + over 30MB of FW code, and big configurations operate with over 1GB of FW managed 24 + runtime state. 25 + 26 + The availability of such a flexible layer has created quite a variety in the 27 + industry where single pieces of silicon are now configurable software-defined 28 + devices and can operate in substantially different ways depending on the need. 29 + Further, we often see cases where specific sites wish to operate devices in ways 30 + that are highly specialized and require applications that have been tailored to 31 + their unique configuration. 32 + 33 + Further, devices have become multi-functional and integrated to the point they 34 + no longer fit neatly into the kernel's division of subsystems. Modern 35 + multi-functional devices have drivers, such as bnxt/ice/mlx5/pds, that span many 36 + subsystems while sharing the underlying hardware using the auxiliary device 37 + system. 38 + 39 + All together this creates a challenge for the operating system, where devices 40 + have an expansive FW environment that needs robust device-specific debugging 41 + support, and FW-driven functionality that is not well suited to “generic” 42 + interfaces. fwctl seeks to allow access to the full device functionality from 43 + user space in the areas of debuggability, management, and first-boot/nth-boot 44 + provisioning. 45 + 46 + fwctl is aimed at the common device design pattern where the OS and FW 47 + communicate via an RPC message layer constructed with a queue or mailbox scheme. 48 + In this case the driver will typically have some layer to deliver RPC messages 49 + and collect RPC responses from device FW. The in-kernel subsystem drivers that 50 + operate the device for its primary purposes will use these RPCs to build their 51 + drivers, but devices also usually have a set of ancillary RPCs that don't really 52 + fit into any specific subsystem. For example, a HW RAID controller is primarily 53 + operated by the block layer but also comes with a set of RPCs to administer the 54 + construction of drives within the HW RAID. 55 + 56 + In the past when devices were more single function, individual subsystems would 57 + grow different approaches to solving some of these common problems. For instance 58 + monitoring device health, manipulating its FLASH, debugging the FW, 59 + provisioning, all have various unique interfaces across the kernel. 60 + 61 + fwctl's purpose is to define a common set of limited rules, described below, 62 + that allow user space to securely construct and execute RPCs inside device FW. 63 + The rules serve as an agreement between the operating system and FW on how to 64 + correctly design the RPC interface. As a uAPI the subsystem provides a thin 65 + layer of discovery and a generic uAPI to deliver the RPCs and collect the 66 + response. It supports a system of user space libraries and tools which will 67 + use this interface to control the device using the device native protocols. 68 + 69 + Scope of Action 70 + --------------- 71 + 72 + fwctl drivers are strictly restricted to being a way to operate the device FW. 73 + It is not an avenue to access random kernel internals, or other operating system 74 + SW states. 75 + 76 + fwctl instances must operate on a well-defined device function, and the device 77 + should have a well-defined security model for what scope within the physical 78 + device the function is permitted to access. For instance, the most complex PCIe 79 + device today may broadly have several function-level scopes: 80 + 81 + 1. A privileged function with full access to the on-device global state and 82 + configuration 83 + 84 + 2. Multiple hypervisor functions with control over itself and child functions 85 + used with VMs 86 + 87 + 3. Multiple VM functions tightly scoped within the VM 88 + 89 + The device may create a logical parent/child relationship between these scopes. 90 + For instance a child VM's FW may be within the scope of the hypervisor FW. It is 91 + quite common in the VFIO world that the hypervisor environment has a complex 92 + provisioning/profiling/configuration responsibility for the function VFIO 93 + assigns to the VM. 94 + 95 + Further, within the function, devices often have RPC commands that fall within 96 + some general scopes of action (see enum fwctl_rpc_scope): 97 + 98 + 1. Access to function & child configuration, FLASH, etc. that becomes live at a 99 + function reset. Access to function & child runtime configuration that is 100 + transparent or non-disruptive to any driver or VM. 101 + 102 + 2. Read-only access to function debug information that may report on FW objects 103 + in the function & child, including FW objects owned by other kernel 104 + subsystems. 105 + 106 + 3. Write access to function & child debug information strictly compatible with 107 + the principles of kernel lockdown and kernel integrity protection. Triggers 108 + a kernel Taint. 109 + 110 + 4. Full debug device access. Triggers a kernel Taint, requires CAP_SYS_RAWIO. 111 + 112 + User space will provide a scope label on each RPC and the kernel must enforce the 113 + above CAPs and taints based on that scope. A combination of kernel and FW can 114 + enforce that RPCs are placed in the correct scope by user space. 115 + 116 + Denied behavior 117 + --------------- 118 + 119 + There are many things this interface must not allow user space to do (without a 120 + Taint or CAP), broadly derived from the principles of kernel lockdown. Some 121 + examples: 122 + 123 + 1. DMA to/from arbitrary memory, hang the system, compromise FW integrity with 124 + untrusted code, or otherwise compromise device or system security and 125 + integrity. 126 + 127 + 2. Provide an abnormal “back door” to kernel drivers. No manipulation of kernel 128 + objects owned by kernel drivers. 129 + 130 + 3. Directly configure or otherwise control kernel drivers. A subsystem kernel 131 + driver can react to the device configuration at function reset/driver load 132 + time, but otherwise must not be coupled to fwctl. 133 + 134 + 4. Operate the HW in a way that overlaps with the core purpose of another 135 + primary kernel subsystem, such as read/write to LBAs, send/receive of 136 + network packets, or operate an accelerator's data plane. 137 + 138 + fwctl is not a replacement for device direct access subsystems like uacce or 139 + VFIO. 140 + 141 + Operations exposed through fwctl's non-taining interfaces should be fully 142 + sharable with other users of the device. For instance exposing a RPC through 143 + fwctl should never prevent a kernel subsystem from also concurrently using that 144 + same RPC or hardware unit down the road. In such cases fwctl will be less 145 + important than proper kernel subsystems that eventually emerge. Mistakes in this 146 + area resulting in clashes will be resolved in favour of a kernel implementation. 147 + 148 + fwctl User API 149 + ============== 150 + 151 + .. kernel-doc:: include/uapi/fwctl/fwctl.h 152 + .. kernel-doc:: include/uapi/fwctl/mlx5.h 153 + .. kernel-doc:: include/uapi/fwctl/pds.h 154 + 155 + sysfs Class 156 + ----------- 157 + 158 + fwctl has a sysfs class (/sys/class/fwctl/fwctlNN/) and character devices 159 + (/dev/fwctl/fwctlNN) with a simple numbered scheme. The character device 160 + operates the iotcl uAPI described above. 161 + 162 + fwctl devices can be related to driver components in other subsystems through 163 + sysfs:: 164 + 165 + $ ls /sys/class/fwctl/fwctl0/device/infiniband/ 166 + ibp0s10f0 167 + 168 + $ ls /sys/class/infiniband/ibp0s10f0/device/fwctl/ 169 + fwctl0/ 170 + 171 + $ ls /sys/devices/pci0000:00/0000:00:0a.0/fwctl/fwctl0 172 + dev device power subsystem uevent 173 + 174 + User space Community 175 + -------------------- 176 + 177 + Drawing inspiration from nvme-cli, participating in the kernel side must come 178 + with a user space in a common TBD git tree, at a minimum to usefully operate the 179 + kernel driver. Providing such an implementation is a pre-condition to merging a 180 + kernel driver. 181 + 182 + The goal is to build user space community around some of the shared problems 183 + we all have, and ideally develop some common user space programs with some 184 + starting themes of: 185 + 186 + - Device in-field debugging 187 + 188 + - HW provisioning 189 + 190 + - VFIO child device profiling before VM boot 191 + 192 + - Confidential Compute topics (attestation, secure provisioning) 193 + 194 + that stretch across all subsystems in the kernel. fwupd is a great example of 195 + how an excellent user space experience can emerge out of kernel-side diversity. 196 + 197 + fwctl Kernel API 198 + ================ 199 + 200 + .. kernel-doc:: drivers/fwctl/main.c 201 + :export: 202 + .. kernel-doc:: include/linux/fwctl.h 203 + 204 + fwctl Driver design 205 + ------------------- 206 + 207 + In many cases a fwctl driver is going to be part of a larger cross-subsystem 208 + device possibly using the auxiliary_device mechanism. In that case several 209 + subsystems are going to be sharing the same device and FW interface layer so the 210 + device design must already provide for isolation and cooperation between kernel 211 + subsystems. fwctl should fit into that same model. 212 + 213 + Part of the driver should include a description of how its scope restrictions 214 + and security model work. The driver and FW together must ensure that RPCs 215 + provided by user space are mapped to the appropriate scope. If the validation is 216 + done in the driver then the validation can read a 'command effects' report from 217 + the device, or hardwire the enforcement. If the validation is done in the FW, 218 + then the driver should pass the fwctl_rpc_scope to the FW along with the command. 219 + 220 + The driver and FW must cooperate to ensure that either fwctl cannot allocate 221 + any FW resources, or any resources it does allocate are freed on FD closure. A 222 + driver primarily constructed around FW RPCs may find that its core PCI function 223 + and RPC layer belongs under fwctl with auxiliary devices connecting to other 224 + subsystems. 225 + 226 + Each device type must be mindful of Linux's philosophy for stable ABI. The FW 227 + RPC interface does not have to meet a strictly stable ABI, but it does need to 228 + meet an expectation that userspace tools that are deployed and in significant 229 + use don't needlessly break. FW upgrade and kernel upgrade should keep widely 230 + deployed tooling working. 231 + 232 + Development and debugging focused RPCs under more permissive scopes can have 233 + less stabilitiy if the tools using them are only run under exceptional 234 + circumstances and not for every day use of the device. Debugging tools may even 235 + require exact version matching as they may require something similar to DWARF 236 + debug information from the FW binary. 237 + 238 + Security Response 239 + ================= 240 + 241 + The kernel remains the gatekeeper for this interface. If violations of the 242 + scopes, security or isolation principles are found, we have options to let 243 + devices fix them with a FW update, push a kernel patch to parse and block RPC 244 + commands or push a kernel patch to block entire firmware versions/devices. 245 + 246 + While the kernel can always directly parse and restrict RPCs, it is expected 247 + that the existing kernel pattern of allowing drivers to delegate validation to 248 + FW to be a useful design. 249 + 250 + Existing Similar Examples 251 + ========================= 252 + 253 + The approach described in this document is not a new idea. Direct, or near 254 + direct device access has been offered by the kernel in different areas for 255 + decades. With more devices wanting to follow this design pattern it is becoming 256 + clear that it is not entirely well understood and, more importantly, the 257 + security considerations are not well defined or agreed upon. 258 + 259 + Some examples: 260 + 261 + - HW RAID controllers. This includes RPCs to do things like compose drives into 262 + a RAID volume, configure RAID parameters, monitor the HW and more. 263 + 264 + - Baseboard managers. RPCs for configuring settings in the device and more 265 + 266 + - NVMe vendor command capsules. nvme-cli provides access to some monitoring 267 + functions that different products have defined, but more exist. 268 + 269 + - CXL also has a NVMe-like vendor command system. 270 + 271 + - DRM allows user space drivers to send commands to the device via kernel 272 + mediation 273 + 274 + - RDMA allows user space drivers to directly push commands to the device 275 + without kernel involvement 276 + 277 + - Various “raw” APIs, raw HID (SDL2), raw USB, NVMe Generic Interface, etc. 278 + 279 + The first 4 are examples of areas that fwctl intends to cover. The latter three 280 + are examples of denied behavior as they fully overlap with the primary purpose 281 + of a kernel subsystem. 282 + 283 + Some key lessons learned from these past efforts are the importance of having a 284 + common user space project to use as a pre-condition for obtaining a kernel 285 + driver. Developing good community around useful software in user space is key to 286 + getting companies to fund participation to enable their products.
+14
Documentation/userspace-api/fwctl/index.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0 2 + 3 + Firmware Control (FWCTL) Userspace API 4 + ====================================== 5 + 6 + A framework that define a common set of limited rules that allows user space 7 + to securely construct and execute RPCs inside device firmware. 8 + 9 + .. toctree:: 10 + :maxdepth: 1 11 + 12 + fwctl 13 + fwctl-cxl 14 + pds_fwctl
+46
Documentation/userspace-api/fwctl/pds_fwctl.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0 2 + 3 + ================ 4 + fwctl pds driver 5 + ================ 6 + 7 + :Author: Shannon Nelson 8 + 9 + Overview 10 + ======== 11 + 12 + The PDS Core device makes a fwctl service available through an 13 + auxiliary_device named pds_core.fwctl.N. The pds_fwctl driver binds to 14 + this device and registers itself with the fwctl subsystem. The resulting 15 + userspace interface is used by an application that is a part of the 16 + AMD Pensando software package for the Distributed Service Card (DSC). 17 + 18 + The pds_fwctl driver has little knowledge of the firmware's internals. 19 + It only knows how to send commands through pds_core's message queue to the 20 + firmware for fwctl requests. The set of fwctl operations available 21 + depends on the firmware in the DSC, and the userspace application 22 + version must match the firmware so that they can talk to each other. 23 + 24 + When a connection is created the pds_fwctl driver requests from the 25 + firmware a list of firmware object endpoints, and for each endpoint the 26 + driver requests a list of operations for that endpoint. 27 + 28 + Each operation description includes a firmware defined command attribute 29 + that maps to the FWCTL scope levels. The driver translates those firmware 30 + values into the FWCTL scope values which can then be used for filtering the 31 + scoped user requests. 32 + 33 + pds_fwctl User API 34 + ================== 35 + 36 + Each RPC request includes the target endpoint and the operation id, and in 37 + and out buffer lengths and pointers. The driver verifies the existence 38 + of the requested endpoint and operations, then checks the request scope 39 + against the required scope of the operation. The request is then put 40 + together with the request data and sent through pds_core's message queue 41 + to the firmware, and the results are returned to the caller. 42 + 43 + The RPC endpoints, operations, and buffer contents are defined by the 44 + particular firmware package in the device, which varies across the 45 + available product configurations. The details are available in the 46 + specific product SDK documentation.
+1
Documentation/userspace-api/index.rst
··· 46 46 accelerators/ocxl 47 47 dma-buf-heaps 48 48 dma-buf-alloc-exchange 49 + fwctl/index 49 50 gpio/index 50 51 iommufd 51 52 media/index
+1
Documentation/userspace-api/ioctl/ioctl-number.rst
··· 333 333 0x97 00-7F fs/ceph/ioctl.h Ceph file system 334 334 0x99 00-0F 537-Addinboard driver 335 335 <mailto:buk@buks.ipn.de> 336 + 0x9A 00-0F include/uapi/fwctl/fwctl.h 336 337 0xA0 all linux/sdp/sdp.h Industrial Device Project 337 338 <mailto:kenji@bitgate.com> 338 339 0xA1 0 linux/vtpm_proxy.h TPM Emulator Proxy Driver
+26
MAINTAINERS
··· 5886 5886 L: linux-cxl@vger.kernel.org 5887 5887 S: Maintained 5888 5888 F: Documentation/driver-api/cxl 5889 + F: Documentation/userspace-api/fwctl/fwctl-cxl.rst 5889 5890 F: drivers/cxl/ 5890 5891 F: include/cxl/ 5891 5892 F: include/uapi/linux/cxl_mem.h ··· 9686 9685 F: kernel/futex/* 9687 9686 F: tools/perf/bench/futex* 9688 9687 F: tools/testing/selftests/futex/ 9688 + 9689 + FWCTL SUBSYSTEM 9690 + M: Dave Jiang <dave.jiang@intel.com> 9691 + M: Jason Gunthorpe <jgg@nvidia.com> 9692 + M: Saeed Mahameed <saeedm@nvidia.com> 9693 + R: Jonathan Cameron <Jonathan.Cameron@huawei.com> 9694 + S: Maintained 9695 + F: Documentation/userspace-api/fwctl/ 9696 + F: drivers/fwctl/ 9697 + F: include/linux/fwctl.h 9698 + F: include/uapi/fwctl/ 9699 + 9700 + FWCTL MLX5 DRIVER 9701 + M: Saeed Mahameed <saeedm@nvidia.com> 9702 + R: Itay Avraham <itayavr@nvidia.com> 9703 + L: linux-kernel@vger.kernel.org 9704 + S: Maintained 9705 + F: drivers/fwctl/mlx5/ 9706 + 9707 + FWCTL PDS DRIVER 9708 + M: Brett Creeley <brett.creeley@amd.com> 9709 + R: Shannon Nelson <shannon.nelson@amd.com> 9710 + L: linux-kernel@vger.kernel.org 9711 + S: Maintained 9712 + F: drivers/fwctl/pds/ 9689 9713 9690 9714 GALAXYCORE GC0308 CAMERA SENSOR DRIVER 9691 9715 M: Sebastian Reichel <sre@kernel.org>
+2
drivers/Kconfig
··· 21 21 22 22 source "drivers/firmware/Kconfig" 23 23 24 + source "drivers/fwctl/Kconfig" 25 + 24 26 source "drivers/gnss/Kconfig" 25 27 26 28 source "drivers/mtd/Kconfig"
+1
drivers/Makefile
··· 135 135 obj-$(CONFIG_MEMSTICK) += memstick/ 136 136 obj-$(CONFIG_INFINIBAND) += infiniband/ 137 137 obj-y += firmware/ 138 + obj-$(CONFIG_FWCTL) += fwctl/ 138 139 obj-$(CONFIG_CRYPTO) += crypto/ 139 140 obj-$(CONFIG_SUPERH) += sh/ 140 141 obj-y += clocksource/
+12
drivers/cxl/Kconfig
··· 7 7 select PCI_DOE 8 8 select FIRMWARE_TABLE 9 9 select NUMA_KEEP_MEMINFO if NUMA_MEMBLKS 10 + select FWCTL if CXL_FEATURES 10 11 help 11 12 CXL is a bus that is electrically compatible with PCI Express, but 12 13 layers three protocols on that signalling (CXL.io, CXL.cache, and ··· 102 101 specification for a detailed description of HDM. 103 102 104 103 If unsure say 'm'. 104 + 105 + config CXL_FEATURES 106 + bool "CXL: Features" 107 + depends on CXL_PCI 108 + help 109 + Enable support for CXL Features. A CXL device that includes a mailbox 110 + supports commands that allows listing, getting, and setting of 111 + optionally defined features such as memory sparing or post package 112 + sparing. Vendors may define custom features for the device. 113 + 114 + If unsure say 'n' 105 115 106 116 config CXL_PORT 107 117 default CXL_BUS
+1
drivers/cxl/core/Makefile
··· 16 16 cxl_core-y += cdat.o 17 17 cxl_core-$(CONFIG_TRACING) += trace.o 18 18 cxl_core-$(CONFIG_CXL_REGION) += region.o 19 + cxl_core-$(CONFIG_CXL_FEATURES) += features.o
+15 -2
drivers/cxl/core/core.h
··· 4 4 #ifndef __CXL_CORE_H__ 5 5 #define __CXL_CORE_H__ 6 6 7 + #include <cxl/mailbox.h> 8 + 7 9 extern const struct device_type cxl_nvdimm_bridge_type; 8 10 extern const struct device_type cxl_nvdimm_type; 9 11 extern const struct device_type cxl_pmu_type; ··· 67 65 68 66 struct cxl_send_command; 69 67 struct cxl_mem_query_commands; 70 - int cxl_query_cmd(struct cxl_memdev *cxlmd, 68 + int cxl_query_cmd(struct cxl_mailbox *cxl_mbox, 71 69 struct cxl_mem_query_commands __user *q); 72 - int cxl_send_cmd(struct cxl_memdev *cxlmd, struct cxl_send_command __user *s); 70 + int cxl_send_cmd(struct cxl_mailbox *cxl_mbox, struct cxl_send_command __user *s); 73 71 void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t addr, 74 72 resource_size_t length); 75 73 ··· 116 114 bool cxl_need_node_perf_attrs_update(int nid); 117 115 int cxl_port_get_switch_dport_bandwidth(struct cxl_port *port, 118 116 struct access_coordinate *c); 117 + 118 + #ifdef CONFIG_CXL_FEATURES 119 + size_t cxl_get_feature(struct cxl_mailbox *cxl_mbox, const uuid_t *feat_uuid, 120 + enum cxl_get_feat_selection selection, 121 + void *feat_out, size_t feat_out_size, u16 offset, 122 + u16 *return_code); 123 + int cxl_set_feature(struct cxl_mailbox *cxl_mbox, const uuid_t *feat_uuid, 124 + u8 feat_version, const void *feat_data, 125 + size_t feat_data_size, u32 feat_flag, u16 offset, 126 + u16 *return_code); 127 + #endif 119 128 120 129 #endif /* __CXL_CORE_H__ */
+708
drivers/cxl/core/features.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* Copyright(c) 2024-2025 Intel Corporation. All rights reserved. */ 3 + #include <linux/fwctl.h> 4 + #include <linux/device.h> 5 + #include <cxl/mailbox.h> 6 + #include <cxl/features.h> 7 + #include <uapi/fwctl/cxl.h> 8 + #include "cxl.h" 9 + #include "core.h" 10 + #include "cxlmem.h" 11 + 12 + /* All the features below are exclusive to the kernel */ 13 + static const uuid_t cxl_exclusive_feats[] = { 14 + CXL_FEAT_PATROL_SCRUB_UUID, 15 + CXL_FEAT_ECS_UUID, 16 + CXL_FEAT_SPPR_UUID, 17 + CXL_FEAT_HPPR_UUID, 18 + CXL_FEAT_CACHELINE_SPARING_UUID, 19 + CXL_FEAT_ROW_SPARING_UUID, 20 + CXL_FEAT_BANK_SPARING_UUID, 21 + CXL_FEAT_RANK_SPARING_UUID, 22 + }; 23 + 24 + static bool is_cxl_feature_exclusive_by_uuid(const uuid_t *uuid) 25 + { 26 + for (int i = 0; i < ARRAY_SIZE(cxl_exclusive_feats); i++) { 27 + if (uuid_equal(uuid, &cxl_exclusive_feats[i])) 28 + return true; 29 + } 30 + 31 + return false; 32 + } 33 + 34 + static bool is_cxl_feature_exclusive(struct cxl_feat_entry *entry) 35 + { 36 + return is_cxl_feature_exclusive_by_uuid(&entry->uuid); 37 + } 38 + 39 + inline struct cxl_features_state *to_cxlfs(struct cxl_dev_state *cxlds) 40 + { 41 + return cxlds->cxlfs; 42 + } 43 + EXPORT_SYMBOL_NS_GPL(to_cxlfs, "CXL"); 44 + 45 + static int cxl_get_supported_features_count(struct cxl_mailbox *cxl_mbox) 46 + { 47 + struct cxl_mbox_get_sup_feats_out mbox_out; 48 + struct cxl_mbox_get_sup_feats_in mbox_in; 49 + struct cxl_mbox_cmd mbox_cmd; 50 + int rc; 51 + 52 + memset(&mbox_in, 0, sizeof(mbox_in)); 53 + mbox_in.count = cpu_to_le32(sizeof(mbox_out)); 54 + memset(&mbox_out, 0, sizeof(mbox_out)); 55 + mbox_cmd = (struct cxl_mbox_cmd) { 56 + .opcode = CXL_MBOX_OP_GET_SUPPORTED_FEATURES, 57 + .size_in = sizeof(mbox_in), 58 + .payload_in = &mbox_in, 59 + .size_out = sizeof(mbox_out), 60 + .payload_out = &mbox_out, 61 + .min_out = sizeof(mbox_out), 62 + }; 63 + rc = cxl_internal_send_cmd(cxl_mbox, &mbox_cmd); 64 + if (rc < 0) 65 + return rc; 66 + 67 + return le16_to_cpu(mbox_out.supported_feats); 68 + } 69 + 70 + static struct cxl_feat_entries * 71 + get_supported_features(struct cxl_features_state *cxlfs) 72 + { 73 + int remain_feats, max_size, max_feats, start, rc, hdr_size; 74 + struct cxl_mailbox *cxl_mbox = &cxlfs->cxlds->cxl_mbox; 75 + int feat_size = sizeof(struct cxl_feat_entry); 76 + struct cxl_mbox_get_sup_feats_in mbox_in; 77 + struct cxl_feat_entry *entry; 78 + struct cxl_mbox_cmd mbox_cmd; 79 + int user_feats = 0; 80 + int count; 81 + 82 + count = cxl_get_supported_features_count(cxl_mbox); 83 + if (count <= 0) 84 + return NULL; 85 + 86 + struct cxl_feat_entries *entries __free(kvfree) = 87 + kvmalloc(struct_size(entries, ent, count), GFP_KERNEL); 88 + if (!entries) 89 + return NULL; 90 + 91 + struct cxl_mbox_get_sup_feats_out *mbox_out __free(kvfree) = 92 + kvmalloc(cxl_mbox->payload_size, GFP_KERNEL); 93 + if (!mbox_out) 94 + return NULL; 95 + 96 + hdr_size = struct_size(mbox_out, ents, 0); 97 + max_size = cxl_mbox->payload_size - hdr_size; 98 + /* max feat entries that can fit in mailbox max payload size */ 99 + max_feats = max_size / feat_size; 100 + entry = entries->ent; 101 + 102 + start = 0; 103 + remain_feats = count; 104 + do { 105 + int retrieved, alloc_size, copy_feats; 106 + int num_entries; 107 + 108 + if (remain_feats > max_feats) { 109 + alloc_size = struct_size(mbox_out, ents, max_feats); 110 + remain_feats = remain_feats - max_feats; 111 + copy_feats = max_feats; 112 + } else { 113 + alloc_size = struct_size(mbox_out, ents, remain_feats); 114 + copy_feats = remain_feats; 115 + remain_feats = 0; 116 + } 117 + 118 + memset(&mbox_in, 0, sizeof(mbox_in)); 119 + mbox_in.count = cpu_to_le32(alloc_size); 120 + mbox_in.start_idx = cpu_to_le16(start); 121 + memset(mbox_out, 0, alloc_size); 122 + mbox_cmd = (struct cxl_mbox_cmd) { 123 + .opcode = CXL_MBOX_OP_GET_SUPPORTED_FEATURES, 124 + .size_in = sizeof(mbox_in), 125 + .payload_in = &mbox_in, 126 + .size_out = alloc_size, 127 + .payload_out = mbox_out, 128 + .min_out = hdr_size, 129 + }; 130 + rc = cxl_internal_send_cmd(cxl_mbox, &mbox_cmd); 131 + if (rc < 0) 132 + return NULL; 133 + 134 + if (mbox_cmd.size_out <= hdr_size) 135 + return NULL; 136 + 137 + /* 138 + * Make sure retrieved out buffer is multiple of feature 139 + * entries. 140 + */ 141 + retrieved = mbox_cmd.size_out - hdr_size; 142 + if (retrieved % feat_size) 143 + return NULL; 144 + 145 + num_entries = le16_to_cpu(mbox_out->num_entries); 146 + /* 147 + * If the reported output entries * defined entry size != 148 + * retrieved output bytes, then the output package is incorrect. 149 + */ 150 + if (num_entries * feat_size != retrieved) 151 + return NULL; 152 + 153 + memcpy(entry, mbox_out->ents, retrieved); 154 + for (int i = 0; i < num_entries; i++) { 155 + if (!is_cxl_feature_exclusive(entry + i)) 156 + user_feats++; 157 + } 158 + entry += num_entries; 159 + /* 160 + * If the number of output entries is less than expected, add the 161 + * remaining entries to the next batch. 162 + */ 163 + remain_feats += copy_feats - num_entries; 164 + start += num_entries; 165 + } while (remain_feats); 166 + 167 + entries->num_features = count; 168 + entries->num_user_features = user_feats; 169 + 170 + return no_free_ptr(entries); 171 + } 172 + 173 + static void free_cxlfs(void *_cxlfs) 174 + { 175 + struct cxl_features_state *cxlfs = _cxlfs; 176 + struct cxl_dev_state *cxlds = cxlfs->cxlds; 177 + 178 + cxlds->cxlfs = NULL; 179 + kvfree(cxlfs->entries); 180 + kfree(cxlfs); 181 + } 182 + 183 + /** 184 + * devm_cxl_setup_features() - Allocate and initialize features context 185 + * @cxlds: CXL device context 186 + * 187 + * Return 0 on success or -errno on failure. 188 + */ 189 + int devm_cxl_setup_features(struct cxl_dev_state *cxlds) 190 + { 191 + struct cxl_mailbox *cxl_mbox = &cxlds->cxl_mbox; 192 + 193 + if (cxl_mbox->feat_cap < CXL_FEATURES_RO) 194 + return -ENODEV; 195 + 196 + struct cxl_features_state *cxlfs __free(kfree) = 197 + kzalloc(sizeof(*cxlfs), GFP_KERNEL); 198 + if (!cxlfs) 199 + return -ENOMEM; 200 + 201 + cxlfs->cxlds = cxlds; 202 + 203 + cxlfs->entries = get_supported_features(cxlfs); 204 + if (!cxlfs->entries) 205 + return -ENOMEM; 206 + 207 + cxlds->cxlfs = cxlfs; 208 + 209 + return devm_add_action_or_reset(cxlds->dev, free_cxlfs, no_free_ptr(cxlfs)); 210 + } 211 + EXPORT_SYMBOL_NS_GPL(devm_cxl_setup_features, "CXL"); 212 + 213 + size_t cxl_get_feature(struct cxl_mailbox *cxl_mbox, const uuid_t *feat_uuid, 214 + enum cxl_get_feat_selection selection, 215 + void *feat_out, size_t feat_out_size, u16 offset, 216 + u16 *return_code) 217 + { 218 + size_t data_to_rd_size, size_out; 219 + struct cxl_mbox_get_feat_in pi; 220 + struct cxl_mbox_cmd mbox_cmd; 221 + size_t data_rcvd_size = 0; 222 + int rc; 223 + 224 + if (return_code) 225 + *return_code = CXL_MBOX_CMD_RC_INPUT; 226 + 227 + if (!feat_out || !feat_out_size) 228 + return 0; 229 + 230 + size_out = min(feat_out_size, cxl_mbox->payload_size); 231 + uuid_copy(&pi.uuid, feat_uuid); 232 + pi.selection = selection; 233 + do { 234 + data_to_rd_size = min(feat_out_size - data_rcvd_size, 235 + cxl_mbox->payload_size); 236 + pi.offset = cpu_to_le16(offset + data_rcvd_size); 237 + pi.count = cpu_to_le16(data_to_rd_size); 238 + 239 + mbox_cmd = (struct cxl_mbox_cmd) { 240 + .opcode = CXL_MBOX_OP_GET_FEATURE, 241 + .size_in = sizeof(pi), 242 + .payload_in = &pi, 243 + .size_out = size_out, 244 + .payload_out = feat_out + data_rcvd_size, 245 + .min_out = data_to_rd_size, 246 + }; 247 + rc = cxl_internal_send_cmd(cxl_mbox, &mbox_cmd); 248 + if (rc < 0 || !mbox_cmd.size_out) { 249 + if (return_code) 250 + *return_code = mbox_cmd.return_code; 251 + return 0; 252 + } 253 + data_rcvd_size += mbox_cmd.size_out; 254 + } while (data_rcvd_size < feat_out_size); 255 + 256 + if (return_code) 257 + *return_code = CXL_MBOX_CMD_RC_SUCCESS; 258 + 259 + return data_rcvd_size; 260 + } 261 + 262 + /* 263 + * FEAT_DATA_MIN_PAYLOAD_SIZE - min extra number of bytes should be 264 + * available in the mailbox for storing the actual feature data so that 265 + * the feature data transfer would work as expected. 266 + */ 267 + #define FEAT_DATA_MIN_PAYLOAD_SIZE 10 268 + int cxl_set_feature(struct cxl_mailbox *cxl_mbox, 269 + const uuid_t *feat_uuid, u8 feat_version, 270 + const void *feat_data, size_t feat_data_size, 271 + u32 feat_flag, u16 offset, u16 *return_code) 272 + { 273 + size_t data_in_size, data_sent_size = 0; 274 + struct cxl_mbox_cmd mbox_cmd; 275 + size_t hdr_size; 276 + 277 + if (return_code) 278 + *return_code = CXL_MBOX_CMD_RC_INPUT; 279 + 280 + struct cxl_mbox_set_feat_in *pi __free(kfree) = 281 + kzalloc(cxl_mbox->payload_size, GFP_KERNEL); 282 + if (!pi) 283 + return -ENOMEM; 284 + 285 + uuid_copy(&pi->uuid, feat_uuid); 286 + pi->version = feat_version; 287 + feat_flag &= ~CXL_SET_FEAT_FLAG_DATA_TRANSFER_MASK; 288 + feat_flag |= CXL_SET_FEAT_FLAG_DATA_SAVED_ACROSS_RESET; 289 + hdr_size = sizeof(pi->hdr); 290 + /* 291 + * Check minimum mbox payload size is available for 292 + * the feature data transfer. 293 + */ 294 + if (hdr_size + FEAT_DATA_MIN_PAYLOAD_SIZE > cxl_mbox->payload_size) 295 + return -ENOMEM; 296 + 297 + if (hdr_size + feat_data_size <= cxl_mbox->payload_size) { 298 + pi->flags = cpu_to_le32(feat_flag | 299 + CXL_SET_FEAT_FLAG_FULL_DATA_TRANSFER); 300 + data_in_size = feat_data_size; 301 + } else { 302 + pi->flags = cpu_to_le32(feat_flag | 303 + CXL_SET_FEAT_FLAG_INITIATE_DATA_TRANSFER); 304 + data_in_size = cxl_mbox->payload_size - hdr_size; 305 + } 306 + 307 + do { 308 + int rc; 309 + 310 + pi->offset = cpu_to_le16(offset + data_sent_size); 311 + memcpy(pi->feat_data, feat_data + data_sent_size, data_in_size); 312 + mbox_cmd = (struct cxl_mbox_cmd) { 313 + .opcode = CXL_MBOX_OP_SET_FEATURE, 314 + .size_in = hdr_size + data_in_size, 315 + .payload_in = pi, 316 + }; 317 + rc = cxl_internal_send_cmd(cxl_mbox, &mbox_cmd); 318 + if (rc < 0) { 319 + if (return_code) 320 + *return_code = mbox_cmd.return_code; 321 + return rc; 322 + } 323 + 324 + data_sent_size += data_in_size; 325 + if (data_sent_size >= feat_data_size) { 326 + if (return_code) 327 + *return_code = CXL_MBOX_CMD_RC_SUCCESS; 328 + return 0; 329 + } 330 + 331 + if ((feat_data_size - data_sent_size) <= (cxl_mbox->payload_size - hdr_size)) { 332 + data_in_size = feat_data_size - data_sent_size; 333 + pi->flags = cpu_to_le32(feat_flag | 334 + CXL_SET_FEAT_FLAG_FINISH_DATA_TRANSFER); 335 + } else { 336 + pi->flags = cpu_to_le32(feat_flag | 337 + CXL_SET_FEAT_FLAG_CONTINUE_DATA_TRANSFER); 338 + } 339 + } while (true); 340 + } 341 + 342 + /* FWCTL support */ 343 + 344 + static inline struct cxl_memdev *fwctl_to_memdev(struct fwctl_device *fwctl_dev) 345 + { 346 + return to_cxl_memdev(fwctl_dev->dev.parent); 347 + } 348 + 349 + static int cxlctl_open_uctx(struct fwctl_uctx *uctx) 350 + { 351 + return 0; 352 + } 353 + 354 + static void cxlctl_close_uctx(struct fwctl_uctx *uctx) 355 + { 356 + } 357 + 358 + static struct cxl_feat_entry * 359 + get_support_feature_info(struct cxl_features_state *cxlfs, 360 + const struct fwctl_rpc_cxl *rpc_in) 361 + { 362 + struct cxl_feat_entry *feat; 363 + const uuid_t *uuid; 364 + 365 + if (rpc_in->op_size < sizeof(uuid)) 366 + return ERR_PTR(-EINVAL); 367 + 368 + uuid = &rpc_in->set_feat_in.uuid; 369 + 370 + for (int i = 0; i < cxlfs->entries->num_features; i++) { 371 + feat = &cxlfs->entries->ent[i]; 372 + if (uuid_equal(uuid, &feat->uuid)) 373 + return feat; 374 + } 375 + 376 + return ERR_PTR(-EINVAL); 377 + } 378 + 379 + static void *cxlctl_get_supported_features(struct cxl_features_state *cxlfs, 380 + const struct fwctl_rpc_cxl *rpc_in, 381 + size_t *out_len) 382 + { 383 + const struct cxl_mbox_get_sup_feats_in *feat_in; 384 + struct cxl_mbox_get_sup_feats_out *feat_out; 385 + struct cxl_feat_entry *pos; 386 + size_t out_size; 387 + int requested; 388 + u32 count; 389 + u16 start; 390 + int i; 391 + 392 + if (rpc_in->op_size != sizeof(*feat_in)) 393 + return ERR_PTR(-EINVAL); 394 + 395 + feat_in = &rpc_in->get_sup_feats_in; 396 + count = le32_to_cpu(feat_in->count); 397 + start = le16_to_cpu(feat_in->start_idx); 398 + requested = count / sizeof(*pos); 399 + 400 + /* 401 + * Make sure that the total requested number of entries is not greater 402 + * than the total number of supported features allowed for userspace. 403 + */ 404 + if (start >= cxlfs->entries->num_features) 405 + return ERR_PTR(-EINVAL); 406 + 407 + requested = min_t(int, requested, cxlfs->entries->num_features - start); 408 + 409 + out_size = sizeof(struct fwctl_rpc_cxl_out) + 410 + struct_size(feat_out, ents, requested); 411 + 412 + struct fwctl_rpc_cxl_out *rpc_out __free(kvfree) = 413 + kvzalloc(out_size, GFP_KERNEL); 414 + if (!rpc_out) 415 + return ERR_PTR(-ENOMEM); 416 + 417 + rpc_out->size = struct_size(feat_out, ents, requested); 418 + feat_out = &rpc_out->get_sup_feats_out; 419 + if (requested == 0) { 420 + feat_out->num_entries = cpu_to_le16(requested); 421 + feat_out->supported_feats = 422 + cpu_to_le16(cxlfs->entries->num_features); 423 + rpc_out->retval = CXL_MBOX_CMD_RC_SUCCESS; 424 + *out_len = out_size; 425 + return no_free_ptr(rpc_out); 426 + } 427 + 428 + for (i = start, pos = &feat_out->ents[0]; 429 + i < cxlfs->entries->num_features; i++, pos++) { 430 + if (i - start == requested) 431 + break; 432 + 433 + memcpy(pos, &cxlfs->entries->ent[i], sizeof(*pos)); 434 + /* 435 + * If the feature is exclusive, set the set_feat_size to 0 to 436 + * indicate that the feature is not changeable. 437 + */ 438 + if (is_cxl_feature_exclusive(pos)) { 439 + u32 flags; 440 + 441 + pos->set_feat_size = 0; 442 + flags = le32_to_cpu(pos->flags); 443 + flags &= ~CXL_FEATURE_F_CHANGEABLE; 444 + pos->flags = cpu_to_le32(flags); 445 + } 446 + } 447 + 448 + feat_out->num_entries = cpu_to_le16(requested); 449 + feat_out->supported_feats = cpu_to_le16(cxlfs->entries->num_features); 450 + rpc_out->retval = CXL_MBOX_CMD_RC_SUCCESS; 451 + *out_len = out_size; 452 + 453 + return no_free_ptr(rpc_out); 454 + } 455 + 456 + static void *cxlctl_get_feature(struct cxl_features_state *cxlfs, 457 + const struct fwctl_rpc_cxl *rpc_in, 458 + size_t *out_len) 459 + { 460 + struct cxl_mailbox *cxl_mbox = &cxlfs->cxlds->cxl_mbox; 461 + const struct cxl_mbox_get_feat_in *feat_in; 462 + u16 offset, count, return_code; 463 + size_t out_size = *out_len; 464 + 465 + if (rpc_in->op_size != sizeof(*feat_in)) 466 + return ERR_PTR(-EINVAL); 467 + 468 + feat_in = &rpc_in->get_feat_in; 469 + offset = le16_to_cpu(feat_in->offset); 470 + count = le16_to_cpu(feat_in->count); 471 + 472 + if (!count) 473 + return ERR_PTR(-EINVAL); 474 + 475 + struct fwctl_rpc_cxl_out *rpc_out __free(kvfree) = 476 + kvzalloc(out_size, GFP_KERNEL); 477 + if (!rpc_out) 478 + return ERR_PTR(-ENOMEM); 479 + 480 + out_size = cxl_get_feature(cxl_mbox, &feat_in->uuid, 481 + feat_in->selection, rpc_out->payload, 482 + count, offset, &return_code); 483 + *out_len = sizeof(struct fwctl_rpc_cxl_out); 484 + if (!out_size) { 485 + rpc_out->size = 0; 486 + rpc_out->retval = return_code; 487 + return no_free_ptr(rpc_out); 488 + } 489 + 490 + rpc_out->size = out_size; 491 + rpc_out->retval = CXL_MBOX_CMD_RC_SUCCESS; 492 + *out_len += out_size; 493 + 494 + return no_free_ptr(rpc_out); 495 + } 496 + 497 + static void *cxlctl_set_feature(struct cxl_features_state *cxlfs, 498 + const struct fwctl_rpc_cxl *rpc_in, 499 + size_t *out_len) 500 + { 501 + struct cxl_mailbox *cxl_mbox = &cxlfs->cxlds->cxl_mbox; 502 + const struct cxl_mbox_set_feat_in *feat_in; 503 + size_t out_size, data_size; 504 + u16 offset, return_code; 505 + u32 flags; 506 + int rc; 507 + 508 + if (rpc_in->op_size <= sizeof(feat_in->hdr)) 509 + return ERR_PTR(-EINVAL); 510 + 511 + feat_in = &rpc_in->set_feat_in; 512 + 513 + if (is_cxl_feature_exclusive_by_uuid(&feat_in->uuid)) 514 + return ERR_PTR(-EPERM); 515 + 516 + offset = le16_to_cpu(feat_in->offset); 517 + flags = le32_to_cpu(feat_in->flags); 518 + out_size = *out_len; 519 + 520 + struct fwctl_rpc_cxl_out *rpc_out __free(kvfree) = 521 + kvzalloc(out_size, GFP_KERNEL); 522 + if (!rpc_out) 523 + return ERR_PTR(-ENOMEM); 524 + 525 + rpc_out->size = 0; 526 + 527 + data_size = rpc_in->op_size - sizeof(feat_in->hdr); 528 + rc = cxl_set_feature(cxl_mbox, &feat_in->uuid, 529 + feat_in->version, feat_in->feat_data, 530 + data_size, flags, offset, &return_code); 531 + if (rc) { 532 + rpc_out->retval = return_code; 533 + return no_free_ptr(rpc_out); 534 + } 535 + 536 + rpc_out->retval = CXL_MBOX_CMD_RC_SUCCESS; 537 + *out_len = sizeof(*rpc_out); 538 + 539 + return no_free_ptr(rpc_out); 540 + } 541 + 542 + static bool cxlctl_validate_set_features(struct cxl_features_state *cxlfs, 543 + const struct fwctl_rpc_cxl *rpc_in, 544 + enum fwctl_rpc_scope scope) 545 + { 546 + u16 effects, imm_mask, reset_mask; 547 + struct cxl_feat_entry *feat; 548 + u32 flags; 549 + 550 + feat = get_support_feature_info(cxlfs, rpc_in); 551 + if (IS_ERR(feat)) 552 + return false; 553 + 554 + /* Ensure that the attribute is changeable */ 555 + flags = le32_to_cpu(feat->flags); 556 + if (!(flags & CXL_FEATURE_F_CHANGEABLE)) 557 + return false; 558 + 559 + effects = le16_to_cpu(feat->effects); 560 + 561 + /* 562 + * Reserved bits are set, rejecting since the effects is not 563 + * comprehended by the driver. 564 + */ 565 + if (effects & CXL_CMD_EFFECTS_RESERVED) { 566 + dev_warn_once(cxlfs->cxlds->dev, 567 + "Reserved bits set in the Feature effects field!\n"); 568 + return false; 569 + } 570 + 571 + /* Currently no user background command support */ 572 + if (effects & CXL_CMD_BACKGROUND) 573 + return false; 574 + 575 + /* Effects cause immediate change, highest security scope is needed */ 576 + imm_mask = CXL_CMD_CONFIG_CHANGE_IMMEDIATE | 577 + CXL_CMD_DATA_CHANGE_IMMEDIATE | 578 + CXL_CMD_POLICY_CHANGE_IMMEDIATE | 579 + CXL_CMD_LOG_CHANGE_IMMEDIATE; 580 + 581 + reset_mask = CXL_CMD_CONFIG_CHANGE_COLD_RESET | 582 + CXL_CMD_CONFIG_CHANGE_CONV_RESET | 583 + CXL_CMD_CONFIG_CHANGE_CXL_RESET; 584 + 585 + /* If no immediate or reset effect set, The hardware has a bug */ 586 + if (!(effects & imm_mask) && !(effects & reset_mask)) 587 + return false; 588 + 589 + /* 590 + * If the Feature setting causes immediate configuration change 591 + * then we need the full write permission policy. 592 + */ 593 + if (effects & imm_mask && scope >= FWCTL_RPC_DEBUG_WRITE_FULL) 594 + return true; 595 + 596 + /* 597 + * If the Feature setting only causes configuration change 598 + * after a reset, then the lesser level of write permission 599 + * policy is ok. 600 + */ 601 + if (!(effects & imm_mask) && scope >= FWCTL_RPC_DEBUG_WRITE) 602 + return true; 603 + 604 + return false; 605 + } 606 + 607 + static bool cxlctl_validate_hw_command(struct cxl_features_state *cxlfs, 608 + const struct fwctl_rpc_cxl *rpc_in, 609 + enum fwctl_rpc_scope scope, 610 + u16 opcode) 611 + { 612 + struct cxl_mailbox *cxl_mbox = &cxlfs->cxlds->cxl_mbox; 613 + 614 + switch (opcode) { 615 + case CXL_MBOX_OP_GET_SUPPORTED_FEATURES: 616 + case CXL_MBOX_OP_GET_FEATURE: 617 + if (cxl_mbox->feat_cap < CXL_FEATURES_RO) 618 + return false; 619 + if (scope >= FWCTL_RPC_CONFIGURATION) 620 + return true; 621 + return false; 622 + case CXL_MBOX_OP_SET_FEATURE: 623 + if (cxl_mbox->feat_cap < CXL_FEATURES_RW) 624 + return false; 625 + return cxlctl_validate_set_features(cxlfs, rpc_in, scope); 626 + default: 627 + return false; 628 + } 629 + } 630 + 631 + static void *cxlctl_handle_commands(struct cxl_features_state *cxlfs, 632 + const struct fwctl_rpc_cxl *rpc_in, 633 + size_t *out_len, u16 opcode) 634 + { 635 + switch (opcode) { 636 + case CXL_MBOX_OP_GET_SUPPORTED_FEATURES: 637 + return cxlctl_get_supported_features(cxlfs, rpc_in, out_len); 638 + case CXL_MBOX_OP_GET_FEATURE: 639 + return cxlctl_get_feature(cxlfs, rpc_in, out_len); 640 + case CXL_MBOX_OP_SET_FEATURE: 641 + return cxlctl_set_feature(cxlfs, rpc_in, out_len); 642 + default: 643 + return ERR_PTR(-EOPNOTSUPP); 644 + } 645 + } 646 + 647 + static void *cxlctl_fw_rpc(struct fwctl_uctx *uctx, enum fwctl_rpc_scope scope, 648 + void *in, size_t in_len, size_t *out_len) 649 + { 650 + struct fwctl_device *fwctl_dev = uctx->fwctl; 651 + struct cxl_memdev *cxlmd = fwctl_to_memdev(fwctl_dev); 652 + struct cxl_features_state *cxlfs = to_cxlfs(cxlmd->cxlds); 653 + const struct fwctl_rpc_cxl *rpc_in = in; 654 + u16 opcode = rpc_in->opcode; 655 + 656 + if (!cxlctl_validate_hw_command(cxlfs, rpc_in, scope, opcode)) 657 + return ERR_PTR(-EINVAL); 658 + 659 + return cxlctl_handle_commands(cxlfs, rpc_in, out_len, opcode); 660 + } 661 + 662 + static const struct fwctl_ops cxlctl_ops = { 663 + .device_type = FWCTL_DEVICE_TYPE_CXL, 664 + .uctx_size = sizeof(struct fwctl_uctx), 665 + .open_uctx = cxlctl_open_uctx, 666 + .close_uctx = cxlctl_close_uctx, 667 + .fw_rpc = cxlctl_fw_rpc, 668 + }; 669 + 670 + DEFINE_FREE(free_fwctl_dev, struct fwctl_device *, if (_T) fwctl_put(_T)) 671 + 672 + static void free_memdev_fwctl(void *_fwctl_dev) 673 + { 674 + struct fwctl_device *fwctl_dev = _fwctl_dev; 675 + 676 + fwctl_unregister(fwctl_dev); 677 + fwctl_put(fwctl_dev); 678 + } 679 + 680 + int devm_cxl_setup_fwctl(struct cxl_memdev *cxlmd) 681 + { 682 + struct cxl_dev_state *cxlds = cxlmd->cxlds; 683 + struct cxl_features_state *cxlfs; 684 + int rc; 685 + 686 + cxlfs = to_cxlfs(cxlds); 687 + if (!cxlfs) 688 + return -ENODEV; 689 + 690 + /* No need to setup FWCTL if there are no user allowed features found */ 691 + if (!cxlfs->entries->num_user_features) 692 + return -ENODEV; 693 + 694 + struct fwctl_device *fwctl_dev __free(free_fwctl_dev) = 695 + _fwctl_alloc_device(&cxlmd->dev, &cxlctl_ops, sizeof(*fwctl_dev)); 696 + if (!fwctl_dev) 697 + return -ENOMEM; 698 + 699 + rc = fwctl_register(fwctl_dev); 700 + if (rc) 701 + return rc; 702 + 703 + return devm_add_action_or_reset(&cxlmd->dev, free_memdev_fwctl, 704 + no_free_ptr(fwctl_dev)); 705 + } 706 + EXPORT_SYMBOL_NS_GPL(devm_cxl_setup_fwctl, "CXL"); 707 + 708 + MODULE_IMPORT_NS("FWCTL");
+77 -47
drivers/cxl/core/mbox.c
··· 349 349 return true; 350 350 } 351 351 352 - static int cxl_mbox_cmd_ctor(struct cxl_mbox_cmd *mbox, 353 - struct cxl_memdev_state *mds, u16 opcode, 352 + static int cxl_mbox_cmd_ctor(struct cxl_mbox_cmd *mbox_cmd, 353 + struct cxl_mailbox *cxl_mbox, u16 opcode, 354 354 size_t in_size, size_t out_size, u64 in_payload) 355 355 { 356 - struct cxl_mailbox *cxl_mbox = &mds->cxlds.cxl_mbox; 357 - *mbox = (struct cxl_mbox_cmd) { 356 + *mbox_cmd = (struct cxl_mbox_cmd) { 358 357 .opcode = opcode, 359 358 .size_in = in_size, 360 359 }; 361 360 362 361 if (in_size) { 363 - mbox->payload_in = vmemdup_user(u64_to_user_ptr(in_payload), 364 - in_size); 365 - if (IS_ERR(mbox->payload_in)) 366 - return PTR_ERR(mbox->payload_in); 362 + mbox_cmd->payload_in = vmemdup_user(u64_to_user_ptr(in_payload), 363 + in_size); 364 + if (IS_ERR(mbox_cmd->payload_in)) 365 + return PTR_ERR(mbox_cmd->payload_in); 367 366 368 - if (!cxl_payload_from_user_allowed(opcode, mbox->payload_in)) { 369 - dev_dbg(mds->cxlds.dev, "%s: input payload not allowed\n", 367 + if (!cxl_payload_from_user_allowed(opcode, mbox_cmd->payload_in)) { 368 + dev_dbg(cxl_mbox->host, "%s: input payload not allowed\n", 370 369 cxl_mem_opcode_to_name(opcode)); 371 - kvfree(mbox->payload_in); 370 + kvfree(mbox_cmd->payload_in); 372 371 return -EBUSY; 373 372 } 374 373 } 375 374 376 375 /* Prepare to handle a full payload for variable sized output */ 377 376 if (out_size == CXL_VARIABLE_PAYLOAD) 378 - mbox->size_out = cxl_mbox->payload_size; 377 + mbox_cmd->size_out = cxl_mbox->payload_size; 379 378 else 380 - mbox->size_out = out_size; 379 + mbox_cmd->size_out = out_size; 381 380 382 - if (mbox->size_out) { 383 - mbox->payload_out = kvzalloc(mbox->size_out, GFP_KERNEL); 384 - if (!mbox->payload_out) { 385 - kvfree(mbox->payload_in); 381 + if (mbox_cmd->size_out) { 382 + mbox_cmd->payload_out = kvzalloc(mbox_cmd->size_out, GFP_KERNEL); 383 + if (!mbox_cmd->payload_out) { 384 + kvfree(mbox_cmd->payload_in); 386 385 return -ENOMEM; 387 386 } 388 387 } ··· 396 397 397 398 static int cxl_to_mem_cmd_raw(struct cxl_mem_command *mem_cmd, 398 399 const struct cxl_send_command *send_cmd, 399 - struct cxl_memdev_state *mds) 400 + struct cxl_mailbox *cxl_mbox) 400 401 { 401 - struct cxl_mailbox *cxl_mbox = &mds->cxlds.cxl_mbox; 402 - 403 402 if (send_cmd->raw.rsvd) 404 403 return -EINVAL; 405 404 ··· 412 415 if (!cxl_mem_raw_command_allowed(send_cmd->raw.opcode)) 413 416 return -EPERM; 414 417 415 - dev_WARN_ONCE(mds->cxlds.dev, true, "raw command path used\n"); 418 + dev_WARN_ONCE(cxl_mbox->host, true, "raw command path used\n"); 416 419 417 420 *mem_cmd = (struct cxl_mem_command) { 418 421 .info = { ··· 428 431 429 432 static int cxl_to_mem_cmd(struct cxl_mem_command *mem_cmd, 430 433 const struct cxl_send_command *send_cmd, 431 - struct cxl_memdev_state *mds) 434 + struct cxl_mailbox *cxl_mbox) 432 435 { 433 436 struct cxl_mem_command *c = &cxl_mem_commands[send_cmd->id]; 434 437 const struct cxl_command_info *info = &c->info; ··· 443 446 return -EINVAL; 444 447 445 448 /* Check that the command is enabled for hardware */ 446 - if (!test_bit(info->id, mds->enabled_cmds)) 449 + if (!test_bit(info->id, cxl_mbox->enabled_cmds)) 447 450 return -ENOTTY; 448 451 449 452 /* Check that the command is not claimed for exclusive kernel use */ 450 - if (test_bit(info->id, mds->exclusive_cmds)) 453 + if (test_bit(info->id, cxl_mbox->exclusive_cmds)) 451 454 return -EBUSY; 452 455 453 456 /* Check the input buffer is the expected size */ ··· 476 479 /** 477 480 * cxl_validate_cmd_from_user() - Check fields for CXL_MEM_SEND_COMMAND. 478 481 * @mbox_cmd: Sanitized and populated &struct cxl_mbox_cmd. 479 - * @mds: The driver data for the operation 482 + * @cxl_mbox: CXL mailbox context 480 483 * @send_cmd: &struct cxl_send_command copied in from userspace. 481 484 * 482 485 * Return: ··· 491 494 * safe to send to the hardware. 492 495 */ 493 496 static int cxl_validate_cmd_from_user(struct cxl_mbox_cmd *mbox_cmd, 494 - struct cxl_memdev_state *mds, 497 + struct cxl_mailbox *cxl_mbox, 495 498 const struct cxl_send_command *send_cmd) 496 499 { 497 - struct cxl_mailbox *cxl_mbox = &mds->cxlds.cxl_mbox; 498 500 struct cxl_mem_command mem_cmd; 499 501 int rc; 500 502 ··· 510 514 511 515 /* Sanitize and construct a cxl_mem_command */ 512 516 if (send_cmd->id == CXL_MEM_COMMAND_ID_RAW) 513 - rc = cxl_to_mem_cmd_raw(&mem_cmd, send_cmd, mds); 517 + rc = cxl_to_mem_cmd_raw(&mem_cmd, send_cmd, cxl_mbox); 514 518 else 515 - rc = cxl_to_mem_cmd(&mem_cmd, send_cmd, mds); 519 + rc = cxl_to_mem_cmd(&mem_cmd, send_cmd, cxl_mbox); 516 520 517 521 if (rc) 518 522 return rc; 519 523 520 524 /* Sanitize and construct a cxl_mbox_cmd */ 521 - return cxl_mbox_cmd_ctor(mbox_cmd, mds, mem_cmd.opcode, 525 + return cxl_mbox_cmd_ctor(mbox_cmd, cxl_mbox, mem_cmd.opcode, 522 526 mem_cmd.info.size_in, mem_cmd.info.size_out, 523 527 send_cmd->in.payload); 524 528 } 525 529 526 - int cxl_query_cmd(struct cxl_memdev *cxlmd, 530 + int cxl_query_cmd(struct cxl_mailbox *cxl_mbox, 527 531 struct cxl_mem_query_commands __user *q) 528 532 { 529 - struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds); 530 - struct device *dev = &cxlmd->dev; 533 + struct device *dev = cxl_mbox->host; 531 534 struct cxl_mem_command *cmd; 532 535 u32 n_commands; 533 536 int j = 0; ··· 547 552 cxl_for_each_cmd(cmd) { 548 553 struct cxl_command_info info = cmd->info; 549 554 550 - if (test_bit(info.id, mds->enabled_cmds)) 555 + if (test_bit(info.id, cxl_mbox->enabled_cmds)) 551 556 info.flags |= CXL_MEM_COMMAND_FLAG_ENABLED; 552 - if (test_bit(info.id, mds->exclusive_cmds)) 557 + if (test_bit(info.id, cxl_mbox->exclusive_cmds)) 553 558 info.flags |= CXL_MEM_COMMAND_FLAG_EXCLUSIVE; 554 559 555 560 if (copy_to_user(&q->commands[j++], &info, sizeof(info))) ··· 564 569 565 570 /** 566 571 * handle_mailbox_cmd_from_user() - Dispatch a mailbox command for userspace. 567 - * @mds: The driver data for the operation 572 + * @cxl_mbox: The mailbox context for the operation. 568 573 * @mbox_cmd: The validated mailbox command. 569 574 * @out_payload: Pointer to userspace's output payload. 570 575 * @size_out: (Input) Max payload size to copy out. ··· 585 590 * 586 591 * See cxl_send_cmd(). 587 592 */ 588 - static int handle_mailbox_cmd_from_user(struct cxl_memdev_state *mds, 593 + static int handle_mailbox_cmd_from_user(struct cxl_mailbox *cxl_mbox, 589 594 struct cxl_mbox_cmd *mbox_cmd, 590 595 u64 out_payload, s32 *size_out, 591 596 u32 *retval) 592 597 { 593 - struct cxl_mailbox *cxl_mbox = &mds->cxlds.cxl_mbox; 594 - struct device *dev = mds->cxlds.dev; 598 + struct device *dev = cxl_mbox->host; 595 599 int rc; 596 600 597 601 dev_dbg(dev, ··· 627 633 return rc; 628 634 } 629 635 630 - int cxl_send_cmd(struct cxl_memdev *cxlmd, struct cxl_send_command __user *s) 636 + int cxl_send_cmd(struct cxl_mailbox *cxl_mbox, struct cxl_send_command __user *s) 631 637 { 632 - struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds); 633 - struct device *dev = &cxlmd->dev; 638 + struct device *dev = cxl_mbox->host; 634 639 struct cxl_send_command send; 635 640 struct cxl_mbox_cmd mbox_cmd; 636 641 int rc; ··· 639 646 if (copy_from_user(&send, s, sizeof(send))) 640 647 return -EFAULT; 641 648 642 - rc = cxl_validate_cmd_from_user(&mbox_cmd, mds, &send); 649 + rc = cxl_validate_cmd_from_user(&mbox_cmd, cxl_mbox, &send); 643 650 if (rc) 644 651 return rc; 645 652 646 - rc = handle_mailbox_cmd_from_user(mds, &mbox_cmd, send.out.payload, 653 + rc = handle_mailbox_cmd_from_user(cxl_mbox, &mbox_cmd, send.out.payload, 647 654 &send.out.size, &send.retval); 648 655 if (rc) 649 656 return rc; ··· 706 713 return 0; 707 714 } 708 715 716 + static int check_features_opcodes(u16 opcode, int *ro_cmds, int *wr_cmds) 717 + { 718 + switch (opcode) { 719 + case CXL_MBOX_OP_GET_SUPPORTED_FEATURES: 720 + case CXL_MBOX_OP_GET_FEATURE: 721 + (*ro_cmds)++; 722 + return 1; 723 + case CXL_MBOX_OP_SET_FEATURE: 724 + (*wr_cmds)++; 725 + return 1; 726 + default: 727 + return 0; 728 + } 729 + } 730 + 731 + /* 'Get Supported Features' and 'Get Feature' */ 732 + #define MAX_FEATURES_READ_CMDS 2 733 + static void set_features_cap(struct cxl_mailbox *cxl_mbox, 734 + int ro_cmds, int wr_cmds) 735 + { 736 + /* Setting up Features capability while walking the CEL */ 737 + if (ro_cmds == MAX_FEATURES_READ_CMDS) { 738 + if (wr_cmds) 739 + cxl_mbox->feat_cap = CXL_FEATURES_RW; 740 + else 741 + cxl_mbox->feat_cap = CXL_FEATURES_RO; 742 + } 743 + } 744 + 709 745 /** 710 746 * cxl_walk_cel() - Walk through the Command Effects Log. 711 747 * @mds: The driver data for the operation ··· 746 724 */ 747 725 static void cxl_walk_cel(struct cxl_memdev_state *mds, size_t size, u8 *cel) 748 726 { 727 + struct cxl_mailbox *cxl_mbox = &mds->cxlds.cxl_mbox; 749 728 struct cxl_cel_entry *cel_entry; 750 729 const int cel_entries = size / sizeof(*cel_entry); 751 730 struct device *dev = mds->cxlds.dev; 752 - int i; 731 + int i, ro_cmds = 0, wr_cmds = 0; 753 732 754 733 cel_entry = (struct cxl_cel_entry *) cel; 755 734 ··· 760 737 int enabled = 0; 761 738 762 739 if (cmd) { 763 - set_bit(cmd->info.id, mds->enabled_cmds); 740 + set_bit(cmd->info.id, cxl_mbox->enabled_cmds); 764 741 enabled++; 765 742 } 743 + 744 + enabled += check_features_opcodes(opcode, &ro_cmds, 745 + &wr_cmds); 766 746 767 747 if (cxl_is_poison_command(opcode)) { 768 748 cxl_set_poison_cmd_enabled(&mds->poison, opcode); ··· 780 754 dev_dbg(dev, "Opcode 0x%04x %s\n", opcode, 781 755 enabled ? "enabled" : "unsupported by driver"); 782 756 } 757 + 758 + set_features_cap(cxl_mbox, ro_cmds, wr_cmds); 783 759 } 784 760 785 761 static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_memdev_state *mds) ··· 835 807 */ 836 808 int cxl_enumerate_cmds(struct cxl_memdev_state *mds) 837 809 { 810 + struct cxl_mailbox *cxl_mbox = &mds->cxlds.cxl_mbox; 838 811 struct cxl_mbox_get_supported_logs *gsl; 839 812 struct device *dev = mds->cxlds.dev; 840 813 struct cxl_mem_command *cmd; ··· 874 845 /* In case CEL was bogus, enable some default commands. */ 875 846 cxl_for_each_cmd(cmd) 876 847 if (cmd->flags & CXL_CMD_FLAG_FORCE_ENABLE) 877 - set_bit(cmd->info.id, mds->enabled_cmds); 848 + set_bit(cmd->info.id, cxl_mbox->enabled_cmds); 878 849 879 850 /* Found the required CEL */ 880 851 rc = 0; ··· 1477 1448 mutex_init(&mds->event.log_lock); 1478 1449 mds->cxlds.dev = dev; 1479 1450 mds->cxlds.reg_map.host = dev; 1451 + mds->cxlds.cxl_mbox.host = dev; 1480 1452 mds->cxlds.reg_map.resource = CXL_RESOURCE_NONE; 1481 1453 mds->cxlds.type = CXL_DEVTYPE_CLASSMEM; 1482 1454 mds->ram_perf.qos_class = CXL_QOS_CLASS_INVALID;
+15 -7
drivers/cxl/core/memdev.c
··· 564 564 void set_exclusive_cxl_commands(struct cxl_memdev_state *mds, 565 565 unsigned long *cmds) 566 566 { 567 + struct cxl_mailbox *cxl_mbox = &mds->cxlds.cxl_mbox; 568 + 567 569 down_write(&cxl_memdev_rwsem); 568 - bitmap_or(mds->exclusive_cmds, mds->exclusive_cmds, cmds, 569 - CXL_MEM_COMMAND_ID_MAX); 570 + bitmap_or(cxl_mbox->exclusive_cmds, cxl_mbox->exclusive_cmds, 571 + cmds, CXL_MEM_COMMAND_ID_MAX); 570 572 up_write(&cxl_memdev_rwsem); 571 573 } 572 574 EXPORT_SYMBOL_NS_GPL(set_exclusive_cxl_commands, "CXL"); ··· 581 579 void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds, 582 580 unsigned long *cmds) 583 581 { 582 + struct cxl_mailbox *cxl_mbox = &mds->cxlds.cxl_mbox; 583 + 584 584 down_write(&cxl_memdev_rwsem); 585 - bitmap_andnot(mds->exclusive_cmds, mds->exclusive_cmds, cmds, 586 - CXL_MEM_COMMAND_ID_MAX); 585 + bitmap_andnot(cxl_mbox->exclusive_cmds, cxl_mbox->exclusive_cmds, 586 + cmds, CXL_MEM_COMMAND_ID_MAX); 587 587 up_write(&cxl_memdev_rwsem); 588 588 } 589 589 EXPORT_SYMBOL_NS_GPL(clear_exclusive_cxl_commands, "CXL"); ··· 660 656 static long __cxl_memdev_ioctl(struct cxl_memdev *cxlmd, unsigned int cmd, 661 657 unsigned long arg) 662 658 { 659 + struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds); 660 + struct cxl_mailbox *cxl_mbox = &mds->cxlds.cxl_mbox; 661 + 663 662 switch (cmd) { 664 663 case CXL_MEM_QUERY_COMMANDS: 665 - return cxl_query_cmd(cxlmd, (void __user *)arg); 664 + return cxl_query_cmd(cxl_mbox, (void __user *)arg); 666 665 case CXL_MEM_SEND_COMMAND: 667 - return cxl_send_cmd(cxlmd, (void __user *)arg); 666 + return cxl_send_cmd(cxl_mbox, (void __user *)arg); 668 667 default: 669 668 return -ENOTTY; 670 669 } ··· 1001 994 int devm_cxl_setup_fw_upload(struct device *host, struct cxl_memdev_state *mds) 1002 995 { 1003 996 struct cxl_dev_state *cxlds = &mds->cxlds; 997 + struct cxl_mailbox *cxl_mbox = &cxlds->cxl_mbox; 1004 998 struct device *dev = &cxlds->cxlmd->dev; 1005 999 struct fw_upload *fwl; 1006 1000 1007 - if (!test_bit(CXL_MEM_COMMAND_ID_GET_FW_INFO, mds->enabled_cmds)) 1001 + if (!test_bit(CXL_MEM_COMMAND_ID_GET_FW_INFO, cxl_mbox->enabled_cmds)) 1008 1002 return 0; 1009 1003 1010 1004 fwl = firmware_upload_register(THIS_MODULE, dev, dev_name(dev),
+7 -40
drivers/cxl/cxlmem.h
··· 106 106 return xa_load(&port->endpoints, (unsigned long)&cxlmd->dev); 107 107 } 108 108 109 - /** 110 - * struct cxl_mbox_cmd - A command to be submitted to hardware. 111 - * @opcode: (input) The command set and command submitted to hardware. 112 - * @payload_in: (input) Pointer to the input payload. 113 - * @payload_out: (output) Pointer to the output payload. Must be allocated by 114 - * the caller. 115 - * @size_in: (input) Number of bytes to load from @payload_in. 116 - * @size_out: (input) Max number of bytes loaded into @payload_out. 117 - * (output) Number of bytes generated by the device. For fixed size 118 - * outputs commands this is always expected to be deterministic. For 119 - * variable sized output commands, it tells the exact number of bytes 120 - * written. 121 - * @min_out: (input) internal command output payload size validation 122 - * @poll_count: (input) Number of timeouts to attempt. 123 - * @poll_interval_ms: (input) Time between mailbox background command polling 124 - * interval timeouts. 125 - * @return_code: (output) Error code returned from hardware. 126 - * 127 - * This is the primary mechanism used to send commands to the hardware. 128 - * All the fields except @payload_* correspond exactly to the fields described in 129 - * Command Register section of the CXL 2.0 8.2.8.4.5. @payload_in and 130 - * @payload_out are written to, and read from the Command Payload Registers 131 - * defined in CXL 2.0 8.2.8.4.8. 132 - */ 133 - struct cxl_mbox_cmd { 134 - u16 opcode; 135 - void *payload_in; 136 - void *payload_out; 137 - size_t size_in; 138 - size_t size_out; 139 - size_t min_out; 140 - int poll_count; 141 - int poll_interval_ms; 142 - u16 return_code; 143 - }; 144 - 145 109 /* 146 110 * Per CXL 3.0 Section 8.2.8.4.5.1 147 111 */ ··· 392 428 * @serial: PCIe Device Serial Number 393 429 * @type: Generic Memory Class device or Vendor Specific Memory device 394 430 * @cxl_mbox: CXL mailbox context 431 + * @cxlfs: CXL features context 395 432 */ 396 433 struct cxl_dev_state { 397 434 struct device *dev; ··· 408 443 u64 serial; 409 444 enum cxl_devtype type; 410 445 struct cxl_mailbox cxl_mbox; 446 + #ifdef CONFIG_CXL_FEATURES 447 + struct cxl_features_state *cxlfs; 448 + #endif 411 449 }; 412 450 413 451 static inline struct cxl_dev_state *mbox_to_cxlds(struct cxl_mailbox *cxl_mbox) ··· 429 461 * @lsa_size: Size of Label Storage Area 430 462 * (CXL 2.0 8.2.9.5.1.1 Identify Memory Device) 431 463 * @firmware_version: Firmware version for the memory device. 432 - * @enabled_cmds: Hardware commands found enabled in CEL. 433 - * @exclusive_cmds: Commands that are kernel-internal only 434 464 * @total_bytes: sum of all possible capacities 435 465 * @volatile_only_bytes: hard volatile capacity 436 466 * @persistent_only_bytes: hard persistent capacity ··· 451 485 struct cxl_dev_state cxlds; 452 486 size_t lsa_size; 453 487 char firmware_version[0x10]; 454 - DECLARE_BITMAP(enabled_cmds, CXL_MEM_COMMAND_ID_MAX); 455 - DECLARE_BITMAP(exclusive_cmds, CXL_MEM_COMMAND_ID_MAX); 456 488 u64 total_bytes; 457 489 u64 volatile_only_bytes; 458 490 u64 persistent_only_bytes; ··· 494 530 CXL_MBOX_OP_GET_LOG_CAPS = 0x0402, 495 531 CXL_MBOX_OP_CLEAR_LOG = 0x0403, 496 532 CXL_MBOX_OP_GET_SUP_LOG_SUBLIST = 0x0405, 533 + CXL_MBOX_OP_GET_SUPPORTED_FEATURES = 0x0500, 534 + CXL_MBOX_OP_GET_FEATURE = 0x0501, 535 + CXL_MBOX_OP_SET_FEATURE = 0x0502, 497 536 CXL_MBOX_OP_IDENTIFY = 0x4000, 498 537 CXL_MBOX_OP_GET_PARTITION_INFO = 0x4100, 499 538 CXL_MBOX_OP_SET_PARTITION_INFO = 0x4101,
+8
drivers/cxl/pci.c
··· 997 997 if (rc) 998 998 return rc; 999 999 1000 + rc = devm_cxl_setup_features(cxlds); 1001 + if (rc) 1002 + dev_dbg(&pdev->dev, "No CXL Features discovered\n"); 1003 + 1000 1004 cxlmd = devm_cxl_add_memdev(&pdev->dev, cxlds); 1001 1005 if (IS_ERR(cxlmd)) 1002 1006 return PTR_ERR(cxlmd); ··· 1012 1008 rc = devm_cxl_sanitize_setup_notifier(&pdev->dev, cxlmd); 1013 1009 if (rc) 1014 1010 return rc; 1011 + 1012 + rc = devm_cxl_setup_fwctl(cxlmd); 1013 + if (rc) 1014 + dev_dbg(&pdev->dev, "No CXL FWCTL setup\n"); 1015 1015 1016 1016 pmu_count = cxl_count_regblock(pdev, CXL_REGLOC_RBI_PMU); 1017 1017 if (pmu_count < 0)
+33
drivers/fwctl/Kconfig
··· 1 + # SPDX-License-Identifier: GPL-2.0-only 2 + menuconfig FWCTL 3 + tristate "fwctl device firmware access framework" 4 + help 5 + fwctl provides a userspace API for restricted access to communicate 6 + with on-device firmware. The communication channel is intended to 7 + support a wide range of lockdown compatible device behaviors including 8 + manipulating device FLASH, debugging, and other activities that don't 9 + fit neatly into an existing subsystem. 10 + 11 + if FWCTL 12 + config FWCTL_MLX5 13 + tristate "mlx5 ConnectX control fwctl driver" 14 + depends on MLX5_CORE 15 + help 16 + MLX5 provides interface for the user process to access the debug and 17 + configuration registers of the ConnectX hardware family 18 + (NICs, PCI switches and SmartNIC SoCs). 19 + This will allow configuration and debug tools to work out of the box on 20 + mainstream kernel. 21 + 22 + If you don't know what to do here, say N. 23 + 24 + config FWCTL_PDS 25 + tristate "AMD/Pensando pds fwctl driver" 26 + depends on PDS_CORE 27 + help 28 + The pds_fwctl driver provides an fwctl interface for a user process 29 + to access the debug and configuration information of the AMD/Pensando 30 + DSC hardware family. 31 + 32 + If you don't know what to do here, say N. 33 + endif
+6
drivers/fwctl/Makefile
··· 1 + # SPDX-License-Identifier: GPL-2.0 2 + obj-$(CONFIG_FWCTL) += fwctl.o 3 + obj-$(CONFIG_FWCTL_MLX5) += mlx5/ 4 + obj-$(CONFIG_FWCTL_PDS) += pds/ 5 + 6 + fwctl-y += main.o
+421
drivers/fwctl/main.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* 3 + * Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES 4 + */ 5 + #define pr_fmt(fmt) "fwctl: " fmt 6 + #include <linux/fwctl.h> 7 + 8 + #include <linux/container_of.h> 9 + #include <linux/fs.h> 10 + #include <linux/module.h> 11 + #include <linux/sizes.h> 12 + #include <linux/slab.h> 13 + 14 + #include <uapi/fwctl/fwctl.h> 15 + 16 + enum { 17 + FWCTL_MAX_DEVICES = 4096, 18 + MAX_RPC_LEN = SZ_2M, 19 + }; 20 + static_assert(FWCTL_MAX_DEVICES < (1U << MINORBITS)); 21 + 22 + static dev_t fwctl_dev; 23 + static DEFINE_IDA(fwctl_ida); 24 + static unsigned long fwctl_tainted; 25 + 26 + struct fwctl_ucmd { 27 + struct fwctl_uctx *uctx; 28 + void __user *ubuffer; 29 + void *cmd; 30 + u32 user_size; 31 + }; 32 + 33 + static int ucmd_respond(struct fwctl_ucmd *ucmd, size_t cmd_len) 34 + { 35 + if (copy_to_user(ucmd->ubuffer, ucmd->cmd, 36 + min_t(size_t, ucmd->user_size, cmd_len))) 37 + return -EFAULT; 38 + return 0; 39 + } 40 + 41 + static int copy_to_user_zero_pad(void __user *to, const void *from, 42 + size_t from_len, size_t user_len) 43 + { 44 + size_t copy_len; 45 + 46 + copy_len = min(from_len, user_len); 47 + if (copy_to_user(to, from, copy_len)) 48 + return -EFAULT; 49 + if (copy_len < user_len) { 50 + if (clear_user(to + copy_len, user_len - copy_len)) 51 + return -EFAULT; 52 + } 53 + return 0; 54 + } 55 + 56 + static int fwctl_cmd_info(struct fwctl_ucmd *ucmd) 57 + { 58 + struct fwctl_device *fwctl = ucmd->uctx->fwctl; 59 + struct fwctl_info *cmd = ucmd->cmd; 60 + size_t driver_info_len = 0; 61 + 62 + if (cmd->flags) 63 + return -EOPNOTSUPP; 64 + 65 + if (!fwctl->ops->info && cmd->device_data_len) { 66 + if (clear_user(u64_to_user_ptr(cmd->out_device_data), 67 + cmd->device_data_len)) 68 + return -EFAULT; 69 + } else if (cmd->device_data_len) { 70 + void *driver_info __free(kfree) = 71 + fwctl->ops->info(ucmd->uctx, &driver_info_len); 72 + if (IS_ERR(driver_info)) 73 + return PTR_ERR(driver_info); 74 + 75 + if (copy_to_user_zero_pad(u64_to_user_ptr(cmd->out_device_data), 76 + driver_info, driver_info_len, 77 + cmd->device_data_len)) 78 + return -EFAULT; 79 + } 80 + 81 + cmd->out_device_type = fwctl->ops->device_type; 82 + cmd->device_data_len = driver_info_len; 83 + return ucmd_respond(ucmd, sizeof(*cmd)); 84 + } 85 + 86 + static int fwctl_cmd_rpc(struct fwctl_ucmd *ucmd) 87 + { 88 + struct fwctl_device *fwctl = ucmd->uctx->fwctl; 89 + struct fwctl_rpc *cmd = ucmd->cmd; 90 + size_t out_len; 91 + 92 + if (cmd->in_len > MAX_RPC_LEN || cmd->out_len > MAX_RPC_LEN) 93 + return -EMSGSIZE; 94 + 95 + switch (cmd->scope) { 96 + case FWCTL_RPC_CONFIGURATION: 97 + case FWCTL_RPC_DEBUG_READ_ONLY: 98 + break; 99 + 100 + case FWCTL_RPC_DEBUG_WRITE_FULL: 101 + if (!capable(CAP_SYS_RAWIO)) 102 + return -EPERM; 103 + fallthrough; 104 + case FWCTL_RPC_DEBUG_WRITE: 105 + if (!test_and_set_bit(0, &fwctl_tainted)) { 106 + dev_warn( 107 + &fwctl->dev, 108 + "%s(%d): has requested full access to the physical device device", 109 + current->comm, task_pid_nr(current)); 110 + add_taint(TAINT_FWCTL, LOCKDEP_STILL_OK); 111 + } 112 + break; 113 + default: 114 + return -EOPNOTSUPP; 115 + } 116 + 117 + void *inbuf __free(kvfree) = kvzalloc(cmd->in_len, GFP_KERNEL_ACCOUNT); 118 + if (!inbuf) 119 + return -ENOMEM; 120 + if (copy_from_user(inbuf, u64_to_user_ptr(cmd->in), cmd->in_len)) 121 + return -EFAULT; 122 + 123 + out_len = cmd->out_len; 124 + void *outbuf __free(kvfree) = fwctl->ops->fw_rpc( 125 + ucmd->uctx, cmd->scope, inbuf, cmd->in_len, &out_len); 126 + if (IS_ERR(outbuf)) 127 + return PTR_ERR(outbuf); 128 + if (outbuf == inbuf) { 129 + /* The driver can re-use inbuf as outbuf */ 130 + inbuf = NULL; 131 + } 132 + 133 + if (copy_to_user(u64_to_user_ptr(cmd->out), outbuf, 134 + min(cmd->out_len, out_len))) 135 + return -EFAULT; 136 + 137 + cmd->out_len = out_len; 138 + return ucmd_respond(ucmd, sizeof(*cmd)); 139 + } 140 + 141 + /* On stack memory for the ioctl structs */ 142 + union fwctl_ucmd_buffer { 143 + struct fwctl_info info; 144 + struct fwctl_rpc rpc; 145 + }; 146 + 147 + struct fwctl_ioctl_op { 148 + unsigned int size; 149 + unsigned int min_size; 150 + unsigned int ioctl_num; 151 + int (*execute)(struct fwctl_ucmd *ucmd); 152 + }; 153 + 154 + #define IOCTL_OP(_ioctl, _fn, _struct, _last) \ 155 + [_IOC_NR(_ioctl) - FWCTL_CMD_BASE] = { \ 156 + .size = sizeof(_struct) + \ 157 + BUILD_BUG_ON_ZERO(sizeof(union fwctl_ucmd_buffer) < \ 158 + sizeof(_struct)), \ 159 + .min_size = offsetofend(_struct, _last), \ 160 + .ioctl_num = _ioctl, \ 161 + .execute = _fn, \ 162 + } 163 + static const struct fwctl_ioctl_op fwctl_ioctl_ops[] = { 164 + IOCTL_OP(FWCTL_INFO, fwctl_cmd_info, struct fwctl_info, out_device_data), 165 + IOCTL_OP(FWCTL_RPC, fwctl_cmd_rpc, struct fwctl_rpc, out), 166 + }; 167 + 168 + static long fwctl_fops_ioctl(struct file *filp, unsigned int cmd, 169 + unsigned long arg) 170 + { 171 + struct fwctl_uctx *uctx = filp->private_data; 172 + const struct fwctl_ioctl_op *op; 173 + struct fwctl_ucmd ucmd = {}; 174 + union fwctl_ucmd_buffer buf; 175 + unsigned int nr; 176 + int ret; 177 + 178 + nr = _IOC_NR(cmd); 179 + if ((nr - FWCTL_CMD_BASE) >= ARRAY_SIZE(fwctl_ioctl_ops)) 180 + return -ENOIOCTLCMD; 181 + 182 + op = &fwctl_ioctl_ops[nr - FWCTL_CMD_BASE]; 183 + if (op->ioctl_num != cmd) 184 + return -ENOIOCTLCMD; 185 + 186 + ucmd.uctx = uctx; 187 + ucmd.cmd = &buf; 188 + ucmd.ubuffer = (void __user *)arg; 189 + ret = get_user(ucmd.user_size, (u32 __user *)ucmd.ubuffer); 190 + if (ret) 191 + return ret; 192 + 193 + if (ucmd.user_size < op->min_size) 194 + return -EINVAL; 195 + 196 + ret = copy_struct_from_user(ucmd.cmd, op->size, ucmd.ubuffer, 197 + ucmd.user_size); 198 + if (ret) 199 + return ret; 200 + 201 + guard(rwsem_read)(&uctx->fwctl->registration_lock); 202 + if (!uctx->fwctl->ops) 203 + return -ENODEV; 204 + return op->execute(&ucmd); 205 + } 206 + 207 + static int fwctl_fops_open(struct inode *inode, struct file *filp) 208 + { 209 + struct fwctl_device *fwctl = 210 + container_of(inode->i_cdev, struct fwctl_device, cdev); 211 + int ret; 212 + 213 + guard(rwsem_read)(&fwctl->registration_lock); 214 + if (!fwctl->ops) 215 + return -ENODEV; 216 + 217 + struct fwctl_uctx *uctx __free(kfree) = 218 + kzalloc(fwctl->ops->uctx_size, GFP_KERNEL_ACCOUNT); 219 + if (!uctx) 220 + return -ENOMEM; 221 + 222 + uctx->fwctl = fwctl; 223 + ret = fwctl->ops->open_uctx(uctx); 224 + if (ret) 225 + return ret; 226 + 227 + scoped_guard(mutex, &fwctl->uctx_list_lock) { 228 + list_add_tail(&uctx->uctx_list_entry, &fwctl->uctx_list); 229 + } 230 + 231 + get_device(&fwctl->dev); 232 + filp->private_data = no_free_ptr(uctx); 233 + return 0; 234 + } 235 + 236 + static void fwctl_destroy_uctx(struct fwctl_uctx *uctx) 237 + { 238 + lockdep_assert_held(&uctx->fwctl->uctx_list_lock); 239 + list_del(&uctx->uctx_list_entry); 240 + uctx->fwctl->ops->close_uctx(uctx); 241 + } 242 + 243 + static int fwctl_fops_release(struct inode *inode, struct file *filp) 244 + { 245 + struct fwctl_uctx *uctx = filp->private_data; 246 + struct fwctl_device *fwctl = uctx->fwctl; 247 + 248 + scoped_guard(rwsem_read, &fwctl->registration_lock) { 249 + /* 250 + * NULL ops means fwctl_unregister() has already removed the 251 + * driver and destroyed the uctx. 252 + */ 253 + if (fwctl->ops) { 254 + guard(mutex)(&fwctl->uctx_list_lock); 255 + fwctl_destroy_uctx(uctx); 256 + } 257 + } 258 + 259 + kfree(uctx); 260 + fwctl_put(fwctl); 261 + return 0; 262 + } 263 + 264 + static const struct file_operations fwctl_fops = { 265 + .owner = THIS_MODULE, 266 + .open = fwctl_fops_open, 267 + .release = fwctl_fops_release, 268 + .unlocked_ioctl = fwctl_fops_ioctl, 269 + }; 270 + 271 + static void fwctl_device_release(struct device *device) 272 + { 273 + struct fwctl_device *fwctl = 274 + container_of(device, struct fwctl_device, dev); 275 + 276 + ida_free(&fwctl_ida, fwctl->dev.devt - fwctl_dev); 277 + mutex_destroy(&fwctl->uctx_list_lock); 278 + kfree(fwctl); 279 + } 280 + 281 + static char *fwctl_devnode(const struct device *dev, umode_t *mode) 282 + { 283 + return kasprintf(GFP_KERNEL, "fwctl/%s", dev_name(dev)); 284 + } 285 + 286 + static struct class fwctl_class = { 287 + .name = "fwctl", 288 + .dev_release = fwctl_device_release, 289 + .devnode = fwctl_devnode, 290 + }; 291 + 292 + static struct fwctl_device * 293 + _alloc_device(struct device *parent, const struct fwctl_ops *ops, size_t size) 294 + { 295 + struct fwctl_device *fwctl __free(kfree) = kzalloc(size, GFP_KERNEL); 296 + int devnum; 297 + 298 + if (!fwctl) 299 + return NULL; 300 + 301 + devnum = ida_alloc_max(&fwctl_ida, FWCTL_MAX_DEVICES - 1, GFP_KERNEL); 302 + if (devnum < 0) 303 + return NULL; 304 + 305 + fwctl->dev.devt = fwctl_dev + devnum; 306 + fwctl->dev.class = &fwctl_class; 307 + fwctl->dev.parent = parent; 308 + 309 + init_rwsem(&fwctl->registration_lock); 310 + mutex_init(&fwctl->uctx_list_lock); 311 + INIT_LIST_HEAD(&fwctl->uctx_list); 312 + 313 + device_initialize(&fwctl->dev); 314 + return_ptr(fwctl); 315 + } 316 + 317 + /* Drivers use the fwctl_alloc_device() wrapper */ 318 + struct fwctl_device *_fwctl_alloc_device(struct device *parent, 319 + const struct fwctl_ops *ops, 320 + size_t size) 321 + { 322 + struct fwctl_device *fwctl __free(fwctl) = 323 + _alloc_device(parent, ops, size); 324 + 325 + if (!fwctl) 326 + return NULL; 327 + 328 + cdev_init(&fwctl->cdev, &fwctl_fops); 329 + /* 330 + * The driver module is protected by fwctl_register/unregister(), 331 + * unregister won't complete until we are done with the driver's module. 332 + */ 333 + fwctl->cdev.owner = THIS_MODULE; 334 + 335 + if (dev_set_name(&fwctl->dev, "fwctl%d", fwctl->dev.devt - fwctl_dev)) 336 + return NULL; 337 + 338 + fwctl->ops = ops; 339 + return_ptr(fwctl); 340 + } 341 + EXPORT_SYMBOL_NS_GPL(_fwctl_alloc_device, "FWCTL"); 342 + 343 + /** 344 + * fwctl_register - Register a new device to the subsystem 345 + * @fwctl: Previously allocated fwctl_device 346 + * 347 + * On return the device is visible through sysfs and /dev, driver ops may be 348 + * called. 349 + */ 350 + int fwctl_register(struct fwctl_device *fwctl) 351 + { 352 + return cdev_device_add(&fwctl->cdev, &fwctl->dev); 353 + } 354 + EXPORT_SYMBOL_NS_GPL(fwctl_register, "FWCTL"); 355 + 356 + /** 357 + * fwctl_unregister - Unregister a device from the subsystem 358 + * @fwctl: Previously allocated and registered fwctl_device 359 + * 360 + * Undoes fwctl_register(). On return no driver ops will be called. The 361 + * caller must still call fwctl_put() to free the fwctl. 362 + * 363 + * Unregister will return even if userspace still has file descriptors open. 364 + * This will call ops->close_uctx() on any open FDs and after return no driver 365 + * op will be called. The FDs remain open but all fops will return -ENODEV. 366 + * 367 + * The design of fwctl allows this sort of disassociation of the driver from the 368 + * subsystem primarily by keeping memory allocations owned by the core subsytem. 369 + * The fwctl_device and fwctl_uctx can both be freed without requiring a driver 370 + * callback. This allows the module to remain unlocked while FDs are open. 371 + */ 372 + void fwctl_unregister(struct fwctl_device *fwctl) 373 + { 374 + struct fwctl_uctx *uctx; 375 + 376 + cdev_device_del(&fwctl->cdev, &fwctl->dev); 377 + 378 + /* Disable and free the driver's resources for any still open FDs. */ 379 + guard(rwsem_write)(&fwctl->registration_lock); 380 + guard(mutex)(&fwctl->uctx_list_lock); 381 + while ((uctx = list_first_entry_or_null(&fwctl->uctx_list, 382 + struct fwctl_uctx, 383 + uctx_list_entry))) 384 + fwctl_destroy_uctx(uctx); 385 + 386 + /* 387 + * The driver module may unload after this returns, the op pointer will 388 + * not be valid. 389 + */ 390 + fwctl->ops = NULL; 391 + } 392 + EXPORT_SYMBOL_NS_GPL(fwctl_unregister, "FWCTL"); 393 + 394 + static int __init fwctl_init(void) 395 + { 396 + int ret; 397 + 398 + ret = alloc_chrdev_region(&fwctl_dev, 0, FWCTL_MAX_DEVICES, "fwctl"); 399 + if (ret) 400 + return ret; 401 + 402 + ret = class_register(&fwctl_class); 403 + if (ret) 404 + goto err_chrdev; 405 + return 0; 406 + 407 + err_chrdev: 408 + unregister_chrdev_region(fwctl_dev, FWCTL_MAX_DEVICES); 409 + return ret; 410 + } 411 + 412 + static void __exit fwctl_exit(void) 413 + { 414 + class_unregister(&fwctl_class); 415 + unregister_chrdev_region(fwctl_dev, FWCTL_MAX_DEVICES); 416 + } 417 + 418 + module_init(fwctl_init); 419 + module_exit(fwctl_exit); 420 + MODULE_DESCRIPTION("fwctl device firmware access framework"); 421 + MODULE_LICENSE("GPL");
+4
drivers/fwctl/mlx5/Makefile
··· 1 + # SPDX-License-Identifier: GPL-2.0 2 + obj-$(CONFIG_FWCTL_MLX5) += mlx5_fwctl.o 3 + 4 + mlx5_fwctl-y += main.o
+411
drivers/fwctl/mlx5/main.c
··· 1 + // SPDX-License-Identifier: BSD-3-Clause OR GPL-2.0 2 + /* 3 + * Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES 4 + */ 5 + #include <linux/fwctl.h> 6 + #include <linux/auxiliary_bus.h> 7 + #include <linux/mlx5/device.h> 8 + #include <linux/mlx5/driver.h> 9 + #include <uapi/fwctl/mlx5.h> 10 + 11 + #define mlx5ctl_err(mcdev, format, ...) \ 12 + dev_err(&mcdev->fwctl.dev, format, ##__VA_ARGS__) 13 + 14 + #define mlx5ctl_dbg(mcdev, format, ...) \ 15 + dev_dbg(&mcdev->fwctl.dev, "PID %u: " format, current->pid, \ 16 + ##__VA_ARGS__) 17 + 18 + struct mlx5ctl_uctx { 19 + struct fwctl_uctx uctx; 20 + u32 uctx_caps; 21 + u32 uctx_uid; 22 + }; 23 + 24 + struct mlx5ctl_dev { 25 + struct fwctl_device fwctl; 26 + struct mlx5_core_dev *mdev; 27 + }; 28 + DEFINE_FREE(mlx5ctl, struct mlx5ctl_dev *, if (_T) fwctl_put(&_T->fwctl)); 29 + 30 + struct mlx5_ifc_mbox_in_hdr_bits { 31 + u8 opcode[0x10]; 32 + u8 uid[0x10]; 33 + 34 + u8 reserved_at_20[0x10]; 35 + u8 op_mod[0x10]; 36 + 37 + u8 reserved_at_40[0x40]; 38 + }; 39 + 40 + struct mlx5_ifc_mbox_out_hdr_bits { 41 + u8 status[0x8]; 42 + u8 reserved_at_8[0x18]; 43 + 44 + u8 syndrome[0x20]; 45 + 46 + u8 reserved_at_40[0x40]; 47 + }; 48 + 49 + enum { 50 + MLX5_UCTX_OBJECT_CAP_TOOLS_RESOURCES = 0x4, 51 + }; 52 + 53 + enum { 54 + MLX5_CMD_OP_QUERY_DRIVER_VERSION = 0x10c, 55 + MLX5_CMD_OP_QUERY_OTHER_HCA_CAP = 0x10e, 56 + MLX5_CMD_OP_QUERY_RDB = 0x512, 57 + MLX5_CMD_OP_QUERY_PSV = 0x602, 58 + MLX5_CMD_OP_QUERY_DC_CNAK_TRACE = 0x716, 59 + MLX5_CMD_OP_QUERY_NVMF_BACKEND_CONTROLLER = 0x722, 60 + MLX5_CMD_OP_QUERY_NVMF_NAMESPACE_CONTEXT = 0x728, 61 + MLX5_CMD_OP_QUERY_BURST_SIZE = 0x813, 62 + MLX5_CMD_OP_QUERY_DIAGNOSTIC_PARAMS = 0x819, 63 + MLX5_CMD_OP_SET_DIAGNOSTIC_PARAMS = 0x820, 64 + MLX5_CMD_OP_QUERY_DIAGNOSTIC_COUNTERS = 0x821, 65 + MLX5_CMD_OP_QUERY_DELAY_DROP_PARAMS = 0x911, 66 + MLX5_CMD_OP_QUERY_AFU = 0x971, 67 + MLX5_CMD_OP_QUERY_CAPI_PEC = 0x981, 68 + MLX5_CMD_OP_QUERY_UCTX = 0xa05, 69 + MLX5_CMD_OP_QUERY_UMEM = 0xa09, 70 + MLX5_CMD_OP_QUERY_NVMF_CC_RESPONSE = 0xb02, 71 + MLX5_CMD_OP_QUERY_EMULATED_FUNCTIONS_INFO = 0xb03, 72 + MLX5_CMD_OP_QUERY_REGEXP_PARAMS = 0xb05, 73 + MLX5_CMD_OP_QUERY_REGEXP_REGISTER = 0xb07, 74 + MLX5_CMD_OP_USER_QUERY_XRQ_DC_PARAMS_ENTRY = 0xb08, 75 + MLX5_CMD_OP_USER_QUERY_XRQ_ERROR_PARAMS = 0xb0a, 76 + MLX5_CMD_OP_ACCESS_REGISTER_USER = 0xb0c, 77 + MLX5_CMD_OP_QUERY_EMULATION_DEVICE_EQ_MSIX_MAPPING = 0xb0f, 78 + MLX5_CMD_OP_QUERY_MATCH_SAMPLE_INFO = 0xb13, 79 + MLX5_CMD_OP_QUERY_CRYPTO_STATE = 0xb14, 80 + MLX5_CMD_OP_QUERY_VUID = 0xb22, 81 + MLX5_CMD_OP_QUERY_DPA_PARTITION = 0xb28, 82 + MLX5_CMD_OP_QUERY_DPA_PARTITIONS = 0xb2a, 83 + MLX5_CMD_OP_POSTPONE_CONNECTED_QP_TIMEOUT = 0xb2e, 84 + MLX5_CMD_OP_QUERY_EMULATED_RESOURCES_INFO = 0xb2f, 85 + MLX5_CMD_OP_QUERY_RSV_RESOURCES = 0x8000, 86 + MLX5_CMD_OP_QUERY_MTT = 0x8001, 87 + MLX5_CMD_OP_QUERY_SCHED_QUEUE = 0x8006, 88 + }; 89 + 90 + static int mlx5ctl_alloc_uid(struct mlx5ctl_dev *mcdev, u32 cap) 91 + { 92 + u32 out[MLX5_ST_SZ_DW(create_uctx_out)] = {}; 93 + u32 in[MLX5_ST_SZ_DW(create_uctx_in)] = {}; 94 + void *uctx; 95 + int ret; 96 + u16 uid; 97 + 98 + uctx = MLX5_ADDR_OF(create_uctx_in, in, uctx); 99 + 100 + mlx5ctl_dbg(mcdev, "%s: caps 0x%x\n", __func__, cap); 101 + MLX5_SET(create_uctx_in, in, opcode, MLX5_CMD_OP_CREATE_UCTX); 102 + MLX5_SET(uctx, uctx, cap, cap); 103 + 104 + ret = mlx5_cmd_exec(mcdev->mdev, in, sizeof(in), out, sizeof(out)); 105 + if (ret) 106 + return ret; 107 + 108 + uid = MLX5_GET(create_uctx_out, out, uid); 109 + mlx5ctl_dbg(mcdev, "allocated uid %u with caps 0x%x\n", uid, cap); 110 + return uid; 111 + } 112 + 113 + static void mlx5ctl_release_uid(struct mlx5ctl_dev *mcdev, u16 uid) 114 + { 115 + u32 in[MLX5_ST_SZ_DW(destroy_uctx_in)] = {}; 116 + struct mlx5_core_dev *mdev = mcdev->mdev; 117 + int ret; 118 + 119 + MLX5_SET(destroy_uctx_in, in, opcode, MLX5_CMD_OP_DESTROY_UCTX); 120 + MLX5_SET(destroy_uctx_in, in, uid, uid); 121 + 122 + ret = mlx5_cmd_exec_in(mdev, destroy_uctx, in); 123 + mlx5ctl_dbg(mcdev, "released uid %u %pe\n", uid, ERR_PTR(ret)); 124 + } 125 + 126 + static int mlx5ctl_open_uctx(struct fwctl_uctx *uctx) 127 + { 128 + struct mlx5ctl_uctx *mfd = 129 + container_of(uctx, struct mlx5ctl_uctx, uctx); 130 + struct mlx5ctl_dev *mcdev = 131 + container_of(uctx->fwctl, struct mlx5ctl_dev, fwctl); 132 + int uid; 133 + 134 + /* 135 + * New FW supports the TOOLS_RESOURCES uid security label 136 + * which allows commands to manipulate the global device state. 137 + * Otherwise only basic existing RDMA devx privilege are allowed. 138 + */ 139 + if (MLX5_CAP_GEN(mcdev->mdev, uctx_cap) & 140 + MLX5_UCTX_OBJECT_CAP_TOOLS_RESOURCES) 141 + mfd->uctx_caps |= MLX5_UCTX_OBJECT_CAP_TOOLS_RESOURCES; 142 + 143 + uid = mlx5ctl_alloc_uid(mcdev, mfd->uctx_caps); 144 + if (uid < 0) 145 + return uid; 146 + 147 + mfd->uctx_uid = uid; 148 + return 0; 149 + } 150 + 151 + static void mlx5ctl_close_uctx(struct fwctl_uctx *uctx) 152 + { 153 + struct mlx5ctl_dev *mcdev = 154 + container_of(uctx->fwctl, struct mlx5ctl_dev, fwctl); 155 + struct mlx5ctl_uctx *mfd = 156 + container_of(uctx, struct mlx5ctl_uctx, uctx); 157 + 158 + mlx5ctl_release_uid(mcdev, mfd->uctx_uid); 159 + } 160 + 161 + static void *mlx5ctl_info(struct fwctl_uctx *uctx, size_t *length) 162 + { 163 + struct mlx5ctl_uctx *mfd = 164 + container_of(uctx, struct mlx5ctl_uctx, uctx); 165 + struct fwctl_info_mlx5 *info; 166 + 167 + info = kzalloc(sizeof(*info), GFP_KERNEL); 168 + if (!info) 169 + return ERR_PTR(-ENOMEM); 170 + 171 + info->uid = mfd->uctx_uid; 172 + info->uctx_caps = mfd->uctx_caps; 173 + *length = sizeof(*info); 174 + return info; 175 + } 176 + 177 + static bool mlx5ctl_validate_rpc(const void *in, enum fwctl_rpc_scope scope) 178 + { 179 + u16 opcode = MLX5_GET(mbox_in_hdr, in, opcode); 180 + u16 op_mod = MLX5_GET(mbox_in_hdr, in, op_mod); 181 + 182 + /* 183 + * Currently the driver can't keep track of commands that allocate 184 + * objects in the FW, these commands are safe from a security 185 + * perspective but nothing will free the memory when the FD is closed. 186 + * For now permit only query commands and set commands that don't alter 187 + * objects. Also the caps for the scope have not been defined yet, 188 + * filter commands manually for now. 189 + */ 190 + switch (opcode) { 191 + case MLX5_CMD_OP_POSTPONE_CONNECTED_QP_TIMEOUT: 192 + case MLX5_CMD_OP_QUERY_ADAPTER: 193 + case MLX5_CMD_OP_QUERY_ESW_FUNCTIONS: 194 + case MLX5_CMD_OP_QUERY_HCA_CAP: 195 + case MLX5_CMD_OP_QUERY_HCA_VPORT_CONTEXT: 196 + case MLX5_CMD_OP_QUERY_OTHER_HCA_CAP: 197 + case MLX5_CMD_OP_QUERY_ROCE_ADDRESS: 198 + case MLX5_CMD_OPCODE_QUERY_VUID: 199 + /* 200 + * FW limits SET_HCA_CAP on the tools UID to only the other function 201 + * mode which is used for function pre-configuration 202 + */ 203 + case MLX5_CMD_OP_SET_HCA_CAP: 204 + return true; /* scope >= FWCTL_RPC_CONFIGURATION; */ 205 + 206 + case MLX5_CMD_OP_FPGA_QUERY_QP_COUNTERS: 207 + case MLX5_CMD_OP_FPGA_QUERY_QP: 208 + case MLX5_CMD_OP_NOP: 209 + case MLX5_CMD_OP_QUERY_AFU: 210 + case MLX5_CMD_OP_QUERY_BURST_SIZE: 211 + case MLX5_CMD_OP_QUERY_CAPI_PEC: 212 + case MLX5_CMD_OP_QUERY_CONG_PARAMS: 213 + case MLX5_CMD_OP_QUERY_CONG_STATISTICS: 214 + case MLX5_CMD_OP_QUERY_CONG_STATUS: 215 + case MLX5_CMD_OP_QUERY_CQ: 216 + case MLX5_CMD_OP_QUERY_CRYPTO_STATE: 217 + case MLX5_CMD_OP_QUERY_DC_CNAK_TRACE: 218 + case MLX5_CMD_OP_QUERY_DCT: 219 + case MLX5_CMD_OP_QUERY_DELAY_DROP_PARAMS: 220 + case MLX5_CMD_OP_QUERY_DIAGNOSTIC_COUNTERS: 221 + case MLX5_CMD_OP_QUERY_DIAGNOSTIC_PARAMS: 222 + case MLX5_CMD_OP_QUERY_DPA_PARTITION: 223 + case MLX5_CMD_OP_QUERY_DPA_PARTITIONS: 224 + case MLX5_CMD_OP_QUERY_DRIVER_VERSION: 225 + case MLX5_CMD_OP_QUERY_EMULATED_FUNCTIONS_INFO: 226 + case MLX5_CMD_OP_QUERY_EMULATED_RESOURCES_INFO: 227 + case MLX5_CMD_OP_QUERY_EMULATION_DEVICE_EQ_MSIX_MAPPING: 228 + case MLX5_CMD_OP_QUERY_EQ: 229 + case MLX5_CMD_OP_QUERY_ESW_VPORT_CONTEXT: 230 + case MLX5_CMD_OP_QUERY_FLOW_COUNTER: 231 + case MLX5_CMD_OP_QUERY_FLOW_GROUP: 232 + case MLX5_CMD_OP_QUERY_FLOW_TABLE_ENTRY: 233 + case MLX5_CMD_OP_QUERY_FLOW_TABLE: 234 + case MLX5_CMD_OP_QUERY_GENERAL_OBJECT: 235 + case MLX5_CMD_OP_QUERY_HCA_VPORT_GID: 236 + case MLX5_CMD_OP_QUERY_HCA_VPORT_PKEY: 237 + case MLX5_CMD_OP_QUERY_ISSI: 238 + case MLX5_CMD_OP_QUERY_L2_TABLE_ENTRY: 239 + case MLX5_CMD_OP_QUERY_LAG: 240 + case MLX5_CMD_OP_QUERY_MAD_DEMUX: 241 + case MLX5_CMD_OP_QUERY_MATCH_SAMPLE_INFO: 242 + case MLX5_CMD_OP_QUERY_MKEY: 243 + case MLX5_CMD_OP_QUERY_MODIFY_HEADER_CONTEXT: 244 + case MLX5_CMD_OP_QUERY_MTT: 245 + case MLX5_CMD_OP_QUERY_NIC_VPORT_CONTEXT: 246 + case MLX5_CMD_OP_QUERY_NVMF_BACKEND_CONTROLLER: 247 + case MLX5_CMD_OP_QUERY_NVMF_CC_RESPONSE: 248 + case MLX5_CMD_OP_QUERY_NVMF_NAMESPACE_CONTEXT: 249 + case MLX5_CMD_OP_QUERY_PACKET_REFORMAT_CONTEXT: 250 + case MLX5_CMD_OP_QUERY_PAGES: 251 + case MLX5_CMD_OP_QUERY_PSV: 252 + case MLX5_CMD_OP_QUERY_Q_COUNTER: 253 + case MLX5_CMD_OP_QUERY_QP: 254 + case MLX5_CMD_OP_QUERY_RATE_LIMIT: 255 + case MLX5_CMD_OP_QUERY_RDB: 256 + case MLX5_CMD_OP_QUERY_REGEXP_PARAMS: 257 + case MLX5_CMD_OP_QUERY_REGEXP_REGISTER: 258 + case MLX5_CMD_OP_QUERY_RMP: 259 + case MLX5_CMD_OP_QUERY_RQ: 260 + case MLX5_CMD_OP_QUERY_RQT: 261 + case MLX5_CMD_OP_QUERY_RSV_RESOURCES: 262 + case MLX5_CMD_OP_QUERY_SCHED_QUEUE: 263 + case MLX5_CMD_OP_QUERY_SCHEDULING_ELEMENT: 264 + case MLX5_CMD_OP_QUERY_SF_PARTITION: 265 + case MLX5_CMD_OP_QUERY_SPECIAL_CONTEXTS: 266 + case MLX5_CMD_OP_QUERY_SQ: 267 + case MLX5_CMD_OP_QUERY_SRQ: 268 + case MLX5_CMD_OP_QUERY_TIR: 269 + case MLX5_CMD_OP_QUERY_TIS: 270 + case MLX5_CMD_OP_QUERY_UCTX: 271 + case MLX5_CMD_OP_QUERY_UMEM: 272 + case MLX5_CMD_OP_QUERY_VHCA_MIGRATION_STATE: 273 + case MLX5_CMD_OP_QUERY_VHCA_STATE: 274 + case MLX5_CMD_OP_QUERY_VNIC_ENV: 275 + case MLX5_CMD_OP_QUERY_VPORT_COUNTER: 276 + case MLX5_CMD_OP_QUERY_VPORT_STATE: 277 + case MLX5_CMD_OP_QUERY_WOL_ROL: 278 + case MLX5_CMD_OP_QUERY_XRC_SRQ: 279 + case MLX5_CMD_OP_QUERY_XRQ_DC_PARAMS_ENTRY: 280 + case MLX5_CMD_OP_QUERY_XRQ_ERROR_PARAMS: 281 + case MLX5_CMD_OP_QUERY_XRQ: 282 + case MLX5_CMD_OP_USER_QUERY_XRQ_DC_PARAMS_ENTRY: 283 + case MLX5_CMD_OP_USER_QUERY_XRQ_ERROR_PARAMS: 284 + return scope >= FWCTL_RPC_DEBUG_READ_ONLY; 285 + 286 + case MLX5_CMD_OP_SET_DIAGNOSTIC_PARAMS: 287 + return scope >= FWCTL_RPC_DEBUG_WRITE; 288 + 289 + case MLX5_CMD_OP_ACCESS_REG: 290 + case MLX5_CMD_OP_ACCESS_REGISTER_USER: 291 + if (op_mod == 0) /* write */ 292 + return true; /* scope >= FWCTL_RPC_CONFIGURATION; */ 293 + return scope >= FWCTL_RPC_DEBUG_READ_ONLY; 294 + default: 295 + return false; 296 + } 297 + } 298 + 299 + static void *mlx5ctl_fw_rpc(struct fwctl_uctx *uctx, enum fwctl_rpc_scope scope, 300 + void *rpc_in, size_t in_len, size_t *out_len) 301 + { 302 + struct mlx5ctl_dev *mcdev = 303 + container_of(uctx->fwctl, struct mlx5ctl_dev, fwctl); 304 + struct mlx5ctl_uctx *mfd = 305 + container_of(uctx, struct mlx5ctl_uctx, uctx); 306 + void *rpc_out; 307 + int ret; 308 + 309 + if (in_len < MLX5_ST_SZ_BYTES(mbox_in_hdr) || 310 + *out_len < MLX5_ST_SZ_BYTES(mbox_out_hdr)) 311 + return ERR_PTR(-EMSGSIZE); 312 + 313 + mlx5ctl_dbg(mcdev, "[UID %d] cmdif: opcode 0x%x inlen %zu outlen %zu\n", 314 + mfd->uctx_uid, MLX5_GET(mbox_in_hdr, rpc_in, opcode), 315 + in_len, *out_len); 316 + 317 + if (!mlx5ctl_validate_rpc(rpc_in, scope)) 318 + return ERR_PTR(-EBADMSG); 319 + 320 + /* 321 + * mlx5_cmd_do() copies the input message to its own buffer before 322 + * executing it, so we can reuse the allocation for the output. 323 + */ 324 + if (*out_len <= in_len) { 325 + rpc_out = rpc_in; 326 + } else { 327 + rpc_out = kvzalloc(*out_len, GFP_KERNEL); 328 + if (!rpc_out) 329 + return ERR_PTR(-ENOMEM); 330 + } 331 + 332 + /* Enforce the user context for the command */ 333 + MLX5_SET(mbox_in_hdr, rpc_in, uid, mfd->uctx_uid); 334 + ret = mlx5_cmd_do(mcdev->mdev, rpc_in, in_len, rpc_out, *out_len); 335 + 336 + mlx5ctl_dbg(mcdev, 337 + "[UID %d] cmdif: opcode 0x%x status 0x%x retval %pe\n", 338 + mfd->uctx_uid, MLX5_GET(mbox_in_hdr, rpc_in, opcode), 339 + MLX5_GET(mbox_out_hdr, rpc_out, status), ERR_PTR(ret)); 340 + 341 + /* 342 + * -EREMOTEIO means execution succeeded and the out is valid, 343 + * but an error code was returned inside out. Everything else 344 + * means the RPC did not make it to the device. 345 + */ 346 + if (ret && ret != -EREMOTEIO) { 347 + if (rpc_out != rpc_in) 348 + kfree(rpc_out); 349 + return ERR_PTR(ret); 350 + } 351 + return rpc_out; 352 + } 353 + 354 + static const struct fwctl_ops mlx5ctl_ops = { 355 + .device_type = FWCTL_DEVICE_TYPE_MLX5, 356 + .uctx_size = sizeof(struct mlx5ctl_uctx), 357 + .open_uctx = mlx5ctl_open_uctx, 358 + .close_uctx = mlx5ctl_close_uctx, 359 + .info = mlx5ctl_info, 360 + .fw_rpc = mlx5ctl_fw_rpc, 361 + }; 362 + 363 + static int mlx5ctl_probe(struct auxiliary_device *adev, 364 + const struct auxiliary_device_id *id) 365 + 366 + { 367 + struct mlx5_adev *madev = container_of(adev, struct mlx5_adev, adev); 368 + struct mlx5_core_dev *mdev = madev->mdev; 369 + struct mlx5ctl_dev *mcdev __free(mlx5ctl) = fwctl_alloc_device( 370 + &mdev->pdev->dev, &mlx5ctl_ops, struct mlx5ctl_dev, fwctl); 371 + int ret; 372 + 373 + if (!mcdev) 374 + return -ENOMEM; 375 + 376 + mcdev->mdev = mdev; 377 + 378 + ret = fwctl_register(&mcdev->fwctl); 379 + if (ret) 380 + return ret; 381 + auxiliary_set_drvdata(adev, no_free_ptr(mcdev)); 382 + return 0; 383 + } 384 + 385 + static void mlx5ctl_remove(struct auxiliary_device *adev) 386 + { 387 + struct mlx5ctl_dev *mcdev = auxiliary_get_drvdata(adev); 388 + 389 + fwctl_unregister(&mcdev->fwctl); 390 + fwctl_put(&mcdev->fwctl); 391 + } 392 + 393 + static const struct auxiliary_device_id mlx5ctl_id_table[] = { 394 + {.name = MLX5_ADEV_NAME ".fwctl",}, 395 + {} 396 + }; 397 + MODULE_DEVICE_TABLE(auxiliary, mlx5ctl_id_table); 398 + 399 + static struct auxiliary_driver mlx5ctl_driver = { 400 + .name = "mlx5_fwctl", 401 + .probe = mlx5ctl_probe, 402 + .remove = mlx5ctl_remove, 403 + .id_table = mlx5ctl_id_table, 404 + }; 405 + 406 + module_auxiliary_driver(mlx5ctl_driver); 407 + 408 + MODULE_IMPORT_NS("FWCTL"); 409 + MODULE_DESCRIPTION("mlx5 ConnectX fwctl driver"); 410 + MODULE_AUTHOR("Saeed Mahameed <saeedm@nvidia.com>"); 411 + MODULE_LICENSE("Dual BSD/GPL");
+4
drivers/fwctl/pds/Makefile
··· 1 + # SPDX-License-Identifier: GPL-2.0 2 + obj-$(CONFIG_FWCTL_PDS) += pds_fwctl.o 3 + 4 + pds_fwctl-y += main.o
+536
drivers/fwctl/pds/main.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright(c) Advanced Micro Devices, Inc */ 3 + 4 + #include <linux/module.h> 5 + #include <linux/auxiliary_bus.h> 6 + #include <linux/pci.h> 7 + #include <linux/vmalloc.h> 8 + #include <linux/bitfield.h> 9 + 10 + #include <uapi/fwctl/fwctl.h> 11 + #include <uapi/fwctl/pds.h> 12 + #include <linux/fwctl.h> 13 + 14 + #include <linux/pds/pds_common.h> 15 + #include <linux/pds/pds_core_if.h> 16 + #include <linux/pds/pds_adminq.h> 17 + #include <linux/pds/pds_auxbus.h> 18 + 19 + struct pdsfc_uctx { 20 + struct fwctl_uctx uctx; 21 + u32 uctx_caps; 22 + }; 23 + 24 + struct pdsfc_rpc_endpoint_info { 25 + u32 endpoint; 26 + dma_addr_t operations_pa; 27 + struct pds_fwctl_query_data *operations; 28 + struct mutex lock; /* lock for endpoint info management */ 29 + }; 30 + 31 + struct pdsfc_dev { 32 + struct fwctl_device fwctl; 33 + struct pds_auxiliary_dev *padev; 34 + u32 caps; 35 + struct pds_fwctl_ident ident; 36 + dma_addr_t endpoints_pa; 37 + struct pds_fwctl_query_data *endpoints; 38 + struct pdsfc_rpc_endpoint_info *endpoint_info; 39 + }; 40 + 41 + static int pdsfc_open_uctx(struct fwctl_uctx *uctx) 42 + { 43 + struct pdsfc_dev *pdsfc = container_of(uctx->fwctl, struct pdsfc_dev, fwctl); 44 + struct pdsfc_uctx *pdsfc_uctx = container_of(uctx, struct pdsfc_uctx, uctx); 45 + 46 + pdsfc_uctx->uctx_caps = pdsfc->caps; 47 + 48 + return 0; 49 + } 50 + 51 + static void pdsfc_close_uctx(struct fwctl_uctx *uctx) 52 + { 53 + } 54 + 55 + static void *pdsfc_info(struct fwctl_uctx *uctx, size_t *length) 56 + { 57 + struct pdsfc_uctx *pdsfc_uctx = container_of(uctx, struct pdsfc_uctx, uctx); 58 + struct fwctl_info_pds *info; 59 + 60 + info = kzalloc(sizeof(*info), GFP_KERNEL); 61 + if (!info) 62 + return ERR_PTR(-ENOMEM); 63 + 64 + info->uctx_caps = pdsfc_uctx->uctx_caps; 65 + 66 + return info; 67 + } 68 + 69 + static int pdsfc_identify(struct pdsfc_dev *pdsfc) 70 + { 71 + struct device *dev = &pdsfc->fwctl.dev; 72 + union pds_core_adminq_comp comp = {0}; 73 + union pds_core_adminq_cmd cmd; 74 + struct pds_fwctl_ident *ident; 75 + dma_addr_t ident_pa; 76 + int err; 77 + 78 + ident = dma_alloc_coherent(dev->parent, sizeof(*ident), &ident_pa, GFP_KERNEL); 79 + if (!ident) { 80 + dev_err(dev, "Failed to map ident buffer\n"); 81 + return -ENOMEM; 82 + } 83 + 84 + cmd = (union pds_core_adminq_cmd) { 85 + .fwctl_ident = { 86 + .opcode = PDS_FWCTL_CMD_IDENT, 87 + .version = 0, 88 + .len = cpu_to_le32(sizeof(*ident)), 89 + .ident_pa = cpu_to_le64(ident_pa), 90 + } 91 + }; 92 + 93 + err = pds_client_adminq_cmd(pdsfc->padev, &cmd, sizeof(cmd), &comp, 0); 94 + if (err) 95 + dev_err(dev, "Failed to send adminq cmd opcode: %u err: %d\n", 96 + cmd.fwctl_ident.opcode, err); 97 + else 98 + pdsfc->ident = *ident; 99 + 100 + dma_free_coherent(dev->parent, sizeof(*ident), ident, ident_pa); 101 + 102 + return err; 103 + } 104 + 105 + static void pdsfc_free_endpoints(struct pdsfc_dev *pdsfc) 106 + { 107 + struct device *dev = &pdsfc->fwctl.dev; 108 + int i; 109 + 110 + if (!pdsfc->endpoints) 111 + return; 112 + 113 + for (i = 0; pdsfc->endpoint_info && i < pdsfc->endpoints->num_entries; i++) 114 + mutex_destroy(&pdsfc->endpoint_info[i].lock); 115 + vfree(pdsfc->endpoint_info); 116 + pdsfc->endpoint_info = NULL; 117 + dma_free_coherent(dev->parent, PAGE_SIZE, 118 + pdsfc->endpoints, pdsfc->endpoints_pa); 119 + pdsfc->endpoints = NULL; 120 + pdsfc->endpoints_pa = DMA_MAPPING_ERROR; 121 + } 122 + 123 + static void pdsfc_free_operations(struct pdsfc_dev *pdsfc) 124 + { 125 + struct device *dev = &pdsfc->fwctl.dev; 126 + u32 num_endpoints; 127 + int i; 128 + 129 + num_endpoints = le32_to_cpu(pdsfc->endpoints->num_entries); 130 + for (i = 0; i < num_endpoints; i++) { 131 + struct pdsfc_rpc_endpoint_info *ei = &pdsfc->endpoint_info[i]; 132 + 133 + if (ei->operations) { 134 + dma_free_coherent(dev->parent, PAGE_SIZE, 135 + ei->operations, ei->operations_pa); 136 + ei->operations = NULL; 137 + ei->operations_pa = DMA_MAPPING_ERROR; 138 + } 139 + } 140 + } 141 + 142 + static struct pds_fwctl_query_data *pdsfc_get_endpoints(struct pdsfc_dev *pdsfc, 143 + dma_addr_t *pa) 144 + { 145 + struct device *dev = &pdsfc->fwctl.dev; 146 + union pds_core_adminq_comp comp = {0}; 147 + struct pds_fwctl_query_data *data; 148 + union pds_core_adminq_cmd cmd; 149 + dma_addr_t data_pa; 150 + int err; 151 + 152 + data = dma_alloc_coherent(dev->parent, PAGE_SIZE, &data_pa, GFP_KERNEL); 153 + if (!data) { 154 + dev_err(dev, "Failed to map endpoint list\n"); 155 + return ERR_PTR(-ENOMEM); 156 + } 157 + 158 + cmd = (union pds_core_adminq_cmd) { 159 + .fwctl_query = { 160 + .opcode = PDS_FWCTL_CMD_QUERY, 161 + .entity = PDS_FWCTL_RPC_ROOT, 162 + .version = 0, 163 + .query_data_buf_len = cpu_to_le32(PAGE_SIZE), 164 + .query_data_buf_pa = cpu_to_le64(data_pa), 165 + } 166 + }; 167 + 168 + err = pds_client_adminq_cmd(pdsfc->padev, &cmd, sizeof(cmd), &comp, 0); 169 + if (err) { 170 + dev_err(dev, "Failed to send adminq cmd opcode: %u entity: %u err: %d\n", 171 + cmd.fwctl_query.opcode, cmd.fwctl_query.entity, err); 172 + dma_free_coherent(dev->parent, PAGE_SIZE, data, data_pa); 173 + return ERR_PTR(err); 174 + } 175 + 176 + *pa = data_pa; 177 + 178 + return data; 179 + } 180 + 181 + static int pdsfc_init_endpoints(struct pdsfc_dev *pdsfc) 182 + { 183 + struct pds_fwctl_query_data_endpoint *ep_entry; 184 + u32 num_endpoints; 185 + int i; 186 + 187 + pdsfc->endpoints = pdsfc_get_endpoints(pdsfc, &pdsfc->endpoints_pa); 188 + if (IS_ERR(pdsfc->endpoints)) 189 + return PTR_ERR(pdsfc->endpoints); 190 + 191 + num_endpoints = le32_to_cpu(pdsfc->endpoints->num_entries); 192 + pdsfc->endpoint_info = vcalloc(num_endpoints, 193 + sizeof(*pdsfc->endpoint_info)); 194 + if (!pdsfc->endpoint_info) { 195 + pdsfc_free_endpoints(pdsfc); 196 + return -ENOMEM; 197 + } 198 + 199 + ep_entry = (struct pds_fwctl_query_data_endpoint *)pdsfc->endpoints->entries; 200 + for (i = 0; i < num_endpoints; i++) { 201 + mutex_init(&pdsfc->endpoint_info[i].lock); 202 + pdsfc->endpoint_info[i].endpoint = ep_entry[i].id; 203 + } 204 + 205 + return 0; 206 + } 207 + 208 + static struct pds_fwctl_query_data *pdsfc_get_operations(struct pdsfc_dev *pdsfc, 209 + dma_addr_t *pa, u32 ep) 210 + { 211 + struct pds_fwctl_query_data_operation *entries; 212 + struct device *dev = &pdsfc->fwctl.dev; 213 + union pds_core_adminq_comp comp = {0}; 214 + struct pds_fwctl_query_data *data; 215 + union pds_core_adminq_cmd cmd; 216 + dma_addr_t data_pa; 217 + int err; 218 + int i; 219 + 220 + /* Query the operations list for the given endpoint */ 221 + data = dma_alloc_coherent(dev->parent, PAGE_SIZE, &data_pa, GFP_KERNEL); 222 + if (!data) { 223 + dev_err(dev, "Failed to map operations list\n"); 224 + return ERR_PTR(-ENOMEM); 225 + } 226 + 227 + cmd = (union pds_core_adminq_cmd) { 228 + .fwctl_query = { 229 + .opcode = PDS_FWCTL_CMD_QUERY, 230 + .entity = PDS_FWCTL_RPC_ENDPOINT, 231 + .version = 0, 232 + .query_data_buf_len = cpu_to_le32(PAGE_SIZE), 233 + .query_data_buf_pa = cpu_to_le64(data_pa), 234 + .ep = cpu_to_le32(ep), 235 + } 236 + }; 237 + 238 + err = pds_client_adminq_cmd(pdsfc->padev, &cmd, sizeof(cmd), &comp, 0); 239 + if (err) { 240 + dev_err(dev, "Failed to send adminq cmd opcode: %u entity: %u err: %d\n", 241 + cmd.fwctl_query.opcode, cmd.fwctl_query.entity, err); 242 + dma_free_coherent(dev->parent, PAGE_SIZE, data, data_pa); 243 + return ERR_PTR(err); 244 + } 245 + 246 + *pa = data_pa; 247 + 248 + entries = (struct pds_fwctl_query_data_operation *)data->entries; 249 + dev_dbg(dev, "num_entries %d\n", data->num_entries); 250 + for (i = 0; i < data->num_entries; i++) { 251 + 252 + /* Translate FW command attribute to fwctl scope */ 253 + switch (entries[i].scope) { 254 + case PDSFC_FW_CMD_ATTR_READ: 255 + case PDSFC_FW_CMD_ATTR_WRITE: 256 + case PDSFC_FW_CMD_ATTR_SYNC: 257 + entries[i].scope = FWCTL_RPC_CONFIGURATION; 258 + break; 259 + case PDSFC_FW_CMD_ATTR_DEBUG_READ: 260 + entries[i].scope = FWCTL_RPC_DEBUG_READ_ONLY; 261 + break; 262 + case PDSFC_FW_CMD_ATTR_DEBUG_WRITE: 263 + entries[i].scope = FWCTL_RPC_DEBUG_WRITE; 264 + break; 265 + default: 266 + entries[i].scope = FWCTL_RPC_DEBUG_WRITE_FULL; 267 + break; 268 + } 269 + dev_dbg(dev, "endpoint %d operation: id %x scope %d\n", 270 + ep, entries[i].id, entries[i].scope); 271 + } 272 + 273 + return data; 274 + } 275 + 276 + static int pdsfc_validate_rpc(struct pdsfc_dev *pdsfc, 277 + struct fwctl_rpc_pds *rpc, 278 + enum fwctl_rpc_scope scope) 279 + { 280 + struct pds_fwctl_query_data_operation *op_entry; 281 + struct pdsfc_rpc_endpoint_info *ep_info = NULL; 282 + struct device *dev = &pdsfc->fwctl.dev; 283 + int i; 284 + 285 + /* validate rpc in_len & out_len based 286 + * on ident.max_req_sz & max_resp_sz 287 + */ 288 + if (rpc->in.len > pdsfc->ident.max_req_sz) { 289 + dev_dbg(dev, "Invalid request size %u, max %u\n", 290 + rpc->in.len, pdsfc->ident.max_req_sz); 291 + return -EINVAL; 292 + } 293 + 294 + if (rpc->out.len > pdsfc->ident.max_resp_sz) { 295 + dev_dbg(dev, "Invalid response size %u, max %u\n", 296 + rpc->out.len, pdsfc->ident.max_resp_sz); 297 + return -EINVAL; 298 + } 299 + 300 + for (i = 0; i < pdsfc->endpoints->num_entries; i++) { 301 + if (pdsfc->endpoint_info[i].endpoint == rpc->in.ep) { 302 + ep_info = &pdsfc->endpoint_info[i]; 303 + break; 304 + } 305 + } 306 + if (!ep_info) { 307 + dev_dbg(dev, "Invalid endpoint %d\n", rpc->in.ep); 308 + return -EINVAL; 309 + } 310 + 311 + /* query and cache this endpoint's operations */ 312 + mutex_lock(&ep_info->lock); 313 + if (!ep_info->operations) { 314 + struct pds_fwctl_query_data *operations; 315 + 316 + operations = pdsfc_get_operations(pdsfc, 317 + &ep_info->operations_pa, 318 + rpc->in.ep); 319 + if (IS_ERR(operations)) { 320 + mutex_unlock(&ep_info->lock); 321 + return -ENOMEM; 322 + } 323 + ep_info->operations = operations; 324 + } 325 + mutex_unlock(&ep_info->lock); 326 + 327 + /* reject unsupported and/or out of scope commands */ 328 + op_entry = (struct pds_fwctl_query_data_operation *)ep_info->operations->entries; 329 + for (i = 0; i < ep_info->operations->num_entries; i++) { 330 + if (PDS_FWCTL_RPC_OPCODE_CMP(rpc->in.op, op_entry[i].id)) { 331 + if (scope < op_entry[i].scope) 332 + return -EPERM; 333 + return 0; 334 + } 335 + } 336 + 337 + dev_dbg(dev, "Invalid operation %d for endpoint %d\n", rpc->in.op, rpc->in.ep); 338 + 339 + return -EINVAL; 340 + } 341 + 342 + static void *pdsfc_fw_rpc(struct fwctl_uctx *uctx, enum fwctl_rpc_scope scope, 343 + void *in, size_t in_len, size_t *out_len) 344 + { 345 + struct pdsfc_dev *pdsfc = container_of(uctx->fwctl, struct pdsfc_dev, fwctl); 346 + struct device *dev = &uctx->fwctl->dev; 347 + union pds_core_adminq_comp comp = {0}; 348 + dma_addr_t out_payload_dma_addr = 0; 349 + dma_addr_t in_payload_dma_addr = 0; 350 + struct fwctl_rpc_pds *rpc = in; 351 + union pds_core_adminq_cmd cmd; 352 + void *out_payload = NULL; 353 + void *in_payload = NULL; 354 + void *out = NULL; 355 + int err; 356 + 357 + err = pdsfc_validate_rpc(pdsfc, rpc, scope); 358 + if (err) 359 + return ERR_PTR(err); 360 + 361 + if (rpc->in.len > 0) { 362 + in_payload = kzalloc(rpc->in.len, GFP_KERNEL); 363 + if (!in_payload) { 364 + dev_err(dev, "Failed to allocate in_payload\n"); 365 + err = -ENOMEM; 366 + goto err_out; 367 + } 368 + 369 + if (copy_from_user(in_payload, u64_to_user_ptr(rpc->in.payload), 370 + rpc->in.len)) { 371 + dev_dbg(dev, "Failed to copy in_payload from user\n"); 372 + err = -EFAULT; 373 + goto err_in_payload; 374 + } 375 + 376 + in_payload_dma_addr = dma_map_single(dev->parent, in_payload, 377 + rpc->in.len, DMA_TO_DEVICE); 378 + err = dma_mapping_error(dev->parent, in_payload_dma_addr); 379 + if (err) { 380 + dev_dbg(dev, "Failed to map in_payload\n"); 381 + goto err_in_payload; 382 + } 383 + } 384 + 385 + if (rpc->out.len > 0) { 386 + out_payload = kzalloc(rpc->out.len, GFP_KERNEL); 387 + if (!out_payload) { 388 + dev_dbg(dev, "Failed to allocate out_payload\n"); 389 + err = -ENOMEM; 390 + goto err_out_payload; 391 + } 392 + 393 + out_payload_dma_addr = dma_map_single(dev->parent, out_payload, 394 + rpc->out.len, DMA_FROM_DEVICE); 395 + err = dma_mapping_error(dev->parent, out_payload_dma_addr); 396 + if (err) { 397 + dev_dbg(dev, "Failed to map out_payload\n"); 398 + goto err_out_payload; 399 + } 400 + } 401 + 402 + cmd = (union pds_core_adminq_cmd) { 403 + .fwctl_rpc = { 404 + .opcode = PDS_FWCTL_CMD_RPC, 405 + .flags = PDS_FWCTL_RPC_IND_REQ | PDS_FWCTL_RPC_IND_RESP, 406 + .ep = cpu_to_le32(rpc->in.ep), 407 + .op = cpu_to_le32(rpc->in.op), 408 + .req_pa = cpu_to_le64(in_payload_dma_addr), 409 + .req_sz = cpu_to_le32(rpc->in.len), 410 + .resp_pa = cpu_to_le64(out_payload_dma_addr), 411 + .resp_sz = cpu_to_le32(rpc->out.len), 412 + } 413 + }; 414 + 415 + err = pds_client_adminq_cmd(pdsfc->padev, &cmd, sizeof(cmd), &comp, 0); 416 + if (err) { 417 + dev_dbg(dev, "%s: ep %d op %x req_pa %llx req_sz %d req_sg %d resp_pa %llx resp_sz %d resp_sg %d err %d\n", 418 + __func__, rpc->in.ep, rpc->in.op, 419 + cmd.fwctl_rpc.req_pa, cmd.fwctl_rpc.req_sz, cmd.fwctl_rpc.req_sg_elems, 420 + cmd.fwctl_rpc.resp_pa, cmd.fwctl_rpc.resp_sz, cmd.fwctl_rpc.resp_sg_elems, 421 + err); 422 + goto done; 423 + } 424 + 425 + dynamic_hex_dump("out ", DUMP_PREFIX_OFFSET, 16, 1, out_payload, rpc->out.len, true); 426 + 427 + if (copy_to_user(u64_to_user_ptr(rpc->out.payload), out_payload, rpc->out.len)) { 428 + dev_dbg(dev, "Failed to copy out_payload to user\n"); 429 + out = ERR_PTR(-EFAULT); 430 + goto done; 431 + } 432 + 433 + rpc->out.retval = le32_to_cpu(comp.fwctl_rpc.err); 434 + *out_len = in_len; 435 + out = in; 436 + 437 + done: 438 + if (out_payload_dma_addr) 439 + dma_unmap_single(dev->parent, out_payload_dma_addr, 440 + rpc->out.len, DMA_FROM_DEVICE); 441 + err_out_payload: 442 + kfree(out_payload); 443 + 444 + if (in_payload_dma_addr) 445 + dma_unmap_single(dev->parent, in_payload_dma_addr, 446 + rpc->in.len, DMA_TO_DEVICE); 447 + err_in_payload: 448 + kfree(in_payload); 449 + err_out: 450 + if (err) 451 + return ERR_PTR(err); 452 + 453 + return out; 454 + } 455 + 456 + static const struct fwctl_ops pdsfc_ops = { 457 + .device_type = FWCTL_DEVICE_TYPE_PDS, 458 + .uctx_size = sizeof(struct pdsfc_uctx), 459 + .open_uctx = pdsfc_open_uctx, 460 + .close_uctx = pdsfc_close_uctx, 461 + .info = pdsfc_info, 462 + .fw_rpc = pdsfc_fw_rpc, 463 + }; 464 + 465 + static int pdsfc_probe(struct auxiliary_device *adev, 466 + const struct auxiliary_device_id *id) 467 + { 468 + struct pds_auxiliary_dev *padev = 469 + container_of(adev, struct pds_auxiliary_dev, aux_dev); 470 + struct device *dev = &adev->dev; 471 + struct pdsfc_dev *pdsfc; 472 + int err; 473 + 474 + pdsfc = fwctl_alloc_device(&padev->vf_pdev->dev, &pdsfc_ops, 475 + struct pdsfc_dev, fwctl); 476 + if (!pdsfc) 477 + return dev_err_probe(dev, -ENOMEM, "Failed to allocate fwctl device struct\n"); 478 + pdsfc->padev = padev; 479 + 480 + err = pdsfc_identify(pdsfc); 481 + if (err) { 482 + fwctl_put(&pdsfc->fwctl); 483 + return dev_err_probe(dev, err, "Failed to identify device\n"); 484 + } 485 + 486 + err = pdsfc_init_endpoints(pdsfc); 487 + if (err) { 488 + fwctl_put(&pdsfc->fwctl); 489 + return dev_err_probe(dev, err, "Failed to init endpoints\n"); 490 + } 491 + 492 + pdsfc->caps = PDS_FWCTL_QUERY_CAP | PDS_FWCTL_SEND_CAP; 493 + 494 + err = fwctl_register(&pdsfc->fwctl); 495 + if (err) { 496 + pdsfc_free_endpoints(pdsfc); 497 + fwctl_put(&pdsfc->fwctl); 498 + return dev_err_probe(dev, err, "Failed to register device\n"); 499 + } 500 + 501 + auxiliary_set_drvdata(adev, pdsfc); 502 + 503 + return 0; 504 + } 505 + 506 + static void pdsfc_remove(struct auxiliary_device *adev) 507 + { 508 + struct pdsfc_dev *pdsfc = auxiliary_get_drvdata(adev); 509 + 510 + fwctl_unregister(&pdsfc->fwctl); 511 + pdsfc_free_operations(pdsfc); 512 + pdsfc_free_endpoints(pdsfc); 513 + 514 + fwctl_put(&pdsfc->fwctl); 515 + } 516 + 517 + static const struct auxiliary_device_id pdsfc_id_table[] = { 518 + {.name = PDS_CORE_DRV_NAME "." PDS_DEV_TYPE_FWCTL_STR }, 519 + {} 520 + }; 521 + MODULE_DEVICE_TABLE(auxiliary, pdsfc_id_table); 522 + 523 + static struct auxiliary_driver pdsfc_driver = { 524 + .name = "pds_fwctl", 525 + .probe = pdsfc_probe, 526 + .remove = pdsfc_remove, 527 + .id_table = pdsfc_id_table, 528 + }; 529 + 530 + module_auxiliary_driver(pdsfc_driver); 531 + 532 + MODULE_IMPORT_NS("FWCTL"); 533 + MODULE_DESCRIPTION("pds fwctl driver"); 534 + MODULE_AUTHOR("Shannon Nelson <shannon.nelson@amd.com>"); 535 + MODULE_AUTHOR("Brett Creeley <brett.creeley@amd.com>"); 536 + MODULE_LICENSE("GPL");
+19 -25
drivers/net/ethernet/amd/pds_core/auxbus.c
··· 175 175 return padev; 176 176 } 177 177 178 - int pdsc_auxbus_dev_del(struct pdsc *cf, struct pdsc *pf) 178 + void pdsc_auxbus_dev_del(struct pdsc *cf, struct pdsc *pf, 179 + struct pds_auxiliary_dev **pd_ptr) 179 180 { 180 181 struct pds_auxiliary_dev *padev; 181 - int err = 0; 182 182 183 - if (!cf) 184 - return -ENODEV; 183 + if (!*pd_ptr) 184 + return; 185 185 186 186 mutex_lock(&pf->config_lock); 187 187 188 - padev = pf->vfs[cf->vf_id].padev; 189 - if (padev) { 190 - pds_client_unregister(pf, padev->client_id); 191 - auxiliary_device_delete(&padev->aux_dev); 192 - auxiliary_device_uninit(&padev->aux_dev); 193 - padev->client_id = 0; 194 - } 195 - pf->vfs[cf->vf_id].padev = NULL; 188 + padev = *pd_ptr; 189 + pds_client_unregister(pf, padev->client_id); 190 + auxiliary_device_delete(&padev->aux_dev); 191 + auxiliary_device_uninit(&padev->aux_dev); 192 + padev->client_id = 0; 193 + *pd_ptr = NULL; 196 194 197 195 mutex_unlock(&pf->config_lock); 198 - return err; 199 196 } 200 197 201 - int pdsc_auxbus_dev_add(struct pdsc *cf, struct pdsc *pf) 198 + int pdsc_auxbus_dev_add(struct pdsc *cf, struct pdsc *pf, 199 + enum pds_core_vif_types vt, 200 + struct pds_auxiliary_dev **pd_ptr) 202 201 { 203 202 struct pds_auxiliary_dev *padev; 204 203 char devname[PDS_DEVNAME_LEN]; 205 - enum pds_core_vif_types vt; 206 204 unsigned long mask; 207 205 u16 vt_support; 208 206 int client_id; ··· 208 210 209 211 if (!cf) 210 212 return -ENODEV; 213 + 214 + if (vt >= PDS_DEV_TYPE_MAX) 215 + return -EINVAL; 211 216 212 217 mutex_lock(&pf->config_lock); 213 218 ··· 223 222 goto out_unlock; 224 223 } 225 224 226 - /* We only support vDPA so far, so it is the only one to 227 - * be verified that it is available in the Core device and 228 - * enabled in the devlink param. In the future this might 229 - * become a loop for several VIF types. 230 - */ 231 - 232 225 /* Verify that the type is supported and enabled. It is not 233 - * an error if there is no auxbus device support for this 234 - * VF, it just means something else needs to happen with it. 226 + * an error if the firmware doesn't support the feature, the 227 + * driver just won't set up an auxiliary_device for it. 235 228 */ 236 - vt = PDS_DEV_TYPE_VDPA; 237 229 vt_support = !!le16_to_cpu(pf->dev_ident.vif_types[vt]); 238 230 if (!(vt_support && 239 231 pf->viftype_status[vt].supported && ··· 252 258 err = PTR_ERR(padev); 253 259 goto out_unlock; 254 260 } 255 - pf->vfs[cf->vf_id].padev = padev; 261 + *pd_ptr = padev; 256 262 257 263 out_unlock: 258 264 mutex_unlock(&pf->config_lock);
+7
drivers/net/ethernet/amd/pds_core/core.c
··· 402 402 } 403 403 404 404 static struct pdsc_viftype pdsc_viftype_defaults[] = { 405 + [PDS_DEV_TYPE_FWCTL] = { .name = PDS_DEV_TYPE_FWCTL_STR, 406 + .vif_id = PDS_DEV_TYPE_FWCTL, 407 + .dl_id = -1 }, 405 408 [PDS_DEV_TYPE_VDPA] = { .name = PDS_DEV_TYPE_VDPA_STR, 406 409 .vif_id = PDS_DEV_TYPE_VDPA, 407 410 .dl_id = DEVLINK_PARAM_GENERIC_ID_ENABLE_VNET }, ··· 431 428 432 429 /* See what the Core device has for support */ 433 430 vt_support = !!le16_to_cpu(pdsc->dev_ident.vif_types[vt]); 431 + 432 + if (vt == PDS_DEV_TYPE_FWCTL) 433 + pdsc->viftype_status[vt].enabled = true; 434 + 434 435 dev_dbg(pdsc->dev, "VIF %s is %ssupported\n", 435 436 pdsc->viftype_status[vt].name, 436 437 vt_support ? "" : "not ");
+6 -2
drivers/net/ethernet/amd/pds_core/core.h
··· 156 156 struct dentry *dentry; 157 157 struct device *dev; 158 158 struct pdsc_dev_bar bars[PDS_CORE_BARS_MAX]; 159 + struct pds_auxiliary_dev *padev; 159 160 struct pdsc_vf *vfs; 160 161 int num_vfs; 161 162 int vf_id; ··· 304 303 int pdsc_register_notify(struct notifier_block *nb); 305 304 void pdsc_unregister_notify(struct notifier_block *nb); 306 305 void pdsc_notify(unsigned long event, void *data); 307 - int pdsc_auxbus_dev_add(struct pdsc *cf, struct pdsc *pf); 308 - int pdsc_auxbus_dev_del(struct pdsc *cf, struct pdsc *pf); 306 + int pdsc_auxbus_dev_add(struct pdsc *cf, struct pdsc *pf, 307 + enum pds_core_vif_types vt, 308 + struct pds_auxiliary_dev **pd_ptr); 309 + void pdsc_auxbus_dev_del(struct pdsc *cf, struct pdsc *pf, 310 + struct pds_auxiliary_dev **pd_ptr); 309 311 310 312 void pdsc_process_adminq(struct pdsc_qcq *qcq); 311 313 void pdsc_work_thread(struct work_struct *work);
+5 -2
drivers/net/ethernet/amd/pds_core/devlink.c
··· 56 56 for (vf_id = 0; vf_id < pdsc->num_vfs; vf_id++) { 57 57 struct pdsc *vf = pdsc->vfs[vf_id].vf; 58 58 59 - err = ctx->val.vbool ? pdsc_auxbus_dev_add(vf, pdsc) : 60 - pdsc_auxbus_dev_del(vf, pdsc); 59 + if (ctx->val.vbool) 60 + err = pdsc_auxbus_dev_add(vf, pdsc, vt_entry->vif_id, 61 + &pdsc->vfs[vf_id].padev); 62 + else 63 + pdsc_auxbus_dev_del(vf, pdsc, &pdsc->vfs[vf_id].padev); 61 64 } 62 65 63 66 return err;
+20 -5
drivers/net/ethernet/amd/pds_core/main.c
··· 190 190 devl_unlock(dl); 191 191 192 192 pf->vfs[vf->vf_id].vf = vf; 193 - err = pdsc_auxbus_dev_add(vf, pf); 193 + err = pdsc_auxbus_dev_add(vf, pf, PDS_DEV_TYPE_VDPA, 194 + &pf->vfs[vf->vf_id].padev); 194 195 if (err) { 195 196 devl_lock(dl); 196 197 devl_unregister(dl); ··· 265 264 266 265 mutex_unlock(&pdsc->config_lock); 267 266 267 + err = pdsc_auxbus_dev_add(pdsc, pdsc, PDS_DEV_TYPE_FWCTL, &pdsc->padev); 268 + if (err) 269 + goto err_out_stop; 270 + 268 271 dl = priv_to_devlink(pdsc); 269 272 devl_lock(dl); 270 273 err = devl_params_register(dl, pdsc_dl_params, ··· 277 272 devl_unlock(dl); 278 273 dev_warn(pdsc->dev, "Failed to register devlink params: %pe\n", 279 274 ERR_PTR(err)); 280 - goto err_out_stop; 275 + goto err_out_del_dev; 281 276 } 282 277 283 278 hr = devl_health_reporter_create(dl, &pdsc_fw_reporter_ops, 0, pdsc); ··· 300 295 err_out_unreg_params: 301 296 devlink_params_unregister(dl, pdsc_dl_params, 302 297 ARRAY_SIZE(pdsc_dl_params)); 298 + err_out_del_dev: 299 + pdsc_auxbus_dev_del(pdsc, pdsc, &pdsc->padev); 303 300 err_out_stop: 304 301 pdsc_stop(pdsc); 305 302 err_out_teardown: ··· 424 417 425 418 pf = pdsc_get_pf_struct(pdsc->pdev); 426 419 if (!IS_ERR(pf)) { 427 - pdsc_auxbus_dev_del(pdsc, pf); 420 + pdsc_auxbus_dev_del(pdsc, pf, &pf->vfs[pdsc->vf_id].padev); 428 421 pf->vfs[pdsc->vf_id].vf = NULL; 429 422 } 430 423 } else { ··· 433 426 * shut themselves down. 434 427 */ 435 428 pdsc_sriov_configure(pdev, 0); 429 + pdsc_auxbus_dev_del(pdsc, pdsc, &pdsc->padev); 436 430 437 431 timer_shutdown_sync(&pdsc->wdtimer); 438 432 if (pdsc->wq) ··· 490 482 491 483 pf = pdsc_get_pf_struct(pdsc->pdev); 492 484 if (!IS_ERR(pf)) 493 - pdsc_auxbus_dev_del(pdsc, pf); 485 + pdsc_auxbus_dev_del(pdsc, pf, 486 + &pf->vfs[pdsc->vf_id].padev); 487 + } else { 488 + pdsc_auxbus_dev_del(pdsc, pdsc, &pdsc->padev); 494 489 } 495 490 496 491 pdsc_unmap_bars(pdsc); ··· 538 527 539 528 pf = pdsc_get_pf_struct(pdsc->pdev); 540 529 if (!IS_ERR(pf)) 541 - pdsc_auxbus_dev_add(pdsc, pf); 530 + pdsc_auxbus_dev_add(pdsc, pf, PDS_DEV_TYPE_VDPA, 531 + &pf->vfs[pdsc->vf_id].padev); 532 + } else { 533 + pdsc_auxbus_dev_add(pdsc, pdsc, PDS_DEV_TYPE_FWCTL, 534 + &pdsc->padev); 542 535 } 543 536 } 544 537
+9
drivers/net/ethernet/mellanox/mlx5/core/dev.c
··· 228 228 MLX5_INTERFACE_PROTOCOL_VNET, 229 229 230 230 MLX5_INTERFACE_PROTOCOL_DPLL, 231 + MLX5_INTERFACE_PROTOCOL_FWCTL, 231 232 }; 233 + 234 + static bool is_fwctl_supported(struct mlx5_core_dev *dev) 235 + { 236 + /* fwctl is most useful on PFs, prevent fwctl on SFs for now */ 237 + return MLX5_CAP_GEN(dev, uctx_cap) && !mlx5_core_is_sf(dev); 238 + } 232 239 233 240 static const struct mlx5_adev_device { 234 241 const char *suffix; ··· 259 252 .is_supported = &is_mp_supported }, 260 253 [MLX5_INTERFACE_PROTOCOL_DPLL] = { .suffix = "dpll", 261 254 .is_supported = &is_dpll_supported }, 255 + [MLX5_INTERFACE_PROTOCOL_FWCTL] = { .suffix = "fwctl", 256 + .is_supported = &is_fwctl_supported }, 262 257 }; 263 258 264 259 int mlx5_adev_idx_alloc(void)
+87
include/cxl/features.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0-only */ 2 + /* Copyright(c) 2024-2025 Intel Corporation. */ 3 + #ifndef __CXL_FEATURES_H__ 4 + #define __CXL_FEATURES_H__ 5 + 6 + #include <linux/uuid.h> 7 + #include <linux/fwctl.h> 8 + #include <uapi/cxl/features.h> 9 + 10 + /* Feature UUIDs used by the kernel */ 11 + #define CXL_FEAT_PATROL_SCRUB_UUID \ 12 + UUID_INIT(0x96dad7d6, 0xfde8, 0x482b, 0xa7, 0x33, 0x75, 0x77, 0x4e, \ 13 + 0x06, 0xdb, 0x8a) 14 + 15 + #define CXL_FEAT_ECS_UUID \ 16 + UUID_INIT(0xe5b13f22, 0x2328, 0x4a14, 0xb8, 0xba, 0xb9, 0x69, 0x1e, \ 17 + 0x89, 0x33, 0x86) 18 + 19 + #define CXL_FEAT_SPPR_UUID \ 20 + UUID_INIT(0x892ba475, 0xfad8, 0x474e, 0x9d, 0x3e, 0x69, 0x2c, 0x91, \ 21 + 0x75, 0x68, 0xbb) 22 + 23 + #define CXL_FEAT_HPPR_UUID \ 24 + UUID_INIT(0x80ea4521, 0x786f, 0x4127, 0xaf, 0xb1, 0xec, 0x74, 0x59, \ 25 + 0xfb, 0x0e, 0x24) 26 + 27 + #define CXL_FEAT_CACHELINE_SPARING_UUID \ 28 + UUID_INIT(0x96C33386, 0x91dd, 0x44c7, 0x9e, 0xcb, 0xfd, 0xaf, 0x65, \ 29 + 0x03, 0xba, 0xc4) 30 + 31 + #define CXL_FEAT_ROW_SPARING_UUID \ 32 + UUID_INIT(0x450ebf67, 0xb135, 0x4f97, 0xa4, 0x98, 0xc2, 0xd5, 0x7f, \ 33 + 0x27, 0x9b, 0xed) 34 + 35 + #define CXL_FEAT_BANK_SPARING_UUID \ 36 + UUID_INIT(0x78b79636, 0x90ac, 0x4b64, 0xa4, 0xef, 0xfa, 0xac, 0x5d, \ 37 + 0x18, 0xa8, 0x63) 38 + 39 + #define CXL_FEAT_RANK_SPARING_UUID \ 40 + UUID_INIT(0x34dbaff5, 0x0552, 0x4281, 0x8f, 0x76, 0xda, 0x0b, 0x5e, \ 41 + 0x7a, 0x76, 0xa7) 42 + 43 + /* Feature commands capability supported by a device */ 44 + enum cxl_features_capability { 45 + CXL_FEATURES_NONE = 0, 46 + CXL_FEATURES_RO, 47 + CXL_FEATURES_RW, 48 + }; 49 + 50 + /** 51 + * struct cxl_features_state - The Features state for the device 52 + * @cxlds: Pointer to CXL device state 53 + * @entries: CXl feature entry context 54 + */ 55 + struct cxl_features_state { 56 + struct cxl_dev_state *cxlds; 57 + struct cxl_feat_entries { 58 + int num_features; 59 + int num_user_features; 60 + struct cxl_feat_entry ent[] __counted_by(num_features); 61 + } *entries; 62 + }; 63 + 64 + struct cxl_mailbox; 65 + struct cxl_memdev; 66 + #ifdef CONFIG_CXL_FEATURES 67 + inline struct cxl_features_state *to_cxlfs(struct cxl_dev_state *cxlds); 68 + int devm_cxl_setup_features(struct cxl_dev_state *cxlds); 69 + int devm_cxl_setup_fwctl(struct cxl_memdev *cxlmd); 70 + #else 71 + static inline struct cxl_features_state *to_cxlfs(struct cxl_dev_state *cxlds) 72 + { 73 + return NULL; 74 + } 75 + 76 + static inline int devm_cxl_setup_features(struct cxl_dev_state *cxlds) 77 + { 78 + return -EOPNOTSUPP; 79 + } 80 + 81 + static inline int devm_cxl_setup_fwctl(struct cxl_memdev *cxlmd) 82 + { 83 + return -EOPNOTSUPP; 84 + } 85 + #endif 86 + 87 + #endif
+43 -1
include/cxl/mailbox.h
··· 3 3 #ifndef __CXL_MBOX_H__ 4 4 #define __CXL_MBOX_H__ 5 5 #include <linux/rcuwait.h> 6 + #include <cxl/features.h> 7 + #include <uapi/linux/cxl_mem.h> 6 8 7 - struct cxl_mbox_cmd; 9 + /** 10 + * struct cxl_mbox_cmd - A command to be submitted to hardware. 11 + * @opcode: (input) The command set and command submitted to hardware. 12 + * @payload_in: (input) Pointer to the input payload. 13 + * @payload_out: (output) Pointer to the output payload. Must be allocated by 14 + * the caller. 15 + * @size_in: (input) Number of bytes to load from @payload_in. 16 + * @size_out: (input) Max number of bytes loaded into @payload_out. 17 + * (output) Number of bytes generated by the device. For fixed size 18 + * outputs commands this is always expected to be deterministic. For 19 + * variable sized output commands, it tells the exact number of bytes 20 + * written. 21 + * @min_out: (input) internal command output payload size validation 22 + * @poll_count: (input) Number of timeouts to attempt. 23 + * @poll_interval_ms: (input) Time between mailbox background command polling 24 + * interval timeouts. 25 + * @return_code: (output) Error code returned from hardware. 26 + * 27 + * This is the primary mechanism used to send commands to the hardware. 28 + * All the fields except @payload_* correspond exactly to the fields described in 29 + * Command Register section of the CXL 2.0 8.2.8.4.5. @payload_in and 30 + * @payload_out are written to, and read from the Command Payload Registers 31 + * defined in CXL 2.0 8.2.8.4.8. 32 + */ 33 + struct cxl_mbox_cmd { 34 + u16 opcode; 35 + void *payload_in; 36 + void *payload_out; 37 + size_t size_in; 38 + size_t size_out; 39 + size_t min_out; 40 + int poll_count; 41 + int poll_interval_ms; 42 + u16 return_code; 43 + }; 8 44 9 45 /** 10 46 * struct cxl_mailbox - context for CXL mailbox operations 11 47 * @host: device that hosts the mailbox 48 + * @enabled_cmds: mailbox commands that are enabled by the driver 49 + * @exclusive_cmds: mailbox commands that are exclusive to the kernel 12 50 * @payload_size: Size of space for payload 13 51 * (CXL 3.1 8.2.8.4.3 Mailbox Capabilities Register) 14 52 * @mbox_mutex: mutex protects device mailbox and firmware 15 53 * @mbox_wait: rcuwait for mailbox 16 54 * @mbox_send: @dev specific transport for transmitting mailbox commands 55 + * @feat_cap: Features capability 17 56 */ 18 57 struct cxl_mailbox { 19 58 struct device *host; 59 + DECLARE_BITMAP(enabled_cmds, CXL_MEM_COMMAND_ID_MAX); 60 + DECLARE_BITMAP(exclusive_cmds, CXL_MEM_COMMAND_ID_MAX); 20 61 size_t payload_size; 21 62 struct mutex mbox_mutex; /* lock to protect mailbox context */ 22 63 struct rcuwait mbox_wait; 23 64 int (*mbox_send)(struct cxl_mailbox *cxl_mbox, struct cxl_mbox_cmd *cmd); 65 + enum cxl_features_capability feat_cap; 24 66 }; 25 67 26 68 int cxl_mailbox_init(struct cxl_mailbox *cxl_mbox, struct device *host);
+135
include/linux/fwctl.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0-only */ 2 + /* 3 + * Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES 4 + */ 5 + #ifndef __LINUX_FWCTL_H 6 + #define __LINUX_FWCTL_H 7 + #include <linux/device.h> 8 + #include <linux/cdev.h> 9 + #include <linux/cleanup.h> 10 + #include <uapi/fwctl/fwctl.h> 11 + 12 + struct fwctl_device; 13 + struct fwctl_uctx; 14 + 15 + /** 16 + * struct fwctl_ops - Driver provided operations 17 + * 18 + * fwctl_unregister() will wait until all excuting ops are completed before it 19 + * returns. Drivers should be mindful to not let their ops run for too long as 20 + * it will block device hot unplug and module unloading. 21 + */ 22 + struct fwctl_ops { 23 + /** 24 + * @device_type: The drivers assigned device_type number. This is uABI. 25 + */ 26 + enum fwctl_device_type device_type; 27 + /** 28 + * @uctx_size: The size of the fwctl_uctx struct to allocate. The first 29 + * bytes of this memory will be a fwctl_uctx. The driver can use the 30 + * remaining bytes as its private memory. 31 + */ 32 + size_t uctx_size; 33 + /** 34 + * @open_uctx: Called when a file descriptor is opened before the uctx 35 + * is ever used. 36 + */ 37 + int (*open_uctx)(struct fwctl_uctx *uctx); 38 + /** 39 + * @close_uctx: Called when the uctx is destroyed, usually when the FD 40 + * is closed. 41 + */ 42 + void (*close_uctx)(struct fwctl_uctx *uctx); 43 + /** 44 + * @info: Implement FWCTL_INFO. Return a kmalloc() memory that is copied 45 + * to out_device_data. On input length indicates the size of the user 46 + * buffer on output it indicates the size of the memory. The driver can 47 + * ignore length on input, the core code will handle everything. 48 + */ 49 + void *(*info)(struct fwctl_uctx *uctx, size_t *length); 50 + /** 51 + * @fw_rpc: Implement FWCTL_RPC. Deliver rpc_in/in_len to the FW and 52 + * return the response and set out_len. rpc_in can be returned as the 53 + * response pointer. Otherwise the returned pointer is freed with 54 + * kvfree(). 55 + */ 56 + void *(*fw_rpc)(struct fwctl_uctx *uctx, enum fwctl_rpc_scope scope, 57 + void *rpc_in, size_t in_len, size_t *out_len); 58 + }; 59 + 60 + /** 61 + * struct fwctl_device - Per-driver registration struct 62 + * @dev: The sysfs (class/fwctl/fwctlXX) device 63 + * 64 + * Each driver instance will have one of these structs with the driver private 65 + * data following immediately after. This struct is refcounted, it is freed by 66 + * calling fwctl_put(). 67 + */ 68 + struct fwctl_device { 69 + struct device dev; 70 + /* private: */ 71 + struct cdev cdev; 72 + 73 + /* Protect uctx_list */ 74 + struct mutex uctx_list_lock; 75 + struct list_head uctx_list; 76 + /* 77 + * Protect ops, held for write when ops becomes NULL during unregister, 78 + * held for read whenever ops is loaded or an ops function is running. 79 + */ 80 + struct rw_semaphore registration_lock; 81 + const struct fwctl_ops *ops; 82 + }; 83 + 84 + struct fwctl_device *_fwctl_alloc_device(struct device *parent, 85 + const struct fwctl_ops *ops, 86 + size_t size); 87 + /** 88 + * fwctl_alloc_device - Allocate a fwctl 89 + * @parent: Physical device that provides the FW interface 90 + * @ops: Driver ops to register 91 + * @drv_struct: 'struct driver_fwctl' that holds the struct fwctl_device 92 + * @member: Name of the struct fwctl_device in @drv_struct 93 + * 94 + * This allocates and initializes the fwctl_device embedded in the drv_struct. 95 + * Upon success the pointer must be freed via fwctl_put(). Returns a 'drv_struct 96 + * \*' on success, NULL on error. 97 + */ 98 + #define fwctl_alloc_device(parent, ops, drv_struct, member) \ 99 + ({ \ 100 + static_assert(__same_type(struct fwctl_device, \ 101 + ((drv_struct *)NULL)->member)); \ 102 + static_assert(offsetof(drv_struct, member) == 0); \ 103 + (drv_struct *)_fwctl_alloc_device(parent, ops, \ 104 + sizeof(drv_struct)); \ 105 + }) 106 + 107 + static inline struct fwctl_device *fwctl_get(struct fwctl_device *fwctl) 108 + { 109 + get_device(&fwctl->dev); 110 + return fwctl; 111 + } 112 + static inline void fwctl_put(struct fwctl_device *fwctl) 113 + { 114 + put_device(&fwctl->dev); 115 + } 116 + DEFINE_FREE(fwctl, struct fwctl_device *, if (_T) fwctl_put(_T)); 117 + 118 + int fwctl_register(struct fwctl_device *fwctl); 119 + void fwctl_unregister(struct fwctl_device *fwctl); 120 + 121 + /** 122 + * struct fwctl_uctx - Per user FD context 123 + * @fwctl: fwctl instance that owns the context 124 + * 125 + * Every FD opened by userspace will get a unique context allocation. Any driver 126 + * private data will follow immediately after. 127 + */ 128 + struct fwctl_uctx { 129 + struct fwctl_device *fwctl; 130 + /* private: */ 131 + /* Head at fwctl_device::uctx_list */ 132 + struct list_head uctx_list_entry; 133 + }; 134 + 135 + #endif
+2 -1
include/linux/panic.h
··· 74 74 #define TAINT_AUX 16 75 75 #define TAINT_RANDSTRUCT 17 76 76 #define TAINT_TEST 18 77 - #define TAINT_FLAGS_COUNT 19 77 + #define TAINT_FWCTL 19 78 + #define TAINT_FLAGS_COUNT 20 78 79 #define TAINT_FLAGS_MAX ((1UL << TAINT_FLAGS_COUNT) - 1) 79 80 80 81 struct taint_flag {
+277
include/linux/pds/pds_adminq.h
··· 1179 1179 u8 status; 1180 1180 }; 1181 1181 1182 + enum pds_fwctl_cmd_opcode { 1183 + PDS_FWCTL_CMD_IDENT = 70, 1184 + PDS_FWCTL_CMD_RPC = 71, 1185 + PDS_FWCTL_CMD_QUERY = 72, 1186 + }; 1187 + 1188 + /** 1189 + * struct pds_fwctl_cmd - Firmware control command structure 1190 + * @opcode: Opcode 1191 + * @rsvd: Reserved 1192 + * @ep: Endpoint identifier 1193 + * @op: Operation identifier 1194 + */ 1195 + struct pds_fwctl_cmd { 1196 + u8 opcode; 1197 + u8 rsvd[3]; 1198 + __le32 ep; 1199 + __le32 op; 1200 + } __packed; 1201 + 1202 + /** 1203 + * struct pds_fwctl_comp - Firmware control completion structure 1204 + * @status: Status of the firmware control operation 1205 + * @rsvd: Reserved 1206 + * @comp_index: Completion index in little-endian format 1207 + * @rsvd2: Reserved 1208 + * @color: Color bit indicating the state of the completion 1209 + */ 1210 + struct pds_fwctl_comp { 1211 + u8 status; 1212 + u8 rsvd; 1213 + __le16 comp_index; 1214 + u8 rsvd2[11]; 1215 + u8 color; 1216 + } __packed; 1217 + 1218 + /** 1219 + * struct pds_fwctl_ident_cmd - Firmware control identification command structure 1220 + * @opcode: Operation code for the command 1221 + * @rsvd: Reserved 1222 + * @version: Interface version 1223 + * @rsvd2: Reserved 1224 + * @len: Length of the identification data 1225 + * @ident_pa: Physical address of the identification data 1226 + */ 1227 + struct pds_fwctl_ident_cmd { 1228 + u8 opcode; 1229 + u8 rsvd; 1230 + u8 version; 1231 + u8 rsvd2; 1232 + __le32 len; 1233 + __le64 ident_pa; 1234 + } __packed; 1235 + 1236 + /* future feature bits here 1237 + * enum pds_fwctl_features { 1238 + * }; 1239 + * (compilers don't like empty enums) 1240 + */ 1241 + 1242 + /** 1243 + * struct pds_fwctl_ident - Firmware control identification structure 1244 + * @features: Supported features (enum pds_fwctl_features) 1245 + * @version: Interface version 1246 + * @rsvd: Reserved 1247 + * @max_req_sz: Maximum request size 1248 + * @max_resp_sz: Maximum response size 1249 + * @max_req_sg_elems: Maximum number of request SGs 1250 + * @max_resp_sg_elems: Maximum number of response SGs 1251 + */ 1252 + struct pds_fwctl_ident { 1253 + __le64 features; 1254 + u8 version; 1255 + u8 rsvd[3]; 1256 + __le32 max_req_sz; 1257 + __le32 max_resp_sz; 1258 + u8 max_req_sg_elems; 1259 + u8 max_resp_sg_elems; 1260 + } __packed; 1261 + 1262 + enum pds_fwctl_query_entity { 1263 + PDS_FWCTL_RPC_ROOT = 0, 1264 + PDS_FWCTL_RPC_ENDPOINT = 1, 1265 + PDS_FWCTL_RPC_OPERATION = 2, 1266 + }; 1267 + 1268 + #define PDS_FWCTL_RPC_OPCODE_CMD_SHIFT 0 1269 + #define PDS_FWCTL_RPC_OPCODE_CMD_MASK GENMASK(15, PDS_FWCTL_RPC_OPCODE_CMD_SHIFT) 1270 + #define PDS_FWCTL_RPC_OPCODE_VER_SHIFT 16 1271 + #define PDS_FWCTL_RPC_OPCODE_VER_MASK GENMASK(23, PDS_FWCTL_RPC_OPCODE_VER_SHIFT) 1272 + 1273 + #define PDS_FWCTL_RPC_OPCODE_GET_CMD(op) FIELD_GET(PDS_FWCTL_RPC_OPCODE_CMD_MASK, op) 1274 + #define PDS_FWCTL_RPC_OPCODE_GET_VER(op) FIELD_GET(PDS_FWCTL_RPC_OPCODE_VER_MASK, op) 1275 + 1276 + #define PDS_FWCTL_RPC_OPCODE_CMP(op1, op2) \ 1277 + (PDS_FWCTL_RPC_OPCODE_GET_CMD(op1) == PDS_FWCTL_RPC_OPCODE_GET_CMD(op2) && \ 1278 + PDS_FWCTL_RPC_OPCODE_GET_VER(op1) <= PDS_FWCTL_RPC_OPCODE_GET_VER(op2)) 1279 + 1280 + /* 1281 + * FW command attributes that map to the FWCTL scope values 1282 + */ 1283 + #define PDSFC_FW_CMD_ATTR_READ 0x00 1284 + #define PDSFC_FW_CMD_ATTR_DEBUG_READ 0x02 1285 + #define PDSFC_FW_CMD_ATTR_WRITE 0x04 1286 + #define PDSFC_FW_CMD_ATTR_DEBUG_WRITE 0x08 1287 + #define PDSFC_FW_CMD_ATTR_SYNC 0x10 1288 + 1289 + /** 1290 + * struct pds_fwctl_query_cmd - Firmware control query command structure 1291 + * @opcode: Operation code for the command 1292 + * @entity: Entity type to query (enum pds_fwctl_query_entity) 1293 + * @version: Version of the query data structure supported by the driver 1294 + * @rsvd: Reserved 1295 + * @query_data_buf_len: Length of the query data buffer 1296 + * @query_data_buf_pa: Physical address of the query data buffer 1297 + * @ep: Endpoint identifier to query (when entity is PDS_FWCTL_RPC_ENDPOINT) 1298 + * @op: Operation identifier to query (when entity is PDS_FWCTL_RPC_OPERATION) 1299 + * 1300 + * This structure is used to send a query command to the firmware control 1301 + * interface. The structure is packed to ensure there is no padding between 1302 + * the fields. 1303 + */ 1304 + struct pds_fwctl_query_cmd { 1305 + u8 opcode; 1306 + u8 entity; 1307 + u8 version; 1308 + u8 rsvd; 1309 + __le32 query_data_buf_len; 1310 + __le64 query_data_buf_pa; 1311 + union { 1312 + __le32 ep; 1313 + __le32 op; 1314 + }; 1315 + } __packed; 1316 + 1317 + /** 1318 + * struct pds_fwctl_query_comp - Firmware control query completion structure 1319 + * @status: Status of the query command 1320 + * @rsvd: Reserved 1321 + * @comp_index: Completion index in little-endian format 1322 + * @version: Version of the query data structure returned by firmware. This 1323 + * should be less than or equal to the version supported by the driver 1324 + * @rsvd2: Reserved 1325 + * @color: Color bit indicating the state of the completion 1326 + */ 1327 + struct pds_fwctl_query_comp { 1328 + u8 status; 1329 + u8 rsvd; 1330 + __le16 comp_index; 1331 + u8 version; 1332 + u8 rsvd2[2]; 1333 + u8 color; 1334 + } __packed; 1335 + 1336 + /** 1337 + * struct pds_fwctl_query_data_endpoint - query data for entity PDS_FWCTL_RPC_ROOT 1338 + * @id: The identifier for the data endpoint 1339 + */ 1340 + struct pds_fwctl_query_data_endpoint { 1341 + __le32 id; 1342 + } __packed; 1343 + 1344 + /** 1345 + * struct pds_fwctl_query_data_operation - query data for entity PDS_FWCTL_RPC_ENDPOINT 1346 + * @id: Operation identifier 1347 + * @scope: Scope of the operation (enum fwctl_rpc_scope) 1348 + * @rsvd: Reserved 1349 + */ 1350 + struct pds_fwctl_query_data_operation { 1351 + __le32 id; 1352 + u8 scope; 1353 + u8 rsvd[3]; 1354 + } __packed; 1355 + 1356 + /** 1357 + * struct pds_fwctl_query_data - query data structure 1358 + * @version: Version of the query data structure 1359 + * @rsvd: Reserved 1360 + * @num_entries: Number of entries in the union 1361 + * @entries: Array of query data entries, depending on the entity type 1362 + */ 1363 + struct pds_fwctl_query_data { 1364 + u8 version; 1365 + u8 rsvd[3]; 1366 + __le32 num_entries; 1367 + u8 entries[] __counted_by_le(num_entries); 1368 + } __packed; 1369 + 1370 + /** 1371 + * struct pds_fwctl_rpc_cmd - Firmware control RPC command 1372 + * @opcode: opcode PDS_FWCTL_CMD_RPC 1373 + * @rsvd: Reserved 1374 + * @flags: Indicates indirect request and/or response handling 1375 + * @ep: Endpoint identifier 1376 + * @op: Operation identifier 1377 + * @inline_req0: Buffer for inline request 1378 + * @inline_req1: Buffer for inline request 1379 + * @req_pa: Physical address of request data 1380 + * @req_sz: Size of the request 1381 + * @req_sg_elems: Number of request SGs 1382 + * @req_rsvd: Reserved 1383 + * @inline_req2: Buffer for inline request 1384 + * @resp_pa: Physical address of response data 1385 + * @resp_sz: Size of the response 1386 + * @resp_sg_elems: Number of response SGs 1387 + * @resp_rsvd: Reserved 1388 + */ 1389 + struct pds_fwctl_rpc_cmd { 1390 + u8 opcode; 1391 + u8 rsvd; 1392 + __le16 flags; 1393 + #define PDS_FWCTL_RPC_IND_REQ 0x1 1394 + #define PDS_FWCTL_RPC_IND_RESP 0x2 1395 + __le32 ep; 1396 + __le32 op; 1397 + u8 inline_req0[16]; 1398 + union { 1399 + u8 inline_req1[16]; 1400 + struct { 1401 + __le64 req_pa; 1402 + __le32 req_sz; 1403 + u8 req_sg_elems; 1404 + u8 req_rsvd[3]; 1405 + }; 1406 + }; 1407 + union { 1408 + u8 inline_req2[16]; 1409 + struct { 1410 + __le64 resp_pa; 1411 + __le32 resp_sz; 1412 + u8 resp_sg_elems; 1413 + u8 resp_rsvd[3]; 1414 + }; 1415 + }; 1416 + } __packed; 1417 + 1418 + /** 1419 + * struct pds_sg_elem - Transmit scatter-gather (SG) descriptor element 1420 + * @addr: DMA address of SG element data buffer 1421 + * @len: Length of SG element data buffer, in bytes 1422 + * @rsvd: Reserved 1423 + */ 1424 + struct pds_sg_elem { 1425 + __le64 addr; 1426 + __le32 len; 1427 + u8 rsvd[4]; 1428 + } __packed; 1429 + 1430 + /** 1431 + * struct pds_fwctl_rpc_comp - Completion of a firmware control RPC 1432 + * @status: Status of the command 1433 + * @rsvd: Reserved 1434 + * @comp_index: Completion index of the command 1435 + * @err: Error code, if any, from the RPC 1436 + * @resp_sz: Size of the response 1437 + * @rsvd2: Reserved 1438 + * @color: Color bit indicating the state of the completion 1439 + */ 1440 + struct pds_fwctl_rpc_comp { 1441 + u8 status; 1442 + u8 rsvd; 1443 + __le16 comp_index; 1444 + __le32 err; 1445 + __le32 resp_sz; 1446 + u8 rsvd2[3]; 1447 + u8 color; 1448 + } __packed; 1449 + 1182 1450 union pds_core_adminq_cmd { 1183 1451 u8 opcode; 1184 1452 u8 bytes[64]; ··· 1484 1216 struct pds_lm_dirty_enable_cmd lm_dirty_enable; 1485 1217 struct pds_lm_dirty_disable_cmd lm_dirty_disable; 1486 1218 struct pds_lm_dirty_seq_ack_cmd lm_dirty_seq_ack; 1219 + 1220 + struct pds_fwctl_cmd fwctl; 1221 + struct pds_fwctl_ident_cmd fwctl_ident; 1222 + struct pds_fwctl_rpc_cmd fwctl_rpc; 1223 + struct pds_fwctl_query_cmd fwctl_query; 1487 1224 }; 1488 1225 1489 1226 union pds_core_adminq_comp { ··· 1516 1243 1517 1244 struct pds_lm_state_size_comp lm_state_size; 1518 1245 struct pds_lm_dirty_status_comp lm_dirty_status; 1246 + 1247 + struct pds_fwctl_comp fwctl; 1248 + struct pds_fwctl_rpc_comp fwctl_rpc; 1249 + struct pds_fwctl_query_comp fwctl_query; 1519 1250 }; 1520 1251 1521 1252 #ifndef __CHECKER__
+2
include/linux/pds/pds_common.h
··· 29 29 PDS_DEV_TYPE_ETH = 3, 30 30 PDS_DEV_TYPE_RDMA = 4, 31 31 PDS_DEV_TYPE_LM = 5, 32 + PDS_DEV_TYPE_FWCTL = 6, 32 33 33 34 /* new ones added before this line */ 34 35 PDS_DEV_TYPE_MAX = 16 /* don't change - used in struct size */ ··· 41 40 #define PDS_DEV_TYPE_ETH_STR "Eth" 42 41 #define PDS_DEV_TYPE_RDMA_STR "RDMA" 43 42 #define PDS_DEV_TYPE_LM_STR "LM" 43 + #define PDS_DEV_TYPE_FWCTL_STR "fwctl" 44 44 45 45 #define PDS_VDPA_DEV_NAME PDS_CORE_DRV_NAME "." PDS_DEV_TYPE_VDPA_STR 46 46 #define PDS_VFIO_LM_DEV_NAME PDS_CORE_DRV_NAME "." PDS_DEV_TYPE_LM_STR "." PDS_DEV_TYPE_VFIO_STR
+170
include/uapi/cxl/features.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ 2 + /* 3 + * Copyright (c) 2024,2025, Intel Corporation 4 + * 5 + * These are definitions for the mailbox command interface of CXL subsystem. 6 + */ 7 + #ifndef _UAPI_CXL_FEATURES_H_ 8 + #define _UAPI_CXL_FEATURES_H_ 9 + 10 + #include <linux/types.h> 11 + #ifndef __KERNEL__ 12 + #include <uuid/uuid.h> 13 + #else 14 + #include <linux/uuid.h> 15 + #endif 16 + 17 + /* 18 + * struct cxl_mbox_get_sup_feats_in - Get Supported Features input 19 + * 20 + * @count: bytes of Feature data to return in output 21 + * @start_idx: index of first requested Supported Feature Entry, 0 based. 22 + * @reserved: reserved field, must be 0s. 23 + * 24 + * Get Supported Features (0x500h) CXL r3.2 8.2.9.6.1 command. 25 + * Input block for Get support Feature 26 + */ 27 + struct cxl_mbox_get_sup_feats_in { 28 + __le32 count; 29 + __le16 start_idx; 30 + __u8 reserved[2]; 31 + } __attribute__ ((__packed__)); 32 + 33 + /* CXL spec r3.2 Table 8-87 command effects */ 34 + #define CXL_CMD_CONFIG_CHANGE_COLD_RESET BIT(0) 35 + #define CXL_CMD_CONFIG_CHANGE_IMMEDIATE BIT(1) 36 + #define CXL_CMD_DATA_CHANGE_IMMEDIATE BIT(2) 37 + #define CXL_CMD_POLICY_CHANGE_IMMEDIATE BIT(3) 38 + #define CXL_CMD_LOG_CHANGE_IMMEDIATE BIT(4) 39 + #define CXL_CMD_SECURITY_STATE_CHANGE BIT(5) 40 + #define CXL_CMD_BACKGROUND BIT(6) 41 + #define CXL_CMD_BGCMD_ABORT_SUPPORTED BIT(7) 42 + #define CXL_CMD_EFFECTS_VALID BIT(9) 43 + #define CXL_CMD_CONFIG_CHANGE_CONV_RESET BIT(10) 44 + #define CXL_CMD_CONFIG_CHANGE_CXL_RESET BIT(11) 45 + #define CXL_CMD_EFFECTS_RESERVED GENMASK(15, 12) 46 + 47 + /* 48 + * struct cxl_feat_entry - Supported Feature Entry 49 + * @uuid: UUID of the Feature 50 + * @id: id to identify the feature. 0 based 51 + * @get_feat_size: max bytes required for Get Feature command for this Feature 52 + * @set_feat_size: max bytes required for Set Feature command for this Feature 53 + * @flags: attribute flags 54 + * @get_feat_ver: Get Feature version 55 + * @set_feat_ver: Set Feature version 56 + * @effects: Set Feature command effects 57 + * @reserved: reserved, must be 0 58 + * 59 + * CXL spec r3.2 Table 8-109 60 + * Get Supported Features Supported Feature Entry 61 + */ 62 + struct cxl_feat_entry { 63 + uuid_t uuid; 64 + __le16 id; 65 + __le16 get_feat_size; 66 + __le16 set_feat_size; 67 + __le32 flags; 68 + __u8 get_feat_ver; 69 + __u8 set_feat_ver; 70 + __le16 effects; 71 + __u8 reserved[18]; 72 + } __attribute__ ((__packed__)); 73 + 74 + /* @flags field for 'struct cxl_feat_entry' */ 75 + #define CXL_FEATURE_F_CHANGEABLE BIT(0) 76 + #define CXL_FEATURE_F_PERSIST_FW_UPDATE BIT(4) 77 + #define CXL_FEATURE_F_DEFAULT_SEL BIT(5) 78 + #define CXL_FEATURE_F_SAVED_SEL BIT(6) 79 + 80 + /* 81 + * struct cxl_mbox_get_sup_feats_out - Get Supported Features output 82 + * @num_entries: number of Supported Feature Entries returned 83 + * @supported_feats: number of supported Features 84 + * @reserved: reserved, must be 0s. 85 + * @ents: Supported Feature Entries array 86 + * 87 + * CXL spec r3.2 Table 8-108 88 + * Get supported Features Output Payload 89 + */ 90 + struct cxl_mbox_get_sup_feats_out { 91 + __struct_group(cxl_mbox_get_sup_feats_out_hdr, hdr, /* no attrs */, 92 + __le16 num_entries; 93 + __le16 supported_feats; 94 + __u8 reserved[4]; 95 + ); 96 + struct cxl_feat_entry ents[] __counted_by_le(num_entries); 97 + } __attribute__ ((__packed__)); 98 + 99 + /* 100 + * Get Feature CXL spec r3.2 Spec 8.2.9.6.2 101 + */ 102 + 103 + /* 104 + * struct cxl_mbox_get_feat_in - Get Feature input 105 + * @uuid: UUID for Feature 106 + * @offset: offset of the first byte in Feature data for output payload 107 + * @count: count in bytes of Feature data returned 108 + * @selection: 0 current value, 1 default value, 2 saved value 109 + * 110 + * CXL spec r3.2 section 8.2.9.6.2 Table 8-99 111 + */ 112 + struct cxl_mbox_get_feat_in { 113 + uuid_t uuid; 114 + __le16 offset; 115 + __le16 count; 116 + __u8 selection; 117 + } __attribute__ ((__packed__)); 118 + 119 + /* 120 + * enum cxl_get_feat_selection - selection field of Get Feature input 121 + */ 122 + enum cxl_get_feat_selection { 123 + CXL_GET_FEAT_SEL_CURRENT_VALUE, 124 + CXL_GET_FEAT_SEL_DEFAULT_VALUE, 125 + CXL_GET_FEAT_SEL_SAVED_VALUE, 126 + CXL_GET_FEAT_SEL_MAX 127 + }; 128 + 129 + /* 130 + * Set Feature CXL spec r3.2 8.2.9.6.3 131 + */ 132 + 133 + /* 134 + * struct cxl_mbox_set_feat_in - Set Features input 135 + * @uuid: UUID for Feature 136 + * @flags: set feature flags 137 + * @offset: byte offset of Feature data to update 138 + * @version: Feature version of the data in Feature Data 139 + * @rsvd: reserved, must be 0s. 140 + * @feat_data: raw byte stream of Features data to update 141 + * 142 + * CXL spec r3.2 section 8.2.9.6.3 Table 8-101 143 + */ 144 + struct cxl_mbox_set_feat_in { 145 + __struct_group(cxl_mbox_set_feat_hdr, hdr, /* no attrs */, 146 + uuid_t uuid; 147 + __le32 flags; 148 + __le16 offset; 149 + __u8 version; 150 + __u8 rsvd[9]; 151 + ); 152 + __u8 feat_data[]; 153 + } __packed; 154 + 155 + /* 156 + * enum cxl_set_feat_flag_data_transfer - Set Feature flags field 157 + */ 158 + enum cxl_set_feat_flag_data_transfer { 159 + CXL_SET_FEAT_FLAG_FULL_DATA_TRANSFER = 0, 160 + CXL_SET_FEAT_FLAG_INITIATE_DATA_TRANSFER, 161 + CXL_SET_FEAT_FLAG_CONTINUE_DATA_TRANSFER, 162 + CXL_SET_FEAT_FLAG_FINISH_DATA_TRANSFER, 163 + CXL_SET_FEAT_FLAG_ABORT_DATA_TRANSFER, 164 + CXL_SET_FEAT_FLAG_DATA_TRANSFER_MAX 165 + }; 166 + 167 + #define CXL_SET_FEAT_FLAG_DATA_TRANSFER_MASK GENMASK(2, 0) 168 + #define CXL_SET_FEAT_FLAG_DATA_SAVED_ACROSS_RESET BIT(3) 169 + 170 + #endif
+56
include/uapi/fwctl/cxl.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ 2 + /* 3 + * Copyright (c) 2024-2025 Intel Corporation 4 + * 5 + * These are definitions for the mailbox command interface of CXL subsystem. 6 + */ 7 + #ifndef _UAPI_FWCTL_CXL_H_ 8 + #define _UAPI_FWCTL_CXL_H_ 9 + 10 + #include <linux/types.h> 11 + #include <linux/stddef.h> 12 + #include <cxl/features.h> 13 + 14 + /** 15 + * struct fwctl_rpc_cxl - ioctl(FWCTL_RPC) input for CXL 16 + * @opcode: CXL mailbox command opcode 17 + * @flags: Flags for the command (input). 18 + * @op_size: Size of input payload. 19 + * @reserved1: Reserved. Must be 0s. 20 + * @get_sup_feats_in: Get Supported Features input 21 + * @get_feat_in: Get Feature input 22 + * @set_feat_in: Set Feature input 23 + */ 24 + struct fwctl_rpc_cxl { 25 + __struct_group(fwctl_rpc_cxl_hdr, hdr, /* no attrs */, 26 + __u32 opcode; 27 + __u32 flags; 28 + __u32 op_size; 29 + __u32 reserved1; 30 + ); 31 + union { 32 + struct cxl_mbox_get_sup_feats_in get_sup_feats_in; 33 + struct cxl_mbox_get_feat_in get_feat_in; 34 + struct cxl_mbox_set_feat_in set_feat_in; 35 + }; 36 + }; 37 + 38 + /** 39 + * struct fwctl_rpc_cxl_out - ioctl(FWCTL_RPC) output for CXL 40 + * @size: Size of the output payload 41 + * @retval: Return value from device 42 + * @get_sup_feats_out: Get Supported Features output 43 + * @payload: raw byte stream of payload 44 + */ 45 + struct fwctl_rpc_cxl_out { 46 + __struct_group(fwctl_rpc_cxl_out_hdr, hdr, /* no attrs */, 47 + __u32 size; 48 + __u32 retval; 49 + ); 50 + union { 51 + struct cxl_mbox_get_sup_feats_out get_sup_feats_out; 52 + __DECLARE_FLEX_ARRAY(__u8, payload); 53 + }; 54 + }; 55 + 56 + #endif
+141
include/uapi/fwctl/fwctl.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ 2 + /* Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. 3 + */ 4 + #ifndef _UAPI_FWCTL_H 5 + #define _UAPI_FWCTL_H 6 + 7 + #include <linux/types.h> 8 + #include <linux/ioctl.h> 9 + 10 + #define FWCTL_TYPE 0x9A 11 + 12 + /** 13 + * DOC: General ioctl format 14 + * 15 + * The ioctl interface follows a general format to allow for extensibility. Each 16 + * ioctl is passed a structure pointer as the argument providing the size of 17 + * the structure in the first u32. The kernel checks that any structure space 18 + * beyond what it understands is 0. This allows userspace to use the backward 19 + * compatible portion while consistently using the newer, larger, structures. 20 + * 21 + * ioctls use a standard meaning for common errnos: 22 + * 23 + * - ENOTTY: The IOCTL number itself is not supported at all 24 + * - E2BIG: The IOCTL number is supported, but the provided structure has 25 + * non-zero in a part the kernel does not understand. 26 + * - EOPNOTSUPP: The IOCTL number is supported, and the structure is 27 + * understood, however a known field has a value the kernel does not 28 + * understand or support. 29 + * - EINVAL: Everything about the IOCTL was understood, but a field is not 30 + * correct. 31 + * - ENOMEM: Out of memory. 32 + * - ENODEV: The underlying device has been hot-unplugged and the FD is 33 + * orphaned. 34 + * 35 + * As well as additional errnos, within specific ioctls. 36 + */ 37 + enum { 38 + FWCTL_CMD_BASE = 0, 39 + FWCTL_CMD_INFO = 0, 40 + FWCTL_CMD_RPC = 1, 41 + }; 42 + 43 + enum fwctl_device_type { 44 + FWCTL_DEVICE_TYPE_ERROR = 0, 45 + FWCTL_DEVICE_TYPE_MLX5 = 1, 46 + FWCTL_DEVICE_TYPE_CXL = 2, 47 + FWCTL_DEVICE_TYPE_PDS = 4, 48 + }; 49 + 50 + /** 51 + * struct fwctl_info - ioctl(FWCTL_INFO) 52 + * @size: sizeof(struct fwctl_info) 53 + * @flags: Must be 0 54 + * @out_device_type: Returns the type of the device from enum fwctl_device_type 55 + * @device_data_len: On input the length of the out_device_data memory. On 56 + * output the size of the kernel's device_data which may be larger or 57 + * smaller than the input. Maybe 0 on input. 58 + * @out_device_data: Pointer to a memory of device_data_len bytes. Kernel will 59 + * fill the entire memory, zeroing as required. 60 + * 61 + * Returns basic information about this fwctl instance, particularly what driver 62 + * is being used to define the device_data format. 63 + */ 64 + struct fwctl_info { 65 + __u32 size; 66 + __u32 flags; 67 + __u32 out_device_type; 68 + __u32 device_data_len; 69 + __aligned_u64 out_device_data; 70 + }; 71 + #define FWCTL_INFO _IO(FWCTL_TYPE, FWCTL_CMD_INFO) 72 + 73 + /** 74 + * enum fwctl_rpc_scope - Scope of access for the RPC 75 + * 76 + * Refer to fwctl.rst for a more detailed discussion of these scopes. 77 + */ 78 + enum fwctl_rpc_scope { 79 + /** 80 + * @FWCTL_RPC_CONFIGURATION: Device configuration access scope 81 + * 82 + * Read/write access to device configuration. When configuration 83 + * is written to the device it remains in a fully supported state. 84 + */ 85 + FWCTL_RPC_CONFIGURATION = 0, 86 + /** 87 + * @FWCTL_RPC_DEBUG_READ_ONLY: Read only access to debug information 88 + * 89 + * Readable debug information. Debug information is compatible with 90 + * kernel lockdown, and does not disclose any sensitive information. For 91 + * instance exposing any encryption secrets from this information is 92 + * forbidden. 93 + */ 94 + FWCTL_RPC_DEBUG_READ_ONLY = 1, 95 + /** 96 + * @FWCTL_RPC_DEBUG_WRITE: Writable access to lockdown compatible debug information 97 + * 98 + * Allows write access to data in the device which may leave a fully 99 + * supported state. This is intended to permit intensive and possibly 100 + * invasive debugging. This scope will taint the kernel. 101 + */ 102 + FWCTL_RPC_DEBUG_WRITE = 2, 103 + /** 104 + * @FWCTL_RPC_DEBUG_WRITE_FULL: Write access to all debug information 105 + * 106 + * Allows read/write access to everything. Requires CAP_SYS_RAW_IO, so 107 + * it is not required to follow lockdown principals. If in doubt 108 + * debugging should be placed in this scope. This scope will taint the 109 + * kernel. 110 + */ 111 + FWCTL_RPC_DEBUG_WRITE_FULL = 3, 112 + }; 113 + 114 + /** 115 + * struct fwctl_rpc - ioctl(FWCTL_RPC) 116 + * @size: sizeof(struct fwctl_rpc) 117 + * @scope: One of enum fwctl_rpc_scope, required scope for the RPC 118 + * @in_len: Length of the in memory 119 + * @out_len: Length of the out memory 120 + * @in: Request message in device specific format 121 + * @out: Response message in device specific format 122 + * 123 + * Deliver a Remote Procedure Call to the device FW and return the response. The 124 + * call's parameters and return are marshaled into linear buffers of memory. Any 125 + * errno indicates that delivery of the RPC to the device failed. Return status 126 + * originating in the device during a successful delivery must be encoded into 127 + * out. 128 + * 129 + * The format of the buffers matches the out_device_type from FWCTL_INFO. 130 + */ 131 + struct fwctl_rpc { 132 + __u32 size; 133 + __u32 scope; 134 + __u32 in_len; 135 + __u32 out_len; 136 + __aligned_u64 in; 137 + __aligned_u64 out; 138 + }; 139 + #define FWCTL_RPC _IO(FWCTL_TYPE, FWCTL_CMD_RPC) 140 + 141 + #endif
+36
include/uapi/fwctl/mlx5.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ 2 + /* 3 + * Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES 4 + * 5 + * These are definitions for the command interface for mlx5 HW. mlx5 FW has a 6 + * User Context mechanism which allows the FW to understand a security scope. 7 + * FWCTL binds each FD to a FW user context and then places the User Context ID 8 + * (UID) in each command header. The created User Context has a capability set 9 + * that is appropriate for FWCTL's security model. 10 + * 11 + * Command formation should use a copy of the structs in mlx5_ifc.h following 12 + * the Programmers Reference Manual. A open release is available here: 13 + * 14 + * https://network.nvidia.com/files/doc-2020/ethernet-adapters-programming-manual.pdf 15 + * 16 + * The device_type for this file is FWCTL_DEVICE_TYPE_MLX5. 17 + */ 18 + #ifndef _UAPI_FWCTL_MLX5_H 19 + #define _UAPI_FWCTL_MLX5_H 20 + 21 + #include <linux/types.h> 22 + 23 + /** 24 + * struct fwctl_info_mlx5 - ioctl(FWCTL_INFO) out_device_data 25 + * @uid: The FW UID this FD is bound to. Each command header will force 26 + * this value. 27 + * @uctx_caps: The FW capabilities that are enabled for the uid. 28 + * 29 + * Return basic information about the FW interface available. 30 + */ 31 + struct fwctl_info_mlx5 { 32 + __u32 uid; 33 + __u32 uctx_caps; 34 + }; 35 + 36 + #endif
+62
include/uapi/fwctl/pds.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ 2 + /* Copyright(c) Advanced Micro Devices, Inc */ 3 + 4 + /* 5 + * fwctl interface info for pds_fwctl 6 + */ 7 + 8 + #ifndef _UAPI_FWCTL_PDS_H_ 9 + #define _UAPI_FWCTL_PDS_H_ 10 + 11 + #include <linux/types.h> 12 + 13 + /** 14 + * struct fwctl_info_pds 15 + * @uctx_caps: bitmap of firmware capabilities 16 + * 17 + * Return basic information about the FW interface available. 18 + */ 19 + struct fwctl_info_pds { 20 + __u32 uctx_caps; 21 + }; 22 + 23 + /** 24 + * enum pds_fwctl_capabilities 25 + * @PDS_FWCTL_QUERY_CAP: firmware can be queried for information 26 + * @PDS_FWCTL_SEND_CAP: firmware can be sent commands 27 + */ 28 + enum pds_fwctl_capabilities { 29 + PDS_FWCTL_QUERY_CAP = 0, 30 + PDS_FWCTL_SEND_CAP, 31 + }; 32 + 33 + /** 34 + * struct fwctl_rpc_pds 35 + * @in.op: requested operation code 36 + * @in.ep: firmware endpoint to operate on 37 + * @in.rsvd: reserved 38 + * @in.len: length of payload data 39 + * @in.payload: address of payload buffer 40 + * @in: rpc in parameters 41 + * @out.retval: operation result value 42 + * @out.rsvd: reserved 43 + * @out.len: length of result data buffer 44 + * @out.payload: address of payload data buffer 45 + * @out: rpc out parameters 46 + */ 47 + struct fwctl_rpc_pds { 48 + struct { 49 + __u32 op; 50 + __u32 ep; 51 + __u32 rsvd; 52 + __u32 len; 53 + __aligned_u64 payload; 54 + } in; 55 + struct { 56 + __u32 retval; 57 + __u32 rsvd[2]; 58 + __u32 len; 59 + __aligned_u64 payload; 60 + } out; 61 + }; 62 + #endif /* _UAPI_FWCTL_PDS_H_ */
+1
kernel/panic.c
··· 511 511 TAINT_FLAG(AUX, 'X', ' ', true), 512 512 TAINT_FLAG(RANDSTRUCT, 'T', ' ', true), 513 513 TAINT_FLAG(TEST, 'N', ' ', true), 514 + TAINT_FLAG(FWCTL, 'J', ' ', true), 514 515 }; 515 516 516 517 #undef TAINT_FLAG
+8
tools/debugging/kernel-chktaint
··· 204 204 echo " * an in-kernel test (such as a KUnit test) has been run (#18)" 205 205 fi 206 206 207 + T=`expr $T / 2` 208 + if [ `expr $T % 2` -eq 0 ]; then 209 + addout " " 210 + else 211 + addout "J" 212 + echo " * fwctl's mutating debug interface was used (#19)" 213 + fi 214 + 207 215 echo "For a more detailed explanation of the various taint flags see" 208 216 echo " Documentation/admin-guide/tainted-kernels.rst in the Linux kernel sources" 209 217 echo " or https://kernel.org/doc/html/latest/admin-guide/tainted-kernels.html"
+1
tools/testing/cxl/Kbuild
··· 63 63 cxl_core-y += $(CXL_CORE_SRC)/cdat.o 64 64 cxl_core-$(CONFIG_TRACING) += $(CXL_CORE_SRC)/trace.o 65 65 cxl_core-$(CONFIG_CXL_REGION) += $(CXL_CORE_SRC)/region.o 66 + cxl_core-$(CONFIG_CXL_FEATURES) += $(CXL_CORE_SRC)/features.o 66 67 cxl_core-y += config_check.o 67 68 cxl_core-y += cxl_core_test.o 68 69 cxl_core-y += cxl_core_exports.o
+185
tools/testing/cxl/test/mem.c
··· 45 45 .effect = CXL_CMD_EFFECT_NONE, 46 46 }, 47 47 { 48 + .opcode = cpu_to_le16(CXL_MBOX_OP_GET_SUPPORTED_FEATURES), 49 + .effect = CXL_CMD_EFFECT_NONE, 50 + }, 51 + { 52 + .opcode = cpu_to_le16(CXL_MBOX_OP_GET_FEATURE), 53 + .effect = CXL_CMD_EFFECT_NONE, 54 + }, 55 + { 56 + .opcode = cpu_to_le16(CXL_MBOX_OP_SET_FEATURE), 57 + .effect = cpu_to_le16(EFFECT(CONF_CHANGE_IMMEDIATE)), 58 + }, 59 + { 48 60 .opcode = cpu_to_le16(CXL_MBOX_OP_IDENTIFY), 49 61 .effect = CXL_CMD_EFFECT_NONE, 50 62 }, ··· 157 145 u32 ev_status; 158 146 }; 159 147 148 + struct vendor_test_feat { 149 + __le32 data; 150 + } __packed; 151 + 160 152 struct cxl_mockmem_data { 161 153 void *lsa; 162 154 void *fw; ··· 177 161 u8 event_buf[SZ_4K]; 178 162 u64 timestamp; 179 163 unsigned long sanitize_timeout; 164 + struct vendor_test_feat test_feat; 180 165 }; 181 166 182 167 static struct mock_event_log *event_find_log(struct device *dev, int log_type) ··· 1371 1354 return -EINVAL; 1372 1355 } 1373 1356 1357 + #define CXL_VENDOR_FEATURE_TEST \ 1358 + UUID_INIT(0xffffffff, 0xffff, 0xffff, 0xff, 0xff, 0xff, 0xff, 0xff, \ 1359 + 0xff, 0xff, 0xff) 1360 + 1361 + static void fill_feature_vendor_test(struct cxl_feat_entry *feat) 1362 + { 1363 + feat->uuid = CXL_VENDOR_FEATURE_TEST; 1364 + feat->id = 0; 1365 + feat->get_feat_size = cpu_to_le16(0x4); 1366 + feat->set_feat_size = cpu_to_le16(0x4); 1367 + feat->flags = cpu_to_le32(CXL_FEATURE_F_CHANGEABLE | 1368 + CXL_FEATURE_F_DEFAULT_SEL | 1369 + CXL_FEATURE_F_SAVED_SEL); 1370 + feat->get_feat_ver = 1; 1371 + feat->set_feat_ver = 1; 1372 + feat->effects = cpu_to_le16(CXL_CMD_CONFIG_CHANGE_COLD_RESET | 1373 + CXL_CMD_EFFECTS_VALID); 1374 + } 1375 + 1376 + #define MAX_CXL_TEST_FEATS 1 1377 + 1378 + static int mock_get_test_feature(struct cxl_mockmem_data *mdata, 1379 + struct cxl_mbox_cmd *cmd) 1380 + { 1381 + struct vendor_test_feat *output = cmd->payload_out; 1382 + struct cxl_mbox_get_feat_in *input = cmd->payload_in; 1383 + u16 offset = le16_to_cpu(input->offset); 1384 + u16 count = le16_to_cpu(input->count); 1385 + u8 *ptr; 1386 + 1387 + if (offset > sizeof(*output)) { 1388 + cmd->return_code = CXL_MBOX_CMD_RC_INPUT; 1389 + return -EINVAL; 1390 + } 1391 + 1392 + if (offset + count > sizeof(*output)) { 1393 + cmd->return_code = CXL_MBOX_CMD_RC_INPUT; 1394 + return -EINVAL; 1395 + } 1396 + 1397 + ptr = (u8 *)&mdata->test_feat + offset; 1398 + memcpy((u8 *)output + offset, ptr, count); 1399 + 1400 + return 0; 1401 + } 1402 + 1403 + static int mock_get_feature(struct cxl_mockmem_data *mdata, 1404 + struct cxl_mbox_cmd *cmd) 1405 + { 1406 + struct cxl_mbox_get_feat_in *input = cmd->payload_in; 1407 + 1408 + if (uuid_equal(&input->uuid, &CXL_VENDOR_FEATURE_TEST)) 1409 + return mock_get_test_feature(mdata, cmd); 1410 + 1411 + cmd->return_code = CXL_MBOX_CMD_RC_UNSUPPORTED; 1412 + 1413 + return -EOPNOTSUPP; 1414 + } 1415 + 1416 + static int mock_set_test_feature(struct cxl_mockmem_data *mdata, 1417 + struct cxl_mbox_cmd *cmd) 1418 + { 1419 + struct cxl_mbox_set_feat_in *input = cmd->payload_in; 1420 + struct vendor_test_feat *test = 1421 + (struct vendor_test_feat *)input->feat_data; 1422 + u32 action; 1423 + 1424 + action = FIELD_GET(CXL_SET_FEAT_FLAG_DATA_TRANSFER_MASK, 1425 + le32_to_cpu(input->hdr.flags)); 1426 + /* 1427 + * While it is spec compliant to support other set actions, it is not 1428 + * necessary to add the complication in the emulation currently. Reject 1429 + * anything besides full xfer. 1430 + */ 1431 + if (action != CXL_SET_FEAT_FLAG_FULL_DATA_TRANSFER) { 1432 + cmd->return_code = CXL_MBOX_CMD_RC_INPUT; 1433 + return -EINVAL; 1434 + } 1435 + 1436 + /* Offset should be reserved when doing full transfer */ 1437 + if (input->hdr.offset) { 1438 + cmd->return_code = CXL_MBOX_CMD_RC_INPUT; 1439 + return -EINVAL; 1440 + } 1441 + 1442 + memcpy(&mdata->test_feat.data, &test->data, sizeof(u32)); 1443 + 1444 + return 0; 1445 + } 1446 + 1447 + static int mock_set_feature(struct cxl_mockmem_data *mdata, 1448 + struct cxl_mbox_cmd *cmd) 1449 + { 1450 + struct cxl_mbox_set_feat_in *input = cmd->payload_in; 1451 + 1452 + if (uuid_equal(&input->hdr.uuid, &CXL_VENDOR_FEATURE_TEST)) 1453 + return mock_set_test_feature(mdata, cmd); 1454 + 1455 + cmd->return_code = CXL_MBOX_CMD_RC_UNSUPPORTED; 1456 + 1457 + return -EOPNOTSUPP; 1458 + } 1459 + 1460 + static int mock_get_supported_features(struct cxl_mockmem_data *mdata, 1461 + struct cxl_mbox_cmd *cmd) 1462 + { 1463 + struct cxl_mbox_get_sup_feats_in *in = cmd->payload_in; 1464 + struct cxl_mbox_get_sup_feats_out *out = cmd->payload_out; 1465 + struct cxl_feat_entry *feat; 1466 + u16 start_idx, count; 1467 + 1468 + if (cmd->size_out < sizeof(*out)) { 1469 + cmd->return_code = CXL_MBOX_CMD_RC_PAYLOADLEN; 1470 + return -EINVAL; 1471 + } 1472 + 1473 + /* 1474 + * Current emulation only supports 1 feature 1475 + */ 1476 + start_idx = le16_to_cpu(in->start_idx); 1477 + if (start_idx != 0) { 1478 + cmd->return_code = CXL_MBOX_CMD_RC_INPUT; 1479 + return -EINVAL; 1480 + } 1481 + 1482 + count = le16_to_cpu(in->count); 1483 + if (count < struct_size(out, ents, 0)) { 1484 + cmd->return_code = CXL_MBOX_CMD_RC_PAYLOADLEN; 1485 + return -EINVAL; 1486 + } 1487 + 1488 + out->supported_feats = cpu_to_le16(MAX_CXL_TEST_FEATS); 1489 + cmd->return_code = 0; 1490 + if (count < struct_size(out, ents, MAX_CXL_TEST_FEATS)) { 1491 + out->num_entries = 0; 1492 + return 0; 1493 + } 1494 + 1495 + out->num_entries = cpu_to_le16(MAX_CXL_TEST_FEATS); 1496 + feat = out->ents; 1497 + fill_feature_vendor_test(feat); 1498 + 1499 + return 0; 1500 + } 1501 + 1374 1502 static int cxl_mock_mbox_send(struct cxl_mailbox *cxl_mbox, 1375 1503 struct cxl_mbox_cmd *cmd) 1376 1504 { ··· 1601 1439 case CXL_MBOX_OP_ACTIVATE_FW: 1602 1440 rc = mock_activate_fw(mdata, cmd); 1603 1441 break; 1442 + case CXL_MBOX_OP_GET_SUPPORTED_FEATURES: 1443 + rc = mock_get_supported_features(mdata, cmd); 1444 + break; 1445 + case CXL_MBOX_OP_GET_FEATURE: 1446 + rc = mock_get_feature(mdata, cmd); 1447 + break; 1448 + case CXL_MBOX_OP_SET_FEATURE: 1449 + rc = mock_set_feature(mdata, cmd); 1450 + break; 1604 1451 default: 1605 1452 break; 1606 1453 } ··· 1655 1484 return rc; 1656 1485 1657 1486 return 0; 1487 + } 1488 + 1489 + static void cxl_mock_test_feat_init(struct cxl_mockmem_data *mdata) 1490 + { 1491 + mdata->test_feat.data = cpu_to_le32(0xdeadbeef); 1658 1492 } 1659 1493 1660 1494 static int cxl_mock_mem_probe(struct platform_device *pdev) ··· 1734 1558 if (rc) 1735 1559 return rc; 1736 1560 1561 + rc = devm_cxl_setup_features(cxlds); 1562 + if (rc) 1563 + dev_dbg(dev, "No CXL Features discovered\n"); 1564 + 1737 1565 cxl_mock_add_event_logs(&mdata->mes); 1738 1566 1739 1567 cxlmd = devm_cxl_add_memdev(&pdev->dev, cxlds); ··· 1752 1572 if (rc) 1753 1573 return rc; 1754 1574 1575 + rc = devm_cxl_setup_fwctl(cxlmd); 1576 + if (rc) 1577 + dev_dbg(dev, "No CXL FWCTL setup\n"); 1578 + 1755 1579 cxl_mem_get_event_records(mds, CXLDEV_EVENT_STATUS_ALL); 1580 + cxl_mock_test_feat_init(mdata); 1756 1581 1757 1582 return 0; 1758 1583 }