Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'drm-misc-next-2023-04-06' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v6.4-rc1:

UAPI Changes:

Cross-subsystem Changes:
- Document port and rotation dt bindings better.
- For panel timing DT bindings, document that vsync and hsync are
first, rather than last in image.
- Fix video/aperture typos.

Core Changes:
- Reject prime DMA-Buf attachment if get_sg_table is missing.
(For self-importing dma-buf only.)
- Add prime import/export to vram-helper.
- Fix oops in drm/vblank when init is not called.
- Fixup xres/yres_virtual and other fixes in fb helper.
- Improve SCDC debugs.
- Skip setting deadline on modesets.
- Assorted TTM fixes.

Driver Changes:
- Add lima usage stats.
- Assorted fixes to bridge/lt8192b, tc358767, ivpu,
bridge/ti-sn65dsi83, ps8640.
- Use pci aperture helpers in drm/ast lynxfb, radeonfb.
- Revert some lima patches, as they required a commit that has been
reverted upstream.
- Add AUO NE135FBM-N41 v8.1 eDP panel.
- Add QAIC accel driver.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/64bb9696-a76a-89d9-1866-bcdf7c69c284@linux.intel.com

+7004 -231
+1
Documentation/accel/index.rst
··· 8 8 :maxdepth: 1 9 9 10 10 introduction 11 + qaic/index 11 12 12 13 .. only:: subproject and html 13 14
+510
Documentation/accel/qaic/aic100.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0-only 2 + 3 + =============================== 4 + Qualcomm Cloud AI 100 (AIC100) 5 + =============================== 6 + 7 + Overview 8 + ======== 9 + 10 + The Qualcomm Cloud AI 100/AIC100 family of products (including SA9000P - part of 11 + Snapdragon Ride) are PCIe adapter cards which contain a dedicated SoC ASIC for 12 + the purpose of efficiently running Artificial Intelligence (AI) Deep Learning 13 + inference workloads. They are AI accelerators. 14 + 15 + The PCIe interface of AIC100 is capable of PCIe Gen4 speeds over eight lanes 16 + (x8). An individual SoC on a card can have up to 16 NSPs for running workloads. 17 + Each SoC has an A53 management CPU. On card, there can be up to 32 GB of DDR. 18 + 19 + Multiple AIC100 cards can be hosted in a single system to scale overall 20 + performance. AIC100 cards are multi-user capable and able to execute workloads 21 + from multiple users in a concurrent manner. 22 + 23 + Hardware Description 24 + ==================== 25 + 26 + An AIC100 card consists of an AIC100 SoC, on-card DDR, and a set of misc 27 + peripherals (PMICs, etc). 28 + 29 + An AIC100 card can either be a PCIe HHHL form factor (a traditional PCIe card), 30 + or a Dual M.2 card. Both use PCIe to connect to the host system. 31 + 32 + As a PCIe endpoint/adapter, AIC100 uses the standard VendorID(VID)/ 33 + DeviceID(DID) combination to uniquely identify itself to the host. AIC100 34 + uses the standard Qualcomm VID (0x17cb). All AIC100 SKUs use the same 35 + AIC100 DID (0xa100). 36 + 37 + AIC100 does not implement FLR (function level reset). 38 + 39 + AIC100 implements MSI but does not implement MSI-X. AIC100 requires 17 MSIs to 40 + operate (1 for MHI, 16 for the DMA Bridge). 41 + 42 + As a PCIe device, AIC100 utilizes BARs to provide host interfaces to the device 43 + hardware. AIC100 provides 3, 64-bit BARs. 44 + 45 + * The first BAR is 4K in size, and exposes the MHI interface to the host. 46 + 47 + * The second BAR is 2M in size, and exposes the DMA Bridge interface to the 48 + host. 49 + 50 + * The third BAR is variable in size based on an individual AIC100's 51 + configuration, but defaults to 64K. This BAR currently has no purpose. 52 + 53 + From the host perspective, AIC100 has several key hardware components - 54 + 55 + * MHI (Modem Host Interface) 56 + * QSM (QAIC Service Manager) 57 + * NSPs (Neural Signal Processor) 58 + * DMA Bridge 59 + * DDR 60 + 61 + MHI 62 + --- 63 + 64 + AIC100 has one MHI interface over PCIe. MHI itself is documented at 65 + Documentation/mhi/index.rst MHI is the mechanism the host uses to communicate 66 + with the QSM. Except for workload data via the DMA Bridge, all interaction with 67 + the device occurs via MHI. 68 + 69 + QSM 70 + --- 71 + 72 + QAIC Service Manager. This is an ARM A53 CPU that runs the primary 73 + firmware of the card and performs on-card management tasks. It also 74 + communicates with the host via MHI. Each AIC100 has one of 75 + these. 76 + 77 + NSP 78 + --- 79 + 80 + Neural Signal Processor. Each AIC100 has up to 16 of these. These are 81 + the processors that run the workloads on AIC100. Each NSP is a Qualcomm Hexagon 82 + (Q6) DSP with HVX and HMX. Each NSP can only run one workload at a time, but 83 + multiple NSPs may be assigned to a single workload. Since each NSP can only run 84 + one workload, AIC100 is limited to 16 concurrent workloads. Workload 85 + "scheduling" is under the purview of the host. AIC100 does not automatically 86 + timeslice. 87 + 88 + DMA Bridge 89 + ---------- 90 + 91 + The DMA Bridge is custom DMA engine that manages the flow of data 92 + in and out of workloads. AIC100 has one of these. The DMA Bridge has 16 93 + channels, each consisting of a set of request/response FIFOs. Each active 94 + workload is assigned a single DMA Bridge channel. The DMA Bridge exposes 95 + hardware registers to manage the FIFOs (head/tail pointers), but requires host 96 + memory to store the FIFOs. 97 + 98 + DDR 99 + --- 100 + 101 + AIC100 has on-card DDR. In total, an AIC100 can have up to 32 GB of DDR. 102 + This DDR is used to store workloads, data for the workloads, and is used by the 103 + QSM for managing the device. NSPs are granted access to sections of the DDR by 104 + the QSM. The host does not have direct access to the DDR, and must make 105 + requests to the QSM to transfer data to the DDR. 106 + 107 + High-level Use Flow 108 + =================== 109 + 110 + AIC100 is a multi-user, programmable accelerator typically used for running 111 + neural networks in inferencing mode to efficiently perform AI operations. 112 + AIC100 is not intended for training neural networks. AIC100 can be utilized 113 + for generic compute workloads. 114 + 115 + Assuming a user wants to utilize AIC100, they would follow these steps: 116 + 117 + 1. Compile the workload into an ELF targeting the NSP(s) 118 + 2. Make requests to the QSM to load the workload and related artifacts into the 119 + device DDR 120 + 3. Make a request to the QSM to activate the workload onto a set of idle NSPs 121 + 4. Make requests to the DMA Bridge to send input data to the workload to be 122 + processed, and other requests to receive processed output data from the 123 + workload. 124 + 5. Once the workload is no longer required, make a request to the QSM to 125 + deactivate the workload, thus putting the NSPs back into an idle state. 126 + 6. Once the workload and related artifacts are no longer needed for future 127 + sessions, make requests to the QSM to unload the data from DDR. This frees 128 + the DDR to be used by other users. 129 + 130 + 131 + Boot Flow 132 + ========= 133 + 134 + AIC100 uses a flashless boot flow, derived from Qualcomm MSMs. 135 + 136 + When AIC100 is first powered on, it begins executing PBL (Primary Bootloader) 137 + from ROM. PBL enumerates the PCIe link, and initializes the BHI (Boot Host 138 + Interface) component of MHI. 139 + 140 + Using BHI, the host points PBL to the location of the SBL (Secondary Bootloader) 141 + image. The PBL pulls the image from the host, validates it, and begins 142 + execution of SBL. 143 + 144 + SBL initializes MHI, and uses MHI to notify the host that the device has entered 145 + the SBL stage. SBL performs a number of operations: 146 + 147 + * SBL initializes the majority of hardware (anything PBL left uninitialized), 148 + including DDR. 149 + * SBL offloads the bootlog to the host. 150 + * SBL synchronizes timestamps with the host for future logging. 151 + * SBL uses the Sahara protocol to obtain the runtime firmware images from the 152 + host. 153 + 154 + Once SBL has obtained and validated the runtime firmware, it brings the NSPs out 155 + of reset, and jumps into the QSM. 156 + 157 + The QSM uses MHI to notify the host that the device has entered the QSM stage 158 + (AMSS in MHI terms). At this point, the AIC100 device is fully functional, and 159 + ready to process workloads. 160 + 161 + Userspace components 162 + ==================== 163 + 164 + Compiler 165 + -------- 166 + 167 + An open compiler for AIC100 based on upstream LLVM can be found at: 168 + https://github.com/quic/software-kit-for-qualcomm-cloud-ai-100-cc 169 + 170 + Usermode Driver (UMD) 171 + --------------------- 172 + 173 + An open UMD that interfaces with the qaic kernel driver can be found at: 174 + https://github.com/quic/software-kit-for-qualcomm-cloud-ai-100 175 + 176 + Sahara loader 177 + ------------- 178 + 179 + An open implementation of the Sahara protocol called kickstart can be found at: 180 + https://github.com/andersson/qdl 181 + 182 + MHI Channels 183 + ============ 184 + 185 + AIC100 defines a number of MHI channels for different purposes. This is a list 186 + of the defined channels, and their uses. 187 + 188 + +----------------+---------+----------+----------------------------------------+ 189 + | Channel name | IDs | EEs | Purpose | 190 + +================+=========+==========+========================================+ 191 + | QAIC_LOOPBACK | 0 & 1 | AMSS | Any data sent to the device on this | 192 + | | | | channel is sent back to the host. | 193 + +----------------+---------+----------+----------------------------------------+ 194 + | QAIC_SAHARA | 2 & 3 | SBL | Used by SBL to obtain the runtime | 195 + | | | | firmware from the host. | 196 + +----------------+---------+----------+----------------------------------------+ 197 + | QAIC_DIAG | 4 & 5 | AMSS | Used to communicate with QSM via the | 198 + | | | | DIAG protocol. | 199 + +----------------+---------+----------+----------------------------------------+ 200 + | QAIC_SSR | 6 & 7 | AMSS | Used to notify the host of subsystem | 201 + | | | | restart events, and to offload SSR | 202 + | | | | crashdumps. | 203 + +----------------+---------+----------+----------------------------------------+ 204 + | QAIC_QDSS | 8 & 9 | AMSS | Used for the Qualcomm Debug Subsystem. | 205 + +----------------+---------+----------+----------------------------------------+ 206 + | QAIC_CONTROL | 10 & 11 | AMSS | Used for the Neural Network Control | 207 + | | | | (NNC) protocol. This is the primary | 208 + | | | | channel between host and QSM for | 209 + | | | | managing workloads. | 210 + +----------------+---------+----------+----------------------------------------+ 211 + | QAIC_LOGGING | 12 & 13 | SBL | Used by the SBL to send the bootlog to | 212 + | | | | the host. | 213 + +----------------+---------+----------+----------------------------------------+ 214 + | QAIC_STATUS | 14 & 15 | AMSS | Used to notify the host of Reliability,| 215 + | | | | Accessibility, Serviceability (RAS) | 216 + | | | | events. | 217 + +----------------+---------+----------+----------------------------------------+ 218 + | QAIC_TELEMETRY | 16 & 17 | AMSS | Used to get/set power/thermal/etc | 219 + | | | | attributes. | 220 + +----------------+---------+----------+----------------------------------------+ 221 + | QAIC_DEBUG | 18 & 19 | AMSS | Not used. | 222 + +----------------+---------+----------+----------------------------------------+ 223 + | QAIC_TIMESYNC | 20 & 21 | SBL/AMSS | Used to synchronize timestamps in the | 224 + | | | | device side logs with the host time | 225 + | | | | source. | 226 + +----------------+---------+----------+----------------------------------------+ 227 + 228 + DMA Bridge 229 + ========== 230 + 231 + Overview 232 + -------- 233 + 234 + The DMA Bridge is one of the main interfaces to the host from the device 235 + (the other being MHI). As part of activating a workload to run on NSPs, the QSM 236 + assigns that network a DMA Bridge channel. A workload's DMA Bridge channel 237 + (DBC for short) is solely for the use of that workload and is not shared with 238 + other workloads. 239 + 240 + Each DBC is a pair of FIFOs that manage data in and out of the workload. One 241 + FIFO is the request FIFO. The other FIFO is the response FIFO. 242 + 243 + Each DBC contains 4 registers in hardware: 244 + 245 + * Request FIFO head pointer (offset 0x0). Read only by the host. Indicates the 246 + latest item in the FIFO the device has consumed. 247 + * Request FIFO tail pointer (offset 0x4). Read/write by the host. Host 248 + increments this register to add new items to the FIFO. 249 + * Response FIFO head pointer (offset 0x8). Read/write by the host. Indicates 250 + the latest item in the FIFO the host has consumed. 251 + * Response FIFO tail pointer (offset 0xc). Read only by the host. Device 252 + increments this register to add new items to the FIFO. 253 + 254 + The values in each register are indexes in the FIFO. To get the location of the 255 + FIFO element pointed to by the register: FIFO base address + register * element 256 + size. 257 + 258 + DBC registers are exposed to the host via the second BAR. Each DBC consumes 259 + 4KB of space in the BAR. 260 + 261 + The actual FIFOs are backed by host memory. When sending a request to the QSM 262 + to activate a network, the host must donate memory to be used for the FIFOs. 263 + Due to internal mapping limitations of the device, a single contiguous chunk of 264 + memory must be provided per DBC, which hosts both FIFOs. The request FIFO will 265 + consume the beginning of the memory chunk, and the response FIFO will consume 266 + the end of the memory chunk. 267 + 268 + Request FIFO 269 + ------------ 270 + 271 + A request FIFO element has the following structure: 272 + 273 + .. code-block:: c 274 + 275 + struct request_elem { 276 + u16 req_id; 277 + u8 seq_id; 278 + u8 pcie_dma_cmd; 279 + u32 reserved; 280 + u64 pcie_dma_source_addr; 281 + u64 pcie_dma_dest_addr; 282 + u32 pcie_dma_len; 283 + u32 reserved; 284 + u64 doorbell_addr; 285 + u8 doorbell_attr; 286 + u8 reserved; 287 + u16 reserved; 288 + u32 doorbell_data; 289 + u32 sem_cmd0; 290 + u32 sem_cmd1; 291 + u32 sem_cmd2; 292 + u32 sem_cmd3; 293 + }; 294 + 295 + Request field descriptions: 296 + 297 + req_id 298 + request ID. A request FIFO element and a response FIFO element with 299 + the same request ID refer to the same command. 300 + 301 + seq_id 302 + sequence ID within a request. Ignored by the DMA Bridge. 303 + 304 + pcie_dma_cmd 305 + describes the DMA element of this request. 306 + 307 + * Bit(7) is the force msi flag, which overrides the DMA Bridge MSI logic 308 + and generates a MSI when this request is complete, and QSM 309 + configures the DMA Bridge to look at this bit. 310 + * Bits(6:5) are reserved. 311 + * Bit(4) is the completion code flag, and indicates that the DMA Bridge 312 + shall generate a response FIFO element when this request is 313 + complete. 314 + * Bit(3) indicates if this request is a linked list transfer(0) or a bulk 315 + transfer(1). 316 + * Bit(2) is reserved. 317 + * Bits(1:0) indicate the type of transfer. No transfer(0), to device(1), 318 + from device(2). Value 3 is illegal. 319 + 320 + pcie_dma_source_addr 321 + source address for a bulk transfer, or the address of the linked list. 322 + 323 + pcie_dma_dest_addr 324 + destination address for a bulk transfer. 325 + 326 + pcie_dma_len 327 + length of the bulk transfer. Note that the size of this field 328 + limits transfers to 4G in size. 329 + 330 + doorbell_addr 331 + address of the doorbell to ring when this request is complete. 332 + 333 + doorbell_attr 334 + doorbell attributes. 335 + 336 + * Bit(7) indicates if a write to a doorbell is to occur. 337 + * Bits(6:2) are reserved. 338 + * Bits(1:0) contain the encoding of the doorbell length. 0 is 32-bit, 339 + 1 is 16-bit, 2 is 8-bit, 3 is reserved. The doorbell address 340 + must be naturally aligned to the specified length. 341 + 342 + doorbell_data 343 + data to write to the doorbell. Only the bits corresponding to 344 + the doorbell length are valid. 345 + 346 + sem_cmdN 347 + semaphore command. 348 + 349 + * Bit(31) indicates this semaphore command is enabled. 350 + * Bit(30) is the to-device DMA fence. Block this request until all 351 + to-device DMA transfers are complete. 352 + * Bit(29) is the from-device DMA fence. Block this request until all 353 + from-device DMA transfers are complete. 354 + * Bits(28:27) are reserved. 355 + * Bits(26:24) are the semaphore command. 0 is NOP. 1 is init with the 356 + specified value. 2 is increment. 3 is decrement. 4 is wait 357 + until the semaphore is equal to the specified value. 5 is wait 358 + until the semaphore is greater or equal to the specified value. 359 + 6 is "P", wait until semaphore is greater than 0, then 360 + decrement by 1. 7 is reserved. 361 + * Bit(23) is reserved. 362 + * Bit(22) is the semaphore sync. 0 is post sync, which means that the 363 + semaphore operation is done after the DMA transfer. 1 is 364 + presync, which gates the DMA transfer. Only one presync is 365 + allowed per request. 366 + * Bit(21) is reserved. 367 + * Bits(20:16) is the index of the semaphore to operate on. 368 + * Bits(15:12) are reserved. 369 + * Bits(11:0) are the semaphore value to use in operations. 370 + 371 + Overall, a request is processed in 4 steps: 372 + 373 + 1. If specified, the presync semaphore condition must be true 374 + 2. If enabled, the DMA transfer occurs 375 + 3. If specified, the postsync semaphore conditions must be true 376 + 4. If enabled, the doorbell is written 377 + 378 + By using the semaphores in conjunction with the workload running on the NSPs, 379 + the data pipeline can be synchronized such that the host can queue multiple 380 + requests of data for the workload to process, but the DMA Bridge will only copy 381 + the data into the memory of the workload when the workload is ready to process 382 + the next input. 383 + 384 + Response FIFO 385 + ------------- 386 + 387 + Once a request is fully processed, a response FIFO element is generated if 388 + specified in pcie_dma_cmd. The structure of a response FIFO element: 389 + 390 + .. code-block:: c 391 + 392 + struct response_elem { 393 + u16 req_id; 394 + u16 completion_code; 395 + }; 396 + 397 + req_id 398 + matches the req_id of the request that generated this element. 399 + 400 + completion_code 401 + status of this request. 0 is success. Non-zero is an error. 402 + 403 + The DMA Bridge will generate a MSI to the host as a reaction to activity in the 404 + response FIFO of a DBC. The DMA Bridge hardware has an IRQ storm mitigation 405 + algorithm, where it will only generate a MSI when the response FIFO transitions 406 + from empty to non-empty (unless force MSI is enabled and triggered). In 407 + response to this MSI, the host is expected to drain the response FIFO, and must 408 + take care to handle any race conditions between draining the FIFO, and the 409 + device inserting elements into the FIFO. 410 + 411 + Neural Network Control (NNC) Protocol 412 + ===================================== 413 + 414 + The NNC protocol is how the host makes requests to the QSM to manage workloads. 415 + It uses the QAIC_CONTROL MHI channel. 416 + 417 + Each NNC request is packaged into a message. Each message is a series of 418 + transactions. A passthrough type transaction can contain elements known as 419 + commands. 420 + 421 + QSM requires NNC messages be little endian encoded and the fields be naturally 422 + aligned. Since there are 64-bit elements in some NNC messages, 64-bit alignment 423 + must be maintained. 424 + 425 + A message contains a header and then a series of transactions. A message may be 426 + at most 4K in size from QSM to the host. From the host to the QSM, a message 427 + can be at most 64K (maximum size of a single MHI packet), but there is a 428 + continuation feature where message N+1 can be marked as a continuation of 429 + message N. This is used for exceedingly large DMA xfer transactions. 430 + 431 + Transaction descriptions 432 + ------------------------ 433 + 434 + passthrough 435 + Allows userspace to send an opaque payload directly to the QSM. 436 + This is used for NNC commands. Userspace is responsible for managing 437 + the QSM message requirements in the payload. 438 + 439 + dma_xfer 440 + DMA transfer. Describes an object that the QSM should DMA into the 441 + device via address and size tuples. 442 + 443 + activate 444 + Activate a workload onto NSPs. The host must provide memory to be 445 + used by the DBC. 446 + 447 + deactivate 448 + Deactivate an active workload and return the NSPs to idle. 449 + 450 + status 451 + Query the QSM about it's NNC implementation. Returns the NNC version, 452 + and if CRC is used. 453 + 454 + terminate 455 + Release a user's resources. 456 + 457 + dma_xfer_cont 458 + Continuation of a previous DMA transfer. If a DMA transfer 459 + cannot be specified in a single message (highly fragmented), this 460 + transaction can be used to specify more ranges. 461 + 462 + validate_partition 463 + Query to QSM to determine if a partition identifier is valid. 464 + 465 + Each message is tagged with a user id, and a partition id. The user id allows 466 + QSM to track resources, and release them when the user goes away (eg the process 467 + crashes). A partition id identifies the resource partition that QSM manages, 468 + which this message applies to. 469 + 470 + Messages may have CRCs. Messages should have CRCs applied until the QSM 471 + reports via the status transaction that CRCs are not needed. The QSM on the 472 + SA9000P requires CRCs for black channel safing. 473 + 474 + Subsystem Restart (SSR) 475 + ======================= 476 + 477 + SSR is the concept of limiting the impact of an error. An AIC100 device may 478 + have multiple users, each with their own workload running. If the workload of 479 + one user crashes, the fallout of that should be limited to that workload and not 480 + impact other workloads. SSR accomplishes this. 481 + 482 + If a particular workload crashes, QSM notifies the host via the QAIC_SSR MHI 483 + channel. This notification identifies the workload by it's assigned DBC. A 484 + multi-stage recovery process is then used to cleanup both sides, and get the 485 + DBC/NSPs into a working state. 486 + 487 + When SSR occurs, any state in the workload is lost. Any inputs that were in 488 + process, or queued by not yet serviced, are lost. The loaded artifacts will 489 + remain in on-card DDR, but the host will need to re-activate the workload if 490 + it desires to recover the workload. 491 + 492 + Reliability, Accessibility, Serviceability (RAS) 493 + ================================================ 494 + 495 + AIC100 is expected to be deployed in server systems where RAS ideology is 496 + applied. Simply put, RAS is the concept of detecting, classifying, and 497 + reporting errors. While PCIe has AER (Advanced Error Reporting) which factors 498 + into RAS, AER does not allow for a device to report details about internal 499 + errors. Therefore, AIC100 implements a custom RAS mechanism. When a RAS event 500 + occurs, QSM will report the event with appropriate details via the QAIC_STATUS 501 + MHI channel. A sysadmin may determine that a particular device needs 502 + additional service based on RAS reports. 503 + 504 + Telemetry 505 + ========= 506 + 507 + QSM has the ability to report various physical attributes of the device, and in 508 + some cases, to allow the host to control them. Examples include thermal limits, 509 + thermal readings, and power readings. These items are communicated via the 510 + QAIC_TELEMETRY MHI channel.
+13
Documentation/accel/qaic/index.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0-only 2 + 3 + ===================================== 4 + accel/qaic Qualcomm Cloud AI driver 5 + ===================================== 6 + 7 + The accel/qaic driver supports the Qualcomm Cloud AI machine learning 8 + accelerator cards. 9 + 10 + .. toctree:: 11 + 12 + qaic 13 + aic100
+170
Documentation/accel/qaic/qaic.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0-only 2 + 3 + ============= 4 + QAIC driver 5 + ============= 6 + 7 + The QAIC driver is the Kernel Mode Driver (KMD) for the AIC100 family of AI 8 + accelerator products. 9 + 10 + Interrupts 11 + ========== 12 + 13 + While the AIC100 DMA Bridge hardware implements an IRQ storm mitigation 14 + mechanism, it is still possible for an IRQ storm to occur. A storm can happen 15 + if the workload is particularly quick, and the host is responsive. If the host 16 + can drain the response FIFO as quickly as the device can insert elements into 17 + it, then the device will frequently transition the response FIFO from empty to 18 + non-empty and generate MSIs at a rate equivalent to the speed of the 19 + workload's ability to process inputs. The lprnet (license plate reader network) 20 + workload is known to trigger this condition, and can generate in excess of 100k 21 + MSIs per second. It has been observed that most systems cannot tolerate this 22 + for long, and will crash due to some form of watchdog due to the overhead of 23 + the interrupt controller interrupting the host CPU. 24 + 25 + To mitigate this issue, the QAIC driver implements specific IRQ handling. When 26 + QAIC receives an IRQ, it disables that line. This prevents the interrupt 27 + controller from interrupting the CPU. Then AIC drains the FIFO. Once the FIFO 28 + is drained, QAIC implements a "last chance" polling algorithm where QAIC will 29 + sleep for a time to see if the workload will generate more activity. The IRQ 30 + line remains disabled during this time. If no activity is detected, QAIC exits 31 + polling mode and reenables the IRQ line. 32 + 33 + This mitigation in QAIC is very effective. The same lprnet usecase that 34 + generates 100k IRQs per second (per /proc/interrupts) is reduced to roughly 64 35 + IRQs over 5 minutes while keeping the host system stable, and having the same 36 + workload throughput performance (within run to run noise variation). 37 + 38 + 39 + Neural Network Control (NNC) Protocol 40 + ===================================== 41 + 42 + The implementation of NNC is split between the KMD (QAIC) and UMD. In general 43 + QAIC understands how to encode/decode NNC wire protocol, and elements of the 44 + protocol which require kernel space knowledge to process (for example, mapping 45 + host memory to device IOVAs). QAIC understands the structure of a message, and 46 + all of the transactions. QAIC does not understand commands (the payload of a 47 + passthrough transaction). 48 + 49 + QAIC handles and enforces the required little endianness and 64-bit alignment, 50 + to the degree that it can. Since QAIC does not know the contents of a 51 + passthrough transaction, it relies on the UMD to satisfy the requirements. 52 + 53 + The terminate transaction is of particular use to QAIC. QAIC is not aware of 54 + the resources that are loaded onto a device since the majority of that activity 55 + occurs within NNC commands. As a result, QAIC does not have the means to 56 + roll back userspace activity. To ensure that a userspace client's resources 57 + are fully released in the case of a process crash, or a bug, QAIC uses the 58 + terminate command to let QSM know when a user has gone away, and the resources 59 + can be released. 60 + 61 + QSM can report a version number of the NNC protocol it supports. This is in the 62 + form of a Major number and a Minor number. 63 + 64 + Major number updates indicate changes to the NNC protocol which impact the 65 + message format, or transactions (impacts QAIC). 66 + 67 + Minor number updates indicate changes to the NNC protocol which impact the 68 + commands (does not impact QAIC). 69 + 70 + uAPI 71 + ==== 72 + 73 + QAIC defines a number of driver specific IOCTLs as part of the userspace API. 74 + This section describes those APIs. 75 + 76 + DRM_IOCTL_QAIC_MANAGE 77 + This IOCTL allows userspace to send a NNC request to the QSM. The call will 78 + block until a response is received, or the request has timed out. 79 + 80 + DRM_IOCTL_QAIC_CREATE_BO 81 + This IOCTL allows userspace to allocate a buffer object (BO) which can send 82 + or receive data from a workload. The call will return a GEM handle that 83 + represents the allocated buffer. The BO is not usable until it has been 84 + sliced (see DRM_IOCTL_QAIC_ATTACH_SLICE_BO). 85 + 86 + DRM_IOCTL_QAIC_MMAP_BO 87 + This IOCTL allows userspace to prepare an allocated BO to be mmap'd into the 88 + userspace process. 89 + 90 + DRM_IOCTL_QAIC_ATTACH_SLICE_BO 91 + This IOCTL allows userspace to slice a BO in preparation for sending the BO 92 + to the device. Slicing is the operation of describing what portions of a BO 93 + get sent where to a workload. This requires a set of DMA transfers for the 94 + DMA Bridge, and as such, locks the BO to a specific DBC. 95 + 96 + DRM_IOCTL_QAIC_EXECUTE_BO 97 + This IOCTL allows userspace to submit a set of sliced BOs to the device. The 98 + call is non-blocking. Success only indicates that the BOs have been queued 99 + to the device, but does not guarantee they have been executed. 100 + 101 + DRM_IOCTL_QAIC_PARTIAL_EXECUTE_BO 102 + This IOCTL operates like DRM_IOCTL_QAIC_EXECUTE_BO, but it allows userspace 103 + to shrink the BOs sent to the device for this specific call. If a BO 104 + typically has N inputs, but only a subset of those is available, this IOCTL 105 + allows userspace to indicate that only the first M bytes of the BO should be 106 + sent to the device to minimize data transfer overhead. This IOCTL dynamically 107 + recomputes the slicing, and therefore has some processing overhead before the 108 + BOs can be queued to the device. 109 + 110 + DRM_IOCTL_QAIC_WAIT_BO 111 + This IOCTL allows userspace to determine when a particular BO has been 112 + processed by the device. The call will block until either the BO has been 113 + processed and can be re-queued to the device, or a timeout occurs. 114 + 115 + DRM_IOCTL_QAIC_PERF_STATS_BO 116 + This IOCTL allows userspace to collect performance statistics on the most 117 + recent execution of a BO. This allows userspace to construct an end to end 118 + timeline of the BO processing for a performance analysis. 119 + 120 + DRM_IOCTL_QAIC_PART_DEV 121 + This IOCTL allows userspace to request a duplicate "shadow device". This extra 122 + accelN device is associated with a specific partition of resources on the 123 + AIC100 device and can be used for limiting a process to some subset of 124 + resources. 125 + 126 + Userspace Client Isolation 127 + ========================== 128 + 129 + AIC100 supports multiple clients. Multiple DBCs can be consumed by a single 130 + client, and multiple clients can each consume one or more DBCs. Workloads 131 + may contain sensitive information therefore only the client that owns the 132 + workload should be allowed to interface with the DBC. 133 + 134 + Clients are identified by the instance associated with their open(). A client 135 + may only use memory they allocate, and DBCs that are assigned to their 136 + workloads. Attempts to access resources assigned to other clients will be 137 + rejected. 138 + 139 + Module parameters 140 + ================= 141 + 142 + QAIC supports the following module parameters: 143 + 144 + **datapath_polling (bool)** 145 + 146 + Configures QAIC to use a polling thread for datapath events instead of relying 147 + on the device interrupts. Useful for platforms with broken multiMSI. Must be 148 + set at QAIC driver initialization. Default is 0 (off). 149 + 150 + **mhi_timeout_ms (unsigned int)** 151 + 152 + Sets the timeout value for MHI operations in milliseconds (ms). Must be set 153 + at the time the driver detects a device. Default is 2000 (2 seconds). 154 + 155 + **control_resp_timeout_s (unsigned int)** 156 + 157 + Sets the timeout value for QSM responses to NNC messages in seconds (s). Must 158 + be set at the time the driver is sending a request to QSM. Default is 60 (one 159 + minute). 160 + 161 + **wait_exec_default_timeout_ms (unsigned int)** 162 + 163 + Sets the default timeout for the wait_exec ioctl in milliseconds (ms). Must be 164 + set prior to the waic_exec ioctl call. A value specified in the ioctl call 165 + overrides this for that call. Default is 5000 (5 seconds). 166 + 167 + **datapath_poll_interval_us (unsigned int)** 168 + 169 + Sets the polling interval in microseconds (us) when datapath polling is active. 170 + Takes effect at the next polling interval. Default is 100 (100 us).
+9
Documentation/devicetree/bindings/display/panel/elida,kd35t133.yaml
··· 17 17 const: elida,kd35t133 18 18 reg: true 19 19 backlight: true 20 + port: true 20 21 reset-gpios: true 22 + rotation: true 21 23 iovcc-supply: 22 24 description: regulator that supplies the iovcc voltage 23 25 vdd-supply: ··· 29 27 - compatible 30 28 - reg 31 29 - backlight 30 + - port 32 31 - iovcc-supply 33 32 - vdd-supply 34 33 ··· 46 43 backlight = <&backlight>; 47 44 iovcc-supply = <&vcc_1v8>; 48 45 vdd-supply = <&vcc3v3_lcd>; 46 + 47 + port { 48 + mipi_in_panel: endpoint { 49 + remote-endpoint = <&mipi_out_panel>; 50 + }; 51 + }; 49 52 }; 50 53 }; 51 54
+8
Documentation/devicetree/bindings/display/panel/feiyang,fy07024di26a30d.yaml
··· 26 26 dvdd-supply: 27 27 description: 3v3 digital regulator 28 28 29 + port: true 29 30 reset-gpios: true 30 31 31 32 backlight: true ··· 36 35 - reg 37 36 - avdd-supply 38 37 - dvdd-supply 38 + - port 39 39 40 40 additionalProperties: false 41 41 ··· 55 53 dvdd-supply = <&reg_dldo2>; 56 54 reset-gpios = <&pio 3 24 GPIO_ACTIVE_HIGH>; /* LCD-RST: PD24 */ 57 55 backlight = <&backlight>; 56 + 57 + port { 58 + mipi_in_panel: endpoint { 59 + remote-endpoint = <&mipi_out_panel>; 60 + }; 61 + }; 58 62 }; 59 63 };
+23 -23
Documentation/devicetree/bindings/display/panel/panel-timing.yaml
··· 17 17 18 18 The parameters are defined as seen in the following illustration. 19 19 20 - +----------+-------------------------------------+----------+-------+ 21 - | | ^ | | | 22 - | | |vback_porch | | | 23 - | | v | | | 24 - +----------#######################################----------+-------+ 25 - | # ^ # | | 26 - | # | # | | 27 - | hback # | # hfront | hsync | 28 - | porch # | hactive # porch | len | 29 - |<-------->#<-------+--------------------------->#<-------->|<----->| 30 - | # | # | | 31 - | # |vactive # | | 32 - | # | # | | 33 - | # v # | | 34 - +----------#######################################----------+-------+ 35 - | | ^ | | | 36 - | | |vfront_porch | | | 37 - | | v | | | 38 - +----------+-------------------------------------+----------+-------+ 39 - | | ^ | | | 40 - | | |vsync_len | | | 41 - | | v | | | 42 - +----------+-------------------------------------+----------+-------+ 20 + +-------+----------+-------------------------------------+----------+ 21 + | | | ^ | | 22 + | | | |vsync_len | | 23 + | | | v | | 24 + +-------+----------+-------------------------------------+----------+ 25 + | | | ^ | | 26 + | | | |vback_porch | | 27 + | | | v | | 28 + +-------+----------#######################################----------+ 29 + | | # ^ # | 30 + | | # | # | 31 + | hsync | hback # | # hfront | 32 + | len | porch # | hactive # porch | 33 + |<----->|<-------->#<-------+--------------------------->#<-------->| 34 + | | # | # | 35 + | | # |vactive # | 36 + | | # | # | 37 + | | # v # | 38 + +-------+----------#######################################----------+ 39 + | | | ^ | | 40 + | | | |vfront_porch | | 41 + | | | v | | 42 + +-------+----------+-------------------------------------+----------+ 43 43 44 44 45 45 The following is the panel timings shown with time on the x-axis.
+9
Documentation/devicetree/bindings/display/panel/sitronix,st7701.yaml
··· 42 42 IOVCC-supply: 43 43 description: I/O system regulator 44 44 45 + port: true 45 46 reset-gpios: true 47 + rotation: true 46 48 47 49 backlight: true 48 50 ··· 53 51 - reg 54 52 - VCC-supply 55 53 - IOVCC-supply 54 + - port 56 55 - reset-gpios 57 56 58 57 additionalProperties: false ··· 73 70 IOVCC-supply = <&reg_dldo2>; 74 71 reset-gpios = <&pio 3 24 GPIO_ACTIVE_HIGH>; /* LCD-RST: PD24 */ 75 72 backlight = <&backlight>; 73 + 74 + port { 75 + mipi_in_panel: endpoint { 76 + remote-endpoint = <&mipi_out_panel>; 77 + }; 78 + }; 76 79 }; 77 80 };
+4
Documentation/devicetree/bindings/display/panel/sitronix,st7789v.yaml
··· 26 26 spi-cpha: true 27 27 spi-cpol: true 28 28 29 + dc-gpios: 30 + maxItems: 1 31 + description: DCX pin, Display data/command selection pin in parallel interface 32 + 29 33 required: 30 34 - compatible 31 35 - reg
+8
Documentation/devicetree/bindings/display/panel/xinpeng,xpp055c272.yaml
··· 17 17 const: xinpeng,xpp055c272 18 18 reg: true 19 19 backlight: true 20 + port: true 20 21 reset-gpios: true 21 22 iovcc-supply: 22 23 description: regulator that supplies the iovcc voltage ··· 28 27 - compatible 29 28 - reg 30 29 - backlight 30 + - port 31 31 - iovcc-supply 32 32 - vci-supply 33 33 ··· 46 44 backlight = <&backlight>; 47 45 iovcc-supply = <&vcc_1v8>; 48 46 vci-supply = <&vcc3v3_lcd>; 47 + 48 + port { 49 + mipi_in_panel: endpoint { 50 + remote-endpoint = <&mipi_out_panel>; 51 + }; 52 + }; 49 53 }; 50 54 }; 51 55
+10
MAINTAINERS
··· 17265 17265 F: drivers/clk/qcom/ 17266 17266 F: include/dt-bindings/clock/qcom,* 17267 17267 17268 + QUALCOMM CLOUD AI (QAIC) DRIVER 17269 + M: Jeffrey Hugo <quic_jhugo@quicinc.com> 17270 + L: linux-arm-msm@vger.kernel.org 17271 + L: dri-devel@lists.freedesktop.org 17272 + S: Supported 17273 + T: git git://anongit.freedesktop.org/drm/drm-misc 17274 + F: Documentation/accel/qaic/ 17275 + F: drivers/accel/qaic/ 17276 + F: include/uapi/drm/qaic_accel.h 17277 + 17268 17278 QUALCOMM CORE POWER REDUCTION (CPR) AVS DRIVER 17269 17279 M: Bjorn Andersson <andersson@kernel.org> 17270 17280 M: Konrad Dybcio <konrad.dybcio@linaro.org>
+1
drivers/accel/Kconfig
··· 26 26 27 27 source "drivers/accel/habanalabs/Kconfig" 28 28 source "drivers/accel/ivpu/Kconfig" 29 + source "drivers/accel/qaic/Kconfig" 29 30 30 31 endif
+1
drivers/accel/Makefile
··· 2 2 3 3 obj-$(CONFIG_DRM_ACCEL_HABANALABS) += habanalabs/ 4 4 obj-$(CONFIG_DRM_ACCEL_IVPU) += ivpu/ 5 + obj-$(CONFIG_DRM_ACCEL_QAIC) += qaic/
+4
drivers/accel/ivpu/ivpu_drv.c
··· 433 433 /* Clear any pending errors */ 434 434 pcie_capability_clear_word(pdev, PCI_EXP_DEVSTA, 0x3f); 435 435 436 + /* VPU MTL does not require PCI spec 10m D3hot delay */ 437 + if (ivpu_is_mtl(vdev)) 438 + pdev->d3hot_delay = 0; 439 + 436 440 ret = pcim_enable_device(pdev); 437 441 if (ret) { 438 442 ivpu_err(vdev, "Failed to enable PCI device: %d\n", ret);
+23
drivers/accel/qaic/Kconfig
··· 1 + # SPDX-License-Identifier: GPL-2.0-only 2 + # 3 + # Qualcomm Cloud AI accelerators driver 4 + # 5 + 6 + config DRM_ACCEL_QAIC 7 + tristate "Qualcomm Cloud AI accelerators" 8 + depends on DRM_ACCEL 9 + depends on PCI && HAS_IOMEM 10 + depends on MHI_BUS 11 + depends on MMU 12 + select CRC32 13 + help 14 + Enables driver for Qualcomm's Cloud AI accelerator PCIe cards that are 15 + designed to accelerate Deep Learning inference workloads. 16 + 17 + The driver manages the PCIe devices and provides an IOCTL interface 18 + for users to submit workloads to the devices. 19 + 20 + If unsure, say N. 21 + 22 + To compile this driver as a module, choose M here: the 23 + module will be called qaic.
+13
drivers/accel/qaic/Makefile
··· 1 + # SPDX-License-Identifier: GPL-2.0-only 2 + # 3 + # Makefile for Qualcomm Cloud AI accelerators driver 4 + # 5 + 6 + obj-$(CONFIG_DRM_ACCEL_QAIC) := qaic.o 7 + 8 + qaic-y := \ 9 + mhi_controller.o \ 10 + mhi_qaic_ctrl.o \ 11 + qaic_control.o \ 12 + qaic_data.o \ 13 + qaic_drv.o
+563
drivers/accel/qaic/mhi_controller.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + 3 + /* Copyright (c) 2019-2021, The Linux Foundation. All rights reserved. */ 4 + /* Copyright (c) 2021-2023 Qualcomm Innovation Center, Inc. All rights reserved. */ 5 + 6 + #include <linux/delay.h> 7 + #include <linux/err.h> 8 + #include <linux/memblock.h> 9 + #include <linux/mhi.h> 10 + #include <linux/moduleparam.h> 11 + #include <linux/pci.h> 12 + #include <linux/sizes.h> 13 + 14 + #include "mhi_controller.h" 15 + #include "qaic.h" 16 + 17 + #define MAX_RESET_TIME_SEC 25 18 + 19 + static unsigned int mhi_timeout_ms = 2000; /* 2 sec default */ 20 + module_param(mhi_timeout_ms, uint, 0600); 21 + MODULE_PARM_DESC(mhi_timeout_ms, "MHI controller timeout value"); 22 + 23 + static struct mhi_channel_config aic100_channels[] = { 24 + { 25 + .name = "QAIC_LOOPBACK", 26 + .num = 0, 27 + .num_elements = 32, 28 + .local_elements = 0, 29 + .event_ring = 0, 30 + .dir = DMA_TO_DEVICE, 31 + .ee_mask = MHI_CH_EE_AMSS, 32 + .pollcfg = 0, 33 + .doorbell = MHI_DB_BRST_DISABLE, 34 + .lpm_notify = false, 35 + .offload_channel = false, 36 + .doorbell_mode_switch = false, 37 + .auto_queue = false, 38 + .wake_capable = false, 39 + }, 40 + { 41 + .name = "QAIC_LOOPBACK", 42 + .num = 1, 43 + .num_elements = 32, 44 + .local_elements = 0, 45 + .event_ring = 0, 46 + .dir = DMA_FROM_DEVICE, 47 + .ee_mask = MHI_CH_EE_AMSS, 48 + .pollcfg = 0, 49 + .doorbell = MHI_DB_BRST_DISABLE, 50 + .lpm_notify = false, 51 + .offload_channel = false, 52 + .doorbell_mode_switch = false, 53 + .auto_queue = false, 54 + .wake_capable = false, 55 + }, 56 + { 57 + .name = "QAIC_SAHARA", 58 + .num = 2, 59 + .num_elements = 32, 60 + .local_elements = 0, 61 + .event_ring = 0, 62 + .dir = DMA_TO_DEVICE, 63 + .ee_mask = MHI_CH_EE_SBL, 64 + .pollcfg = 0, 65 + .doorbell = MHI_DB_BRST_DISABLE, 66 + .lpm_notify = false, 67 + .offload_channel = false, 68 + .doorbell_mode_switch = false, 69 + .auto_queue = false, 70 + .wake_capable = false, 71 + }, 72 + { 73 + .name = "QAIC_SAHARA", 74 + .num = 3, 75 + .num_elements = 32, 76 + .local_elements = 0, 77 + .event_ring = 0, 78 + .dir = DMA_FROM_DEVICE, 79 + .ee_mask = MHI_CH_EE_SBL, 80 + .pollcfg = 0, 81 + .doorbell = MHI_DB_BRST_DISABLE, 82 + .lpm_notify = false, 83 + .offload_channel = false, 84 + .doorbell_mode_switch = false, 85 + .auto_queue = false, 86 + .wake_capable = false, 87 + }, 88 + { 89 + .name = "QAIC_DIAG", 90 + .num = 4, 91 + .num_elements = 32, 92 + .local_elements = 0, 93 + .event_ring = 0, 94 + .dir = DMA_TO_DEVICE, 95 + .ee_mask = MHI_CH_EE_AMSS, 96 + .pollcfg = 0, 97 + .doorbell = MHI_DB_BRST_DISABLE, 98 + .lpm_notify = false, 99 + .offload_channel = false, 100 + .doorbell_mode_switch = false, 101 + .auto_queue = false, 102 + .wake_capable = false, 103 + }, 104 + { 105 + .name = "QAIC_DIAG", 106 + .num = 5, 107 + .num_elements = 32, 108 + .local_elements = 0, 109 + .event_ring = 0, 110 + .dir = DMA_FROM_DEVICE, 111 + .ee_mask = MHI_CH_EE_AMSS, 112 + .pollcfg = 0, 113 + .doorbell = MHI_DB_BRST_DISABLE, 114 + .lpm_notify = false, 115 + .offload_channel = false, 116 + .doorbell_mode_switch = false, 117 + .auto_queue = false, 118 + .wake_capable = false, 119 + }, 120 + { 121 + .name = "QAIC_SSR", 122 + .num = 6, 123 + .num_elements = 32, 124 + .local_elements = 0, 125 + .event_ring = 0, 126 + .dir = DMA_TO_DEVICE, 127 + .ee_mask = MHI_CH_EE_AMSS, 128 + .pollcfg = 0, 129 + .doorbell = MHI_DB_BRST_DISABLE, 130 + .lpm_notify = false, 131 + .offload_channel = false, 132 + .doorbell_mode_switch = false, 133 + .auto_queue = false, 134 + .wake_capable = false, 135 + }, 136 + { 137 + .name = "QAIC_SSR", 138 + .num = 7, 139 + .num_elements = 32, 140 + .local_elements = 0, 141 + .event_ring = 0, 142 + .dir = DMA_FROM_DEVICE, 143 + .ee_mask = MHI_CH_EE_AMSS, 144 + .pollcfg = 0, 145 + .doorbell = MHI_DB_BRST_DISABLE, 146 + .lpm_notify = false, 147 + .offload_channel = false, 148 + .doorbell_mode_switch = false, 149 + .auto_queue = false, 150 + .wake_capable = false, 151 + }, 152 + { 153 + .name = "QAIC_QDSS", 154 + .num = 8, 155 + .num_elements = 32, 156 + .local_elements = 0, 157 + .event_ring = 0, 158 + .dir = DMA_TO_DEVICE, 159 + .ee_mask = MHI_CH_EE_AMSS, 160 + .pollcfg = 0, 161 + .doorbell = MHI_DB_BRST_DISABLE, 162 + .lpm_notify = false, 163 + .offload_channel = false, 164 + .doorbell_mode_switch = false, 165 + .auto_queue = false, 166 + .wake_capable = false, 167 + }, 168 + { 169 + .name = "QAIC_QDSS", 170 + .num = 9, 171 + .num_elements = 32, 172 + .local_elements = 0, 173 + .event_ring = 0, 174 + .dir = DMA_FROM_DEVICE, 175 + .ee_mask = MHI_CH_EE_AMSS, 176 + .pollcfg = 0, 177 + .doorbell = MHI_DB_BRST_DISABLE, 178 + .lpm_notify = false, 179 + .offload_channel = false, 180 + .doorbell_mode_switch = false, 181 + .auto_queue = false, 182 + .wake_capable = false, 183 + }, 184 + { 185 + .name = "QAIC_CONTROL", 186 + .num = 10, 187 + .num_elements = 128, 188 + .local_elements = 0, 189 + .event_ring = 0, 190 + .dir = DMA_TO_DEVICE, 191 + .ee_mask = MHI_CH_EE_AMSS, 192 + .pollcfg = 0, 193 + .doorbell = MHI_DB_BRST_DISABLE, 194 + .lpm_notify = false, 195 + .offload_channel = false, 196 + .doorbell_mode_switch = false, 197 + .auto_queue = false, 198 + .wake_capable = false, 199 + }, 200 + { 201 + .name = "QAIC_CONTROL", 202 + .num = 11, 203 + .num_elements = 128, 204 + .local_elements = 0, 205 + .event_ring = 0, 206 + .dir = DMA_FROM_DEVICE, 207 + .ee_mask = MHI_CH_EE_AMSS, 208 + .pollcfg = 0, 209 + .doorbell = MHI_DB_BRST_DISABLE, 210 + .lpm_notify = false, 211 + .offload_channel = false, 212 + .doorbell_mode_switch = false, 213 + .auto_queue = false, 214 + .wake_capable = false, 215 + }, 216 + { 217 + .name = "QAIC_LOGGING", 218 + .num = 12, 219 + .num_elements = 32, 220 + .local_elements = 0, 221 + .event_ring = 0, 222 + .dir = DMA_TO_DEVICE, 223 + .ee_mask = MHI_CH_EE_SBL, 224 + .pollcfg = 0, 225 + .doorbell = MHI_DB_BRST_DISABLE, 226 + .lpm_notify = false, 227 + .offload_channel = false, 228 + .doorbell_mode_switch = false, 229 + .auto_queue = false, 230 + .wake_capable = false, 231 + }, 232 + { 233 + .name = "QAIC_LOGGING", 234 + .num = 13, 235 + .num_elements = 32, 236 + .local_elements = 0, 237 + .event_ring = 0, 238 + .dir = DMA_FROM_DEVICE, 239 + .ee_mask = MHI_CH_EE_SBL, 240 + .pollcfg = 0, 241 + .doorbell = MHI_DB_BRST_DISABLE, 242 + .lpm_notify = false, 243 + .offload_channel = false, 244 + .doorbell_mode_switch = false, 245 + .auto_queue = false, 246 + .wake_capable = false, 247 + }, 248 + { 249 + .name = "QAIC_STATUS", 250 + .num = 14, 251 + .num_elements = 32, 252 + .local_elements = 0, 253 + .event_ring = 0, 254 + .dir = DMA_TO_DEVICE, 255 + .ee_mask = MHI_CH_EE_AMSS, 256 + .pollcfg = 0, 257 + .doorbell = MHI_DB_BRST_DISABLE, 258 + .lpm_notify = false, 259 + .offload_channel = false, 260 + .doorbell_mode_switch = false, 261 + .auto_queue = false, 262 + .wake_capable = false, 263 + }, 264 + { 265 + .name = "QAIC_STATUS", 266 + .num = 15, 267 + .num_elements = 32, 268 + .local_elements = 0, 269 + .event_ring = 0, 270 + .dir = DMA_FROM_DEVICE, 271 + .ee_mask = MHI_CH_EE_AMSS, 272 + .pollcfg = 0, 273 + .doorbell = MHI_DB_BRST_DISABLE, 274 + .lpm_notify = false, 275 + .offload_channel = false, 276 + .doorbell_mode_switch = false, 277 + .auto_queue = false, 278 + .wake_capable = false, 279 + }, 280 + { 281 + .name = "QAIC_TELEMETRY", 282 + .num = 16, 283 + .num_elements = 32, 284 + .local_elements = 0, 285 + .event_ring = 0, 286 + .dir = DMA_TO_DEVICE, 287 + .ee_mask = MHI_CH_EE_AMSS, 288 + .pollcfg = 0, 289 + .doorbell = MHI_DB_BRST_DISABLE, 290 + .lpm_notify = false, 291 + .offload_channel = false, 292 + .doorbell_mode_switch = false, 293 + .auto_queue = false, 294 + .wake_capable = false, 295 + }, 296 + { 297 + .name = "QAIC_TELEMETRY", 298 + .num = 17, 299 + .num_elements = 32, 300 + .local_elements = 0, 301 + .event_ring = 0, 302 + .dir = DMA_FROM_DEVICE, 303 + .ee_mask = MHI_CH_EE_AMSS, 304 + .pollcfg = 0, 305 + .doorbell = MHI_DB_BRST_DISABLE, 306 + .lpm_notify = false, 307 + .offload_channel = false, 308 + .doorbell_mode_switch = false, 309 + .auto_queue = false, 310 + .wake_capable = false, 311 + }, 312 + { 313 + .name = "QAIC_DEBUG", 314 + .num = 18, 315 + .num_elements = 32, 316 + .local_elements = 0, 317 + .event_ring = 0, 318 + .dir = DMA_TO_DEVICE, 319 + .ee_mask = MHI_CH_EE_AMSS, 320 + .pollcfg = 0, 321 + .doorbell = MHI_DB_BRST_DISABLE, 322 + .lpm_notify = false, 323 + .offload_channel = false, 324 + .doorbell_mode_switch = false, 325 + .auto_queue = false, 326 + .wake_capable = false, 327 + }, 328 + { 329 + .name = "QAIC_DEBUG", 330 + .num = 19, 331 + .num_elements = 32, 332 + .local_elements = 0, 333 + .event_ring = 0, 334 + .dir = DMA_FROM_DEVICE, 335 + .ee_mask = MHI_CH_EE_AMSS, 336 + .pollcfg = 0, 337 + .doorbell = MHI_DB_BRST_DISABLE, 338 + .lpm_notify = false, 339 + .offload_channel = false, 340 + .doorbell_mode_switch = false, 341 + .auto_queue = false, 342 + .wake_capable = false, 343 + }, 344 + { 345 + .name = "QAIC_TIMESYNC", 346 + .num = 20, 347 + .num_elements = 32, 348 + .local_elements = 0, 349 + .event_ring = 0, 350 + .dir = DMA_TO_DEVICE, 351 + .ee_mask = MHI_CH_EE_SBL | MHI_CH_EE_AMSS, 352 + .pollcfg = 0, 353 + .doorbell = MHI_DB_BRST_DISABLE, 354 + .lpm_notify = false, 355 + .offload_channel = false, 356 + .doorbell_mode_switch = false, 357 + .auto_queue = false, 358 + .wake_capable = false, 359 + }, 360 + { 361 + .num = 21, 362 + .name = "QAIC_TIMESYNC", 363 + .num_elements = 32, 364 + .local_elements = 0, 365 + .event_ring = 0, 366 + .dir = DMA_FROM_DEVICE, 367 + .ee_mask = MHI_CH_EE_SBL | MHI_CH_EE_AMSS, 368 + .pollcfg = 0, 369 + .doorbell = MHI_DB_BRST_DISABLE, 370 + .lpm_notify = false, 371 + .offload_channel = false, 372 + .doorbell_mode_switch = false, 373 + .auto_queue = false, 374 + .wake_capable = false, 375 + }, 376 + }; 377 + 378 + static struct mhi_event_config aic100_events[] = { 379 + { 380 + .num_elements = 32, 381 + .irq_moderation_ms = 0, 382 + .irq = 0, 383 + .channel = U32_MAX, 384 + .priority = 1, 385 + .mode = MHI_DB_BRST_DISABLE, 386 + .data_type = MHI_ER_CTRL, 387 + .hardware_event = false, 388 + .client_managed = false, 389 + .offload_channel = false, 390 + }, 391 + }; 392 + 393 + static struct mhi_controller_config aic100_config = { 394 + .max_channels = 128, 395 + .timeout_ms = 0, /* controlled by mhi_timeout */ 396 + .buf_len = 0, 397 + .num_channels = ARRAY_SIZE(aic100_channels), 398 + .ch_cfg = aic100_channels, 399 + .num_events = ARRAY_SIZE(aic100_events), 400 + .event_cfg = aic100_events, 401 + .use_bounce_buf = false, 402 + .m2_no_db = false, 403 + }; 404 + 405 + static int mhi_read_reg(struct mhi_controller *mhi_cntrl, void __iomem *addr, u32 *out) 406 + { 407 + u32 tmp = readl_relaxed(addr); 408 + 409 + if (tmp == U32_MAX) 410 + return -EIO; 411 + 412 + *out = tmp; 413 + 414 + return 0; 415 + } 416 + 417 + static void mhi_write_reg(struct mhi_controller *mhi_cntrl, void __iomem *addr, u32 val) 418 + { 419 + writel_relaxed(val, addr); 420 + } 421 + 422 + static int mhi_runtime_get(struct mhi_controller *mhi_cntrl) 423 + { 424 + return 0; 425 + } 426 + 427 + static void mhi_runtime_put(struct mhi_controller *mhi_cntrl) 428 + { 429 + } 430 + 431 + static void mhi_status_cb(struct mhi_controller *mhi_cntrl, enum mhi_callback reason) 432 + { 433 + struct qaic_device *qdev = pci_get_drvdata(to_pci_dev(mhi_cntrl->cntrl_dev)); 434 + 435 + /* this event occurs in atomic context */ 436 + if (reason == MHI_CB_FATAL_ERROR) 437 + pci_err(qdev->pdev, "Fatal error received from device. Attempting to recover\n"); 438 + /* this event occurs in non-atomic context */ 439 + if (reason == MHI_CB_SYS_ERROR) 440 + qaic_dev_reset_clean_local_state(qdev, true); 441 + } 442 + 443 + static int mhi_reset_and_async_power_up(struct mhi_controller *mhi_cntrl) 444 + { 445 + u8 time_sec = 1; 446 + int current_ee; 447 + int ret; 448 + 449 + /* Reset the device to bring the device in PBL EE */ 450 + mhi_soc_reset(mhi_cntrl); 451 + 452 + /* 453 + * Keep checking the execution environment(EE) after every 1 second 454 + * interval. 455 + */ 456 + do { 457 + msleep(1000); 458 + current_ee = mhi_get_exec_env(mhi_cntrl); 459 + } while (current_ee != MHI_EE_PBL && time_sec++ <= MAX_RESET_TIME_SEC); 460 + 461 + /* If the device is in PBL EE retry power up */ 462 + if (current_ee == MHI_EE_PBL) 463 + ret = mhi_async_power_up(mhi_cntrl); 464 + else 465 + ret = -EIO; 466 + 467 + return ret; 468 + } 469 + 470 + struct mhi_controller *qaic_mhi_register_controller(struct pci_dev *pci_dev, void __iomem *mhi_bar, 471 + int mhi_irq) 472 + { 473 + struct mhi_controller *mhi_cntrl; 474 + int ret; 475 + 476 + mhi_cntrl = devm_kzalloc(&pci_dev->dev, sizeof(*mhi_cntrl), GFP_KERNEL); 477 + if (!mhi_cntrl) 478 + return ERR_PTR(-ENOMEM); 479 + 480 + mhi_cntrl->cntrl_dev = &pci_dev->dev; 481 + 482 + /* 483 + * Covers the entire possible physical ram region. Remote side is 484 + * going to calculate a size of this range, so subtract 1 to prevent 485 + * rollover. 486 + */ 487 + mhi_cntrl->iova_start = 0; 488 + mhi_cntrl->iova_stop = PHYS_ADDR_MAX - 1; 489 + mhi_cntrl->status_cb = mhi_status_cb; 490 + mhi_cntrl->runtime_get = mhi_runtime_get; 491 + mhi_cntrl->runtime_put = mhi_runtime_put; 492 + mhi_cntrl->read_reg = mhi_read_reg; 493 + mhi_cntrl->write_reg = mhi_write_reg; 494 + mhi_cntrl->regs = mhi_bar; 495 + mhi_cntrl->reg_len = SZ_4K; 496 + mhi_cntrl->nr_irqs = 1; 497 + mhi_cntrl->irq = devm_kmalloc(&pci_dev->dev, sizeof(*mhi_cntrl->irq), GFP_KERNEL); 498 + 499 + if (!mhi_cntrl->irq) 500 + return ERR_PTR(-ENOMEM); 501 + 502 + mhi_cntrl->irq[0] = mhi_irq; 503 + mhi_cntrl->fw_image = "qcom/aic100/sbl.bin"; 504 + 505 + /* use latest configured timeout */ 506 + aic100_config.timeout_ms = mhi_timeout_ms; 507 + ret = mhi_register_controller(mhi_cntrl, &aic100_config); 508 + if (ret) { 509 + pci_err(pci_dev, "mhi_register_controller failed %d\n", ret); 510 + return ERR_PTR(ret); 511 + } 512 + 513 + ret = mhi_prepare_for_power_up(mhi_cntrl); 514 + if (ret) { 515 + pci_err(pci_dev, "mhi_prepare_for_power_up failed %d\n", ret); 516 + goto prepare_power_up_fail; 517 + } 518 + 519 + ret = mhi_async_power_up(mhi_cntrl); 520 + /* 521 + * If EIO is returned it is possible that device is in SBL EE, which is 522 + * undesired. SOC reset the device and try to power up again. 523 + */ 524 + if (ret == -EIO && MHI_EE_SBL == mhi_get_exec_env(mhi_cntrl)) { 525 + pci_err(pci_dev, "Found device in SBL at MHI init. Attempting a reset.\n"); 526 + ret = mhi_reset_and_async_power_up(mhi_cntrl); 527 + } 528 + 529 + if (ret) { 530 + pci_err(pci_dev, "mhi_async_power_up failed %d\n", ret); 531 + goto power_up_fail; 532 + } 533 + 534 + return mhi_cntrl; 535 + 536 + power_up_fail: 537 + mhi_unprepare_after_power_down(mhi_cntrl); 538 + prepare_power_up_fail: 539 + mhi_unregister_controller(mhi_cntrl); 540 + return ERR_PTR(ret); 541 + } 542 + 543 + void qaic_mhi_free_controller(struct mhi_controller *mhi_cntrl, bool link_up) 544 + { 545 + mhi_power_down(mhi_cntrl, link_up); 546 + mhi_unprepare_after_power_down(mhi_cntrl); 547 + mhi_unregister_controller(mhi_cntrl); 548 + } 549 + 550 + void qaic_mhi_start_reset(struct mhi_controller *mhi_cntrl) 551 + { 552 + mhi_power_down(mhi_cntrl, true); 553 + } 554 + 555 + void qaic_mhi_reset_done(struct mhi_controller *mhi_cntrl) 556 + { 557 + struct pci_dev *pci_dev = container_of(mhi_cntrl->cntrl_dev, struct pci_dev, dev); 558 + int ret; 559 + 560 + ret = mhi_async_power_up(mhi_cntrl); 561 + if (ret) 562 + pci_err(pci_dev, "mhi_async_power_up failed after reset %d\n", ret); 563 + }
+16
drivers/accel/qaic/mhi_controller.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0-only 2 + * 3 + * Copyright (c) 2019-2020, The Linux Foundation. All rights reserved. 4 + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved. 5 + */ 6 + 7 + #ifndef MHICONTROLLERQAIC_H_ 8 + #define MHICONTROLLERQAIC_H_ 9 + 10 + struct mhi_controller *qaic_mhi_register_controller(struct pci_dev *pci_dev, void __iomem *mhi_bar, 11 + int mhi_irq); 12 + void qaic_mhi_free_controller(struct mhi_controller *mhi_cntrl, bool link_up); 13 + void qaic_mhi_start_reset(struct mhi_controller *mhi_cntrl); 14 + void qaic_mhi_reset_done(struct mhi_controller *mhi_cntrl); 15 + 16 + #endif /* MHICONTROLLERQAIC_H_ */
+569
drivers/accel/qaic/mhi_qaic_ctrl.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved. */ 3 + 4 + #include <linux/kernel.h> 5 + #include <linux/mhi.h> 6 + #include <linux/mod_devicetable.h> 7 + #include <linux/module.h> 8 + #include <linux/poll.h> 9 + #include <linux/xarray.h> 10 + #include <uapi/linux/eventpoll.h> 11 + 12 + #include "mhi_qaic_ctrl.h" 13 + #include "qaic.h" 14 + 15 + #define MHI_QAIC_CTRL_DRIVER_NAME "mhi_qaic_ctrl" 16 + #define MHI_QAIC_CTRL_MAX_MINORS 128 17 + #define MHI_MAX_MTU 0xffff 18 + static DEFINE_XARRAY_ALLOC(mqc_xa); 19 + static struct class *mqc_dev_class; 20 + static int mqc_dev_major; 21 + 22 + /** 23 + * struct mqc_buf - Buffer structure used to receive data from device 24 + * @data: Address of data to read from 25 + * @odata: Original address returned from *alloc() API. Used to free this buf. 26 + * @len: Length of data in byte 27 + * @node: This buffer will be part of list managed in struct mqc_dev 28 + */ 29 + struct mqc_buf { 30 + void *data; 31 + void *odata; 32 + size_t len; 33 + struct list_head node; 34 + }; 35 + 36 + /** 37 + * struct mqc_dev - MHI QAIC Control Device 38 + * @minor: MQC device node minor number 39 + * @mhi_dev: Associated mhi device object 40 + * @mtu: Max TRE buffer length 41 + * @enabled: Flag to track the state of the MQC device 42 + * @lock: Mutex lock to serialize access to open_count 43 + * @read_lock: Mutex lock to serialize readers 44 + * @write_lock: Mutex lock to serialize writers 45 + * @ul_wq: Wait queue for writers 46 + * @dl_wq: Wait queue for readers 47 + * @dl_queue_lock: Spin lock to serialize access to download queue 48 + * @dl_queue: Queue of downloaded buffers 49 + * @open_count: Track open counts 50 + * @ref_count: Reference count for this structure 51 + */ 52 + struct mqc_dev { 53 + u32 minor; 54 + struct mhi_device *mhi_dev; 55 + size_t mtu; 56 + bool enabled; 57 + struct mutex lock; 58 + struct mutex read_lock; 59 + struct mutex write_lock; 60 + wait_queue_head_t ul_wq; 61 + wait_queue_head_t dl_wq; 62 + spinlock_t dl_queue_lock; 63 + struct list_head dl_queue; 64 + unsigned int open_count; 65 + struct kref ref_count; 66 + }; 67 + 68 + static void mqc_dev_release(struct kref *ref) 69 + { 70 + struct mqc_dev *mqcdev = container_of(ref, struct mqc_dev, ref_count); 71 + 72 + mutex_destroy(&mqcdev->read_lock); 73 + mutex_destroy(&mqcdev->write_lock); 74 + mutex_destroy(&mqcdev->lock); 75 + kfree(mqcdev); 76 + } 77 + 78 + static int mhi_qaic_ctrl_fill_dl_queue(struct mqc_dev *mqcdev) 79 + { 80 + struct mhi_device *mhi_dev = mqcdev->mhi_dev; 81 + struct mqc_buf *ctrlbuf; 82 + int rx_budget; 83 + int ret = 0; 84 + void *data; 85 + 86 + rx_budget = mhi_get_free_desc_count(mhi_dev, DMA_FROM_DEVICE); 87 + if (rx_budget < 0) 88 + return -EIO; 89 + 90 + while (rx_budget--) { 91 + data = kzalloc(mqcdev->mtu + sizeof(*ctrlbuf), GFP_KERNEL); 92 + if (!data) 93 + return -ENOMEM; 94 + 95 + ctrlbuf = data + mqcdev->mtu; 96 + ctrlbuf->odata = data; 97 + 98 + ret = mhi_queue_buf(mhi_dev, DMA_FROM_DEVICE, data, mqcdev->mtu, MHI_EOT); 99 + if (ret) { 100 + kfree(data); 101 + dev_err(&mhi_dev->dev, "Failed to queue buffer\n"); 102 + return ret; 103 + } 104 + } 105 + 106 + return ret; 107 + } 108 + 109 + static int mhi_qaic_ctrl_dev_start_chan(struct mqc_dev *mqcdev) 110 + { 111 + struct device *dev = &mqcdev->mhi_dev->dev; 112 + int ret = 0; 113 + 114 + ret = mutex_lock_interruptible(&mqcdev->lock); 115 + if (ret) 116 + return ret; 117 + if (!mqcdev->enabled) { 118 + ret = -ENODEV; 119 + goto release_dev_lock; 120 + } 121 + if (!mqcdev->open_count) { 122 + ret = mhi_prepare_for_transfer(mqcdev->mhi_dev); 123 + if (ret) { 124 + dev_err(dev, "Error starting transfer channels\n"); 125 + goto release_dev_lock; 126 + } 127 + 128 + ret = mhi_qaic_ctrl_fill_dl_queue(mqcdev); 129 + if (ret) { 130 + dev_err(dev, "Error filling download queue.\n"); 131 + goto mhi_unprepare; 132 + } 133 + } 134 + mqcdev->open_count++; 135 + mutex_unlock(&mqcdev->lock); 136 + 137 + return 0; 138 + 139 + mhi_unprepare: 140 + mhi_unprepare_from_transfer(mqcdev->mhi_dev); 141 + release_dev_lock: 142 + mutex_unlock(&mqcdev->lock); 143 + return ret; 144 + } 145 + 146 + static struct mqc_dev *mqc_dev_get_by_minor(unsigned int minor) 147 + { 148 + struct mqc_dev *mqcdev; 149 + 150 + xa_lock(&mqc_xa); 151 + mqcdev = xa_load(&mqc_xa, minor); 152 + if (mqcdev) 153 + kref_get(&mqcdev->ref_count); 154 + xa_unlock(&mqc_xa); 155 + 156 + return mqcdev; 157 + } 158 + 159 + static int mhi_qaic_ctrl_open(struct inode *inode, struct file *filp) 160 + { 161 + struct mqc_dev *mqcdev; 162 + int ret; 163 + 164 + mqcdev = mqc_dev_get_by_minor(iminor(inode)); 165 + if (!mqcdev) { 166 + pr_debug("mqc: minor %d not found\n", iminor(inode)); 167 + return -EINVAL; 168 + } 169 + 170 + ret = mhi_qaic_ctrl_dev_start_chan(mqcdev); 171 + if (ret) { 172 + kref_put(&mqcdev->ref_count, mqc_dev_release); 173 + return ret; 174 + } 175 + 176 + filp->private_data = mqcdev; 177 + 178 + return 0; 179 + } 180 + 181 + static void mhi_qaic_ctrl_buf_free(struct mqc_buf *ctrlbuf) 182 + { 183 + list_del(&ctrlbuf->node); 184 + kfree(ctrlbuf->odata); 185 + } 186 + 187 + static void __mhi_qaic_ctrl_release(struct mqc_dev *mqcdev) 188 + { 189 + struct mqc_buf *ctrlbuf, *tmp; 190 + 191 + mhi_unprepare_from_transfer(mqcdev->mhi_dev); 192 + wake_up_interruptible(&mqcdev->ul_wq); 193 + wake_up_interruptible(&mqcdev->dl_wq); 194 + /* 195 + * Free the dl_queue. As we have already unprepared mhi transfers, we 196 + * do not expect any callback functions that update dl_queue hence no need 197 + * to grab dl_queue lock. 198 + */ 199 + mutex_lock(&mqcdev->read_lock); 200 + list_for_each_entry_safe(ctrlbuf, tmp, &mqcdev->dl_queue, node) 201 + mhi_qaic_ctrl_buf_free(ctrlbuf); 202 + mutex_unlock(&mqcdev->read_lock); 203 + } 204 + 205 + static int mhi_qaic_ctrl_release(struct inode *inode, struct file *file) 206 + { 207 + struct mqc_dev *mqcdev = file->private_data; 208 + 209 + mutex_lock(&mqcdev->lock); 210 + mqcdev->open_count--; 211 + if (!mqcdev->open_count && mqcdev->enabled) 212 + __mhi_qaic_ctrl_release(mqcdev); 213 + mutex_unlock(&mqcdev->lock); 214 + 215 + kref_put(&mqcdev->ref_count, mqc_dev_release); 216 + 217 + return 0; 218 + } 219 + 220 + static __poll_t mhi_qaic_ctrl_poll(struct file *file, poll_table *wait) 221 + { 222 + struct mqc_dev *mqcdev = file->private_data; 223 + struct mhi_device *mhi_dev; 224 + __poll_t mask = 0; 225 + 226 + mhi_dev = mqcdev->mhi_dev; 227 + 228 + poll_wait(file, &mqcdev->ul_wq, wait); 229 + poll_wait(file, &mqcdev->dl_wq, wait); 230 + 231 + mutex_lock(&mqcdev->lock); 232 + if (!mqcdev->enabled) { 233 + mutex_unlock(&mqcdev->lock); 234 + return EPOLLERR; 235 + } 236 + 237 + spin_lock_bh(&mqcdev->dl_queue_lock); 238 + if (!list_empty(&mqcdev->dl_queue)) 239 + mask |= EPOLLIN | EPOLLRDNORM; 240 + spin_unlock_bh(&mqcdev->dl_queue_lock); 241 + 242 + if (mutex_lock_interruptible(&mqcdev->write_lock)) { 243 + mutex_unlock(&mqcdev->lock); 244 + return EPOLLERR; 245 + } 246 + if (mhi_get_free_desc_count(mhi_dev, DMA_TO_DEVICE) > 0) 247 + mask |= EPOLLOUT | EPOLLWRNORM; 248 + mutex_unlock(&mqcdev->write_lock); 249 + mutex_unlock(&mqcdev->lock); 250 + 251 + dev_dbg(&mhi_dev->dev, "Client attempted to poll, returning mask 0x%x\n", mask); 252 + 253 + return mask; 254 + } 255 + 256 + static int mhi_qaic_ctrl_tx(struct mqc_dev *mqcdev) 257 + { 258 + int ret; 259 + 260 + ret = wait_event_interruptible(mqcdev->ul_wq, !mqcdev->enabled || 261 + mhi_get_free_desc_count(mqcdev->mhi_dev, DMA_TO_DEVICE) > 0); 262 + 263 + if (!mqcdev->enabled) 264 + return -ENODEV; 265 + 266 + return ret; 267 + } 268 + 269 + static ssize_t mhi_qaic_ctrl_write(struct file *file, const char __user *buf, size_t count, 270 + loff_t *offp) 271 + { 272 + struct mqc_dev *mqcdev = file->private_data; 273 + struct mhi_device *mhi_dev; 274 + size_t bytes_xfered = 0; 275 + struct device *dev; 276 + int ret, nr_desc; 277 + 278 + mhi_dev = mqcdev->mhi_dev; 279 + dev = &mhi_dev->dev; 280 + 281 + if (!mhi_dev->ul_chan) 282 + return -EOPNOTSUPP; 283 + 284 + if (!buf || !count) 285 + return -EINVAL; 286 + 287 + dev_dbg(dev, "Request to transfer %zu bytes\n", count); 288 + 289 + ret = mhi_qaic_ctrl_tx(mqcdev); 290 + if (ret) 291 + return ret; 292 + 293 + if (mutex_lock_interruptible(&mqcdev->write_lock)) 294 + return -EINTR; 295 + 296 + nr_desc = mhi_get_free_desc_count(mhi_dev, DMA_TO_DEVICE); 297 + if (nr_desc * mqcdev->mtu < count) { 298 + ret = -EMSGSIZE; 299 + dev_dbg(dev, "Buffer too big to transfer\n"); 300 + goto unlock_mutex; 301 + } 302 + 303 + while (count != bytes_xfered) { 304 + enum mhi_flags flags; 305 + size_t to_copy; 306 + void *kbuf; 307 + 308 + to_copy = min_t(size_t, count - bytes_xfered, mqcdev->mtu); 309 + kbuf = kmalloc(to_copy, GFP_KERNEL); 310 + if (!kbuf) { 311 + ret = -ENOMEM; 312 + goto unlock_mutex; 313 + } 314 + 315 + ret = copy_from_user(kbuf, buf + bytes_xfered, to_copy); 316 + if (ret) { 317 + kfree(kbuf); 318 + ret = -EFAULT; 319 + goto unlock_mutex; 320 + } 321 + 322 + if (bytes_xfered + to_copy == count) 323 + flags = MHI_EOT; 324 + else 325 + flags = MHI_CHAIN; 326 + 327 + ret = mhi_queue_buf(mhi_dev, DMA_TO_DEVICE, kbuf, to_copy, flags); 328 + if (ret) { 329 + kfree(kbuf); 330 + dev_err(dev, "Failed to queue buf of size %zu\n", to_copy); 331 + goto unlock_mutex; 332 + } 333 + 334 + bytes_xfered += to_copy; 335 + } 336 + 337 + mutex_unlock(&mqcdev->write_lock); 338 + dev_dbg(dev, "bytes xferred: %zu\n", bytes_xfered); 339 + 340 + return bytes_xfered; 341 + 342 + unlock_mutex: 343 + mutex_unlock(&mqcdev->write_lock); 344 + return ret; 345 + } 346 + 347 + static int mhi_qaic_ctrl_rx(struct mqc_dev *mqcdev) 348 + { 349 + int ret; 350 + 351 + ret = wait_event_interruptible(mqcdev->dl_wq, 352 + !mqcdev->enabled || !list_empty(&mqcdev->dl_queue)); 353 + 354 + if (!mqcdev->enabled) 355 + return -ENODEV; 356 + 357 + return ret; 358 + } 359 + 360 + static ssize_t mhi_qaic_ctrl_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) 361 + { 362 + struct mqc_dev *mqcdev = file->private_data; 363 + struct mqc_buf *ctrlbuf; 364 + size_t to_copy; 365 + int ret; 366 + 367 + if (!mqcdev->mhi_dev->dl_chan) 368 + return -EOPNOTSUPP; 369 + 370 + ret = mhi_qaic_ctrl_rx(mqcdev); 371 + if (ret) 372 + return ret; 373 + 374 + if (mutex_lock_interruptible(&mqcdev->read_lock)) 375 + return -EINTR; 376 + 377 + ctrlbuf = list_first_entry_or_null(&mqcdev->dl_queue, struct mqc_buf, node); 378 + if (!ctrlbuf) { 379 + mutex_unlock(&mqcdev->read_lock); 380 + ret = -ENODEV; 381 + goto error_out; 382 + } 383 + 384 + to_copy = min_t(size_t, count, ctrlbuf->len); 385 + if (copy_to_user(buf, ctrlbuf->data, to_copy)) { 386 + mutex_unlock(&mqcdev->read_lock); 387 + dev_dbg(&mqcdev->mhi_dev->dev, "Failed to copy data to user buffer\n"); 388 + ret = -EFAULT; 389 + goto error_out; 390 + } 391 + 392 + ctrlbuf->len -= to_copy; 393 + ctrlbuf->data += to_copy; 394 + 395 + if (!ctrlbuf->len) { 396 + spin_lock_bh(&mqcdev->dl_queue_lock); 397 + mhi_qaic_ctrl_buf_free(ctrlbuf); 398 + spin_unlock_bh(&mqcdev->dl_queue_lock); 399 + mhi_qaic_ctrl_fill_dl_queue(mqcdev); 400 + dev_dbg(&mqcdev->mhi_dev->dev, "Read buf freed\n"); 401 + } 402 + 403 + mutex_unlock(&mqcdev->read_lock); 404 + return to_copy; 405 + 406 + error_out: 407 + mutex_unlock(&mqcdev->read_lock); 408 + return ret; 409 + } 410 + 411 + static const struct file_operations mhidev_fops = { 412 + .owner = THIS_MODULE, 413 + .open = mhi_qaic_ctrl_open, 414 + .release = mhi_qaic_ctrl_release, 415 + .read = mhi_qaic_ctrl_read, 416 + .write = mhi_qaic_ctrl_write, 417 + .poll = mhi_qaic_ctrl_poll, 418 + }; 419 + 420 + static void mhi_qaic_ctrl_ul_xfer_cb(struct mhi_device *mhi_dev, struct mhi_result *mhi_result) 421 + { 422 + struct mqc_dev *mqcdev = dev_get_drvdata(&mhi_dev->dev); 423 + 424 + dev_dbg(&mhi_dev->dev, "%s: status: %d xfer_len: %zu\n", __func__, 425 + mhi_result->transaction_status, mhi_result->bytes_xferd); 426 + 427 + kfree(mhi_result->buf_addr); 428 + 429 + if (!mhi_result->transaction_status) 430 + wake_up_interruptible(&mqcdev->ul_wq); 431 + } 432 + 433 + static void mhi_qaic_ctrl_dl_xfer_cb(struct mhi_device *mhi_dev, struct mhi_result *mhi_result) 434 + { 435 + struct mqc_dev *mqcdev = dev_get_drvdata(&mhi_dev->dev); 436 + struct mqc_buf *ctrlbuf; 437 + 438 + dev_dbg(&mhi_dev->dev, "%s: status: %d receive_len: %zu\n", __func__, 439 + mhi_result->transaction_status, mhi_result->bytes_xferd); 440 + 441 + if (mhi_result->transaction_status && 442 + mhi_result->transaction_status != -EOVERFLOW) { 443 + kfree(mhi_result->buf_addr); 444 + return; 445 + } 446 + 447 + ctrlbuf = mhi_result->buf_addr + mqcdev->mtu; 448 + ctrlbuf->data = mhi_result->buf_addr; 449 + ctrlbuf->len = mhi_result->bytes_xferd; 450 + spin_lock_bh(&mqcdev->dl_queue_lock); 451 + list_add_tail(&ctrlbuf->node, &mqcdev->dl_queue); 452 + spin_unlock_bh(&mqcdev->dl_queue_lock); 453 + 454 + wake_up_interruptible(&mqcdev->dl_wq); 455 + } 456 + 457 + static int mhi_qaic_ctrl_probe(struct mhi_device *mhi_dev, const struct mhi_device_id *id) 458 + { 459 + struct mqc_dev *mqcdev; 460 + struct device *dev; 461 + int ret; 462 + 463 + mqcdev = kzalloc(sizeof(*mqcdev), GFP_KERNEL); 464 + if (!mqcdev) 465 + return -ENOMEM; 466 + 467 + kref_init(&mqcdev->ref_count); 468 + mutex_init(&mqcdev->lock); 469 + mqcdev->mhi_dev = mhi_dev; 470 + 471 + ret = xa_alloc(&mqc_xa, &mqcdev->minor, mqcdev, XA_LIMIT(0, MHI_QAIC_CTRL_MAX_MINORS), 472 + GFP_KERNEL); 473 + if (ret) { 474 + kfree(mqcdev); 475 + return ret; 476 + } 477 + 478 + init_waitqueue_head(&mqcdev->ul_wq); 479 + init_waitqueue_head(&mqcdev->dl_wq); 480 + mutex_init(&mqcdev->read_lock); 481 + mutex_init(&mqcdev->write_lock); 482 + spin_lock_init(&mqcdev->dl_queue_lock); 483 + INIT_LIST_HEAD(&mqcdev->dl_queue); 484 + mqcdev->mtu = min_t(size_t, id->driver_data, MHI_MAX_MTU); 485 + mqcdev->enabled = true; 486 + mqcdev->open_count = 0; 487 + dev_set_drvdata(&mhi_dev->dev, mqcdev); 488 + 489 + dev = device_create(mqc_dev_class, &mhi_dev->dev, MKDEV(mqc_dev_major, mqcdev->minor), 490 + mqcdev, "%s", dev_name(&mhi_dev->dev)); 491 + if (IS_ERR(dev)) { 492 + xa_erase(&mqc_xa, mqcdev->minor); 493 + dev_set_drvdata(&mhi_dev->dev, NULL); 494 + kfree(mqcdev); 495 + return PTR_ERR(dev); 496 + } 497 + 498 + return 0; 499 + }; 500 + 501 + static void mhi_qaic_ctrl_remove(struct mhi_device *mhi_dev) 502 + { 503 + struct mqc_dev *mqcdev = dev_get_drvdata(&mhi_dev->dev); 504 + 505 + device_destroy(mqc_dev_class, MKDEV(mqc_dev_major, mqcdev->minor)); 506 + 507 + mutex_lock(&mqcdev->lock); 508 + mqcdev->enabled = false; 509 + if (mqcdev->open_count) 510 + __mhi_qaic_ctrl_release(mqcdev); 511 + mutex_unlock(&mqcdev->lock); 512 + 513 + xa_erase(&mqc_xa, mqcdev->minor); 514 + kref_put(&mqcdev->ref_count, mqc_dev_release); 515 + } 516 + 517 + /* .driver_data stores max mtu */ 518 + static const struct mhi_device_id mhi_qaic_ctrl_match_table[] = { 519 + { .chan = "QAIC_SAHARA", .driver_data = SZ_32K}, 520 + {}, 521 + }; 522 + MODULE_DEVICE_TABLE(mhi, mhi_qaic_ctrl_match_table); 523 + 524 + static struct mhi_driver mhi_qaic_ctrl_driver = { 525 + .id_table = mhi_qaic_ctrl_match_table, 526 + .remove = mhi_qaic_ctrl_remove, 527 + .probe = mhi_qaic_ctrl_probe, 528 + .ul_xfer_cb = mhi_qaic_ctrl_ul_xfer_cb, 529 + .dl_xfer_cb = mhi_qaic_ctrl_dl_xfer_cb, 530 + .driver = { 531 + .name = MHI_QAIC_CTRL_DRIVER_NAME, 532 + }, 533 + }; 534 + 535 + int mhi_qaic_ctrl_init(void) 536 + { 537 + int ret; 538 + 539 + ret = register_chrdev(0, MHI_QAIC_CTRL_DRIVER_NAME, &mhidev_fops); 540 + if (ret < 0) 541 + return ret; 542 + 543 + mqc_dev_major = ret; 544 + mqc_dev_class = class_create(THIS_MODULE, MHI_QAIC_CTRL_DRIVER_NAME); 545 + if (IS_ERR(mqc_dev_class)) { 546 + ret = PTR_ERR(mqc_dev_class); 547 + goto unregister_chrdev; 548 + } 549 + 550 + ret = mhi_driver_register(&mhi_qaic_ctrl_driver); 551 + if (ret) 552 + goto destroy_class; 553 + 554 + return 0; 555 + 556 + destroy_class: 557 + class_destroy(mqc_dev_class); 558 + unregister_chrdev: 559 + unregister_chrdev(mqc_dev_major, MHI_QAIC_CTRL_DRIVER_NAME); 560 + return ret; 561 + } 562 + 563 + void mhi_qaic_ctrl_deinit(void) 564 + { 565 + mhi_driver_unregister(&mhi_qaic_ctrl_driver); 566 + class_destroy(mqc_dev_class); 567 + unregister_chrdev(mqc_dev_major, MHI_QAIC_CTRL_DRIVER_NAME); 568 + xa_destroy(&mqc_xa); 569 + }
+12
drivers/accel/qaic/mhi_qaic_ctrl.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0-only 2 + * 3 + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. 4 + */ 5 + 6 + #ifndef __MHI_QAIC_CTRL_H__ 7 + #define __MHI_QAIC_CTRL_H__ 8 + 9 + int mhi_qaic_ctrl_init(void); 10 + void mhi_qaic_ctrl_deinit(void); 11 + 12 + #endif /* __MHI_QAIC_CTRL_H__ */
+282
drivers/accel/qaic/qaic.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0-only 2 + * 3 + * Copyright (c) 2019-2021, The Linux Foundation. All rights reserved. 4 + * Copyright (c) 2021-2023 Qualcomm Innovation Center, Inc. All rights reserved. 5 + */ 6 + 7 + #ifndef _QAIC_H_ 8 + #define _QAIC_H_ 9 + 10 + #include <linux/interrupt.h> 11 + #include <linux/kref.h> 12 + #include <linux/mhi.h> 13 + #include <linux/mutex.h> 14 + #include <linux/pci.h> 15 + #include <linux/spinlock.h> 16 + #include <linux/srcu.h> 17 + #include <linux/wait.h> 18 + #include <linux/workqueue.h> 19 + #include <drm/drm_device.h> 20 + #include <drm/drm_gem.h> 21 + 22 + #define QAIC_DBC_BASE SZ_128K 23 + #define QAIC_DBC_SIZE SZ_4K 24 + 25 + #define QAIC_NO_PARTITION -1 26 + 27 + #define QAIC_DBC_OFF(i) ((i) * QAIC_DBC_SIZE + QAIC_DBC_BASE) 28 + 29 + #define to_qaic_bo(obj) container_of(obj, struct qaic_bo, base) 30 + 31 + extern bool datapath_polling; 32 + 33 + struct qaic_user { 34 + /* Uniquely identifies this user for the device */ 35 + int handle; 36 + struct kref ref_count; 37 + /* Char device opened by this user */ 38 + struct qaic_drm_device *qddev; 39 + /* Node in list of users that opened this drm device */ 40 + struct list_head node; 41 + /* SRCU used to synchronize this user during cleanup */ 42 + struct srcu_struct qddev_lock; 43 + atomic_t chunk_id; 44 + }; 45 + 46 + struct dma_bridge_chan { 47 + /* Pointer to device strcut maintained by driver */ 48 + struct qaic_device *qdev; 49 + /* ID of this DMA bridge channel(DBC) */ 50 + unsigned int id; 51 + /* Synchronizes access to xfer_list */ 52 + spinlock_t xfer_lock; 53 + /* Base address of request queue */ 54 + void *req_q_base; 55 + /* Base address of response queue */ 56 + void *rsp_q_base; 57 + /* 58 + * Base bus address of request queue. Response queue bus address can be 59 + * calculated by adding request queue size to this variable 60 + */ 61 + dma_addr_t dma_addr; 62 + /* Total size of request and response queue in byte */ 63 + u32 total_size; 64 + /* Capacity of request/response queue */ 65 + u32 nelem; 66 + /* The user that opened this DBC */ 67 + struct qaic_user *usr; 68 + /* 69 + * Request ID of next memory handle that goes in request queue. One 70 + * memory handle can enqueue more than one request elements, all 71 + * this requests that belong to same memory handle have same request ID 72 + */ 73 + u16 next_req_id; 74 + /* true: DBC is in use; false: DBC not in use */ 75 + bool in_use; 76 + /* 77 + * Base address of device registers. Used to read/write request and 78 + * response queue's head and tail pointer of this DBC. 79 + */ 80 + void __iomem *dbc_base; 81 + /* Head of list where each node is a memory handle queued in request queue */ 82 + struct list_head xfer_list; 83 + /* Synchronizes DBC readers during cleanup */ 84 + struct srcu_struct ch_lock; 85 + /* 86 + * When this DBC is released, any thread waiting on this wait queue is 87 + * woken up 88 + */ 89 + wait_queue_head_t dbc_release; 90 + /* Head of list where each node is a bo associated with this DBC */ 91 + struct list_head bo_lists; 92 + /* The irq line for this DBC. Used for polling */ 93 + unsigned int irq; 94 + /* Polling work item to simulate interrupts */ 95 + struct work_struct poll_work; 96 + }; 97 + 98 + struct qaic_device { 99 + /* Pointer to base PCI device struct of our physical device */ 100 + struct pci_dev *pdev; 101 + /* Req. ID of request that will be queued next in MHI control device */ 102 + u32 next_seq_num; 103 + /* Base address of bar 0 */ 104 + void __iomem *bar_0; 105 + /* Base address of bar 2 */ 106 + void __iomem *bar_2; 107 + /* Controller structure for MHI devices */ 108 + struct mhi_controller *mhi_cntrl; 109 + /* MHI control channel device */ 110 + struct mhi_device *cntl_ch; 111 + /* List of requests queued in MHI control device */ 112 + struct list_head cntl_xfer_list; 113 + /* Synchronizes MHI control device transactions and its xfer list */ 114 + struct mutex cntl_mutex; 115 + /* Array of DBC struct of this device */ 116 + struct dma_bridge_chan *dbc; 117 + /* Work queue for tasks related to MHI control device */ 118 + struct workqueue_struct *cntl_wq; 119 + /* Synchronizes all the users of device during cleanup */ 120 + struct srcu_struct dev_lock; 121 + /* true: Device under reset; false: Device not under reset */ 122 + bool in_reset; 123 + /* 124 + * true: A tx MHI transaction has failed and a rx buffer is still queued 125 + * in control device. Such a buffer is considered lost rx buffer 126 + * false: No rx buffer is lost in control device 127 + */ 128 + bool cntl_lost_buf; 129 + /* Maximum number of DBC supported by this device */ 130 + u32 num_dbc; 131 + /* Reference to the drm_device for this device when it is created */ 132 + struct qaic_drm_device *qddev; 133 + /* Generate the CRC of a control message */ 134 + u32 (*gen_crc)(void *msg); 135 + /* Validate the CRC of a control message */ 136 + bool (*valid_crc)(void *msg); 137 + }; 138 + 139 + struct qaic_drm_device { 140 + /* Pointer to the root device struct driven by this driver */ 141 + struct qaic_device *qdev; 142 + /* 143 + * The physical device can be partition in number of logical devices. 144 + * And each logical device is given a partition id. This member stores 145 + * that id. QAIC_NO_PARTITION is a sentinel used to mark that this drm 146 + * device is the actual physical device 147 + */ 148 + s32 partition_id; 149 + /* Pointer to the drm device struct of this drm device */ 150 + struct drm_device *ddev; 151 + /* Head in list of users who have opened this drm device */ 152 + struct list_head users; 153 + /* Synchronizes access to users list */ 154 + struct mutex users_mutex; 155 + }; 156 + 157 + struct qaic_bo { 158 + struct drm_gem_object base; 159 + /* Scatter/gather table for allocate/imported BO */ 160 + struct sg_table *sgt; 161 + /* BO size requested by user. GEM object might be bigger in size. */ 162 + u64 size; 163 + /* Head in list of slices of this BO */ 164 + struct list_head slices; 165 + /* Total nents, for all slices of this BO */ 166 + int total_slice_nents; 167 + /* 168 + * Direction of transfer. It can assume only two value DMA_TO_DEVICE and 169 + * DMA_FROM_DEVICE. 170 + */ 171 + int dir; 172 + /* The pointer of the DBC which operates on this BO */ 173 + struct dma_bridge_chan *dbc; 174 + /* Number of slice that belongs to this buffer */ 175 + u32 nr_slice; 176 + /* Number of slice that have been transferred by DMA engine */ 177 + u32 nr_slice_xfer_done; 178 + /* true = BO is queued for execution, true = BO is not queued */ 179 + bool queued; 180 + /* 181 + * If true then user has attached slicing information to this BO by 182 + * calling DRM_IOCTL_QAIC_ATTACH_SLICE_BO ioctl. 183 + */ 184 + bool sliced; 185 + /* Request ID of this BO if it is queued for execution */ 186 + u16 req_id; 187 + /* Handle assigned to this BO */ 188 + u32 handle; 189 + /* Wait on this for completion of DMA transfer of this BO */ 190 + struct completion xfer_done; 191 + /* 192 + * Node in linked list where head is dbc->xfer_list. 193 + * This link list contain BO's that are queued for DMA transfer. 194 + */ 195 + struct list_head xfer_list; 196 + /* 197 + * Node in linked list where head is dbc->bo_lists. 198 + * This link list contain BO's that are associated with the DBC it is 199 + * linked to. 200 + */ 201 + struct list_head bo_list; 202 + struct { 203 + /* 204 + * Latest timestamp(ns) at which kernel received a request to 205 + * execute this BO 206 + */ 207 + u64 req_received_ts; 208 + /* 209 + * Latest timestamp(ns) at which kernel enqueued requests of 210 + * this BO for execution in DMA queue 211 + */ 212 + u64 req_submit_ts; 213 + /* 214 + * Latest timestamp(ns) at which kernel received a completion 215 + * interrupt for requests of this BO 216 + */ 217 + u64 req_processed_ts; 218 + /* 219 + * Number of elements already enqueued in DMA queue before 220 + * enqueuing requests of this BO 221 + */ 222 + u32 queue_level_before; 223 + } perf_stats; 224 + 225 + }; 226 + 227 + struct bo_slice { 228 + /* Mapped pages */ 229 + struct sg_table *sgt; 230 + /* Number of requests required to queue in DMA queue */ 231 + int nents; 232 + /* See enum dma_data_direction */ 233 + int dir; 234 + /* Actual requests that will be copied in DMA queue */ 235 + struct dbc_req *reqs; 236 + struct kref ref_count; 237 + /* true: No DMA transfer required */ 238 + bool no_xfer; 239 + /* Pointer to the parent BO handle */ 240 + struct qaic_bo *bo; 241 + /* Node in list of slices maintained by parent BO */ 242 + struct list_head slice; 243 + /* Size of this slice in bytes */ 244 + u64 size; 245 + /* Offset of this slice in buffer */ 246 + u64 offset; 247 + }; 248 + 249 + int get_dbc_req_elem_size(void); 250 + int get_dbc_rsp_elem_size(void); 251 + int get_cntl_version(struct qaic_device *qdev, struct qaic_user *usr, u16 *major, u16 *minor); 252 + int qaic_manage_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); 253 + void qaic_mhi_ul_xfer_cb(struct mhi_device *mhi_dev, struct mhi_result *mhi_result); 254 + 255 + void qaic_mhi_dl_xfer_cb(struct mhi_device *mhi_dev, struct mhi_result *mhi_result); 256 + 257 + int qaic_control_open(struct qaic_device *qdev); 258 + void qaic_control_close(struct qaic_device *qdev); 259 + void qaic_release_usr(struct qaic_device *qdev, struct qaic_user *usr); 260 + 261 + irqreturn_t dbc_irq_threaded_fn(int irq, void *data); 262 + irqreturn_t dbc_irq_handler(int irq, void *data); 263 + int disable_dbc(struct qaic_device *qdev, u32 dbc_id, struct qaic_user *usr); 264 + void enable_dbc(struct qaic_device *qdev, u32 dbc_id, struct qaic_user *usr); 265 + void wakeup_dbc(struct qaic_device *qdev, u32 dbc_id); 266 + void release_dbc(struct qaic_device *qdev, u32 dbc_id); 267 + 268 + void wake_all_cntl(struct qaic_device *qdev); 269 + void qaic_dev_reset_clean_local_state(struct qaic_device *qdev, bool exit_reset); 270 + 271 + struct drm_gem_object *qaic_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf); 272 + 273 + int qaic_create_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); 274 + int qaic_mmap_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); 275 + int qaic_attach_slice_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); 276 + int qaic_execute_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); 277 + int qaic_partial_execute_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); 278 + int qaic_wait_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); 279 + int qaic_perf_stats_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); 280 + void irq_polling_work(struct work_struct *work); 281 + 282 + #endif /* _QAIC_H_ */
+1526
drivers/accel/qaic/qaic_control.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + 3 + /* Copyright (c) 2019-2021, The Linux Foundation. All rights reserved. */ 4 + /* Copyright (c) 2021-2023 Qualcomm Innovation Center, Inc. All rights reserved. */ 5 + 6 + #include <asm/byteorder.h> 7 + #include <linux/completion.h> 8 + #include <linux/crc32.h> 9 + #include <linux/delay.h> 10 + #include <linux/dma-mapping.h> 11 + #include <linux/kref.h> 12 + #include <linux/list.h> 13 + #include <linux/mhi.h> 14 + #include <linux/mm.h> 15 + #include <linux/moduleparam.h> 16 + #include <linux/mutex.h> 17 + #include <linux/pci.h> 18 + #include <linux/scatterlist.h> 19 + #include <linux/types.h> 20 + #include <linux/uaccess.h> 21 + #include <linux/workqueue.h> 22 + #include <linux/wait.h> 23 + #include <drm/drm_device.h> 24 + #include <drm/drm_file.h> 25 + #include <uapi/drm/qaic_accel.h> 26 + 27 + #include "qaic.h" 28 + 29 + #define MANAGE_MAGIC_NUMBER ((__force __le32)0x43494151) /* "QAIC" in little endian */ 30 + #define QAIC_DBC_Q_GAP SZ_256 31 + #define QAIC_DBC_Q_BUF_ALIGN SZ_4K 32 + #define QAIC_MANAGE_EXT_MSG_LENGTH SZ_64K /* Max DMA message length */ 33 + #define QAIC_WRAPPER_MAX_SIZE SZ_4K 34 + #define QAIC_MHI_RETRY_WAIT_MS 100 35 + #define QAIC_MHI_RETRY_MAX 20 36 + 37 + static unsigned int control_resp_timeout_s = 60; /* 60 sec default */ 38 + module_param(control_resp_timeout_s, uint, 0600); 39 + MODULE_PARM_DESC(control_resp_timeout_s, "Timeout for NNC responses from QSM"); 40 + 41 + struct manage_msg { 42 + u32 len; 43 + u32 count; 44 + u8 data[]; 45 + }; 46 + 47 + /* 48 + * wire encoding structures for the manage protocol. 49 + * All fields are little endian on the wire 50 + */ 51 + struct wire_msg_hdr { 52 + __le32 crc32; /* crc of everything following this field in the message */ 53 + __le32 magic_number; 54 + __le32 sequence_number; 55 + __le32 len; /* length of this message */ 56 + __le32 count; /* number of transactions in this message */ 57 + __le32 handle; /* unique id to track the resources consumed */ 58 + __le32 partition_id; /* partition id for the request (signed) */ 59 + __le32 padding; /* must be 0 */ 60 + } __packed; 61 + 62 + struct wire_msg { 63 + struct wire_msg_hdr hdr; 64 + u8 data[]; 65 + } __packed; 66 + 67 + struct wire_trans_hdr { 68 + __le32 type; 69 + __le32 len; 70 + } __packed; 71 + 72 + /* Each message sent from driver to device are organized in a list of wrapper_msg */ 73 + struct wrapper_msg { 74 + struct list_head list; 75 + struct kref ref_count; 76 + u32 len; /* length of data to transfer */ 77 + struct wrapper_list *head; 78 + union { 79 + struct wire_msg msg; 80 + struct wire_trans_hdr trans; 81 + }; 82 + }; 83 + 84 + struct wrapper_list { 85 + struct list_head list; 86 + spinlock_t lock; /* Protects the list state during additions and removals */ 87 + }; 88 + 89 + struct wire_trans_passthrough { 90 + struct wire_trans_hdr hdr; 91 + u8 data[]; 92 + } __packed; 93 + 94 + struct wire_addr_size_pair { 95 + __le64 addr; 96 + __le64 size; 97 + } __packed; 98 + 99 + struct wire_trans_dma_xfer { 100 + struct wire_trans_hdr hdr; 101 + __le32 tag; 102 + __le32 count; 103 + __le32 dma_chunk_id; 104 + __le32 padding; 105 + struct wire_addr_size_pair data[]; 106 + } __packed; 107 + 108 + /* Initiated by device to continue the DMA xfer of a large piece of data */ 109 + struct wire_trans_dma_xfer_cont { 110 + struct wire_trans_hdr hdr; 111 + __le32 dma_chunk_id; 112 + __le32 padding; 113 + __le64 xferred_size; 114 + } __packed; 115 + 116 + struct wire_trans_activate_to_dev { 117 + struct wire_trans_hdr hdr; 118 + __le64 req_q_addr; 119 + __le64 rsp_q_addr; 120 + __le32 req_q_size; 121 + __le32 rsp_q_size; 122 + __le32 buf_len; 123 + __le32 options; /* unused, but BIT(16) has meaning to the device */ 124 + } __packed; 125 + 126 + struct wire_trans_activate_from_dev { 127 + struct wire_trans_hdr hdr; 128 + __le32 status; 129 + __le32 dbc_id; 130 + __le64 options; /* unused */ 131 + } __packed; 132 + 133 + struct wire_trans_deactivate_from_dev { 134 + struct wire_trans_hdr hdr; 135 + __le32 status; 136 + __le32 dbc_id; 137 + } __packed; 138 + 139 + struct wire_trans_terminate_to_dev { 140 + struct wire_trans_hdr hdr; 141 + __le32 handle; 142 + __le32 padding; 143 + } __packed; 144 + 145 + struct wire_trans_terminate_from_dev { 146 + struct wire_trans_hdr hdr; 147 + __le32 status; 148 + __le32 padding; 149 + } __packed; 150 + 151 + struct wire_trans_status_to_dev { 152 + struct wire_trans_hdr hdr; 153 + } __packed; 154 + 155 + struct wire_trans_status_from_dev { 156 + struct wire_trans_hdr hdr; 157 + __le16 major; 158 + __le16 minor; 159 + __le32 status; 160 + __le64 status_flags; 161 + } __packed; 162 + 163 + struct wire_trans_validate_part_to_dev { 164 + struct wire_trans_hdr hdr; 165 + __le32 part_id; 166 + __le32 padding; 167 + } __packed; 168 + 169 + struct wire_trans_validate_part_from_dev { 170 + struct wire_trans_hdr hdr; 171 + __le32 status; 172 + __le32 padding; 173 + } __packed; 174 + 175 + struct xfer_queue_elem { 176 + /* 177 + * Node in list of ongoing transfer request on control channel. 178 + * Maintained by root device struct. 179 + */ 180 + struct list_head list; 181 + /* Sequence number of this transfer request */ 182 + u32 seq_num; 183 + /* This is used to wait on until completion of transfer request */ 184 + struct completion xfer_done; 185 + /* Received data from device */ 186 + void *buf; 187 + }; 188 + 189 + struct dma_xfer { 190 + /* Node in list of DMA transfers which is used for cleanup */ 191 + struct list_head list; 192 + /* SG table of memory used for DMA */ 193 + struct sg_table *sgt; 194 + /* Array pages used for DMA */ 195 + struct page **page_list; 196 + /* Number of pages used for DMA */ 197 + unsigned long nr_pages; 198 + }; 199 + 200 + struct ioctl_resources { 201 + /* List of all DMA transfers which is used later for cleanup */ 202 + struct list_head dma_xfers; 203 + /* Base address of request queue which belongs to a DBC */ 204 + void *buf; 205 + /* 206 + * Base bus address of request queue which belongs to a DBC. Response 207 + * queue base bus address can be calculated by adding size of request 208 + * queue to base bus address of request queue. 209 + */ 210 + dma_addr_t dma_addr; 211 + /* Total size of request queue and response queue in byte */ 212 + u32 total_size; 213 + /* Total number of elements that can be queued in each of request and response queue */ 214 + u32 nelem; 215 + /* Base address of response queue which belongs to a DBC */ 216 + void *rsp_q_base; 217 + /* Status of the NNC message received */ 218 + u32 status; 219 + /* DBC id of the DBC received from device */ 220 + u32 dbc_id; 221 + /* 222 + * DMA transfer request messages can be big in size and it may not be 223 + * possible to send them in one shot. In such cases the messages are 224 + * broken into chunks, this field stores ID of such chunks. 225 + */ 226 + u32 dma_chunk_id; 227 + /* Total number of bytes transferred for a DMA xfer request */ 228 + u64 xferred_dma_size; 229 + /* Header of transaction message received from user. Used during DMA xfer request. */ 230 + void *trans_hdr; 231 + }; 232 + 233 + struct resp_work { 234 + struct work_struct work; 235 + struct qaic_device *qdev; 236 + void *buf; 237 + }; 238 + 239 + /* 240 + * Since we're working with little endian messages, its useful to be able to 241 + * increment without filling a whole line with conversions back and forth just 242 + * to add one(1) to a message count. 243 + */ 244 + static __le32 incr_le32(__le32 val) 245 + { 246 + return cpu_to_le32(le32_to_cpu(val) + 1); 247 + } 248 + 249 + static u32 gen_crc(void *msg) 250 + { 251 + struct wrapper_list *wrappers = msg; 252 + struct wrapper_msg *w; 253 + u32 crc = ~0; 254 + 255 + list_for_each_entry(w, &wrappers->list, list) 256 + crc = crc32(crc, &w->msg, w->len); 257 + 258 + return crc ^ ~0; 259 + } 260 + 261 + static u32 gen_crc_stub(void *msg) 262 + { 263 + return 0; 264 + } 265 + 266 + static bool valid_crc(void *msg) 267 + { 268 + struct wire_msg_hdr *hdr = msg; 269 + bool ret; 270 + u32 crc; 271 + 272 + /* 273 + * The output of this algorithm is always converted to the native 274 + * endianness. 275 + */ 276 + crc = le32_to_cpu(hdr->crc32); 277 + hdr->crc32 = 0; 278 + ret = (crc32(~0, msg, le32_to_cpu(hdr->len)) ^ ~0) == crc; 279 + hdr->crc32 = cpu_to_le32(crc); 280 + return ret; 281 + } 282 + 283 + static bool valid_crc_stub(void *msg) 284 + { 285 + return true; 286 + } 287 + 288 + static void free_wrapper(struct kref *ref) 289 + { 290 + struct wrapper_msg *wrapper = container_of(ref, struct wrapper_msg, ref_count); 291 + 292 + list_del(&wrapper->list); 293 + kfree(wrapper); 294 + } 295 + 296 + static void save_dbc_buf(struct qaic_device *qdev, struct ioctl_resources *resources, 297 + struct qaic_user *usr) 298 + { 299 + u32 dbc_id = resources->dbc_id; 300 + 301 + if (resources->buf) { 302 + wait_event_interruptible(qdev->dbc[dbc_id].dbc_release, !qdev->dbc[dbc_id].in_use); 303 + qdev->dbc[dbc_id].req_q_base = resources->buf; 304 + qdev->dbc[dbc_id].rsp_q_base = resources->rsp_q_base; 305 + qdev->dbc[dbc_id].dma_addr = resources->dma_addr; 306 + qdev->dbc[dbc_id].total_size = resources->total_size; 307 + qdev->dbc[dbc_id].nelem = resources->nelem; 308 + enable_dbc(qdev, dbc_id, usr); 309 + qdev->dbc[dbc_id].in_use = true; 310 + resources->buf = NULL; 311 + } 312 + } 313 + 314 + static void free_dbc_buf(struct qaic_device *qdev, struct ioctl_resources *resources) 315 + { 316 + if (resources->buf) 317 + dma_free_coherent(&qdev->pdev->dev, resources->total_size, resources->buf, 318 + resources->dma_addr); 319 + resources->buf = NULL; 320 + } 321 + 322 + static void free_dma_xfers(struct qaic_device *qdev, struct ioctl_resources *resources) 323 + { 324 + struct dma_xfer *xfer; 325 + struct dma_xfer *x; 326 + int i; 327 + 328 + list_for_each_entry_safe(xfer, x, &resources->dma_xfers, list) { 329 + dma_unmap_sgtable(&qdev->pdev->dev, xfer->sgt, DMA_TO_DEVICE, 0); 330 + sg_free_table(xfer->sgt); 331 + kfree(xfer->sgt); 332 + for (i = 0; i < xfer->nr_pages; ++i) 333 + put_page(xfer->page_list[i]); 334 + kfree(xfer->page_list); 335 + list_del(&xfer->list); 336 + kfree(xfer); 337 + } 338 + } 339 + 340 + static struct wrapper_msg *add_wrapper(struct wrapper_list *wrappers, u32 size) 341 + { 342 + struct wrapper_msg *w = kzalloc(size, GFP_KERNEL); 343 + 344 + if (!w) 345 + return NULL; 346 + list_add_tail(&w->list, &wrappers->list); 347 + kref_init(&w->ref_count); 348 + w->head = wrappers; 349 + return w; 350 + } 351 + 352 + static int encode_passthrough(struct qaic_device *qdev, void *trans, struct wrapper_list *wrappers, 353 + u32 *user_len) 354 + { 355 + struct qaic_manage_trans_passthrough *in_trans = trans; 356 + struct wire_trans_passthrough *out_trans; 357 + struct wrapper_msg *trans_wrapper; 358 + struct wrapper_msg *wrapper; 359 + struct wire_msg *msg; 360 + u32 msg_hdr_len; 361 + 362 + wrapper = list_first_entry(&wrappers->list, struct wrapper_msg, list); 363 + msg = &wrapper->msg; 364 + msg_hdr_len = le32_to_cpu(msg->hdr.len); 365 + 366 + if (in_trans->hdr.len % 8 != 0) 367 + return -EINVAL; 368 + 369 + if (msg_hdr_len + in_trans->hdr.len > QAIC_MANAGE_EXT_MSG_LENGTH) 370 + return -ENOSPC; 371 + 372 + trans_wrapper = add_wrapper(wrappers, 373 + offsetof(struct wrapper_msg, trans) + in_trans->hdr.len); 374 + if (!trans_wrapper) 375 + return -ENOMEM; 376 + trans_wrapper->len = in_trans->hdr.len; 377 + out_trans = (struct wire_trans_passthrough *)&trans_wrapper->trans; 378 + 379 + memcpy(out_trans->data, in_trans->data, in_trans->hdr.len - sizeof(in_trans->hdr)); 380 + msg->hdr.len = cpu_to_le32(msg_hdr_len + in_trans->hdr.len); 381 + msg->hdr.count = incr_le32(msg->hdr.count); 382 + *user_len += in_trans->hdr.len; 383 + out_trans->hdr.type = cpu_to_le32(QAIC_TRANS_PASSTHROUGH_TO_DEV); 384 + out_trans->hdr.len = cpu_to_le32(in_trans->hdr.len); 385 + 386 + return 0; 387 + } 388 + 389 + /* returns error code for failure, 0 if enough pages alloc'd, 1 if dma_cont is needed */ 390 + static int find_and_map_user_pages(struct qaic_device *qdev, 391 + struct qaic_manage_trans_dma_xfer *in_trans, 392 + struct ioctl_resources *resources, struct dma_xfer *xfer) 393 + { 394 + unsigned long need_pages; 395 + struct page **page_list; 396 + unsigned long nr_pages; 397 + struct sg_table *sgt; 398 + u64 xfer_start_addr; 399 + int ret; 400 + int i; 401 + 402 + xfer_start_addr = in_trans->addr + resources->xferred_dma_size; 403 + 404 + need_pages = DIV_ROUND_UP(in_trans->size + offset_in_page(xfer_start_addr) - 405 + resources->xferred_dma_size, PAGE_SIZE); 406 + 407 + nr_pages = need_pages; 408 + 409 + while (1) { 410 + page_list = kmalloc_array(nr_pages, sizeof(*page_list), GFP_KERNEL | __GFP_NOWARN); 411 + if (!page_list) { 412 + nr_pages = nr_pages / 2; 413 + if (!nr_pages) 414 + return -ENOMEM; 415 + } else { 416 + break; 417 + } 418 + } 419 + 420 + ret = get_user_pages_fast(xfer_start_addr, nr_pages, 0, page_list); 421 + if (ret < 0 || ret != nr_pages) { 422 + ret = -EFAULT; 423 + goto free_page_list; 424 + } 425 + 426 + sgt = kmalloc(sizeof(*sgt), GFP_KERNEL); 427 + if (!sgt) { 428 + ret = -ENOMEM; 429 + goto put_pages; 430 + } 431 + 432 + ret = sg_alloc_table_from_pages(sgt, page_list, nr_pages, 433 + offset_in_page(xfer_start_addr), 434 + in_trans->size - resources->xferred_dma_size, GFP_KERNEL); 435 + if (ret) { 436 + ret = -ENOMEM; 437 + goto free_sgt; 438 + } 439 + 440 + ret = dma_map_sgtable(&qdev->pdev->dev, sgt, DMA_TO_DEVICE, 0); 441 + if (ret) 442 + goto free_table; 443 + 444 + xfer->sgt = sgt; 445 + xfer->page_list = page_list; 446 + xfer->nr_pages = nr_pages; 447 + 448 + return need_pages > nr_pages ? 1 : 0; 449 + 450 + free_table: 451 + sg_free_table(sgt); 452 + free_sgt: 453 + kfree(sgt); 454 + put_pages: 455 + for (i = 0; i < nr_pages; ++i) 456 + put_page(page_list[i]); 457 + free_page_list: 458 + kfree(page_list); 459 + return ret; 460 + } 461 + 462 + /* returns error code for failure, 0 if everything was encoded, 1 if dma_cont is needed */ 463 + static int encode_addr_size_pairs(struct dma_xfer *xfer, struct wrapper_list *wrappers, 464 + struct ioctl_resources *resources, u32 msg_hdr_len, u32 *size, 465 + struct wire_trans_dma_xfer **out_trans) 466 + { 467 + struct wrapper_msg *trans_wrapper; 468 + struct sg_table *sgt = xfer->sgt; 469 + struct wire_addr_size_pair *asp; 470 + struct scatterlist *sg; 471 + struct wrapper_msg *w; 472 + unsigned int dma_len; 473 + u64 dma_chunk_len; 474 + void *boundary; 475 + int nents_dma; 476 + int nents; 477 + int i; 478 + 479 + nents = sgt->nents; 480 + nents_dma = nents; 481 + *size = QAIC_MANAGE_EXT_MSG_LENGTH - msg_hdr_len - sizeof(**out_trans); 482 + for_each_sgtable_sg(sgt, sg, i) { 483 + *size -= sizeof(*asp); 484 + /* Save 1K for possible follow-up transactions. */ 485 + if (*size < SZ_1K) { 486 + nents_dma = i; 487 + break; 488 + } 489 + } 490 + 491 + trans_wrapper = add_wrapper(wrappers, QAIC_WRAPPER_MAX_SIZE); 492 + if (!trans_wrapper) 493 + return -ENOMEM; 494 + *out_trans = (struct wire_trans_dma_xfer *)&trans_wrapper->trans; 495 + 496 + asp = (*out_trans)->data; 497 + boundary = (void *)trans_wrapper + QAIC_WRAPPER_MAX_SIZE; 498 + *size = 0; 499 + 500 + dma_len = 0; 501 + w = trans_wrapper; 502 + dma_chunk_len = 0; 503 + for_each_sg(sgt->sgl, sg, nents_dma, i) { 504 + asp->size = cpu_to_le64(dma_len); 505 + dma_chunk_len += dma_len; 506 + if (dma_len) { 507 + asp++; 508 + if ((void *)asp + sizeof(*asp) > boundary) { 509 + w->len = (void *)asp - (void *)&w->msg; 510 + *size += w->len; 511 + w = add_wrapper(wrappers, QAIC_WRAPPER_MAX_SIZE); 512 + if (!w) 513 + return -ENOMEM; 514 + boundary = (void *)w + QAIC_WRAPPER_MAX_SIZE; 515 + asp = (struct wire_addr_size_pair *)&w->msg; 516 + } 517 + } 518 + asp->addr = cpu_to_le64(sg_dma_address(sg)); 519 + dma_len = sg_dma_len(sg); 520 + } 521 + /* finalize the last segment */ 522 + asp->size = cpu_to_le64(dma_len); 523 + w->len = (void *)asp + sizeof(*asp) - (void *)&w->msg; 524 + *size += w->len; 525 + dma_chunk_len += dma_len; 526 + resources->xferred_dma_size += dma_chunk_len; 527 + 528 + return nents_dma < nents ? 1 : 0; 529 + } 530 + 531 + static void cleanup_xfer(struct qaic_device *qdev, struct dma_xfer *xfer) 532 + { 533 + int i; 534 + 535 + dma_unmap_sgtable(&qdev->pdev->dev, xfer->sgt, DMA_TO_DEVICE, 0); 536 + sg_free_table(xfer->sgt); 537 + kfree(xfer->sgt); 538 + for (i = 0; i < xfer->nr_pages; ++i) 539 + put_page(xfer->page_list[i]); 540 + kfree(xfer->page_list); 541 + } 542 + 543 + static int encode_dma(struct qaic_device *qdev, void *trans, struct wrapper_list *wrappers, 544 + u32 *user_len, struct ioctl_resources *resources, struct qaic_user *usr) 545 + { 546 + struct qaic_manage_trans_dma_xfer *in_trans = trans; 547 + struct wire_trans_dma_xfer *out_trans; 548 + struct wrapper_msg *wrapper; 549 + struct dma_xfer *xfer; 550 + struct wire_msg *msg; 551 + bool need_cont_dma; 552 + u32 msg_hdr_len; 553 + u32 size; 554 + int ret; 555 + 556 + wrapper = list_first_entry(&wrappers->list, struct wrapper_msg, list); 557 + msg = &wrapper->msg; 558 + msg_hdr_len = le32_to_cpu(msg->hdr.len); 559 + 560 + if (msg_hdr_len > (UINT_MAX - QAIC_MANAGE_EXT_MSG_LENGTH)) 561 + return -EINVAL; 562 + 563 + /* There should be enough space to hold at least one ASP entry. */ 564 + if (msg_hdr_len + sizeof(*out_trans) + sizeof(struct wire_addr_size_pair) > 565 + QAIC_MANAGE_EXT_MSG_LENGTH) 566 + return -ENOMEM; 567 + 568 + if (in_trans->addr + in_trans->size < in_trans->addr || !in_trans->size) 569 + return -EINVAL; 570 + 571 + xfer = kmalloc(sizeof(*xfer), GFP_KERNEL); 572 + if (!xfer) 573 + return -ENOMEM; 574 + 575 + ret = find_and_map_user_pages(qdev, in_trans, resources, xfer); 576 + if (ret < 0) 577 + goto free_xfer; 578 + 579 + need_cont_dma = (bool)ret; 580 + 581 + ret = encode_addr_size_pairs(xfer, wrappers, resources, msg_hdr_len, &size, &out_trans); 582 + if (ret < 0) 583 + goto cleanup_xfer; 584 + 585 + need_cont_dma = need_cont_dma || (bool)ret; 586 + 587 + msg->hdr.len = cpu_to_le32(msg_hdr_len + size); 588 + msg->hdr.count = incr_le32(msg->hdr.count); 589 + 590 + out_trans->hdr.type = cpu_to_le32(QAIC_TRANS_DMA_XFER_TO_DEV); 591 + out_trans->hdr.len = cpu_to_le32(size); 592 + out_trans->tag = cpu_to_le32(in_trans->tag); 593 + out_trans->count = cpu_to_le32((size - sizeof(*out_trans)) / 594 + sizeof(struct wire_addr_size_pair)); 595 + 596 + *user_len += in_trans->hdr.len; 597 + 598 + if (resources->dma_chunk_id) { 599 + out_trans->dma_chunk_id = cpu_to_le32(resources->dma_chunk_id); 600 + } else if (need_cont_dma) { 601 + while (resources->dma_chunk_id == 0) 602 + resources->dma_chunk_id = atomic_inc_return(&usr->chunk_id); 603 + 604 + out_trans->dma_chunk_id = cpu_to_le32(resources->dma_chunk_id); 605 + } 606 + resources->trans_hdr = trans; 607 + 608 + list_add(&xfer->list, &resources->dma_xfers); 609 + return 0; 610 + 611 + cleanup_xfer: 612 + cleanup_xfer(qdev, xfer); 613 + free_xfer: 614 + kfree(xfer); 615 + return ret; 616 + } 617 + 618 + static int encode_activate(struct qaic_device *qdev, void *trans, struct wrapper_list *wrappers, 619 + u32 *user_len, struct ioctl_resources *resources) 620 + { 621 + struct qaic_manage_trans_activate_to_dev *in_trans = trans; 622 + struct wire_trans_activate_to_dev *out_trans; 623 + struct wrapper_msg *trans_wrapper; 624 + struct wrapper_msg *wrapper; 625 + struct wire_msg *msg; 626 + dma_addr_t dma_addr; 627 + u32 msg_hdr_len; 628 + void *buf; 629 + u32 nelem; 630 + u32 size; 631 + int ret; 632 + 633 + wrapper = list_first_entry(&wrappers->list, struct wrapper_msg, list); 634 + msg = &wrapper->msg; 635 + msg_hdr_len = le32_to_cpu(msg->hdr.len); 636 + 637 + if (msg_hdr_len + sizeof(*out_trans) > QAIC_MANAGE_MAX_MSG_LENGTH) 638 + return -ENOSPC; 639 + 640 + if (!in_trans->queue_size) 641 + return -EINVAL; 642 + 643 + if (in_trans->pad) 644 + return -EINVAL; 645 + 646 + nelem = in_trans->queue_size; 647 + size = (get_dbc_req_elem_size() + get_dbc_rsp_elem_size()) * nelem; 648 + if (size / nelem != get_dbc_req_elem_size() + get_dbc_rsp_elem_size()) 649 + return -EINVAL; 650 + 651 + if (size + QAIC_DBC_Q_GAP + QAIC_DBC_Q_BUF_ALIGN < size) 652 + return -EINVAL; 653 + 654 + size = ALIGN((size + QAIC_DBC_Q_GAP), QAIC_DBC_Q_BUF_ALIGN); 655 + 656 + buf = dma_alloc_coherent(&qdev->pdev->dev, size, &dma_addr, GFP_KERNEL); 657 + if (!buf) 658 + return -ENOMEM; 659 + 660 + trans_wrapper = add_wrapper(wrappers, 661 + offsetof(struct wrapper_msg, trans) + sizeof(*out_trans)); 662 + if (!trans_wrapper) { 663 + ret = -ENOMEM; 664 + goto free_dma; 665 + } 666 + trans_wrapper->len = sizeof(*out_trans); 667 + out_trans = (struct wire_trans_activate_to_dev *)&trans_wrapper->trans; 668 + 669 + out_trans->hdr.type = cpu_to_le32(QAIC_TRANS_ACTIVATE_TO_DEV); 670 + out_trans->hdr.len = cpu_to_le32(sizeof(*out_trans)); 671 + out_trans->buf_len = cpu_to_le32(size); 672 + out_trans->req_q_addr = cpu_to_le64(dma_addr); 673 + out_trans->req_q_size = cpu_to_le32(nelem); 674 + out_trans->rsp_q_addr = cpu_to_le64(dma_addr + size - nelem * get_dbc_rsp_elem_size()); 675 + out_trans->rsp_q_size = cpu_to_le32(nelem); 676 + out_trans->options = cpu_to_le32(in_trans->options); 677 + 678 + *user_len += in_trans->hdr.len; 679 + msg->hdr.len = cpu_to_le32(msg_hdr_len + sizeof(*out_trans)); 680 + msg->hdr.count = incr_le32(msg->hdr.count); 681 + 682 + resources->buf = buf; 683 + resources->dma_addr = dma_addr; 684 + resources->total_size = size; 685 + resources->nelem = nelem; 686 + resources->rsp_q_base = buf + size - nelem * get_dbc_rsp_elem_size(); 687 + return 0; 688 + 689 + free_dma: 690 + dma_free_coherent(&qdev->pdev->dev, size, buf, dma_addr); 691 + return ret; 692 + } 693 + 694 + static int encode_deactivate(struct qaic_device *qdev, void *trans, 695 + u32 *user_len, struct qaic_user *usr) 696 + { 697 + struct qaic_manage_trans_deactivate *in_trans = trans; 698 + 699 + if (in_trans->dbc_id >= qdev->num_dbc || in_trans->pad) 700 + return -EINVAL; 701 + 702 + *user_len += in_trans->hdr.len; 703 + 704 + return disable_dbc(qdev, in_trans->dbc_id, usr); 705 + } 706 + 707 + static int encode_status(struct qaic_device *qdev, void *trans, struct wrapper_list *wrappers, 708 + u32 *user_len) 709 + { 710 + struct qaic_manage_trans_status_to_dev *in_trans = trans; 711 + struct wire_trans_status_to_dev *out_trans; 712 + struct wrapper_msg *trans_wrapper; 713 + struct wrapper_msg *wrapper; 714 + struct wire_msg *msg; 715 + u32 msg_hdr_len; 716 + 717 + wrapper = list_first_entry(&wrappers->list, struct wrapper_msg, list); 718 + msg = &wrapper->msg; 719 + msg_hdr_len = le32_to_cpu(msg->hdr.len); 720 + 721 + if (msg_hdr_len + in_trans->hdr.len > QAIC_MANAGE_MAX_MSG_LENGTH) 722 + return -ENOSPC; 723 + 724 + trans_wrapper = add_wrapper(wrappers, sizeof(*trans_wrapper)); 725 + if (!trans_wrapper) 726 + return -ENOMEM; 727 + 728 + trans_wrapper->len = sizeof(*out_trans); 729 + out_trans = (struct wire_trans_status_to_dev *)&trans_wrapper->trans; 730 + 731 + out_trans->hdr.type = cpu_to_le32(QAIC_TRANS_STATUS_TO_DEV); 732 + out_trans->hdr.len = cpu_to_le32(in_trans->hdr.len); 733 + msg->hdr.len = cpu_to_le32(msg_hdr_len + in_trans->hdr.len); 734 + msg->hdr.count = incr_le32(msg->hdr.count); 735 + *user_len += in_trans->hdr.len; 736 + 737 + return 0; 738 + } 739 + 740 + static int encode_message(struct qaic_device *qdev, struct manage_msg *user_msg, 741 + struct wrapper_list *wrappers, struct ioctl_resources *resources, 742 + struct qaic_user *usr) 743 + { 744 + struct qaic_manage_trans_hdr *trans_hdr; 745 + struct wrapper_msg *wrapper; 746 + struct wire_msg *msg; 747 + u32 user_len = 0; 748 + int ret; 749 + int i; 750 + 751 + if (!user_msg->count) { 752 + ret = -EINVAL; 753 + goto out; 754 + } 755 + 756 + wrapper = list_first_entry(&wrappers->list, struct wrapper_msg, list); 757 + msg = &wrapper->msg; 758 + 759 + msg->hdr.len = cpu_to_le32(sizeof(msg->hdr)); 760 + 761 + if (resources->dma_chunk_id) { 762 + ret = encode_dma(qdev, resources->trans_hdr, wrappers, &user_len, resources, usr); 763 + msg->hdr.count = cpu_to_le32(1); 764 + goto out; 765 + } 766 + 767 + for (i = 0; i < user_msg->count; ++i) { 768 + if (user_len >= user_msg->len) { 769 + ret = -EINVAL; 770 + break; 771 + } 772 + trans_hdr = (struct qaic_manage_trans_hdr *)(user_msg->data + user_len); 773 + if (user_len + trans_hdr->len > user_msg->len) { 774 + ret = -EINVAL; 775 + break; 776 + } 777 + 778 + switch (trans_hdr->type) { 779 + case QAIC_TRANS_PASSTHROUGH_FROM_USR: 780 + ret = encode_passthrough(qdev, trans_hdr, wrappers, &user_len); 781 + break; 782 + case QAIC_TRANS_DMA_XFER_FROM_USR: 783 + ret = encode_dma(qdev, trans_hdr, wrappers, &user_len, resources, usr); 784 + break; 785 + case QAIC_TRANS_ACTIVATE_FROM_USR: 786 + ret = encode_activate(qdev, trans_hdr, wrappers, &user_len, resources); 787 + break; 788 + case QAIC_TRANS_DEACTIVATE_FROM_USR: 789 + ret = encode_deactivate(qdev, trans_hdr, &user_len, usr); 790 + break; 791 + case QAIC_TRANS_STATUS_FROM_USR: 792 + ret = encode_status(qdev, trans_hdr, wrappers, &user_len); 793 + break; 794 + default: 795 + ret = -EINVAL; 796 + break; 797 + } 798 + 799 + if (ret) 800 + break; 801 + } 802 + 803 + if (user_len != user_msg->len) 804 + ret = -EINVAL; 805 + out: 806 + if (ret) { 807 + free_dma_xfers(qdev, resources); 808 + free_dbc_buf(qdev, resources); 809 + return ret; 810 + } 811 + 812 + return 0; 813 + } 814 + 815 + static int decode_passthrough(struct qaic_device *qdev, void *trans, struct manage_msg *user_msg, 816 + u32 *msg_len) 817 + { 818 + struct qaic_manage_trans_passthrough *out_trans; 819 + struct wire_trans_passthrough *in_trans = trans; 820 + u32 len; 821 + 822 + out_trans = (void *)user_msg->data + user_msg->len; 823 + 824 + len = le32_to_cpu(in_trans->hdr.len); 825 + if (len % 8 != 0) 826 + return -EINVAL; 827 + 828 + if (user_msg->len + len > QAIC_MANAGE_MAX_MSG_LENGTH) 829 + return -ENOSPC; 830 + 831 + memcpy(out_trans->data, in_trans->data, len - sizeof(in_trans->hdr)); 832 + user_msg->len += len; 833 + *msg_len += len; 834 + out_trans->hdr.type = le32_to_cpu(in_trans->hdr.type); 835 + out_trans->hdr.len = len; 836 + 837 + return 0; 838 + } 839 + 840 + static int decode_activate(struct qaic_device *qdev, void *trans, struct manage_msg *user_msg, 841 + u32 *msg_len, struct ioctl_resources *resources, struct qaic_user *usr) 842 + { 843 + struct qaic_manage_trans_activate_from_dev *out_trans; 844 + struct wire_trans_activate_from_dev *in_trans = trans; 845 + u32 len; 846 + 847 + out_trans = (void *)user_msg->data + user_msg->len; 848 + 849 + len = le32_to_cpu(in_trans->hdr.len); 850 + if (user_msg->len + len > QAIC_MANAGE_MAX_MSG_LENGTH) 851 + return -ENOSPC; 852 + 853 + user_msg->len += len; 854 + *msg_len += len; 855 + out_trans->hdr.type = le32_to_cpu(in_trans->hdr.type); 856 + out_trans->hdr.len = len; 857 + out_trans->status = le32_to_cpu(in_trans->status); 858 + out_trans->dbc_id = le32_to_cpu(in_trans->dbc_id); 859 + out_trans->options = le64_to_cpu(in_trans->options); 860 + 861 + if (!resources->buf) 862 + /* how did we get an activate response without a request? */ 863 + return -EINVAL; 864 + 865 + if (out_trans->dbc_id >= qdev->num_dbc) 866 + /* 867 + * The device assigned an invalid resource, which should never 868 + * happen. Return an error so the user can try to recover. 869 + */ 870 + return -ENODEV; 871 + 872 + if (out_trans->status) 873 + /* 874 + * Allocating resources failed on device side. This is not an 875 + * expected behaviour, user is expected to handle this situation. 876 + */ 877 + return -ECANCELED; 878 + 879 + resources->status = out_trans->status; 880 + resources->dbc_id = out_trans->dbc_id; 881 + save_dbc_buf(qdev, resources, usr); 882 + 883 + return 0; 884 + } 885 + 886 + static int decode_deactivate(struct qaic_device *qdev, void *trans, u32 *msg_len, 887 + struct qaic_user *usr) 888 + { 889 + struct wire_trans_deactivate_from_dev *in_trans = trans; 890 + u32 dbc_id = le32_to_cpu(in_trans->dbc_id); 891 + u32 status = le32_to_cpu(in_trans->status); 892 + 893 + if (dbc_id >= qdev->num_dbc) 894 + /* 895 + * The device assigned an invalid resource, which should never 896 + * happen. Inject an error so the user can try to recover. 897 + */ 898 + return -ENODEV; 899 + 900 + if (status) { 901 + /* 902 + * Releasing resources failed on the device side, which puts 903 + * us in a bind since they may still be in use, so enable the 904 + * dbc. User is expected to retry deactivation. 905 + */ 906 + enable_dbc(qdev, dbc_id, usr); 907 + return -ECANCELED; 908 + } 909 + 910 + release_dbc(qdev, dbc_id); 911 + *msg_len += sizeof(*in_trans); 912 + 913 + return 0; 914 + } 915 + 916 + static int decode_status(struct qaic_device *qdev, void *trans, struct manage_msg *user_msg, 917 + u32 *user_len, struct wire_msg *msg) 918 + { 919 + struct qaic_manage_trans_status_from_dev *out_trans; 920 + struct wire_trans_status_from_dev *in_trans = trans; 921 + u32 len; 922 + 923 + out_trans = (void *)user_msg->data + user_msg->len; 924 + 925 + len = le32_to_cpu(in_trans->hdr.len); 926 + if (user_msg->len + len > QAIC_MANAGE_MAX_MSG_LENGTH) 927 + return -ENOSPC; 928 + 929 + out_trans->hdr.type = QAIC_TRANS_STATUS_FROM_DEV; 930 + out_trans->hdr.len = len; 931 + out_trans->major = le16_to_cpu(in_trans->major); 932 + out_trans->minor = le16_to_cpu(in_trans->minor); 933 + out_trans->status_flags = le64_to_cpu(in_trans->status_flags); 934 + out_trans->status = le32_to_cpu(in_trans->status); 935 + *user_len += le32_to_cpu(in_trans->hdr.len); 936 + user_msg->len += len; 937 + 938 + if (out_trans->status) 939 + return -ECANCELED; 940 + if (out_trans->status_flags & BIT(0) && !valid_crc(msg)) 941 + return -EPIPE; 942 + 943 + return 0; 944 + } 945 + 946 + static int decode_message(struct qaic_device *qdev, struct manage_msg *user_msg, 947 + struct wire_msg *msg, struct ioctl_resources *resources, 948 + struct qaic_user *usr) 949 + { 950 + u32 msg_hdr_len = le32_to_cpu(msg->hdr.len); 951 + struct wire_trans_hdr *trans_hdr; 952 + u32 msg_len = 0; 953 + int ret; 954 + int i; 955 + 956 + if (msg_hdr_len > QAIC_MANAGE_MAX_MSG_LENGTH) 957 + return -EINVAL; 958 + 959 + user_msg->len = 0; 960 + user_msg->count = le32_to_cpu(msg->hdr.count); 961 + 962 + for (i = 0; i < user_msg->count; ++i) { 963 + trans_hdr = (struct wire_trans_hdr *)(msg->data + msg_len); 964 + if (msg_len + le32_to_cpu(trans_hdr->len) > msg_hdr_len) 965 + return -EINVAL; 966 + 967 + switch (le32_to_cpu(trans_hdr->type)) { 968 + case QAIC_TRANS_PASSTHROUGH_FROM_DEV: 969 + ret = decode_passthrough(qdev, trans_hdr, user_msg, &msg_len); 970 + break; 971 + case QAIC_TRANS_ACTIVATE_FROM_DEV: 972 + ret = decode_activate(qdev, trans_hdr, user_msg, &msg_len, resources, usr); 973 + break; 974 + case QAIC_TRANS_DEACTIVATE_FROM_DEV: 975 + ret = decode_deactivate(qdev, trans_hdr, &msg_len, usr); 976 + break; 977 + case QAIC_TRANS_STATUS_FROM_DEV: 978 + ret = decode_status(qdev, trans_hdr, user_msg, &msg_len, msg); 979 + break; 980 + default: 981 + return -EINVAL; 982 + } 983 + 984 + if (ret) 985 + return ret; 986 + } 987 + 988 + if (msg_len != (msg_hdr_len - sizeof(msg->hdr))) 989 + return -EINVAL; 990 + 991 + return 0; 992 + } 993 + 994 + static void *msg_xfer(struct qaic_device *qdev, struct wrapper_list *wrappers, u32 seq_num, 995 + bool ignore_signal) 996 + { 997 + struct xfer_queue_elem elem; 998 + struct wire_msg *out_buf; 999 + struct wrapper_msg *w; 1000 + int retry_count; 1001 + long ret; 1002 + 1003 + if (qdev->in_reset) { 1004 + mutex_unlock(&qdev->cntl_mutex); 1005 + return ERR_PTR(-ENODEV); 1006 + } 1007 + 1008 + elem.seq_num = seq_num; 1009 + elem.buf = NULL; 1010 + init_completion(&elem.xfer_done); 1011 + if (likely(!qdev->cntl_lost_buf)) { 1012 + /* 1013 + * The max size of request to device is QAIC_MANAGE_EXT_MSG_LENGTH. 1014 + * The max size of response from device is QAIC_MANAGE_MAX_MSG_LENGTH. 1015 + */ 1016 + out_buf = kmalloc(QAIC_MANAGE_MAX_MSG_LENGTH, GFP_KERNEL); 1017 + if (!out_buf) { 1018 + mutex_unlock(&qdev->cntl_mutex); 1019 + return ERR_PTR(-ENOMEM); 1020 + } 1021 + 1022 + ret = mhi_queue_buf(qdev->cntl_ch, DMA_FROM_DEVICE, out_buf, 1023 + QAIC_MANAGE_MAX_MSG_LENGTH, MHI_EOT); 1024 + if (ret) { 1025 + mutex_unlock(&qdev->cntl_mutex); 1026 + return ERR_PTR(ret); 1027 + } 1028 + } else { 1029 + /* 1030 + * we lost a buffer because we queued a recv buf, but then 1031 + * queuing the corresponding tx buf failed. To try to avoid 1032 + * a memory leak, lets reclaim it and use it for this 1033 + * transaction. 1034 + */ 1035 + qdev->cntl_lost_buf = false; 1036 + } 1037 + 1038 + list_for_each_entry(w, &wrappers->list, list) { 1039 + kref_get(&w->ref_count); 1040 + retry_count = 0; 1041 + retry: 1042 + ret = mhi_queue_buf(qdev->cntl_ch, DMA_TO_DEVICE, &w->msg, w->len, 1043 + list_is_last(&w->list, &wrappers->list) ? MHI_EOT : MHI_CHAIN); 1044 + if (ret) { 1045 + if (ret == -EAGAIN && retry_count++ < QAIC_MHI_RETRY_MAX) { 1046 + msleep_interruptible(QAIC_MHI_RETRY_WAIT_MS); 1047 + if (!signal_pending(current)) 1048 + goto retry; 1049 + } 1050 + 1051 + qdev->cntl_lost_buf = true; 1052 + kref_put(&w->ref_count, free_wrapper); 1053 + mutex_unlock(&qdev->cntl_mutex); 1054 + return ERR_PTR(ret); 1055 + } 1056 + } 1057 + 1058 + list_add_tail(&elem.list, &qdev->cntl_xfer_list); 1059 + mutex_unlock(&qdev->cntl_mutex); 1060 + 1061 + if (ignore_signal) 1062 + ret = wait_for_completion_timeout(&elem.xfer_done, control_resp_timeout_s * HZ); 1063 + else 1064 + ret = wait_for_completion_interruptible_timeout(&elem.xfer_done, 1065 + control_resp_timeout_s * HZ); 1066 + /* 1067 + * not using _interruptable because we have to cleanup or we'll 1068 + * likely cause memory corruption 1069 + */ 1070 + mutex_lock(&qdev->cntl_mutex); 1071 + if (!list_empty(&elem.list)) 1072 + list_del(&elem.list); 1073 + if (!ret && !elem.buf) 1074 + ret = -ETIMEDOUT; 1075 + else if (ret > 0 && !elem.buf) 1076 + ret = -EIO; 1077 + mutex_unlock(&qdev->cntl_mutex); 1078 + 1079 + if (ret < 0) { 1080 + kfree(elem.buf); 1081 + return ERR_PTR(ret); 1082 + } else if (!qdev->valid_crc(elem.buf)) { 1083 + kfree(elem.buf); 1084 + return ERR_PTR(-EPIPE); 1085 + } 1086 + 1087 + return elem.buf; 1088 + } 1089 + 1090 + /* Add a transaction to abort the outstanding DMA continuation */ 1091 + static int abort_dma_cont(struct qaic_device *qdev, struct wrapper_list *wrappers, u32 dma_chunk_id) 1092 + { 1093 + struct wire_trans_dma_xfer *out_trans; 1094 + u32 size = sizeof(*out_trans); 1095 + struct wrapper_msg *wrapper; 1096 + struct wrapper_msg *w; 1097 + struct wire_msg *msg; 1098 + 1099 + wrapper = list_first_entry(&wrappers->list, struct wrapper_msg, list); 1100 + msg = &wrapper->msg; 1101 + 1102 + /* Remove all but the first wrapper which has the msg header */ 1103 + list_for_each_entry_safe(wrapper, w, &wrappers->list, list) 1104 + if (!list_is_first(&wrapper->list, &wrappers->list)) 1105 + kref_put(&wrapper->ref_count, free_wrapper); 1106 + 1107 + wrapper = add_wrapper(wrappers, offsetof(struct wrapper_msg, trans) + sizeof(*out_trans)); 1108 + 1109 + if (!wrapper) 1110 + return -ENOMEM; 1111 + 1112 + out_trans = (struct wire_trans_dma_xfer *)&wrapper->trans; 1113 + out_trans->hdr.type = cpu_to_le32(QAIC_TRANS_DMA_XFER_TO_DEV); 1114 + out_trans->hdr.len = cpu_to_le32(size); 1115 + out_trans->tag = cpu_to_le32(0); 1116 + out_trans->count = cpu_to_le32(0); 1117 + out_trans->dma_chunk_id = cpu_to_le32(dma_chunk_id); 1118 + 1119 + msg->hdr.len = cpu_to_le32(size + sizeof(*msg)); 1120 + msg->hdr.count = cpu_to_le32(1); 1121 + wrapper->len = size; 1122 + 1123 + return 0; 1124 + } 1125 + 1126 + static struct wrapper_list *alloc_wrapper_list(void) 1127 + { 1128 + struct wrapper_list *wrappers; 1129 + 1130 + wrappers = kmalloc(sizeof(*wrappers), GFP_KERNEL); 1131 + if (!wrappers) 1132 + return NULL; 1133 + INIT_LIST_HEAD(&wrappers->list); 1134 + spin_lock_init(&wrappers->lock); 1135 + 1136 + return wrappers; 1137 + } 1138 + 1139 + static int qaic_manage_msg_xfer(struct qaic_device *qdev, struct qaic_user *usr, 1140 + struct manage_msg *user_msg, struct ioctl_resources *resources, 1141 + struct wire_msg **rsp) 1142 + { 1143 + struct wrapper_list *wrappers; 1144 + struct wrapper_msg *wrapper; 1145 + struct wrapper_msg *w; 1146 + bool all_done = false; 1147 + struct wire_msg *msg; 1148 + int ret; 1149 + 1150 + wrappers = alloc_wrapper_list(); 1151 + if (!wrappers) 1152 + return -ENOMEM; 1153 + 1154 + wrapper = add_wrapper(wrappers, sizeof(*wrapper)); 1155 + if (!wrapper) { 1156 + kfree(wrappers); 1157 + return -ENOMEM; 1158 + } 1159 + 1160 + msg = &wrapper->msg; 1161 + wrapper->len = sizeof(*msg); 1162 + 1163 + ret = encode_message(qdev, user_msg, wrappers, resources, usr); 1164 + if (ret && resources->dma_chunk_id) 1165 + ret = abort_dma_cont(qdev, wrappers, resources->dma_chunk_id); 1166 + if (ret) 1167 + goto encode_failed; 1168 + 1169 + ret = mutex_lock_interruptible(&qdev->cntl_mutex); 1170 + if (ret) 1171 + goto lock_failed; 1172 + 1173 + msg->hdr.magic_number = MANAGE_MAGIC_NUMBER; 1174 + msg->hdr.sequence_number = cpu_to_le32(qdev->next_seq_num++); 1175 + 1176 + if (usr) { 1177 + msg->hdr.handle = cpu_to_le32(usr->handle); 1178 + msg->hdr.partition_id = cpu_to_le32(usr->qddev->partition_id); 1179 + } else { 1180 + msg->hdr.handle = 0; 1181 + msg->hdr.partition_id = cpu_to_le32(QAIC_NO_PARTITION); 1182 + } 1183 + 1184 + msg->hdr.padding = cpu_to_le32(0); 1185 + msg->hdr.crc32 = cpu_to_le32(qdev->gen_crc(wrappers)); 1186 + 1187 + /* msg_xfer releases the mutex */ 1188 + *rsp = msg_xfer(qdev, wrappers, qdev->next_seq_num - 1, false); 1189 + if (IS_ERR(*rsp)) 1190 + ret = PTR_ERR(*rsp); 1191 + 1192 + lock_failed: 1193 + free_dma_xfers(qdev, resources); 1194 + encode_failed: 1195 + spin_lock(&wrappers->lock); 1196 + list_for_each_entry_safe(wrapper, w, &wrappers->list, list) 1197 + kref_put(&wrapper->ref_count, free_wrapper); 1198 + all_done = list_empty(&wrappers->list); 1199 + spin_unlock(&wrappers->lock); 1200 + if (all_done) 1201 + kfree(wrappers); 1202 + 1203 + return ret; 1204 + } 1205 + 1206 + static int qaic_manage(struct qaic_device *qdev, struct qaic_user *usr, struct manage_msg *user_msg) 1207 + { 1208 + struct wire_trans_dma_xfer_cont *dma_cont = NULL; 1209 + struct ioctl_resources resources; 1210 + struct wire_msg *rsp = NULL; 1211 + int ret; 1212 + 1213 + memset(&resources, 0, sizeof(struct ioctl_resources)); 1214 + 1215 + INIT_LIST_HEAD(&resources.dma_xfers); 1216 + 1217 + if (user_msg->len > QAIC_MANAGE_MAX_MSG_LENGTH || 1218 + user_msg->count > QAIC_MANAGE_MAX_MSG_LENGTH / sizeof(struct qaic_manage_trans_hdr)) 1219 + return -EINVAL; 1220 + 1221 + dma_xfer_continue: 1222 + ret = qaic_manage_msg_xfer(qdev, usr, user_msg, &resources, &rsp); 1223 + if (ret) 1224 + return ret; 1225 + /* dma_cont should be the only transaction if present */ 1226 + if (le32_to_cpu(rsp->hdr.count) == 1) { 1227 + dma_cont = (struct wire_trans_dma_xfer_cont *)rsp->data; 1228 + if (le32_to_cpu(dma_cont->hdr.type) != QAIC_TRANS_DMA_XFER_CONT) 1229 + dma_cont = NULL; 1230 + } 1231 + if (dma_cont) { 1232 + if (le32_to_cpu(dma_cont->dma_chunk_id) == resources.dma_chunk_id && 1233 + le64_to_cpu(dma_cont->xferred_size) == resources.xferred_dma_size) { 1234 + kfree(rsp); 1235 + goto dma_xfer_continue; 1236 + } 1237 + 1238 + ret = -EINVAL; 1239 + goto dma_cont_failed; 1240 + } 1241 + 1242 + ret = decode_message(qdev, user_msg, rsp, &resources, usr); 1243 + 1244 + dma_cont_failed: 1245 + free_dbc_buf(qdev, &resources); 1246 + kfree(rsp); 1247 + return ret; 1248 + } 1249 + 1250 + int qaic_manage_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) 1251 + { 1252 + struct qaic_manage_msg *user_msg; 1253 + struct qaic_device *qdev; 1254 + struct manage_msg *msg; 1255 + struct qaic_user *usr; 1256 + u8 __user *user_data; 1257 + int qdev_rcu_id; 1258 + int usr_rcu_id; 1259 + int ret; 1260 + 1261 + usr = file_priv->driver_priv; 1262 + 1263 + usr_rcu_id = srcu_read_lock(&usr->qddev_lock); 1264 + if (!usr->qddev) { 1265 + srcu_read_unlock(&usr->qddev_lock, usr_rcu_id); 1266 + return -ENODEV; 1267 + } 1268 + 1269 + qdev = usr->qddev->qdev; 1270 + 1271 + qdev_rcu_id = srcu_read_lock(&qdev->dev_lock); 1272 + if (qdev->in_reset) { 1273 + srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id); 1274 + srcu_read_unlock(&usr->qddev_lock, usr_rcu_id); 1275 + return -ENODEV; 1276 + } 1277 + 1278 + user_msg = data; 1279 + 1280 + if (user_msg->len > QAIC_MANAGE_MAX_MSG_LENGTH) { 1281 + ret = -EINVAL; 1282 + goto out; 1283 + } 1284 + 1285 + msg = kzalloc(QAIC_MANAGE_MAX_MSG_LENGTH + sizeof(*msg), GFP_KERNEL); 1286 + if (!msg) { 1287 + ret = -ENOMEM; 1288 + goto out; 1289 + } 1290 + 1291 + msg->len = user_msg->len; 1292 + msg->count = user_msg->count; 1293 + 1294 + user_data = u64_to_user_ptr(user_msg->data); 1295 + 1296 + if (copy_from_user(msg->data, user_data, user_msg->len)) { 1297 + ret = -EFAULT; 1298 + goto free_msg; 1299 + } 1300 + 1301 + ret = qaic_manage(qdev, usr, msg); 1302 + 1303 + /* 1304 + * If the qaic_manage() is successful then we copy the message onto 1305 + * userspace memory but we have an exception for -ECANCELED. 1306 + * For -ECANCELED, it means that device has NACKed the message with a 1307 + * status error code which userspace would like to know. 1308 + */ 1309 + if (ret == -ECANCELED || !ret) { 1310 + if (copy_to_user(user_data, msg->data, msg->len)) { 1311 + ret = -EFAULT; 1312 + } else { 1313 + user_msg->len = msg->len; 1314 + user_msg->count = msg->count; 1315 + } 1316 + } 1317 + 1318 + free_msg: 1319 + kfree(msg); 1320 + out: 1321 + srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id); 1322 + srcu_read_unlock(&usr->qddev_lock, usr_rcu_id); 1323 + return ret; 1324 + } 1325 + 1326 + int get_cntl_version(struct qaic_device *qdev, struct qaic_user *usr, u16 *major, u16 *minor) 1327 + { 1328 + struct qaic_manage_trans_status_from_dev *status_result; 1329 + struct qaic_manage_trans_status_to_dev *status_query; 1330 + struct manage_msg *user_msg; 1331 + int ret; 1332 + 1333 + user_msg = kmalloc(sizeof(*user_msg) + sizeof(*status_result), GFP_KERNEL); 1334 + if (!user_msg) { 1335 + ret = -ENOMEM; 1336 + goto out; 1337 + } 1338 + user_msg->len = sizeof(*status_query); 1339 + user_msg->count = 1; 1340 + 1341 + status_query = (struct qaic_manage_trans_status_to_dev *)user_msg->data; 1342 + status_query->hdr.type = QAIC_TRANS_STATUS_FROM_USR; 1343 + status_query->hdr.len = sizeof(status_query->hdr); 1344 + 1345 + ret = qaic_manage(qdev, usr, user_msg); 1346 + if (ret) 1347 + goto kfree_user_msg; 1348 + status_result = (struct qaic_manage_trans_status_from_dev *)user_msg->data; 1349 + *major = status_result->major; 1350 + *minor = status_result->minor; 1351 + 1352 + if (status_result->status_flags & BIT(0)) { /* device is using CRC */ 1353 + /* By default qdev->gen_crc is programmed to generate CRC */ 1354 + qdev->valid_crc = valid_crc; 1355 + } else { 1356 + /* By default qdev->valid_crc is programmed to bypass CRC */ 1357 + qdev->gen_crc = gen_crc_stub; 1358 + } 1359 + 1360 + kfree_user_msg: 1361 + kfree(user_msg); 1362 + out: 1363 + return ret; 1364 + } 1365 + 1366 + static void resp_worker(struct work_struct *work) 1367 + { 1368 + struct resp_work *resp = container_of(work, struct resp_work, work); 1369 + struct qaic_device *qdev = resp->qdev; 1370 + struct wire_msg *msg = resp->buf; 1371 + struct xfer_queue_elem *elem; 1372 + struct xfer_queue_elem *i; 1373 + bool found = false; 1374 + 1375 + mutex_lock(&qdev->cntl_mutex); 1376 + list_for_each_entry_safe(elem, i, &qdev->cntl_xfer_list, list) { 1377 + if (elem->seq_num == le32_to_cpu(msg->hdr.sequence_number)) { 1378 + found = true; 1379 + list_del_init(&elem->list); 1380 + elem->buf = msg; 1381 + complete_all(&elem->xfer_done); 1382 + break; 1383 + } 1384 + } 1385 + mutex_unlock(&qdev->cntl_mutex); 1386 + 1387 + if (!found) 1388 + /* request must have timed out, drop packet */ 1389 + kfree(msg); 1390 + 1391 + kfree(resp); 1392 + } 1393 + 1394 + static void free_wrapper_from_list(struct wrapper_list *wrappers, struct wrapper_msg *wrapper) 1395 + { 1396 + bool all_done = false; 1397 + 1398 + spin_lock(&wrappers->lock); 1399 + kref_put(&wrapper->ref_count, free_wrapper); 1400 + all_done = list_empty(&wrappers->list); 1401 + spin_unlock(&wrappers->lock); 1402 + 1403 + if (all_done) 1404 + kfree(wrappers); 1405 + } 1406 + 1407 + void qaic_mhi_ul_xfer_cb(struct mhi_device *mhi_dev, struct mhi_result *mhi_result) 1408 + { 1409 + struct wire_msg *msg = mhi_result->buf_addr; 1410 + struct wrapper_msg *wrapper = container_of(msg, struct wrapper_msg, msg); 1411 + 1412 + free_wrapper_from_list(wrapper->head, wrapper); 1413 + } 1414 + 1415 + void qaic_mhi_dl_xfer_cb(struct mhi_device *mhi_dev, struct mhi_result *mhi_result) 1416 + { 1417 + struct qaic_device *qdev = dev_get_drvdata(&mhi_dev->dev); 1418 + struct wire_msg *msg = mhi_result->buf_addr; 1419 + struct resp_work *resp; 1420 + 1421 + if (mhi_result->transaction_status || msg->hdr.magic_number != MANAGE_MAGIC_NUMBER) { 1422 + kfree(msg); 1423 + return; 1424 + } 1425 + 1426 + resp = kmalloc(sizeof(*resp), GFP_ATOMIC); 1427 + if (!resp) { 1428 + kfree(msg); 1429 + return; 1430 + } 1431 + 1432 + INIT_WORK(&resp->work, resp_worker); 1433 + resp->qdev = qdev; 1434 + resp->buf = msg; 1435 + queue_work(qdev->cntl_wq, &resp->work); 1436 + } 1437 + 1438 + int qaic_control_open(struct qaic_device *qdev) 1439 + { 1440 + if (!qdev->cntl_ch) 1441 + return -ENODEV; 1442 + 1443 + qdev->cntl_lost_buf = false; 1444 + /* 1445 + * By default qaic should assume that device has CRC enabled. 1446 + * Qaic comes to know if device has CRC enabled or disabled during the 1447 + * device status transaction, which is the first transaction performed 1448 + * on control channel. 1449 + * 1450 + * So CRC validation of first device status transaction response is 1451 + * ignored (by calling valid_crc_stub) and is done later during decoding 1452 + * if device has CRC enabled. 1453 + * Now that qaic knows whether device has CRC enabled or not it acts 1454 + * accordingly. 1455 + */ 1456 + qdev->gen_crc = gen_crc; 1457 + qdev->valid_crc = valid_crc_stub; 1458 + 1459 + return mhi_prepare_for_transfer(qdev->cntl_ch); 1460 + } 1461 + 1462 + void qaic_control_close(struct qaic_device *qdev) 1463 + { 1464 + mhi_unprepare_from_transfer(qdev->cntl_ch); 1465 + } 1466 + 1467 + void qaic_release_usr(struct qaic_device *qdev, struct qaic_user *usr) 1468 + { 1469 + struct wire_trans_terminate_to_dev *trans; 1470 + struct wrapper_list *wrappers; 1471 + struct wrapper_msg *wrapper; 1472 + struct wire_msg *msg; 1473 + struct wire_msg *rsp; 1474 + 1475 + wrappers = alloc_wrapper_list(); 1476 + if (!wrappers) 1477 + return; 1478 + 1479 + wrapper = add_wrapper(wrappers, sizeof(*wrapper) + sizeof(*msg) + sizeof(*trans)); 1480 + if (!wrapper) 1481 + return; 1482 + 1483 + msg = &wrapper->msg; 1484 + 1485 + trans = (struct wire_trans_terminate_to_dev *)msg->data; 1486 + 1487 + trans->hdr.type = cpu_to_le32(QAIC_TRANS_TERMINATE_TO_DEV); 1488 + trans->hdr.len = cpu_to_le32(sizeof(*trans)); 1489 + trans->handle = cpu_to_le32(usr->handle); 1490 + 1491 + mutex_lock(&qdev->cntl_mutex); 1492 + wrapper->len = sizeof(msg->hdr) + sizeof(*trans); 1493 + msg->hdr.magic_number = MANAGE_MAGIC_NUMBER; 1494 + msg->hdr.sequence_number = cpu_to_le32(qdev->next_seq_num++); 1495 + msg->hdr.len = cpu_to_le32(wrapper->len); 1496 + msg->hdr.count = cpu_to_le32(1); 1497 + msg->hdr.handle = cpu_to_le32(usr->handle); 1498 + msg->hdr.padding = cpu_to_le32(0); 1499 + msg->hdr.crc32 = cpu_to_le32(qdev->gen_crc(wrappers)); 1500 + 1501 + /* 1502 + * msg_xfer releases the mutex 1503 + * We don't care about the return of msg_xfer since we will not do 1504 + * anything different based on what happens. 1505 + * We ignore pending signals since one will be set if the user is 1506 + * killed, and we need give the device a chance to cleanup, otherwise 1507 + * DMA may still be in progress when we return. 1508 + */ 1509 + rsp = msg_xfer(qdev, wrappers, qdev->next_seq_num - 1, true); 1510 + if (!IS_ERR(rsp)) 1511 + kfree(rsp); 1512 + free_wrapper_from_list(wrappers, wrapper); 1513 + } 1514 + 1515 + void wake_all_cntl(struct qaic_device *qdev) 1516 + { 1517 + struct xfer_queue_elem *elem; 1518 + struct xfer_queue_elem *i; 1519 + 1520 + mutex_lock(&qdev->cntl_mutex); 1521 + list_for_each_entry_safe(elem, i, &qdev->cntl_xfer_list, list) { 1522 + list_del_init(&elem->list); 1523 + complete_all(&elem->xfer_done); 1524 + } 1525 + mutex_unlock(&qdev->cntl_mutex); 1526 + }
+1902
drivers/accel/qaic/qaic_data.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + 3 + /* Copyright (c) 2019-2021, The Linux Foundation. All rights reserved. */ 4 + /* Copyright (c) 2021-2023 Qualcomm Innovation Center, Inc. All rights reserved. */ 5 + 6 + #include <linux/bitfield.h> 7 + #include <linux/bits.h> 8 + #include <linux/completion.h> 9 + #include <linux/delay.h> 10 + #include <linux/dma-buf.h> 11 + #include <linux/dma-mapping.h> 12 + #include <linux/interrupt.h> 13 + #include <linux/kref.h> 14 + #include <linux/list.h> 15 + #include <linux/math64.h> 16 + #include <linux/mm.h> 17 + #include <linux/moduleparam.h> 18 + #include <linux/scatterlist.h> 19 + #include <linux/spinlock.h> 20 + #include <linux/srcu.h> 21 + #include <linux/types.h> 22 + #include <linux/uaccess.h> 23 + #include <linux/wait.h> 24 + #include <drm/drm_file.h> 25 + #include <drm/drm_gem.h> 26 + #include <drm/drm_print.h> 27 + #include <uapi/drm/qaic_accel.h> 28 + 29 + #include "qaic.h" 30 + 31 + #define SEM_VAL_MASK GENMASK_ULL(11, 0) 32 + #define SEM_INDEX_MASK GENMASK_ULL(4, 0) 33 + #define BULK_XFER BIT(3) 34 + #define GEN_COMPLETION BIT(4) 35 + #define INBOUND_XFER 1 36 + #define OUTBOUND_XFER 2 37 + #define REQHP_OFF 0x0 /* we read this */ 38 + #define REQTP_OFF 0x4 /* we write this */ 39 + #define RSPHP_OFF 0x8 /* we write this */ 40 + #define RSPTP_OFF 0xc /* we read this */ 41 + 42 + #define ENCODE_SEM(val, index, sync, cmd, flags) \ 43 + ({ \ 44 + FIELD_PREP(GENMASK(11, 0), (val)) | \ 45 + FIELD_PREP(GENMASK(20, 16), (index)) | \ 46 + FIELD_PREP(BIT(22), (sync)) | \ 47 + FIELD_PREP(GENMASK(26, 24), (cmd)) | \ 48 + FIELD_PREP(GENMASK(30, 29), (flags)) | \ 49 + FIELD_PREP(BIT(31), (cmd) ? 1 : 0); \ 50 + }) 51 + #define NUM_EVENTS 128 52 + #define NUM_DELAYS 10 53 + 54 + static unsigned int wait_exec_default_timeout_ms = 5000; /* 5 sec default */ 55 + module_param(wait_exec_default_timeout_ms, uint, 0600); 56 + MODULE_PARM_DESC(wait_exec_default_timeout_ms, "Default timeout for DRM_IOCTL_QAIC_WAIT_BO"); 57 + 58 + static unsigned int datapath_poll_interval_us = 100; /* 100 usec default */ 59 + module_param(datapath_poll_interval_us, uint, 0600); 60 + MODULE_PARM_DESC(datapath_poll_interval_us, 61 + "Amount of time to sleep between activity when datapath polling is enabled"); 62 + 63 + struct dbc_req { 64 + /* 65 + * A request ID is assigned to each memory handle going in DMA queue. 66 + * As a single memory handle can enqueue multiple elements in DMA queue 67 + * all of them will have the same request ID. 68 + */ 69 + __le16 req_id; 70 + /* Future use */ 71 + __u8 seq_id; 72 + /* 73 + * Special encoded variable 74 + * 7 0 - Do not force to generate MSI after DMA is completed 75 + * 1 - Force to generate MSI after DMA is completed 76 + * 6:5 Reserved 77 + * 4 1 - Generate completion element in the response queue 78 + * 0 - No Completion Code 79 + * 3 0 - DMA request is a Link list transfer 80 + * 1 - DMA request is a Bulk transfer 81 + * 2 Reserved 82 + * 1:0 00 - No DMA transfer involved 83 + * 01 - DMA transfer is part of inbound transfer 84 + * 10 - DMA transfer has outbound transfer 85 + * 11 - NA 86 + */ 87 + __u8 cmd; 88 + __le32 resv; 89 + /* Source address for the transfer */ 90 + __le64 src_addr; 91 + /* Destination address for the transfer */ 92 + __le64 dest_addr; 93 + /* Length of transfer request */ 94 + __le32 len; 95 + __le32 resv2; 96 + /* Doorbell address */ 97 + __le64 db_addr; 98 + /* 99 + * Special encoded variable 100 + * 7 1 - Doorbell(db) write 101 + * 0 - No doorbell write 102 + * 6:2 Reserved 103 + * 1:0 00 - 32 bit access, db address must be aligned to 32bit-boundary 104 + * 01 - 16 bit access, db address must be aligned to 16bit-boundary 105 + * 10 - 8 bit access, db address must be aligned to 8bit-boundary 106 + * 11 - Reserved 107 + */ 108 + __u8 db_len; 109 + __u8 resv3; 110 + __le16 resv4; 111 + /* 32 bit data written to doorbell address */ 112 + __le32 db_data; 113 + /* 114 + * Special encoded variable 115 + * All the fields of sem_cmdX are passed from user and all are ORed 116 + * together to form sem_cmd. 117 + * 0:11 Semaphore value 118 + * 15:12 Reserved 119 + * 20:16 Semaphore index 120 + * 21 Reserved 121 + * 22 Semaphore Sync 122 + * 23 Reserved 123 + * 26:24 Semaphore command 124 + * 28:27 Reserved 125 + * 29 Semaphore DMA out bound sync fence 126 + * 30 Semaphore DMA in bound sync fence 127 + * 31 Enable semaphore command 128 + */ 129 + __le32 sem_cmd0; 130 + __le32 sem_cmd1; 131 + __le32 sem_cmd2; 132 + __le32 sem_cmd3; 133 + } __packed; 134 + 135 + struct dbc_rsp { 136 + /* Request ID of the memory handle whose DMA transaction is completed */ 137 + __le16 req_id; 138 + /* Status of the DMA transaction. 0 : Success otherwise failure */ 139 + __le16 status; 140 + } __packed; 141 + 142 + inline int get_dbc_req_elem_size(void) 143 + { 144 + return sizeof(struct dbc_req); 145 + } 146 + 147 + inline int get_dbc_rsp_elem_size(void) 148 + { 149 + return sizeof(struct dbc_rsp); 150 + } 151 + 152 + static void free_slice(struct kref *kref) 153 + { 154 + struct bo_slice *slice = container_of(kref, struct bo_slice, ref_count); 155 + 156 + list_del(&slice->slice); 157 + drm_gem_object_put(&slice->bo->base); 158 + sg_free_table(slice->sgt); 159 + kfree(slice->sgt); 160 + kfree(slice->reqs); 161 + kfree(slice); 162 + } 163 + 164 + static int clone_range_of_sgt_for_slice(struct qaic_device *qdev, struct sg_table **sgt_out, 165 + struct sg_table *sgt_in, u64 size, u64 offset) 166 + { 167 + int total_len, len, nents, offf = 0, offl = 0; 168 + struct scatterlist *sg, *sgn, *sgf, *sgl; 169 + struct sg_table *sgt; 170 + int ret, j; 171 + 172 + /* find out number of relevant nents needed for this mem */ 173 + total_len = 0; 174 + sgf = NULL; 175 + sgl = NULL; 176 + nents = 0; 177 + 178 + size = size ? size : PAGE_SIZE; 179 + for (sg = sgt_in->sgl; sg; sg = sg_next(sg)) { 180 + len = sg_dma_len(sg); 181 + 182 + if (!len) 183 + continue; 184 + if (offset >= total_len && offset < total_len + len) { 185 + sgf = sg; 186 + offf = offset - total_len; 187 + } 188 + if (sgf) 189 + nents++; 190 + if (offset + size >= total_len && 191 + offset + size <= total_len + len) { 192 + sgl = sg; 193 + offl = offset + size - total_len; 194 + break; 195 + } 196 + total_len += len; 197 + } 198 + 199 + if (!sgf || !sgl) { 200 + ret = -EINVAL; 201 + goto out; 202 + } 203 + 204 + sgt = kzalloc(sizeof(*sgt), GFP_KERNEL); 205 + if (!sgt) { 206 + ret = -ENOMEM; 207 + goto out; 208 + } 209 + 210 + ret = sg_alloc_table(sgt, nents, GFP_KERNEL); 211 + if (ret) 212 + goto free_sgt; 213 + 214 + /* copy relevant sg node and fix page and length */ 215 + sgn = sgf; 216 + for_each_sgtable_sg(sgt, sg, j) { 217 + memcpy(sg, sgn, sizeof(*sg)); 218 + if (sgn == sgf) { 219 + sg_dma_address(sg) += offf; 220 + sg_dma_len(sg) -= offf; 221 + sg_set_page(sg, sg_page(sgn), sg_dma_len(sg), offf); 222 + } else { 223 + offf = 0; 224 + } 225 + if (sgn == sgl) { 226 + sg_dma_len(sg) = offl - offf; 227 + sg_set_page(sg, sg_page(sgn), offl - offf, offf); 228 + sg_mark_end(sg); 229 + break; 230 + } 231 + sgn = sg_next(sgn); 232 + } 233 + 234 + *sgt_out = sgt; 235 + return ret; 236 + 237 + free_sgt: 238 + kfree(sgt); 239 + out: 240 + *sgt_out = NULL; 241 + return ret; 242 + } 243 + 244 + static int encode_reqs(struct qaic_device *qdev, struct bo_slice *slice, 245 + struct qaic_attach_slice_entry *req) 246 + { 247 + __le64 db_addr = cpu_to_le64(req->db_addr); 248 + __le32 db_data = cpu_to_le32(req->db_data); 249 + struct scatterlist *sg; 250 + __u8 cmd = BULK_XFER; 251 + int presync_sem; 252 + u64 dev_addr; 253 + __u8 db_len; 254 + int i; 255 + 256 + if (!slice->no_xfer) 257 + cmd |= (slice->dir == DMA_TO_DEVICE ? INBOUND_XFER : OUTBOUND_XFER); 258 + 259 + if (req->db_len && !IS_ALIGNED(req->db_addr, req->db_len / 8)) 260 + return -EINVAL; 261 + 262 + presync_sem = req->sem0.presync + req->sem1.presync + req->sem2.presync + req->sem3.presync; 263 + if (presync_sem > 1) 264 + return -EINVAL; 265 + 266 + presync_sem = req->sem0.presync << 0 | req->sem1.presync << 1 | 267 + req->sem2.presync << 2 | req->sem3.presync << 3; 268 + 269 + switch (req->db_len) { 270 + case 32: 271 + db_len = BIT(7); 272 + break; 273 + case 16: 274 + db_len = BIT(7) | 1; 275 + break; 276 + case 8: 277 + db_len = BIT(7) | 2; 278 + break; 279 + case 0: 280 + db_len = 0; /* doorbell is not active for this command */ 281 + break; 282 + default: 283 + return -EINVAL; /* should never hit this */ 284 + } 285 + 286 + /* 287 + * When we end up splitting up a single request (ie a buf slice) into 288 + * multiple DMA requests, we have to manage the sync data carefully. 289 + * There can only be one presync sem. That needs to be on every xfer 290 + * so that the DMA engine doesn't transfer data before the receiver is 291 + * ready. We only do the doorbell and postsync sems after the xfer. 292 + * To guarantee previous xfers for the request are complete, we use a 293 + * fence. 294 + */ 295 + dev_addr = req->dev_addr; 296 + for_each_sgtable_sg(slice->sgt, sg, i) { 297 + slice->reqs[i].cmd = cmd; 298 + slice->reqs[i].src_addr = cpu_to_le64(slice->dir == DMA_TO_DEVICE ? 299 + sg_dma_address(sg) : dev_addr); 300 + slice->reqs[i].dest_addr = cpu_to_le64(slice->dir == DMA_TO_DEVICE ? 301 + dev_addr : sg_dma_address(sg)); 302 + /* 303 + * sg_dma_len(sg) returns size of a DMA segment, maximum DMA 304 + * segment size is set to UINT_MAX by qaic and hence return 305 + * values of sg_dma_len(sg) can never exceed u32 range. So, 306 + * by down sizing we are not corrupting the value. 307 + */ 308 + slice->reqs[i].len = cpu_to_le32((u32)sg_dma_len(sg)); 309 + switch (presync_sem) { 310 + case BIT(0): 311 + slice->reqs[i].sem_cmd0 = cpu_to_le32(ENCODE_SEM(req->sem0.val, 312 + req->sem0.index, 313 + req->sem0.presync, 314 + req->sem0.cmd, 315 + req->sem0.flags)); 316 + break; 317 + case BIT(1): 318 + slice->reqs[i].sem_cmd1 = cpu_to_le32(ENCODE_SEM(req->sem1.val, 319 + req->sem1.index, 320 + req->sem1.presync, 321 + req->sem1.cmd, 322 + req->sem1.flags)); 323 + break; 324 + case BIT(2): 325 + slice->reqs[i].sem_cmd2 = cpu_to_le32(ENCODE_SEM(req->sem2.val, 326 + req->sem2.index, 327 + req->sem2.presync, 328 + req->sem2.cmd, 329 + req->sem2.flags)); 330 + break; 331 + case BIT(3): 332 + slice->reqs[i].sem_cmd3 = cpu_to_le32(ENCODE_SEM(req->sem3.val, 333 + req->sem3.index, 334 + req->sem3.presync, 335 + req->sem3.cmd, 336 + req->sem3.flags)); 337 + break; 338 + } 339 + dev_addr += sg_dma_len(sg); 340 + } 341 + /* add post transfer stuff to last segment */ 342 + i--; 343 + slice->reqs[i].cmd |= GEN_COMPLETION; 344 + slice->reqs[i].db_addr = db_addr; 345 + slice->reqs[i].db_len = db_len; 346 + slice->reqs[i].db_data = db_data; 347 + /* 348 + * Add a fence if we have more than one request going to the hardware 349 + * representing the entirety of the user request, and the user request 350 + * has no presync condition. 351 + * Fences are expensive, so we try to avoid them. We rely on the 352 + * hardware behavior to avoid needing one when there is a presync 353 + * condition. When a presync exists, all requests for that same 354 + * presync will be queued into a fifo. Thus, since we queue the 355 + * post xfer activity only on the last request we queue, the hardware 356 + * will ensure that the last queued request is processed last, thus 357 + * making sure the post xfer activity happens at the right time without 358 + * a fence. 359 + */ 360 + if (i && !presync_sem) 361 + req->sem0.flags |= (slice->dir == DMA_TO_DEVICE ? 362 + QAIC_SEM_INSYNCFENCE : QAIC_SEM_OUTSYNCFENCE); 363 + slice->reqs[i].sem_cmd0 = cpu_to_le32(ENCODE_SEM(req->sem0.val, req->sem0.index, 364 + req->sem0.presync, req->sem0.cmd, 365 + req->sem0.flags)); 366 + slice->reqs[i].sem_cmd1 = cpu_to_le32(ENCODE_SEM(req->sem1.val, req->sem1.index, 367 + req->sem1.presync, req->sem1.cmd, 368 + req->sem1.flags)); 369 + slice->reqs[i].sem_cmd2 = cpu_to_le32(ENCODE_SEM(req->sem2.val, req->sem2.index, 370 + req->sem2.presync, req->sem2.cmd, 371 + req->sem2.flags)); 372 + slice->reqs[i].sem_cmd3 = cpu_to_le32(ENCODE_SEM(req->sem3.val, req->sem3.index, 373 + req->sem3.presync, req->sem3.cmd, 374 + req->sem3.flags)); 375 + 376 + return 0; 377 + } 378 + 379 + static int qaic_map_one_slice(struct qaic_device *qdev, struct qaic_bo *bo, 380 + struct qaic_attach_slice_entry *slice_ent) 381 + { 382 + struct sg_table *sgt = NULL; 383 + struct bo_slice *slice; 384 + int ret; 385 + 386 + ret = clone_range_of_sgt_for_slice(qdev, &sgt, bo->sgt, slice_ent->size, slice_ent->offset); 387 + if (ret) 388 + goto out; 389 + 390 + slice = kmalloc(sizeof(*slice), GFP_KERNEL); 391 + if (!slice) { 392 + ret = -ENOMEM; 393 + goto free_sgt; 394 + } 395 + 396 + slice->reqs = kcalloc(sgt->nents, sizeof(*slice->reqs), GFP_KERNEL); 397 + if (!slice->reqs) { 398 + ret = -ENOMEM; 399 + goto free_slice; 400 + } 401 + 402 + slice->no_xfer = !slice_ent->size; 403 + slice->sgt = sgt; 404 + slice->nents = sgt->nents; 405 + slice->dir = bo->dir; 406 + slice->bo = bo; 407 + slice->size = slice_ent->size; 408 + slice->offset = slice_ent->offset; 409 + 410 + ret = encode_reqs(qdev, slice, slice_ent); 411 + if (ret) 412 + goto free_req; 413 + 414 + bo->total_slice_nents += sgt->nents; 415 + kref_init(&slice->ref_count); 416 + drm_gem_object_get(&bo->base); 417 + list_add_tail(&slice->slice, &bo->slices); 418 + 419 + return 0; 420 + 421 + free_req: 422 + kfree(slice->reqs); 423 + free_slice: 424 + kfree(slice); 425 + free_sgt: 426 + sg_free_table(sgt); 427 + kfree(sgt); 428 + out: 429 + return ret; 430 + } 431 + 432 + static int create_sgt(struct qaic_device *qdev, struct sg_table **sgt_out, u64 size) 433 + { 434 + struct scatterlist *sg; 435 + struct sg_table *sgt; 436 + struct page **pages; 437 + int *pages_order; 438 + int buf_extra; 439 + int max_order; 440 + int nr_pages; 441 + int ret = 0; 442 + int i, j, k; 443 + int order; 444 + 445 + if (size) { 446 + nr_pages = DIV_ROUND_UP(size, PAGE_SIZE); 447 + /* 448 + * calculate how much extra we are going to allocate, to remove 449 + * later 450 + */ 451 + buf_extra = (PAGE_SIZE - size % PAGE_SIZE) % PAGE_SIZE; 452 + max_order = min(MAX_ORDER - 1, get_order(size)); 453 + } else { 454 + /* allocate a single page for book keeping */ 455 + nr_pages = 1; 456 + buf_extra = 0; 457 + max_order = 0; 458 + } 459 + 460 + pages = kvmalloc_array(nr_pages, sizeof(*pages) + sizeof(*pages_order), GFP_KERNEL); 461 + if (!pages) { 462 + ret = -ENOMEM; 463 + goto out; 464 + } 465 + pages_order = (void *)pages + sizeof(*pages) * nr_pages; 466 + 467 + /* 468 + * Allocate requested memory using alloc_pages. It is possible to allocate 469 + * the requested memory in multiple chunks by calling alloc_pages 470 + * multiple times. Use SG table to handle multiple allocated pages. 471 + */ 472 + i = 0; 473 + while (nr_pages > 0) { 474 + order = min(get_order(nr_pages * PAGE_SIZE), max_order); 475 + while (1) { 476 + pages[i] = alloc_pages(GFP_KERNEL | GFP_HIGHUSER | 477 + __GFP_NOWARN | __GFP_ZERO | 478 + (order ? __GFP_NORETRY : __GFP_RETRY_MAYFAIL), 479 + order); 480 + if (pages[i]) 481 + break; 482 + if (!order--) { 483 + ret = -ENOMEM; 484 + goto free_partial_alloc; 485 + } 486 + } 487 + 488 + max_order = order; 489 + pages_order[i] = order; 490 + 491 + nr_pages -= 1 << order; 492 + if (nr_pages <= 0) 493 + /* account for over allocation */ 494 + buf_extra += abs(nr_pages) * PAGE_SIZE; 495 + i++; 496 + } 497 + 498 + sgt = kmalloc(sizeof(*sgt), GFP_KERNEL); 499 + if (!sgt) { 500 + ret = -ENOMEM; 501 + goto free_partial_alloc; 502 + } 503 + 504 + if (sg_alloc_table(sgt, i, GFP_KERNEL)) { 505 + ret = -ENOMEM; 506 + goto free_sgt; 507 + } 508 + 509 + /* Populate the SG table with the allocated memory pages */ 510 + sg = sgt->sgl; 511 + for (k = 0; k < i; k++, sg = sg_next(sg)) { 512 + /* Last entry requires special handling */ 513 + if (k < i - 1) { 514 + sg_set_page(sg, pages[k], PAGE_SIZE << pages_order[k], 0); 515 + } else { 516 + sg_set_page(sg, pages[k], (PAGE_SIZE << pages_order[k]) - buf_extra, 0); 517 + sg_mark_end(sg); 518 + } 519 + } 520 + 521 + kvfree(pages); 522 + *sgt_out = sgt; 523 + return ret; 524 + 525 + free_sgt: 526 + kfree(sgt); 527 + free_partial_alloc: 528 + for (j = 0; j < i; j++) 529 + __free_pages(pages[j], pages_order[j]); 530 + kvfree(pages); 531 + out: 532 + *sgt_out = NULL; 533 + return ret; 534 + } 535 + 536 + static bool invalid_sem(struct qaic_sem *sem) 537 + { 538 + if (sem->val & ~SEM_VAL_MASK || sem->index & ~SEM_INDEX_MASK || 539 + !(sem->presync == 0 || sem->presync == 1) || sem->pad || 540 + sem->flags & ~(QAIC_SEM_INSYNCFENCE | QAIC_SEM_OUTSYNCFENCE) || 541 + sem->cmd > QAIC_SEM_WAIT_GT_0) 542 + return true; 543 + return false; 544 + } 545 + 546 + static int qaic_validate_req(struct qaic_device *qdev, struct qaic_attach_slice_entry *slice_ent, 547 + u32 count, u64 total_size) 548 + { 549 + int i; 550 + 551 + for (i = 0; i < count; i++) { 552 + if (!(slice_ent[i].db_len == 32 || slice_ent[i].db_len == 16 || 553 + slice_ent[i].db_len == 8 || slice_ent[i].db_len == 0) || 554 + invalid_sem(&slice_ent[i].sem0) || invalid_sem(&slice_ent[i].sem1) || 555 + invalid_sem(&slice_ent[i].sem2) || invalid_sem(&slice_ent[i].sem3)) 556 + return -EINVAL; 557 + 558 + if (slice_ent[i].offset + slice_ent[i].size > total_size) 559 + return -EINVAL; 560 + } 561 + 562 + return 0; 563 + } 564 + 565 + static void qaic_free_sgt(struct sg_table *sgt) 566 + { 567 + struct scatterlist *sg; 568 + 569 + for (sg = sgt->sgl; sg; sg = sg_next(sg)) 570 + if (sg_page(sg)) 571 + __free_pages(sg_page(sg), get_order(sg->length)); 572 + sg_free_table(sgt); 573 + kfree(sgt); 574 + } 575 + 576 + static void qaic_gem_print_info(struct drm_printer *p, unsigned int indent, 577 + const struct drm_gem_object *obj) 578 + { 579 + struct qaic_bo *bo = to_qaic_bo(obj); 580 + 581 + drm_printf_indent(p, indent, "user requested size=%llu\n", bo->size); 582 + } 583 + 584 + static const struct vm_operations_struct drm_vm_ops = { 585 + .open = drm_gem_vm_open, 586 + .close = drm_gem_vm_close, 587 + }; 588 + 589 + static int qaic_gem_object_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma) 590 + { 591 + struct qaic_bo *bo = to_qaic_bo(obj); 592 + unsigned long offset = 0; 593 + struct scatterlist *sg; 594 + int ret; 595 + 596 + if (obj->import_attach) 597 + return -EINVAL; 598 + 599 + for (sg = bo->sgt->sgl; sg; sg = sg_next(sg)) { 600 + if (sg_page(sg)) { 601 + ret = remap_pfn_range(vma, vma->vm_start + offset, page_to_pfn(sg_page(sg)), 602 + sg->length, vma->vm_page_prot); 603 + if (ret) 604 + goto out; 605 + offset += sg->length; 606 + } 607 + } 608 + 609 + out: 610 + return ret; 611 + } 612 + 613 + static void qaic_free_object(struct drm_gem_object *obj) 614 + { 615 + struct qaic_bo *bo = to_qaic_bo(obj); 616 + 617 + if (obj->import_attach) { 618 + /* DMABUF/PRIME Path */ 619 + dma_buf_detach(obj->import_attach->dmabuf, obj->import_attach); 620 + dma_buf_put(obj->import_attach->dmabuf); 621 + } else { 622 + /* Private buffer allocation path */ 623 + qaic_free_sgt(bo->sgt); 624 + } 625 + 626 + drm_gem_object_release(obj); 627 + kfree(bo); 628 + } 629 + 630 + static const struct drm_gem_object_funcs qaic_gem_funcs = { 631 + .free = qaic_free_object, 632 + .print_info = qaic_gem_print_info, 633 + .mmap = qaic_gem_object_mmap, 634 + .vm_ops = &drm_vm_ops, 635 + }; 636 + 637 + static struct qaic_bo *qaic_alloc_init_bo(void) 638 + { 639 + struct qaic_bo *bo; 640 + 641 + bo = kzalloc(sizeof(*bo), GFP_KERNEL); 642 + if (!bo) 643 + return ERR_PTR(-ENOMEM); 644 + 645 + INIT_LIST_HEAD(&bo->slices); 646 + init_completion(&bo->xfer_done); 647 + complete_all(&bo->xfer_done); 648 + 649 + return bo; 650 + } 651 + 652 + int qaic_create_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) 653 + { 654 + struct qaic_create_bo *args = data; 655 + int usr_rcu_id, qdev_rcu_id; 656 + struct drm_gem_object *obj; 657 + struct qaic_device *qdev; 658 + struct qaic_user *usr; 659 + struct qaic_bo *bo; 660 + size_t size; 661 + int ret; 662 + 663 + if (args->pad) 664 + return -EINVAL; 665 + 666 + usr = file_priv->driver_priv; 667 + usr_rcu_id = srcu_read_lock(&usr->qddev_lock); 668 + if (!usr->qddev) { 669 + ret = -ENODEV; 670 + goto unlock_usr_srcu; 671 + } 672 + 673 + qdev = usr->qddev->qdev; 674 + qdev_rcu_id = srcu_read_lock(&qdev->dev_lock); 675 + if (qdev->in_reset) { 676 + ret = -ENODEV; 677 + goto unlock_dev_srcu; 678 + } 679 + 680 + size = PAGE_ALIGN(args->size); 681 + if (size == 0) { 682 + ret = -EINVAL; 683 + goto unlock_dev_srcu; 684 + } 685 + 686 + bo = qaic_alloc_init_bo(); 687 + if (IS_ERR(bo)) { 688 + ret = PTR_ERR(bo); 689 + goto unlock_dev_srcu; 690 + } 691 + obj = &bo->base; 692 + 693 + drm_gem_private_object_init(dev, obj, size); 694 + 695 + obj->funcs = &qaic_gem_funcs; 696 + ret = create_sgt(qdev, &bo->sgt, size); 697 + if (ret) 698 + goto free_bo; 699 + 700 + bo->size = args->size; 701 + 702 + ret = drm_gem_handle_create(file_priv, obj, &args->handle); 703 + if (ret) 704 + goto free_sgt; 705 + 706 + bo->handle = args->handle; 707 + drm_gem_object_put(obj); 708 + srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id); 709 + srcu_read_unlock(&usr->qddev_lock, usr_rcu_id); 710 + 711 + return 0; 712 + 713 + free_sgt: 714 + qaic_free_sgt(bo->sgt); 715 + free_bo: 716 + kfree(bo); 717 + unlock_dev_srcu: 718 + srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id); 719 + unlock_usr_srcu: 720 + srcu_read_unlock(&usr->qddev_lock, usr_rcu_id); 721 + return ret; 722 + } 723 + 724 + int qaic_mmap_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) 725 + { 726 + struct qaic_mmap_bo *args = data; 727 + int usr_rcu_id, qdev_rcu_id; 728 + struct drm_gem_object *obj; 729 + struct qaic_device *qdev; 730 + struct qaic_user *usr; 731 + int ret; 732 + 733 + usr = file_priv->driver_priv; 734 + usr_rcu_id = srcu_read_lock(&usr->qddev_lock); 735 + if (!usr->qddev) { 736 + ret = -ENODEV; 737 + goto unlock_usr_srcu; 738 + } 739 + 740 + qdev = usr->qddev->qdev; 741 + qdev_rcu_id = srcu_read_lock(&qdev->dev_lock); 742 + if (qdev->in_reset) { 743 + ret = -ENODEV; 744 + goto unlock_dev_srcu; 745 + } 746 + 747 + obj = drm_gem_object_lookup(file_priv, args->handle); 748 + if (!obj) { 749 + ret = -ENOENT; 750 + goto unlock_dev_srcu; 751 + } 752 + 753 + ret = drm_gem_create_mmap_offset(obj); 754 + if (ret == 0) 755 + args->offset = drm_vma_node_offset_addr(&obj->vma_node); 756 + 757 + drm_gem_object_put(obj); 758 + 759 + unlock_dev_srcu: 760 + srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id); 761 + unlock_usr_srcu: 762 + srcu_read_unlock(&usr->qddev_lock, usr_rcu_id); 763 + return ret; 764 + } 765 + 766 + struct drm_gem_object *qaic_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf) 767 + { 768 + struct dma_buf_attachment *attach; 769 + struct drm_gem_object *obj; 770 + struct qaic_bo *bo; 771 + size_t size; 772 + int ret; 773 + 774 + bo = qaic_alloc_init_bo(); 775 + if (IS_ERR(bo)) { 776 + ret = PTR_ERR(bo); 777 + goto out; 778 + } 779 + 780 + obj = &bo->base; 781 + get_dma_buf(dma_buf); 782 + 783 + attach = dma_buf_attach(dma_buf, dev->dev); 784 + if (IS_ERR(attach)) { 785 + ret = PTR_ERR(attach); 786 + goto attach_fail; 787 + } 788 + 789 + size = PAGE_ALIGN(attach->dmabuf->size); 790 + if (size == 0) { 791 + ret = -EINVAL; 792 + goto size_align_fail; 793 + } 794 + 795 + drm_gem_private_object_init(dev, obj, size); 796 + /* 797 + * skipping dma_buf_map_attachment() as we do not know the direction 798 + * just yet. Once the direction is known in the subsequent IOCTL to 799 + * attach slicing, we can do it then. 800 + */ 801 + 802 + obj->funcs = &qaic_gem_funcs; 803 + obj->import_attach = attach; 804 + obj->resv = dma_buf->resv; 805 + 806 + return obj; 807 + 808 + size_align_fail: 809 + dma_buf_detach(dma_buf, attach); 810 + attach_fail: 811 + dma_buf_put(dma_buf); 812 + kfree(bo); 813 + out: 814 + return ERR_PTR(ret); 815 + } 816 + 817 + static int qaic_prepare_import_bo(struct qaic_bo *bo, struct qaic_attach_slice_hdr *hdr) 818 + { 819 + struct drm_gem_object *obj = &bo->base; 820 + struct sg_table *sgt; 821 + int ret; 822 + 823 + if (obj->import_attach->dmabuf->size < hdr->size) 824 + return -EINVAL; 825 + 826 + sgt = dma_buf_map_attachment(obj->import_attach, hdr->dir); 827 + if (IS_ERR(sgt)) { 828 + ret = PTR_ERR(sgt); 829 + return ret; 830 + } 831 + 832 + bo->sgt = sgt; 833 + bo->size = hdr->size; 834 + 835 + return 0; 836 + } 837 + 838 + static int qaic_prepare_export_bo(struct qaic_device *qdev, struct qaic_bo *bo, 839 + struct qaic_attach_slice_hdr *hdr) 840 + { 841 + int ret; 842 + 843 + if (bo->size != hdr->size) 844 + return -EINVAL; 845 + 846 + ret = dma_map_sgtable(&qdev->pdev->dev, bo->sgt, hdr->dir, 0); 847 + if (ret) 848 + return -EFAULT; 849 + 850 + return 0; 851 + } 852 + 853 + static int qaic_prepare_bo(struct qaic_device *qdev, struct qaic_bo *bo, 854 + struct qaic_attach_slice_hdr *hdr) 855 + { 856 + int ret; 857 + 858 + if (bo->base.import_attach) 859 + ret = qaic_prepare_import_bo(bo, hdr); 860 + else 861 + ret = qaic_prepare_export_bo(qdev, bo, hdr); 862 + 863 + if (ret == 0) 864 + bo->dir = hdr->dir; 865 + 866 + return ret; 867 + } 868 + 869 + static void qaic_unprepare_import_bo(struct qaic_bo *bo) 870 + { 871 + dma_buf_unmap_attachment(bo->base.import_attach, bo->sgt, bo->dir); 872 + bo->sgt = NULL; 873 + bo->size = 0; 874 + } 875 + 876 + static void qaic_unprepare_export_bo(struct qaic_device *qdev, struct qaic_bo *bo) 877 + { 878 + dma_unmap_sgtable(&qdev->pdev->dev, bo->sgt, bo->dir, 0); 879 + } 880 + 881 + static void qaic_unprepare_bo(struct qaic_device *qdev, struct qaic_bo *bo) 882 + { 883 + if (bo->base.import_attach) 884 + qaic_unprepare_import_bo(bo); 885 + else 886 + qaic_unprepare_export_bo(qdev, bo); 887 + 888 + bo->dir = 0; 889 + } 890 + 891 + static void qaic_free_slices_bo(struct qaic_bo *bo) 892 + { 893 + struct bo_slice *slice, *temp; 894 + 895 + list_for_each_entry_safe(slice, temp, &bo->slices, slice) 896 + kref_put(&slice->ref_count, free_slice); 897 + } 898 + 899 + static int qaic_attach_slicing_bo(struct qaic_device *qdev, struct qaic_bo *bo, 900 + struct qaic_attach_slice_hdr *hdr, 901 + struct qaic_attach_slice_entry *slice_ent) 902 + { 903 + int ret, i; 904 + 905 + for (i = 0; i < hdr->count; i++) { 906 + ret = qaic_map_one_slice(qdev, bo, &slice_ent[i]); 907 + if (ret) { 908 + qaic_free_slices_bo(bo); 909 + return ret; 910 + } 911 + } 912 + 913 + if (bo->total_slice_nents > qdev->dbc[hdr->dbc_id].nelem) { 914 + qaic_free_slices_bo(bo); 915 + return -ENOSPC; 916 + } 917 + 918 + bo->sliced = true; 919 + bo->nr_slice = hdr->count; 920 + list_add_tail(&bo->bo_list, &qdev->dbc[hdr->dbc_id].bo_lists); 921 + 922 + return 0; 923 + } 924 + 925 + int qaic_attach_slice_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) 926 + { 927 + struct qaic_attach_slice_entry *slice_ent; 928 + struct qaic_attach_slice *args = data; 929 + struct dma_bridge_chan *dbc; 930 + int usr_rcu_id, qdev_rcu_id; 931 + struct drm_gem_object *obj; 932 + struct qaic_device *qdev; 933 + unsigned long arg_size; 934 + struct qaic_user *usr; 935 + u8 __user *user_data; 936 + struct qaic_bo *bo; 937 + int ret; 938 + 939 + usr = file_priv->driver_priv; 940 + usr_rcu_id = srcu_read_lock(&usr->qddev_lock); 941 + if (!usr->qddev) { 942 + ret = -ENODEV; 943 + goto unlock_usr_srcu; 944 + } 945 + 946 + qdev = usr->qddev->qdev; 947 + qdev_rcu_id = srcu_read_lock(&qdev->dev_lock); 948 + if (qdev->in_reset) { 949 + ret = -ENODEV; 950 + goto unlock_dev_srcu; 951 + } 952 + 953 + if (args->hdr.count == 0) { 954 + ret = -EINVAL; 955 + goto unlock_dev_srcu; 956 + } 957 + 958 + arg_size = args->hdr.count * sizeof(*slice_ent); 959 + if (arg_size / args->hdr.count != sizeof(*slice_ent)) { 960 + ret = -EINVAL; 961 + goto unlock_dev_srcu; 962 + } 963 + 964 + if (args->hdr.dbc_id >= qdev->num_dbc) { 965 + ret = -EINVAL; 966 + goto unlock_dev_srcu; 967 + } 968 + 969 + if (args->hdr.size == 0) { 970 + ret = -EINVAL; 971 + goto unlock_dev_srcu; 972 + } 973 + 974 + if (!(args->hdr.dir == DMA_TO_DEVICE || args->hdr.dir == DMA_FROM_DEVICE)) { 975 + ret = -EINVAL; 976 + goto unlock_dev_srcu; 977 + } 978 + 979 + dbc = &qdev->dbc[args->hdr.dbc_id]; 980 + if (dbc->usr != usr) { 981 + ret = -EINVAL; 982 + goto unlock_dev_srcu; 983 + } 984 + 985 + if (args->data == 0) { 986 + ret = -EINVAL; 987 + goto unlock_dev_srcu; 988 + } 989 + 990 + user_data = u64_to_user_ptr(args->data); 991 + 992 + slice_ent = kzalloc(arg_size, GFP_KERNEL); 993 + if (!slice_ent) { 994 + ret = -EINVAL; 995 + goto unlock_dev_srcu; 996 + } 997 + 998 + ret = copy_from_user(slice_ent, user_data, arg_size); 999 + if (ret) { 1000 + ret = -EFAULT; 1001 + goto free_slice_ent; 1002 + } 1003 + 1004 + ret = qaic_validate_req(qdev, slice_ent, args->hdr.count, args->hdr.size); 1005 + if (ret) 1006 + goto free_slice_ent; 1007 + 1008 + obj = drm_gem_object_lookup(file_priv, args->hdr.handle); 1009 + if (!obj) { 1010 + ret = -ENOENT; 1011 + goto free_slice_ent; 1012 + } 1013 + 1014 + bo = to_qaic_bo(obj); 1015 + 1016 + ret = qaic_prepare_bo(qdev, bo, &args->hdr); 1017 + if (ret) 1018 + goto put_bo; 1019 + 1020 + ret = qaic_attach_slicing_bo(qdev, bo, &args->hdr, slice_ent); 1021 + if (ret) 1022 + goto unprepare_bo; 1023 + 1024 + if (args->hdr.dir == DMA_TO_DEVICE) 1025 + dma_sync_sgtable_for_cpu(&qdev->pdev->dev, bo->sgt, args->hdr.dir); 1026 + 1027 + bo->dbc = dbc; 1028 + drm_gem_object_put(obj); 1029 + srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id); 1030 + srcu_read_unlock(&usr->qddev_lock, usr_rcu_id); 1031 + 1032 + return 0; 1033 + 1034 + unprepare_bo: 1035 + qaic_unprepare_bo(qdev, bo); 1036 + put_bo: 1037 + drm_gem_object_put(obj); 1038 + free_slice_ent: 1039 + kfree(slice_ent); 1040 + unlock_dev_srcu: 1041 + srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id); 1042 + unlock_usr_srcu: 1043 + srcu_read_unlock(&usr->qddev_lock, usr_rcu_id); 1044 + return ret; 1045 + } 1046 + 1047 + static inline int copy_exec_reqs(struct qaic_device *qdev, struct bo_slice *slice, u32 dbc_id, 1048 + u32 head, u32 *ptail) 1049 + { 1050 + struct dma_bridge_chan *dbc = &qdev->dbc[dbc_id]; 1051 + struct dbc_req *reqs = slice->reqs; 1052 + u32 tail = *ptail; 1053 + u32 avail; 1054 + 1055 + avail = head - tail; 1056 + if (head <= tail) 1057 + avail += dbc->nelem; 1058 + 1059 + --avail; 1060 + 1061 + if (avail < slice->nents) 1062 + return -EAGAIN; 1063 + 1064 + if (tail + slice->nents > dbc->nelem) { 1065 + avail = dbc->nelem - tail; 1066 + avail = min_t(u32, avail, slice->nents); 1067 + memcpy(dbc->req_q_base + tail * get_dbc_req_elem_size(), reqs, 1068 + sizeof(*reqs) * avail); 1069 + reqs += avail; 1070 + avail = slice->nents - avail; 1071 + if (avail) 1072 + memcpy(dbc->req_q_base, reqs, sizeof(*reqs) * avail); 1073 + } else { 1074 + memcpy(dbc->req_q_base + tail * get_dbc_req_elem_size(), reqs, 1075 + sizeof(*reqs) * slice->nents); 1076 + } 1077 + 1078 + *ptail = (tail + slice->nents) % dbc->nelem; 1079 + 1080 + return 0; 1081 + } 1082 + 1083 + /* 1084 + * Based on the value of resize we may only need to transmit first_n 1085 + * entries and the last entry, with last_bytes to send from the last entry. 1086 + * Note that first_n could be 0. 1087 + */ 1088 + static inline int copy_partial_exec_reqs(struct qaic_device *qdev, struct bo_slice *slice, 1089 + u64 resize, u32 dbc_id, u32 head, u32 *ptail) 1090 + { 1091 + struct dma_bridge_chan *dbc = &qdev->dbc[dbc_id]; 1092 + struct dbc_req *reqs = slice->reqs; 1093 + struct dbc_req *last_req; 1094 + u32 tail = *ptail; 1095 + u64 total_bytes; 1096 + u64 last_bytes; 1097 + u32 first_n; 1098 + u32 avail; 1099 + int ret; 1100 + int i; 1101 + 1102 + avail = head - tail; 1103 + if (head <= tail) 1104 + avail += dbc->nelem; 1105 + 1106 + --avail; 1107 + 1108 + total_bytes = 0; 1109 + for (i = 0; i < slice->nents; i++) { 1110 + total_bytes += le32_to_cpu(reqs[i].len); 1111 + if (total_bytes >= resize) 1112 + break; 1113 + } 1114 + 1115 + if (total_bytes < resize) { 1116 + /* User space should have used the full buffer path. */ 1117 + ret = -EINVAL; 1118 + return ret; 1119 + } 1120 + 1121 + first_n = i; 1122 + last_bytes = i ? resize + le32_to_cpu(reqs[i].len) - total_bytes : resize; 1123 + 1124 + if (avail < (first_n + 1)) 1125 + return -EAGAIN; 1126 + 1127 + if (first_n) { 1128 + if (tail + first_n > dbc->nelem) { 1129 + avail = dbc->nelem - tail; 1130 + avail = min_t(u32, avail, first_n); 1131 + memcpy(dbc->req_q_base + tail * get_dbc_req_elem_size(), reqs, 1132 + sizeof(*reqs) * avail); 1133 + last_req = reqs + avail; 1134 + avail = first_n - avail; 1135 + if (avail) 1136 + memcpy(dbc->req_q_base, last_req, sizeof(*reqs) * avail); 1137 + } else { 1138 + memcpy(dbc->req_q_base + tail * get_dbc_req_elem_size(), reqs, 1139 + sizeof(*reqs) * first_n); 1140 + } 1141 + } 1142 + 1143 + /* Copy over the last entry. Here we need to adjust len to the left over 1144 + * size, and set src and dst to the entry it is copied to. 1145 + */ 1146 + last_req = dbc->req_q_base + (tail + first_n) % dbc->nelem * get_dbc_req_elem_size(); 1147 + memcpy(last_req, reqs + slice->nents - 1, sizeof(*reqs)); 1148 + 1149 + /* 1150 + * last_bytes holds size of a DMA segment, maximum DMA segment size is 1151 + * set to UINT_MAX by qaic and hence last_bytes can never exceed u32 1152 + * range. So, by down sizing we are not corrupting the value. 1153 + */ 1154 + last_req->len = cpu_to_le32((u32)last_bytes); 1155 + last_req->src_addr = reqs[first_n].src_addr; 1156 + last_req->dest_addr = reqs[first_n].dest_addr; 1157 + 1158 + *ptail = (tail + first_n + 1) % dbc->nelem; 1159 + 1160 + return 0; 1161 + } 1162 + 1163 + static int send_bo_list_to_device(struct qaic_device *qdev, struct drm_file *file_priv, 1164 + struct qaic_execute_entry *exec, unsigned int count, 1165 + bool is_partial, struct dma_bridge_chan *dbc, u32 head, 1166 + u32 *tail) 1167 + { 1168 + struct qaic_partial_execute_entry *pexec = (struct qaic_partial_execute_entry *)exec; 1169 + struct drm_gem_object *obj; 1170 + struct bo_slice *slice; 1171 + unsigned long flags; 1172 + struct qaic_bo *bo; 1173 + bool queued; 1174 + int i, j; 1175 + int ret; 1176 + 1177 + for (i = 0; i < count; i++) { 1178 + /* 1179 + * ref count will be decremented when the transfer of this 1180 + * buffer is complete. It is inside dbc_irq_threaded_fn(). 1181 + */ 1182 + obj = drm_gem_object_lookup(file_priv, 1183 + is_partial ? pexec[i].handle : exec[i].handle); 1184 + if (!obj) { 1185 + ret = -ENOENT; 1186 + goto failed_to_send_bo; 1187 + } 1188 + 1189 + bo = to_qaic_bo(obj); 1190 + 1191 + if (!bo->sliced) { 1192 + ret = -EINVAL; 1193 + goto failed_to_send_bo; 1194 + } 1195 + 1196 + if (is_partial && pexec[i].resize > bo->size) { 1197 + ret = -EINVAL; 1198 + goto failed_to_send_bo; 1199 + } 1200 + 1201 + spin_lock_irqsave(&dbc->xfer_lock, flags); 1202 + queued = bo->queued; 1203 + bo->queued = true; 1204 + if (queued) { 1205 + spin_unlock_irqrestore(&dbc->xfer_lock, flags); 1206 + ret = -EINVAL; 1207 + goto failed_to_send_bo; 1208 + } 1209 + 1210 + bo->req_id = dbc->next_req_id++; 1211 + 1212 + list_for_each_entry(slice, &bo->slices, slice) { 1213 + /* 1214 + * If this slice does not fall under the given 1215 + * resize then skip this slice and continue the loop 1216 + */ 1217 + if (is_partial && pexec[i].resize && pexec[i].resize <= slice->offset) 1218 + continue; 1219 + 1220 + for (j = 0; j < slice->nents; j++) 1221 + slice->reqs[j].req_id = cpu_to_le16(bo->req_id); 1222 + 1223 + /* 1224 + * If it is a partial execute ioctl call then check if 1225 + * resize has cut this slice short then do a partial copy 1226 + * else do complete copy 1227 + */ 1228 + if (is_partial && pexec[i].resize && 1229 + pexec[i].resize < slice->offset + slice->size) 1230 + ret = copy_partial_exec_reqs(qdev, slice, 1231 + pexec[i].resize - slice->offset, 1232 + dbc->id, head, tail); 1233 + else 1234 + ret = copy_exec_reqs(qdev, slice, dbc->id, head, tail); 1235 + if (ret) { 1236 + bo->queued = false; 1237 + spin_unlock_irqrestore(&dbc->xfer_lock, flags); 1238 + goto failed_to_send_bo; 1239 + } 1240 + } 1241 + reinit_completion(&bo->xfer_done); 1242 + list_add_tail(&bo->xfer_list, &dbc->xfer_list); 1243 + spin_unlock_irqrestore(&dbc->xfer_lock, flags); 1244 + dma_sync_sgtable_for_device(&qdev->pdev->dev, bo->sgt, bo->dir); 1245 + } 1246 + 1247 + return 0; 1248 + 1249 + failed_to_send_bo: 1250 + if (likely(obj)) 1251 + drm_gem_object_put(obj); 1252 + for (j = 0; j < i; j++) { 1253 + spin_lock_irqsave(&dbc->xfer_lock, flags); 1254 + bo = list_last_entry(&dbc->xfer_list, struct qaic_bo, xfer_list); 1255 + obj = &bo->base; 1256 + bo->queued = false; 1257 + list_del(&bo->xfer_list); 1258 + spin_unlock_irqrestore(&dbc->xfer_lock, flags); 1259 + dma_sync_sgtable_for_cpu(&qdev->pdev->dev, bo->sgt, bo->dir); 1260 + drm_gem_object_put(obj); 1261 + } 1262 + return ret; 1263 + } 1264 + 1265 + static void update_profiling_data(struct drm_file *file_priv, 1266 + struct qaic_execute_entry *exec, unsigned int count, 1267 + bool is_partial, u64 received_ts, u64 submit_ts, u32 queue_level) 1268 + { 1269 + struct qaic_partial_execute_entry *pexec = (struct qaic_partial_execute_entry *)exec; 1270 + struct drm_gem_object *obj; 1271 + struct qaic_bo *bo; 1272 + int i; 1273 + 1274 + for (i = 0; i < count; i++) { 1275 + /* 1276 + * Since we already committed the BO to hardware, the only way 1277 + * this should fail is a pending signal. We can't cancel the 1278 + * submit to hardware, so we have to just skip the profiling 1279 + * data. In case the signal is not fatal to the process, we 1280 + * return success so that the user doesn't try to resubmit. 1281 + */ 1282 + obj = drm_gem_object_lookup(file_priv, 1283 + is_partial ? pexec[i].handle : exec[i].handle); 1284 + if (!obj) 1285 + break; 1286 + bo = to_qaic_bo(obj); 1287 + bo->perf_stats.req_received_ts = received_ts; 1288 + bo->perf_stats.req_submit_ts = submit_ts; 1289 + bo->perf_stats.queue_level_before = queue_level; 1290 + queue_level += bo->total_slice_nents; 1291 + drm_gem_object_put(obj); 1292 + } 1293 + } 1294 + 1295 + static int __qaic_execute_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv, 1296 + bool is_partial) 1297 + { 1298 + struct qaic_partial_execute_entry *pexec; 1299 + struct qaic_execute *args = data; 1300 + struct qaic_execute_entry *exec; 1301 + struct dma_bridge_chan *dbc; 1302 + int usr_rcu_id, qdev_rcu_id; 1303 + struct qaic_device *qdev; 1304 + struct qaic_user *usr; 1305 + u8 __user *user_data; 1306 + unsigned long n; 1307 + u64 received_ts; 1308 + u32 queue_level; 1309 + u64 submit_ts; 1310 + int rcu_id; 1311 + u32 head; 1312 + u32 tail; 1313 + u64 size; 1314 + int ret; 1315 + 1316 + received_ts = ktime_get_ns(); 1317 + 1318 + size = is_partial ? sizeof(*pexec) : sizeof(*exec); 1319 + 1320 + n = (unsigned long)size * args->hdr.count; 1321 + if (args->hdr.count == 0 || n / args->hdr.count != size) 1322 + return -EINVAL; 1323 + 1324 + user_data = u64_to_user_ptr(args->data); 1325 + 1326 + exec = kcalloc(args->hdr.count, size, GFP_KERNEL); 1327 + pexec = (struct qaic_partial_execute_entry *)exec; 1328 + if (!exec) 1329 + return -ENOMEM; 1330 + 1331 + if (copy_from_user(exec, user_data, n)) { 1332 + ret = -EFAULT; 1333 + goto free_exec; 1334 + } 1335 + 1336 + usr = file_priv->driver_priv; 1337 + usr_rcu_id = srcu_read_lock(&usr->qddev_lock); 1338 + if (!usr->qddev) { 1339 + ret = -ENODEV; 1340 + goto unlock_usr_srcu; 1341 + } 1342 + 1343 + qdev = usr->qddev->qdev; 1344 + qdev_rcu_id = srcu_read_lock(&qdev->dev_lock); 1345 + if (qdev->in_reset) { 1346 + ret = -ENODEV; 1347 + goto unlock_dev_srcu; 1348 + } 1349 + 1350 + if (args->hdr.dbc_id >= qdev->num_dbc) { 1351 + ret = -EINVAL; 1352 + goto unlock_dev_srcu; 1353 + } 1354 + 1355 + dbc = &qdev->dbc[args->hdr.dbc_id]; 1356 + 1357 + rcu_id = srcu_read_lock(&dbc->ch_lock); 1358 + if (!dbc->usr || dbc->usr->handle != usr->handle) { 1359 + ret = -EPERM; 1360 + goto release_ch_rcu; 1361 + } 1362 + 1363 + head = readl(dbc->dbc_base + REQHP_OFF); 1364 + tail = readl(dbc->dbc_base + REQTP_OFF); 1365 + 1366 + if (head == U32_MAX || tail == U32_MAX) { 1367 + /* PCI link error */ 1368 + ret = -ENODEV; 1369 + goto release_ch_rcu; 1370 + } 1371 + 1372 + queue_level = head <= tail ? tail - head : dbc->nelem - (head - tail); 1373 + 1374 + ret = send_bo_list_to_device(qdev, file_priv, exec, args->hdr.count, is_partial, dbc, 1375 + head, &tail); 1376 + if (ret) 1377 + goto release_ch_rcu; 1378 + 1379 + /* Finalize commit to hardware */ 1380 + submit_ts = ktime_get_ns(); 1381 + writel(tail, dbc->dbc_base + REQTP_OFF); 1382 + 1383 + update_profiling_data(file_priv, exec, args->hdr.count, is_partial, received_ts, 1384 + submit_ts, queue_level); 1385 + 1386 + if (datapath_polling) 1387 + schedule_work(&dbc->poll_work); 1388 + 1389 + release_ch_rcu: 1390 + srcu_read_unlock(&dbc->ch_lock, rcu_id); 1391 + unlock_dev_srcu: 1392 + srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id); 1393 + unlock_usr_srcu: 1394 + srcu_read_unlock(&usr->qddev_lock, usr_rcu_id); 1395 + free_exec: 1396 + kfree(exec); 1397 + return ret; 1398 + } 1399 + 1400 + int qaic_execute_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) 1401 + { 1402 + return __qaic_execute_bo_ioctl(dev, data, file_priv, false); 1403 + } 1404 + 1405 + int qaic_partial_execute_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) 1406 + { 1407 + return __qaic_execute_bo_ioctl(dev, data, file_priv, true); 1408 + } 1409 + 1410 + /* 1411 + * Our interrupt handling is a bit more complicated than a simple ideal, but 1412 + * sadly necessary. 1413 + * 1414 + * Each dbc has a completion queue. Entries in the queue correspond to DMA 1415 + * requests which the device has processed. The hardware already has a built 1416 + * in irq mitigation. When the device puts an entry into the queue, it will 1417 + * only trigger an interrupt if the queue was empty. Therefore, when adding 1418 + * the Nth event to a non-empty queue, the hardware doesn't trigger an 1419 + * interrupt. This means the host doesn't get additional interrupts signaling 1420 + * the same thing - the queue has something to process. 1421 + * This behavior can be overridden in the DMA request. 1422 + * This means that when the host receives an interrupt, it is required to 1423 + * drain the queue. 1424 + * 1425 + * This behavior is what NAPI attempts to accomplish, although we can't use 1426 + * NAPI as we don't have a netdev. We use threaded irqs instead. 1427 + * 1428 + * However, there is a situation where the host drains the queue fast enough 1429 + * that every event causes an interrupt. Typically this is not a problem as 1430 + * the rate of events would be low. However, that is not the case with 1431 + * lprnet for example. On an Intel Xeon D-2191 where we run 8 instances of 1432 + * lprnet, the host receives roughly 80k interrupts per second from the device 1433 + * (per /proc/interrupts). While NAPI documentation indicates the host should 1434 + * just chug along, sadly that behavior causes instability in some hosts. 1435 + * 1436 + * Therefore, we implement an interrupt disable scheme similar to NAPI. The 1437 + * key difference is that we will delay after draining the queue for a small 1438 + * time to allow additional events to come in via polling. Using the above 1439 + * lprnet workload, this reduces the number of interrupts processed from 1440 + * ~80k/sec to about 64 in 5 minutes and appears to solve the system 1441 + * instability. 1442 + */ 1443 + irqreturn_t dbc_irq_handler(int irq, void *data) 1444 + { 1445 + struct dma_bridge_chan *dbc = data; 1446 + int rcu_id; 1447 + u32 head; 1448 + u32 tail; 1449 + 1450 + rcu_id = srcu_read_lock(&dbc->ch_lock); 1451 + 1452 + if (!dbc->usr) { 1453 + srcu_read_unlock(&dbc->ch_lock, rcu_id); 1454 + return IRQ_HANDLED; 1455 + } 1456 + 1457 + head = readl(dbc->dbc_base + RSPHP_OFF); 1458 + if (head == U32_MAX) { /* PCI link error */ 1459 + srcu_read_unlock(&dbc->ch_lock, rcu_id); 1460 + return IRQ_NONE; 1461 + } 1462 + 1463 + tail = readl(dbc->dbc_base + RSPTP_OFF); 1464 + if (tail == U32_MAX) { /* PCI link error */ 1465 + srcu_read_unlock(&dbc->ch_lock, rcu_id); 1466 + return IRQ_NONE; 1467 + } 1468 + 1469 + if (head == tail) { /* queue empty */ 1470 + srcu_read_unlock(&dbc->ch_lock, rcu_id); 1471 + return IRQ_NONE; 1472 + } 1473 + 1474 + disable_irq_nosync(irq); 1475 + srcu_read_unlock(&dbc->ch_lock, rcu_id); 1476 + return IRQ_WAKE_THREAD; 1477 + } 1478 + 1479 + void irq_polling_work(struct work_struct *work) 1480 + { 1481 + struct dma_bridge_chan *dbc = container_of(work, struct dma_bridge_chan, poll_work); 1482 + unsigned long flags; 1483 + int rcu_id; 1484 + u32 head; 1485 + u32 tail; 1486 + 1487 + rcu_id = srcu_read_lock(&dbc->ch_lock); 1488 + 1489 + while (1) { 1490 + if (dbc->qdev->in_reset) { 1491 + srcu_read_unlock(&dbc->ch_lock, rcu_id); 1492 + return; 1493 + } 1494 + if (!dbc->usr) { 1495 + srcu_read_unlock(&dbc->ch_lock, rcu_id); 1496 + return; 1497 + } 1498 + spin_lock_irqsave(&dbc->xfer_lock, flags); 1499 + if (list_empty(&dbc->xfer_list)) { 1500 + spin_unlock_irqrestore(&dbc->xfer_lock, flags); 1501 + srcu_read_unlock(&dbc->ch_lock, rcu_id); 1502 + return; 1503 + } 1504 + spin_unlock_irqrestore(&dbc->xfer_lock, flags); 1505 + 1506 + head = readl(dbc->dbc_base + RSPHP_OFF); 1507 + if (head == U32_MAX) { /* PCI link error */ 1508 + srcu_read_unlock(&dbc->ch_lock, rcu_id); 1509 + return; 1510 + } 1511 + 1512 + tail = readl(dbc->dbc_base + RSPTP_OFF); 1513 + if (tail == U32_MAX) { /* PCI link error */ 1514 + srcu_read_unlock(&dbc->ch_lock, rcu_id); 1515 + return; 1516 + } 1517 + 1518 + if (head != tail) { 1519 + irq_wake_thread(dbc->irq, dbc); 1520 + srcu_read_unlock(&dbc->ch_lock, rcu_id); 1521 + return; 1522 + } 1523 + 1524 + cond_resched(); 1525 + usleep_range(datapath_poll_interval_us, 2 * datapath_poll_interval_us); 1526 + } 1527 + } 1528 + 1529 + irqreturn_t dbc_irq_threaded_fn(int irq, void *data) 1530 + { 1531 + struct dma_bridge_chan *dbc = data; 1532 + int event_count = NUM_EVENTS; 1533 + int delay_count = NUM_DELAYS; 1534 + struct qaic_device *qdev; 1535 + struct qaic_bo *bo, *i; 1536 + struct dbc_rsp *rsp; 1537 + unsigned long flags; 1538 + int rcu_id; 1539 + u16 status; 1540 + u16 req_id; 1541 + u32 head; 1542 + u32 tail; 1543 + 1544 + rcu_id = srcu_read_lock(&dbc->ch_lock); 1545 + 1546 + head = readl(dbc->dbc_base + RSPHP_OFF); 1547 + if (head == U32_MAX) /* PCI link error */ 1548 + goto error_out; 1549 + 1550 + qdev = dbc->qdev; 1551 + read_fifo: 1552 + 1553 + if (!event_count) { 1554 + event_count = NUM_EVENTS; 1555 + cond_resched(); 1556 + } 1557 + 1558 + /* 1559 + * if this channel isn't assigned or gets unassigned during processing 1560 + * we have nothing further to do 1561 + */ 1562 + if (!dbc->usr) 1563 + goto error_out; 1564 + 1565 + tail = readl(dbc->dbc_base + RSPTP_OFF); 1566 + if (tail == U32_MAX) /* PCI link error */ 1567 + goto error_out; 1568 + 1569 + if (head == tail) { /* queue empty */ 1570 + if (delay_count) { 1571 + --delay_count; 1572 + usleep_range(100, 200); 1573 + goto read_fifo; /* check for a new event */ 1574 + } 1575 + goto normal_out; 1576 + } 1577 + 1578 + delay_count = NUM_DELAYS; 1579 + while (head != tail) { 1580 + if (!event_count) 1581 + break; 1582 + --event_count; 1583 + rsp = dbc->rsp_q_base + head * sizeof(*rsp); 1584 + req_id = le16_to_cpu(rsp->req_id); 1585 + status = le16_to_cpu(rsp->status); 1586 + if (status) 1587 + pci_dbg(qdev->pdev, "req_id %d failed with status %d\n", req_id, status); 1588 + spin_lock_irqsave(&dbc->xfer_lock, flags); 1589 + /* 1590 + * A BO can receive multiple interrupts, since a BO can be 1591 + * divided into multiple slices and a buffer receives as many 1592 + * interrupts as slices. So until it receives interrupts for 1593 + * all the slices we cannot mark that buffer complete. 1594 + */ 1595 + list_for_each_entry_safe(bo, i, &dbc->xfer_list, xfer_list) { 1596 + if (bo->req_id == req_id) 1597 + bo->nr_slice_xfer_done++; 1598 + else 1599 + continue; 1600 + 1601 + if (bo->nr_slice_xfer_done < bo->nr_slice) 1602 + break; 1603 + 1604 + /* 1605 + * At this point we have received all the interrupts for 1606 + * BO, which means BO execution is complete. 1607 + */ 1608 + dma_sync_sgtable_for_cpu(&qdev->pdev->dev, bo->sgt, bo->dir); 1609 + bo->nr_slice_xfer_done = 0; 1610 + bo->queued = false; 1611 + list_del(&bo->xfer_list); 1612 + bo->perf_stats.req_processed_ts = ktime_get_ns(); 1613 + complete_all(&bo->xfer_done); 1614 + drm_gem_object_put(&bo->base); 1615 + break; 1616 + } 1617 + spin_unlock_irqrestore(&dbc->xfer_lock, flags); 1618 + head = (head + 1) % dbc->nelem; 1619 + } 1620 + 1621 + /* 1622 + * Update the head pointer of response queue and let the device know 1623 + * that we have consumed elements from the queue. 1624 + */ 1625 + writel(head, dbc->dbc_base + RSPHP_OFF); 1626 + 1627 + /* elements might have been put in the queue while we were processing */ 1628 + goto read_fifo; 1629 + 1630 + normal_out: 1631 + if (likely(!datapath_polling)) 1632 + enable_irq(irq); 1633 + else 1634 + schedule_work(&dbc->poll_work); 1635 + /* checking the fifo and enabling irqs is a race, missed event check */ 1636 + tail = readl(dbc->dbc_base + RSPTP_OFF); 1637 + if (tail != U32_MAX && head != tail) { 1638 + if (likely(!datapath_polling)) 1639 + disable_irq_nosync(irq); 1640 + goto read_fifo; 1641 + } 1642 + srcu_read_unlock(&dbc->ch_lock, rcu_id); 1643 + return IRQ_HANDLED; 1644 + 1645 + error_out: 1646 + srcu_read_unlock(&dbc->ch_lock, rcu_id); 1647 + if (likely(!datapath_polling)) 1648 + enable_irq(irq); 1649 + else 1650 + schedule_work(&dbc->poll_work); 1651 + 1652 + return IRQ_HANDLED; 1653 + } 1654 + 1655 + int qaic_wait_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) 1656 + { 1657 + struct qaic_wait *args = data; 1658 + int usr_rcu_id, qdev_rcu_id; 1659 + struct dma_bridge_chan *dbc; 1660 + struct drm_gem_object *obj; 1661 + struct qaic_device *qdev; 1662 + unsigned long timeout; 1663 + struct qaic_user *usr; 1664 + struct qaic_bo *bo; 1665 + int rcu_id; 1666 + int ret; 1667 + 1668 + usr = file_priv->driver_priv; 1669 + usr_rcu_id = srcu_read_lock(&usr->qddev_lock); 1670 + if (!usr->qddev) { 1671 + ret = -ENODEV; 1672 + goto unlock_usr_srcu; 1673 + } 1674 + 1675 + qdev = usr->qddev->qdev; 1676 + qdev_rcu_id = srcu_read_lock(&qdev->dev_lock); 1677 + if (qdev->in_reset) { 1678 + ret = -ENODEV; 1679 + goto unlock_dev_srcu; 1680 + } 1681 + 1682 + if (args->pad != 0) { 1683 + ret = -EINVAL; 1684 + goto unlock_dev_srcu; 1685 + } 1686 + 1687 + if (args->dbc_id >= qdev->num_dbc) { 1688 + ret = -EINVAL; 1689 + goto unlock_dev_srcu; 1690 + } 1691 + 1692 + dbc = &qdev->dbc[args->dbc_id]; 1693 + 1694 + rcu_id = srcu_read_lock(&dbc->ch_lock); 1695 + if (dbc->usr != usr) { 1696 + ret = -EPERM; 1697 + goto unlock_ch_srcu; 1698 + } 1699 + 1700 + obj = drm_gem_object_lookup(file_priv, args->handle); 1701 + if (!obj) { 1702 + ret = -ENOENT; 1703 + goto unlock_ch_srcu; 1704 + } 1705 + 1706 + bo = to_qaic_bo(obj); 1707 + timeout = args->timeout ? args->timeout : wait_exec_default_timeout_ms; 1708 + timeout = msecs_to_jiffies(timeout); 1709 + ret = wait_for_completion_interruptible_timeout(&bo->xfer_done, timeout); 1710 + if (!ret) { 1711 + ret = -ETIMEDOUT; 1712 + goto put_obj; 1713 + } 1714 + if (ret > 0) 1715 + ret = 0; 1716 + 1717 + if (!dbc->usr) 1718 + ret = -EPERM; 1719 + 1720 + put_obj: 1721 + drm_gem_object_put(obj); 1722 + unlock_ch_srcu: 1723 + srcu_read_unlock(&dbc->ch_lock, rcu_id); 1724 + unlock_dev_srcu: 1725 + srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id); 1726 + unlock_usr_srcu: 1727 + srcu_read_unlock(&usr->qddev_lock, usr_rcu_id); 1728 + return ret; 1729 + } 1730 + 1731 + int qaic_perf_stats_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) 1732 + { 1733 + struct qaic_perf_stats_entry *ent = NULL; 1734 + struct qaic_perf_stats *args = data; 1735 + int usr_rcu_id, qdev_rcu_id; 1736 + struct drm_gem_object *obj; 1737 + struct qaic_device *qdev; 1738 + struct qaic_user *usr; 1739 + struct qaic_bo *bo; 1740 + int ret, i; 1741 + 1742 + usr = file_priv->driver_priv; 1743 + usr_rcu_id = srcu_read_lock(&usr->qddev_lock); 1744 + if (!usr->qddev) { 1745 + ret = -ENODEV; 1746 + goto unlock_usr_srcu; 1747 + } 1748 + 1749 + qdev = usr->qddev->qdev; 1750 + qdev_rcu_id = srcu_read_lock(&qdev->dev_lock); 1751 + if (qdev->in_reset) { 1752 + ret = -ENODEV; 1753 + goto unlock_dev_srcu; 1754 + } 1755 + 1756 + if (args->hdr.dbc_id >= qdev->num_dbc) { 1757 + ret = -EINVAL; 1758 + goto unlock_dev_srcu; 1759 + } 1760 + 1761 + ent = kcalloc(args->hdr.count, sizeof(*ent), GFP_KERNEL); 1762 + if (!ent) { 1763 + ret = -EINVAL; 1764 + goto unlock_dev_srcu; 1765 + } 1766 + 1767 + ret = copy_from_user(ent, u64_to_user_ptr(args->data), args->hdr.count * sizeof(*ent)); 1768 + if (ret) { 1769 + ret = -EFAULT; 1770 + goto free_ent; 1771 + } 1772 + 1773 + for (i = 0; i < args->hdr.count; i++) { 1774 + obj = drm_gem_object_lookup(file_priv, ent[i].handle); 1775 + if (!obj) { 1776 + ret = -ENOENT; 1777 + goto free_ent; 1778 + } 1779 + bo = to_qaic_bo(obj); 1780 + /* 1781 + * perf stats ioctl is called before wait ioctl is complete then 1782 + * the latency information is invalid. 1783 + */ 1784 + if (bo->perf_stats.req_processed_ts < bo->perf_stats.req_submit_ts) { 1785 + ent[i].device_latency_us = 0; 1786 + } else { 1787 + ent[i].device_latency_us = div_u64((bo->perf_stats.req_processed_ts - 1788 + bo->perf_stats.req_submit_ts), 1000); 1789 + } 1790 + ent[i].submit_latency_us = div_u64((bo->perf_stats.req_submit_ts - 1791 + bo->perf_stats.req_received_ts), 1000); 1792 + ent[i].queue_level_before = bo->perf_stats.queue_level_before; 1793 + ent[i].num_queue_element = bo->total_slice_nents; 1794 + drm_gem_object_put(obj); 1795 + } 1796 + 1797 + if (copy_to_user(u64_to_user_ptr(args->data), ent, args->hdr.count * sizeof(*ent))) 1798 + ret = -EFAULT; 1799 + 1800 + free_ent: 1801 + kfree(ent); 1802 + unlock_dev_srcu: 1803 + srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id); 1804 + unlock_usr_srcu: 1805 + srcu_read_unlock(&usr->qddev_lock, usr_rcu_id); 1806 + return ret; 1807 + } 1808 + 1809 + static void empty_xfer_list(struct qaic_device *qdev, struct dma_bridge_chan *dbc) 1810 + { 1811 + unsigned long flags; 1812 + struct qaic_bo *bo; 1813 + 1814 + spin_lock_irqsave(&dbc->xfer_lock, flags); 1815 + while (!list_empty(&dbc->xfer_list)) { 1816 + bo = list_first_entry(&dbc->xfer_list, typeof(*bo), xfer_list); 1817 + bo->queued = false; 1818 + list_del(&bo->xfer_list); 1819 + spin_unlock_irqrestore(&dbc->xfer_lock, flags); 1820 + dma_sync_sgtable_for_cpu(&qdev->pdev->dev, bo->sgt, bo->dir); 1821 + complete_all(&bo->xfer_done); 1822 + drm_gem_object_put(&bo->base); 1823 + spin_lock_irqsave(&dbc->xfer_lock, flags); 1824 + } 1825 + spin_unlock_irqrestore(&dbc->xfer_lock, flags); 1826 + } 1827 + 1828 + int disable_dbc(struct qaic_device *qdev, u32 dbc_id, struct qaic_user *usr) 1829 + { 1830 + if (!qdev->dbc[dbc_id].usr || qdev->dbc[dbc_id].usr->handle != usr->handle) 1831 + return -EPERM; 1832 + 1833 + qdev->dbc[dbc_id].usr = NULL; 1834 + synchronize_srcu(&qdev->dbc[dbc_id].ch_lock); 1835 + return 0; 1836 + } 1837 + 1838 + /** 1839 + * enable_dbc - Enable the DBC. DBCs are disabled by removing the context of 1840 + * user. Add user context back to DBC to enable it. This function trusts the 1841 + * DBC ID passed and expects the DBC to be disabled. 1842 + * @qdev: Qranium device handle 1843 + * @dbc_id: ID of the DBC 1844 + * @usr: User context 1845 + */ 1846 + void enable_dbc(struct qaic_device *qdev, u32 dbc_id, struct qaic_user *usr) 1847 + { 1848 + qdev->dbc[dbc_id].usr = usr; 1849 + } 1850 + 1851 + void wakeup_dbc(struct qaic_device *qdev, u32 dbc_id) 1852 + { 1853 + struct dma_bridge_chan *dbc = &qdev->dbc[dbc_id]; 1854 + 1855 + dbc->usr = NULL; 1856 + empty_xfer_list(qdev, dbc); 1857 + synchronize_srcu(&dbc->ch_lock); 1858 + } 1859 + 1860 + void release_dbc(struct qaic_device *qdev, u32 dbc_id) 1861 + { 1862 + struct bo_slice *slice, *slice_temp; 1863 + struct qaic_bo *bo, *bo_temp; 1864 + struct dma_bridge_chan *dbc; 1865 + 1866 + dbc = &qdev->dbc[dbc_id]; 1867 + if (!dbc->in_use) 1868 + return; 1869 + 1870 + wakeup_dbc(qdev, dbc_id); 1871 + 1872 + dma_free_coherent(&qdev->pdev->dev, dbc->total_size, dbc->req_q_base, dbc->dma_addr); 1873 + dbc->total_size = 0; 1874 + dbc->req_q_base = NULL; 1875 + dbc->dma_addr = 0; 1876 + dbc->nelem = 0; 1877 + dbc->usr = NULL; 1878 + 1879 + list_for_each_entry_safe(bo, bo_temp, &dbc->bo_lists, bo_list) { 1880 + list_for_each_entry_safe(slice, slice_temp, &bo->slices, slice) 1881 + kref_put(&slice->ref_count, free_slice); 1882 + bo->sliced = false; 1883 + INIT_LIST_HEAD(&bo->slices); 1884 + bo->total_slice_nents = 0; 1885 + bo->dir = 0; 1886 + bo->dbc = NULL; 1887 + bo->nr_slice = 0; 1888 + bo->nr_slice_xfer_done = 0; 1889 + bo->queued = false; 1890 + bo->req_id = 0; 1891 + init_completion(&bo->xfer_done); 1892 + complete_all(&bo->xfer_done); 1893 + list_del(&bo->bo_list); 1894 + bo->perf_stats.req_received_ts = 0; 1895 + bo->perf_stats.req_submit_ts = 0; 1896 + bo->perf_stats.req_processed_ts = 0; 1897 + bo->perf_stats.queue_level_before = 0; 1898 + } 1899 + 1900 + dbc->in_use = false; 1901 + wake_up(&dbc->dbc_release); 1902 + }
+647
drivers/accel/qaic/qaic_drv.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + 3 + /* Copyright (c) 2019-2021, The Linux Foundation. All rights reserved. */ 4 + /* Copyright (c) 2021-2023 Qualcomm Innovation Center, Inc. All rights reserved. */ 5 + 6 + #include <linux/delay.h> 7 + #include <linux/dma-mapping.h> 8 + #include <linux/idr.h> 9 + #include <linux/interrupt.h> 10 + #include <linux/list.h> 11 + #include <linux/kref.h> 12 + #include <linux/mhi.h> 13 + #include <linux/module.h> 14 + #include <linux/msi.h> 15 + #include <linux/mutex.h> 16 + #include <linux/pci.h> 17 + #include <linux/spinlock.h> 18 + #include <linux/workqueue.h> 19 + #include <linux/wait.h> 20 + #include <drm/drm_accel.h> 21 + #include <drm/drm_drv.h> 22 + #include <drm/drm_file.h> 23 + #include <drm/drm_gem.h> 24 + #include <drm/drm_ioctl.h> 25 + #include <uapi/drm/qaic_accel.h> 26 + 27 + #include "mhi_controller.h" 28 + #include "mhi_qaic_ctrl.h" 29 + #include "qaic.h" 30 + 31 + MODULE_IMPORT_NS(DMA_BUF); 32 + 33 + #define PCI_DEV_AIC100 0xa100 34 + #define QAIC_NAME "qaic" 35 + #define QAIC_DESC "Qualcomm Cloud AI Accelerators" 36 + #define CNTL_MAJOR 5 37 + #define CNTL_MINOR 0 38 + 39 + bool datapath_polling; 40 + module_param(datapath_polling, bool, 0400); 41 + MODULE_PARM_DESC(datapath_polling, "Operate the datapath in polling mode"); 42 + static bool link_up; 43 + static DEFINE_IDA(qaic_usrs); 44 + 45 + static int qaic_create_drm_device(struct qaic_device *qdev, s32 partition_id); 46 + static void qaic_destroy_drm_device(struct qaic_device *qdev, s32 partition_id); 47 + 48 + static void free_usr(struct kref *kref) 49 + { 50 + struct qaic_user *usr = container_of(kref, struct qaic_user, ref_count); 51 + 52 + cleanup_srcu_struct(&usr->qddev_lock); 53 + ida_free(&qaic_usrs, usr->handle); 54 + kfree(usr); 55 + } 56 + 57 + static int qaic_open(struct drm_device *dev, struct drm_file *file) 58 + { 59 + struct qaic_drm_device *qddev = dev->dev_private; 60 + struct qaic_device *qdev = qddev->qdev; 61 + struct qaic_user *usr; 62 + int rcu_id; 63 + int ret; 64 + 65 + rcu_id = srcu_read_lock(&qdev->dev_lock); 66 + if (qdev->in_reset) { 67 + ret = -ENODEV; 68 + goto dev_unlock; 69 + } 70 + 71 + usr = kmalloc(sizeof(*usr), GFP_KERNEL); 72 + if (!usr) { 73 + ret = -ENOMEM; 74 + goto dev_unlock; 75 + } 76 + 77 + usr->handle = ida_alloc(&qaic_usrs, GFP_KERNEL); 78 + if (usr->handle < 0) { 79 + ret = usr->handle; 80 + goto free_usr; 81 + } 82 + usr->qddev = qddev; 83 + atomic_set(&usr->chunk_id, 0); 84 + init_srcu_struct(&usr->qddev_lock); 85 + kref_init(&usr->ref_count); 86 + 87 + ret = mutex_lock_interruptible(&qddev->users_mutex); 88 + if (ret) 89 + goto cleanup_usr; 90 + 91 + list_add(&usr->node, &qddev->users); 92 + mutex_unlock(&qddev->users_mutex); 93 + 94 + file->driver_priv = usr; 95 + 96 + srcu_read_unlock(&qdev->dev_lock, rcu_id); 97 + return 0; 98 + 99 + cleanup_usr: 100 + cleanup_srcu_struct(&usr->qddev_lock); 101 + free_usr: 102 + kfree(usr); 103 + dev_unlock: 104 + srcu_read_unlock(&qdev->dev_lock, rcu_id); 105 + return ret; 106 + } 107 + 108 + static void qaic_postclose(struct drm_device *dev, struct drm_file *file) 109 + { 110 + struct qaic_user *usr = file->driver_priv; 111 + struct qaic_drm_device *qddev; 112 + struct qaic_device *qdev; 113 + int qdev_rcu_id; 114 + int usr_rcu_id; 115 + int i; 116 + 117 + qddev = usr->qddev; 118 + usr_rcu_id = srcu_read_lock(&usr->qddev_lock); 119 + if (qddev) { 120 + qdev = qddev->qdev; 121 + qdev_rcu_id = srcu_read_lock(&qdev->dev_lock); 122 + if (!qdev->in_reset) { 123 + qaic_release_usr(qdev, usr); 124 + for (i = 0; i < qdev->num_dbc; ++i) 125 + if (qdev->dbc[i].usr && qdev->dbc[i].usr->handle == usr->handle) 126 + release_dbc(qdev, i); 127 + } 128 + srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id); 129 + 130 + mutex_lock(&qddev->users_mutex); 131 + if (!list_empty(&usr->node)) 132 + list_del_init(&usr->node); 133 + mutex_unlock(&qddev->users_mutex); 134 + } 135 + 136 + srcu_read_unlock(&usr->qddev_lock, usr_rcu_id); 137 + kref_put(&usr->ref_count, free_usr); 138 + 139 + file->driver_priv = NULL; 140 + } 141 + 142 + DEFINE_DRM_ACCEL_FOPS(qaic_accel_fops); 143 + 144 + static const struct drm_ioctl_desc qaic_drm_ioctls[] = { 145 + DRM_IOCTL_DEF_DRV(QAIC_MANAGE, qaic_manage_ioctl, 0), 146 + DRM_IOCTL_DEF_DRV(QAIC_CREATE_BO, qaic_create_bo_ioctl, 0), 147 + DRM_IOCTL_DEF_DRV(QAIC_MMAP_BO, qaic_mmap_bo_ioctl, 0), 148 + DRM_IOCTL_DEF_DRV(QAIC_ATTACH_SLICE_BO, qaic_attach_slice_bo_ioctl, 0), 149 + DRM_IOCTL_DEF_DRV(QAIC_EXECUTE_BO, qaic_execute_bo_ioctl, 0), 150 + DRM_IOCTL_DEF_DRV(QAIC_PARTIAL_EXECUTE_BO, qaic_partial_execute_bo_ioctl, 0), 151 + DRM_IOCTL_DEF_DRV(QAIC_WAIT_BO, qaic_wait_bo_ioctl, 0), 152 + DRM_IOCTL_DEF_DRV(QAIC_PERF_STATS_BO, qaic_perf_stats_bo_ioctl, 0), 153 + }; 154 + 155 + static const struct drm_driver qaic_accel_driver = { 156 + .driver_features = DRIVER_GEM | DRIVER_COMPUTE_ACCEL, 157 + 158 + .name = QAIC_NAME, 159 + .desc = QAIC_DESC, 160 + .date = "20190618", 161 + 162 + .fops = &qaic_accel_fops, 163 + .open = qaic_open, 164 + .postclose = qaic_postclose, 165 + 166 + .ioctls = qaic_drm_ioctls, 167 + .num_ioctls = ARRAY_SIZE(qaic_drm_ioctls), 168 + .prime_fd_to_handle = drm_gem_prime_fd_to_handle, 169 + .gem_prime_import = qaic_gem_prime_import, 170 + }; 171 + 172 + static int qaic_create_drm_device(struct qaic_device *qdev, s32 partition_id) 173 + { 174 + struct qaic_drm_device *qddev; 175 + struct drm_device *ddev; 176 + struct device *pdev; 177 + int ret; 178 + 179 + /* Hold off implementing partitions until the uapi is determined */ 180 + if (partition_id != QAIC_NO_PARTITION) 181 + return -EINVAL; 182 + 183 + pdev = &qdev->pdev->dev; 184 + 185 + qddev = kzalloc(sizeof(*qddev), GFP_KERNEL); 186 + if (!qddev) 187 + return -ENOMEM; 188 + 189 + ddev = drm_dev_alloc(&qaic_accel_driver, pdev); 190 + if (IS_ERR(ddev)) { 191 + ret = PTR_ERR(ddev); 192 + goto ddev_fail; 193 + } 194 + 195 + ddev->dev_private = qddev; 196 + qddev->ddev = ddev; 197 + 198 + qddev->qdev = qdev; 199 + qddev->partition_id = partition_id; 200 + INIT_LIST_HEAD(&qddev->users); 201 + mutex_init(&qddev->users_mutex); 202 + 203 + qdev->qddev = qddev; 204 + 205 + ret = drm_dev_register(ddev, 0); 206 + if (ret) { 207 + pci_dbg(qdev->pdev, "%s: drm_dev_register failed %d\n", __func__, ret); 208 + goto drm_reg_fail; 209 + } 210 + 211 + return 0; 212 + 213 + drm_reg_fail: 214 + mutex_destroy(&qddev->users_mutex); 215 + qdev->qddev = NULL; 216 + drm_dev_put(ddev); 217 + ddev_fail: 218 + kfree(qddev); 219 + return ret; 220 + } 221 + 222 + static void qaic_destroy_drm_device(struct qaic_device *qdev, s32 partition_id) 223 + { 224 + struct qaic_drm_device *qddev; 225 + struct qaic_user *usr; 226 + 227 + qddev = qdev->qddev; 228 + 229 + /* 230 + * Existing users get unresolvable errors till they close FDs. 231 + * Need to sync carefully with users calling close(). The 232 + * list of users can be modified elsewhere when the lock isn't 233 + * held here, but the sync'ing the srcu with the mutex held 234 + * could deadlock. Grab the mutex so that the list will be 235 + * unmodified. The user we get will exist as long as the 236 + * lock is held. Signal that the qcdev is going away, and 237 + * grab a reference to the user so they don't go away for 238 + * synchronize_srcu(). Then release the mutex to avoid 239 + * deadlock and make sure the user has observed the signal. 240 + * With the lock released, we cannot maintain any state of the 241 + * user list. 242 + */ 243 + mutex_lock(&qddev->users_mutex); 244 + while (!list_empty(&qddev->users)) { 245 + usr = list_first_entry(&qddev->users, struct qaic_user, node); 246 + list_del_init(&usr->node); 247 + kref_get(&usr->ref_count); 248 + usr->qddev = NULL; 249 + mutex_unlock(&qddev->users_mutex); 250 + synchronize_srcu(&usr->qddev_lock); 251 + kref_put(&usr->ref_count, free_usr); 252 + mutex_lock(&qddev->users_mutex); 253 + } 254 + mutex_unlock(&qddev->users_mutex); 255 + 256 + if (qddev->ddev) { 257 + drm_dev_unregister(qddev->ddev); 258 + drm_dev_put(qddev->ddev); 259 + } 260 + 261 + kfree(qddev); 262 + } 263 + 264 + static int qaic_mhi_probe(struct mhi_device *mhi_dev, const struct mhi_device_id *id) 265 + { 266 + struct qaic_device *qdev; 267 + u16 major, minor; 268 + int ret; 269 + 270 + /* 271 + * Invoking this function indicates that the control channel to the 272 + * device is available. We use that as a signal to indicate that 273 + * the device side firmware has booted. The device side firmware 274 + * manages the device resources, so we need to communicate with it 275 + * via the control channel in order to utilize the device. Therefore 276 + * we wait until this signal to create the drm dev that userspace will 277 + * use to control the device, because without the device side firmware, 278 + * userspace can't do anything useful. 279 + */ 280 + 281 + qdev = pci_get_drvdata(to_pci_dev(mhi_dev->mhi_cntrl->cntrl_dev)); 282 + 283 + qdev->in_reset = false; 284 + 285 + dev_set_drvdata(&mhi_dev->dev, qdev); 286 + qdev->cntl_ch = mhi_dev; 287 + 288 + ret = qaic_control_open(qdev); 289 + if (ret) { 290 + pci_dbg(qdev->pdev, "%s: control_open failed %d\n", __func__, ret); 291 + return ret; 292 + } 293 + 294 + ret = get_cntl_version(qdev, NULL, &major, &minor); 295 + if (ret || major != CNTL_MAJOR || minor > CNTL_MINOR) { 296 + pci_err(qdev->pdev, "%s: Control protocol version (%d.%d) not supported. Supported version is (%d.%d). Ret: %d\n", 297 + __func__, major, minor, CNTL_MAJOR, CNTL_MINOR, ret); 298 + ret = -EINVAL; 299 + goto close_control; 300 + } 301 + 302 + ret = qaic_create_drm_device(qdev, QAIC_NO_PARTITION); 303 + 304 + return ret; 305 + 306 + close_control: 307 + qaic_control_close(qdev); 308 + return ret; 309 + } 310 + 311 + static void qaic_mhi_remove(struct mhi_device *mhi_dev) 312 + { 313 + /* This is redundant since we have already observed the device crash */ 314 + } 315 + 316 + static void qaic_notify_reset(struct qaic_device *qdev) 317 + { 318 + int i; 319 + 320 + qdev->in_reset = true; 321 + /* wake up any waiters to avoid waiting for timeouts at sync */ 322 + wake_all_cntl(qdev); 323 + for (i = 0; i < qdev->num_dbc; ++i) 324 + wakeup_dbc(qdev, i); 325 + synchronize_srcu(&qdev->dev_lock); 326 + } 327 + 328 + void qaic_dev_reset_clean_local_state(struct qaic_device *qdev, bool exit_reset) 329 + { 330 + int i; 331 + 332 + qaic_notify_reset(qdev); 333 + 334 + /* remove drmdevs to prevent new users from coming in */ 335 + qaic_destroy_drm_device(qdev, QAIC_NO_PARTITION); 336 + 337 + /* start tearing things down */ 338 + for (i = 0; i < qdev->num_dbc; ++i) 339 + release_dbc(qdev, i); 340 + 341 + if (exit_reset) 342 + qdev->in_reset = false; 343 + } 344 + 345 + static struct qaic_device *create_qdev(struct pci_dev *pdev, const struct pci_device_id *id) 346 + { 347 + struct qaic_device *qdev; 348 + int i; 349 + 350 + qdev = devm_kzalloc(&pdev->dev, sizeof(*qdev), GFP_KERNEL); 351 + if (!qdev) 352 + return NULL; 353 + 354 + if (id->device == PCI_DEV_AIC100) { 355 + qdev->num_dbc = 16; 356 + qdev->dbc = devm_kcalloc(&pdev->dev, qdev->num_dbc, sizeof(*qdev->dbc), GFP_KERNEL); 357 + if (!qdev->dbc) 358 + return NULL; 359 + } 360 + 361 + qdev->cntl_wq = alloc_workqueue("qaic_cntl", WQ_UNBOUND, 0); 362 + if (!qdev->cntl_wq) 363 + return NULL; 364 + 365 + pci_set_drvdata(pdev, qdev); 366 + qdev->pdev = pdev; 367 + 368 + mutex_init(&qdev->cntl_mutex); 369 + INIT_LIST_HEAD(&qdev->cntl_xfer_list); 370 + init_srcu_struct(&qdev->dev_lock); 371 + 372 + for (i = 0; i < qdev->num_dbc; ++i) { 373 + spin_lock_init(&qdev->dbc[i].xfer_lock); 374 + qdev->dbc[i].qdev = qdev; 375 + qdev->dbc[i].id = i; 376 + INIT_LIST_HEAD(&qdev->dbc[i].xfer_list); 377 + init_srcu_struct(&qdev->dbc[i].ch_lock); 378 + init_waitqueue_head(&qdev->dbc[i].dbc_release); 379 + INIT_LIST_HEAD(&qdev->dbc[i].bo_lists); 380 + } 381 + 382 + return qdev; 383 + } 384 + 385 + static void cleanup_qdev(struct qaic_device *qdev) 386 + { 387 + int i; 388 + 389 + for (i = 0; i < qdev->num_dbc; ++i) 390 + cleanup_srcu_struct(&qdev->dbc[i].ch_lock); 391 + cleanup_srcu_struct(&qdev->dev_lock); 392 + pci_set_drvdata(qdev->pdev, NULL); 393 + destroy_workqueue(qdev->cntl_wq); 394 + } 395 + 396 + static int init_pci(struct qaic_device *qdev, struct pci_dev *pdev) 397 + { 398 + int bars; 399 + int ret; 400 + 401 + bars = pci_select_bars(pdev, IORESOURCE_MEM); 402 + 403 + /* make sure the device has the expected BARs */ 404 + if (bars != (BIT(0) | BIT(2) | BIT(4))) { 405 + pci_dbg(pdev, "%s: expected BARs 0, 2, and 4 not found in device. Found 0x%x\n", 406 + __func__, bars); 407 + return -EINVAL; 408 + } 409 + 410 + ret = pcim_enable_device(pdev); 411 + if (ret) 412 + return ret; 413 + 414 + ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)); 415 + if (ret) 416 + return ret; 417 + ret = dma_set_max_seg_size(&pdev->dev, UINT_MAX); 418 + if (ret) 419 + return ret; 420 + 421 + qdev->bar_0 = devm_ioremap_resource(&pdev->dev, &pdev->resource[0]); 422 + if (IS_ERR(qdev->bar_0)) 423 + return PTR_ERR(qdev->bar_0); 424 + 425 + qdev->bar_2 = devm_ioremap_resource(&pdev->dev, &pdev->resource[2]); 426 + if (IS_ERR(qdev->bar_2)) 427 + return PTR_ERR(qdev->bar_2); 428 + 429 + /* Managed release since we use pcim_enable_device above */ 430 + pci_set_master(pdev); 431 + 432 + return 0; 433 + } 434 + 435 + static int init_msi(struct qaic_device *qdev, struct pci_dev *pdev) 436 + { 437 + int mhi_irq; 438 + int ret; 439 + int i; 440 + 441 + /* Managed release since we use pcim_enable_device */ 442 + ret = pci_alloc_irq_vectors(pdev, 1, 32, PCI_IRQ_MSI); 443 + if (ret < 0) 444 + return ret; 445 + 446 + if (ret < 32) { 447 + pci_err(pdev, "%s: Requested 32 MSIs. Obtained %d MSIs which is less than the 32 required.\n", 448 + __func__, ret); 449 + return -ENODEV; 450 + } 451 + 452 + mhi_irq = pci_irq_vector(pdev, 0); 453 + if (mhi_irq < 0) 454 + return mhi_irq; 455 + 456 + for (i = 0; i < qdev->num_dbc; ++i) { 457 + ret = devm_request_threaded_irq(&pdev->dev, pci_irq_vector(pdev, i + 1), 458 + dbc_irq_handler, dbc_irq_threaded_fn, IRQF_SHARED, 459 + "qaic_dbc", &qdev->dbc[i]); 460 + if (ret) 461 + return ret; 462 + 463 + if (datapath_polling) { 464 + qdev->dbc[i].irq = pci_irq_vector(pdev, i + 1); 465 + disable_irq_nosync(qdev->dbc[i].irq); 466 + INIT_WORK(&qdev->dbc[i].poll_work, irq_polling_work); 467 + } 468 + } 469 + 470 + return mhi_irq; 471 + } 472 + 473 + static int qaic_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) 474 + { 475 + struct qaic_device *qdev; 476 + int mhi_irq; 477 + int ret; 478 + int i; 479 + 480 + qdev = create_qdev(pdev, id); 481 + if (!qdev) 482 + return -ENOMEM; 483 + 484 + ret = init_pci(qdev, pdev); 485 + if (ret) 486 + goto cleanup_qdev; 487 + 488 + for (i = 0; i < qdev->num_dbc; ++i) 489 + qdev->dbc[i].dbc_base = qdev->bar_2 + QAIC_DBC_OFF(i); 490 + 491 + mhi_irq = init_msi(qdev, pdev); 492 + if (mhi_irq < 0) { 493 + ret = mhi_irq; 494 + goto cleanup_qdev; 495 + } 496 + 497 + qdev->mhi_cntrl = qaic_mhi_register_controller(pdev, qdev->bar_0, mhi_irq); 498 + if (IS_ERR(qdev->mhi_cntrl)) { 499 + ret = PTR_ERR(qdev->mhi_cntrl); 500 + goto cleanup_qdev; 501 + } 502 + 503 + return 0; 504 + 505 + cleanup_qdev: 506 + cleanup_qdev(qdev); 507 + return ret; 508 + } 509 + 510 + static void qaic_pci_remove(struct pci_dev *pdev) 511 + { 512 + struct qaic_device *qdev = pci_get_drvdata(pdev); 513 + 514 + if (!qdev) 515 + return; 516 + 517 + qaic_dev_reset_clean_local_state(qdev, false); 518 + qaic_mhi_free_controller(qdev->mhi_cntrl, link_up); 519 + cleanup_qdev(qdev); 520 + } 521 + 522 + static void qaic_pci_shutdown(struct pci_dev *pdev) 523 + { 524 + /* see qaic_exit for what link_up is doing */ 525 + link_up = true; 526 + qaic_pci_remove(pdev); 527 + } 528 + 529 + static pci_ers_result_t qaic_pci_error_detected(struct pci_dev *pdev, pci_channel_state_t error) 530 + { 531 + return PCI_ERS_RESULT_NEED_RESET; 532 + } 533 + 534 + static void qaic_pci_reset_prepare(struct pci_dev *pdev) 535 + { 536 + struct qaic_device *qdev = pci_get_drvdata(pdev); 537 + 538 + qaic_notify_reset(qdev); 539 + qaic_mhi_start_reset(qdev->mhi_cntrl); 540 + qaic_dev_reset_clean_local_state(qdev, false); 541 + } 542 + 543 + static void qaic_pci_reset_done(struct pci_dev *pdev) 544 + { 545 + struct qaic_device *qdev = pci_get_drvdata(pdev); 546 + 547 + qdev->in_reset = false; 548 + qaic_mhi_reset_done(qdev->mhi_cntrl); 549 + } 550 + 551 + static const struct mhi_device_id qaic_mhi_match_table[] = { 552 + { .chan = "QAIC_CONTROL", }, 553 + {}, 554 + }; 555 + 556 + static struct mhi_driver qaic_mhi_driver = { 557 + .id_table = qaic_mhi_match_table, 558 + .remove = qaic_mhi_remove, 559 + .probe = qaic_mhi_probe, 560 + .ul_xfer_cb = qaic_mhi_ul_xfer_cb, 561 + .dl_xfer_cb = qaic_mhi_dl_xfer_cb, 562 + .driver = { 563 + .name = "qaic_mhi", 564 + }, 565 + }; 566 + 567 + static const struct pci_device_id qaic_ids[] = { 568 + { PCI_DEVICE(PCI_VENDOR_ID_QCOM, PCI_DEV_AIC100), }, 569 + { } 570 + }; 571 + MODULE_DEVICE_TABLE(pci, qaic_ids); 572 + 573 + static const struct pci_error_handlers qaic_pci_err_handler = { 574 + .error_detected = qaic_pci_error_detected, 575 + .reset_prepare = qaic_pci_reset_prepare, 576 + .reset_done = qaic_pci_reset_done, 577 + }; 578 + 579 + static struct pci_driver qaic_pci_driver = { 580 + .name = QAIC_NAME, 581 + .id_table = qaic_ids, 582 + .probe = qaic_pci_probe, 583 + .remove = qaic_pci_remove, 584 + .shutdown = qaic_pci_shutdown, 585 + .err_handler = &qaic_pci_err_handler, 586 + }; 587 + 588 + static int __init qaic_init(void) 589 + { 590 + int ret; 591 + 592 + ret = mhi_driver_register(&qaic_mhi_driver); 593 + if (ret) { 594 + pr_debug("qaic: mhi_driver_register failed %d\n", ret); 595 + return ret; 596 + } 597 + 598 + ret = pci_register_driver(&qaic_pci_driver); 599 + if (ret) { 600 + pr_debug("qaic: pci_register_driver failed %d\n", ret); 601 + goto free_mhi; 602 + } 603 + 604 + ret = mhi_qaic_ctrl_init(); 605 + if (ret) { 606 + pr_debug("qaic: mhi_qaic_ctrl_init failed %d\n", ret); 607 + goto free_pci; 608 + } 609 + 610 + return 0; 611 + 612 + free_pci: 613 + pci_unregister_driver(&qaic_pci_driver); 614 + free_mhi: 615 + mhi_driver_unregister(&qaic_mhi_driver); 616 + return ret; 617 + } 618 + 619 + static void __exit qaic_exit(void) 620 + { 621 + /* 622 + * We assume that qaic_pci_remove() is called due to a hotplug event 623 + * which would mean that the link is down, and thus 624 + * qaic_mhi_free_controller() should not try to access the device during 625 + * cleanup. 626 + * We call pci_unregister_driver() below, which also triggers 627 + * qaic_pci_remove(), but since this is module exit, we expect the link 628 + * to the device to be up, in which case qaic_mhi_free_controller() 629 + * should try to access the device during cleanup to put the device in 630 + * a sane state. 631 + * For that reason, we set link_up here to let qaic_mhi_free_controller 632 + * know the expected link state. Since the module is going to be 633 + * removed at the end of this, we don't need to worry about 634 + * reinitializing the link_up state after the cleanup is done. 635 + */ 636 + link_up = true; 637 + mhi_qaic_ctrl_deinit(); 638 + pci_unregister_driver(&qaic_pci_driver); 639 + mhi_driver_unregister(&qaic_mhi_driver); 640 + } 641 + 642 + module_init(qaic_init); 643 + module_exit(qaic_exit); 644 + 645 + MODULE_AUTHOR(QAIC_DESC " Kernel Driver Team"); 646 + MODULE_DESCRIPTION(QAIC_DESC " Accel Driver"); 647 + MODULE_LICENSE("GPL");
+1 -15
drivers/gpu/drm/ast/ast_drv.c
··· 89 89 90 90 MODULE_DEVICE_TABLE(pci, ast_pciidlist); 91 91 92 - static int ast_remove_conflicting_framebuffers(struct pci_dev *pdev) 93 - { 94 - bool primary = false; 95 - resource_size_t base, size; 96 - 97 - base = pci_resource_start(pdev, 0); 98 - size = pci_resource_len(pdev, 0); 99 - #ifdef CONFIG_X86 100 - primary = pdev->resource[PCI_ROM_RESOURCE].flags & IORESOURCE_ROM_SHADOW; 101 - #endif 102 - 103 - return drm_aperture_remove_conflicting_framebuffers(base, size, primary, &ast_driver); 104 - } 105 - 106 92 static int ast_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) 107 93 { 108 94 struct ast_device *ast; 109 95 struct drm_device *dev; 110 96 int ret; 111 97 112 - ret = ast_remove_conflicting_framebuffers(pdev); 98 + ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, &ast_driver); 113 99 if (ret) 114 100 return ret; 115 101
+60 -43
drivers/gpu/drm/bridge/fsl-ldb.c
··· 84 84 struct drm_bridge *panel_bridge; 85 85 struct clk *clk; 86 86 struct regmap *regmap; 87 - bool lvds_dual_link; 88 87 const struct fsl_ldb_devdata *devdata; 88 + bool ch0_enabled; 89 + bool ch1_enabled; 89 90 }; 91 + 92 + static bool fsl_ldb_is_dual(const struct fsl_ldb *fsl_ldb) 93 + { 94 + return (fsl_ldb->ch0_enabled && fsl_ldb->ch1_enabled); 95 + } 90 96 91 97 static inline struct fsl_ldb *to_fsl_ldb(struct drm_bridge *bridge) 92 98 { ··· 101 95 102 96 static unsigned long fsl_ldb_link_frequency(struct fsl_ldb *fsl_ldb, int clock) 103 97 { 104 - if (fsl_ldb->lvds_dual_link) 98 + if (fsl_ldb_is_dual(fsl_ldb)) 105 99 return clock * 3500; 106 100 else 107 101 return clock * 7000; ··· 176 170 177 171 configured_link_freq = clk_get_rate(fsl_ldb->clk); 178 172 if (configured_link_freq != requested_link_freq) 179 - dev_warn(fsl_ldb->dev, "Configured LDB clock (%lu Hz) does not match requested LVDS clock: %lu Hz", 173 + dev_warn(fsl_ldb->dev, "Configured LDB clock (%lu Hz) does not match requested LVDS clock: %lu Hz\n", 180 174 configured_link_freq, 181 175 requested_link_freq); 182 176 183 177 clk_prepare_enable(fsl_ldb->clk); 184 178 185 179 /* Program LDB_CTRL */ 186 - reg = LDB_CTRL_CH0_ENABLE; 180 + reg = (fsl_ldb->ch0_enabled ? LDB_CTRL_CH0_ENABLE : 0) | 181 + (fsl_ldb->ch1_enabled ? LDB_CTRL_CH1_ENABLE : 0) | 182 + (fsl_ldb_is_dual(fsl_ldb) ? LDB_CTRL_SPLIT_MODE : 0); 187 183 188 - if (fsl_ldb->lvds_dual_link) 189 - reg |= LDB_CTRL_CH1_ENABLE | LDB_CTRL_SPLIT_MODE; 184 + if (lvds_format_24bpp) 185 + reg |= (fsl_ldb->ch0_enabled ? LDB_CTRL_CH0_DATA_WIDTH : 0) | 186 + (fsl_ldb->ch1_enabled ? LDB_CTRL_CH1_DATA_WIDTH : 0); 190 187 191 - if (lvds_format_24bpp) { 192 - reg |= LDB_CTRL_CH0_DATA_WIDTH; 193 - if (fsl_ldb->lvds_dual_link) 194 - reg |= LDB_CTRL_CH1_DATA_WIDTH; 195 - } 188 + if (lvds_format_jeida) 189 + reg |= (fsl_ldb->ch0_enabled ? LDB_CTRL_CH0_BIT_MAPPING : 0) | 190 + (fsl_ldb->ch1_enabled ? LDB_CTRL_CH1_BIT_MAPPING : 0); 196 191 197 - if (lvds_format_jeida) { 198 - reg |= LDB_CTRL_CH0_BIT_MAPPING; 199 - if (fsl_ldb->lvds_dual_link) 200 - reg |= LDB_CTRL_CH1_BIT_MAPPING; 201 - } 202 - 203 - if (mode->flags & DRM_MODE_FLAG_PVSYNC) { 204 - reg |= LDB_CTRL_DI0_VSYNC_POLARITY; 205 - if (fsl_ldb->lvds_dual_link) 206 - reg |= LDB_CTRL_DI1_VSYNC_POLARITY; 207 - } 192 + if (mode->flags & DRM_MODE_FLAG_PVSYNC) 193 + reg |= (fsl_ldb->ch0_enabled ? LDB_CTRL_DI0_VSYNC_POLARITY : 0) | 194 + (fsl_ldb->ch1_enabled ? LDB_CTRL_DI1_VSYNC_POLARITY : 0); 208 195 209 196 regmap_write(fsl_ldb->regmap, fsl_ldb->devdata->ldb_ctrl, reg); 210 197 ··· 209 210 /* Wait for VBG to stabilize. */ 210 211 usleep_range(15, 20); 211 212 212 - reg |= LVDS_CTRL_CH0_EN; 213 - if (fsl_ldb->lvds_dual_link) 214 - reg |= LVDS_CTRL_CH1_EN; 213 + reg |= (fsl_ldb->ch0_enabled ? LVDS_CTRL_CH0_EN : 0) | 214 + (fsl_ldb->ch1_enabled ? LVDS_CTRL_CH1_EN : 0); 215 215 216 216 regmap_write(fsl_ldb->regmap, fsl_ldb->devdata->lvds_ctrl, reg); 217 217 } ··· 263 265 { 264 266 struct fsl_ldb *fsl_ldb = to_fsl_ldb(bridge); 265 267 266 - if (mode->clock > (fsl_ldb->lvds_dual_link ? 160000 : 80000)) 268 + if (mode->clock > (fsl_ldb_is_dual(fsl_ldb) ? 160000 : 80000)) 267 269 return MODE_CLOCK_HIGH; 268 270 269 271 return MODE_OK; ··· 284 286 { 285 287 struct device *dev = &pdev->dev; 286 288 struct device_node *panel_node; 287 - struct device_node *port1, *port2; 289 + struct device_node *remote1, *remote2; 288 290 struct drm_panel *panel; 289 291 struct fsl_ldb *fsl_ldb; 290 292 int dual_link; ··· 309 311 if (IS_ERR(fsl_ldb->regmap)) 310 312 return PTR_ERR(fsl_ldb->regmap); 311 313 312 - /* Locate the panel DT node. */ 313 - panel_node = of_graph_get_remote_node(dev->of_node, 1, 0); 314 - if (!panel_node) 315 - return -ENXIO; 314 + /* Locate the remote ports and the panel node */ 315 + remote1 = of_graph_get_remote_node(dev->of_node, 1, 0); 316 + remote2 = of_graph_get_remote_node(dev->of_node, 2, 0); 317 + fsl_ldb->ch0_enabled = (remote1 != NULL); 318 + fsl_ldb->ch1_enabled = (remote2 != NULL); 319 + panel_node = of_node_get(remote1 ? remote1 : remote2); 320 + of_node_put(remote1); 321 + of_node_put(remote2); 322 + 323 + if (!fsl_ldb->ch0_enabled && !fsl_ldb->ch1_enabled) { 324 + of_node_put(panel_node); 325 + return dev_err_probe(dev, -ENXIO, "No panel node found"); 326 + } 327 + 328 + dev_dbg(dev, "Using %s\n", 329 + fsl_ldb_is_dual(fsl_ldb) ? "dual-link mode" : 330 + fsl_ldb->ch0_enabled ? "channel 0" : "channel 1"); 316 331 317 332 panel = of_drm_find_panel(panel_node); 318 333 of_node_put(panel_node); ··· 336 325 if (IS_ERR(fsl_ldb->panel_bridge)) 337 326 return PTR_ERR(fsl_ldb->panel_bridge); 338 327 339 - /* Determine whether this is dual-link configuration */ 340 - port1 = of_graph_get_port_by_id(dev->of_node, 1); 341 - port2 = of_graph_get_port_by_id(dev->of_node, 2); 342 - dual_link = drm_of_lvds_get_dual_link_pixel_order(port1, port2); 343 - of_node_put(port1); 344 - of_node_put(port2); 345 328 346 - if (dual_link == DRM_LVDS_DUAL_LINK_EVEN_ODD_PIXELS) { 347 - dev_err(dev, "LVDS channel pixel swap not supported.\n"); 348 - return -EINVAL; 329 + if (fsl_ldb_is_dual(fsl_ldb)) { 330 + struct device_node *port1, *port2; 331 + 332 + port1 = of_graph_get_port_by_id(dev->of_node, 1); 333 + port2 = of_graph_get_port_by_id(dev->of_node, 2); 334 + dual_link = drm_of_lvds_get_dual_link_pixel_order(port1, port2); 335 + of_node_put(port1); 336 + of_node_put(port2); 337 + 338 + if (dual_link < 0) 339 + return dev_err_probe(dev, dual_link, 340 + "Error getting dual link configuration\n"); 341 + 342 + /* Only DRM_LVDS_DUAL_LINK_ODD_EVEN_PIXELS is supported */ 343 + if (dual_link == DRM_LVDS_DUAL_LINK_EVEN_ODD_PIXELS) { 344 + dev_err(dev, "LVDS channel pixel swap not supported.\n"); 345 + return -EINVAL; 346 + } 349 347 } 350 - 351 - if (dual_link == DRM_LVDS_DUAL_LINK_ODD_EVEN_PIXELS) 352 - fsl_ldb->lvds_dual_link = true; 353 348 354 349 platform_set_drvdata(pdev, fsl_ldb); 355 350
-1
drivers/gpu/drm/bridge/lontium-lt8912b.c
··· 504 504 dsi->format = MIPI_DSI_FMT_RGB888; 505 505 506 506 dsi->mode_flags = MIPI_DSI_MODE_VIDEO | 507 - MIPI_DSI_MODE_VIDEO_BURST | 508 507 MIPI_DSI_MODE_LPM | 509 508 MIPI_DSI_MODE_NO_EOT_PACKET; 510 509
+1 -1
drivers/gpu/drm/bridge/parade-ps8640.c
··· 184 184 * actually connected to GPIO9). 185 185 */ 186 186 ret = regmap_read_poll_timeout(map, PAGE2_GPIO_H, status, 187 - status & PS_GPIO9, wait_us / 10, wait_us); 187 + status & PS_GPIO9, 20000, wait_us); 188 188 189 189 /* 190 190 * The first time we see HPD go high after a reset we delay an extra
+4 -4
drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
··· 1426 1426 /* Control for TMDS Bit Period/TMDS Clock-Period Ratio */ 1427 1427 if (dw_hdmi_support_scdc(hdmi, display)) { 1428 1428 if (mtmdsclock > HDMI14_MAX_TMDSCLK) 1429 - drm_scdc_set_high_tmds_clock_ratio(hdmi->ddc, 1); 1429 + drm_scdc_set_high_tmds_clock_ratio(&hdmi->connector, 1); 1430 1430 else 1431 - drm_scdc_set_high_tmds_clock_ratio(hdmi->ddc, 0); 1431 + drm_scdc_set_high_tmds_clock_ratio(&hdmi->connector, 0); 1432 1432 } 1433 1433 } 1434 1434 EXPORT_SYMBOL_GPL(dw_hdmi_set_high_tmds_clock_ratio); ··· 2116 2116 min_t(u8, bytes, SCDC_MIN_SOURCE_VERSION)); 2117 2117 2118 2118 /* Enabled Scrambling in the Sink */ 2119 - drm_scdc_set_scrambling(hdmi->ddc, 1); 2119 + drm_scdc_set_scrambling(&hdmi->connector, 1); 2120 2120 2121 2121 /* 2122 2122 * To activate the scrambler feature, you must ensure ··· 2132 2132 hdmi_writeb(hdmi, 0, HDMI_FC_SCRAMBLER_CTRL); 2133 2133 hdmi_writeb(hdmi, (u8)~HDMI_MC_SWRSTZ_TMDSSWRST_REQ, 2134 2134 HDMI_MC_SWRSTZ); 2135 - drm_scdc_set_scrambling(hdmi->ddc, 0); 2135 + drm_scdc_set_scrambling(&hdmi->connector, 0); 2136 2136 } 2137 2137 } 2138 2138
+2 -2
drivers/gpu/drm/bridge/tc358767.c
··· 1896 1896 "failed to create dsi device\n"); 1897 1897 1898 1898 tc->dsi = dsi; 1899 - 1900 1899 dsi->lanes = dsi_lanes; 1901 1900 dsi->format = MIPI_DSI_FMT_RGB888; 1902 - dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_SYNC_PULSE; 1901 + dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST | 1902 + MIPI_DSI_MODE_LPM | MIPI_DSI_CLOCK_NON_CONTINUOUS; 1903 1903 1904 1904 ret = mipi_dsi_attach(dsi); 1905 1905 if (ret < 0) {
+6 -2
drivers/gpu/drm/bridge/ti-sn65dsi83.c
··· 642 642 643 643 dsi->lanes = dsi_lanes; 644 644 dsi->format = MIPI_DSI_FMT_RGB888; 645 - dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST; 645 + dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST | 646 + MIPI_DSI_MODE_VIDEO_NO_HFP | MIPI_DSI_MODE_VIDEO_NO_HBP | 647 + MIPI_DSI_MODE_VIDEO_NO_HSA | MIPI_DSI_MODE_NO_EOT_PACKET; 646 648 647 649 ret = devm_mipi_dsi_attach(dev, dsi); 648 650 if (ret < 0) { ··· 700 698 drm_bridge_add(&ctx->bridge); 701 699 702 700 ret = sn65dsi83_host_attach(ctx); 703 - if (ret) 701 + if (ret) { 702 + dev_err_probe(dev, ret, "failed to attach DSI host\n"); 704 703 goto err_remove_bridge; 704 + } 705 705 706 706 return 0; 707 707
+2 -2
drivers/gpu/drm/bridge/ti-sn65dsi86.c
··· 363 363 /* td2: min 100 us after regulators before enabling the GPIO */ 364 364 usleep_range(100, 110); 365 365 366 - gpiod_set_value(pdata->enable_gpio, 1); 366 + gpiod_set_value_cansleep(pdata->enable_gpio, 1); 367 367 368 368 /* 369 369 * If we have a reference clock we can enable communication w/ the ··· 386 386 if (pdata->refclk) 387 387 ti_sn65dsi86_disable_comms(pdata); 388 388 389 - gpiod_set_value(pdata->enable_gpio, 0); 389 + gpiod_set_value_cansleep(pdata->enable_gpio, 0); 390 390 391 391 ret = regulator_bulk_disable(SN_REGULATOR_SUPPLY_NUM, pdata->supplies); 392 392 if (ret)
+30 -16
drivers/gpu/drm/display/drm_scdc_helper.c
··· 26 26 #include <linux/delay.h> 27 27 28 28 #include <drm/display/drm_scdc_helper.h> 29 + #include <drm/drm_connector.h> 30 + #include <drm/drm_device.h> 29 31 #include <drm/drm_print.h> 30 32 31 33 /** ··· 142 140 143 141 /** 144 142 * drm_scdc_get_scrambling_status - what is status of scrambling? 145 - * @adapter: I2C adapter for DDC channel 143 + * @connector: connector 146 144 * 147 145 * Reads the scrambler status over SCDC, and checks the 148 146 * scrambling status. ··· 150 148 * Returns: 151 149 * True if the scrambling is enabled, false otherwise. 152 150 */ 153 - bool drm_scdc_get_scrambling_status(struct i2c_adapter *adapter) 151 + bool drm_scdc_get_scrambling_status(struct drm_connector *connector) 154 152 { 155 153 u8 status; 156 154 int ret; 157 155 158 - ret = drm_scdc_readb(adapter, SCDC_SCRAMBLER_STATUS, &status); 156 + ret = drm_scdc_readb(connector->ddc, SCDC_SCRAMBLER_STATUS, &status); 159 157 if (ret < 0) { 160 - DRM_DEBUG_KMS("Failed to read scrambling status: %d\n", ret); 158 + drm_dbg_kms(connector->dev, 159 + "[CONNECTOR:%d:%s] Failed to read scrambling status: %d\n", 160 + connector->base.id, connector->name, ret); 161 161 return false; 162 162 } 163 163 ··· 169 165 170 166 /** 171 167 * drm_scdc_set_scrambling - enable scrambling 172 - * @adapter: I2C adapter for DDC channel 168 + * @connector: connector 173 169 * @enable: bool to indicate if scrambling is to be enabled/disabled 174 170 * 175 171 * Writes the TMDS config register over SCDC channel, and: ··· 179 175 * Returns: 180 176 * True if scrambling is set/reset successfully, false otherwise. 181 177 */ 182 - bool drm_scdc_set_scrambling(struct i2c_adapter *adapter, bool enable) 178 + bool drm_scdc_set_scrambling(struct drm_connector *connector, 179 + bool enable) 183 180 { 184 181 u8 config; 185 182 int ret; 186 183 187 - ret = drm_scdc_readb(adapter, SCDC_TMDS_CONFIG, &config); 184 + ret = drm_scdc_readb(connector->ddc, SCDC_TMDS_CONFIG, &config); 188 185 if (ret < 0) { 189 - DRM_DEBUG_KMS("Failed to read TMDS config: %d\n", ret); 186 + drm_dbg_kms(connector->dev, 187 + "[CONNECTOR:%d:%s] Failed to read TMDS config: %d\n", 188 + connector->base.id, connector->name, ret); 190 189 return false; 191 190 } 192 191 ··· 198 191 else 199 192 config &= ~SCDC_SCRAMBLING_ENABLE; 200 193 201 - ret = drm_scdc_writeb(adapter, SCDC_TMDS_CONFIG, config); 194 + ret = drm_scdc_writeb(connector->ddc, SCDC_TMDS_CONFIG, config); 202 195 if (ret < 0) { 203 - DRM_DEBUG_KMS("Failed to enable scrambling: %d\n", ret); 196 + drm_dbg_kms(connector->dev, 197 + "[CONNECTOR:%d:%s] Failed to enable scrambling: %d\n", 198 + connector->base.id, connector->name, ret); 204 199 return false; 205 200 } 206 201 ··· 212 203 213 204 /** 214 205 * drm_scdc_set_high_tmds_clock_ratio - set TMDS clock ratio 215 - * @adapter: I2C adapter for DDC channel 206 + * @connector: connector 216 207 * @set: ret or reset the high clock ratio 217 208 * 218 209 * ··· 239 230 * Returns: 240 231 * True if write is successful, false otherwise. 241 232 */ 242 - bool drm_scdc_set_high_tmds_clock_ratio(struct i2c_adapter *adapter, bool set) 233 + bool drm_scdc_set_high_tmds_clock_ratio(struct drm_connector *connector, 234 + bool set) 243 235 { 244 236 u8 config; 245 237 int ret; 246 238 247 - ret = drm_scdc_readb(adapter, SCDC_TMDS_CONFIG, &config); 239 + ret = drm_scdc_readb(connector->ddc, SCDC_TMDS_CONFIG, &config); 248 240 if (ret < 0) { 249 - DRM_DEBUG_KMS("Failed to read TMDS config: %d\n", ret); 241 + drm_dbg_kms(connector->dev, 242 + "[CONNECTOR:%d:%s] Failed to read TMDS config: %d\n", 243 + connector->base.id, connector->name, ret); 250 244 return false; 251 245 } 252 246 ··· 258 246 else 259 247 config &= ~SCDC_TMDS_BIT_CLOCK_RATIO_BY_40; 260 248 261 - ret = drm_scdc_writeb(adapter, SCDC_TMDS_CONFIG, config); 249 + ret = drm_scdc_writeb(connector->ddc, SCDC_TMDS_CONFIG, config); 262 250 if (ret < 0) { 263 - DRM_DEBUG_KMS("Failed to set TMDS clock ratio: %d\n", ret); 251 + drm_dbg_kms(connector->dev, 252 + "[CONNECTOR:%d:%s] Failed to set TMDS clock ratio: %d\n", 253 + connector->base.id, connector->name, ret); 264 254 return false; 265 255 } 266 256
+6
drivers/gpu/drm/drm_atomic_helper.c
··· 1528 1528 for_each_new_crtc_in_state (state, crtc, new_crtc_state, i) { 1529 1529 ktime_t v; 1530 1530 1531 + if (drm_atomic_crtc_needs_modeset(new_crtc_state)) 1532 + continue; 1533 + 1534 + if (!new_crtc_state->active) 1535 + continue; 1536 + 1531 1537 if (drm_crtc_next_vblank_start(crtc, &v)) 1532 1538 continue; 1533 1539
+39 -14
drivers/gpu/drm/drm_fb_helper.c
··· 1537 1537 } 1538 1538 } 1539 1539 1540 + static void __fill_var(struct fb_var_screeninfo *var, 1541 + struct drm_framebuffer *fb) 1542 + { 1543 + int i; 1544 + 1545 + var->xres_virtual = fb->width; 1546 + var->yres_virtual = fb->height; 1547 + var->accel_flags = FB_ACCELF_TEXT; 1548 + var->bits_per_pixel = drm_format_info_bpp(fb->format, 0); 1549 + 1550 + var->height = var->width = 0; 1551 + var->left_margin = var->right_margin = 0; 1552 + var->upper_margin = var->lower_margin = 0; 1553 + var->hsync_len = var->vsync_len = 0; 1554 + var->sync = var->vmode = 0; 1555 + var->rotate = 0; 1556 + var->colorspace = 0; 1557 + for (i = 0; i < 4; i++) 1558 + var->reserved[i] = 0; 1559 + } 1560 + 1540 1561 /** 1541 1562 * drm_fb_helper_check_var - implementation for &fb_ops.fb_check_var 1542 1563 * @var: screeninfo to check ··· 1610 1589 return -EINVAL; 1611 1590 } 1612 1591 1592 + __fill_var(var, fb); 1593 + 1594 + /* 1595 + * fb_pan_display() validates this, but fb_set_par() doesn't and just 1596 + * falls over. Note that __fill_var above adjusts y/res_virtual. 1597 + */ 1598 + if (var->yoffset > var->yres_virtual - var->yres || 1599 + var->xoffset > var->xres_virtual - var->xres) 1600 + return -EINVAL; 1601 + 1602 + /* We neither support grayscale nor FOURCC (also stored in here). */ 1603 + if (var->grayscale > 0) 1604 + return -EINVAL; 1605 + 1606 + if (var->nonstd) 1607 + return -EINVAL; 1608 + 1613 1609 /* 1614 1610 * Workaround for SDL 1.2, which is known to be setting all pixel format 1615 1611 * fields values to zero in some cases. We treat this situation as a ··· 1640 1602 !var->blue.msb_right && !var->transp.msb_right) { 1641 1603 drm_fb_helper_fill_pixel_fmt(var, format); 1642 1604 } 1643 - 1644 - /* 1645 - * Likewise, bits_per_pixel should be rounded up to a supported value. 1646 - */ 1647 - var->bits_per_pixel = bpp; 1648 1605 1649 1606 /* 1650 1607 * drm fbdev emulation doesn't support changing the pixel format at all, ··· 1670 1637 1671 1638 if (oops_in_progress) 1672 1639 return -EBUSY; 1673 - 1674 - if (var->pixclock != 0) { 1675 - drm_err(fb_helper->dev, "PIXEL CLOCK SET\n"); 1676 - return -EINVAL; 1677 - } 1678 1640 1679 1641 /* 1680 1642 * Normally we want to make sure that a kms master takes precedence over ··· 2064 2036 } 2065 2037 2066 2038 info->pseudo_palette = fb_helper->pseudo_palette; 2067 - info->var.xres_virtual = fb->width; 2068 - info->var.yres_virtual = fb->height; 2069 - info->var.bits_per_pixel = drm_format_info_bpp(format, 0); 2070 - info->var.accel_flags = FB_ACCELF_TEXT; 2071 2039 info->var.xoffset = 0; 2072 2040 info->var.yoffset = 0; 2041 + __fill_var(&info->var, fb); 2073 2042 info->var.activate = FB_ACTIVATE_NOW; 2074 2043 2075 2044 drm_fb_helper_fill_pixel_fmt(&info->var, format);
+5 -1
drivers/gpu/drm/drm_prime.c
··· 544 544 * Optional pinning of buffers is handled at dma-buf attach and detach time in 545 545 * drm_gem_map_attach() and drm_gem_map_detach(). Backing storage itself is 546 546 * handled by drm_gem_map_dma_buf() and drm_gem_unmap_dma_buf(), which relies on 547 - * &drm_gem_object_funcs.get_sg_table. 547 + * &drm_gem_object_funcs.get_sg_table. If &drm_gem_object_funcs.get_sg_table is 548 + * unimplemented, exports into another device are rejected. 548 549 * 549 550 * For kernel-internal access there's drm_gem_dmabuf_vmap() and 550 551 * drm_gem_dmabuf_vunmap(). Userspace mmap support is provided by ··· 583 582 struct dma_buf_attachment *attach) 584 583 { 585 584 struct drm_gem_object *obj = dma_buf->priv; 585 + 586 + if (!obj->funcs->get_sg_table) 587 + return -ENOSYS; 586 588 587 589 return drm_gem_pin(obj); 588 590 }
+8 -2
drivers/gpu/drm/drm_vblank.c
··· 996 996 int drm_crtc_next_vblank_start(struct drm_crtc *crtc, ktime_t *vblanktime) 997 997 { 998 998 unsigned int pipe = drm_crtc_index(crtc); 999 - struct drm_vblank_crtc *vblank = &crtc->dev->vblank[pipe]; 1000 - struct drm_display_mode *mode = &vblank->hwmode; 999 + struct drm_vblank_crtc *vblank; 1000 + struct drm_display_mode *mode; 1001 1001 u64 vblank_start; 1002 + 1003 + if (!drm_dev_has_vblank(crtc->dev)) 1004 + return -EINVAL; 1005 + 1006 + vblank = &crtc->dev->vblank[pipe]; 1007 + mode = &vblank->hwmode; 1002 1008 1003 1009 if (!vblank->framedur_ns || !vblank->linedur_ns) 1004 1010 return -EINVAL;
+2 -2
drivers/gpu/drm/i915/display/intel_ddi.c
··· 3988 3988 3989 3989 ret = drm_scdc_readb(adapter, SCDC_TMDS_CONFIG, &config); 3990 3990 if (ret < 0) { 3991 - drm_err(&dev_priv->drm, "Failed to read TMDS config: %d\n", 3992 - ret); 3991 + drm_err(&dev_priv->drm, "[CONNECTOR:%d:%s] Failed to read TMDS config: %d\n", 3992 + connector->base.base.id, connector->base.name, ret); 3993 3993 return 0; 3994 3994 } 3995 3995
+2 -6
drivers/gpu/drm/i915/display/intel_hdmi.c
··· 2646 2646 bool scrambling) 2647 2647 { 2648 2648 struct drm_i915_private *dev_priv = to_i915(encoder->base.dev); 2649 - struct intel_hdmi *intel_hdmi = enc_to_intel_hdmi(encoder); 2650 2649 struct drm_scrambling *sink_scrambling = 2651 2650 &connector->display_info.hdmi.scdc.scrambling; 2652 - struct i2c_adapter *adapter = 2653 - intel_gmbus_get_adapter(dev_priv, intel_hdmi->ddc_bus); 2654 2651 2655 2652 if (!sink_scrambling->supported) 2656 2653 return true; ··· 2658 2661 str_yes_no(scrambling), high_tmds_clock_ratio ? 40 : 10); 2659 2662 2660 2663 /* Set TMDS bit clock ratio to 1/40 or 1/10, and enable/disable scrambling */ 2661 - return drm_scdc_set_high_tmds_clock_ratio(adapter, 2662 - high_tmds_clock_ratio) && 2663 - drm_scdc_set_scrambling(adapter, scrambling); 2664 + return drm_scdc_set_high_tmds_clock_ratio(connector, high_tmds_clock_ratio) && 2665 + drm_scdc_set_scrambling(connector, scrambling); 2664 2666 } 2665 2667 2666 2668 static u8 chv_port_to_ddc_pin(struct drm_i915_private *dev_priv, enum port port)
+4 -2
drivers/gpu/drm/lima/lima_drv.c
··· 392 392 393 393 /* Allocate and initialize the DRM device. */ 394 394 ddev = drm_dev_alloc(&lima_drm_driver, &pdev->dev); 395 - if (IS_ERR(ddev)) 396 - return PTR_ERR(ddev); 395 + if (IS_ERR(ddev)) { 396 + err = PTR_ERR(ddev); 397 + goto err_out0; 398 + } 397 399 398 400 ddev->dev_private = ldev; 399 401 ldev->ddev = ddev;
+1
drivers/gpu/drm/panel/panel-edp.c
··· 1879 1879 EDP_PANEL_ENTRY('B', 'O', 'E', 0x07d1, &boe_nv133fhm_n61.delay, "NV133FHM-N61"), 1880 1880 EDP_PANEL_ENTRY('B', 'O', 'E', 0x082d, &boe_nv133fhm_n61.delay, "NV133FHM-N62"), 1881 1881 EDP_PANEL_ENTRY('B', 'O', 'E', 0x094b, &delay_200_500_e50, "NT116WHM-N21"), 1882 + EDP_PANEL_ENTRY('B', 'O', 'E', 0x095f, &delay_200_500_e50, "NE135FBM-N41 v8.1"), 1882 1883 EDP_PANEL_ENTRY('B', 'O', 'E', 0x098d, &boe_nv110wtm_n61.delay, "NV110WTM-N61"), 1883 1884 EDP_PANEL_ENTRY('B', 'O', 'E', 0x09dd, &delay_200_500_e50, "NT116WHM-N21"), 1884 1885 EDP_PANEL_ENTRY('B', 'O', 'E', 0x0a5d, &delay_200_500_e50, "NV116WHM-N45"),
+5 -10
drivers/gpu/drm/tegra/sor.c
··· 2140 2140 2141 2141 static void tegra_sor_hdmi_scdc_disable(struct tegra_sor *sor) 2142 2142 { 2143 - struct i2c_adapter *ddc = sor->output.ddc; 2144 - 2145 - drm_scdc_set_high_tmds_clock_ratio(ddc, false); 2146 - drm_scdc_set_scrambling(ddc, false); 2143 + drm_scdc_set_high_tmds_clock_ratio(&sor->output.connector, false); 2144 + drm_scdc_set_scrambling(&sor->output.connector, false); 2147 2145 2148 2146 tegra_sor_hdmi_disable_scrambling(sor); 2149 2147 } ··· 2166 2168 2167 2169 static void tegra_sor_hdmi_scdc_enable(struct tegra_sor *sor) 2168 2170 { 2169 - struct i2c_adapter *ddc = sor->output.ddc; 2170 - 2171 - drm_scdc_set_high_tmds_clock_ratio(ddc, true); 2172 - drm_scdc_set_scrambling(ddc, true); 2171 + drm_scdc_set_high_tmds_clock_ratio(&sor->output.connector, true); 2172 + drm_scdc_set_scrambling(&sor->output.connector, true); 2173 2173 2174 2174 tegra_sor_hdmi_enable_scrambling(sor); 2175 2175 } ··· 2175 2179 static void tegra_sor_hdmi_scdc_work(struct work_struct *work) 2176 2180 { 2177 2181 struct tegra_sor *sor = container_of(work, struct tegra_sor, scdc.work); 2178 - struct i2c_adapter *ddc = sor->output.ddc; 2179 2182 2180 - if (!drm_scdc_get_scrambling_status(ddc)) { 2183 + if (!drm_scdc_get_scrambling_status(&sor->output.connector)) { 2181 2184 DRM_DEBUG_KMS("SCDC not scrambled\n"); 2182 2185 tegra_sor_hdmi_scdc_enable(sor); 2183 2186 }
+10 -3
drivers/gpu/drm/ttm/ttm_bo_vm.c
··· 218 218 prot = ttm_io_prot(bo, bo->resource, prot); 219 219 if (!bo->resource->bus.is_iomem) { 220 220 struct ttm_operation_ctx ctx = { 221 - .interruptible = false, 221 + .interruptible = true, 222 222 .no_wait_gpu = false, 223 223 .force_alloc = true 224 224 }; 225 225 226 226 ttm = bo->ttm; 227 - if (ttm_tt_populate(bdev, bo->ttm, &ctx)) 228 - return VM_FAULT_OOM; 227 + err = ttm_tt_populate(bdev, bo->ttm, &ctx); 228 + if (err) { 229 + if (err == -EINTR || err == -ERESTARTSYS || 230 + err == -EAGAIN) 231 + return VM_FAULT_NOPAGE; 232 + 233 + pr_debug("TTM fault hit %pe.\n", ERR_PTR(err)); 234 + return VM_FAULT_SIGBUS; 235 + } 229 236 } else { 230 237 /* Iomem should not be marked encrypted */ 231 238 prot = pgprot_decrypted(prot);
+70 -41
drivers/gpu/drm/ttm/ttm_pool.c
··· 47 47 48 48 #include "ttm_module.h" 49 49 50 + #define TTM_MAX_ORDER (PMD_SHIFT - PAGE_SHIFT) 51 + #define __TTM_DIM_ORDER (TTM_MAX_ORDER + 1) 52 + /* Some architectures have a weird PMD_SHIFT */ 53 + #define TTM_DIM_ORDER (__TTM_DIM_ORDER <= MAX_ORDER ? __TTM_DIM_ORDER : MAX_ORDER) 54 + 50 55 /** 51 56 * struct ttm_pool_dma - Helper object for coherent DMA mappings 52 57 * ··· 70 65 71 66 static atomic_long_t allocated_pages; 72 67 73 - static struct ttm_pool_type global_write_combined[MAX_ORDER]; 74 - static struct ttm_pool_type global_uncached[MAX_ORDER]; 68 + static struct ttm_pool_type global_write_combined[TTM_DIM_ORDER]; 69 + static struct ttm_pool_type global_uncached[TTM_DIM_ORDER]; 75 70 76 - static struct ttm_pool_type global_dma32_write_combined[MAX_ORDER]; 77 - static struct ttm_pool_type global_dma32_uncached[MAX_ORDER]; 71 + static struct ttm_pool_type global_dma32_write_combined[TTM_DIM_ORDER]; 72 + static struct ttm_pool_type global_dma32_uncached[TTM_DIM_ORDER]; 78 73 79 74 static spinlock_t shrinker_lock; 80 75 static struct list_head shrinker_list; ··· 373 368 } 374 369 375 370 /** 371 + * ttm_pool_free_range() - Free a range of TTM pages 372 + * @pool: The pool used for allocating. 373 + * @tt: The struct ttm_tt holding the page pointers. 374 + * @caching: The page caching mode used by the range. 375 + * @start_page: index for first page to free. 376 + * @end_page: index for last page to free + 1. 377 + * 378 + * During allocation the ttm_tt page-vector may be populated with ranges of 379 + * pages with different attributes if allocation hit an error without being 380 + * able to completely fulfill the allocation. This function can be used 381 + * to free these individual ranges. 382 + */ 383 + static void ttm_pool_free_range(struct ttm_pool *pool, struct ttm_tt *tt, 384 + enum ttm_caching caching, 385 + pgoff_t start_page, pgoff_t end_page) 386 + { 387 + struct page **pages = tt->pages; 388 + unsigned int order; 389 + pgoff_t i, nr; 390 + 391 + for (i = start_page; i < end_page; i += nr, pages += nr) { 392 + struct ttm_pool_type *pt = NULL; 393 + 394 + order = ttm_pool_page_order(pool, *pages); 395 + nr = (1UL << order); 396 + if (tt->dma_address) 397 + ttm_pool_unmap(pool, tt->dma_address[i], nr); 398 + 399 + pt = ttm_pool_select_type(pool, caching, order); 400 + if (pt) 401 + ttm_pool_type_give(pt, *pages); 402 + else 403 + ttm_pool_free_page(pool, caching, order, *pages); 404 + } 405 + } 406 + 407 + /** 376 408 * ttm_pool_alloc - Fill a ttm_tt object 377 409 * 378 410 * @pool: ttm_pool to use ··· 424 382 int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt, 425 383 struct ttm_operation_ctx *ctx) 426 384 { 427 - unsigned long num_pages = tt->num_pages; 385 + pgoff_t num_pages = tt->num_pages; 428 386 dma_addr_t *dma_addr = tt->dma_address; 429 387 struct page **caching = tt->pages; 430 388 struct page **pages = tt->pages; 389 + enum ttm_caching page_caching; 431 390 gfp_t gfp_flags = GFP_USER; 432 - unsigned int i, order; 391 + pgoff_t caching_divide; 392 + unsigned int order; 433 393 struct page *p; 434 394 int r; 435 395 ··· 449 405 else 450 406 gfp_flags |= GFP_HIGHUSER; 451 407 452 - for (order = min_t(unsigned int, MAX_ORDER - 1, __fls(num_pages)); 408 + for (order = min_t(unsigned int, TTM_MAX_ORDER, __fls(num_pages)); 453 409 num_pages; 454 410 order = min_t(unsigned int, order, __fls(num_pages))) { 455 411 struct ttm_pool_type *pt; 456 412 413 + page_caching = tt->caching; 457 414 pt = ttm_pool_select_type(pool, tt->caching, order); 458 415 p = pt ? ttm_pool_type_take(pt) : NULL; 459 416 if (p) { ··· 463 418 if (r) 464 419 goto error_free_page; 465 420 421 + caching = pages; 466 422 do { 467 423 r = ttm_pool_page_allocated(pool, order, p, 468 424 &dma_addr, ··· 472 426 if (r) 473 427 goto error_free_page; 474 428 429 + caching = pages; 475 430 if (num_pages < (1 << order)) 476 431 break; 477 432 478 433 p = ttm_pool_type_take(pt); 479 434 } while (p); 480 - caching = pages; 481 435 } 482 436 437 + page_caching = ttm_cached; 483 438 while (num_pages >= (1 << order) && 484 439 (p = ttm_pool_alloc_page(pool, gfp_flags, order))) { 485 440 ··· 489 442 tt->caching); 490 443 if (r) 491 444 goto error_free_page; 445 + caching = pages; 492 446 } 493 447 r = ttm_pool_page_allocated(pool, order, p, &dma_addr, 494 448 &num_pages, &pages); ··· 516 468 return 0; 517 469 518 470 error_free_page: 519 - ttm_pool_free_page(pool, tt->caching, order, p); 471 + ttm_pool_free_page(pool, page_caching, order, p); 520 472 521 473 error_free_all: 522 474 num_pages = tt->num_pages - num_pages; 523 - for (i = 0; i < num_pages; ) { 524 - order = ttm_pool_page_order(pool, tt->pages[i]); 525 - ttm_pool_free_page(pool, tt->caching, order, tt->pages[i]); 526 - i += 1 << order; 527 - } 475 + caching_divide = caching - tt->pages; 476 + ttm_pool_free_range(pool, tt, tt->caching, 0, caching_divide); 477 + ttm_pool_free_range(pool, tt, ttm_cached, caching_divide, num_pages); 528 478 529 479 return r; 530 480 } ··· 538 492 */ 539 493 void ttm_pool_free(struct ttm_pool *pool, struct ttm_tt *tt) 540 494 { 541 - unsigned int i; 542 - 543 - for (i = 0; i < tt->num_pages; ) { 544 - struct page *p = tt->pages[i]; 545 - unsigned int order, num_pages; 546 - struct ttm_pool_type *pt; 547 - 548 - order = ttm_pool_page_order(pool, p); 549 - num_pages = 1ULL << order; 550 - if (tt->dma_address) 551 - ttm_pool_unmap(pool, tt->dma_address[i], num_pages); 552 - 553 - pt = ttm_pool_select_type(pool, tt->caching, order); 554 - if (pt) 555 - ttm_pool_type_give(pt, tt->pages[i]); 556 - else 557 - ttm_pool_free_page(pool, tt->caching, order, 558 - tt->pages[i]); 559 - 560 - i += num_pages; 561 - } 495 + ttm_pool_free_range(pool, tt, tt->caching, 0, tt->num_pages); 562 496 563 497 while (atomic_long_read(&allocated_pages) > page_pool_size) 564 498 ttm_pool_shrink(); ··· 568 542 569 543 if (use_dma_alloc) { 570 544 for (i = 0; i < TTM_NUM_CACHING_TYPES; ++i) 571 - for (j = 0; j < MAX_ORDER; ++j) 545 + for (j = 0; j < TTM_DIM_ORDER; ++j) 572 546 ttm_pool_type_init(&pool->caching[i].orders[j], 573 547 pool, i, j); 574 548 } ··· 588 562 589 563 if (pool->use_dma_alloc) { 590 564 for (i = 0; i < TTM_NUM_CACHING_TYPES; ++i) 591 - for (j = 0; j < MAX_ORDER; ++j) 565 + for (j = 0; j < TTM_DIM_ORDER; ++j) 592 566 ttm_pool_type_fini(&pool->caching[i].orders[j]); 593 567 } 594 568 ··· 642 616 unsigned int i; 643 617 644 618 seq_puts(m, "\t "); 645 - for (i = 0; i < MAX_ORDER; ++i) 619 + for (i = 0; i < TTM_DIM_ORDER; ++i) 646 620 seq_printf(m, " ---%2u---", i); 647 621 seq_puts(m, "\n"); 648 622 } ··· 653 627 { 654 628 unsigned int i; 655 629 656 - for (i = 0; i < MAX_ORDER; ++i) 630 + for (i = 0; i < TTM_DIM_ORDER; ++i) 657 631 seq_printf(m, " %8u", ttm_pool_type_count(&pt[i])); 658 632 seq_puts(m, "\n"); 659 633 } ··· 756 730 { 757 731 unsigned int i; 758 732 733 + BUILD_BUG_ON(TTM_DIM_ORDER > MAX_ORDER); 734 + BUILD_BUG_ON(TTM_DIM_ORDER < 1); 735 + 759 736 if (!page_pool_size) 760 737 page_pool_size = num_pages; 761 738 762 739 spin_lock_init(&shrinker_lock); 763 740 INIT_LIST_HEAD(&shrinker_list); 764 741 765 - for (i = 0; i < MAX_ORDER; ++i) { 742 + for (i = 0; i < TTM_DIM_ORDER; ++i) { 766 743 ttm_pool_type_init(&global_write_combined[i], NULL, 767 744 ttm_write_combined, i); 768 745 ttm_pool_type_init(&global_uncached[i], NULL, ttm_uncached, i); ··· 798 769 { 799 770 unsigned int i; 800 771 801 - for (i = 0; i < MAX_ORDER; ++i) { 772 + for (i = 0; i < TTM_DIM_ORDER; ++i) { 802 773 ttm_pool_type_fini(&global_write_combined[i]); 803 774 ttm_pool_type_fini(&global_uncached[i]); 804 775
+12 -9
drivers/gpu/drm/vc4/vc4_hdmi.c
··· 885 885 static void vc4_hdmi_enable_scrambling(struct drm_encoder *encoder) 886 886 { 887 887 struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder); 888 - struct drm_device *drm = vc4_hdmi->connector.dev; 888 + struct drm_connector *connector = &vc4_hdmi->connector; 889 + struct drm_device *drm = connector->dev; 889 890 const struct drm_display_mode *mode = &vc4_hdmi->saved_adjusted_mode; 890 891 unsigned long flags; 891 892 int idx; ··· 904 903 if (!drm_dev_enter(drm, &idx)) 905 904 return; 906 905 907 - drm_scdc_set_high_tmds_clock_ratio(vc4_hdmi->ddc, true); 908 - drm_scdc_set_scrambling(vc4_hdmi->ddc, true); 906 + drm_scdc_set_high_tmds_clock_ratio(connector, true); 907 + drm_scdc_set_scrambling(connector, true); 909 908 910 909 spin_lock_irqsave(&vc4_hdmi->hw_lock, flags); 911 910 HDMI_WRITE(HDMI_SCRAMBLER_CTL, HDMI_READ(HDMI_SCRAMBLER_CTL) | ··· 923 922 static void vc4_hdmi_disable_scrambling(struct drm_encoder *encoder) 924 923 { 925 924 struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder); 926 - struct drm_device *drm = vc4_hdmi->connector.dev; 925 + struct drm_connector *connector = &vc4_hdmi->connector; 926 + struct drm_device *drm = connector->dev; 927 927 unsigned long flags; 928 928 int idx; 929 929 ··· 946 944 ~VC5_HDMI_SCRAMBLER_CTL_ENABLE); 947 945 spin_unlock_irqrestore(&vc4_hdmi->hw_lock, flags); 948 946 949 - drm_scdc_set_scrambling(vc4_hdmi->ddc, false); 950 - drm_scdc_set_high_tmds_clock_ratio(vc4_hdmi->ddc, false); 947 + drm_scdc_set_scrambling(connector, false); 948 + drm_scdc_set_high_tmds_clock_ratio(connector, false); 951 949 952 950 drm_dev_exit(idx); 953 951 } ··· 957 955 struct vc4_hdmi *vc4_hdmi = container_of(to_delayed_work(work), 958 956 struct vc4_hdmi, 959 957 scrambling_work); 958 + struct drm_connector *connector = &vc4_hdmi->connector; 960 959 961 - if (drm_scdc_get_scrambling_status(vc4_hdmi->ddc)) 960 + if (drm_scdc_get_scrambling_status(connector)) 962 961 return; 963 962 964 - drm_scdc_set_high_tmds_clock_ratio(vc4_hdmi->ddc, true); 965 - drm_scdc_set_scrambling(vc4_hdmi->ddc, true); 963 + drm_scdc_set_high_tmds_clock_ratio(connector, true); 964 + drm_scdc_set_scrambling(connector, true); 966 965 967 966 queue_delayed_work(system_wq, &vc4_hdmi->scrambling_work, 968 967 msecs_to_jiffies(SCRAMBLING_POLLING_DELAY_MS));
+1 -15
drivers/staging/sm750fb/sm750.c
··· 989 989 return err; 990 990 } 991 991 992 - static int lynxfb_kick_out_firmware_fb(struct pci_dev *pdev) 993 - { 994 - resource_size_t base = pci_resource_start(pdev, 0); 995 - resource_size_t size = pci_resource_len(pdev, 0); 996 - bool primary = false; 997 - 998 - #ifdef CONFIG_X86 999 - primary = pdev->resource[PCI_ROM_RESOURCE].flags & 1000 - IORESOURCE_ROM_SHADOW; 1001 - #endif 1002 - 1003 - return aperture_remove_conflicting_devices(base, size, primary, "sm750_fb1"); 1004 - } 1005 - 1006 992 static int lynxfb_pci_probe(struct pci_dev *pdev, 1007 993 const struct pci_device_id *ent) 1008 994 { ··· 997 1011 int fbidx; 998 1012 int err; 999 1013 1000 - err = lynxfb_kick_out_firmware_fb(pdev); 1014 + err = aperture_remove_conflicting_pci_devices(pdev, "sm750_fb1"); 1001 1015 if (err) 1002 1016 return err; 1003 1017
+4 -4
drivers/video/aperture.c
··· 20 20 * driver can be active at any given time. Many systems load a generic 21 21 * graphics drivers, such as EFI-GOP or VESA, early during the boot process. 22 22 * During later boot stages, they replace the generic driver with a dedicated, 23 - * hardware-specific driver. To take over the device the dedicated driver 23 + * hardware-specific driver. To take over the device, the dedicated driver 24 24 * first has to remove the generic driver. Aperture functions manage 25 25 * ownership of framebuffer memory and hand-over between drivers. 26 26 * ··· 76 76 * generic EFI or VESA drivers, have to register themselves as owners of their 77 77 * framebuffer apertures. Ownership of the framebuffer memory is achieved 78 78 * by calling devm_aperture_acquire_for_platform_device(). If successful, the 79 - * driveris the owner of the framebuffer range. The function fails if the 79 + * driver is the owner of the framebuffer range. The function fails if the 80 80 * framebuffer is already owned by another driver. See below for an example. 81 81 * 82 82 * .. code-block:: c ··· 126 126 * et al for the registered framebuffer range, the aperture helpers call 127 127 * platform_device_unregister() and the generic driver unloads itself. The 128 128 * generic driver also has to provide a remove function to make this work. 129 - * Once hot unplugged fro mhardware, it may not access the device's 129 + * Once hot unplugged from hardware, it may not access the device's 130 130 * registers, framebuffer memory, ROM, etc afterwards. 131 131 */ 132 132 ··· 203 203 204 204 /* 205 205 * Remove the device from the device hierarchy. This is the right thing 206 - * to do for firmware-based DRM drivers, such as EFI, VESA or VGA. After 206 + * to do for firmware-based fb drivers, such as EFI, VESA or VGA. After 207 207 * the new driver takes over the hardware, the firmware device's state 208 208 * will be lost. 209 209 *
+1 -9
drivers/video/fbdev/aty/radeon_base.c
··· 2238 2238 .read = radeon_show_edid2, 2239 2239 }; 2240 2240 2241 - static int radeon_kick_out_firmware_fb(struct pci_dev *pdev) 2242 - { 2243 - resource_size_t base = pci_resource_start(pdev, 0); 2244 - resource_size_t size = pci_resource_len(pdev, 0); 2245 - 2246 - return aperture_remove_conflicting_devices(base, size, false, KBUILD_MODNAME); 2247 - } 2248 - 2249 2241 static int radeonfb_pci_register(struct pci_dev *pdev, 2250 2242 const struct pci_device_id *ent) 2251 2243 { ··· 2288 2296 rinfo->fb_base_phys = pci_resource_start (pdev, 0); 2289 2297 rinfo->mmio_base_phys = pci_resource_start (pdev, 2); 2290 2298 2291 - ret = radeon_kick_out_firmware_fb(pdev); 2299 + ret = aperture_remove_conflicting_pci_devices(pdev, KBUILD_MODNAME); 2292 2300 if (ret) 2293 2301 goto err_release_fb; 2294 2302
+4 -3
include/drm/display/drm_scdc_helper.h
··· 28 28 29 29 #include <drm/display/drm_scdc.h> 30 30 31 + struct drm_connector; 31 32 struct i2c_adapter; 32 33 33 34 ssize_t drm_scdc_read(struct i2c_adapter *adapter, u8 offset, void *buffer, ··· 72 71 return drm_scdc_write(adapter, offset, &value, sizeof(value)); 73 72 } 74 73 75 - bool drm_scdc_get_scrambling_status(struct i2c_adapter *adapter); 74 + bool drm_scdc_get_scrambling_status(struct drm_connector *connector); 76 75 77 - bool drm_scdc_set_scrambling(struct i2c_adapter *adapter, bool enable); 78 - bool drm_scdc_set_high_tmds_clock_ratio(struct i2c_adapter *adapter, bool set); 76 + bool drm_scdc_set_scrambling(struct drm_connector *connector, bool enable); 77 + bool drm_scdc_set_high_tmds_clock_ratio(struct drm_connector *connector, bool set); 79 78 80 79 #endif
+3 -1
include/drm/drm_gem_vram_helper.h
··· 160 160 .debugfs_init = drm_vram_mm_debugfs_init, \ 161 161 .dumb_create = drm_gem_vram_driver_dumb_create, \ 162 162 .dumb_map_offset = drm_gem_ttm_dumb_map_offset, \ 163 - .gem_prime_mmap = drm_gem_prime_mmap 163 + .gem_prime_mmap = drm_gem_prime_mmap, \ 164 + .prime_handle_to_fd = drm_gem_prime_handle_to_fd, \ 165 + .prime_fd_to_handle = drm_gem_prime_fd_to_handle 164 166 165 167 /* 166 168 * VRAM memory manager
+397
include/uapi/drm/qaic_accel.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note 2 + * 3 + * Copyright (c) 2019-2020, The Linux Foundation. All rights reserved. 4 + * Copyright (c) 2021-2023 Qualcomm Innovation Center, Inc. All rights reserved. 5 + */ 6 + 7 + #ifndef QAIC_ACCEL_H_ 8 + #define QAIC_ACCEL_H_ 9 + 10 + #include "drm.h" 11 + 12 + #if defined(__cplusplus) 13 + extern "C" { 14 + #endif 15 + 16 + /* The length(4K) includes len and count fields of qaic_manage_msg */ 17 + #define QAIC_MANAGE_MAX_MSG_LENGTH SZ_4K 18 + 19 + /* semaphore flags */ 20 + #define QAIC_SEM_INSYNCFENCE 2 21 + #define QAIC_SEM_OUTSYNCFENCE 1 22 + 23 + /* Semaphore commands */ 24 + #define QAIC_SEM_NOP 0 25 + #define QAIC_SEM_INIT 1 26 + #define QAIC_SEM_INC 2 27 + #define QAIC_SEM_DEC 3 28 + #define QAIC_SEM_WAIT_EQUAL 4 29 + #define QAIC_SEM_WAIT_GT_EQ 5 /* Greater than or equal */ 30 + #define QAIC_SEM_WAIT_GT_0 6 /* Greater than 0 */ 31 + 32 + #define QAIC_TRANS_UNDEFINED 0 33 + #define QAIC_TRANS_PASSTHROUGH_FROM_USR 1 34 + #define QAIC_TRANS_PASSTHROUGH_TO_USR 2 35 + #define QAIC_TRANS_PASSTHROUGH_FROM_DEV 3 36 + #define QAIC_TRANS_PASSTHROUGH_TO_DEV 4 37 + #define QAIC_TRANS_DMA_XFER_FROM_USR 5 38 + #define QAIC_TRANS_DMA_XFER_TO_DEV 6 39 + #define QAIC_TRANS_ACTIVATE_FROM_USR 7 40 + #define QAIC_TRANS_ACTIVATE_FROM_DEV 8 41 + #define QAIC_TRANS_ACTIVATE_TO_DEV 9 42 + #define QAIC_TRANS_DEACTIVATE_FROM_USR 10 43 + #define QAIC_TRANS_DEACTIVATE_FROM_DEV 11 44 + #define QAIC_TRANS_STATUS_FROM_USR 12 45 + #define QAIC_TRANS_STATUS_TO_USR 13 46 + #define QAIC_TRANS_STATUS_FROM_DEV 14 47 + #define QAIC_TRANS_STATUS_TO_DEV 15 48 + #define QAIC_TRANS_TERMINATE_FROM_DEV 16 49 + #define QAIC_TRANS_TERMINATE_TO_DEV 17 50 + #define QAIC_TRANS_DMA_XFER_CONT 18 51 + #define QAIC_TRANS_VALIDATE_PARTITION_FROM_DEV 19 52 + #define QAIC_TRANS_VALIDATE_PARTITION_TO_DEV 20 53 + 54 + /** 55 + * struct qaic_manage_trans_hdr - Header for a transaction in a manage message. 56 + * @type: In. Identifies this transaction. See QAIC_TRANS_* defines. 57 + * @len: In. Length of this transaction, including this header. 58 + */ 59 + struct qaic_manage_trans_hdr { 60 + __u32 type; 61 + __u32 len; 62 + }; 63 + 64 + /** 65 + * struct qaic_manage_trans_passthrough - Defines a passthrough transaction. 66 + * @hdr: In. Header to identify this transaction. 67 + * @data: In. Payload of this ransaction. Opaque to the driver. Userspace must 68 + * encode in little endian and align/pad to 64-bit. 69 + */ 70 + struct qaic_manage_trans_passthrough { 71 + struct qaic_manage_trans_hdr hdr; 72 + __u8 data[]; 73 + }; 74 + 75 + /** 76 + * struct qaic_manage_trans_dma_xfer - Defines a DMA transfer transaction. 77 + * @hdr: In. Header to identify this transaction. 78 + * @tag: In. Identified this transfer in other transactions. Opaque to the 79 + * driver. 80 + * @pad: Structure padding. 81 + * @addr: In. Address of the data to DMA to the device. 82 + * @size: In. Length of the data to DMA to the device. 83 + */ 84 + struct qaic_manage_trans_dma_xfer { 85 + struct qaic_manage_trans_hdr hdr; 86 + __u32 tag; 87 + __u32 pad; 88 + __u64 addr; 89 + __u64 size; 90 + }; 91 + 92 + /** 93 + * struct qaic_manage_trans_activate_to_dev - Defines an activate request. 94 + * @hdr: In. Header to identify this transaction. 95 + * @queue_size: In. Number of elements for DBC request and response queues. 96 + * @eventfd: Unused. 97 + * @options: In. Device specific options for this activate. 98 + * @pad: Structure padding. Must be 0. 99 + */ 100 + struct qaic_manage_trans_activate_to_dev { 101 + struct qaic_manage_trans_hdr hdr; 102 + __u32 queue_size; 103 + __u32 eventfd; 104 + __u32 options; 105 + __u32 pad; 106 + }; 107 + 108 + /** 109 + * struct qaic_manage_trans_activate_from_dev - Defines an activate response. 110 + * @hdr: Out. Header to identify this transaction. 111 + * @status: Out. Return code of the request from the device. 112 + * @dbc_id: Out. Id of the assigned DBC for successful request. 113 + * @options: Out. Device specific options for this activate. 114 + */ 115 + struct qaic_manage_trans_activate_from_dev { 116 + struct qaic_manage_trans_hdr hdr; 117 + __u32 status; 118 + __u32 dbc_id; 119 + __u64 options; 120 + }; 121 + 122 + /** 123 + * struct qaic_manage_trans_deactivate - Defines a deactivate request. 124 + * @hdr: In. Header to identify this transaction. 125 + * @dbc_id: In. Id of assigned DBC. 126 + * @pad: Structure padding. Must be 0. 127 + */ 128 + struct qaic_manage_trans_deactivate { 129 + struct qaic_manage_trans_hdr hdr; 130 + __u32 dbc_id; 131 + __u32 pad; 132 + }; 133 + 134 + /** 135 + * struct qaic_manage_trans_status_to_dev - Defines a status request. 136 + * @hdr: In. Header to identify this transaction. 137 + */ 138 + struct qaic_manage_trans_status_to_dev { 139 + struct qaic_manage_trans_hdr hdr; 140 + }; 141 + 142 + /** 143 + * struct qaic_manage_trans_status_from_dev - Defines a status response. 144 + * @hdr: Out. Header to identify this transaction. 145 + * @major: Out. NNC protocol version major number. 146 + * @minor: Out. NNC protocol version minor number. 147 + * @status: Out. Return code from device. 148 + * @status_flags: Out. Flags from device. Bit 0 indicates if CRCs are required. 149 + */ 150 + struct qaic_manage_trans_status_from_dev { 151 + struct qaic_manage_trans_hdr hdr; 152 + __u16 major; 153 + __u16 minor; 154 + __u32 status; 155 + __u64 status_flags; 156 + }; 157 + 158 + /** 159 + * struct qaic_manage_msg - Defines a message to the device. 160 + * @len: In. Length of all the transactions contained within this message. 161 + * @count: In. Number of transactions in this message. 162 + * @data: In. Address to an array where the transactions can be found. 163 + */ 164 + struct qaic_manage_msg { 165 + __u32 len; 166 + __u32 count; 167 + __u64 data; 168 + }; 169 + 170 + /** 171 + * struct qaic_create_bo - Defines a request to create a buffer object. 172 + * @size: In. Size of the buffer in bytes. 173 + * @handle: Out. GEM handle for the BO. 174 + * @pad: Structure padding. Must be 0. 175 + */ 176 + struct qaic_create_bo { 177 + __u64 size; 178 + __u32 handle; 179 + __u32 pad; 180 + }; 181 + 182 + /** 183 + * struct qaic_mmap_bo - Defines a request to prepare a BO for mmap(). 184 + * @handle: In. Handle of the GEM BO to prepare for mmap(). 185 + * @pad: Structure padding. Must be 0. 186 + * @offset: Out. Offset value to provide to mmap(). 187 + */ 188 + struct qaic_mmap_bo { 189 + __u32 handle; 190 + __u32 pad; 191 + __u64 offset; 192 + }; 193 + 194 + /** 195 + * struct qaic_sem - Defines a semaphore command for a BO slice. 196 + * @val: In. Only lower 12 bits are valid. 197 + * @index: In. Only lower 5 bits are valid. 198 + * @presync: In. 1 if presync operation, 0 if postsync. 199 + * @cmd: In. One of QAIC_SEM_*. 200 + * @flags: In. Bitfield. See QAIC_SEM_INSYNCFENCE and QAIC_SEM_OUTSYNCFENCE 201 + * @pad: Structure padding. Must be 0. 202 + */ 203 + struct qaic_sem { 204 + __u16 val; 205 + __u8 index; 206 + __u8 presync; 207 + __u8 cmd; 208 + __u8 flags; 209 + __u16 pad; 210 + }; 211 + 212 + /** 213 + * struct qaic_attach_slice_entry - Defines a single BO slice. 214 + * @size: In. Size of this slice in bytes. 215 + * @sem0: In. Semaphore command 0. Must be 0 is not valid. 216 + * @sem1: In. Semaphore command 1. Must be 0 is not valid. 217 + * @sem2: In. Semaphore command 2. Must be 0 is not valid. 218 + * @sem3: In. Semaphore command 3. Must be 0 is not valid. 219 + * @dev_addr: In. Device address this slice pushes to or pulls from. 220 + * @db_addr: In. Address of the doorbell to ring. 221 + * @db_data: In. Data to write to the doorbell. 222 + * @db_len: In. Size of the doorbell data in bits - 32, 16, or 8. 0 is for 223 + * inactive doorbells. 224 + * @offset: In. Start of this slice as an offset from the start of the BO. 225 + */ 226 + struct qaic_attach_slice_entry { 227 + __u64 size; 228 + struct qaic_sem sem0; 229 + struct qaic_sem sem1; 230 + struct qaic_sem sem2; 231 + struct qaic_sem sem3; 232 + __u64 dev_addr; 233 + __u64 db_addr; 234 + __u32 db_data; 235 + __u32 db_len; 236 + __u64 offset; 237 + }; 238 + 239 + /** 240 + * struct qaic_attach_slice_hdr - Defines metadata for a set of BO slices. 241 + * @count: In. Number of slices for this BO. 242 + * @dbc_id: In. Associate the sliced BO with this DBC. 243 + * @handle: In. GEM handle of the BO to slice. 244 + * @dir: In. Direction of data flow. 1 = DMA_TO_DEVICE, 2 = DMA_FROM_DEVICE 245 + * @size: In. Total length of the BO. 246 + * If BO is imported (DMABUF/PRIME) then this size 247 + * should not exceed the size of DMABUF provided. 248 + * If BO is allocated using DRM_IOCTL_QAIC_CREATE_BO 249 + * then this size should be exactly same as the size 250 + * provided during DRM_IOCTL_QAIC_CREATE_BO. 251 + * @dev_addr: In. Device address this slice pushes to or pulls from. 252 + * @db_addr: In. Address of the doorbell to ring. 253 + * @db_data: In. Data to write to the doorbell. 254 + * @db_len: In. Size of the doorbell data in bits - 32, 16, or 8. 0 is for 255 + * inactive doorbells. 256 + * @offset: In. Start of this slice as an offset from the start of the BO. 257 + */ 258 + struct qaic_attach_slice_hdr { 259 + __u32 count; 260 + __u32 dbc_id; 261 + __u32 handle; 262 + __u32 dir; 263 + __u64 size; 264 + }; 265 + 266 + /** 267 + * struct qaic_attach_slice - Defines a set of BO slices. 268 + * @hdr: In. Metadata of the set of slices. 269 + * @data: In. Pointer to an array containing the slice definitions. 270 + */ 271 + struct qaic_attach_slice { 272 + struct qaic_attach_slice_hdr hdr; 273 + __u64 data; 274 + }; 275 + 276 + /** 277 + * struct qaic_execute_entry - Defines a BO to submit to the device. 278 + * @handle: In. GEM handle of the BO to commit to the device. 279 + * @dir: In. Direction of data. 1 = to device, 2 = from device. 280 + */ 281 + struct qaic_execute_entry { 282 + __u32 handle; 283 + __u32 dir; 284 + }; 285 + 286 + /** 287 + * struct qaic_partial_execute_entry - Defines a BO to resize and submit. 288 + * @handle: In. GEM handle of the BO to commit to the device. 289 + * @dir: In. Direction of data. 1 = to device, 2 = from device. 290 + * @resize: In. New size of the BO. Must be <= the original BO size. 0 is 291 + * short for no resize. 292 + */ 293 + struct qaic_partial_execute_entry { 294 + __u32 handle; 295 + __u32 dir; 296 + __u64 resize; 297 + }; 298 + 299 + /** 300 + * struct qaic_execute_hdr - Defines metadata for BO submission. 301 + * @count: In. Number of BOs to submit. 302 + * @dbc_id: In. DBC to submit the BOs on. 303 + */ 304 + struct qaic_execute_hdr { 305 + __u32 count; 306 + __u32 dbc_id; 307 + }; 308 + 309 + /** 310 + * struct qaic_execute - Defines a list of BOs to submit to the device. 311 + * @hdr: In. BO list metadata. 312 + * @data: In. Pointer to an array of BOs to submit. 313 + */ 314 + struct qaic_execute { 315 + struct qaic_execute_hdr hdr; 316 + __u64 data; 317 + }; 318 + 319 + /** 320 + * struct qaic_wait - Defines a blocking wait for BO execution. 321 + * @handle: In. GEM handle of the BO to wait on. 322 + * @timeout: In. Maximum time in ms to wait for the BO. 323 + * @dbc_id: In. DBC the BO is submitted to. 324 + * @pad: Structure padding. Must be 0. 325 + */ 326 + struct qaic_wait { 327 + __u32 handle; 328 + __u32 timeout; 329 + __u32 dbc_id; 330 + __u32 pad; 331 + }; 332 + 333 + /** 334 + * struct qaic_perf_stats_hdr - Defines metadata for getting BO perf info. 335 + * @count: In. Number of BOs requested. 336 + * @pad: Structure padding. Must be 0. 337 + * @dbc_id: In. DBC the BO are associated with. 338 + */ 339 + struct qaic_perf_stats_hdr { 340 + __u16 count; 341 + __u16 pad; 342 + __u32 dbc_id; 343 + }; 344 + 345 + /** 346 + * struct qaic_perf_stats - Defines a request for getting BO perf info. 347 + * @hdr: In. Request metadata 348 + * @data: In. Pointer to array of stats structures that will receive the data. 349 + */ 350 + struct qaic_perf_stats { 351 + struct qaic_perf_stats_hdr hdr; 352 + __u64 data; 353 + }; 354 + 355 + /** 356 + * struct qaic_perf_stats_entry - Defines a BO perf info. 357 + * @handle: In. GEM handle of the BO to get perf stats for. 358 + * @queue_level_before: Out. Number of elements in the queue before this BO 359 + * was submitted. 360 + * @num_queue_element: Out. Number of elements added to the queue to submit 361 + * this BO. 362 + * @submit_latency_us: Out. Time taken by the driver to submit this BO. 363 + * @device_latency_us: Out. Time taken by the device to execute this BO. 364 + * @pad: Structure padding. Must be 0. 365 + */ 366 + struct qaic_perf_stats_entry { 367 + __u32 handle; 368 + __u32 queue_level_before; 369 + __u32 num_queue_element; 370 + __u32 submit_latency_us; 371 + __u32 device_latency_us; 372 + __u32 pad; 373 + }; 374 + 375 + #define DRM_QAIC_MANAGE 0x00 376 + #define DRM_QAIC_CREATE_BO 0x01 377 + #define DRM_QAIC_MMAP_BO 0x02 378 + #define DRM_QAIC_ATTACH_SLICE_BO 0x03 379 + #define DRM_QAIC_EXECUTE_BO 0x04 380 + #define DRM_QAIC_PARTIAL_EXECUTE_BO 0x05 381 + #define DRM_QAIC_WAIT_BO 0x06 382 + #define DRM_QAIC_PERF_STATS_BO 0x07 383 + 384 + #define DRM_IOCTL_QAIC_MANAGE DRM_IOWR(DRM_COMMAND_BASE + DRM_QAIC_MANAGE, struct qaic_manage_msg) 385 + #define DRM_IOCTL_QAIC_CREATE_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_QAIC_CREATE_BO, struct qaic_create_bo) 386 + #define DRM_IOCTL_QAIC_MMAP_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_QAIC_MMAP_BO, struct qaic_mmap_bo) 387 + #define DRM_IOCTL_QAIC_ATTACH_SLICE_BO DRM_IOW(DRM_COMMAND_BASE + DRM_QAIC_ATTACH_SLICE_BO, struct qaic_attach_slice) 388 + #define DRM_IOCTL_QAIC_EXECUTE_BO DRM_IOW(DRM_COMMAND_BASE + DRM_QAIC_EXECUTE_BO, struct qaic_execute) 389 + #define DRM_IOCTL_QAIC_PARTIAL_EXECUTE_BO DRM_IOW(DRM_COMMAND_BASE + DRM_QAIC_PARTIAL_EXECUTE_BO, struct qaic_execute) 390 + #define DRM_IOCTL_QAIC_WAIT_BO DRM_IOW(DRM_COMMAND_BASE + DRM_QAIC_WAIT_BO, struct qaic_wait) 391 + #define DRM_IOCTL_QAIC_PERF_STATS_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_QAIC_PERF_STATS_BO, struct qaic_perf_stats) 392 + 393 + #if defined(__cplusplus) 394 + } 395 + #endif 396 + 397 + #endif /* QAIC_ACCEL_H_ */