Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

cxl: doc/linux/access-coordinates Update access coordinates calculation methods

Add documentation on how to calculate the access coordinates for a given
CXL region in detail.

Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20250515000923.2590820-4-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

Dave Jiang fc785615 1ce91b37

+85 -2
+85 -2
Documentation/driver-api/cxl/linux/access-coordinates.rst
··· 5 5 CXL Access Coordinates Computation 6 6 ================================== 7 7 8 + Latency and Bandwidth Calculation 9 + ================================= 10 + A memory region performance coordinates (latency and bandwidth) are typically 11 + provided via ACPI tables :doc:`SRAT <../platform/acpi/srat>` and 12 + :doc:`HMAT <../platform/acpi/hmat>`. However, the platform firmware (BIOS) is 13 + not able to annotate those for CXL devices that are hot-plugged since they do 14 + not exist during platform firmware initialization. The CXL driver can compute 15 + the performance coordinates by retrieving data from several components. 16 + 17 + The :doc:`SRAT <../platform/acpi/srat>` provides a Generic Port Affinity 18 + subtable that ties a proximity domain to a device handle, which in this case 19 + would be the CXL hostbridge. Using this association, the performance 20 + coordinates for the Generic Port can be retrieved from the 21 + :doc:`HMAT <../platform/acpi/hmat>` subtable. This piece represents the 22 + performance coordinates between a CPU and a Generic Port (CXL hostbridge). 23 + 24 + The :doc:`CDAT <../platform/cdat>` provides the performance coordinates for 25 + the CXL device itself. That is the bandwidth and latency to access that device's 26 + memory region. The DSMAS subtable provides a DSMADHandle that is tied to a 27 + Device Physical Address (DPA) range. The DSLBIS subtable provides the 28 + performance coordinates that's tied to a DSMADhandle and this ties the two 29 + table entries together to provide the performance coordinates for each DPA 30 + region. For example, if a device exports a DRAM region and a PMEM region, 31 + then there would be different performance characteristsics for each of those 32 + regions. 33 + 34 + If there's a CXL switch in the topology, then the performance coordinates for the 35 + switch is provided by SSLBIS subtable. This provides the bandwidth and latency 36 + for traversing the switch between the switch upstream port and the switch 37 + downstream port that points to the endpoint device. 38 + 39 + Simple topology example:: 40 + 41 + GP0/HB0/ACPI0016-0 42 + RP0 43 + | 44 + | L0 45 + | 46 + SW 0 / USP0 47 + SW 0 / DSP0 48 + | 49 + | L1 50 + | 51 + EP0 52 + 53 + In this example, there is a CXL switch between an endpoint and a root port. 54 + Latency in this example is calculated as such: 55 + L(EP0) - Latency from EP0 CDAT DSMAS+DSLBIS 56 + L(L1) - Link latency between EP0 and SW0DSP0 57 + L(SW0) - Latency for the switch from SW0 CDAT SSLBIS. 58 + L(L0) - Link latency between SW0 and RP0 59 + L(RP0) - Latency from root port to CPU via SRAT and HMAT (Generic Port). 60 + Total read and write latencies are the sum of all these parts. 61 + 62 + Bandwidth in this example is calculated as such: 63 + B(EP0) - Bandwidth from EP0 CDAT DSMAS+DSLBIS 64 + B(L1) - Link bandwidth between EP0 and SW0DSP0 65 + B(SW0) - Bandwidth for the switch from SW0 CDAT SSLBIS. 66 + B(L0) - Link bandwidth between SW0 and RP0 67 + B(RP0) - Bandwidth from root port to CPU via SRAT and HMAT (Generic Port). 68 + The total read and write bandwidth is the min() of all these parts. 69 + 70 + To calculate the link bandwidth: 71 + LinkOperatingFrequency (GT/s) is the current negotiated link speed. 72 + DataRatePerLink (MB/s) = LinkOperatingFrequency / 8 73 + Bandwidth (MB/s) = PCIeCurrentLinkWidth * DataRatePerLink 74 + Where PCIeCurrentLinkWidth is the number of lanes in the link. 75 + 76 + To calculate the link latency: 77 + LinkLatency (picoseconds) = FlitSize / LinkBandwidth (MB/s) 78 + 79 + See `CXL Memory Device SW Guide r1.0 <https://www.intel.com/content/www/us/en/content-details/643805/cxl-memory-device-software-guide.html>`_, 80 + section 2.11.3 and 2.11.4 for details. 81 + 82 + In the end, the access coordinates for a constructed memory region is calculated from one 83 + or more memory partitions from each of the CXL device(s). 84 + 8 85 Shared Upstream Link Calculation 9 86 ================================ 10 87 For certain CXL region construction with endpoints behind CXL switches (SW) or ··· 168 91 bandwidth from all the members of the last xarray is updated for the 169 92 access coordinates residing in the cxl region (cxlr) context. 170 93 171 - .. kernel-doc:: drivers/cxl/acpi.c 172 - :identifiers: cxl_acpi_evaluate_qtg_dsm 94 + QTG ID 95 + ====== 96 + Each :doc:`CEDT <../platform/acpi/cedt>` has a QTG ID field. This field provides 97 + the ID that associates with a QoS Throttling Group (QTG) for the CFMWS window. 98 + Once the access coordinates are calculated, an ACPI Device Specific Method can 99 + be issued to the ACPI0016 device to retrieve the QTG ID depends on the access 100 + coordinates provided. The QTG ID for the device can be used as guidance to match 101 + to the CFMWS to setup the best Linux root decoder for the device performance.