Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

cxl: docs/allocation/reclaim

Document a bit about how reclaim interacts with various CXL
configurations.

Signed-off-by: Gregory Price <gourry@gourry.net>
Link: https://patch.msgid.link/20250512162134.3596150-16-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

authored by

Gregory Price and committed by
Dave Jiang
f109e77d 419dc40b

+52
+51
Documentation/driver-api/cxl/allocation/reclaim.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0 2 + 3 + ======= 4 + Reclaim 5 + ======= 6 + Another way CXL memory can be utilized *indirectly* is via the reclaim system 7 + in :code:`mm/vmscan.c`. Reclaim is engaged when memory capacity on the system 8 + becomes pressured based on global and cgroup-local `watermark` settings. 9 + 10 + In this section we won't discuss the `watermark` configurations, just how CXL 11 + memory can be consumed by various pieces of reclaim system. 12 + 13 + Demotion 14 + ======== 15 + By default, the reclaim system will prefer swap (or zswap) when reclaiming 16 + memory. Enabling :code:`kernel/mm/numa/demotion_enabled` will cause vmscan 17 + to opportunistically prefer distant NUMA nodes to swap or zswap, if capacity 18 + is available. 19 + 20 + Demotion engages the :code:`mm/memory_tier.c` component to determine the 21 + next demotion node. The next demotion node is based on the :code:`HMAT` 22 + or :code:`CDAT` performance data. 23 + 24 + cpusets.mems_allowed quirk 25 + -------------------------- 26 + In Linux v6.15 and below, demotion does not respect :code:`cpusets.mems_allowed` 27 + when migrating pages. As a result, if demotion is enabled, vmscan cannot 28 + guarantee isolation of a container's memory from nodes not set in mems_allowed. 29 + 30 + In Linux v6.XX and up, demotion does attempt to respect 31 + :code:`cpusets.mems_allowed`; however, certain classes of shared memory 32 + originally instantiated by another cgroup (such as common libraries - e.g. 33 + libc) may still be demoted. As a result, the mems_allowed interface still 34 + cannot provide perfect isolation from the remote nodes. 35 + 36 + ZSwap and Node Preference 37 + ========================= 38 + In Linux v6.15 and below, ZSwap allocates memory from the local node of the 39 + processor for the new pages being compressed. Since pages being compressed 40 + are typically cold, the result is a cold page becomes promoted - only to 41 + be later demoted as it ages off the LRU. 42 + 43 + In Linux v6.XX, ZSwap tries to prefer the node of the page being compressed 44 + as the allocation target for the compression page. This helps prevent 45 + thrashing. 46 + 47 + Demotion with ZSwap 48 + =================== 49 + When enabling both Demotion and ZSwap, you create a situation where ZSwap 50 + will prefer the slowest form of CXL memory by default until that tier of 51 + memory is exhausted.
+1
Documentation/driver-api/cxl/index.rst
··· 46 46 47 47 allocation/dax 48 48 allocation/page-allocator 49 + allocation/reclaim 49 50 50 51 .. only:: subproject and html