Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Documentation: update numastat explanation

During recent patch discussion [1] it became apparent that the "other_node"
definition in the numastat documentation has always been different from actual
implementation. It was also noted that the stats can be innacurate on systems
with memoryless nodes.

This patch corrects the other_node definition (with minor tweaks to two more
definitions), adds a note about memoryless nodes and also two introductory
paragraphs to the numastat documentation.

[1] https://lore.kernel.org/linux-mm/20200504070304.127361-1-sandipan@linux.ibm.com/T/#u

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Sandipan Das <sandipan@linux.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Link: https://lore.kernel.org/r/20200507120217.12313-1-vbabka@suse.cz
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

authored by

Vlastimil Babka and committed by
Jonathan Corbet
77691ee9 ea8fdf1a

+28 -3
+28 -3
Documentation/admin-guide/numastat.rst
··· 6 6 7 7 All units are pages. Hugepages have separate counters. 8 8 9 + The numa_hit, numa_miss and numa_foreign counters reflect how well processes 10 + are able to allocate memory from nodes they prefer. If they succeed, numa_hit 11 + is incremented on the preferred node, otherwise numa_foreign is incremented on 12 + the preferred node and numa_miss on the node where allocation succeeded. 13 + 14 + Usually preferred node is the one local to the CPU where the process executes, 15 + but restrictions such as mempolicies can change that, so there are also two 16 + counters based on CPU local node. local_node is similar to numa_hit and is 17 + incremented on allocation from a node by CPU on the same node. other_node is 18 + similar to numa_miss and is incremented on the node where allocation succeeds 19 + from a CPU from a different node. Note there is no counter analogical to 20 + numa_foreign. 21 + 22 + In more detail: 23 + 9 24 =============== ============================================================ 10 25 numa_hit A process wanted to allocate memory from this node, 11 26 and succeeded. ··· 29 14 but ended up with memory from this node. 30 15 31 16 numa_foreign A process wanted to allocate on this node, 32 - but ended up with memory from another one. 17 + but ended up with memory from another node. 33 18 34 - local_node A process ran on this node and got memory from it. 19 + local_node A process ran on this node's CPU, 20 + and got memory from this node. 35 21 36 - other_node A process ran on this node and got memory from another node. 22 + other_node A process ran on a different node's CPU 23 + and got memory from this node. 37 24 38 25 interleave_hit Interleaving wanted to allocate from this node 39 26 and succeeded. ··· 45 28 (http://oss.sgi.com/projects/libnuma/). Note that it only works 46 29 well right now on machines with a small number of CPUs. 47 30 31 + Note that on systems with memoryless nodes (where a node has CPUs but no 32 + memory) the numa_hit, numa_miss and numa_foreign statistics can be skewed 33 + heavily. In the current kernel implementation, if a process prefers a 34 + memoryless node (i.e. because it is running on one of its local CPU), the 35 + implementation actually treats one of the nearest nodes with memory as the 36 + preferred node. As a result, such allocation will not increase the numa_foreign 37 + counter on the memoryless node, and will skew the numa_hit, numa_miss and 38 + numa_foreign statistics of the nearest node.