Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Documentation: core-api: entry: Add comments about nesting

The topic of nesting and reentrancy in the context of early entry code
hasn't been addressed so far. So do it.

Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20220110105044.94423-2-nsaenzju@redhat.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

authored by

Nicolas Saenz Julienne and committed by
Jonathan Corbet
e3aa43e9 bf026e2e

+18
+18
Documentation/core-api/entry.rst
··· 105 105 ensure that enter_from_user_mode() is called first on entry and 106 106 exit_to_user_mode() is called last on exit. 107 107 108 + Do not nest syscalls. Nested systcalls will cause RCU and/or context tracking 109 + to print a warning. 108 110 109 111 KVM 110 112 --- ··· 122 120 Task work handling is done separately for guest at the boundary of the 123 121 vcpu_run() loop via xfer_to_guest_mode_handle_work() which is a subset of 124 122 the work handled on return to user space. 123 + 124 + Do not nest KVM entry/exit transitions because doing so is nonsensical. 125 125 126 126 Interrupts and regular exceptions 127 127 --------------------------------- ··· 183 179 before it handles soft interrupts, whose handlers must run in BH context rather 184 180 than irq-disabled context. In addition, irqentry_exit() might schedule, which 185 181 also requires that HARDIRQ_OFFSET has been removed from the preemption count. 182 + 183 + Even though interrupt handlers are expected to run with local interrupts 184 + disabled, interrupt nesting is common from an entry/exit perspective. For 185 + example, softirq handling happens within an irqentry_{enter,exit}() block with 186 + local interrupts enabled. Also, although uncommon, nothing prevents an 187 + interrupt handler from re-enabling interrupts. 188 + 189 + Interrupt entry/exit code doesn't strictly need to handle reentrancy, since it 190 + runs with local interrupts disabled. But NMIs can happen anytime, and a lot of 191 + the entry code is shared between the two. 186 192 187 193 NMI and NMI-like exceptions 188 194 --------------------------- ··· 273 259 274 260 There is no combined irqentry_nmi_if_kernel() function available as the 275 261 above cannot be handled in an exception-agnostic way. 262 + 263 + NMIs can happen in any context. For example, an NMI-like exception triggered 264 + while handling an NMI. So NMI entry code has to be reentrant and state updates 265 + need to handle nesting.