Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

locking/refcount: Document interaction with PID_MAX_LIMIT

Document the circumstances under which refcount_t's saturation mechanism
works deterministically.

Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200303105427.260620-1-jannh@google.com

authored by

Jann Horn and committed by
Ingo Molnar
a13f58a0 d22cc7f6

+18 -5
+18 -5
include/linux/refcount.h
··· 38 38 * atomic operations, then the count will continue to edge closer to 0. If it 39 39 * reaches a value of 1 before /any/ of the threads reset it to the saturated 40 40 * value, then a concurrent refcount_dec_and_test() may erroneously free the 41 - * underlying object. Given the precise timing details involved with the 42 - * round-robin scheduling of each thread manipulating the refcount and the need 43 - * to hit the race multiple times in succession, there doesn't appear to be a 44 - * practical avenue of attack even if using refcount_add() operations with 45 - * larger increments. 41 + * underlying object. 42 + * Linux limits the maximum number of tasks to PID_MAX_LIMIT, which is currently 43 + * 0x400000 (and can't easily be raised in the future beyond FUTEX_TID_MASK). 44 + * With the current PID limit, if no batched refcounting operations are used and 45 + * the attacker can't repeatedly trigger kernel oopses in the middle of refcount 46 + * operations, this makes it impossible for a saturated refcount to leave the 47 + * saturation range, even if it is possible for multiple uses of the same 48 + * refcount to nest in the context of a single task: 49 + * 50 + * (UINT_MAX+1-REFCOUNT_SATURATED) / PID_MAX_LIMIT = 51 + * 0x40000000 / 0x400000 = 0x100 = 256 52 + * 53 + * If hundreds of references are added/removed with a single refcounting 54 + * operation, it may potentially be possible to leave the saturation range; but 55 + * given the precise timing details involved with the round-robin scheduling of 56 + * each thread manipulating the refcount and the need to hit the race multiple 57 + * times in succession, there doesn't appear to be a practical avenue of attack 58 + * even if using refcount_add() operations with larger increments. 46 59 * 47 60 * Memory ordering 48 61 * ===============