Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

list: Introduce CONFIG_LIST_HARDENED

Numerous production kernel configs (see [1, 2]) are choosing to enable
CONFIG_DEBUG_LIST, which is also being recommended by KSPP for hardened
configs [3]. The motivation behind this is that the option can be used
as a security hardening feature (e.g. CVE-2019-2215 and CVE-2019-2025
are mitigated by the option [4]).

The feature has never been designed with performance in mind, yet common
list manipulation is happening across hot paths all over the kernel.

Introduce CONFIG_LIST_HARDENED, which performs list pointer checking
inline, and only upon list corruption calls the reporting slow path.

To generate optimal machine code with CONFIG_LIST_HARDENED:

1. Elide checking for pointer values which upon dereference would
result in an immediate access fault (i.e. minimal hardening
checks). The trade-off is lower-quality error reports.

2. Use the __preserve_most function attribute (available with Clang,
but not yet with GCC) to minimize the code footprint for calling
the reporting slow path. As a result, function size of callers is
reduced by avoiding saving registers before calling the rarely
called reporting slow path.

Note that all TUs in lib/Makefile already disable function tracing,
including list_debug.c, and __preserve_most's implied notrace has
no effect in this case.

3. Because the inline checks are a subset of the full set of checks in
__list_*_valid_or_report(), always return false if the inline
checks failed. This avoids redundant compare and conditional
branch right after return from the slow path.

As a side-effect of the checks being inline, if the compiler can prove
some condition to always be true, it can completely elide some checks.

Since DEBUG_LIST is functionally a superset of LIST_HARDENED, the
Kconfig variables are changed to reflect that: DEBUG_LIST selects
LIST_HARDENED, whereas LIST_HARDENED itself has no dependency on
DEBUG_LIST.

Running netperf with CONFIG_LIST_HARDENED (using a Clang compiler with
"preserve_most") shows throughput improvements, in my case of ~7% on
average (up to 20-30% on some test cases).

Link: https://r.android.com/1266735 [1]
Link: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/blob/main/config [2]
Link: https://kernsec.org/wiki/index.php/Kernel_Self_Protection_Project/Recommended_Settings [3]
Link: https://googleprojectzero.blogspot.com/2019/11/bad-binder-android-in-wild-exploit.html [4]
Signed-off-by: Marco Elver <elver@google.com>
Link: https://lore.kernel.org/r/20230811151847.1594958-3-elver@google.com
Signed-off-by: Kees Cook <keescook@chromium.org>

authored by

Marco Elver and committed by
Kees Cook
aebc7b0d b16c42c8

+88 -13
+1 -1
arch/arm64/kvm/hyp/nvhe/Makefile
··· 25 25 cache.o setup.o mm.o mem_protect.o sys_regs.o pkvm.o stacktrace.o ffa.o 26 26 hyp-obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \ 27 27 ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o 28 - hyp-obj-$(CONFIG_DEBUG_LIST) += list_debug.o 28 + hyp-obj-$(CONFIG_LIST_HARDENED) += list_debug.o 29 29 hyp-obj-y += $(lib-objs) 30 30 31 31 ##
+2
arch/arm64/kvm/hyp/nvhe/list_debug.c
··· 26 26 27 27 /* The predicates checked here are taken from lib/list_debug.c. */ 28 28 29 + __list_valid_slowpath 29 30 bool __list_add_valid_or_report(struct list_head *new, struct list_head *prev, 30 31 struct list_head *next) 31 32 { ··· 38 37 return true; 39 38 } 40 39 40 + __list_valid_slowpath 41 41 bool __list_del_entry_valid_or_report(struct list_head *entry) 42 42 { 43 43 struct list_head *prev, *next;
+2 -2
drivers/misc/lkdtm/bugs.c
··· 393 393 pr_err("Overwrite did not happen, but no BUG?!\n"); 394 394 else { 395 395 pr_err("list_add() corruption not detected!\n"); 396 - pr_expected_config(CONFIG_DEBUG_LIST); 396 + pr_expected_config(CONFIG_LIST_HARDENED); 397 397 } 398 398 } 399 399 ··· 420 420 pr_err("Overwrite did not happen, but no BUG?!\n"); 421 421 else { 422 422 pr_err("list_del() corruption not detected!\n"); 423 - pr_expected_config(CONFIG_DEBUG_LIST); 423 + pr_expected_config(CONFIG_LIST_HARDENED); 424 424 } 425 425 } 426 426
+58 -6
include/linux/list.h
··· 38 38 WRITE_ONCE(list->prev, list); 39 39 } 40 40 41 + #ifdef CONFIG_LIST_HARDENED 42 + 41 43 #ifdef CONFIG_DEBUG_LIST 44 + # define __list_valid_slowpath 45 + #else 46 + # define __list_valid_slowpath __cold __preserve_most 47 + #endif 48 + 42 49 /* 43 50 * Performs the full set of list corruption checks before __list_add(). 44 51 * On list corruption reports a warning, and returns false. 45 52 */ 46 - extern bool __list_add_valid_or_report(struct list_head *new, 47 - struct list_head *prev, 48 - struct list_head *next); 53 + extern bool __list_valid_slowpath __list_add_valid_or_report(struct list_head *new, 54 + struct list_head *prev, 55 + struct list_head *next); 49 56 50 57 /* 51 58 * Performs list corruption checks before __list_add(). Returns false if a 52 59 * corruption is detected, true otherwise. 60 + * 61 + * With CONFIG_LIST_HARDENED only, performs minimal list integrity checking 62 + * inline to catch non-faulting corruptions, and only if a corruption is 63 + * detected calls the reporting function __list_add_valid_or_report(). 53 64 */ 54 65 static __always_inline bool __list_add_valid(struct list_head *new, 55 66 struct list_head *prev, 56 67 struct list_head *next) 57 68 { 58 - return __list_add_valid_or_report(new, prev, next); 69 + bool ret = true; 70 + 71 + if (!IS_ENABLED(CONFIG_DEBUG_LIST)) { 72 + /* 73 + * With the hardening version, elide checking if next and prev 74 + * are NULL, since the immediate dereference of them below would 75 + * result in a fault if NULL. 76 + * 77 + * With the reduced set of checks, we can afford to inline the 78 + * checks, which also gives the compiler a chance to elide some 79 + * of them completely if they can be proven at compile-time. If 80 + * one of the pre-conditions does not hold, the slow-path will 81 + * show a report which pre-condition failed. 82 + */ 83 + if (likely(next->prev == prev && prev->next == next && new != prev && new != next)) 84 + return true; 85 + ret = false; 86 + } 87 + 88 + ret &= __list_add_valid_or_report(new, prev, next); 89 + return ret; 59 90 } 60 91 61 92 /* 62 93 * Performs the full set of list corruption checks before __list_del_entry(). 63 94 * On list corruption reports a warning, and returns false. 64 95 */ 65 - extern bool __list_del_entry_valid_or_report(struct list_head *entry); 96 + extern bool __list_valid_slowpath __list_del_entry_valid_or_report(struct list_head *entry); 66 97 67 98 /* 68 99 * Performs list corruption checks before __list_del_entry(). Returns false if a 69 100 * corruption is detected, true otherwise. 101 + * 102 + * With CONFIG_LIST_HARDENED only, performs minimal list integrity checking 103 + * inline to catch non-faulting corruptions, and only if a corruption is 104 + * detected calls the reporting function __list_del_entry_valid_or_report(). 70 105 */ 71 106 static __always_inline bool __list_del_entry_valid(struct list_head *entry) 72 107 { 73 - return __list_del_entry_valid_or_report(entry); 108 + bool ret = true; 109 + 110 + if (!IS_ENABLED(CONFIG_DEBUG_LIST)) { 111 + struct list_head *prev = entry->prev; 112 + struct list_head *next = entry->next; 113 + 114 + /* 115 + * With the hardening version, elide checking if next and prev 116 + * are NULL, LIST_POISON1 or LIST_POISON2, since the immediate 117 + * dereference of them below would result in a fault. 118 + */ 119 + if (likely(prev->next == entry && next->prev == entry)) 120 + return true; 121 + ret = false; 122 + } 123 + 124 + ret &= __list_del_entry_valid_or_report(entry); 125 + return ret; 74 126 } 75 127 #else 76 128 static inline bool __list_add_valid(struct list_head *new,
+7 -2
lib/Kconfig.debug
··· 1674 1674 config DEBUG_LIST 1675 1675 bool "Debug linked list manipulation" 1676 1676 depends on DEBUG_KERNEL || BUG_ON_DATA_CORRUPTION 1677 + select LIST_HARDENED 1677 1678 help 1678 - Enable this to turn on extended checks in the linked-list 1679 - walking routines. 1679 + Enable this to turn on extended checks in the linked-list walking 1680 + routines. 1681 + 1682 + This option trades better quality error reports for performance, and 1683 + is more suitable for kernel debugging. If you care about performance, 1684 + you should only enable CONFIG_LIST_HARDENED instead. 1680 1685 1681 1686 If unsure, say N. 1682 1687
+1 -1
lib/Makefile
··· 161 161 obj-$(CONFIG_INTERVAL_TREE) += interval_tree.o 162 162 obj-$(CONFIG_ASSOCIATIVE_ARRAY) += assoc_array.o 163 163 obj-$(CONFIG_DEBUG_PREEMPT) += smp_processor_id.o 164 - obj-$(CONFIG_DEBUG_LIST) += list_debug.o 164 + obj-$(CONFIG_LIST_HARDENED) += list_debug.o 165 165 obj-$(CONFIG_DEBUG_OBJECTS) += debugobjects.o 166 166 167 167 obj-$(CONFIG_BITREVERSE) += bitrev.o
+4 -1
lib/list_debug.c
··· 2 2 * Copyright 2006, Red Hat, Inc., Dave Jones 3 3 * Released under the General Public License (GPL). 4 4 * 5 - * This file contains the linked list validation for DEBUG_LIST. 5 + * This file contains the linked list validation and error reporting for 6 + * LIST_HARDENED and DEBUG_LIST. 6 7 */ 7 8 8 9 #include <linux/export.h> ··· 18 17 * attempt). 19 18 */ 20 19 20 + __list_valid_slowpath 21 21 bool __list_add_valid_or_report(struct list_head *new, struct list_head *prev, 22 22 struct list_head *next) 23 23 { ··· 41 39 } 42 40 EXPORT_SYMBOL(__list_add_valid_or_report); 43 41 42 + __list_valid_slowpath 44 43 bool __list_del_entry_valid_or_report(struct list_head *entry) 45 44 { 46 45 struct list_head *prev, *next;
+13
security/Kconfig.hardening
··· 279 279 280 280 endmenu 281 281 282 + menu "Hardening of kernel data structures" 283 + 284 + config LIST_HARDENED 285 + bool "Check integrity of linked list manipulation" 286 + help 287 + Minimal integrity checking in the linked-list manipulation routines 288 + to catch memory corruptions that are not guaranteed to result in an 289 + immediate access fault. 290 + 291 + If unsure, say N. 292 + 293 + endmenu 294 + 282 295 config CC_HAS_RANDSTRUCT 283 296 def_bool $(cc-option,-frandomize-layout-seed-file=/dev/null) 284 297 # Randstruct was first added in Clang 15, but it isn't safe to use until