Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

kasan: allow sampling page_alloc allocations for HW_TAGS

As Hardware Tag-Based KASAN is intended to be used in production, its
performance impact is crucial. As page_alloc allocations tend to be big,
tagging and checking all such allocations can introduce a significant
slowdown.

Add two new boot parameters that allow to alleviate that slowdown:

- kasan.page_alloc.sample, which makes Hardware Tag-Based KASAN tag only
every Nth page_alloc allocation with the order configured by the second
added parameter (default: tag every such allocation).

- kasan.page_alloc.sample.order, which makes sampling enabled by the first
parameter only affect page_alloc allocations with the order equal or
greater than the specified value (default: 3, see below).

The exact performance improvement caused by using the new parameters
depends on their values and the applied workload.

The chosen default value for kasan.page_alloc.sample.order is 3, which
matches both PAGE_ALLOC_COSTLY_ORDER and SKB_FRAG_PAGE_ORDER. This is
done for two reasons:

1. PAGE_ALLOC_COSTLY_ORDER is "the order at which allocations are deemed
costly to service", which corresponds to the idea that only large and
thus costly allocations are supposed to sampled.

2. One of the workloads targeted by this patch is a benchmark that sends
a large amount of data over a local loopback connection. Most multi-page
data allocations in the networking subsystem have the order of
SKB_FRAG_PAGE_ORDER (or PAGE_ALLOC_COSTLY_ORDER).

When running a local loopback test on a testing MTE-enabled device in sync
mode, enabling Hardware Tag-Based KASAN introduces a ~50% slowdown.
Applying this patch and setting kasan.page_alloc.sampling to a value
higher than 1 allows to lower the slowdown. The performance improvement
saturates around the sampling interval value of 10 with the default
sampling page order of 3. This lowers the slowdown to ~20%. The slowdown
in real scenarios involving the network will likely be better.

Enabling page_alloc sampling has a downside: KASAN misses bad accesses to
a page_alloc allocation that has not been tagged. This lowers the value
of KASAN as a security mitigation.

However, based on measuring the number of page_alloc allocations of
different orders during boot in a test build, sampling with the default
kasan.page_alloc.sample.order value affects only ~7% of allocations. The
rest ~93% of allocations are still checked deterministically.

Link: https://lkml.kernel.org/r/129da0614123bb85ed4dd61ae30842b2dd7c903f.1671471846.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Mark Brand <markbrand@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Andrey Konovalov and committed by
Andrew Morton
44383cef cbc2bd98

+149 -21
+17
Documentation/dev-tools/kasan.rst
··· 140 140 - ``kasan.vmalloc=off`` or ``=on`` disables or enables tagging of vmalloc 141 141 allocations (default: ``on``). 142 142 143 + - ``kasan.page_alloc.sample=<sampling interval>`` makes KASAN tag only every 144 + Nth page_alloc allocation with the order equal or greater than 145 + ``kasan.page_alloc.sample.order``, where N is the value of the ``sample`` 146 + parameter (default: ``1``, or tag every such allocation). 147 + This parameter is intended to mitigate the performance overhead introduced 148 + by KASAN. 149 + Note that enabling this parameter makes Hardware Tag-Based KASAN skip checks 150 + of allocations chosen by sampling and thus miss bad accesses to these 151 + allocations. Use the default value for accurate bug detection. 152 + 153 + - ``kasan.page_alloc.sample.order=<minimum page order>`` specifies the minimum 154 + order of allocations that are affected by sampling (default: ``3``). 155 + Only applies when ``kasan.page_alloc.sample`` is set to a value greater 156 + than ``1``. 157 + This parameter is intended to allow sampling only large page_alloc 158 + allocations, which is the biggest source of the performance overhead. 159 + 143 160 Error reports 144 161 ~~~~~~~~~~~~~ 145 162
+9 -5
include/linux/kasan.h
··· 120 120 __kasan_poison_pages(page, order, init); 121 121 } 122 122 123 - void __kasan_unpoison_pages(struct page *page, unsigned int order, bool init); 124 - static __always_inline void kasan_unpoison_pages(struct page *page, 123 + bool __kasan_unpoison_pages(struct page *page, unsigned int order, bool init); 124 + static __always_inline bool kasan_unpoison_pages(struct page *page, 125 125 unsigned int order, bool init) 126 126 { 127 127 if (kasan_enabled()) 128 - __kasan_unpoison_pages(page, order, init); 128 + return __kasan_unpoison_pages(page, order, init); 129 + return false; 129 130 } 130 131 131 132 void __kasan_cache_create_kmalloc(struct kmem_cache *cache); ··· 250 249 static inline void kasan_unpoison_range(const void *address, size_t size) {} 251 250 static inline void kasan_poison_pages(struct page *page, unsigned int order, 252 251 bool init) {} 253 - static inline void kasan_unpoison_pages(struct page *page, unsigned int order, 254 - bool init) {} 252 + static inline bool kasan_unpoison_pages(struct page *page, unsigned int order, 253 + bool init) 254 + { 255 + return false; 256 + } 255 257 static inline void kasan_cache_create_kmalloc(struct kmem_cache *cache) {} 256 258 static inline void kasan_poison_slab(struct slab *slab) {} 257 259 static inline void kasan_unpoison_object_data(struct kmem_cache *cache,
+7 -2
mm/kasan/common.c
··· 95 95 } 96 96 #endif /* CONFIG_KASAN_STACK */ 97 97 98 - void __kasan_unpoison_pages(struct page *page, unsigned int order, bool init) 98 + bool __kasan_unpoison_pages(struct page *page, unsigned int order, bool init) 99 99 { 100 100 u8 tag; 101 101 unsigned long i; 102 102 103 103 if (unlikely(PageHighMem(page))) 104 - return; 104 + return false; 105 + 106 + if (!kasan_sample_page_alloc(order)) 107 + return false; 105 108 106 109 tag = kasan_random_tag(); 107 110 kasan_unpoison(set_tag(page_address(page), tag), 108 111 PAGE_SIZE << order, init); 109 112 for (i = 0; i < (1 << order); i++) 110 113 page_kasan_tag_set(page + i, tag); 114 + 115 + return true; 111 116 } 112 117 113 118 void __kasan_poison_pages(struct page *page, unsigned int order, bool init)
+60
mm/kasan/hw_tags.c
··· 59 59 /* Whether to enable vmalloc tagging. */ 60 60 DEFINE_STATIC_KEY_TRUE(kasan_flag_vmalloc); 61 61 62 + #define PAGE_ALLOC_SAMPLE_DEFAULT 1 63 + #define PAGE_ALLOC_SAMPLE_ORDER_DEFAULT 3 64 + 65 + /* 66 + * Sampling interval of page_alloc allocation (un)poisoning. 67 + * Defaults to no sampling. 68 + */ 69 + unsigned long kasan_page_alloc_sample = PAGE_ALLOC_SAMPLE_DEFAULT; 70 + 71 + /* 72 + * Minimum order of page_alloc allocations to be affected by sampling. 73 + * The default value is chosen to match both 74 + * PAGE_ALLOC_COSTLY_ORDER and SKB_FRAG_PAGE_ORDER. 75 + */ 76 + unsigned int kasan_page_alloc_sample_order = PAGE_ALLOC_SAMPLE_ORDER_DEFAULT; 77 + 78 + DEFINE_PER_CPU(long, kasan_page_alloc_skip); 79 + 62 80 /* kasan=off/on */ 63 81 static int __init early_kasan_flag(char *arg) 64 82 { ··· 139 121 else 140 122 return "sync"; 141 123 } 124 + 125 + /* kasan.page_alloc.sample=<sampling interval> */ 126 + static int __init early_kasan_flag_page_alloc_sample(char *arg) 127 + { 128 + int rv; 129 + 130 + if (!arg) 131 + return -EINVAL; 132 + 133 + rv = kstrtoul(arg, 0, &kasan_page_alloc_sample); 134 + if (rv) 135 + return rv; 136 + 137 + if (!kasan_page_alloc_sample || kasan_page_alloc_sample > LONG_MAX) { 138 + kasan_page_alloc_sample = PAGE_ALLOC_SAMPLE_DEFAULT; 139 + return -EINVAL; 140 + } 141 + 142 + return 0; 143 + } 144 + early_param("kasan.page_alloc.sample", early_kasan_flag_page_alloc_sample); 145 + 146 + /* kasan.page_alloc.sample.order=<minimum page order> */ 147 + static int __init early_kasan_flag_page_alloc_sample_order(char *arg) 148 + { 149 + int rv; 150 + 151 + if (!arg) 152 + return -EINVAL; 153 + 154 + rv = kstrtouint(arg, 0, &kasan_page_alloc_sample_order); 155 + if (rv) 156 + return rv; 157 + 158 + if (kasan_page_alloc_sample_order > INT_MAX) { 159 + kasan_page_alloc_sample_order = PAGE_ALLOC_SAMPLE_ORDER_DEFAULT; 160 + return -EINVAL; 161 + } 162 + 163 + return 0; 164 + } 165 + early_param("kasan.page_alloc.sample.order", early_kasan_flag_page_alloc_sample_order); 142 166 143 167 /* 144 168 * kasan_init_hw_tags_cpu() is called for each CPU.
+27
mm/kasan/kasan.h
··· 42 42 43 43 extern enum kasan_mode kasan_mode __ro_after_init; 44 44 45 + extern unsigned long kasan_page_alloc_sample; 46 + extern unsigned int kasan_page_alloc_sample_order; 47 + DECLARE_PER_CPU(long, kasan_page_alloc_skip); 48 + 45 49 static inline bool kasan_vmalloc_enabled(void) 46 50 { 47 51 return static_branch_likely(&kasan_flag_vmalloc); ··· 61 57 return kasan_mode == KASAN_MODE_SYNC || kasan_mode == KASAN_MODE_ASYMM; 62 58 } 63 59 60 + static inline bool kasan_sample_page_alloc(unsigned int order) 61 + { 62 + /* Fast-path for when sampling is disabled. */ 63 + if (kasan_page_alloc_sample == 1) 64 + return true; 65 + 66 + if (order < kasan_page_alloc_sample_order) 67 + return true; 68 + 69 + if (this_cpu_dec_return(kasan_page_alloc_skip) < 0) { 70 + this_cpu_write(kasan_page_alloc_skip, 71 + kasan_page_alloc_sample - 1); 72 + return true; 73 + } 74 + 75 + return false; 76 + } 77 + 64 78 #else /* CONFIG_KASAN_HW_TAGS */ 65 79 66 80 static inline bool kasan_async_fault_possible(void) ··· 87 65 } 88 66 89 67 static inline bool kasan_sync_fault_possible(void) 68 + { 69 + return true; 70 + } 71 + 72 + static inline bool kasan_sample_page_alloc(unsigned int order) 90 73 { 91 74 return true; 92 75 }
+29 -14
mm/page_alloc.c
··· 1356 1356 * see the comment next to it. 1357 1357 * 3. Skipping poisoning is requested via __GFP_SKIP_KASAN_POISON, 1358 1358 * see the comment next to it. 1359 + * 4. The allocation is excluded from being checked due to sampling, 1360 + * see the call to kasan_unpoison_pages. 1359 1361 * 1360 1362 * Poisoning pages during deferred memory init will greatly lengthen the 1361 1363 * process and cause problem in large memory systems as the deferred pages ··· 2470 2468 { 2471 2469 bool init = !want_init_on_free() && want_init_on_alloc(gfp_flags) && 2472 2470 !should_skip_init(gfp_flags); 2473 - bool init_tags = init && (gfp_flags & __GFP_ZEROTAGS); 2471 + bool zero_tags = init && (gfp_flags & __GFP_ZEROTAGS); 2472 + bool reset_tags = !zero_tags; 2474 2473 int i; 2475 2474 2476 2475 set_page_private(page, 0); ··· 2494 2491 */ 2495 2492 2496 2493 /* 2497 - * If memory tags should be zeroed (which happens only when memory 2498 - * should be initialized as well). 2494 + * If memory tags should be zeroed 2495 + * (which happens only when memory should be initialized as well). 2499 2496 */ 2500 - if (init_tags) { 2497 + if (zero_tags) { 2501 2498 /* Initialize both memory and tags. */ 2502 2499 for (i = 0; i != 1 << order; ++i) 2503 2500 tag_clear_highpage(page + i); 2504 2501 2505 - /* Note that memory is already initialized by the loop above. */ 2502 + /* Take note that memory was initialized by the loop above. */ 2506 2503 init = false; 2507 2504 } 2508 2505 if (!should_skip_kasan_unpoison(gfp_flags)) { 2509 - /* Unpoison shadow memory or set memory tags. */ 2510 - kasan_unpoison_pages(page, order, init); 2511 - 2512 - /* Note that memory is already initialized by KASAN. */ 2513 - if (kasan_has_integrated_init()) 2514 - init = false; 2515 - } else { 2516 - /* Ensure page_address() dereferencing does not fault. */ 2506 + /* Try unpoisoning (or setting tags) and initializing memory. */ 2507 + if (kasan_unpoison_pages(page, order, init)) { 2508 + /* Take note that memory was initialized by KASAN. */ 2509 + if (kasan_has_integrated_init()) 2510 + init = false; 2511 + /* Take note that memory tags were set by KASAN. */ 2512 + reset_tags = false; 2513 + } else { 2514 + /* 2515 + * KASAN decided to exclude this allocation from being 2516 + * poisoned due to sampling. Skip poisoning as well. 2517 + */ 2518 + SetPageSkipKASanPoison(page); 2519 + } 2520 + } 2521 + /* 2522 + * If memory tags have not been set, reset the page tags to ensure 2523 + * page_address() dereferencing does not fault. 2524 + */ 2525 + if (reset_tags) { 2517 2526 for (i = 0; i != 1 << order; ++i) 2518 2527 page_kasan_tag_reset(page + i); 2519 2528 } 2520 - /* If memory is still not initialized, do it now. */ 2529 + /* If memory is still not initialized, initialize it now. */ 2521 2530 if (init) 2522 2531 kernel_init_pages(page, 1 << order); 2523 2532 /* Propagate __GFP_SKIP_KASAN_POISON to page flags. */