Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'slab-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab fixes from Vlastimil Babka:

- The "common kmalloc v4" series [1] by Hyeonggon Yoo.

While the plan after LPC is to try again if it's possible to get rid
of SLOB and SLAB (and if any critical aspect of those is not possible
to achieve with SLUB today, modify it accordingly), it will take a
while even in case there are no objections.

Meanwhile this is a nice cleanup and some parts (e.g. to the
tracepoints) will be useful even if we end up with a single slab
implementation in the future:

- Improves the mm/slab_common.c wrappers to allow deleting
duplicated code between SLAB and SLUB.

- Large kmalloc() allocations in SLAB are passed to page allocator
like in SLUB, reducing number of kmalloc caches.

- Removes the {kmem_cache_alloc,kmalloc}_node variants of
tracepoints, node id parameter added to non-_node variants.

- Addition of kmalloc_size_roundup()

The first two patches from a series by Kees Cook [2] that introduce
kmalloc_size_roundup(). This will allow merging of per-subsystem
patches using the new function and ultimately stop (ab)using ksize()
in a way that causes ongoing trouble for debugging functionality and
static checkers.

- Wasted kmalloc() memory tracking in debugfs alloc_traces

A patch from Feng Tang that enhances the existing debugfs
alloc_traces file for kmalloc caches with information about how much
space is wasted by allocations that needs less space than the
particular kmalloc cache provides.

- My series [3] to fix validation races for caches with enabled
debugging:

- By decoupling the debug cache operation more from non-debug
fastpaths, extra locking simplifications were possible and thus
done afterwards.

- Additional cleanup of PREEMPT_RT specific code on top, by Thomas
Gleixner.

- A late fix for slab page leaks caused by the series, by Feng
Tang.

- Smaller fixes and cleanups:

- Unneeded variable removals, by ye xingchen

- A cleanup removing a BUG_ON() in create_unique_id(), by Chao Yu

Link: https://lore.kernel.org/all/20220817101826.236819-1-42.hyeyoo@gmail.com/ [1]
Link: https://lore.kernel.org/all/20220923202822.2667581-1-keescook@chromium.org/ [2]
Link: https://lore.kernel.org/all/20220823170400.26546-1-vbabka@suse.cz/ [3]

* tag 'slab-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (30 commits)
mm/slub: fix a slab missed to be freed problem
slab: Introduce kmalloc_size_roundup()
slab: Remove __malloc attribute from realloc functions
mm/slub: clean up create_unique_id()
mm/slub: enable debugging memory wasting of kmalloc
slub: Make PREEMPT_RT support less convoluted
mm/slub: simplify __cmpxchg_double_slab() and slab_[un]lock()
mm/slub: convert object_map_lock to non-raw spinlock
mm/slub: remove slab_lock() usage for debug operations
mm/slub: restrict sysfs validation to debug caches and make it safe
mm/sl[au]b: check if large object is valid in __ksize()
mm/slab_common: move declaration of __ksize() to mm/slab.h
mm/slab_common: drop kmem_alloc & avoid dereferencing fields when not using
mm/slab_common: unify NUMA and UMA version of tracepoints
mm/sl[au]b: cleanup kmem_cache_alloc[_node]_trace()
mm/sl[au]b: generalize kmalloc subsystem
mm/slub: move free_debug_processing() further
mm/sl[au]b: introduce common alloc/free functions without tracepoint
mm/slab: kmalloc: pass requests larger than order-1 page to page allocator
mm/slab_common: cleanup kmalloc_large()
...

+879 -946
+21 -12
Documentation/mm/slub.rst
··· 400 400 allocated objects. The output is sorted by frequency of each trace. 401 401 402 402 Information in the output: 403 - Number of objects, allocating function, minimal/average/maximal jiffies since alloc, 404 - pid range of the allocating processes, cpu mask of allocating cpus, and stack trace. 403 + Number of objects, allocating function, possible memory wastage of 404 + kmalloc objects(total/per-object), minimal/average/maximal jiffies 405 + since alloc, pid range of the allocating processes, cpu mask of 406 + allocating cpus, numa node mask of origins of memory, and stack trace. 405 407 406 408 Example::: 407 409 408 - 1085 populate_error_injection_list+0x97/0x110 age=166678/166680/166682 pid=1 cpus=1:: 409 - __slab_alloc+0x6d/0x90 410 - kmem_cache_alloc_trace+0x2eb/0x300 411 - populate_error_injection_list+0x97/0x110 412 - init_error_injection+0x1b/0x71 413 - do_one_initcall+0x5f/0x2d0 414 - kernel_init_freeable+0x26f/0x2d7 415 - kernel_init+0xe/0x118 416 - ret_from_fork+0x22/0x30 417 - 410 + 338 pci_alloc_dev+0x2c/0xa0 waste=521872/1544 age=290837/291891/293509 pid=1 cpus=106 nodes=0-1 411 + __kmem_cache_alloc_node+0x11f/0x4e0 412 + kmalloc_trace+0x26/0xa0 413 + pci_alloc_dev+0x2c/0xa0 414 + pci_scan_single_device+0xd2/0x150 415 + pci_scan_slot+0xf7/0x2d0 416 + pci_scan_child_bus_extend+0x4e/0x360 417 + acpi_pci_root_create+0x32e/0x3b0 418 + pci_acpi_scan_root+0x2b9/0x2d0 419 + acpi_pci_root_add.cold.11+0x110/0xb0a 420 + acpi_bus_attach+0x262/0x3f0 421 + device_for_each_child+0xb7/0x110 422 + acpi_dev_for_each_child+0x77/0xa0 423 + acpi_bus_attach+0x108/0x3f0 424 + device_for_each_child+0xb7/0x110 425 + acpi_dev_for_each_child+0x77/0xa0 426 + acpi_bus_attach+0x108/0x3f0 418 427 419 428 2. free_traces:: 420 429
+2 -1
include/linux/compiler_attributes.h
··· 35 35 36 36 /* 37 37 * Note: do not use this directly. Instead, use __alloc_size() since it is conditionally 38 - * available and includes other attributes. 38 + * available and includes other attributes. For GCC < 9.1, __alloc_size__ gets undefined 39 + * in compiler-gcc.h, due to misbehaviors. 39 40 * 40 41 * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-alloc_005fsize-function-attribute 41 42 * clang: https://clang.llvm.org/docs/AttributeReference.html#alloc-size
+5 -3
include/linux/compiler_types.h
··· 271 271 272 272 /* 273 273 * Any place that could be marked with the "alloc_size" attribute is also 274 - * a place to be marked with the "malloc" attribute. Do this as part of the 275 - * __alloc_size macro to avoid redundant attributes and to avoid missing a 276 - * __malloc marking. 274 + * a place to be marked with the "malloc" attribute, except those that may 275 + * be performing a _reallocation_, as that may alias the existing pointer. 276 + * For these, use __realloc_size(). 277 277 */ 278 278 #ifdef __alloc_size__ 279 279 # define __alloc_size(x, ...) __alloc_size__(x, ## __VA_ARGS__) __malloc 280 + # define __realloc_size(x, ...) __alloc_size__(x, ## __VA_ARGS__) 280 281 #else 281 282 # define __alloc_size(x, ...) __malloc 283 + # define __realloc_size(x, ...) 282 284 #endif 283 285 284 286 #ifndef asm_volatile_goto
+89 -99
include/linux/slab.h
··· 29 29 #define SLAB_RED_ZONE ((slab_flags_t __force)0x00000400U) 30 30 /* DEBUG: Poison objects */ 31 31 #define SLAB_POISON ((slab_flags_t __force)0x00000800U) 32 + /* Indicate a kmalloc slab */ 33 + #define SLAB_KMALLOC ((slab_flags_t __force)0x00001000U) 32 34 /* Align objs on cache lines */ 33 35 #define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x00002000U) 34 36 /* Use GFP_DMA memory */ ··· 186 184 /* 187 185 * Common kmalloc functions provided by all allocators 188 186 */ 189 - void * __must_check krealloc(const void *objp, size_t new_size, gfp_t flags) __alloc_size(2); 187 + void * __must_check krealloc(const void *objp, size_t new_size, gfp_t flags) __realloc_size(2); 190 188 void kfree(const void *objp); 191 189 void kfree_sensitive(const void *objp); 192 190 size_t __ksize(const void *objp); 191 + 192 + /** 193 + * ksize - Report actual allocation size of associated object 194 + * 195 + * @objp: Pointer returned from a prior kmalloc()-family allocation. 196 + * 197 + * This should not be used for writing beyond the originally requested 198 + * allocation size. Either use krealloc() or round up the allocation size 199 + * with kmalloc_size_roundup() prior to allocation. If this is used to 200 + * access beyond the originally requested allocation size, UBSAN_BOUNDS 201 + * and/or FORTIFY_SOURCE may trip, since they only know about the 202 + * originally allocated size via the __alloc_size attribute. 203 + */ 193 204 size_t ksize(const void *objp); 205 + 194 206 #ifdef CONFIG_PRINTK 195 207 bool kmem_valid_obj(void *object); 196 208 void kmem_dump_obj(void *object); ··· 259 243 260 244 #ifdef CONFIG_SLAB 261 245 /* 262 - * The largest kmalloc size supported by the SLAB allocators is 263 - * 32 megabyte (2^25) or the maximum allocatable page order if that is 264 - * less than 32 MB. 265 - * 266 - * WARNING: Its not easy to increase this value since the allocators have 267 - * to do various tricks to work around compiler limitations in order to 268 - * ensure proper constant folding. 246 + * SLAB and SLUB directly allocates requests fitting in to an order-1 page 247 + * (PAGE_SIZE*2). Larger requests are passed to the page allocator. 269 248 */ 270 - #define KMALLOC_SHIFT_HIGH ((MAX_ORDER + PAGE_SHIFT - 1) <= 25 ? \ 271 - (MAX_ORDER + PAGE_SHIFT - 1) : 25) 272 - #define KMALLOC_SHIFT_MAX KMALLOC_SHIFT_HIGH 249 + #define KMALLOC_SHIFT_HIGH (PAGE_SHIFT + 1) 250 + #define KMALLOC_SHIFT_MAX (MAX_ORDER + PAGE_SHIFT - 1) 273 251 #ifndef KMALLOC_SHIFT_LOW 274 252 #define KMALLOC_SHIFT_LOW 5 275 253 #endif 276 254 #endif 277 255 278 256 #ifdef CONFIG_SLUB 279 - /* 280 - * SLUB directly allocates requests fitting in to an order-1 page 281 - * (PAGE_SIZE*2). Larger requests are passed to the page allocator. 282 - */ 283 257 #define KMALLOC_SHIFT_HIGH (PAGE_SHIFT + 1) 284 258 #define KMALLOC_SHIFT_MAX (MAX_ORDER + PAGE_SHIFT - 1) 285 259 #ifndef KMALLOC_SHIFT_LOW ··· 421 415 if (size <= 512 * 1024) return 19; 422 416 if (size <= 1024 * 1024) return 20; 423 417 if (size <= 2 * 1024 * 1024) return 21; 424 - if (size <= 4 * 1024 * 1024) return 22; 425 - if (size <= 8 * 1024 * 1024) return 23; 426 - if (size <= 16 * 1024 * 1024) return 24; 427 - if (size <= 32 * 1024 * 1024) return 25; 428 418 429 419 if (!IS_ENABLED(CONFIG_PROFILE_ALL_BRANCHES) && size_is_constant) 430 420 BUILD_BUG_ON_MSG(1, "unexpected size in kmalloc_index()"); ··· 430 428 /* Will never be reached. Needed because the compiler may complain */ 431 429 return -1; 432 430 } 431 + static_assert(PAGE_SHIFT <= 20); 433 432 #define kmalloc_index(s) __kmalloc_index(s, true) 434 433 #endif /* !CONFIG_SLOB */ 435 434 ··· 459 456 kmem_cache_free_bulk(NULL, size, p); 460 457 } 461 458 462 - #ifdef CONFIG_NUMA 463 459 void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_kmalloc_alignment 464 460 __alloc_size(1); 465 461 void *kmem_cache_alloc_node(struct kmem_cache *s, gfp_t flags, int node) __assume_slab_alignment 466 462 __malloc; 467 - #else 468 - static __always_inline __alloc_size(1) void *__kmalloc_node(size_t size, gfp_t flags, int node) 469 - { 470 - return __kmalloc(size, flags); 471 - } 472 - 473 - static __always_inline void *kmem_cache_alloc_node(struct kmem_cache *s, gfp_t flags, int node) 474 - { 475 - return kmem_cache_alloc(s, flags); 476 - } 477 - #endif 478 463 479 464 #ifdef CONFIG_TRACING 480 - extern void *kmem_cache_alloc_trace(struct kmem_cache *s, gfp_t flags, size_t size) 481 - __assume_slab_alignment __alloc_size(3); 465 + void *kmalloc_trace(struct kmem_cache *s, gfp_t flags, size_t size) 466 + __assume_kmalloc_alignment __alloc_size(3); 482 467 483 - #ifdef CONFIG_NUMA 484 - extern void *kmem_cache_alloc_node_trace(struct kmem_cache *s, gfp_t gfpflags, 485 - int node, size_t size) __assume_slab_alignment 486 - __alloc_size(4); 487 - #else 488 - static __always_inline __alloc_size(4) void *kmem_cache_alloc_node_trace(struct kmem_cache *s, 489 - gfp_t gfpflags, int node, size_t size) 490 - { 491 - return kmem_cache_alloc_trace(s, gfpflags, size); 492 - } 493 - #endif /* CONFIG_NUMA */ 494 - 468 + void *kmalloc_node_trace(struct kmem_cache *s, gfp_t gfpflags, 469 + int node, size_t size) __assume_kmalloc_alignment 470 + __alloc_size(4); 495 471 #else /* CONFIG_TRACING */ 496 - static __always_inline __alloc_size(3) void *kmem_cache_alloc_trace(struct kmem_cache *s, 497 - gfp_t flags, size_t size) 472 + /* Save a function call when CONFIG_TRACING=n */ 473 + static __always_inline __alloc_size(3) 474 + void *kmalloc_trace(struct kmem_cache *s, gfp_t flags, size_t size) 498 475 { 499 476 void *ret = kmem_cache_alloc(s, flags); 500 477 ··· 482 499 return ret; 483 500 } 484 501 485 - static __always_inline void *kmem_cache_alloc_node_trace(struct kmem_cache *s, gfp_t gfpflags, 486 - int node, size_t size) 502 + static __always_inline __alloc_size(4) 503 + void *kmalloc_node_trace(struct kmem_cache *s, gfp_t gfpflags, 504 + int node, size_t size) 487 505 { 488 506 void *ret = kmem_cache_alloc_node(s, gfpflags, node); 489 507 ··· 493 509 } 494 510 #endif /* CONFIG_TRACING */ 495 511 496 - extern void *kmalloc_order(size_t size, gfp_t flags, unsigned int order) __assume_page_alignment 497 - __alloc_size(1); 512 + void *kmalloc_large(size_t size, gfp_t flags) __assume_page_alignment 513 + __alloc_size(1); 498 514 499 - #ifdef CONFIG_TRACING 500 - extern void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) 501 - __assume_page_alignment __alloc_size(1); 502 - #else 503 - static __always_inline __alloc_size(1) void *kmalloc_order_trace(size_t size, gfp_t flags, 504 - unsigned int order) 505 - { 506 - return kmalloc_order(size, flags, order); 507 - } 508 - #endif 509 - 510 - static __always_inline __alloc_size(1) void *kmalloc_large(size_t size, gfp_t flags) 511 - { 512 - unsigned int order = get_order(size); 513 - return kmalloc_order_trace(size, flags, order); 514 - } 515 + void *kmalloc_large_node(size_t size, gfp_t flags, int node) __assume_page_alignment 516 + __alloc_size(1); 515 517 516 518 /** 517 519 * kmalloc - allocate memory ··· 567 597 if (!index) 568 598 return ZERO_SIZE_PTR; 569 599 570 - return kmem_cache_alloc_trace( 600 + return kmalloc_trace( 571 601 kmalloc_caches[kmalloc_type(flags)][index], 572 602 flags, size); 573 603 #endif ··· 575 605 return __kmalloc(size, flags); 576 606 } 577 607 608 + #ifndef CONFIG_SLOB 578 609 static __always_inline __alloc_size(1) void *kmalloc_node(size_t size, gfp_t flags, int node) 579 610 { 580 - #ifndef CONFIG_SLOB 581 - if (__builtin_constant_p(size) && 582 - size <= KMALLOC_MAX_CACHE_SIZE) { 583 - unsigned int i = kmalloc_index(size); 611 + if (__builtin_constant_p(size)) { 612 + unsigned int index; 584 613 585 - if (!i) 614 + if (size > KMALLOC_MAX_CACHE_SIZE) 615 + return kmalloc_large_node(size, flags, node); 616 + 617 + index = kmalloc_index(size); 618 + 619 + if (!index) 586 620 return ZERO_SIZE_PTR; 587 621 588 - return kmem_cache_alloc_node_trace( 589 - kmalloc_caches[kmalloc_type(flags)][i], 590 - flags, node, size); 622 + return kmalloc_node_trace( 623 + kmalloc_caches[kmalloc_type(flags)][index], 624 + flags, node, size); 591 625 } 592 - #endif 593 626 return __kmalloc_node(size, flags, node); 594 627 } 628 + #else 629 + static __always_inline __alloc_size(1) void *kmalloc_node(size_t size, gfp_t flags, int node) 630 + { 631 + if (__builtin_constant_p(size) && size > KMALLOC_MAX_CACHE_SIZE) 632 + return kmalloc_large_node(size, flags, node); 633 + 634 + return __kmalloc_node(size, flags, node); 635 + } 636 + #endif 595 637 596 638 /** 597 639 * kmalloc_array - allocate memory for an array. ··· 629 647 * @new_size: new size of a single member of the array 630 648 * @flags: the type of memory to allocate (see kmalloc) 631 649 */ 632 - static inline __alloc_size(2, 3) void * __must_check krealloc_array(void *p, 633 - size_t new_n, 634 - size_t new_size, 635 - gfp_t flags) 650 + static inline __realloc_size(2, 3) void * __must_check krealloc_array(void *p, 651 + size_t new_n, 652 + size_t new_size, 653 + gfp_t flags) 636 654 { 637 655 size_t bytes; 638 656 ··· 653 671 return kmalloc_array(n, size, flags | __GFP_ZERO); 654 672 } 655 673 674 + void *__kmalloc_node_track_caller(size_t size, gfp_t flags, int node, 675 + unsigned long caller) __alloc_size(1); 676 + #define kmalloc_node_track_caller(size, flags, node) \ 677 + __kmalloc_node_track_caller(size, flags, node, \ 678 + _RET_IP_) 679 + 656 680 /* 657 681 * kmalloc_track_caller is a special version of kmalloc that records the 658 682 * calling function of the routine calling it for slab leak tracking instead ··· 667 679 * allocator where we care about the real place the memory allocation 668 680 * request comes from. 669 681 */ 670 - extern void *__kmalloc_track_caller(size_t size, gfp_t flags, unsigned long caller); 671 682 #define kmalloc_track_caller(size, flags) \ 672 - __kmalloc_track_caller(size, flags, _RET_IP_) 683 + __kmalloc_node_track_caller(size, flags, \ 684 + NUMA_NO_NODE, _RET_IP_) 673 685 674 686 static inline __alloc_size(1, 2) void *kmalloc_array_node(size_t n, size_t size, gfp_t flags, 675 687 int node) ··· 687 699 { 688 700 return kmalloc_array_node(n, size, flags | __GFP_ZERO, node); 689 701 } 690 - 691 - 692 - #ifdef CONFIG_NUMA 693 - extern void *__kmalloc_node_track_caller(size_t size, gfp_t flags, int node, 694 - unsigned long caller) __alloc_size(1); 695 - #define kmalloc_node_track_caller(size, flags, node) \ 696 - __kmalloc_node_track_caller(size, flags, node, \ 697 - _RET_IP_) 698 - 699 - #else /* CONFIG_NUMA */ 700 - 701 - #define kmalloc_node_track_caller(size, flags, node) \ 702 - kmalloc_track_caller(size, flags) 703 - 704 - #endif /* CONFIG_NUMA */ 705 702 706 703 /* 707 704 * Shortcuts ··· 747 774 } 748 775 749 776 extern void *kvrealloc(const void *p, size_t oldsize, size_t newsize, gfp_t flags) 750 - __alloc_size(3); 777 + __realloc_size(3); 751 778 extern void kvfree(const void *addr); 752 779 extern void kvfree_sensitive(const void *addr, size_t len); 753 780 754 781 unsigned int kmem_cache_size(struct kmem_cache *s); 782 + 783 + /** 784 + * kmalloc_size_roundup - Report allocation bucket size for the given size 785 + * 786 + * @size: Number of bytes to round up from. 787 + * 788 + * This returns the number of bytes that would be available in a kmalloc() 789 + * allocation of @size bytes. For example, a 126 byte request would be 790 + * rounded up to the next sized kmalloc bucket, 128 bytes. (This is strictly 791 + * for the general-purpose kmalloc()-based allocations, and is not for the 792 + * pre-sized kmem_cache_alloc()-based allocations.) 793 + * 794 + * Use this to kmalloc() the full bucket size ahead of time instead of using 795 + * ksize() to query the size after an allocation. 796 + */ 797 + size_t kmalloc_size_roundup(size_t size); 798 + 755 799 void __init kmem_cache_init_late(void); 756 800 757 801 #if defined(CONFIG_SMP) && defined(CONFIG_SLAB)
+42 -78
include/trace/events/kmem.h
··· 9 9 #include <linux/tracepoint.h> 10 10 #include <trace/events/mmflags.h> 11 11 12 - DECLARE_EVENT_CLASS(kmem_alloc, 12 + TRACE_EVENT(kmem_cache_alloc, 13 13 14 14 TP_PROTO(unsigned long call_site, 15 15 const void *ptr, 16 16 struct kmem_cache *s, 17 - size_t bytes_req, 18 - size_t bytes_alloc, 19 - gfp_t gfp_flags), 20 - 21 - TP_ARGS(call_site, ptr, s, bytes_req, bytes_alloc, gfp_flags), 22 - 23 - TP_STRUCT__entry( 24 - __field( unsigned long, call_site ) 25 - __field( const void *, ptr ) 26 - __field( size_t, bytes_req ) 27 - __field( size_t, bytes_alloc ) 28 - __field( unsigned long, gfp_flags ) 29 - __field( bool, accounted ) 30 - ), 31 - 32 - TP_fast_assign( 33 - __entry->call_site = call_site; 34 - __entry->ptr = ptr; 35 - __entry->bytes_req = bytes_req; 36 - __entry->bytes_alloc = bytes_alloc; 37 - __entry->gfp_flags = (__force unsigned long)gfp_flags; 38 - __entry->accounted = IS_ENABLED(CONFIG_MEMCG_KMEM) ? 39 - ((gfp_flags & __GFP_ACCOUNT) || 40 - (s && s->flags & SLAB_ACCOUNT)) : false; 41 - ), 42 - 43 - TP_printk("call_site=%pS ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s accounted=%s", 44 - (void *)__entry->call_site, 45 - __entry->ptr, 46 - __entry->bytes_req, 47 - __entry->bytes_alloc, 48 - show_gfp_flags(__entry->gfp_flags), 49 - __entry->accounted ? "true" : "false") 50 - ); 51 - 52 - DEFINE_EVENT(kmem_alloc, kmalloc, 53 - 54 - TP_PROTO(unsigned long call_site, const void *ptr, struct kmem_cache *s, 55 - size_t bytes_req, size_t bytes_alloc, gfp_t gfp_flags), 56 - 57 - TP_ARGS(call_site, ptr, s, bytes_req, bytes_alloc, gfp_flags) 58 - ); 59 - 60 - DEFINE_EVENT(kmem_alloc, kmem_cache_alloc, 61 - 62 - TP_PROTO(unsigned long call_site, const void *ptr, struct kmem_cache *s, 63 - size_t bytes_req, size_t bytes_alloc, gfp_t gfp_flags), 64 - 65 - TP_ARGS(call_site, ptr, s, bytes_req, bytes_alloc, gfp_flags) 66 - ); 67 - 68 - DECLARE_EVENT_CLASS(kmem_alloc_node, 69 - 70 - TP_PROTO(unsigned long call_site, 71 - const void *ptr, 72 - struct kmem_cache *s, 73 - size_t bytes_req, 74 - size_t bytes_alloc, 75 17 gfp_t gfp_flags, 76 18 int node), 77 19 78 - TP_ARGS(call_site, ptr, s, bytes_req, bytes_alloc, gfp_flags, node), 20 + TP_ARGS(call_site, ptr, s, gfp_flags, node), 79 21 80 22 TP_STRUCT__entry( 81 23 __field( unsigned long, call_site ) ··· 32 90 TP_fast_assign( 33 91 __entry->call_site = call_site; 34 92 __entry->ptr = ptr; 35 - __entry->bytes_req = bytes_req; 36 - __entry->bytes_alloc = bytes_alloc; 93 + __entry->bytes_req = s->object_size; 94 + __entry->bytes_alloc = s->size; 37 95 __entry->gfp_flags = (__force unsigned long)gfp_flags; 38 96 __entry->node = node; 39 97 __entry->accounted = IS_ENABLED(CONFIG_MEMCG_KMEM) ? 40 98 ((gfp_flags & __GFP_ACCOUNT) || 41 - (s && s->flags & SLAB_ACCOUNT)) : false; 99 + (s->flags & SLAB_ACCOUNT)) : false; 42 100 ), 43 101 44 102 TP_printk("call_site=%pS ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s node=%d accounted=%s", ··· 51 109 __entry->accounted ? "true" : "false") 52 110 ); 53 111 54 - DEFINE_EVENT(kmem_alloc_node, kmalloc_node, 112 + TRACE_EVENT(kmalloc, 55 113 56 - TP_PROTO(unsigned long call_site, const void *ptr, 57 - struct kmem_cache *s, size_t bytes_req, size_t bytes_alloc, 58 - gfp_t gfp_flags, int node), 114 + TP_PROTO(unsigned long call_site, 115 + const void *ptr, 116 + size_t bytes_req, 117 + size_t bytes_alloc, 118 + gfp_t gfp_flags, 119 + int node), 59 120 60 - TP_ARGS(call_site, ptr, s, bytes_req, bytes_alloc, gfp_flags, node) 61 - ); 121 + TP_ARGS(call_site, ptr, bytes_req, bytes_alloc, gfp_flags, node), 62 122 63 - DEFINE_EVENT(kmem_alloc_node, kmem_cache_alloc_node, 123 + TP_STRUCT__entry( 124 + __field( unsigned long, call_site ) 125 + __field( const void *, ptr ) 126 + __field( size_t, bytes_req ) 127 + __field( size_t, bytes_alloc ) 128 + __field( unsigned long, gfp_flags ) 129 + __field( int, node ) 130 + ), 64 131 65 - TP_PROTO(unsigned long call_site, const void *ptr, 66 - struct kmem_cache *s, size_t bytes_req, size_t bytes_alloc, 67 - gfp_t gfp_flags, int node), 132 + TP_fast_assign( 133 + __entry->call_site = call_site; 134 + __entry->ptr = ptr; 135 + __entry->bytes_req = bytes_req; 136 + __entry->bytes_alloc = bytes_alloc; 137 + __entry->gfp_flags = (__force unsigned long)gfp_flags; 138 + __entry->node = node; 139 + ), 68 140 69 - TP_ARGS(call_site, ptr, s, bytes_req, bytes_alloc, gfp_flags, node) 141 + TP_printk("call_site=%pS ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s node=%d accounted=%s", 142 + (void *)__entry->call_site, 143 + __entry->ptr, 144 + __entry->bytes_req, 145 + __entry->bytes_alloc, 146 + show_gfp_flags(__entry->gfp_flags), 147 + __entry->node, 148 + (IS_ENABLED(CONFIG_MEMCG_KMEM) && 149 + (__entry->gfp_flags & (__force unsigned long)__GFP_ACCOUNT)) ? "true" : "false") 70 150 ); 71 151 72 152 TRACE_EVENT(kfree, ··· 113 149 114 150 TRACE_EVENT(kmem_cache_free, 115 151 116 - TP_PROTO(unsigned long call_site, const void *ptr, const char *name), 152 + TP_PROTO(unsigned long call_site, const void *ptr, const struct kmem_cache *s), 117 153 118 - TP_ARGS(call_site, ptr, name), 154 + TP_ARGS(call_site, ptr, s), 119 155 120 156 TP_STRUCT__entry( 121 157 __field( unsigned long, call_site ) 122 158 __field( const void *, ptr ) 123 - __string( name, name ) 159 + __string( name, s->name ) 124 160 ), 125 161 126 162 TP_fast_assign( 127 163 __entry->call_site = call_site; 128 164 __entry->ptr = ptr; 129 - __assign_str(name, name); 165 + __assign_str(name, s->name); 130 166 ), 131 167 132 168 TP_printk("call_site=%pS ptr=%p name=%s",
+1
mm/kfence/report.c
··· 86 86 /* Also the *_bulk() variants by only checking prefixes. */ 87 87 if (str_has_prefix(buf, ARCH_FUNC_PREFIX "kfree") || 88 88 str_has_prefix(buf, ARCH_FUNC_PREFIX "kmem_cache_free") || 89 + str_has_prefix(buf, ARCH_FUNC_PREFIX "__kmem_cache_free") || 89 90 str_has_prefix(buf, ARCH_FUNC_PREFIX "__kmalloc") || 90 91 str_has_prefix(buf, ARCH_FUNC_PREFIX "kmem_cache_alloc")) 91 92 goto found;
+70 -235
mm/slab.c
··· 3181 3181 } 3182 3182 3183 3183 static __always_inline void * 3184 - slab_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid, size_t orig_size, 3185 - unsigned long caller) 3184 + __do_cache_alloc(struct kmem_cache *cachep, gfp_t flags, int nodeid) 3186 3185 { 3187 - unsigned long save_flags; 3188 - void *ptr; 3186 + void *objp = NULL; 3189 3187 int slab_node = numa_mem_id(); 3190 - struct obj_cgroup *objcg = NULL; 3191 - bool init = false; 3192 3188 3193 - flags &= gfp_allowed_mask; 3194 - cachep = slab_pre_alloc_hook(cachep, NULL, &objcg, 1, flags); 3195 - if (unlikely(!cachep)) 3196 - return NULL; 3197 - 3198 - ptr = kfence_alloc(cachep, orig_size, flags); 3199 - if (unlikely(ptr)) 3200 - goto out_hooks; 3201 - 3202 - local_irq_save(save_flags); 3203 - 3204 - if (nodeid == NUMA_NO_NODE) 3205 - nodeid = slab_node; 3206 - 3207 - if (unlikely(!get_node(cachep, nodeid))) { 3208 - /* Node not bootstrapped yet */ 3209 - ptr = fallback_alloc(cachep, flags); 3210 - goto out; 3211 - } 3212 - 3213 - if (nodeid == slab_node) { 3189 + if (nodeid == NUMA_NO_NODE) { 3190 + if (current->mempolicy || cpuset_do_slab_mem_spread()) { 3191 + objp = alternate_node_alloc(cachep, flags); 3192 + if (objp) 3193 + goto out; 3194 + } 3214 3195 /* 3215 3196 * Use the locally cached objects if possible. 3216 3197 * However ____cache_alloc does not allow fallback 3217 3198 * to other nodes. It may fail while we still have 3218 3199 * objects on other nodes available. 3219 3200 */ 3220 - ptr = ____cache_alloc(cachep, flags); 3221 - if (ptr) 3222 - goto out; 3201 + objp = ____cache_alloc(cachep, flags); 3202 + nodeid = slab_node; 3203 + } else if (nodeid == slab_node) { 3204 + objp = ____cache_alloc(cachep, flags); 3205 + } else if (!get_node(cachep, nodeid)) { 3206 + /* Node not bootstrapped yet */ 3207 + objp = fallback_alloc(cachep, flags); 3208 + goto out; 3223 3209 } 3224 - /* ___cache_alloc_node can fall back to other nodes */ 3225 - ptr = ____cache_alloc_node(cachep, flags, nodeid); 3226 - out: 3227 - local_irq_restore(save_flags); 3228 - ptr = cache_alloc_debugcheck_after(cachep, flags, ptr, caller); 3229 - init = slab_want_init_on_alloc(flags, cachep); 3230 - 3231 - out_hooks: 3232 - slab_post_alloc_hook(cachep, objcg, flags, 1, &ptr, init); 3233 - return ptr; 3234 - } 3235 - 3236 - static __always_inline void * 3237 - __do_cache_alloc(struct kmem_cache *cache, gfp_t flags) 3238 - { 3239 - void *objp; 3240 - 3241 - if (current->mempolicy || cpuset_do_slab_mem_spread()) { 3242 - objp = alternate_node_alloc(cache, flags); 3243 - if (objp) 3244 - goto out; 3245 - } 3246 - objp = ____cache_alloc(cache, flags); 3247 3210 3248 3211 /* 3249 3212 * We may just have run out of memory on the local node. 3250 3213 * ____cache_alloc_node() knows how to locate memory on other nodes 3251 3214 */ 3252 3215 if (!objp) 3253 - objp = ____cache_alloc_node(cache, flags, numa_mem_id()); 3254 - 3216 + objp = ____cache_alloc_node(cachep, flags, nodeid); 3255 3217 out: 3256 3218 return objp; 3257 3219 } 3258 3220 #else 3259 3221 3260 3222 static __always_inline void * 3261 - __do_cache_alloc(struct kmem_cache *cachep, gfp_t flags) 3223 + __do_cache_alloc(struct kmem_cache *cachep, gfp_t flags, int nodeid __maybe_unused) 3262 3224 { 3263 3225 return ____cache_alloc(cachep, flags); 3264 3226 } ··· 3228 3266 #endif /* CONFIG_NUMA */ 3229 3267 3230 3268 static __always_inline void * 3231 - slab_alloc(struct kmem_cache *cachep, struct list_lru *lru, gfp_t flags, 3232 - size_t orig_size, unsigned long caller) 3269 + slab_alloc_node(struct kmem_cache *cachep, struct list_lru *lru, gfp_t flags, 3270 + int nodeid, size_t orig_size, unsigned long caller) 3233 3271 { 3234 3272 unsigned long save_flags; 3235 3273 void *objp; ··· 3246 3284 goto out; 3247 3285 3248 3286 local_irq_save(save_flags); 3249 - objp = __do_cache_alloc(cachep, flags); 3287 + objp = __do_cache_alloc(cachep, flags, nodeid); 3250 3288 local_irq_restore(save_flags); 3251 3289 objp = cache_alloc_debugcheck_after(cachep, flags, objp, caller); 3252 3290 prefetchw(objp); ··· 3255 3293 out: 3256 3294 slab_post_alloc_hook(cachep, objcg, flags, 1, &objp, init); 3257 3295 return objp; 3296 + } 3297 + 3298 + static __always_inline void * 3299 + slab_alloc(struct kmem_cache *cachep, struct list_lru *lru, gfp_t flags, 3300 + size_t orig_size, unsigned long caller) 3301 + { 3302 + return slab_alloc_node(cachep, lru, flags, NUMA_NO_NODE, orig_size, 3303 + caller); 3258 3304 } 3259 3305 3260 3306 /* ··· 3440 3470 { 3441 3471 void *ret = slab_alloc(cachep, lru, flags, cachep->object_size, _RET_IP_); 3442 3472 3443 - trace_kmem_cache_alloc(_RET_IP_, ret, cachep, 3444 - cachep->object_size, cachep->size, flags); 3473 + trace_kmem_cache_alloc(_RET_IP_, ret, cachep, flags, NUMA_NO_NODE); 3445 3474 3446 3475 return ret; 3447 3476 } ··· 3490 3521 3491 3522 local_irq_disable(); 3492 3523 for (i = 0; i < size; i++) { 3493 - void *objp = kfence_alloc(s, s->object_size, flags) ?: __do_cache_alloc(s, flags); 3524 + void *objp = kfence_alloc(s, s->object_size, flags) ?: 3525 + __do_cache_alloc(s, flags, NUMA_NO_NODE); 3494 3526 3495 3527 if (unlikely(!objp)) 3496 3528 goto error; ··· 3518 3548 } 3519 3549 EXPORT_SYMBOL(kmem_cache_alloc_bulk); 3520 3550 3521 - #ifdef CONFIG_TRACING 3522 - void * 3523 - kmem_cache_alloc_trace(struct kmem_cache *cachep, gfp_t flags, size_t size) 3524 - { 3525 - void *ret; 3526 - 3527 - ret = slab_alloc(cachep, NULL, flags, size, _RET_IP_); 3528 - 3529 - ret = kasan_kmalloc(cachep, ret, size, flags); 3530 - trace_kmalloc(_RET_IP_, ret, cachep, 3531 - size, cachep->size, flags); 3532 - return ret; 3533 - } 3534 - EXPORT_SYMBOL(kmem_cache_alloc_trace); 3535 - #endif 3536 - 3537 - #ifdef CONFIG_NUMA 3538 3551 /** 3539 3552 * kmem_cache_alloc_node - Allocate an object on the specified node 3540 3553 * @cachep: The cache to allocate from. ··· 3533 3580 */ 3534 3581 void *kmem_cache_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid) 3535 3582 { 3536 - void *ret = slab_alloc_node(cachep, flags, nodeid, cachep->object_size, _RET_IP_); 3583 + void *ret = slab_alloc_node(cachep, NULL, flags, nodeid, cachep->object_size, _RET_IP_); 3537 3584 3538 - trace_kmem_cache_alloc_node(_RET_IP_, ret, cachep, 3539 - cachep->object_size, cachep->size, 3540 - flags, nodeid); 3585 + trace_kmem_cache_alloc(_RET_IP_, ret, cachep, flags, nodeid); 3541 3586 3542 3587 return ret; 3543 3588 } 3544 3589 EXPORT_SYMBOL(kmem_cache_alloc_node); 3545 3590 3546 - #ifdef CONFIG_TRACING 3547 - void *kmem_cache_alloc_node_trace(struct kmem_cache *cachep, 3548 - gfp_t flags, 3549 - int nodeid, 3550 - size_t size) 3591 + void *__kmem_cache_alloc_node(struct kmem_cache *cachep, gfp_t flags, 3592 + int nodeid, size_t orig_size, 3593 + unsigned long caller) 3551 3594 { 3552 - void *ret; 3553 - 3554 - ret = slab_alloc_node(cachep, flags, nodeid, size, _RET_IP_); 3555 - 3556 - ret = kasan_kmalloc(cachep, ret, size, flags); 3557 - trace_kmalloc_node(_RET_IP_, ret, cachep, 3558 - size, cachep->size, 3559 - flags, nodeid); 3560 - return ret; 3595 + return slab_alloc_node(cachep, NULL, flags, nodeid, 3596 + orig_size, caller); 3561 3597 } 3562 - EXPORT_SYMBOL(kmem_cache_alloc_node_trace); 3563 - #endif 3564 - 3565 - static __always_inline void * 3566 - __do_kmalloc_node(size_t size, gfp_t flags, int node, unsigned long caller) 3567 - { 3568 - struct kmem_cache *cachep; 3569 - void *ret; 3570 - 3571 - if (unlikely(size > KMALLOC_MAX_CACHE_SIZE)) 3572 - return NULL; 3573 - cachep = kmalloc_slab(size, flags); 3574 - if (unlikely(ZERO_OR_NULL_PTR(cachep))) 3575 - return cachep; 3576 - ret = kmem_cache_alloc_node_trace(cachep, flags, node, size); 3577 - ret = kasan_kmalloc(cachep, ret, size, flags); 3578 - 3579 - return ret; 3580 - } 3581 - 3582 - void *__kmalloc_node(size_t size, gfp_t flags, int node) 3583 - { 3584 - return __do_kmalloc_node(size, flags, node, _RET_IP_); 3585 - } 3586 - EXPORT_SYMBOL(__kmalloc_node); 3587 - 3588 - void *__kmalloc_node_track_caller(size_t size, gfp_t flags, 3589 - int node, unsigned long caller) 3590 - { 3591 - return __do_kmalloc_node(size, flags, node, caller); 3592 - } 3593 - EXPORT_SYMBOL(__kmalloc_node_track_caller); 3594 - #endif /* CONFIG_NUMA */ 3595 3598 3596 3599 #ifdef CONFIG_PRINTK 3597 3600 void __kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab) ··· 3571 3662 } 3572 3663 #endif 3573 3664 3574 - /** 3575 - * __do_kmalloc - allocate memory 3576 - * @size: how many bytes of memory are required. 3577 - * @flags: the type of memory to allocate (see kmalloc). 3578 - * @caller: function caller for debug tracking of the caller 3579 - * 3580 - * Return: pointer to the allocated memory or %NULL in case of error 3581 - */ 3582 - static __always_inline void *__do_kmalloc(size_t size, gfp_t flags, 3583 - unsigned long caller) 3665 + static __always_inline 3666 + void __do_kmem_cache_free(struct kmem_cache *cachep, void *objp, 3667 + unsigned long caller) 3584 3668 { 3585 - struct kmem_cache *cachep; 3586 - void *ret; 3669 + unsigned long flags; 3587 3670 3588 - if (unlikely(size > KMALLOC_MAX_CACHE_SIZE)) 3589 - return NULL; 3590 - cachep = kmalloc_slab(size, flags); 3591 - if (unlikely(ZERO_OR_NULL_PTR(cachep))) 3592 - return cachep; 3593 - ret = slab_alloc(cachep, NULL, flags, size, caller); 3594 - 3595 - ret = kasan_kmalloc(cachep, ret, size, flags); 3596 - trace_kmalloc(caller, ret, cachep, 3597 - size, cachep->size, flags); 3598 - 3599 - return ret; 3671 + local_irq_save(flags); 3672 + debug_check_no_locks_freed(objp, cachep->object_size); 3673 + if (!(cachep->flags & SLAB_DEBUG_OBJECTS)) 3674 + debug_check_no_obj_freed(objp, cachep->object_size); 3675 + __cache_free(cachep, objp, caller); 3676 + local_irq_restore(flags); 3600 3677 } 3601 3678 3602 - void *__kmalloc(size_t size, gfp_t flags) 3679 + void __kmem_cache_free(struct kmem_cache *cachep, void *objp, 3680 + unsigned long caller) 3603 3681 { 3604 - return __do_kmalloc(size, flags, _RET_IP_); 3682 + __do_kmem_cache_free(cachep, objp, caller); 3605 3683 } 3606 - EXPORT_SYMBOL(__kmalloc); 3607 - 3608 - void *__kmalloc_track_caller(size_t size, gfp_t flags, unsigned long caller) 3609 - { 3610 - return __do_kmalloc(size, flags, caller); 3611 - } 3612 - EXPORT_SYMBOL(__kmalloc_track_caller); 3613 3684 3614 3685 /** 3615 3686 * kmem_cache_free - Deallocate an object ··· 3601 3712 */ 3602 3713 void kmem_cache_free(struct kmem_cache *cachep, void *objp) 3603 3714 { 3604 - unsigned long flags; 3605 3715 cachep = cache_from_obj(cachep, objp); 3606 3716 if (!cachep) 3607 3717 return; 3608 3718 3609 - trace_kmem_cache_free(_RET_IP_, objp, cachep->name); 3610 - local_irq_save(flags); 3611 - debug_check_no_locks_freed(objp, cachep->object_size); 3612 - if (!(cachep->flags & SLAB_DEBUG_OBJECTS)) 3613 - debug_check_no_obj_freed(objp, cachep->object_size); 3614 - __cache_free(cachep, objp, _RET_IP_); 3615 - local_irq_restore(flags); 3719 + trace_kmem_cache_free(_RET_IP_, objp, cachep); 3720 + __do_kmem_cache_free(cachep, objp, _RET_IP_); 3616 3721 } 3617 3722 EXPORT_SYMBOL(kmem_cache_free); 3618 3723 3619 3724 void kmem_cache_free_bulk(struct kmem_cache *orig_s, size_t size, void **p) 3620 3725 { 3621 - struct kmem_cache *s; 3622 - size_t i; 3623 3726 3624 3727 local_irq_disable(); 3625 - for (i = 0; i < size; i++) { 3728 + for (int i = 0; i < size; i++) { 3626 3729 void *objp = p[i]; 3730 + struct kmem_cache *s; 3627 3731 3628 - if (!orig_s) /* called via kfree_bulk */ 3629 - s = virt_to_cache(objp); 3630 - else 3732 + if (!orig_s) { 3733 + struct folio *folio = virt_to_folio(objp); 3734 + 3735 + /* called via kfree_bulk */ 3736 + if (!folio_test_slab(folio)) { 3737 + local_irq_enable(); 3738 + free_large_kmalloc(folio, objp); 3739 + local_irq_disable(); 3740 + continue; 3741 + } 3742 + s = folio_slab(folio)->slab_cache; 3743 + } else { 3631 3744 s = cache_from_obj(orig_s, objp); 3745 + } 3746 + 3632 3747 if (!s) 3633 3748 continue; 3634 3749 ··· 3647 3754 /* FIXME: add tracing */ 3648 3755 } 3649 3756 EXPORT_SYMBOL(kmem_cache_free_bulk); 3650 - 3651 - /** 3652 - * kfree - free previously allocated memory 3653 - * @objp: pointer returned by kmalloc. 3654 - * 3655 - * If @objp is NULL, no operation is performed. 3656 - * 3657 - * Don't free memory not originally allocated by kmalloc() 3658 - * or you will run into trouble. 3659 - */ 3660 - void kfree(const void *objp) 3661 - { 3662 - struct kmem_cache *c; 3663 - unsigned long flags; 3664 - 3665 - trace_kfree(_RET_IP_, objp); 3666 - 3667 - if (unlikely(ZERO_OR_NULL_PTR(objp))) 3668 - return; 3669 - local_irq_save(flags); 3670 - kfree_debugcheck(objp); 3671 - c = virt_to_cache(objp); 3672 - if (!c) { 3673 - local_irq_restore(flags); 3674 - return; 3675 - } 3676 - debug_check_no_locks_freed(objp, c->object_size); 3677 - 3678 - debug_check_no_obj_freed(objp, c->object_size); 3679 - __cache_free(c, (void *)objp, _RET_IP_); 3680 - local_irq_restore(flags); 3681 - } 3682 - EXPORT_SYMBOL(kfree); 3683 3757 3684 3758 /* 3685 3759 * This initializes kmem_cache_node or resizes various caches for all nodes. ··· 4050 4190 usercopy_abort("SLAB object", cachep->name, to_user, offset, n); 4051 4191 } 4052 4192 #endif /* CONFIG_HARDENED_USERCOPY */ 4053 - 4054 - /** 4055 - * __ksize -- Uninstrumented ksize. 4056 - * @objp: pointer to the object 4057 - * 4058 - * Unlike ksize(), __ksize() is uninstrumented, and does not provide the same 4059 - * safety checks as ksize() with KASAN instrumentation enabled. 4060 - * 4061 - * Return: size of the actual memory used by @objp in bytes 4062 - */ 4063 - size_t __ksize(const void *objp) 4064 - { 4065 - struct kmem_cache *c; 4066 - size_t size; 4067 - 4068 - BUG_ON(!objp); 4069 - if (unlikely(objp == ZERO_SIZE_PTR)) 4070 - return 0; 4071 - 4072 - c = virt_to_cache(objp); 4073 - size = c ? c->object_size : 0; 4074 - 4075 - return size; 4076 - } 4077 - EXPORT_SYMBOL(__ksize);
+10
mm/slab.h
··· 273 273 274 274 /* Find the kmalloc slab corresponding for a certain size */ 275 275 struct kmem_cache *kmalloc_slab(size_t, gfp_t); 276 + 277 + void *__kmem_cache_alloc_node(struct kmem_cache *s, gfp_t gfpflags, 278 + int node, size_t orig_size, 279 + unsigned long caller); 280 + void __kmem_cache_free(struct kmem_cache *s, void *x, unsigned long caller); 276 281 #endif 277 282 278 283 gfp_t kmalloc_fix_flags(gfp_t flags); ··· 663 658 print_tracking(cachep, x); 664 659 return cachep; 665 660 } 661 + 662 + void free_large_kmalloc(struct folio *folio, void *object); 663 + 666 664 #endif /* CONFIG_SLOB */ 665 + 666 + size_t __ksize(const void *objp); 667 667 668 668 static inline size_t slab_ksize(const struct kmem_cache *s) 669 669 {
+207 -34
mm/slab_common.c
··· 511 511 */ 512 512 int kmem_cache_shrink(struct kmem_cache *cachep) 513 513 { 514 - int ret; 515 - 516 - 517 514 kasan_cache_shrink(cachep); 518 - ret = __kmem_cache_shrink(cachep); 519 515 520 - return ret; 516 + return __kmem_cache_shrink(cachep); 521 517 } 522 518 EXPORT_SYMBOL(kmem_cache_shrink); 523 519 ··· 661 665 if (!s) 662 666 panic("Out of memory when creating slab %s\n", name); 663 667 664 - create_boot_cache(s, name, size, flags, useroffset, usersize); 668 + create_boot_cache(s, name, size, flags | SLAB_KMALLOC, useroffset, 669 + usersize); 665 670 kasan_cache_create_kmalloc(s); 666 671 list_add(&s->list, &slab_caches); 667 672 s->refcount = 1; ··· 734 737 return kmalloc_caches[kmalloc_type(flags)][index]; 735 738 } 736 739 740 + size_t kmalloc_size_roundup(size_t size) 741 + { 742 + struct kmem_cache *c; 743 + 744 + /* Short-circuit the 0 size case. */ 745 + if (unlikely(size == 0)) 746 + return 0; 747 + /* Short-circuit saturated "too-large" case. */ 748 + if (unlikely(size == SIZE_MAX)) 749 + return SIZE_MAX; 750 + /* Above the smaller buckets, size is a multiple of page size. */ 751 + if (size > KMALLOC_MAX_CACHE_SIZE) 752 + return PAGE_SIZE << get_order(size); 753 + 754 + /* The flags don't matter since size_index is common to all. */ 755 + c = kmalloc_slab(size, GFP_KERNEL); 756 + return c ? c->object_size : 0; 757 + } 758 + EXPORT_SYMBOL(kmalloc_size_roundup); 759 + 737 760 #ifdef CONFIG_ZONE_DMA 738 761 #define KMALLOC_DMA_NAME(sz) .name[KMALLOC_DMA] = "dma-kmalloc-" #sz, 739 762 #else ··· 777 760 778 761 /* 779 762 * kmalloc_info[] is to make slub_debug=,kmalloc-xx option work at boot time. 780 - * kmalloc_index() supports up to 2^25=32MB, so the final entry of the table is 781 - * kmalloc-32M. 763 + * kmalloc_index() supports up to 2^21=2MB, so the final entry of the table is 764 + * kmalloc-2M. 782 765 */ 783 766 const struct kmalloc_info_struct kmalloc_info[] __initconst = { 784 767 INIT_KMALLOC_INFO(0, 0), ··· 802 785 INIT_KMALLOC_INFO(262144, 256k), 803 786 INIT_KMALLOC_INFO(524288, 512k), 804 787 INIT_KMALLOC_INFO(1048576, 1M), 805 - INIT_KMALLOC_INFO(2097152, 2M), 806 - INIT_KMALLOC_INFO(4194304, 4M), 807 - INIT_KMALLOC_INFO(8388608, 8M), 808 - INIT_KMALLOC_INFO(16777216, 16M), 809 - INIT_KMALLOC_INFO(33554432, 32M) 788 + INIT_KMALLOC_INFO(2097152, 2M) 810 789 }; 811 790 812 791 /* ··· 915 902 /* Kmalloc array is now usable */ 916 903 slab_state = UP; 917 904 } 905 + 906 + void free_large_kmalloc(struct folio *folio, void *object) 907 + { 908 + unsigned int order = folio_order(folio); 909 + 910 + if (WARN_ON_ONCE(order == 0)) 911 + pr_warn_once("object pointer: 0x%p\n", object); 912 + 913 + kmemleak_free(object); 914 + kasan_kfree_large(object); 915 + 916 + mod_lruvec_page_state(folio_page(folio, 0), NR_SLAB_UNRECLAIMABLE_B, 917 + -(PAGE_SIZE << order)); 918 + __free_pages(folio_page(folio, 0), order); 919 + } 920 + 921 + static void *__kmalloc_large_node(size_t size, gfp_t flags, int node); 922 + static __always_inline 923 + void *__do_kmalloc_node(size_t size, gfp_t flags, int node, unsigned long caller) 924 + { 925 + struct kmem_cache *s; 926 + void *ret; 927 + 928 + if (unlikely(size > KMALLOC_MAX_CACHE_SIZE)) { 929 + ret = __kmalloc_large_node(size, flags, node); 930 + trace_kmalloc(_RET_IP_, ret, size, 931 + PAGE_SIZE << get_order(size), flags, node); 932 + return ret; 933 + } 934 + 935 + s = kmalloc_slab(size, flags); 936 + 937 + if (unlikely(ZERO_OR_NULL_PTR(s))) 938 + return s; 939 + 940 + ret = __kmem_cache_alloc_node(s, flags, node, size, caller); 941 + ret = kasan_kmalloc(s, ret, size, flags); 942 + trace_kmalloc(_RET_IP_, ret, size, s->size, flags, node); 943 + return ret; 944 + } 945 + 946 + void *__kmalloc_node(size_t size, gfp_t flags, int node) 947 + { 948 + return __do_kmalloc_node(size, flags, node, _RET_IP_); 949 + } 950 + EXPORT_SYMBOL(__kmalloc_node); 951 + 952 + void *__kmalloc(size_t size, gfp_t flags) 953 + { 954 + return __do_kmalloc_node(size, flags, NUMA_NO_NODE, _RET_IP_); 955 + } 956 + EXPORT_SYMBOL(__kmalloc); 957 + 958 + void *__kmalloc_node_track_caller(size_t size, gfp_t flags, 959 + int node, unsigned long caller) 960 + { 961 + return __do_kmalloc_node(size, flags, node, caller); 962 + } 963 + EXPORT_SYMBOL(__kmalloc_node_track_caller); 964 + 965 + /** 966 + * kfree - free previously allocated memory 967 + * @object: pointer returned by kmalloc. 968 + * 969 + * If @object is NULL, no operation is performed. 970 + * 971 + * Don't free memory not originally allocated by kmalloc() 972 + * or you will run into trouble. 973 + */ 974 + void kfree(const void *object) 975 + { 976 + struct folio *folio; 977 + struct slab *slab; 978 + struct kmem_cache *s; 979 + 980 + trace_kfree(_RET_IP_, object); 981 + 982 + if (unlikely(ZERO_OR_NULL_PTR(object))) 983 + return; 984 + 985 + folio = virt_to_folio(object); 986 + if (unlikely(!folio_test_slab(folio))) { 987 + free_large_kmalloc(folio, (void *)object); 988 + return; 989 + } 990 + 991 + slab = folio_slab(folio); 992 + s = slab->slab_cache; 993 + __kmem_cache_free(s, (void *)object, _RET_IP_); 994 + } 995 + EXPORT_SYMBOL(kfree); 996 + 997 + /** 998 + * __ksize -- Report full size of underlying allocation 999 + * @objp: pointer to the object 1000 + * 1001 + * This should only be used internally to query the true size of allocations. 1002 + * It is not meant to be a way to discover the usable size of an allocation 1003 + * after the fact. Instead, use kmalloc_size_roundup(). Using memory beyond 1004 + * the originally requested allocation size may trigger KASAN, UBSAN_BOUNDS, 1005 + * and/or FORTIFY_SOURCE. 1006 + * 1007 + * Return: size of the actual memory used by @objp in bytes 1008 + */ 1009 + size_t __ksize(const void *object) 1010 + { 1011 + struct folio *folio; 1012 + 1013 + if (unlikely(object == ZERO_SIZE_PTR)) 1014 + return 0; 1015 + 1016 + folio = virt_to_folio(object); 1017 + 1018 + if (unlikely(!folio_test_slab(folio))) { 1019 + if (WARN_ON(folio_size(folio) <= KMALLOC_MAX_CACHE_SIZE)) 1020 + return 0; 1021 + if (WARN_ON(object != folio_address(folio))) 1022 + return 0; 1023 + return folio_size(folio); 1024 + } 1025 + 1026 + return slab_ksize(folio_slab(folio)->slab_cache); 1027 + } 1028 + 1029 + #ifdef CONFIG_TRACING 1030 + void *kmalloc_trace(struct kmem_cache *s, gfp_t gfpflags, size_t size) 1031 + { 1032 + void *ret = __kmem_cache_alloc_node(s, gfpflags, NUMA_NO_NODE, 1033 + size, _RET_IP_); 1034 + 1035 + trace_kmalloc(_RET_IP_, ret, size, s->size, gfpflags, NUMA_NO_NODE); 1036 + 1037 + ret = kasan_kmalloc(s, ret, size, gfpflags); 1038 + return ret; 1039 + } 1040 + EXPORT_SYMBOL(kmalloc_trace); 1041 + 1042 + void *kmalloc_node_trace(struct kmem_cache *s, gfp_t gfpflags, 1043 + int node, size_t size) 1044 + { 1045 + void *ret = __kmem_cache_alloc_node(s, gfpflags, node, size, _RET_IP_); 1046 + 1047 + trace_kmalloc(_RET_IP_, ret, size, s->size, gfpflags, node); 1048 + 1049 + ret = kasan_kmalloc(s, ret, size, gfpflags); 1050 + return ret; 1051 + } 1052 + EXPORT_SYMBOL(kmalloc_node_trace); 1053 + #endif /* !CONFIG_TRACING */ 918 1054 #endif /* !CONFIG_SLOB */ 919 1055 920 1056 gfp_t kmalloc_fix_flags(gfp_t flags) ··· 1083 921 * directly to the page allocator. We use __GFP_COMP, because we will need to 1084 922 * know the allocation order to free the pages properly in kfree. 1085 923 */ 1086 - void *kmalloc_order(size_t size, gfp_t flags, unsigned int order) 924 + 925 + static void *__kmalloc_large_node(size_t size, gfp_t flags, int node) 1087 926 { 1088 - void *ret = NULL; 1089 927 struct page *page; 928 + void *ptr = NULL; 929 + unsigned int order = get_order(size); 1090 930 1091 931 if (unlikely(flags & GFP_SLAB_BUG_MASK)) 1092 932 flags = kmalloc_fix_flags(flags); 1093 933 1094 934 flags |= __GFP_COMP; 1095 - page = alloc_pages(flags, order); 1096 - if (likely(page)) { 1097 - ret = page_address(page); 935 + page = alloc_pages_node(node, flags, order); 936 + if (page) { 937 + ptr = page_address(page); 1098 938 mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE_B, 1099 939 PAGE_SIZE << order); 1100 940 } 1101 - ret = kasan_kmalloc_large(ret, size, flags); 1102 - /* As ret might get tagged, call kmemleak hook after KASAN. */ 1103 - kmemleak_alloc(ret, size, 1, flags); 1104 - return ret; 1105 - } 1106 - EXPORT_SYMBOL(kmalloc_order); 1107 941 1108 - #ifdef CONFIG_TRACING 1109 - void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) 942 + ptr = kasan_kmalloc_large(ptr, size, flags); 943 + /* As ptr might get tagged, call kmemleak hook after KASAN. */ 944 + kmemleak_alloc(ptr, size, 1, flags); 945 + 946 + return ptr; 947 + } 948 + 949 + void *kmalloc_large(size_t size, gfp_t flags) 1110 950 { 1111 - void *ret = kmalloc_order(size, flags, order); 1112 - trace_kmalloc(_RET_IP_, ret, NULL, size, PAGE_SIZE << order, flags); 951 + void *ret = __kmalloc_large_node(size, flags, NUMA_NO_NODE); 952 + 953 + trace_kmalloc(_RET_IP_, ret, size, PAGE_SIZE << get_order(size), 954 + flags, NUMA_NO_NODE); 1113 955 return ret; 1114 956 } 1115 - EXPORT_SYMBOL(kmalloc_order_trace); 1116 - #endif 957 + EXPORT_SYMBOL(kmalloc_large); 958 + 959 + void *kmalloc_large_node(size_t size, gfp_t flags, int node) 960 + { 961 + void *ret = __kmalloc_large_node(size, flags, node); 962 + 963 + trace_kmalloc(_RET_IP_, ret, size, PAGE_SIZE << get_order(size), 964 + flags, node); 965 + return ret; 966 + } 967 + EXPORT_SYMBOL(kmalloc_large_node); 1117 968 1118 969 #ifdef CONFIG_SLAB_FREELIST_RANDOM 1119 970 /* Randomize a generic freelist */ ··· 1325 1150 1326 1151 #endif /* CONFIG_SLAB || CONFIG_SLUB_DEBUG */ 1327 1152 1328 - static __always_inline void *__do_krealloc(const void *p, size_t new_size, 1329 - gfp_t flags) 1153 + static __always_inline __realloc_size(2) void * 1154 + __do_krealloc(const void *p, size_t new_size, gfp_t flags) 1330 1155 { 1331 1156 void *ret; 1332 1157 size_t ks; ··· 1458 1283 /* Tracepoints definitions. */ 1459 1284 EXPORT_TRACEPOINT_SYMBOL(kmalloc); 1460 1285 EXPORT_TRACEPOINT_SYMBOL(kmem_cache_alloc); 1461 - EXPORT_TRACEPOINT_SYMBOL(kmalloc_node); 1462 - EXPORT_TRACEPOINT_SYMBOL(kmem_cache_alloc_node); 1463 1286 EXPORT_TRACEPOINT_SYMBOL(kfree); 1464 1287 EXPORT_TRACEPOINT_SYMBOL(kmem_cache_free); 1465 1288
+23 -22
mm/slob.c
··· 507 507 *m = size; 508 508 ret = (void *)m + minalign; 509 509 510 - trace_kmalloc_node(caller, ret, NULL, 511 - size, size + minalign, gfp, node); 510 + trace_kmalloc(caller, ret, size, size + minalign, gfp, node); 512 511 } else { 513 512 unsigned int order = get_order(size); 514 513 ··· 515 516 gfp |= __GFP_COMP; 516 517 ret = slob_new_pages(gfp, order, node); 517 518 518 - trace_kmalloc_node(caller, ret, NULL, 519 - size, PAGE_SIZE << order, gfp, node); 519 + trace_kmalloc(caller, ret, size, PAGE_SIZE << order, gfp, node); 520 520 } 521 521 522 522 kmemleak_alloc(ret, size, 1, gfp); ··· 528 530 } 529 531 EXPORT_SYMBOL(__kmalloc); 530 532 531 - void *__kmalloc_track_caller(size_t size, gfp_t gfp, unsigned long caller) 532 - { 533 - return __do_kmalloc_node(size, gfp, NUMA_NO_NODE, caller); 534 - } 535 - EXPORT_SYMBOL(__kmalloc_track_caller); 536 - 537 - #ifdef CONFIG_NUMA 538 533 void *__kmalloc_node_track_caller(size_t size, gfp_t gfp, 539 534 int node, unsigned long caller) 540 535 { 541 536 return __do_kmalloc_node(size, gfp, node, caller); 542 537 } 543 538 EXPORT_SYMBOL(__kmalloc_node_track_caller); 544 - #endif 545 539 546 540 void kfree(const void *block) 547 541 { ··· 564 574 } 565 575 EXPORT_SYMBOL(kfree); 566 576 577 + size_t kmalloc_size_roundup(size_t size) 578 + { 579 + /* Short-circuit the 0 size case. */ 580 + if (unlikely(size == 0)) 581 + return 0; 582 + /* Short-circuit saturated "too-large" case. */ 583 + if (unlikely(size == SIZE_MAX)) 584 + return SIZE_MAX; 585 + 586 + return ALIGN(size, ARCH_KMALLOC_MINALIGN); 587 + } 588 + 589 + EXPORT_SYMBOL(kmalloc_size_roundup); 590 + 567 591 /* can't use ksize for kmem_cache_alloc memory, only kmalloc */ 568 592 size_t __ksize(const void *block) 569 593 { ··· 598 594 m = (unsigned int *)(block - align); 599 595 return SLOB_UNITS(*m) * SLOB_UNIT; 600 596 } 601 - EXPORT_SYMBOL(__ksize); 602 597 603 598 int __kmem_cache_create(struct kmem_cache *c, slab_flags_t flags) 604 599 { ··· 605 602 /* leave room for rcu footer at the end of object */ 606 603 c->size += sizeof(struct slob_rcu); 607 604 } 605 + 606 + /* Actual size allocated */ 607 + c->size = SLOB_UNITS(c->size) * SLOB_UNIT; 608 608 c->flags = flags; 609 609 return 0; 610 610 } ··· 622 616 623 617 if (c->size < PAGE_SIZE) { 624 618 b = slob_alloc(c->size, flags, c->align, node, 0); 625 - trace_kmem_cache_alloc_node(_RET_IP_, b, NULL, c->object_size, 626 - SLOB_UNITS(c->size) * SLOB_UNIT, 627 - flags, node); 619 + trace_kmem_cache_alloc(_RET_IP_, b, c, flags, node); 628 620 } else { 629 621 b = slob_new_pages(flags, get_order(c->size), node); 630 - trace_kmem_cache_alloc_node(_RET_IP_, b, NULL, c->object_size, 631 - PAGE_SIZE << get_order(c->size), 632 - flags, node); 622 + trace_kmem_cache_alloc(_RET_IP_, b, c, flags, node); 633 623 } 634 624 635 625 if (b && c->ctor) { ··· 649 647 return slob_alloc_node(cachep, flags, NUMA_NO_NODE); 650 648 } 651 649 EXPORT_SYMBOL(kmem_cache_alloc_lru); 652 - #ifdef CONFIG_NUMA 650 + 653 651 void *__kmalloc_node(size_t size, gfp_t gfp, int node) 654 652 { 655 653 return __do_kmalloc_node(size, gfp, node, _RET_IP_); ··· 661 659 return slob_alloc_node(cachep, gfp, node); 662 660 } 663 661 EXPORT_SYMBOL(kmem_cache_alloc_node); 664 - #endif 665 662 666 663 static void __kmem_cache_free(void *b, int size) 667 664 { ··· 681 680 void kmem_cache_free(struct kmem_cache *c, void *b) 682 681 { 683 682 kmemleak_free_recursive(b, c->flags); 684 - trace_kmem_cache_free(_RET_IP_, b, c->name); 683 + trace_kmem_cache_free(_RET_IP_, b, c); 685 684 if (unlikely(c->flags & SLAB_TYPESAFE_BY_RCU)) { 686 685 struct slob_rcu *slob_rcu; 687 686 slob_rcu = b + (c->size - sizeof(struct slob_rcu));
+409 -462
mm/slub.c
··· 50 50 * 1. slab_mutex (Global Mutex) 51 51 * 2. node->list_lock (Spinlock) 52 52 * 3. kmem_cache->cpu_slab->lock (Local lock) 53 - * 4. slab_lock(slab) (Only on some arches or for debugging) 53 + * 4. slab_lock(slab) (Only on some arches) 54 54 * 5. object_map_lock (Only for debugging) 55 55 * 56 56 * slab_mutex ··· 64 64 * The slab_lock is a wrapper around the page lock, thus it is a bit 65 65 * spinlock. 66 66 * 67 - * The slab_lock is only used for debugging and on arches that do not 68 - * have the ability to do a cmpxchg_double. It only protects: 67 + * The slab_lock is only used on arches that do not have the ability 68 + * to do a cmpxchg_double. It only protects: 69 + * 69 70 * A. slab->freelist -> List of free objects in a slab 70 71 * B. slab->inuse -> Number of objects in use 71 72 * C. slab->objects -> Number of objects in slab ··· 95 94 * allocating a long series of objects that fill up slabs does not require 96 95 * the list lock. 97 96 * 97 + * For debug caches, all allocations are forced to go through a list_lock 98 + * protected region to serialize against concurrent validation. 99 + * 98 100 * cpu_slab->lock local lock 99 101 * 100 102 * This locks protect slowpath manipulation of all kmem_cache_cpu fields 101 103 * except the stat counters. This is a percpu structure manipulated only by 102 104 * the local cpu, so the lock protects against being preempted or interrupted 103 105 * by an irq. Fast path operations rely on lockless operations instead. 104 - * On PREEMPT_RT, the local lock does not actually disable irqs (and thus 105 - * prevent the lockless operations), so fastpath operations also need to take 106 - * the lock and are no longer lockless. 106 + * 107 + * On PREEMPT_RT, the local lock neither disables interrupts nor preemption 108 + * which means the lockless fastpath cannot be used as it might interfere with 109 + * an in-progress slow path operations. In this case the local lock is always 110 + * taken but it still utilizes the freelist for the common operations. 107 111 * 108 112 * lockless fastpaths 109 113 * ··· 169 163 * function call even on !PREEMPT_RT, use inline preempt_disable() there. 170 164 */ 171 165 #ifndef CONFIG_PREEMPT_RT 172 - #define slub_get_cpu_ptr(var) get_cpu_ptr(var) 173 - #define slub_put_cpu_ptr(var) put_cpu_ptr(var) 166 + #define slub_get_cpu_ptr(var) get_cpu_ptr(var) 167 + #define slub_put_cpu_ptr(var) put_cpu_ptr(var) 168 + #define USE_LOCKLESS_FAST_PATH() (true) 174 169 #else 175 170 #define slub_get_cpu_ptr(var) \ 176 171 ({ \ ··· 183 176 (void)(var); \ 184 177 migrate_enable(); \ 185 178 } while (0) 179 + #define USE_LOCKLESS_FAST_PATH() (false) 186 180 #endif 187 181 188 182 #ifdef CONFIG_SLUB_DEBUG ··· 194 186 #endif 195 187 #endif /* CONFIG_SLUB_DEBUG */ 196 188 189 + /* Structure holding parameters for get_partial() call chain */ 190 + struct partial_context { 191 + struct slab **slab; 192 + gfp_t flags; 193 + unsigned int orig_size; 194 + }; 195 + 197 196 static inline bool kmem_cache_debug(struct kmem_cache *s) 198 197 { 199 198 return kmem_cache_debug_flags(s, SLAB_DEBUG_FLAGS); 199 + } 200 + 201 + static inline bool slub_debug_orig_size(struct kmem_cache *s) 202 + { 203 + return (kmem_cache_debug_flags(s, SLAB_STORE_USER) && 204 + (s->flags & SLAB_KMALLOC)); 200 205 } 201 206 202 207 void *fixup_red_left(struct kmem_cache *s, void *p) ··· 468 447 /* 469 448 * Per slab locking using the pagelock 470 449 */ 471 - static __always_inline void __slab_lock(struct slab *slab) 450 + static __always_inline void slab_lock(struct slab *slab) 472 451 { 473 452 struct page *page = slab_page(slab); 474 453 ··· 476 455 bit_spin_lock(PG_locked, &page->flags); 477 456 } 478 457 479 - static __always_inline void __slab_unlock(struct slab *slab) 458 + static __always_inline void slab_unlock(struct slab *slab) 480 459 { 481 460 struct page *page = slab_page(slab); 482 461 ··· 484 463 __bit_spin_unlock(PG_locked, &page->flags); 485 464 } 486 465 487 - static __always_inline void slab_lock(struct slab *slab, unsigned long *flags) 488 - { 489 - if (IS_ENABLED(CONFIG_PREEMPT_RT)) 490 - local_irq_save(*flags); 491 - __slab_lock(slab); 492 - } 493 - 494 - static __always_inline void slab_unlock(struct slab *slab, unsigned long *flags) 495 - { 496 - __slab_unlock(slab); 497 - if (IS_ENABLED(CONFIG_PREEMPT_RT)) 498 - local_irq_restore(*flags); 499 - } 500 - 501 466 /* 502 467 * Interrupts must be disabled (for the fallback code to work right), typically 503 - * by an _irqsave() lock variant. Except on PREEMPT_RT where locks are different 504 - * so we disable interrupts as part of slab_[un]lock(). 468 + * by an _irqsave() lock variant. On PREEMPT_RT the preempt_disable(), which is 469 + * part of bit_spin_lock(), is sufficient because the policy is not to allow any 470 + * allocation/ free operation in hardirq context. Therefore nothing can 471 + * interrupt the operation. 505 472 */ 506 473 static inline bool __cmpxchg_double_slab(struct kmem_cache *s, struct slab *slab, 507 474 void *freelist_old, unsigned long counters_old, 508 475 void *freelist_new, unsigned long counters_new, 509 476 const char *n) 510 477 { 511 - if (!IS_ENABLED(CONFIG_PREEMPT_RT)) 478 + if (USE_LOCKLESS_FAST_PATH()) 512 479 lockdep_assert_irqs_disabled(); 513 480 #if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && \ 514 481 defined(CONFIG_HAVE_ALIGNED_STRUCT_PAGE) ··· 508 499 } else 509 500 #endif 510 501 { 511 - /* init to 0 to prevent spurious warnings */ 512 - unsigned long flags = 0; 513 - 514 - slab_lock(slab, &flags); 502 + slab_lock(slab); 515 503 if (slab->freelist == freelist_old && 516 504 slab->counters == counters_old) { 517 505 slab->freelist = freelist_new; 518 506 slab->counters = counters_new; 519 - slab_unlock(slab, &flags); 507 + slab_unlock(slab); 520 508 return true; 521 509 } 522 - slab_unlock(slab, &flags); 510 + slab_unlock(slab); 523 511 } 524 512 525 513 cpu_relax(); ··· 547 541 unsigned long flags; 548 542 549 543 local_irq_save(flags); 550 - __slab_lock(slab); 544 + slab_lock(slab); 551 545 if (slab->freelist == freelist_old && 552 546 slab->counters == counters_old) { 553 547 slab->freelist = freelist_new; 554 548 slab->counters = counters_new; 555 - __slab_unlock(slab); 549 + slab_unlock(slab); 556 550 local_irq_restore(flags); 557 551 return true; 558 552 } 559 - __slab_unlock(slab); 553 + slab_unlock(slab); 560 554 local_irq_restore(flags); 561 555 } 562 556 ··· 572 566 573 567 #ifdef CONFIG_SLUB_DEBUG 574 568 static unsigned long object_map[BITS_TO_LONGS(MAX_OBJS_PER_PAGE)]; 575 - static DEFINE_RAW_SPINLOCK(object_map_lock); 569 + static DEFINE_SPINLOCK(object_map_lock); 576 570 577 571 static void __fill_map(unsigned long *obj_map, struct kmem_cache *s, 578 572 struct slab *slab) ··· 605 599 #else 606 600 static inline bool slab_add_kunit_errors(void) { return false; } 607 601 #endif 608 - 609 - /* 610 - * Determine a map of objects in use in a slab. 611 - * 612 - * Node listlock must be held to guarantee that the slab does 613 - * not vanish from under us. 614 - */ 615 - static unsigned long *get_map(struct kmem_cache *s, struct slab *slab) 616 - __acquires(&object_map_lock) 617 - { 618 - VM_BUG_ON(!irqs_disabled()); 619 - 620 - raw_spin_lock(&object_map_lock); 621 - 622 - __fill_map(object_map, s, slab); 623 - 624 - return object_map; 625 - } 626 - 627 - static void put_map(unsigned long *map) __releases(&object_map_lock) 628 - { 629 - VM_BUG_ON(map != object_map); 630 - raw_spin_unlock(&object_map_lock); 631 - } 632 602 633 603 static inline unsigned int size_from_object(struct kmem_cache *s) 634 604 { ··· 803 821 folio_flags(folio, 0)); 804 822 } 805 823 824 + /* 825 + * kmalloc caches has fixed sizes (mostly power of 2), and kmalloc() API 826 + * family will round up the real request size to these fixed ones, so 827 + * there could be an extra area than what is requested. Save the original 828 + * request size in the meta data area, for better debug and sanity check. 829 + */ 830 + static inline void set_orig_size(struct kmem_cache *s, 831 + void *object, unsigned int orig_size) 832 + { 833 + void *p = kasan_reset_tag(object); 834 + 835 + if (!slub_debug_orig_size(s)) 836 + return; 837 + 838 + p += get_info_end(s); 839 + p += sizeof(struct track) * 2; 840 + 841 + *(unsigned int *)p = orig_size; 842 + } 843 + 844 + static inline unsigned int get_orig_size(struct kmem_cache *s, void *object) 845 + { 846 + void *p = kasan_reset_tag(object); 847 + 848 + if (!slub_debug_orig_size(s)) 849 + return s->object_size; 850 + 851 + p += get_info_end(s); 852 + p += sizeof(struct track) * 2; 853 + 854 + return *(unsigned int *)p; 855 + } 856 + 806 857 static void slab_bug(struct kmem_cache *s, char *fmt, ...) 807 858 { 808 859 struct va_format vaf; ··· 894 879 895 880 if (s->flags & SLAB_STORE_USER) 896 881 off += 2 * sizeof(struct track); 882 + 883 + if (slub_debug_orig_size(s)) 884 + off += sizeof(unsigned int); 897 885 898 886 off += kasan_metadata_size(s); 899 887 ··· 1031 1013 * 1032 1014 * A. Free pointer (if we cannot overwrite object on free) 1033 1015 * B. Tracking data for SLAB_STORE_USER 1034 - * C. Padding to reach required alignment boundary or at minimum 1016 + * C. Original request size for kmalloc object (SLAB_STORE_USER enabled) 1017 + * D. Padding to reach required alignment boundary or at minimum 1035 1018 * one word if debugging is on to be able to detect writes 1036 1019 * before the word boundary. 1037 1020 * ··· 1050 1031 { 1051 1032 unsigned long off = get_info_end(s); /* The end of info */ 1052 1033 1053 - if (s->flags & SLAB_STORE_USER) 1034 + if (s->flags & SLAB_STORE_USER) { 1054 1035 /* We also have user information there */ 1055 1036 off += 2 * sizeof(struct track); 1037 + 1038 + if (s->flags & SLAB_KMALLOC) 1039 + off += sizeof(unsigned int); 1040 + } 1056 1041 1057 1042 off += kasan_metadata_size(s); 1058 1043 ··· 1352 1329 } 1353 1330 1354 1331 static noinline int alloc_debug_processing(struct kmem_cache *s, 1355 - struct slab *slab, 1356 - void *object, unsigned long addr) 1332 + struct slab *slab, void *object, int orig_size) 1357 1333 { 1358 1334 if (s->flags & SLAB_CONSISTENCY_CHECKS) { 1359 1335 if (!alloc_consistency_checks(s, slab, object)) 1360 1336 goto bad; 1361 1337 } 1362 1338 1363 - /* Success perform special debug activities for allocs */ 1364 - if (s->flags & SLAB_STORE_USER) 1365 - set_track(s, object, TRACK_ALLOC, addr); 1339 + /* Success. Perform special debug activities for allocs */ 1366 1340 trace(s, slab, object, 1); 1341 + set_orig_size(s, object, orig_size); 1367 1342 init_object(s, object, SLUB_RED_ACTIVE); 1368 1343 return 1; 1369 1344 ··· 1409 1388 return 0; 1410 1389 } 1411 1390 return 1; 1412 - } 1413 - 1414 - /* Supports checking bulk free of a constructed freelist */ 1415 - static noinline int free_debug_processing( 1416 - struct kmem_cache *s, struct slab *slab, 1417 - void *head, void *tail, int bulk_cnt, 1418 - unsigned long addr) 1419 - { 1420 - struct kmem_cache_node *n = get_node(s, slab_nid(slab)); 1421 - void *object = head; 1422 - int cnt = 0; 1423 - unsigned long flags, flags2; 1424 - int ret = 0; 1425 - depot_stack_handle_t handle = 0; 1426 - 1427 - if (s->flags & SLAB_STORE_USER) 1428 - handle = set_track_prepare(); 1429 - 1430 - spin_lock_irqsave(&n->list_lock, flags); 1431 - slab_lock(slab, &flags2); 1432 - 1433 - if (s->flags & SLAB_CONSISTENCY_CHECKS) { 1434 - if (!check_slab(s, slab)) 1435 - goto out; 1436 - } 1437 - 1438 - next_object: 1439 - cnt++; 1440 - 1441 - if (s->flags & SLAB_CONSISTENCY_CHECKS) { 1442 - if (!free_consistency_checks(s, slab, object, addr)) 1443 - goto out; 1444 - } 1445 - 1446 - if (s->flags & SLAB_STORE_USER) 1447 - set_track_update(s, object, TRACK_FREE, addr, handle); 1448 - trace(s, slab, object, 0); 1449 - /* Freepointer not overwritten by init_object(), SLAB_POISON moved it */ 1450 - init_object(s, object, SLUB_RED_INACTIVE); 1451 - 1452 - /* Reached end of constructed freelist yet? */ 1453 - if (object != tail) { 1454 - object = get_freepointer(s, object); 1455 - goto next_object; 1456 - } 1457 - ret = 1; 1458 - 1459 - out: 1460 - if (cnt != bulk_cnt) 1461 - slab_err(s, slab, "Bulk freelist count(%d) invalid(%d)\n", 1462 - bulk_cnt, cnt); 1463 - 1464 - slab_unlock(slab, &flags2); 1465 - spin_unlock_irqrestore(&n->list_lock, flags); 1466 - if (!ret) 1467 - slab_fix(s, "Object at 0x%p not freed", object); 1468 - return ret; 1469 1391 } 1470 1392 1471 1393 /* ··· 1630 1666 void setup_slab_debug(struct kmem_cache *s, struct slab *slab, void *addr) {} 1631 1667 1632 1668 static inline int alloc_debug_processing(struct kmem_cache *s, 1633 - struct slab *slab, void *object, unsigned long addr) { return 0; } 1669 + struct slab *slab, void *object, int orig_size) { return 0; } 1634 1670 1635 - static inline int free_debug_processing( 1671 + static inline void free_debug_processing( 1636 1672 struct kmem_cache *s, struct slab *slab, 1637 1673 void *head, void *tail, int bulk_cnt, 1638 - unsigned long addr) { return 0; } 1674 + unsigned long addr) {} 1639 1675 1640 1676 static inline void slab_pad_check(struct kmem_cache *s, struct slab *slab) {} 1641 1677 static inline int check_object(struct kmem_cache *s, struct slab *slab, 1642 1678 void *object, u8 val) { return 1; } 1679 + static inline void set_track(struct kmem_cache *s, void *object, 1680 + enum track_item alloc, unsigned long addr) {} 1643 1681 static inline void add_full(struct kmem_cache *s, struct kmem_cache_node *n, 1644 1682 struct slab *slab) {} 1645 1683 static inline void remove_full(struct kmem_cache *s, struct kmem_cache_node *n, ··· 1675 1709 * Hooks for other subsystems that check memory allocations. In a typical 1676 1710 * production configuration these hooks all should produce no code at all. 1677 1711 */ 1678 - static inline void *kmalloc_large_node_hook(void *ptr, size_t size, gfp_t flags) 1679 - { 1680 - ptr = kasan_kmalloc_large(ptr, size, flags); 1681 - /* As ptr might get tagged, call kmemleak hook after KASAN. */ 1682 - kmemleak_alloc(ptr, size, 1, flags); 1683 - return ptr; 1684 - } 1685 - 1686 - static __always_inline void kfree_hook(void *x) 1687 - { 1688 - kmemleak_free(x); 1689 - kasan_kfree_large(x); 1690 - } 1691 - 1692 1712 static __always_inline bool slab_free_hook(struct kmem_cache *s, 1693 1713 void *x, bool init) 1694 1714 { ··· 1933 1981 */ 1934 1982 slab = alloc_slab_page(alloc_gfp, node, oo); 1935 1983 if (unlikely(!slab)) 1936 - goto out; 1984 + return NULL; 1937 1985 stat(s, ORDER_FALLBACK); 1938 1986 } 1939 1987 1940 1988 slab->objects = oo_objects(oo); 1989 + slab->inuse = 0; 1990 + slab->frozen = 0; 1941 1991 1942 1992 account_slab(slab, oo_order(oo), s, flags); 1943 1993 ··· 1965 2011 } 1966 2012 set_freepointer(s, p, NULL); 1967 2013 } 1968 - 1969 - slab->inuse = slab->objects; 1970 - slab->frozen = 1; 1971 - 1972 - out: 1973 - if (!slab) 1974 - return NULL; 1975 - 1976 - inc_slabs_node(s, slab_nid(slab), slab->objects); 1977 2014 1978 2015 return slab; 1979 2016 } ··· 2053 2108 } 2054 2109 2055 2110 /* 2111 + * Called only for kmem_cache_debug() caches instead of acquire_slab(), with a 2112 + * slab from the n->partial list. Remove only a single object from the slab, do 2113 + * the alloc_debug_processing() checks and leave the slab on the list, or move 2114 + * it to full list if it was the last free object. 2115 + */ 2116 + static void *alloc_single_from_partial(struct kmem_cache *s, 2117 + struct kmem_cache_node *n, struct slab *slab, int orig_size) 2118 + { 2119 + void *object; 2120 + 2121 + lockdep_assert_held(&n->list_lock); 2122 + 2123 + object = slab->freelist; 2124 + slab->freelist = get_freepointer(s, object); 2125 + slab->inuse++; 2126 + 2127 + if (!alloc_debug_processing(s, slab, object, orig_size)) { 2128 + remove_partial(n, slab); 2129 + return NULL; 2130 + } 2131 + 2132 + if (slab->inuse == slab->objects) { 2133 + remove_partial(n, slab); 2134 + add_full(s, n, slab); 2135 + } 2136 + 2137 + return object; 2138 + } 2139 + 2140 + /* 2141 + * Called only for kmem_cache_debug() caches to allocate from a freshly 2142 + * allocated slab. Allocate a single object instead of whole freelist 2143 + * and put the slab to the partial (or full) list. 2144 + */ 2145 + static void *alloc_single_from_new_slab(struct kmem_cache *s, 2146 + struct slab *slab, int orig_size) 2147 + { 2148 + int nid = slab_nid(slab); 2149 + struct kmem_cache_node *n = get_node(s, nid); 2150 + unsigned long flags; 2151 + void *object; 2152 + 2153 + 2154 + object = slab->freelist; 2155 + slab->freelist = get_freepointer(s, object); 2156 + slab->inuse = 1; 2157 + 2158 + if (!alloc_debug_processing(s, slab, object, orig_size)) 2159 + /* 2160 + * It's not really expected that this would fail on a 2161 + * freshly allocated slab, but a concurrent memory 2162 + * corruption in theory could cause that. 2163 + */ 2164 + return NULL; 2165 + 2166 + spin_lock_irqsave(&n->list_lock, flags); 2167 + 2168 + if (slab->inuse == slab->objects) 2169 + add_full(s, n, slab); 2170 + else 2171 + add_partial(n, slab, DEACTIVATE_TO_HEAD); 2172 + 2173 + inc_slabs_node(s, nid, slab->objects); 2174 + spin_unlock_irqrestore(&n->list_lock, flags); 2175 + 2176 + return object; 2177 + } 2178 + 2179 + /* 2056 2180 * Remove slab from the partial list, freeze it and 2057 2181 * return the pointer to the freelist. 2058 2182 * ··· 2178 2164 * Try to allocate a partial slab from a specific node. 2179 2165 */ 2180 2166 static void *get_partial_node(struct kmem_cache *s, struct kmem_cache_node *n, 2181 - struct slab **ret_slab, gfp_t gfpflags) 2167 + struct partial_context *pc) 2182 2168 { 2183 2169 struct slab *slab, *slab2; 2184 2170 void *object = NULL; ··· 2198 2184 list_for_each_entry_safe(slab, slab2, &n->partial, slab_list) { 2199 2185 void *t; 2200 2186 2201 - if (!pfmemalloc_match(slab, gfpflags)) 2187 + if (!pfmemalloc_match(slab, pc->flags)) 2202 2188 continue; 2189 + 2190 + if (kmem_cache_debug(s)) { 2191 + object = alloc_single_from_partial(s, n, slab, 2192 + pc->orig_size); 2193 + if (object) 2194 + break; 2195 + continue; 2196 + } 2203 2197 2204 2198 t = acquire_slab(s, n, slab, object == NULL); 2205 2199 if (!t) 2206 2200 break; 2207 2201 2208 2202 if (!object) { 2209 - *ret_slab = slab; 2203 + *pc->slab = slab; 2210 2204 stat(s, ALLOC_FROM_PARTIAL); 2211 2205 object = t; 2212 2206 } else { ··· 2238 2216 /* 2239 2217 * Get a slab from somewhere. Search in increasing NUMA distances. 2240 2218 */ 2241 - static void *get_any_partial(struct kmem_cache *s, gfp_t flags, 2242 - struct slab **ret_slab) 2219 + static void *get_any_partial(struct kmem_cache *s, struct partial_context *pc) 2243 2220 { 2244 2221 #ifdef CONFIG_NUMA 2245 2222 struct zonelist *zonelist; 2246 2223 struct zoneref *z; 2247 2224 struct zone *zone; 2248 - enum zone_type highest_zoneidx = gfp_zone(flags); 2225 + enum zone_type highest_zoneidx = gfp_zone(pc->flags); 2249 2226 void *object; 2250 2227 unsigned int cpuset_mems_cookie; 2251 2228 ··· 2272 2251 2273 2252 do { 2274 2253 cpuset_mems_cookie = read_mems_allowed_begin(); 2275 - zonelist = node_zonelist(mempolicy_slab_node(), flags); 2254 + zonelist = node_zonelist(mempolicy_slab_node(), pc->flags); 2276 2255 for_each_zone_zonelist(zone, z, zonelist, highest_zoneidx) { 2277 2256 struct kmem_cache_node *n; 2278 2257 2279 2258 n = get_node(s, zone_to_nid(zone)); 2280 2259 2281 - if (n && cpuset_zone_allowed(zone, flags) && 2260 + if (n && cpuset_zone_allowed(zone, pc->flags) && 2282 2261 n->nr_partial > s->min_partial) { 2283 - object = get_partial_node(s, n, ret_slab, flags); 2262 + object = get_partial_node(s, n, pc); 2284 2263 if (object) { 2285 2264 /* 2286 2265 * Don't check read_mems_allowed_retry() ··· 2301 2280 /* 2302 2281 * Get a partial slab, lock it and return it. 2303 2282 */ 2304 - static void *get_partial(struct kmem_cache *s, gfp_t flags, int node, 2305 - struct slab **ret_slab) 2283 + static void *get_partial(struct kmem_cache *s, int node, struct partial_context *pc) 2306 2284 { 2307 2285 void *object; 2308 2286 int searchnode = node; ··· 2309 2289 if (node == NUMA_NO_NODE) 2310 2290 searchnode = numa_mem_id(); 2311 2291 2312 - object = get_partial_node(s, get_node(s, searchnode), ret_slab, flags); 2292 + object = get_partial_node(s, get_node(s, searchnode), pc); 2313 2293 if (object || node != NUMA_NO_NODE) 2314 2294 return object; 2315 2295 2316 - return get_any_partial(s, flags, ret_slab); 2296 + return get_any_partial(s, pc); 2317 2297 } 2318 2298 2319 2299 #ifdef CONFIG_PREEMPTION ··· 2813 2793 { 2814 2794 return atomic_long_read(&n->total_objects); 2815 2795 } 2796 + 2797 + /* Supports checking bulk free of a constructed freelist */ 2798 + static noinline void free_debug_processing( 2799 + struct kmem_cache *s, struct slab *slab, 2800 + void *head, void *tail, int bulk_cnt, 2801 + unsigned long addr) 2802 + { 2803 + struct kmem_cache_node *n = get_node(s, slab_nid(slab)); 2804 + struct slab *slab_free = NULL; 2805 + void *object = head; 2806 + int cnt = 0; 2807 + unsigned long flags; 2808 + bool checks_ok = false; 2809 + depot_stack_handle_t handle = 0; 2810 + 2811 + if (s->flags & SLAB_STORE_USER) 2812 + handle = set_track_prepare(); 2813 + 2814 + spin_lock_irqsave(&n->list_lock, flags); 2815 + 2816 + if (s->flags & SLAB_CONSISTENCY_CHECKS) { 2817 + if (!check_slab(s, slab)) 2818 + goto out; 2819 + } 2820 + 2821 + if (slab->inuse < bulk_cnt) { 2822 + slab_err(s, slab, "Slab has %d allocated objects but %d are to be freed\n", 2823 + slab->inuse, bulk_cnt); 2824 + goto out; 2825 + } 2826 + 2827 + next_object: 2828 + 2829 + if (++cnt > bulk_cnt) 2830 + goto out_cnt; 2831 + 2832 + if (s->flags & SLAB_CONSISTENCY_CHECKS) { 2833 + if (!free_consistency_checks(s, slab, object, addr)) 2834 + goto out; 2835 + } 2836 + 2837 + if (s->flags & SLAB_STORE_USER) 2838 + set_track_update(s, object, TRACK_FREE, addr, handle); 2839 + trace(s, slab, object, 0); 2840 + /* Freepointer not overwritten by init_object(), SLAB_POISON moved it */ 2841 + init_object(s, object, SLUB_RED_INACTIVE); 2842 + 2843 + /* Reached end of constructed freelist yet? */ 2844 + if (object != tail) { 2845 + object = get_freepointer(s, object); 2846 + goto next_object; 2847 + } 2848 + checks_ok = true; 2849 + 2850 + out_cnt: 2851 + if (cnt != bulk_cnt) 2852 + slab_err(s, slab, "Bulk free expected %d objects but found %d\n", 2853 + bulk_cnt, cnt); 2854 + 2855 + out: 2856 + if (checks_ok) { 2857 + void *prior = slab->freelist; 2858 + 2859 + /* Perform the actual freeing while we still hold the locks */ 2860 + slab->inuse -= cnt; 2861 + set_freepointer(s, tail, prior); 2862 + slab->freelist = head; 2863 + 2864 + /* 2865 + * If the slab is empty, and node's partial list is full, 2866 + * it should be discarded anyway no matter it's on full or 2867 + * partial list. 2868 + */ 2869 + if (slab->inuse == 0 && n->nr_partial >= s->min_partial) 2870 + slab_free = slab; 2871 + 2872 + if (!prior) { 2873 + /* was on full list */ 2874 + remove_full(s, n, slab); 2875 + if (!slab_free) { 2876 + add_partial(n, slab, DEACTIVATE_TO_TAIL); 2877 + stat(s, FREE_ADD_PARTIAL); 2878 + } 2879 + } else if (slab_free) { 2880 + remove_partial(n, slab); 2881 + stat(s, FREE_REMOVE_PARTIAL); 2882 + } 2883 + } 2884 + 2885 + if (slab_free) { 2886 + /* 2887 + * Update the counters while still holding n->list_lock to 2888 + * prevent spurious validation warnings 2889 + */ 2890 + dec_slabs_node(s, slab_nid(slab_free), slab_free->objects); 2891 + } 2892 + 2893 + spin_unlock_irqrestore(&n->list_lock, flags); 2894 + 2895 + if (!checks_ok) 2896 + slab_fix(s, "Object at 0x%p not freed", object); 2897 + 2898 + if (slab_free) { 2899 + stat(s, FREE_SLAB); 2900 + free_slab(s, slab_free); 2901 + } 2902 + } 2816 2903 #endif /* CONFIG_SLUB_DEBUG */ 2817 2904 2818 2905 #if defined(CONFIG_SLUB_DEBUG) || defined(CONFIG_SYSFS) ··· 3037 2910 * already disabled (which is the case for bulk allocation). 3038 2911 */ 3039 2912 static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, 3040 - unsigned long addr, struct kmem_cache_cpu *c) 2913 + unsigned long addr, struct kmem_cache_cpu *c, unsigned int orig_size) 3041 2914 { 3042 2915 void *freelist; 3043 2916 struct slab *slab; 3044 2917 unsigned long flags; 2918 + struct partial_context pc; 3045 2919 3046 2920 stat(s, ALLOC_SLOWPATH); 3047 2921 ··· 3156 3028 3157 3029 new_objects: 3158 3030 3159 - freelist = get_partial(s, gfpflags, node, &slab); 3031 + pc.flags = gfpflags; 3032 + pc.slab = &slab; 3033 + pc.orig_size = orig_size; 3034 + freelist = get_partial(s, node, &pc); 3160 3035 if (freelist) 3161 3036 goto check_new_slab; 3162 3037 ··· 3172 3041 return NULL; 3173 3042 } 3174 3043 3044 + stat(s, ALLOC_SLAB); 3045 + 3046 + if (kmem_cache_debug(s)) { 3047 + freelist = alloc_single_from_new_slab(s, slab, orig_size); 3048 + 3049 + if (unlikely(!freelist)) 3050 + goto new_objects; 3051 + 3052 + if (s->flags & SLAB_STORE_USER) 3053 + set_track(s, freelist, TRACK_ALLOC, addr); 3054 + 3055 + return freelist; 3056 + } 3057 + 3175 3058 /* 3176 3059 * No other reference to the slab yet so we can 3177 3060 * muck around with it freely without cmpxchg 3178 3061 */ 3179 3062 freelist = slab->freelist; 3180 3063 slab->freelist = NULL; 3064 + slab->inuse = slab->objects; 3065 + slab->frozen = 1; 3181 3066 3182 - stat(s, ALLOC_SLAB); 3067 + inc_slabs_node(s, slab_nid(slab), slab->objects); 3183 3068 3184 3069 check_new_slab: 3185 3070 3186 3071 if (kmem_cache_debug(s)) { 3187 - if (!alloc_debug_processing(s, slab, freelist, addr)) { 3188 - /* Slab failed checks. Next slab needed */ 3189 - goto new_slab; 3190 - } else { 3191 - /* 3192 - * For debug case, we don't load freelist so that all 3193 - * allocations go through alloc_debug_processing() 3194 - */ 3195 - goto return_single; 3196 - } 3072 + /* 3073 + * For debug caches here we had to go through 3074 + * alloc_single_from_partial() so just store the tracking info 3075 + * and return the object 3076 + */ 3077 + if (s->flags & SLAB_STORE_USER) 3078 + set_track(s, freelist, TRACK_ALLOC, addr); 3079 + 3080 + return freelist; 3197 3081 } 3198 3082 3199 - if (unlikely(!pfmemalloc_match(slab, gfpflags))) 3083 + if (unlikely(!pfmemalloc_match(slab, gfpflags))) { 3200 3084 /* 3201 3085 * For !pfmemalloc_match() case we don't load freelist so that 3202 3086 * we don't make further mismatched allocations easier. 3203 3087 */ 3204 - goto return_single; 3088 + deactivate_slab(s, slab, get_freepointer(s, freelist)); 3089 + return freelist; 3090 + } 3205 3091 3206 3092 retry_load_slab: 3207 3093 ··· 3242 3094 c->slab = slab; 3243 3095 3244 3096 goto load_freelist; 3245 - 3246 - return_single: 3247 - 3248 - deactivate_slab(s, slab, get_freepointer(s, freelist)); 3249 - return freelist; 3250 3097 } 3251 3098 3252 3099 /* ··· 3250 3107 * pointer. 3251 3108 */ 3252 3109 static void *__slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, 3253 - unsigned long addr, struct kmem_cache_cpu *c) 3110 + unsigned long addr, struct kmem_cache_cpu *c, unsigned int orig_size) 3254 3111 { 3255 3112 void *p; 3256 3113 ··· 3263 3120 c = slub_get_cpu_ptr(s->cpu_slab); 3264 3121 #endif 3265 3122 3266 - p = ___slab_alloc(s, gfpflags, node, addr, c); 3123 + p = ___slab_alloc(s, gfpflags, node, addr, c, orig_size); 3267 3124 #ifdef CONFIG_PREEMPT_COUNT 3268 3125 slub_put_cpu_ptr(s->cpu_slab); 3269 3126 #endif ··· 3345 3202 3346 3203 object = c->freelist; 3347 3204 slab = c->slab; 3348 - /* 3349 - * We cannot use the lockless fastpath on PREEMPT_RT because if a 3350 - * slowpath has taken the local_lock_irqsave(), it is not protected 3351 - * against a fast path operation in an irq handler. So we need to take 3352 - * the slow path which uses local_lock. It is still relatively fast if 3353 - * there is a suitable cpu freelist. 3354 - */ 3355 - if (IS_ENABLED(CONFIG_PREEMPT_RT) || 3205 + 3206 + if (!USE_LOCKLESS_FAST_PATH() || 3356 3207 unlikely(!object || !slab || !node_match(slab, node))) { 3357 - object = __slab_alloc(s, gfpflags, node, addr, c); 3208 + object = __slab_alloc(s, gfpflags, node, addr, c, orig_size); 3358 3209 } else { 3359 3210 void *next_object = get_freepointer_safe(s, object); 3360 3211 ··· 3399 3262 { 3400 3263 void *ret = slab_alloc(s, lru, gfpflags, _RET_IP_, s->object_size); 3401 3264 3402 - trace_kmem_cache_alloc(_RET_IP_, ret, s, s->object_size, 3403 - s->size, gfpflags); 3265 + trace_kmem_cache_alloc(_RET_IP_, ret, s, gfpflags, NUMA_NO_NODE); 3404 3266 3405 3267 return ret; 3406 3268 } ··· 3417 3281 } 3418 3282 EXPORT_SYMBOL(kmem_cache_alloc_lru); 3419 3283 3420 - #ifdef CONFIG_TRACING 3421 - void *kmem_cache_alloc_trace(struct kmem_cache *s, gfp_t gfpflags, size_t size) 3284 + void *__kmem_cache_alloc_node(struct kmem_cache *s, gfp_t gfpflags, 3285 + int node, size_t orig_size, 3286 + unsigned long caller) 3422 3287 { 3423 - void *ret = slab_alloc(s, NULL, gfpflags, _RET_IP_, size); 3424 - trace_kmalloc(_RET_IP_, ret, s, size, s->size, gfpflags); 3425 - ret = kasan_kmalloc(s, ret, size, gfpflags); 3426 - return ret; 3288 + return slab_alloc_node(s, NULL, gfpflags, node, 3289 + caller, orig_size); 3427 3290 } 3428 - EXPORT_SYMBOL(kmem_cache_alloc_trace); 3429 - #endif 3430 3291 3431 - #ifdef CONFIG_NUMA 3432 3292 void *kmem_cache_alloc_node(struct kmem_cache *s, gfp_t gfpflags, int node) 3433 3293 { 3434 3294 void *ret = slab_alloc_node(s, NULL, gfpflags, node, _RET_IP_, s->object_size); 3435 3295 3436 - trace_kmem_cache_alloc_node(_RET_IP_, ret, s, 3437 - s->object_size, s->size, gfpflags, node); 3296 + trace_kmem_cache_alloc(_RET_IP_, ret, s, gfpflags, node); 3438 3297 3439 3298 return ret; 3440 3299 } 3441 3300 EXPORT_SYMBOL(kmem_cache_alloc_node); 3442 - 3443 - #ifdef CONFIG_TRACING 3444 - void *kmem_cache_alloc_node_trace(struct kmem_cache *s, 3445 - gfp_t gfpflags, 3446 - int node, size_t size) 3447 - { 3448 - void *ret = slab_alloc_node(s, NULL, gfpflags, node, _RET_IP_, size); 3449 - 3450 - trace_kmalloc_node(_RET_IP_, ret, s, 3451 - size, s->size, gfpflags, node); 3452 - 3453 - ret = kasan_kmalloc(s, ret, size, gfpflags); 3454 - return ret; 3455 - } 3456 - EXPORT_SYMBOL(kmem_cache_alloc_node_trace); 3457 - #endif 3458 - #endif /* CONFIG_NUMA */ 3459 3301 3460 3302 /* 3461 3303 * Slow path handling. This may still be called frequently since objects ··· 3460 3346 if (kfence_free(head)) 3461 3347 return; 3462 3348 3463 - if (kmem_cache_debug(s) && 3464 - !free_debug_processing(s, slab, head, tail, cnt, addr)) 3349 + if (kmem_cache_debug(s)) { 3350 + free_debug_processing(s, slab, head, tail, cnt, addr); 3465 3351 return; 3352 + } 3466 3353 3467 3354 do { 3468 3355 if (unlikely(n)) { ··· 3583 3468 void *tail_obj = tail ? : head; 3584 3469 struct kmem_cache_cpu *c; 3585 3470 unsigned long tid; 3471 + void **freelist; 3586 3472 3587 3473 redo: 3588 3474 /* ··· 3598 3482 /* Same with comment on barrier() in slab_alloc_node() */ 3599 3483 barrier(); 3600 3484 3601 - if (likely(slab == c->slab)) { 3602 - #ifndef CONFIG_PREEMPT_RT 3603 - void **freelist = READ_ONCE(c->freelist); 3485 + if (unlikely(slab != c->slab)) { 3486 + __slab_free(s, slab, head, tail_obj, cnt, addr); 3487 + return; 3488 + } 3489 + 3490 + if (USE_LOCKLESS_FAST_PATH()) { 3491 + freelist = READ_ONCE(c->freelist); 3604 3492 3605 3493 set_freepointer(s, tail_obj, freelist); 3606 3494 ··· 3616 3496 note_cmpxchg_failure("slab_free", s, tid); 3617 3497 goto redo; 3618 3498 } 3619 - #else /* CONFIG_PREEMPT_RT */ 3620 - /* 3621 - * We cannot use the lockless fastpath on PREEMPT_RT because if 3622 - * a slowpath has taken the local_lock_irqsave(), it is not 3623 - * protected against a fast path operation in an irq handler. So 3624 - * we need to take the local_lock. We shouldn't simply defer to 3625 - * __slab_free() as that wouldn't use the cpu freelist at all. 3626 - */ 3627 - void **freelist; 3628 - 3499 + } else { 3500 + /* Update the free list under the local lock */ 3629 3501 local_lock(&s->cpu_slab->lock); 3630 3502 c = this_cpu_ptr(s->cpu_slab); 3631 3503 if (unlikely(slab != c->slab)) { ··· 3632 3520 c->tid = next_tid(tid); 3633 3521 3634 3522 local_unlock(&s->cpu_slab->lock); 3635 - #endif 3636 - stat(s, FREE_FASTPATH); 3637 - } else 3638 - __slab_free(s, slab, head, tail_obj, cnt, addr); 3639 - 3523 + } 3524 + stat(s, FREE_FASTPATH); 3640 3525 } 3641 3526 3642 3527 static __always_inline void slab_free(struct kmem_cache *s, struct slab *slab, ··· 3656 3547 } 3657 3548 #endif 3658 3549 3550 + void __kmem_cache_free(struct kmem_cache *s, void *x, unsigned long caller) 3551 + { 3552 + slab_free(s, virt_to_slab(x), x, NULL, &x, 1, caller); 3553 + } 3554 + 3659 3555 void kmem_cache_free(struct kmem_cache *s, void *x) 3660 3556 { 3661 3557 s = cache_from_obj(s, x); 3662 3558 if (!s) 3663 3559 return; 3664 - trace_kmem_cache_free(_RET_IP_, x, s->name); 3560 + trace_kmem_cache_free(_RET_IP_, x, s); 3665 3561 slab_free(s, virt_to_slab(x), x, NULL, &x, 1, _RET_IP_); 3666 3562 } 3667 3563 EXPORT_SYMBOL(kmem_cache_free); ··· 3678 3564 int cnt; 3679 3565 struct kmem_cache *s; 3680 3566 }; 3681 - 3682 - static inline void free_large_kmalloc(struct folio *folio, void *object) 3683 - { 3684 - unsigned int order = folio_order(folio); 3685 - 3686 - if (WARN_ON_ONCE(order == 0)) 3687 - pr_warn_once("object pointer: 0x%p\n", object); 3688 - 3689 - kfree_hook(object); 3690 - mod_lruvec_page_state(folio_page(folio, 0), NR_SLAB_UNRECLAIMABLE_B, 3691 - -(PAGE_SIZE << order)); 3692 - __free_pages(folio_page(folio, 0), order); 3693 - } 3694 3567 3695 3568 /* 3696 3569 * This function progressively scans the array with free objects (with ··· 3815 3714 * of re-populating per CPU c->freelist 3816 3715 */ 3817 3716 p[i] = ___slab_alloc(s, flags, NUMA_NO_NODE, 3818 - _RET_IP_, c); 3717 + _RET_IP_, c, s->object_size); 3819 3718 if (unlikely(!p[i])) 3820 3719 goto error; 3821 3720 ··· 4042 3941 slab = new_slab(kmem_cache_node, GFP_NOWAIT, node); 4043 3942 4044 3943 BUG_ON(!slab); 3944 + inc_slabs_node(kmem_cache_node, slab_nid(slab), slab->objects); 4045 3945 if (slab_nid(slab) != node) { 4046 3946 pr_err("SLUB: Unable to allocate memory from node %d\n", node); 4047 3947 pr_err("SLUB: Allocating a useless per node structure in order to be able to continue\n"); ··· 4057 3955 n = kasan_slab_alloc(kmem_cache_node, n, GFP_KERNEL, false); 4058 3956 slab->freelist = get_freepointer(kmem_cache_node, n); 4059 3957 slab->inuse = 1; 4060 - slab->frozen = 0; 4061 3958 kmem_cache_node->node[node] = n; 4062 3959 init_kmem_cache_node(n); 4063 3960 inc_slabs_node(kmem_cache_node, node, slab->objects); ··· 4218 4117 } 4219 4118 4220 4119 #ifdef CONFIG_SLUB_DEBUG 4221 - if (flags & SLAB_STORE_USER) 4120 + if (flags & SLAB_STORE_USER) { 4222 4121 /* 4223 4122 * Need to store information about allocs and frees after 4224 4123 * the object. 4225 4124 */ 4226 4125 size += 2 * sizeof(struct track); 4126 + 4127 + /* Save the original kmalloc request size */ 4128 + if (flags & SLAB_KMALLOC) 4129 + size += sizeof(unsigned int); 4130 + } 4227 4131 #endif 4228 4132 4229 4133 kasan_cache_create(s, &size, &s->flags); ··· 4348 4242 { 4349 4243 #ifdef CONFIG_SLUB_DEBUG 4350 4244 void *addr = slab_address(slab); 4351 - unsigned long flags; 4352 - unsigned long *map; 4353 4245 void *p; 4354 4246 4355 4247 slab_err(s, slab, text, s->name); 4356 - slab_lock(slab, &flags); 4357 4248 4358 - map = get_map(s, slab); 4249 + spin_lock(&object_map_lock); 4250 + __fill_map(object_map, s, slab); 4251 + 4359 4252 for_each_object(p, s, addr, slab->objects) { 4360 4253 4361 - if (!test_bit(__obj_to_index(s, addr, p), map)) { 4254 + if (!test_bit(__obj_to_index(s, addr, p), object_map)) { 4362 4255 pr_err("Object 0x%p @offset=%tu\n", p, p - addr); 4363 4256 print_tracking(s, p); 4364 4257 } 4365 4258 } 4366 - put_map(map); 4367 - slab_unlock(slab, &flags); 4259 + spin_unlock(&object_map_lock); 4368 4260 #endif 4369 4261 } 4370 4262 ··· 4513 4409 4514 4410 __setup("slub_min_objects=", setup_slub_min_objects); 4515 4411 4516 - void *__kmalloc(size_t size, gfp_t flags) 4517 - { 4518 - struct kmem_cache *s; 4519 - void *ret; 4520 - 4521 - if (unlikely(size > KMALLOC_MAX_CACHE_SIZE)) 4522 - return kmalloc_large(size, flags); 4523 - 4524 - s = kmalloc_slab(size, flags); 4525 - 4526 - if (unlikely(ZERO_OR_NULL_PTR(s))) 4527 - return s; 4528 - 4529 - ret = slab_alloc(s, NULL, flags, _RET_IP_, size); 4530 - 4531 - trace_kmalloc(_RET_IP_, ret, s, size, s->size, flags); 4532 - 4533 - ret = kasan_kmalloc(s, ret, size, flags); 4534 - 4535 - return ret; 4536 - } 4537 - EXPORT_SYMBOL(__kmalloc); 4538 - 4539 - #ifdef CONFIG_NUMA 4540 - static void *kmalloc_large_node(size_t size, gfp_t flags, int node) 4541 - { 4542 - struct page *page; 4543 - void *ptr = NULL; 4544 - unsigned int order = get_order(size); 4545 - 4546 - flags |= __GFP_COMP; 4547 - page = alloc_pages_node(node, flags, order); 4548 - if (page) { 4549 - ptr = page_address(page); 4550 - mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE_B, 4551 - PAGE_SIZE << order); 4552 - } 4553 - 4554 - return kmalloc_large_node_hook(ptr, size, flags); 4555 - } 4556 - 4557 - void *__kmalloc_node(size_t size, gfp_t flags, int node) 4558 - { 4559 - struct kmem_cache *s; 4560 - void *ret; 4561 - 4562 - if (unlikely(size > KMALLOC_MAX_CACHE_SIZE)) { 4563 - ret = kmalloc_large_node(size, flags, node); 4564 - 4565 - trace_kmalloc_node(_RET_IP_, ret, NULL, 4566 - size, PAGE_SIZE << get_order(size), 4567 - flags, node); 4568 - 4569 - return ret; 4570 - } 4571 - 4572 - s = kmalloc_slab(size, flags); 4573 - 4574 - if (unlikely(ZERO_OR_NULL_PTR(s))) 4575 - return s; 4576 - 4577 - ret = slab_alloc_node(s, NULL, flags, node, _RET_IP_, size); 4578 - 4579 - trace_kmalloc_node(_RET_IP_, ret, s, size, s->size, flags, node); 4580 - 4581 - ret = kasan_kmalloc(s, ret, size, flags); 4582 - 4583 - return ret; 4584 - } 4585 - EXPORT_SYMBOL(__kmalloc_node); 4586 - #endif /* CONFIG_NUMA */ 4587 - 4588 4412 #ifdef CONFIG_HARDENED_USERCOPY 4589 4413 /* 4590 4414 * Rejects incorrectly sized objects and objects that are to be copied ··· 4563 4531 } 4564 4532 #endif /* CONFIG_HARDENED_USERCOPY */ 4565 4533 4566 - size_t __ksize(const void *object) 4567 - { 4568 - struct folio *folio; 4569 - 4570 - if (unlikely(object == ZERO_SIZE_PTR)) 4571 - return 0; 4572 - 4573 - folio = virt_to_folio(object); 4574 - 4575 - if (unlikely(!folio_test_slab(folio))) 4576 - return folio_size(folio); 4577 - 4578 - return slab_ksize(folio_slab(folio)->slab_cache); 4579 - } 4580 - EXPORT_SYMBOL(__ksize); 4581 - 4582 - void kfree(const void *x) 4583 - { 4584 - struct folio *folio; 4585 - struct slab *slab; 4586 - void *object = (void *)x; 4587 - 4588 - trace_kfree(_RET_IP_, x); 4589 - 4590 - if (unlikely(ZERO_OR_NULL_PTR(x))) 4591 - return; 4592 - 4593 - folio = virt_to_folio(x); 4594 - if (unlikely(!folio_test_slab(folio))) { 4595 - free_large_kmalloc(folio, object); 4596 - return; 4597 - } 4598 - slab = folio_slab(folio); 4599 - slab_free(slab->slab_cache, slab, object, NULL, &object, 1, _RET_IP_); 4600 - } 4601 - EXPORT_SYMBOL(kfree); 4602 - 4603 4534 #define SHRINK_PROMOTE_MAX 32 4604 4535 4605 4536 /* ··· 4611 4616 if (free == slab->objects) { 4612 4617 list_move(&slab->slab_list, &discard); 4613 4618 n->nr_partial--; 4619 + dec_slabs_node(s, node, slab->objects); 4614 4620 } else if (free <= SHRINK_PROMOTE_MAX) 4615 4621 list_move(&slab->slab_list, promote + free - 1); 4616 4622 } ··· 4627 4631 4628 4632 /* Release empty slabs */ 4629 4633 list_for_each_entry_safe(slab, t, &discard, slab_list) 4630 - discard_slab(s, slab); 4634 + free_slab(s, slab); 4631 4635 4632 4636 if (slabs_node(s, node)) 4633 4637 ret = 1; ··· 4911 4915 return 0; 4912 4916 } 4913 4917 4914 - void *__kmalloc_track_caller(size_t size, gfp_t gfpflags, unsigned long caller) 4915 - { 4916 - struct kmem_cache *s; 4917 - void *ret; 4918 - 4919 - if (unlikely(size > KMALLOC_MAX_CACHE_SIZE)) 4920 - return kmalloc_large(size, gfpflags); 4921 - 4922 - s = kmalloc_slab(size, gfpflags); 4923 - 4924 - if (unlikely(ZERO_OR_NULL_PTR(s))) 4925 - return s; 4926 - 4927 - ret = slab_alloc(s, NULL, gfpflags, caller, size); 4928 - 4929 - /* Honor the call site pointer we received. */ 4930 - trace_kmalloc(caller, ret, s, size, s->size, gfpflags); 4931 - 4932 - ret = kasan_kmalloc(s, ret, size, gfpflags); 4933 - 4934 - return ret; 4935 - } 4936 - EXPORT_SYMBOL(__kmalloc_track_caller); 4937 - 4938 - #ifdef CONFIG_NUMA 4939 - void *__kmalloc_node_track_caller(size_t size, gfp_t gfpflags, 4940 - int node, unsigned long caller) 4941 - { 4942 - struct kmem_cache *s; 4943 - void *ret; 4944 - 4945 - if (unlikely(size > KMALLOC_MAX_CACHE_SIZE)) { 4946 - ret = kmalloc_large_node(size, gfpflags, node); 4947 - 4948 - trace_kmalloc_node(caller, ret, NULL, 4949 - size, PAGE_SIZE << get_order(size), 4950 - gfpflags, node); 4951 - 4952 - return ret; 4953 - } 4954 - 4955 - s = kmalloc_slab(size, gfpflags); 4956 - 4957 - if (unlikely(ZERO_OR_NULL_PTR(s))) 4958 - return s; 4959 - 4960 - ret = slab_alloc_node(s, NULL, gfpflags, node, caller, size); 4961 - 4962 - /* Honor the call site pointer we received. */ 4963 - trace_kmalloc_node(caller, ret, s, size, s->size, gfpflags, node); 4964 - 4965 - ret = kasan_kmalloc(s, ret, size, gfpflags); 4966 - 4967 - return ret; 4968 - } 4969 - EXPORT_SYMBOL(__kmalloc_node_track_caller); 4970 - #endif 4971 - 4972 4918 #ifdef CONFIG_SYSFS 4973 4919 static int count_inuse(struct slab *slab) 4974 4920 { ··· 4929 4991 { 4930 4992 void *p; 4931 4993 void *addr = slab_address(slab); 4932 - unsigned long flags; 4933 - 4934 - slab_lock(slab, &flags); 4935 4994 4936 4995 if (!check_slab(s, slab) || !on_freelist(s, slab, NULL)) 4937 - goto unlock; 4996 + return; 4938 4997 4939 4998 /* Now we know that a valid freelist exists */ 4940 4999 __fill_map(obj_map, s, slab); ··· 4942 5007 if (!check_object(s, slab, p, val)) 4943 5008 break; 4944 5009 } 4945 - unlock: 4946 - slab_unlock(slab, &flags); 4947 5010 } 4948 5011 4949 5012 static int validate_slab_node(struct kmem_cache *s, ··· 5012 5079 depot_stack_handle_t handle; 5013 5080 unsigned long count; 5014 5081 unsigned long addr; 5082 + unsigned long waste; 5015 5083 long long sum_time; 5016 5084 long min_time; 5017 5085 long max_time; ··· 5059 5125 } 5060 5126 5061 5127 static int add_location(struct loc_track *t, struct kmem_cache *s, 5062 - const struct track *track) 5128 + const struct track *track, 5129 + unsigned int orig_size) 5063 5130 { 5064 5131 long start, end, pos; 5065 5132 struct location *l; 5066 - unsigned long caddr, chandle; 5133 + unsigned long caddr, chandle, cwaste; 5067 5134 unsigned long age = jiffies - track->when; 5068 5135 depot_stack_handle_t handle = 0; 5136 + unsigned int waste = s->object_size - orig_size; 5069 5137 5070 5138 #ifdef CONFIG_STACKDEPOT 5071 5139 handle = READ_ONCE(track->handle); ··· 5085 5149 if (pos == end) 5086 5150 break; 5087 5151 5088 - caddr = t->loc[pos].addr; 5089 - chandle = t->loc[pos].handle; 5090 - if ((track->addr == caddr) && (handle == chandle)) { 5152 + l = &t->loc[pos]; 5153 + caddr = l->addr; 5154 + chandle = l->handle; 5155 + cwaste = l->waste; 5156 + if ((track->addr == caddr) && (handle == chandle) && 5157 + (waste == cwaste)) { 5091 5158 5092 - l = &t->loc[pos]; 5093 5159 l->count++; 5094 5160 if (track->when) { 5095 5161 l->sum_time += age; ··· 5116 5178 end = pos; 5117 5179 else if (track->addr == caddr && handle < chandle) 5118 5180 end = pos; 5181 + else if (track->addr == caddr && handle == chandle && 5182 + waste < cwaste) 5183 + end = pos; 5119 5184 else 5120 5185 start = pos; 5121 5186 } ··· 5142 5201 l->min_pid = track->pid; 5143 5202 l->max_pid = track->pid; 5144 5203 l->handle = handle; 5204 + l->waste = waste; 5145 5205 cpumask_clear(to_cpumask(l->cpus)); 5146 5206 cpumask_set_cpu(track->cpu, to_cpumask(l->cpus)); 5147 5207 nodes_clear(l->nodes); ··· 5155 5213 unsigned long *obj_map) 5156 5214 { 5157 5215 void *addr = slab_address(slab); 5216 + bool is_alloc = (alloc == TRACK_ALLOC); 5158 5217 void *p; 5159 5218 5160 5219 __fill_map(obj_map, s, slab); 5161 5220 5162 5221 for_each_object(p, s, addr, slab->objects) 5163 5222 if (!test_bit(__obj_to_index(s, addr, p), obj_map)) 5164 - add_location(t, s, get_track(s, p, alloc)); 5223 + add_location(t, s, get_track(s, p, alloc), 5224 + is_alloc ? get_orig_size(s, p) : 5225 + s->object_size); 5165 5226 } 5166 5227 #endif /* CONFIG_DEBUG_FS */ 5167 5228 #endif /* CONFIG_SLUB_DEBUG */ ··· 5557 5612 { 5558 5613 int ret = -EINVAL; 5559 5614 5560 - if (buf[0] == '1') { 5615 + if (buf[0] == '1' && kmem_cache_debug(s)) { 5561 5616 ret = validate_slab_cache(s); 5562 5617 if (ret >= 0) 5563 5618 ret = length; ··· 5782 5837 { 5783 5838 struct slab_attribute *attribute; 5784 5839 struct kmem_cache *s; 5785 - int err; 5786 5840 5787 5841 attribute = to_slab_attr(attr); 5788 5842 s = to_slab(kobj); ··· 5789 5845 if (!attribute->show) 5790 5846 return -EIO; 5791 5847 5792 - err = attribute->show(s, buf); 5793 - 5794 - return err; 5848 + return attribute->show(s, buf); 5795 5849 } 5796 5850 5797 5851 static ssize_t slab_attr_store(struct kobject *kobj, ··· 5798 5856 { 5799 5857 struct slab_attribute *attribute; 5800 5858 struct kmem_cache *s; 5801 - int err; 5802 5859 5803 5860 attribute = to_slab_attr(attr); 5804 5861 s = to_slab(kobj); ··· 5805 5864 if (!attribute->store) 5806 5865 return -EIO; 5807 5866 5808 - err = attribute->store(s, buf, len); 5809 - return err; 5867 + return attribute->store(s, buf, len); 5810 5868 } 5811 5869 5812 5870 static void kmem_cache_release(struct kobject *k) ··· 5830 5890 return slab_kset; 5831 5891 } 5832 5892 5833 - #define ID_STR_LENGTH 64 5893 + #define ID_STR_LENGTH 32 5834 5894 5835 5895 /* Create a unique string id for a slab cache: 5836 5896 * ··· 5864 5924 *p++ = 'A'; 5865 5925 if (p != name + 1) 5866 5926 *p++ = '-'; 5867 - p += sprintf(p, "%07u", s->size); 5927 + p += snprintf(p, ID_STR_LENGTH - (p - name), "%07u", s->size); 5868 5928 5869 - BUG_ON(p > name + ID_STR_LENGTH - 1); 5929 + if (WARN_ON(p > name + ID_STR_LENGTH - 1)) { 5930 + kfree(name); 5931 + return ERR_PTR(-EINVAL); 5932 + } 5870 5933 return name; 5871 5934 } 5872 5935 ··· 6034 6091 seq_printf(seq, "%pS", (void *)l->addr); 6035 6092 else 6036 6093 seq_puts(seq, "<not-available>"); 6094 + 6095 + if (l->waste) 6096 + seq_printf(seq, " waste=%lu/%lu", 6097 + l->count * l->waste, l->waste); 6037 6098 6038 6099 if (l->sum_time != l->min_time) { 6039 6100 seq_printf(seq, " age=%ld/%llu/%ld",