Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

cma: factor out minimum alignment requirement

Patch series "mm: enforce pageblock_order < MAX_ORDER".

Having pageblock_order >= MAX_ORDER seems to be able to happen in corner
cases and some parts of the kernel are not prepared for it.

For example, Aneesh has shown [1] that such kernels can be compiled on
ppc64 with 64k base pages by setting FORCE_MAX_ZONEORDER=8, which will
run into a WARN_ON_ONCE(order >= MAX_ORDER) in comapction code right
during boot.

We can get pageblock_order >= MAX_ORDER when the default hugetlb size is
bigger than the maximum allocation granularity of the buddy, in which
case we are no longer talking about huge pages but instead gigantic
pages.

Having pageblock_order >= MAX_ORDER can only make alloc_contig_range()
of such gigantic pages more likely to succeed.

Reliable use of gigantic pages either requires boot time allcoation or
CMA, no need to overcomplicate some places in the kernel to optimize for
corner cases that are broken in other areas of the kernel.

This patch (of 2):

Let's enforce pageblock_order < MAX_ORDER and simplify.

Especially patch #1 can be regarded a cleanup before:
[PATCH v5 0/6] Use pageblock_order for cma and alloc_contig_range
alignment. [2]

[1] https://lkml.kernel.org/r/87r189a2ks.fsf@linux.ibm.com
[2] https://lkml.kernel.org/r/20220211164135.1803616-1-zi.yan@sent.com

Link: https://lkml.kernel.org/r/20220214174132.219303-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Acked-by: Rob Herring <robh@kernel.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: John Garry via iommu <iommu@lists.linux-foundation.org>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

David Hildenbrand and committed by
Linus Torvalds
e16faf26 56651377

+19 -30
-5
arch/powerpc/include/asm/fadump-internal.h
··· 19 19 20 20 #define memblock_num_regions(memblock_type) (memblock.memblock_type.cnt) 21 21 22 - /* Alignment per CMA requirement. */ 23 - #define FADUMP_CMA_ALIGNMENT (PAGE_SIZE << \ 24 - max_t(unsigned long, MAX_ORDER - 1, \ 25 - pageblock_order)) 26 - 27 22 /* FAD commands */ 28 23 #define FADUMP_REGISTER 1 29 24 #define FADUMP_UNREGISTER 2
+1 -1
arch/powerpc/kernel/fadump.c
··· 544 544 if (!fw_dump.nocma) { 545 545 fw_dump.boot_memory_size = 546 546 ALIGN(fw_dump.boot_memory_size, 547 - FADUMP_CMA_ALIGNMENT); 547 + CMA_MIN_ALIGNMENT_BYTES); 548 548 } 549 549 #endif 550 550
+3 -6
drivers/of/of_reserved_mem.c
··· 22 22 #include <linux/slab.h> 23 23 #include <linux/memblock.h> 24 24 #include <linux/kmemleak.h> 25 + #include <linux/cma.h> 25 26 26 27 #include "of_private.h" 27 28 ··· 117 116 if (IS_ENABLED(CONFIG_CMA) 118 117 && of_flat_dt_is_compatible(node, "shared-dma-pool") 119 118 && of_get_flat_dt_prop(node, "reusable", NULL) 120 - && !nomap) { 121 - unsigned long order = 122 - max_t(unsigned long, MAX_ORDER - 1, pageblock_order); 123 - 124 - align = max(align, (phys_addr_t)PAGE_SIZE << order); 125 - } 119 + && !nomap) 120 + align = max_t(phys_addr_t, align, CMA_MIN_ALIGNMENT_BYTES); 126 121 127 122 prop = of_get_flat_dt_prop(node, "alloc-ranges", &len); 128 123 if (prop) {
+9
include/linux/cma.h
··· 20 20 21 21 #define CMA_MAX_NAME 64 22 22 23 + /* 24 + * TODO: once the buddy -- especially pageblock merging and alloc_contig_range() 25 + * -- can deal with only some pageblocks of a higher-order page being 26 + * MIGRATE_CMA, we can use pageblock_nr_pages. 27 + */ 28 + #define CMA_MIN_ALIGNMENT_PAGES max_t(phys_addr_t, MAX_ORDER_NR_PAGES, \ 29 + pageblock_nr_pages) 30 + #define CMA_MIN_ALIGNMENT_BYTES (PAGE_SIZE * CMA_MIN_ALIGNMENT_PAGES) 31 + 23 32 struct cma; 24 33 25 34 extern unsigned long totalcma_pages;
+1 -3
kernel/dma/contiguous.c
··· 399 399 400 400 static int __init rmem_cma_setup(struct reserved_mem *rmem) 401 401 { 402 - phys_addr_t align = PAGE_SIZE << max(MAX_ORDER - 1, pageblock_order); 403 - phys_addr_t mask = align - 1; 404 402 unsigned long node = rmem->fdt_node; 405 403 bool default_cma = of_get_flat_dt_prop(node, "linux,cma-default", NULL); 406 404 struct cma *cma; ··· 414 416 of_get_flat_dt_prop(node, "no-map", NULL)) 415 417 return -EINVAL; 416 418 417 - if ((rmem->base & mask) || (rmem->size & mask)) { 419 + if (!IS_ALIGNED(rmem->base | rmem->size, CMA_MIN_ALIGNMENT_BYTES)) { 418 420 pr_err("Reserved memory: incorrect alignment of CMA region\n"); 419 421 return -EINVAL; 420 422 }
+5 -15
mm/cma.c
··· 168 168 struct cma **res_cma) 169 169 { 170 170 struct cma *cma; 171 - phys_addr_t alignment; 172 171 173 172 /* Sanity checks */ 174 173 if (cma_area_count == ARRAY_SIZE(cma_areas)) { ··· 178 179 if (!size || !memblock_is_region_reserved(base, size)) 179 180 return -EINVAL; 180 181 181 - /* ensure minimal alignment required by mm core */ 182 - alignment = PAGE_SIZE << 183 - max_t(unsigned long, MAX_ORDER - 1, pageblock_order); 184 - 185 182 /* alignment should be aligned with order_per_bit */ 186 - if (!IS_ALIGNED(alignment >> PAGE_SHIFT, 1 << order_per_bit)) 183 + if (!IS_ALIGNED(CMA_MIN_ALIGNMENT_PAGES, 1 << order_per_bit)) 187 184 return -EINVAL; 188 185 189 - if (ALIGN(base, alignment) != base || ALIGN(size, alignment) != size) 186 + /* ensure minimal alignment required by mm core */ 187 + if (!IS_ALIGNED(base | size, CMA_MIN_ALIGNMENT_BYTES)) 190 188 return -EINVAL; 191 189 192 190 /* ··· 258 262 if (alignment && !is_power_of_2(alignment)) 259 263 return -EINVAL; 260 264 261 - /* 262 - * Sanitise input arguments. 263 - * Pages both ends in CMA area could be merged into adjacent unmovable 264 - * migratetype page by page allocator's buddy algorithm. In the case, 265 - * you couldn't get a contiguous memory, which is not what we want. 266 - */ 267 - alignment = max(alignment, (phys_addr_t)PAGE_SIZE << 268 - max_t(unsigned long, MAX_ORDER - 1, pageblock_order)); 265 + /* Sanitise input arguments. */ 266 + alignment = max_t(phys_addr_t, alignment, CMA_MIN_ALIGNMENT_BYTES); 269 267 if (fixed && base & (alignment - 1)) { 270 268 ret = -EINVAL; 271 269 pr_err("Region at %pa must be aligned to %pa bytes\n",