Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

mm: remove flush_kernel_dcache_page

flush_kernel_dcache_page is a rather confusing interface that implements a
subset of flush_dcache_page by not being able to properly handle page
cache mapped pages.

The only callers left are in the exec code as all other previous callers
were incorrect as they could have dealt with page cache pages. Replace
the calls to flush_kernel_dcache_page with calls to flush_dcache_page,
which for all architectures does either exactly the same thing, can
contains one or more of the following:

1) an optimization to defer the cache flush for page cache pages not
mapped into userspace
2) additional flushing for mapped page cache pages if cache aliases
are possible

Link: https://lkml.kernel.org/r/20210712060928.4161649-7-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Cc: Alex Shi <alexs@kernel.org>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Helge Deller <deller@gmx.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Cercueil <paul@crapouillou.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Christoph Hellwig and committed by
Linus Torvalds
f358afc5 0e84f5db

+46 -150
+32 -44
Documentation/core-api/cachetlb.rst
··· 271 271 272 272 ``void flush_dcache_page(struct page *page)`` 273 273 274 - Any time the kernel writes to a page cache page, _OR_ 275 - the kernel is about to read from a page cache page and 276 - user space shared/writable mappings of this page potentially 277 - exist, this routine is called. 274 + This routines must be called when: 275 + 276 + a) the kernel did write to a page that is in the page cache page 277 + and / or in high memory 278 + b) the kernel is about to read from a page cache page and user space 279 + shared/writable mappings of this page potentially exist. Note 280 + that {get,pin}_user_pages{_fast} already call flush_dcache_page 281 + on any page found in the user address space and thus driver 282 + code rarely needs to take this into account. 278 283 279 284 .. note:: 280 285 ··· 289 284 handling vfs symlinks in the page cache need not call 290 285 this interface at all. 291 286 292 - The phrase "kernel writes to a page cache page" means, 293 - specifically, that the kernel executes store instructions 294 - that dirty data in that page at the page->virtual mapping 295 - of that page. It is important to flush here to handle 296 - D-cache aliasing, to make sure these kernel stores are 297 - visible to user space mappings of that page. 287 + The phrase "kernel writes to a page cache page" means, specifically, 288 + that the kernel executes store instructions that dirty data in that 289 + page at the page->virtual mapping of that page. It is important to 290 + flush here to handle D-cache aliasing, to make sure these kernel stores 291 + are visible to user space mappings of that page. 298 292 299 - The corollary case is just as important, if there are users 300 - which have shared+writable mappings of this file, we must make 301 - sure that kernel reads of these pages will see the most recent 302 - stores done by the user. 293 + The corollary case is just as important, if there are users which have 294 + shared+writable mappings of this file, we must make sure that kernel 295 + reads of these pages will see the most recent stores done by the user. 303 296 304 - If D-cache aliasing is not an issue, this routine may 305 - simply be defined as a nop on that architecture. 297 + If D-cache aliasing is not an issue, this routine may simply be defined 298 + as a nop on that architecture. 306 299 307 - There is a bit set aside in page->flags (PG_arch_1) as 308 - "architecture private". The kernel guarantees that, 309 - for pagecache pages, it will clear this bit when such 310 - a page first enters the pagecache. 300 + There is a bit set aside in page->flags (PG_arch_1) as "architecture 301 + private". The kernel guarantees that, for pagecache pages, it will 302 + clear this bit when such a page first enters the pagecache. 311 303 312 - This allows these interfaces to be implemented much more 313 - efficiently. It allows one to "defer" (perhaps indefinitely) 314 - the actual flush if there are currently no user processes 315 - mapping this page. See sparc64's flush_dcache_page and 316 - update_mmu_cache implementations for an example of how to go 317 - about doing this. 304 + This allows these interfaces to be implemented much more efficiently. 305 + It allows one to "defer" (perhaps indefinitely) the actual flush if 306 + there are currently no user processes mapping this page. See sparc64's 307 + flush_dcache_page and update_mmu_cache implementations for an example 308 + of how to go about doing this. 318 309 319 - The idea is, first at flush_dcache_page() time, if 320 - page->mapping->i_mmap is an empty tree, just mark the architecture 321 - private page flag bit. Later, in update_mmu_cache(), a check is 322 - made of this flag bit, and if set the flush is done and the flag 323 - bit is cleared. 310 + The idea is, first at flush_dcache_page() time, if page_file_mapping() 311 + returns a mapping, and mapping_mapped on that mapping returns %false, 312 + just mark the architecture private page flag bit. Later, in 313 + update_mmu_cache(), a check is made of this flag bit, and if set the 314 + flush is done and the flag bit is cleared. 324 315 325 316 .. important:: 326 317 ··· 351 350 implementation is a nop (and should remain so for all coherent 352 351 architectures). For incoherent architectures, it should flush 353 352 the cache of the page at vmaddr. 354 - 355 - ``void flush_kernel_dcache_page(struct page *page)`` 356 - 357 - When the kernel needs to modify a user page is has obtained 358 - with kmap, it calls this function after all modifications are 359 - complete (but before kunmapping it) to bring the underlying 360 - page up to date. It is assumed here that the user has no 361 - incoherent cached copies (i.e. the original page was obtained 362 - from a mechanism like get_user_pages()). The default 363 - implementation is a nop and should remain so on all coherent 364 - architectures. On incoherent architectures, this should flush 365 - the kernel cache for page (using page_address(page)). 366 - 367 353 368 354 ``void flush_icache_range(unsigned long start, unsigned long end)`` 369 355
-9
Documentation/translations/zh_CN/core-api/cachetlb.rst
··· 298 298 用。默认的实现是nop(对于所有相干的架构应该保持这样)。对于不一致性 299 299 的架构,它应该刷新vmaddr处的页面缓存。 300 300 301 - ``void flush_kernel_dcache_page(struct page *page)`` 302 - 303 - 当内核需要修改一个用kmap获得的用户页时,它会在所有修改完成后(但在 304 - kunmapping之前)调用这个函数,以使底层页面达到最新状态。这里假定用 305 - 户没有不一致性的缓存副本(即原始页面是从类似get_user_pages()的机制 306 - 中获得的)。默认的实现是一个nop,在所有相干的架构上都应该如此。在不 307 - 一致性的架构上,这应该刷新内核缓存中的页面(使用page_address(page))。 308 - 309 - 310 301 ``void flush_icache_range(unsigned long start, unsigned long end)`` 311 302 312 303 当内核存储到它将执行的地址中时(例如在加载模块时),这个函数被调用。
+1 -3
arch/arm/include/asm/cacheflush.h
··· 291 291 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1 292 292 extern void flush_dcache_page(struct page *); 293 293 294 + #define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1 294 295 static inline void flush_kernel_vmap_range(void *addr, int size) 295 296 { 296 297 if ((cache_is_vivt() || cache_is_vipt_aliasing())) ··· 312 311 if (PageAnon(page)) 313 312 __flush_anon_page(vma, page, vmaddr); 314 313 } 315 - 316 - #define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE 317 - extern void flush_kernel_dcache_page(struct page *); 318 314 319 315 #define flush_dcache_mmap_lock(mapping) xa_lock_irq(&mapping->i_pages) 320 316 #define flush_dcache_mmap_unlock(mapping) xa_unlock_irq(&mapping->i_pages)
-33
arch/arm/mm/flush.c
··· 346 346 EXPORT_SYMBOL(flush_dcache_page); 347 347 348 348 /* 349 - * Ensure cache coherency for the kernel mapping of this page. We can 350 - * assume that the page is pinned via kmap. 351 - * 352 - * If the page only exists in the page cache and there are no user 353 - * space mappings, this is a no-op since the page was already marked 354 - * dirty at creation. Otherwise, we need to flush the dirty kernel 355 - * cache lines directly. 356 - */ 357 - void flush_kernel_dcache_page(struct page *page) 358 - { 359 - if (cache_is_vivt() || cache_is_vipt_aliasing()) { 360 - struct address_space *mapping; 361 - 362 - mapping = page_mapping_file(page); 363 - 364 - if (!mapping || mapping_mapped(mapping)) { 365 - void *addr; 366 - 367 - addr = page_address(page); 368 - /* 369 - * kmap_atomic() doesn't set the page virtual 370 - * address for highmem pages, and 371 - * kunmap_atomic() takes care of cache 372 - * flushing already. 373 - */ 374 - if (!IS_ENABLED(CONFIG_HIGHMEM) || addr) 375 - __cpuc_flush_dcache_area(addr, PAGE_SIZE); 376 - } 377 - } 378 - } 379 - EXPORT_SYMBOL(flush_kernel_dcache_page); 380 - 381 - /* 382 349 * Flush an anonymous page so that users of get_user_pages() 383 350 * can safely access the data. The expected sequence is: 384 351 *
-6
arch/arm/mm/nommu.c
··· 166 166 } 167 167 EXPORT_SYMBOL(flush_dcache_page); 168 168 169 - void flush_kernel_dcache_page(struct page *page) 170 - { 171 - __cpuc_flush_dcache_area(page_address(page), PAGE_SIZE); 172 - } 173 - EXPORT_SYMBOL(flush_kernel_dcache_page); 174 - 175 169 void copy_to_user_page(struct vm_area_struct *vma, struct page *page, 176 170 unsigned long uaddr, void *dst, const void *src, 177 171 unsigned long len)
-11
arch/csky/abiv1/cacheflush.c
··· 56 56 } 57 57 } 58 58 59 - void flush_kernel_dcache_page(struct page *page) 60 - { 61 - struct address_space *mapping; 62 - 63 - mapping = page_mapping_file(page); 64 - 65 - if (!mapping || mapping_mapped(mapping)) 66 - dcache_wbinv_all(); 67 - } 68 - EXPORT_SYMBOL(flush_kernel_dcache_page); 69 - 70 59 void flush_cache_range(struct vm_area_struct *vma, unsigned long start, 71 60 unsigned long end) 72 61 {
+1 -3
arch/csky/abiv1/inc/abi/cacheflush.h
··· 14 14 #define flush_cache_page(vma, page, pfn) cache_wbinv_all() 15 15 #define flush_cache_dup_mm(mm) cache_wbinv_all() 16 16 17 - #define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE 18 - extern void flush_kernel_dcache_page(struct page *); 19 - 20 17 #define flush_dcache_mmap_lock(mapping) xa_lock_irq(&mapping->i_pages) 21 18 #define flush_dcache_mmap_unlock(mapping) xa_unlock_irq(&mapping->i_pages) 22 19 20 + #define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1 23 21 static inline void flush_kernel_vmap_range(void *addr, int size) 24 22 { 25 23 dcache_wbinv_all();
+1 -7
arch/mips/include/asm/cacheflush.h
··· 125 125 kunmap_coherent(); 126 126 } 127 127 128 - #define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE 129 - static inline void flush_kernel_dcache_page(struct page *page) 130 - { 131 - BUG_ON(cpu_has_dc_aliases && PageHighMem(page)); 132 - flush_dcache_page(page); 133 - } 134 - 128 + #define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1 135 129 /* 136 130 * For now flush_kernel_vmap_range and invalidate_kernel_vmap_range both do a 137 131 * cache writeback and invalidate operation.
+1 -2
arch/nds32/include/asm/cacheflush.h
··· 36 36 void flush_anon_page(struct vm_area_struct *vma, 37 37 struct page *page, unsigned long vaddr); 38 38 39 - #define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE 40 - void flush_kernel_dcache_page(struct page *page); 39 + #define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1 41 40 void flush_kernel_vmap_range(void *addr, int size); 42 41 void invalidate_kernel_vmap_range(void *addr, int size); 43 42 #define flush_dcache_mmap_lock(mapping) xa_lock_irq(&(mapping)->i_pages)
-9
arch/nds32/mm/cacheflush.c
··· 318 318 local_irq_restore(flags); 319 319 } 320 320 321 - void flush_kernel_dcache_page(struct page *page) 322 - { 323 - unsigned long flags; 324 - local_irq_save(flags); 325 - cpu_dcache_wbinval_page((unsigned long)page_address(page)); 326 - local_irq_restore(flags); 327 - } 328 - EXPORT_SYMBOL(flush_kernel_dcache_page); 329 - 330 321 void flush_kernel_vmap_range(void *addr, int size) 331 322 { 332 323 unsigned long flags;
+2 -6
arch/parisc/include/asm/cacheflush.h
··· 36 36 void flush_cache_all(void); 37 37 void flush_cache_mm(struct mm_struct *mm); 38 38 39 - #define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE 40 39 void flush_kernel_dcache_page_addr(void *addr); 41 - static inline void flush_kernel_dcache_page(struct page *page) 42 - { 43 - flush_kernel_dcache_page_addr(page_address(page)); 44 - } 45 40 46 41 #define flush_kernel_dcache_range(start,size) \ 47 42 flush_kernel_dcache_range_asm((start), (start)+(size)); 48 43 44 + #define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1 49 45 void flush_kernel_vmap_range(void *vaddr, int size); 50 46 void invalidate_kernel_vmap_range(void *vaddr, int size); 51 47 ··· 55 59 #define flush_dcache_mmap_unlock(mapping) xa_unlock_irq(&mapping->i_pages) 56 60 57 61 #define flush_icache_page(vma,page) do { \ 58 - flush_kernel_dcache_page(page); \ 62 + flush_kernel_dcache_page_addr(page_address(page)); \ 59 63 flush_kernel_icache_page(page_address(page)); \ 60 64 } while (0) 61 65
+1 -2
arch/parisc/kernel/cache.c
··· 334 334 return; 335 335 } 336 336 337 - flush_kernel_dcache_page(page); 337 + flush_kernel_dcache_page_addr(page_address(page)); 338 338 339 339 if (!mapping) 340 340 return; ··· 375 375 376 376 /* Defined in arch/parisc/kernel/pacache.S */ 377 377 EXPORT_SYMBOL(flush_kernel_dcache_range_asm); 378 - EXPORT_SYMBOL(flush_kernel_dcache_page_asm); 379 378 EXPORT_SYMBOL(flush_data_cache_local); 380 379 EXPORT_SYMBOL(flush_kernel_icache_range_asm); 381 380
+2 -6
arch/sh/include/asm/cacheflush.h
··· 63 63 if (boot_cpu_data.dcache.n_aliases && PageAnon(page)) 64 64 __flush_anon_page(page, vmaddr); 65 65 } 66 + 67 + #define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1 66 68 static inline void flush_kernel_vmap_range(void *addr, int size) 67 69 { 68 70 __flush_wback_region(addr, size); ··· 72 70 static inline void invalidate_kernel_vmap_range(void *addr, int size) 73 71 { 74 72 __flush_invalidate_region(addr, size); 75 - } 76 - 77 - #define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE 78 - static inline void flush_kernel_dcache_page(struct page *page) 79 - { 80 - flush_dcache_page(page); 81 73 } 82 74 83 75 extern void copy_to_user_page(struct vm_area_struct *vma,
+1 -1
block/blk-map.c
··· 309 309 310 310 static void bio_invalidate_vmalloc_pages(struct bio *bio) 311 311 { 312 - #ifdef ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE 312 + #ifdef ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 313 313 if (bio->bi_private && !op_is_write(bio_op(bio))) { 314 314 unsigned long i, len = 0; 315 315
+3 -3
fs/exec.c
··· 574 574 } 575 575 576 576 if (kmapped_page) { 577 - flush_kernel_dcache_page(kmapped_page); 577 + flush_dcache_page(kmapped_page); 578 578 kunmap(kmapped_page); 579 579 put_arg_page(kmapped_page); 580 580 } ··· 592 592 ret = 0; 593 593 out: 594 594 if (kmapped_page) { 595 - flush_kernel_dcache_page(kmapped_page); 595 + flush_dcache_page(kmapped_page); 596 596 kunmap(kmapped_page); 597 597 put_arg_page(kmapped_page); 598 598 } ··· 634 634 kaddr = kmap_atomic(page); 635 635 flush_arg_page(bprm, pos & PAGE_MASK, page); 636 636 memcpy(kaddr + offset_in_page(pos), arg, bytes_to_copy); 637 - flush_kernel_dcache_page(page); 637 + flush_dcache_page(page); 638 638 kunmap_atomic(kaddr); 639 639 put_arg_page(page); 640 640 }
+1 -4
include/linux/highmem.h
··· 130 130 } 131 131 #endif 132 132 133 - #ifndef ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE 134 - static inline void flush_kernel_dcache_page(struct page *page) 135 - { 136 - } 133 + #ifndef ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 137 134 static inline void flush_kernel_vmap_range(void *vaddr, int size) 138 135 { 139 136 }
-1
tools/testing/scatterlist/linux/mm.h
··· 127 127 #define kmemleak_free(a) 128 128 129 129 #define PageSlab(p) (0) 130 - #define flush_kernel_dcache_page(p) 131 130 132 131 #define MAX_ERRNO 4095 133 132