Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

mm: Add vm_insert_pfn_prot()

The x86 vvar vma contains pages with differing cacheability
flags. x86 currently implements this by manually inserting all
the ptes using (io_)remap_pfn_range when the vma is set up.

x86 wants to move to using .fault with VM_FAULT_NOPAGE to set up
the mappings as needed. The correct API to use to insert a pfn
in .fault is vm_insert_pfn(), but vm_insert_pfn() can't override the
vma's cache mode, and the HPET page in particular needs to be
uncached despite the fact that the rest of the VMA is cached.

Add vm_insert_pfn_prot() to support varying cacheability within
the same non-COW VMA in a more sane manner.

x86 could alternatively use multiple VMAs, but that's messy,
would break CRIU, and would create unnecessary VMAs that would
waste memory.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/d2938d1eb37be7a5e4f86182db646551f11e45aa.1451446564.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

authored by

Andy Lutomirski and committed by
Ingo Molnar
1745cbc5 f872f540

+25 -2
+2
include/linux/mm.h
··· 2080 2080 int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *); 2081 2081 int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr, 2082 2082 unsigned long pfn); 2083 + int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, 2084 + unsigned long pfn, pgprot_t pgprot); 2083 2085 int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr, 2084 2086 unsigned long pfn); 2085 2087 int vm_iomap_memory(struct vm_area_struct *vma, phys_addr_t start, unsigned long len);
+23 -2
mm/memory.c
··· 1564 1564 int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr, 1565 1565 unsigned long pfn) 1566 1566 { 1567 + return vm_insert_pfn_prot(vma, addr, pfn, vma->vm_page_prot); 1568 + } 1569 + EXPORT_SYMBOL(vm_insert_pfn); 1570 + 1571 + /** 1572 + * vm_insert_pfn_prot - insert single pfn into user vma with specified pgprot 1573 + * @vma: user vma to map to 1574 + * @addr: target user address of this page 1575 + * @pfn: source kernel pfn 1576 + * @pgprot: pgprot flags for the inserted page 1577 + * 1578 + * This is exactly like vm_insert_pfn, except that it allows drivers to 1579 + * to override pgprot on a per-page basis. 1580 + * 1581 + * This only makes sense for IO mappings, and it makes no sense for 1582 + * cow mappings. In general, using multiple vmas is preferable; 1583 + * vm_insert_pfn_prot should only be used if using multiple VMAs is 1584 + * impractical. 1585 + */ 1586 + int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, 1587 + unsigned long pfn, pgprot_t pgprot) 1588 + { 1567 1589 int ret; 1568 - pgprot_t pgprot = vma->vm_page_prot; 1569 1590 /* 1570 1591 * Technically, architectures with pte_special can avoid all these 1571 1592 * restrictions (same for remap_pfn_range). However we would like ··· 1608 1587 1609 1588 return ret; 1610 1589 } 1611 - EXPORT_SYMBOL(vm_insert_pfn); 1590 + EXPORT_SYMBOL(vm_insert_pfn_prot); 1612 1591 1613 1592 int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr, 1614 1593 unsigned long pfn)