Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

mm: add ability to take further action in vm_area_desc

Some drivers/filesystems need to perform additional tasks after the VMA is
set up. This is typically in the form of pre-population.

The forms of pre-population most likely to be performed are a PFN remap
or the insertion of normal folios and PFNs into a mixed map.

We start by implementing the PFN remap functionality, ensuring that we
perform the appropriate actions at the appropriate time - that is setting
flags at the point of .mmap_prepare, and performing the actual remap at the
point at which the VMA is fully established.

This prevents the driver from doing anything too crazy with a VMA at any
stage, and we retain complete control over how the mm functionality is
applied.

Unfortunately callers still do often require some kind of custom action,
so we add an optional success/error _hook to allow the caller to do
something after the action has succeeded or failed.

This is done at the point when the VMA has already been established, so
the harm that can be done is limited.

The error hook can be used to filter errors if necessary.

There may be cases in which the caller absolutely must hold the file rmap
lock until the operation is entirely complete. It is an edge case, but
certainly the hugetlbfs mmap hook requires it.

To accommodate this, we add the hide_from_rmap_until_complete flag to the
mmap_action type. In this case, if a new VMA is allocated, we will hold the
file rmap lock until the operation is entirely completed (including any
success/error hooks).

Note that we do not need to update __compat_vma_mmap() to accommodate this
flag, as this function will be invoked from an .mmap handler whose VMA is
not yet visible, so we implicitly hide it from the rmap.

If any error arises on these final actions, we simply unmap the VMA
altogether.

Also update the stacked filesystem compatibility layer to utilise the
action behaviour, and update the VMA tests accordingly.

While we're here, rename __compat_vma_mmap_prepare() to __compat_vma_mmap()
as we are now performing actions invoked by the mmap_prepare in addition to
just the mmap_prepare hook.

Link: https://lkml.kernel.org/r/2601199a7b2eaeadfcd8ab6e199c6d1706650c94.1760959442.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Chatre, Reinette <reinette.chatre@intel.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Dave Martin <dave.martin@arm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nicolas Pitre <nico@fluxnic.net>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Robin Murohy <robin.murphy@arm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Lorenzo Stoakes and committed by
Andrew Morton
ac0a3fc9 db91b783

+441 -49
+3 -3
include/linux/fs.h
··· 2393 2393 return true; 2394 2394 } 2395 2395 2396 - int __compat_vma_mmap_prepare(const struct file_operations *f_op, 2396 + int __compat_vma_mmap(const struct file_operations *f_op, 2397 2397 struct file *file, struct vm_area_struct *vma); 2398 - int compat_vma_mmap_prepare(struct file *file, struct vm_area_struct *vma); 2398 + int compat_vma_mmap(struct file *file, struct vm_area_struct *vma); 2399 2399 2400 2400 static inline int vfs_mmap(struct file *file, struct vm_area_struct *vma) 2401 2401 { 2402 2402 if (file->f_op->mmap_prepare) 2403 - return compat_vma_mmap_prepare(file, vma); 2403 + return compat_vma_mmap(file, vma); 2404 2404 2405 2405 return file->f_op->mmap(file, vma); 2406 2406 }
+74
include/linux/mm.h
··· 3608 3608 return vma_desc_size(desc) >> PAGE_SHIFT; 3609 3609 } 3610 3610 3611 + /** 3612 + * mmap_action_remap - helper for mmap_prepare hook to specify that a pure PFN 3613 + * remap is required. 3614 + * @desc: The VMA descriptor for the VMA requiring remap. 3615 + * @start: The virtual address to start the remap from, must be within the VMA. 3616 + * @start_pfn: The first PFN in the range to remap. 3617 + * @size: The size of the range to remap, in bytes, at most spanning to the end 3618 + * of the VMA. 3619 + */ 3620 + static inline void mmap_action_remap(struct vm_area_desc *desc, 3621 + unsigned long start, 3622 + unsigned long start_pfn, 3623 + unsigned long size) 3624 + { 3625 + struct mmap_action *action = &desc->action; 3626 + 3627 + /* [start, start + size) must be within the VMA. */ 3628 + WARN_ON_ONCE(start < desc->start || start >= desc->end); 3629 + WARN_ON_ONCE(start + size > desc->end); 3630 + 3631 + action->type = MMAP_REMAP_PFN; 3632 + action->remap.start = start; 3633 + action->remap.start_pfn = start_pfn; 3634 + action->remap.size = size; 3635 + action->remap.pgprot = desc->page_prot; 3636 + } 3637 + 3638 + /** 3639 + * mmap_action_remap_full - helper for mmap_prepare hook to specify that the 3640 + * entirety of a VMA should be PFN remapped. 3641 + * @desc: The VMA descriptor for the VMA requiring remap. 3642 + * @start_pfn: The first PFN in the range to remap. 3643 + */ 3644 + static inline void mmap_action_remap_full(struct vm_area_desc *desc, 3645 + unsigned long start_pfn) 3646 + { 3647 + mmap_action_remap(desc, desc->start, start_pfn, vma_desc_size(desc)); 3648 + } 3649 + 3650 + /** 3651 + * mmap_action_ioremap - helper for mmap_prepare hook to specify that a pure PFN 3652 + * I/O remap is required. 3653 + * @desc: The VMA descriptor for the VMA requiring remap. 3654 + * @start: The virtual address to start the remap from, must be within the VMA. 3655 + * @start_pfn: The first PFN in the range to remap. 3656 + * @size: The size of the range to remap, in bytes, at most spanning to the end 3657 + * of the VMA. 3658 + */ 3659 + static inline void mmap_action_ioremap(struct vm_area_desc *desc, 3660 + unsigned long start, 3661 + unsigned long start_pfn, 3662 + unsigned long size) 3663 + { 3664 + mmap_action_remap(desc, start, start_pfn, size); 3665 + desc->action.type = MMAP_IO_REMAP_PFN; 3666 + } 3667 + 3668 + /** 3669 + * mmap_action_ioremap_full - helper for mmap_prepare hook to specify that the 3670 + * entirety of a VMA should be PFN I/O remapped. 3671 + * @desc: The VMA descriptor for the VMA requiring remap. 3672 + * @start_pfn: The first PFN in the range to remap. 3673 + */ 3674 + static inline void mmap_action_ioremap_full(struct vm_area_desc *desc, 3675 + unsigned long start_pfn) 3676 + { 3677 + mmap_action_ioremap(desc, desc->start, start_pfn, vma_desc_size(desc)); 3678 + } 3679 + 3680 + void mmap_action_prepare(struct mmap_action *action, 3681 + struct vm_area_desc *desc); 3682 + int mmap_action_complete(struct mmap_action *action, 3683 + struct vm_area_struct *vma); 3684 + 3611 3685 /* Look up the first VMA which exactly match the interval vm_start ... vm_end */ 3612 3686 static inline struct vm_area_struct *find_exact_vma(struct mm_struct *mm, 3613 3687 unsigned long vm_start, unsigned long vm_end)
+53
include/linux/mm_types.h
··· 773 773 }; 774 774 #endif 775 775 776 + /* What action should be taken after an .mmap_prepare call is complete? */ 777 + enum mmap_action_type { 778 + MMAP_NOTHING, /* Mapping is complete, no further action. */ 779 + MMAP_REMAP_PFN, /* Remap PFN range. */ 780 + MMAP_IO_REMAP_PFN, /* I/O remap PFN range. */ 781 + }; 782 + 783 + /* 784 + * Describes an action an mmap_prepare hook can instruct to be taken to complete 785 + * the mapping of a VMA. Specified in vm_area_desc. 786 + */ 787 + struct mmap_action { 788 + union { 789 + /* Remap range. */ 790 + struct { 791 + unsigned long start; 792 + unsigned long start_pfn; 793 + unsigned long size; 794 + pgprot_t pgprot; 795 + } remap; 796 + }; 797 + enum mmap_action_type type; 798 + 799 + /* 800 + * If specified, this hook is invoked after the selected action has been 801 + * successfully completed. Note that the VMA write lock still held. 802 + * 803 + * The absolute minimum ought to be done here. 804 + * 805 + * Returns 0 on success, or an error code. 806 + */ 807 + int (*success_hook)(const struct vm_area_struct *vma); 808 + 809 + /* 810 + * If specified, this hook is invoked when an error occurred when 811 + * attempting the selection action. 812 + * 813 + * The hook can return an error code in order to filter the error, but 814 + * it is not valid to clear the error here. 815 + */ 816 + int (*error_hook)(int err); 817 + 818 + /* 819 + * This should be set in rare instances where the operation required 820 + * that the rmap should not be able to access the VMA until 821 + * completely set up. 822 + */ 823 + bool hide_from_rmap_until_complete :1; 824 + }; 825 + 776 826 /* 777 827 * Describes a VMA that is about to be mmap()'ed. Drivers may choose to 778 828 * manipulate mutable fields which will cause those fields to be updated in the ··· 846 796 /* Write-only fields. */ 847 797 const struct vm_operations_struct *vm_ops; 848 798 void *private_data; 799 + 800 + /* Take further action? */ 801 + struct mmap_action action; 849 802 }; 850 803 851 804 /*
+135 -11
mm/util.c
··· 1135 1135 #endif 1136 1136 1137 1137 /** 1138 - * __compat_vma_mmap_prepare() - See description for compat_vma_mmap_prepare() 1138 + * __compat_vma_mmap() - See description for compat_vma_mmap() 1139 1139 * for details. This is the same operation, only with a specific file operations 1140 1140 * struct which may or may not be the same as vma->vm_file->f_op. 1141 1141 * @f_op: The file operations whose .mmap_prepare() hook is specified. ··· 1143 1143 * @vma: The VMA to apply the .mmap_prepare() hook to. 1144 1144 * Returns: 0 on success or error. 1145 1145 */ 1146 - int __compat_vma_mmap_prepare(const struct file_operations *f_op, 1146 + int __compat_vma_mmap(const struct file_operations *f_op, 1147 1147 struct file *file, struct vm_area_struct *vma) 1148 1148 { 1149 1149 struct vm_area_desc desc = { ··· 1156 1156 .vm_file = vma->vm_file, 1157 1157 .vm_flags = vma->vm_flags, 1158 1158 .page_prot = vma->vm_page_prot, 1159 + 1160 + .action.type = MMAP_NOTHING, /* Default */ 1159 1161 }; 1160 1162 int err; 1161 1163 1162 1164 err = f_op->mmap_prepare(&desc); 1163 1165 if (err) 1164 1166 return err; 1165 - set_vma_from_desc(vma, &desc); 1166 1167 1167 - return 0; 1168 + mmap_action_prepare(&desc.action, &desc); 1169 + set_vma_from_desc(vma, &desc); 1170 + return mmap_action_complete(&desc.action, vma); 1168 1171 } 1169 - EXPORT_SYMBOL(__compat_vma_mmap_prepare); 1172 + EXPORT_SYMBOL(__compat_vma_mmap); 1170 1173 1171 1174 /** 1172 - * compat_vma_mmap_prepare() - Apply the file's .mmap_prepare() hook to an 1173 - * existing VMA. 1175 + * compat_vma_mmap() - Apply the file's .mmap_prepare() hook to an 1176 + * existing VMA and execute any requested actions. 1174 1177 * @file: The file which possesss an f_op->mmap_prepare() hook. 1175 1178 * @vma: The VMA to apply the .mmap_prepare() hook to. 1176 1179 * ··· 1188 1185 * .mmap_prepare() hook, as we are in a different context when we invoke the 1189 1186 * .mmap() hook, already having a VMA to deal with. 1190 1187 * 1191 - * compat_vma_mmap_prepare() is a compatibility function that takes VMA state, 1188 + * compat_vma_mmap() is a compatibility function that takes VMA state, 1192 1189 * establishes a struct vm_area_desc descriptor, passes to the underlying 1193 1190 * .mmap_prepare() hook and applies any changes performed by it. 1194 1191 * ··· 1197 1194 * 1198 1195 * Returns: 0 on success or error. 1199 1196 */ 1200 - int compat_vma_mmap_prepare(struct file *file, struct vm_area_struct *vma) 1197 + int compat_vma_mmap(struct file *file, struct vm_area_struct *vma) 1201 1198 { 1202 - return __compat_vma_mmap_prepare(file->f_op, file, vma); 1199 + return __compat_vma_mmap(file->f_op, file, vma); 1203 1200 } 1204 - EXPORT_SYMBOL(compat_vma_mmap_prepare); 1201 + EXPORT_SYMBOL(compat_vma_mmap); 1205 1202 1206 1203 static void set_ps_flags(struct page_snapshot *ps, const struct folio *folio, 1207 1204 const struct page *page) ··· 1282 1279 ps->idx = 0; 1283 1280 } 1284 1281 } 1282 + 1283 + static int mmap_action_finish(struct mmap_action *action, 1284 + const struct vm_area_struct *vma, int err) 1285 + { 1286 + /* 1287 + * If an error occurs, unmap the VMA altogether and return an error. We 1288 + * only clear the newly allocated VMA, since this function is only 1289 + * invoked if we do NOT merge, so we only clean up the VMA we created. 1290 + */ 1291 + if (err) { 1292 + const size_t len = vma_pages(vma) << PAGE_SHIFT; 1293 + 1294 + do_munmap(current->mm, vma->vm_start, len, NULL); 1295 + 1296 + if (action->error_hook) { 1297 + /* We may want to filter the error. */ 1298 + err = action->error_hook(err); 1299 + 1300 + /* The caller should not clear the error. */ 1301 + VM_WARN_ON_ONCE(!err); 1302 + } 1303 + return err; 1304 + } 1305 + 1306 + if (action->success_hook) 1307 + return action->success_hook(vma); 1308 + 1309 + return 0; 1310 + } 1311 + 1312 + #ifdef CONFIG_MMU 1313 + /** 1314 + * mmap_action_prepare - Perform preparatory setup for an VMA descriptor 1315 + * action which need to be performed. 1316 + * @desc: The VMA descriptor to prepare for @action. 1317 + * @action: The action to perform. 1318 + */ 1319 + void mmap_action_prepare(struct mmap_action *action, 1320 + struct vm_area_desc *desc) 1321 + { 1322 + switch (action->type) { 1323 + case MMAP_NOTHING: 1324 + break; 1325 + case MMAP_REMAP_PFN: 1326 + remap_pfn_range_prepare(desc, action->remap.start_pfn); 1327 + break; 1328 + case MMAP_IO_REMAP_PFN: 1329 + io_remap_pfn_range_prepare(desc, action->remap.start_pfn, 1330 + action->remap.size); 1331 + break; 1332 + } 1333 + } 1334 + EXPORT_SYMBOL(mmap_action_prepare); 1335 + 1336 + /** 1337 + * mmap_action_complete - Execute VMA descriptor action. 1338 + * @action: The action to perform. 1339 + * @vma: The VMA to perform the action upon. 1340 + * 1341 + * Similar to mmap_action_prepare(). 1342 + * 1343 + * Return: 0 on success, or error, at which point the VMA will be unmapped. 1344 + */ 1345 + int mmap_action_complete(struct mmap_action *action, 1346 + struct vm_area_struct *vma) 1347 + { 1348 + int err = 0; 1349 + 1350 + switch (action->type) { 1351 + case MMAP_NOTHING: 1352 + break; 1353 + case MMAP_REMAP_PFN: 1354 + err = remap_pfn_range_complete(vma, action->remap.start, 1355 + action->remap.start_pfn, action->remap.size, 1356 + action->remap.pgprot); 1357 + break; 1358 + case MMAP_IO_REMAP_PFN: 1359 + err = io_remap_pfn_range_complete(vma, action->remap.start, 1360 + action->remap.start_pfn, action->remap.size, 1361 + action->remap.pgprot); 1362 + break; 1363 + } 1364 + 1365 + return mmap_action_finish(action, vma, err); 1366 + } 1367 + EXPORT_SYMBOL(mmap_action_complete); 1368 + #else 1369 + void mmap_action_prepare(struct mmap_action *action, 1370 + struct vm_area_desc *desc) 1371 + { 1372 + switch (action->type) { 1373 + case MMAP_NOTHING: 1374 + break; 1375 + case MMAP_REMAP_PFN: 1376 + case MMAP_IO_REMAP_PFN: 1377 + WARN_ON_ONCE(1); /* nommu cannot handle these. */ 1378 + break; 1379 + } 1380 + } 1381 + EXPORT_SYMBOL(mmap_action_prepare); 1382 + 1383 + int mmap_action_complete(struct mmap_action *action, 1384 + struct vm_area_struct *vma) 1385 + { 1386 + int err = 0; 1387 + 1388 + switch (action->type) { 1389 + case MMAP_NOTHING: 1390 + break; 1391 + case MMAP_REMAP_PFN: 1392 + case MMAP_IO_REMAP_PFN: 1393 + WARN_ON_ONCE(1); /* nommu cannot handle this. */ 1394 + 1395 + err = -EINVAL; 1396 + break; 1397 + } 1398 + 1399 + return mmap_action_finish(action, vma, err); 1400 + } 1401 + EXPORT_SYMBOL(mmap_action_complete); 1402 + #endif 1285 1403 1286 1404 #ifdef CONFIG_MMU 1287 1405 /**
+85 -28
mm/vma.c
··· 34 34 struct maple_tree mt_detach; 35 35 36 36 /* Determine if we can check KSM flags early in mmap() logic. */ 37 - bool check_ksm_early; 37 + bool check_ksm_early :1; 38 + /* If we map new, hold the file rmap lock on mapping. */ 39 + bool hold_file_rmap_lock :1; 38 40 }; 39 41 40 42 #define MMAP_STATE(name, mm_, vmi_, addr_, len_, pgoff_, vm_flags_, file_) \ ··· 1756 1754 unlink_file_vma_batch_process(vb); 1757 1755 } 1758 1756 1759 - static void vma_link_file(struct vm_area_struct *vma) 1757 + static void vma_link_file(struct vm_area_struct *vma, bool hold_rmap_lock) 1760 1758 { 1761 1759 struct file *file = vma->vm_file; 1762 1760 struct address_space *mapping; ··· 1765 1763 mapping = file->f_mapping; 1766 1764 i_mmap_lock_write(mapping); 1767 1765 __vma_link_file(vma, mapping); 1768 - i_mmap_unlock_write(mapping); 1766 + if (!hold_rmap_lock) 1767 + i_mmap_unlock_write(mapping); 1769 1768 } 1770 1769 } 1771 1770 ··· 1780 1777 1781 1778 vma_start_write(vma); 1782 1779 vma_iter_store_new(&vmi, vma); 1783 - vma_link_file(vma); 1780 + vma_link_file(vma, /* hold_rmap_lock= */false); 1784 1781 mm->map_count++; 1785 1782 validate_mm(mm); 1786 1783 return 0; ··· 2314 2311 map->vm_flags = ksm_vma_flags(map->mm, map->file, map->vm_flags); 2315 2312 } 2316 2313 2314 + static void set_desc_from_map(struct vm_area_desc *desc, 2315 + const struct mmap_state *map) 2316 + { 2317 + desc->start = map->addr; 2318 + desc->end = map->end; 2319 + 2320 + desc->pgoff = map->pgoff; 2321 + desc->vm_file = map->file; 2322 + desc->vm_flags = map->vm_flags; 2323 + desc->page_prot = map->page_prot; 2324 + } 2325 + 2317 2326 /* 2318 2327 * __mmap_setup() - Prepare to gather any overlapping VMAs that need to be 2319 2328 * unmapped once the map operation is completed, check limits, account mapping 2320 2329 * and clean up any pre-existing VMAs. 2321 2330 * 2331 + * As a result it sets up the @map and @desc objects. 2332 + * 2322 2333 * @map: Mapping state. 2334 + * @desc: VMA descriptor 2323 2335 * @uf: Userfaultfd context list. 2324 2336 * 2325 2337 * Returns: 0 on success, error code otherwise. 2326 2338 */ 2327 - static int __mmap_setup(struct mmap_state *map, struct list_head *uf) 2339 + static int __mmap_setup(struct mmap_state *map, struct vm_area_desc *desc, 2340 + struct list_head *uf) 2328 2341 { 2329 2342 int error; 2330 2343 struct vma_iterator *vmi = map->vmi; ··· 2397 2378 */ 2398 2379 vms_clean_up_area(vms, &map->mas_detach); 2399 2380 2381 + set_desc_from_map(desc, map); 2400 2382 return 0; 2401 2383 } 2402 2384 ··· 2499 2479 vma_start_write(vma); 2500 2480 vma_iter_store_new(vmi, vma); 2501 2481 map->mm->map_count++; 2502 - vma_link_file(vma); 2482 + vma_link_file(vma, map->hold_file_rmap_lock); 2503 2483 2504 2484 /* 2505 2485 * vma_merge_new_range() calls khugepaged_enter_vma() too, the below ··· 2559 2539 vma_set_page_prot(vma); 2560 2540 } 2561 2541 2542 + static void call_action_prepare(struct mmap_state *map, 2543 + struct vm_area_desc *desc) 2544 + { 2545 + struct mmap_action *action = &desc->action; 2546 + 2547 + mmap_action_prepare(action, desc); 2548 + 2549 + if (action->hide_from_rmap_until_complete) 2550 + map->hold_file_rmap_lock = true; 2551 + } 2552 + 2562 2553 /* 2563 2554 * Invoke the f_op->mmap_prepare() callback for a file-backed mapping that 2564 2555 * specifies it. ··· 2581 2550 * 2582 2551 * Returns 0 on success, or an error code otherwise. 2583 2552 */ 2584 - static int call_mmap_prepare(struct mmap_state *map) 2553 + static int call_mmap_prepare(struct mmap_state *map, 2554 + struct vm_area_desc *desc) 2585 2555 { 2586 2556 int err; 2587 - struct vm_area_desc desc = { 2588 - .mm = map->mm, 2589 - .file = map->file, 2590 - .start = map->addr, 2591 - .end = map->end, 2592 - 2593 - .pgoff = map->pgoff, 2594 - .vm_file = map->file, 2595 - .vm_flags = map->vm_flags, 2596 - .page_prot = map->page_prot, 2597 - }; 2598 2557 2599 2558 /* Invoke the hook. */ 2600 - err = vfs_mmap_prepare(map->file, &desc); 2559 + err = vfs_mmap_prepare(map->file, desc); 2601 2560 if (err) 2602 2561 return err; 2603 2562 2563 + call_action_prepare(map, desc); 2564 + 2604 2565 /* Update fields permitted to be changed. */ 2605 - map->pgoff = desc.pgoff; 2606 - map->file = desc.vm_file; 2607 - map->vm_flags = desc.vm_flags; 2608 - map->page_prot = desc.page_prot; 2566 + map->pgoff = desc->pgoff; 2567 + map->file = desc->vm_file; 2568 + map->vm_flags = desc->vm_flags; 2569 + map->page_prot = desc->page_prot; 2609 2570 /* User-defined fields. */ 2610 - map->vm_ops = desc.vm_ops; 2611 - map->vm_private_data = desc.private_data; 2571 + map->vm_ops = desc->vm_ops; 2572 + map->vm_private_data = desc->private_data; 2612 2573 2613 2574 return 0; 2614 2575 } ··· 2642 2619 return false; 2643 2620 } 2644 2621 2622 + static int call_action_complete(struct mmap_state *map, 2623 + struct vm_area_desc *desc, 2624 + struct vm_area_struct *vma) 2625 + { 2626 + struct mmap_action *action = &desc->action; 2627 + int ret; 2628 + 2629 + ret = mmap_action_complete(action, vma); 2630 + 2631 + /* If we held the file rmap we need to release it. */ 2632 + if (map->hold_file_rmap_lock) { 2633 + struct file *file = vma->vm_file; 2634 + 2635 + i_mmap_unlock_write(file->f_mapping); 2636 + } 2637 + return ret; 2638 + } 2639 + 2645 2640 static unsigned long __mmap_region(struct file *file, unsigned long addr, 2646 2641 unsigned long len, vm_flags_t vm_flags, unsigned long pgoff, 2647 2642 struct list_head *uf) 2648 2643 { 2649 2644 struct mm_struct *mm = current->mm; 2650 2645 struct vm_area_struct *vma = NULL; 2651 - int error; 2652 2646 bool have_mmap_prepare = file && file->f_op->mmap_prepare; 2653 2647 VMA_ITERATOR(vmi, mm, addr); 2654 2648 MMAP_STATE(map, mm, &vmi, addr, len, pgoff, vm_flags, file); 2649 + struct vm_area_desc desc = { 2650 + .mm = mm, 2651 + .file = file, 2652 + .action = { 2653 + .type = MMAP_NOTHING, /* Default to no further action. */ 2654 + }, 2655 + }; 2656 + bool allocated_new = false; 2657 + int error; 2655 2658 2656 2659 map.check_ksm_early = can_set_ksm_flags_early(&map); 2657 2660 2658 - error = __mmap_setup(&map, uf); 2661 + error = __mmap_setup(&map, &desc, uf); 2659 2662 if (!error && have_mmap_prepare) 2660 - error = call_mmap_prepare(&map); 2663 + error = call_mmap_prepare(&map, &desc); 2661 2664 if (error) 2662 2665 goto abort_munmap; 2663 2666 ··· 2702 2653 error = __mmap_new_vma(&map, &vma); 2703 2654 if (error) 2704 2655 goto unacct_error; 2656 + allocated_new = true; 2705 2657 } 2706 2658 2707 2659 if (have_mmap_prepare) 2708 2660 set_vma_user_defined_fields(vma, &map); 2709 2661 2710 2662 __mmap_complete(&map, vma); 2663 + 2664 + if (have_mmap_prepare && allocated_new) { 2665 + error = call_action_complete(&map, &desc, vma); 2666 + 2667 + if (error) 2668 + return error; 2669 + } 2711 2670 2712 2671 return addr; 2713 2672
+91 -7
tools/testing/vma/vma_internal.h
··· 275 275 276 276 struct vm_area_struct; 277 277 278 + 279 + /* What action should be taken after an .mmap_prepare call is complete? */ 280 + enum mmap_action_type { 281 + MMAP_NOTHING, /* Mapping is complete, no further action. */ 282 + MMAP_REMAP_PFN, /* Remap PFN range. */ 283 + MMAP_IO_REMAP_PFN, /* I/O remap PFN range. */ 284 + }; 285 + 286 + /* 287 + * Describes an action an mmap_prepare hook can instruct to be taken to complete 288 + * the mapping of a VMA. Specified in vm_area_desc. 289 + */ 290 + struct mmap_action { 291 + union { 292 + /* Remap range. */ 293 + struct { 294 + unsigned long start; 295 + unsigned long start_pfn; 296 + unsigned long size; 297 + pgprot_t pgprot; 298 + } remap; 299 + }; 300 + enum mmap_action_type type; 301 + 302 + /* 303 + * If specified, this hook is invoked after the selected action has been 304 + * successfully completed. Note that the VMA write lock still held. 305 + * 306 + * The absolute minimum ought to be done here. 307 + * 308 + * Returns 0 on success, or an error code. 309 + */ 310 + int (*success_hook)(const struct vm_area_struct *vma); 311 + 312 + /* 313 + * If specified, this hook is invoked when an error occurred when 314 + * attempting the selection action. 315 + * 316 + * The hook can return an error code in order to filter the error, but 317 + * it is not valid to clear the error here. 318 + */ 319 + int (*error_hook)(int err); 320 + 321 + /* 322 + * This should be set in rare instances where the operation required 323 + * that the rmap should not be able to access the VMA until 324 + * completely set up. 325 + */ 326 + bool hide_from_rmap_until_complete :1; 327 + }; 328 + 278 329 /* 279 330 * Describes a VMA that is about to be mmap()'ed. Drivers may choose to 280 331 * manipulate mutable fields which will cause those fields to be updated in the ··· 349 298 /* Write-only fields. */ 350 299 const struct vm_operations_struct *vm_ops; 351 300 void *private_data; 301 + 302 + /* Take further action? */ 303 + struct mmap_action action; 352 304 }; 353 305 354 306 struct file_operations { ··· 1380 1326 static inline void set_vma_from_desc(struct vm_area_struct *vma, 1381 1327 struct vm_area_desc *desc); 1382 1328 1383 - static inline int __compat_vma_mmap_prepare(const struct file_operations *f_op, 1329 + static inline void mmap_action_prepare(struct mmap_action *action, 1330 + struct vm_area_desc *desc) 1331 + { 1332 + } 1333 + 1334 + static inline int mmap_action_complete(struct mmap_action *action, 1335 + struct vm_area_struct *vma) 1336 + { 1337 + return 0; 1338 + } 1339 + 1340 + static inline int __compat_vma_mmap(const struct file_operations *f_op, 1384 1341 struct file *file, struct vm_area_struct *vma) 1385 1342 { 1386 1343 struct vm_area_desc desc = { 1387 1344 .mm = vma->vm_mm, 1388 - .file = vma->vm_file, 1345 + .file = file, 1389 1346 .start = vma->vm_start, 1390 1347 .end = vma->vm_end, 1391 1348 ··· 1404 1339 .vm_file = vma->vm_file, 1405 1340 .vm_flags = vma->vm_flags, 1406 1341 .page_prot = vma->vm_page_prot, 1342 + 1343 + .action.type = MMAP_NOTHING, /* Default */ 1407 1344 }; 1408 1345 int err; 1409 1346 1410 1347 err = f_op->mmap_prepare(&desc); 1411 1348 if (err) 1412 1349 return err; 1413 - set_vma_from_desc(vma, &desc); 1414 1350 1415 - return 0; 1351 + mmap_action_prepare(&desc.action, &desc); 1352 + set_vma_from_desc(vma, &desc); 1353 + return mmap_action_complete(&desc.action, vma); 1416 1354 } 1417 1355 1418 - static inline int compat_vma_mmap_prepare(struct file *file, 1356 + static inline int compat_vma_mmap(struct file *file, 1419 1357 struct vm_area_struct *vma) 1420 1358 { 1421 - return __compat_vma_mmap_prepare(file->f_op, file, vma); 1359 + return __compat_vma_mmap(file->f_op, file, vma); 1422 1360 } 1423 1361 1424 1362 /* Did the driver provide valid mmap hook configuration? */ ··· 1442 1374 static inline int vfs_mmap(struct file *file, struct vm_area_struct *vma) 1443 1375 { 1444 1376 if (file->f_op->mmap_prepare) 1445 - return compat_vma_mmap_prepare(file, vma); 1377 + return compat_vma_mmap(file, vma); 1446 1378 1447 1379 return file->f_op->mmap(file, vma); 1448 1380 } ··· 1473 1405 const struct file *file, vm_flags_t vm_flags) 1474 1406 { 1475 1407 return vm_flags; 1408 + } 1409 + 1410 + static inline void remap_pfn_range_prepare(struct vm_area_desc *desc, unsigned long pfn) 1411 + { 1412 + } 1413 + 1414 + static inline int remap_pfn_range_complete(struct vm_area_struct *vma, unsigned long addr, 1415 + unsigned long pfn, unsigned long size, pgprot_t pgprot) 1416 + { 1417 + return 0; 1418 + } 1419 + 1420 + static inline int do_munmap(struct mm_struct *, unsigned long, size_t, 1421 + struct list_head *uf) 1422 + { 1423 + return 0; 1476 1424 } 1477 1425 1478 1426 #endif /* __MM_VMA_INTERNAL_H */