Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

mm/migrate_device.c: add migrate_device_range()

Device drivers can use the migrate_vma family of functions to migrate
existing private anonymous mappings to device private pages. These pages
are backed by memory on the device with drivers being responsible for
copying data to and from device memory.

Device private pages are freed via the pgmap->page_free() callback when
they are unmapped and their refcount drops to zero. Alternatively they
may be freed indirectly via migration back to CPU memory in response to a
pgmap->migrate_to_ram() callback called whenever the CPU accesses an
address mapped to a device private page.

In other words drivers cannot control the lifetime of data allocated on
the devices and must wait until these pages are freed from userspace.
This causes issues when memory needs to reclaimed on the device, either
because the device is going away due to a ->release() callback or because
another user needs to use the memory.

Drivers could use the existing migrate_vma functions to migrate data off
the device. However this would require them to track the mappings of each
page which is both complicated and not always possible. Instead drivers
need to be able to migrate device pages directly so they can free up
device memory.

To allow that this patch introduces the migrate_device family of functions
which are functionally similar to migrate_vma but which skips the initial
lookup based on mapping.

Link: https://lkml.kernel.org/r/868116aab70b0c8ee467d62498bb2cf0ef907295.1664366292.git-series.apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Alex Sierra <alex.sierra@amd.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Alistair Popple and committed by
Andrew Morton
e778406b 241f6885

+89 -7
+7
include/linux/migrate.h
··· 210 210 int migrate_vma_setup(struct migrate_vma *args); 211 211 void migrate_vma_pages(struct migrate_vma *migrate); 212 212 void migrate_vma_finalize(struct migrate_vma *migrate); 213 + int migrate_device_range(unsigned long *src_pfns, unsigned long start, 214 + unsigned long npages); 215 + void migrate_device_pages(unsigned long *src_pfns, unsigned long *dst_pfns, 216 + unsigned long npages); 217 + void migrate_device_finalize(unsigned long *src_pfns, 218 + unsigned long *dst_pfns, unsigned long npages); 219 + 213 220 #endif /* CONFIG_MIGRATION */ 214 221 215 222 #endif /* _LINUX_MIGRATE_H */
+82 -7
mm/migrate_device.c
··· 693 693 *src &= ~MIGRATE_PFN_MIGRATE; 694 694 } 695 695 696 - static void migrate_device_pages(unsigned long *src_pfns, 696 + static void __migrate_device_pages(unsigned long *src_pfns, 697 697 unsigned long *dst_pfns, unsigned long npages, 698 698 struct migrate_vma *migrate) 699 699 { ··· 715 715 if (!page) { 716 716 unsigned long addr; 717 717 718 + if (!(src_pfns[i] & MIGRATE_PFN_MIGRATE)) 719 + continue; 720 + 718 721 /* 719 722 * The only time there is no vma is when called from 720 723 * migrate_device_coherent_page(). However this isn't ··· 725 722 */ 726 723 VM_BUG_ON(!migrate); 727 724 addr = migrate->start + i*PAGE_SIZE; 728 - if (!(src_pfns[i] & MIGRATE_PFN_MIGRATE)) 729 - continue; 730 725 if (!notified) { 731 726 notified = true; 732 727 ··· 780 779 } 781 780 782 781 /** 782 + * migrate_device_pages() - migrate meta-data from src page to dst page 783 + * @src_pfns: src_pfns returned from migrate_device_range() 784 + * @dst_pfns: array of pfns allocated by the driver to migrate memory to 785 + * @npages: number of pages in the range 786 + * 787 + * Equivalent to migrate_vma_pages(). This is called to migrate struct page 788 + * meta-data from source struct page to destination. 789 + */ 790 + void migrate_device_pages(unsigned long *src_pfns, unsigned long *dst_pfns, 791 + unsigned long npages) 792 + { 793 + __migrate_device_pages(src_pfns, dst_pfns, npages, NULL); 794 + } 795 + EXPORT_SYMBOL(migrate_device_pages); 796 + 797 + /** 783 798 * migrate_vma_pages() - migrate meta-data from src page to dst page 784 799 * @migrate: migrate struct containing all migration information 785 800 * ··· 805 788 */ 806 789 void migrate_vma_pages(struct migrate_vma *migrate) 807 790 { 808 - migrate_device_pages(migrate->src, migrate->dst, migrate->npages, migrate); 791 + __migrate_device_pages(migrate->src, migrate->dst, migrate->npages, migrate); 809 792 } 810 793 EXPORT_SYMBOL(migrate_vma_pages); 811 794 812 - static void migrate_device_finalize(unsigned long *src_pfns, 813 - unsigned long *dst_pfns, unsigned long npages) 795 + /* 796 + * migrate_device_finalize() - complete page migration 797 + * @src_pfns: src_pfns returned from migrate_device_range() 798 + * @dst_pfns: array of pfns allocated by the driver to migrate memory to 799 + * @npages: number of pages in the range 800 + * 801 + * Completes migration of the page by removing special migration entries. 802 + * Drivers must ensure copying of page data is complete and visible to the CPU 803 + * before calling this. 804 + */ 805 + void migrate_device_finalize(unsigned long *src_pfns, 806 + unsigned long *dst_pfns, unsigned long npages) 814 807 { 815 808 unsigned long i; 816 809 ··· 864 837 } 865 838 } 866 839 } 840 + EXPORT_SYMBOL(migrate_device_finalize); 867 841 868 842 /** 869 843 * migrate_vma_finalize() - restore CPU page table entry ··· 882 854 migrate_device_finalize(migrate->src, migrate->dst, migrate->npages); 883 855 } 884 856 EXPORT_SYMBOL(migrate_vma_finalize); 857 + 858 + /** 859 + * migrate_device_range() - migrate device private pfns to normal memory. 860 + * @src_pfns: array large enough to hold migrating source device private pfns. 861 + * @start: starting pfn in the range to migrate. 862 + * @npages: number of pages to migrate. 863 + * 864 + * migrate_vma_setup() is similar in concept to migrate_vma_setup() except that 865 + * instead of looking up pages based on virtual address mappings a range of 866 + * device pfns that should be migrated to system memory is used instead. 867 + * 868 + * This is useful when a driver needs to free device memory but doesn't know the 869 + * virtual mappings of every page that may be in device memory. For example this 870 + * is often the case when a driver is being unloaded or unbound from a device. 871 + * 872 + * Like migrate_vma_setup() this function will take a reference and lock any 873 + * migrating pages that aren't free before unmapping them. Drivers may then 874 + * allocate destination pages and start copying data from the device to CPU 875 + * memory before calling migrate_device_pages(). 876 + */ 877 + int migrate_device_range(unsigned long *src_pfns, unsigned long start, 878 + unsigned long npages) 879 + { 880 + unsigned long i, pfn; 881 + 882 + for (pfn = start, i = 0; i < npages; pfn++, i++) { 883 + struct page *page = pfn_to_page(pfn); 884 + 885 + if (!get_page_unless_zero(page)) { 886 + src_pfns[i] = 0; 887 + continue; 888 + } 889 + 890 + if (!trylock_page(page)) { 891 + src_pfns[i] = 0; 892 + put_page(page); 893 + continue; 894 + } 895 + 896 + src_pfns[i] = migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE; 897 + } 898 + 899 + migrate_device_unmap(src_pfns, npages, NULL); 900 + 901 + return 0; 902 + } 903 + EXPORT_SYMBOL(migrate_device_range); 885 904 886 905 /* 887 906 * Migrate a device coherent page back to normal memory. The caller should have ··· 960 885 dst_pfn = migrate_pfn(page_to_pfn(dpage)); 961 886 } 962 887 963 - migrate_device_pages(&src_pfn, &dst_pfn, 1, NULL); 888 + migrate_device_pages(&src_pfn, &dst_pfn, 1); 964 889 if (src_pfn & MIGRATE_PFN_MIGRATE) 965 890 copy_highpage(dpage, page); 966 891 migrate_device_finalize(&src_pfn, &dst_pfn, 1);