Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

shmem: replace_page must flush_dcache and others

Commit bde05d1ccd51 ("shmem: replace page if mapping excludes its zone")
is not at all likely to break for anyone, but it was an earlier version
from before review feedback was incorporated. Fix that up now.

* shmem_replace_page must flush_dcache_page after copy_highpage [akpm]
* Expand comment on why shmem_unuse_inode needs page_swapcount [akpm]
* Remove excess of VM_BUG_ONs from shmem_replace_page [wangcong]
* Check page_private matches swap before calling shmem_replace_page [hughd]
* shmem_replace_page allow for unexpected race in radix_tree lookup [hughd]

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Stephane Marchesin <marcheu@chromium.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Hugh Dickins and committed by
Linus Torvalds
0142ef6c 71fae7e7

+37 -20
+37 -20
mm/shmem.c
··· 683 683 mutex_lock(&shmem_swaplist_mutex); 684 684 /* 685 685 * We needed to drop mutex to make that restrictive page 686 - * allocation; but the inode might already be freed by now, 687 - * and we cannot refer to inode or mapping or info to check. 688 - * However, we do hold page lock on the PageSwapCache page, 689 - * so can check if that still has our reference remaining. 686 + * allocation, but the inode might have been freed while we 687 + * dropped it: although a racing shmem_evict_inode() cannot 688 + * complete without emptying the radix_tree, our page lock 689 + * on this swapcache page is not enough to prevent that - 690 + * free_swap_and_cache() of our swap entry will only 691 + * trylock_page(), removing swap from radix_tree whatever. 692 + * 693 + * We must not proceed to shmem_add_to_page_cache() if the 694 + * inode has been freed, but of course we cannot rely on 695 + * inode or mapping or info to check that. However, we can 696 + * safely check if our swap entry is still in use (and here 697 + * it can't have got reused for another page): if it's still 698 + * in use, then the inode cannot have been freed yet, and we 699 + * can safely proceed (if it's no longer in use, that tells 700 + * nothing about the inode, but we don't need to unuse swap). 690 701 */ 691 702 if (!page_swapcount(*pagep)) 692 703 error = -ENOENT; ··· 741 730 742 731 /* 743 732 * There's a faint possibility that swap page was replaced before 744 - * caller locked it: it will come back later with the right page. 733 + * caller locked it: caller will come back later with the right page. 745 734 */ 746 - if (unlikely(!PageSwapCache(page))) 735 + if (unlikely(!PageSwapCache(page) || page_private(page) != swap.val)) 747 736 goto out; 748 737 749 738 /* ··· 1006 995 newpage = shmem_alloc_page(gfp, info, index); 1007 996 if (!newpage) 1008 997 return -ENOMEM; 1009 - VM_BUG_ON(shmem_should_replace_page(newpage, gfp)); 1010 998 1011 - *pagep = newpage; 1012 999 page_cache_get(newpage); 1013 1000 copy_highpage(newpage, oldpage); 1001 + flush_dcache_page(newpage); 1014 1002 1015 - VM_BUG_ON(!PageLocked(oldpage)); 1016 1003 __set_page_locked(newpage); 1017 - VM_BUG_ON(!PageUptodate(oldpage)); 1018 1004 SetPageUptodate(newpage); 1019 - VM_BUG_ON(!PageSwapBacked(oldpage)); 1020 1005 SetPageSwapBacked(newpage); 1021 - VM_BUG_ON(!swap_index); 1022 1006 set_page_private(newpage, swap_index); 1023 - VM_BUG_ON(!PageSwapCache(oldpage)); 1024 1007 SetPageSwapCache(newpage); 1025 1008 1026 1009 /* ··· 1024 1019 spin_lock_irq(&swap_mapping->tree_lock); 1025 1020 error = shmem_radix_tree_replace(swap_mapping, swap_index, oldpage, 1026 1021 newpage); 1027 - __inc_zone_page_state(newpage, NR_FILE_PAGES); 1028 - __dec_zone_page_state(oldpage, NR_FILE_PAGES); 1022 + if (!error) { 1023 + __inc_zone_page_state(newpage, NR_FILE_PAGES); 1024 + __dec_zone_page_state(oldpage, NR_FILE_PAGES); 1025 + } 1029 1026 spin_unlock_irq(&swap_mapping->tree_lock); 1030 - BUG_ON(error); 1031 1027 1032 - mem_cgroup_replace_page_cache(oldpage, newpage); 1033 - lru_cache_add_anon(newpage); 1028 + if (unlikely(error)) { 1029 + /* 1030 + * Is this possible? I think not, now that our callers check 1031 + * both PageSwapCache and page_private after getting page lock; 1032 + * but be defensive. Reverse old to newpage for clear and free. 1033 + */ 1034 + oldpage = newpage; 1035 + } else { 1036 + mem_cgroup_replace_page_cache(oldpage, newpage); 1037 + lru_cache_add_anon(newpage); 1038 + *pagep = newpage; 1039 + } 1034 1040 1035 1041 ClearPageSwapCache(oldpage); 1036 1042 set_page_private(oldpage, 0); ··· 1049 1033 unlock_page(oldpage); 1050 1034 page_cache_release(oldpage); 1051 1035 page_cache_release(oldpage); 1052 - return 0; 1036 + return error; 1053 1037 } 1054 1038 1055 1039 /* ··· 1123 1107 1124 1108 /* We have to do this with page locked to prevent races */ 1125 1109 lock_page(page); 1126 - if (!PageSwapCache(page) || page->mapping) { 1110 + if (!PageSwapCache(page) || page_private(page) != swap.val || 1111 + page->mapping) { 1127 1112 error = -EEXIST; /* try again */ 1128 1113 goto failed; 1129 1114 }