Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

mm/thp: fix NR_FILE_MAPPED accounting in page_*_file_rmap()

NR_FILE_MAPPED accounting in mm/rmap.c (for /proc/meminfo "Mapped" and
/proc/vmstat "nr_mapped" and the memcg's memory.stat "mapped_file") is
slightly flawed for file or shmem huge pages.

It is well thought out, and looks convincing, but there's a racy case when
the careful counting in page_remove_file_rmap() (without page lock) gets
discarded. So that in a workload like two "make -j20" kernel builds under
memory pressure, with cc1 on hugepage text, "Mapped" can easily grow by a
spurious 5MB or more on each iteration, ending up implausibly bigger than
most other numbers in /proc/meminfo. And, hypothetically, might grow to
the point of seriously interfering in mm/vmscan.c's heuristics, which do
take NR_FILE_MAPPED into some consideration.

Fixed by moving the __mod_lruvec_page_state() down to where it will not be
missed before return (and I've grown a bit tired of that oft-repeated
but-not-everywhere comment on the __ness: it gets lost in the move here).

Does page_add_file_rmap() need the same change? I suspect not, because
page lock is held in all relevant cases, and its skipping case looks safe;
but it's much easier to be sure, if we do make the same change.

Link: https://lkml.kernel.org/r/e02e52a1-8550-a57c-ed29-f51191ea2375@google.com
Fixes: dd78fedde4b9 ("rmap: support file thp")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Hugh Dickins and committed by
Linus Torvalds
5d543f13 85207ad8

+14 -17
+14 -17
mm/rmap.c
··· 1236 1236 void page_add_file_rmap(struct page *page, 1237 1237 struct vm_area_struct *vma, bool compound) 1238 1238 { 1239 - int i, nr = 1; 1239 + int i, nr = 0; 1240 1240 1241 1241 VM_BUG_ON_PAGE(compound && !PageTransHuge(page), page); 1242 1242 lock_page_memcg(page); 1243 1243 if (compound && PageTransHuge(page)) { 1244 1244 int nr_pages = thp_nr_pages(page); 1245 1245 1246 - for (i = 0, nr = 0; i < nr_pages; i++) { 1246 + for (i = 0; i < nr_pages; i++) { 1247 1247 if (atomic_inc_and_test(&page[i]._mapcount)) 1248 1248 nr++; 1249 1249 } ··· 1271 1271 VM_WARN_ON_ONCE(!PageLocked(page)); 1272 1272 SetPageDoubleMap(compound_head(page)); 1273 1273 } 1274 - if (!atomic_inc_and_test(&page->_mapcount)) 1275 - goto out; 1274 + if (atomic_inc_and_test(&page->_mapcount)) 1275 + nr++; 1276 1276 } 1277 - __mod_lruvec_page_state(page, NR_FILE_MAPPED, nr); 1278 1277 out: 1278 + if (nr) 1279 + __mod_lruvec_page_state(page, NR_FILE_MAPPED, nr); 1279 1280 unlock_page_memcg(page); 1280 1281 1281 1282 mlock_vma_page(page, vma, compound); ··· 1284 1283 1285 1284 static void page_remove_file_rmap(struct page *page, bool compound) 1286 1285 { 1287 - int i, nr = 1; 1286 + int i, nr = 0; 1288 1287 1289 1288 VM_BUG_ON_PAGE(compound && !PageHead(page), page); 1290 1289 ··· 1299 1298 if (compound && PageTransHuge(page)) { 1300 1299 int nr_pages = thp_nr_pages(page); 1301 1300 1302 - for (i = 0, nr = 0; i < nr_pages; i++) { 1301 + for (i = 0; i < nr_pages; i++) { 1303 1302 if (atomic_add_negative(-1, &page[i]._mapcount)) 1304 1303 nr++; 1305 1304 } 1306 1305 if (!atomic_add_negative(-1, compound_mapcount_ptr(page))) 1307 - return; 1306 + goto out; 1308 1307 if (PageSwapBacked(page)) 1309 1308 __mod_lruvec_page_state(page, NR_SHMEM_PMDMAPPED, 1310 1309 -nr_pages); ··· 1312 1311 __mod_lruvec_page_state(page, NR_FILE_PMDMAPPED, 1313 1312 -nr_pages); 1314 1313 } else { 1315 - if (!atomic_add_negative(-1, &page->_mapcount)) 1316 - return; 1314 + if (atomic_add_negative(-1, &page->_mapcount)) 1315 + nr++; 1317 1316 } 1318 - 1319 - /* 1320 - * We use the irq-unsafe __{inc|mod}_lruvec_page_state because 1321 - * these counters are not modified in interrupt context, and 1322 - * pte lock(a spinlock) is held, which implies preemption disabled. 1323 - */ 1324 - __mod_lruvec_page_state(page, NR_FILE_MAPPED, -nr); 1317 + out: 1318 + if (nr) 1319 + __mod_lruvec_page_state(page, NR_FILE_MAPPED, -nr); 1325 1320 } 1326 1321 1327 1322 static void page_remove_anon_compound_rmap(struct page *page)