Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

mm: rename _count, field of the struct page, to _refcount

Many developers already know that field for reference count of the
struct page is _count and atomic type. They would try to handle it
directly and this could break the purpose of page reference count
tracepoint. To prevent direct _count modification, this patch rename it
to _refcount and add warning message on the code. After that, developer
who need to handle reference count will find that field should not be
accessed directly.

[akpm@linux-foundation.org: fix comments, per Vlastimil]
[akpm@linux-foundation.org: Documentation/vm/transhuge.txt too]
[sfr@canb.auug.org.au: sync ethernet driver changes]
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Sunil Goutham <sgoutham@cavium.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Manish Chopra <manish.chopra@qlogic.com>
Cc: Yuval Mintz <yuval.mintz@qlogic.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Joonsoo Kim and committed by
Linus Torvalds
0139aa7b 6d061f9f

+58 -54
+5 -5
Documentation/vm/transhuge.txt
··· 394 394 Refcounting on THP is mostly consistent with refcounting on other compound 395 395 pages: 396 396 397 - - get_page()/put_page() and GUP operate in head page's ->_count. 397 + - get_page()/put_page() and GUP operate in head page's ->_refcount. 398 398 399 - - ->_count in tail pages is always zero: get_page_unless_zero() never 399 + - ->_refcount in tail pages is always zero: get_page_unless_zero() never 400 400 succeed on tail pages. 401 401 402 402 - map/unmap of the pages with PTE entry increment/decrement ->_mapcount ··· 426 426 sum of mapcount of all sub-pages plus one (split_huge_page caller must 427 427 have reference for head page). 428 428 429 - split_huge_page uses migration entries to stabilize page->_count and 429 + split_huge_page uses migration entries to stabilize page->_refcount and 430 430 page->_mapcount. 431 431 432 432 We safe against physical memory scanners too: the only legitimate way 433 433 scanner can get reference to a page is get_page_unless_zero(). 434 434 435 - All tail pages has zero ->_count until atomic_add(). It prevent scanner 435 + All tail pages has zero ->_refcount until atomic_add(). It prevent scanner 436 436 from geting reference to tail page up to the point. After the atomic_add() 437 - we don't care about ->_count value. We already known how many references 437 + we don't care about ->_refcount value. We already known how many references 438 438 with should uncharge from head page. 439 439 440 440 For head page get_page_unless_zero() will succeed and we don't mind. It's
+1 -1
arch/tile/mm/init.c
··· 679 679 * Hacky direct set to avoid unnecessary 680 680 * lock take/release for EVERY page here. 681 681 */ 682 - p->_count.counter = 0; 682 + p->_refcount.counter = 0; 683 683 p->_mapcount.counter = -1; 684 684 } 685 685 init_page_count(page);
+1 -1
drivers/block/aoe/aoecmd.c
··· 861 861 * discussion. 862 862 * 863 863 * We cannot use get_page in the workaround, because it insists on a 864 - * positive page count as a precondition. So we use _count directly. 864 + * positive page count as a precondition. So we use _refcount directly. 865 865 */ 866 866 static void 867 867 bio_pageinc(struct bio *bio)
+1 -1
drivers/hwtracing/intel_th/msu.c
··· 1164 1164 if (!atomic_dec_and_mutex_lock(&msc->mmap_count, &msc->buf_mutex)) 1165 1165 return; 1166 1166 1167 - /* drop page _counts */ 1167 + /* drop page _refcounts */ 1168 1168 for (pg = 0; pg < msc->nr_pages; pg++) { 1169 1169 struct page *page = msc_buffer_get_page(msc, pg); 1170 1170
+10 -10
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
··· 433 433 for (i = 0; i < MLX5_MPWRQ_PAGES_PER_WQE; i++) { 434 434 if (unlikely(mlx5e_alloc_and_map_page(rq, wi, i))) 435 435 goto err_unmap; 436 - atomic_add(mlx5e_mpwqe_strides_per_page(rq), 437 - &wi->umr.dma_info[i].page->_count); 436 + page_ref_add(wi->umr.dma_info[i].page, 437 + mlx5e_mpwqe_strides_per_page(rq)); 438 438 wi->skbs_frags[i] = 0; 439 439 } 440 440 ··· 452 452 while (--i >= 0) { 453 453 dma_unmap_page(rq->pdev, wi->umr.dma_info[i].addr, PAGE_SIZE, 454 454 PCI_DMA_FROMDEVICE); 455 - atomic_sub(mlx5e_mpwqe_strides_per_page(rq), 456 - &wi->umr.dma_info[i].page->_count); 455 + page_ref_sub(wi->umr.dma_info[i].page, 456 + mlx5e_mpwqe_strides_per_page(rq)); 457 457 put_page(wi->umr.dma_info[i].page); 458 458 } 459 459 dma_unmap_single(rq->pdev, wi->umr.mtt_addr, mtt_sz, PCI_DMA_TODEVICE); ··· 477 477 for (i = 0; i < MLX5_MPWRQ_PAGES_PER_WQE; i++) { 478 478 dma_unmap_page(rq->pdev, wi->umr.dma_info[i].addr, PAGE_SIZE, 479 479 PCI_DMA_FROMDEVICE); 480 - atomic_sub(mlx5e_mpwqe_strides_per_page(rq) - wi->skbs_frags[i], 481 - &wi->umr.dma_info[i].page->_count); 480 + page_ref_sub(wi->umr.dma_info[i].page, 481 + mlx5e_mpwqe_strides_per_page(rq) - wi->skbs_frags[i]); 482 482 put_page(wi->umr.dma_info[i].page); 483 483 } 484 484 dma_unmap_single(rq->pdev, wi->umr.mtt_addr, mtt_sz, PCI_DMA_TODEVICE); ··· 527 527 */ 528 528 split_page(wi->dma_info.page, MLX5_MPWRQ_WQE_PAGE_ORDER); 529 529 for (i = 0; i < MLX5_MPWRQ_PAGES_PER_WQE; i++) { 530 - atomic_add(mlx5e_mpwqe_strides_per_page(rq), 531 - &wi->dma_info.page[i]._count); 530 + page_ref_add(&wi->dma_info.page[i], 531 + mlx5e_mpwqe_strides_per_page(rq)); 532 532 wi->skbs_frags[i] = 0; 533 533 } 534 534 ··· 551 551 dma_unmap_page(rq->pdev, wi->dma_info.addr, rq->wqe_sz, 552 552 PCI_DMA_FROMDEVICE); 553 553 for (i = 0; i < MLX5_MPWRQ_PAGES_PER_WQE; i++) { 554 - atomic_sub(mlx5e_mpwqe_strides_per_page(rq) - wi->skbs_frags[i], 555 - &wi->dma_info.page[i]._count); 554 + page_ref_sub(&wi->dma_info.page[i], 555 + mlx5e_mpwqe_strides_per_page(rq) - wi->skbs_frags[i]); 556 556 put_page(&wi->dma_info.page[i]); 557 557 } 558 558 }
+2 -2
drivers/net/ethernet/qlogic/qede/qede_main.c
··· 1036 1036 /* Incr page ref count to reuse on allocation failure 1037 1037 * so that it doesn't get freed while freeing SKB. 1038 1038 */ 1039 - atomic_inc(&current_bd->data->_count); 1039 + page_ref_inc(current_bd->data); 1040 1040 goto out; 1041 1041 } 1042 1042 ··· 1487 1487 * freeing SKB. 1488 1488 */ 1489 1489 1490 - atomic_inc(&sw_rx_data->data->_count); 1490 + page_ref_inc(sw_rx_data->data); 1491 1491 rxq->rx_alloc_errors++; 1492 1492 qede_recycle_rx_bd_ring(rxq, edev, 1493 1493 fp_cqe->bd_num);
+1 -1
fs/proc/page.c
··· 142 142 143 143 144 144 /* 145 - * Caveats on high order pages: page->_count will only be set 145 + * Caveats on high order pages: page->_refcount will only be set 146 146 * -1 on the head page; SLUB/SLQB do the same for PG_slab; 147 147 * SLOB won't set PG_slab at all on compound pages. 148 148 */
+1 -1
include/linux/mm.h
··· 734 734 page = compound_head(page); 735 735 /* 736 736 * Getting a normal page or the head of a compound page 737 - * requires to already have an elevated page->_count. 737 + * requires to already have an elevated page->_refcount. 738 738 */ 739 739 VM_BUG_ON_PAGE(page_ref_count(page) <= 0, page); 740 740 page_ref_inc(page);
+9 -5
include/linux/mm_types.h
··· 73 73 unsigned long counters; 74 74 #else 75 75 /* 76 - * Keep _count separate from slub cmpxchg_double data. 77 - * As the rest of the double word is protected by 78 - * slab_lock but _count is not. 76 + * Keep _refcount separate from slub cmpxchg_double 77 + * data. As the rest of the double word is protected by 78 + * slab_lock but _refcount is not. 79 79 */ 80 80 unsigned counters; 81 81 #endif ··· 97 97 }; 98 98 int units; /* SLOB */ 99 99 }; 100 - atomic_t _count; /* Usage count, see below. */ 100 + /* 101 + * Usage count, *USE WRAPPER FUNCTION* 102 + * when manual accounting. See page_ref.h 103 + */ 104 + atomic_t _refcount; 101 105 }; 102 106 unsigned int active; /* SLAB */ 103 107 }; ··· 252 248 __u32 offset; 253 249 #endif 254 250 /* we maintain a pagecount bias, so that we dont dirty cache line 255 - * containing page->_count every time we allocate a fragment. 251 + * containing page->_refcount every time we allocate a fragment. 256 252 */ 257 253 unsigned int pagecnt_bias; 258 254 bool pfmemalloc;
+13 -13
include/linux/page_ref.h
··· 63 63 64 64 static inline int page_ref_count(struct page *page) 65 65 { 66 - return atomic_read(&page->_count); 66 + return atomic_read(&page->_refcount); 67 67 } 68 68 69 69 static inline int page_count(struct page *page) 70 70 { 71 - return atomic_read(&compound_head(page)->_count); 71 + return atomic_read(&compound_head(page)->_refcount); 72 72 } 73 73 74 74 static inline void set_page_count(struct page *page, int v) 75 75 { 76 - atomic_set(&page->_count, v); 76 + atomic_set(&page->_refcount, v); 77 77 if (page_ref_tracepoint_active(__tracepoint_page_ref_set)) 78 78 __page_ref_set(page, v); 79 79 } ··· 89 89 90 90 static inline void page_ref_add(struct page *page, int nr) 91 91 { 92 - atomic_add(nr, &page->_count); 92 + atomic_add(nr, &page->_refcount); 93 93 if (page_ref_tracepoint_active(__tracepoint_page_ref_mod)) 94 94 __page_ref_mod(page, nr); 95 95 } 96 96 97 97 static inline void page_ref_sub(struct page *page, int nr) 98 98 { 99 - atomic_sub(nr, &page->_count); 99 + atomic_sub(nr, &page->_refcount); 100 100 if (page_ref_tracepoint_active(__tracepoint_page_ref_mod)) 101 101 __page_ref_mod(page, -nr); 102 102 } 103 103 104 104 static inline void page_ref_inc(struct page *page) 105 105 { 106 - atomic_inc(&page->_count); 106 + atomic_inc(&page->_refcount); 107 107 if (page_ref_tracepoint_active(__tracepoint_page_ref_mod)) 108 108 __page_ref_mod(page, 1); 109 109 } 110 110 111 111 static inline void page_ref_dec(struct page *page) 112 112 { 113 - atomic_dec(&page->_count); 113 + atomic_dec(&page->_refcount); 114 114 if (page_ref_tracepoint_active(__tracepoint_page_ref_mod)) 115 115 __page_ref_mod(page, -1); 116 116 } 117 117 118 118 static inline int page_ref_sub_and_test(struct page *page, int nr) 119 119 { 120 - int ret = atomic_sub_and_test(nr, &page->_count); 120 + int ret = atomic_sub_and_test(nr, &page->_refcount); 121 121 122 122 if (page_ref_tracepoint_active(__tracepoint_page_ref_mod_and_test)) 123 123 __page_ref_mod_and_test(page, -nr, ret); ··· 126 126 127 127 static inline int page_ref_dec_and_test(struct page *page) 128 128 { 129 - int ret = atomic_dec_and_test(&page->_count); 129 + int ret = atomic_dec_and_test(&page->_refcount); 130 130 131 131 if (page_ref_tracepoint_active(__tracepoint_page_ref_mod_and_test)) 132 132 __page_ref_mod_and_test(page, -1, ret); ··· 135 135 136 136 static inline int page_ref_dec_return(struct page *page) 137 137 { 138 - int ret = atomic_dec_return(&page->_count); 138 + int ret = atomic_dec_return(&page->_refcount); 139 139 140 140 if (page_ref_tracepoint_active(__tracepoint_page_ref_mod_and_return)) 141 141 __page_ref_mod_and_return(page, -1, ret); ··· 144 144 145 145 static inline int page_ref_add_unless(struct page *page, int nr, int u) 146 146 { 147 - int ret = atomic_add_unless(&page->_count, nr, u); 147 + int ret = atomic_add_unless(&page->_refcount, nr, u); 148 148 149 149 if (page_ref_tracepoint_active(__tracepoint_page_ref_mod_unless)) 150 150 __page_ref_mod_unless(page, nr, ret); ··· 153 153 154 154 static inline int page_ref_freeze(struct page *page, int count) 155 155 { 156 - int ret = likely(atomic_cmpxchg(&page->_count, count, 0) == count); 156 + int ret = likely(atomic_cmpxchg(&page->_refcount, count, 0) == count); 157 157 158 158 if (page_ref_tracepoint_active(__tracepoint_page_ref_freeze)) 159 159 __page_ref_freeze(page, count, ret); ··· 165 165 VM_BUG_ON_PAGE(page_count(page) != 0, page); 166 166 VM_BUG_ON(count == 0); 167 167 168 - atomic_set(&page->_count, count); 168 + atomic_set(&page->_refcount, count); 169 169 if (page_ref_tracepoint_active(__tracepoint_page_ref_unfreeze)) 170 170 __page_ref_unfreeze(page, count); 171 171 }
+4 -4
include/linux/pagemap.h
··· 90 90 91 91 /* 92 92 * speculatively take a reference to a page. 93 - * If the page is free (_count == 0), then _count is untouched, and 0 94 - * is returned. Otherwise, _count is incremented by 1 and 1 is returned. 93 + * If the page is free (_refcount == 0), then _refcount is untouched, and 0 94 + * is returned. Otherwise, _refcount is incremented by 1 and 1 is returned. 95 95 * 96 96 * This function must be called inside the same rcu_read_lock() section as has 97 97 * been used to lookup the page in the pagecache radix-tree (or page table): 98 - * this allows allocators to use a synchronize_rcu() to stabilize _count. 98 + * this allows allocators to use a synchronize_rcu() to stabilize _refcount. 99 99 * 100 100 * Unless an RCU grace period has passed, the count of all pages coming out 101 101 * of the allocator must be considered unstable. page_count may return higher ··· 111 111 * 2. conditionally increment refcount 112 112 * 3. check the page is still in pagecache (if no, goto 1) 113 113 * 114 - * Remove-side that cares about stability of _count (eg. reclaim) has the 114 + * Remove-side that cares about stability of _refcount (eg. reclaim) has the 115 115 * following (with tree_lock held for write): 116 116 * A. atomically check refcount is correct and set it to 0 (atomic_cmpxchg) 117 117 * B. remove page from pagecache
+1 -1
kernel/kexec_core.c
··· 1410 1410 VMCOREINFO_STRUCT_SIZE(list_head); 1411 1411 VMCOREINFO_SIZE(nodemask_t); 1412 1412 VMCOREINFO_OFFSET(page, flags); 1413 - VMCOREINFO_OFFSET(page, _count); 1413 + VMCOREINFO_OFFSET(page, _refcount); 1414 1414 VMCOREINFO_OFFSET(page, mapping); 1415 1415 VMCOREINFO_OFFSET(page, lru); 1416 1416 VMCOREINFO_OFFSET(page, _mapcount);
+2 -2
mm/huge_memory.c
··· 3113 3113 VM_BUG_ON_PAGE(page_ref_count(page_tail) != 0, page_tail); 3114 3114 3115 3115 /* 3116 - * tail_page->_count is zero and not changing from under us. But 3116 + * tail_page->_refcount is zero and not changing from under us. But 3117 3117 * get_page_unless_zero() may be running from under us on the 3118 3118 * tail_page. If we used atomic_set() below instead of atomic_inc(), we 3119 3119 * would then run atomic_set() concurrently with ··· 3340 3340 if (mlocked) 3341 3341 lru_add_drain(); 3342 3342 3343 - /* Prevent deferred_split_scan() touching ->_count */ 3343 + /* Prevent deferred_split_scan() touching ->_refcount */ 3344 3344 spin_lock_irqsave(&pgdata->split_queue_lock, flags); 3345 3345 count = page_count(head); 3346 3346 mapcount = total_mapcount(head);
+1 -1
mm/internal.h
··· 58 58 } 59 59 60 60 /* 61 - * Turn a non-refcounted page (->_count == 0) into refcounted with 61 + * Turn a non-refcounted page (->_refcount == 0) into refcounted with 62 62 * a count of one. 63 63 */ 64 64 static inline void set_page_refcounted(struct page *page)
+2 -2
mm/page_alloc.c
··· 794 794 if (unlikely(page->mapping != NULL)) 795 795 bad_reason = "non-NULL mapping"; 796 796 if (unlikely(page_ref_count(page) != 0)) 797 - bad_reason = "nonzero _count"; 797 + bad_reason = "nonzero _refcount"; 798 798 if (unlikely(page->flags & PAGE_FLAGS_CHECK_AT_FREE)) { 799 799 bad_reason = "PAGE_FLAGS_CHECK_AT_FREE flag(s) set"; 800 800 bad_flags = PAGE_FLAGS_CHECK_AT_FREE; ··· 6864 6864 * We can't use page_count without pin a page 6865 6865 * because another CPU can free compound page. 6866 6866 * This check already skips compound tails of THP 6867 - * because their page->_count is zero at all time. 6867 + * because their page->_refcount is zero at all time. 6868 6868 */ 6869 6869 if (!page_ref_count(page)) { 6870 6870 if (PageBuddy(page))
+2 -2
mm/slub.c
··· 329 329 tmp.counters = counters_new; 330 330 /* 331 331 * page->counters can cover frozen/inuse/objects as well 332 - * as page->_count. If we assign to ->counters directly 333 - * we run the risk of losing updates to page->_count, so 332 + * as page->_refcount. If we assign to ->counters directly 333 + * we run the risk of losing updates to page->_refcount, so 334 334 * be careful and only assign to the fields we need. 335 335 */ 336 336 page->frozen = tmp.frozen;
+2 -2
mm/vmscan.c
··· 633 633 * 634 634 * Reversing the order of the tests ensures such a situation cannot 635 635 * escape unnoticed. The smp_rmb is needed to ensure the page->flags 636 - * load is not satisfied before that of page->_count. 636 + * load is not satisfied before that of page->_refcount. 637 637 * 638 638 * Note that if SetPageDirty is always performed via set_page_dirty, 639 639 * and thus under tree_lock, then this ordering is not required. ··· 1720 1720 * It is safe to rely on PG_active against the non-LRU pages in here because 1721 1721 * nobody will play with that bit on a non-LRU page. 1722 1722 * 1723 - * The downside is that we have to touch page->_count against each page. 1723 + * The downside is that we have to touch page->_refcount against each page. 1724 1724 * But we had to alter page->flags anyway. 1725 1725 */ 1726 1726