Merge branch 'docs-net-page_pool-sync-dev-and-kdoc'

+47 -63

Documentation/networking/page_pool.rst

··· 64 64 The protection doesn't strictly have to be NAPI, any guarantee that allocating 65 65 a page will cause no race conditions is enough. 66 66 67 - * page_pool_create(): Create a pool. 68 - * flags: PP_FLAG_DMA_MAP, PP_FLAG_DMA_SYNC_DEV 69 - * order: 2^order pages on allocation 70 - * pool_size: size of the ptr_ring 71 - * nid: preferred NUMA node for allocation 72 - * dev: struct device. Used on DMA operations 73 - * dma_dir: DMA direction 74 - * max_len: max DMA sync memory size 75 - * offset: DMA address offset 67 + .. kernel-doc:: net/core/page_pool.c 68 + :identifiers: page_pool_create 76 69 77 - * page_pool_put_page(): The outcome of this depends on the page refcnt. If the 78 - driver bumps the refcnt > 1 this will unmap the page. If the page refcnt is 1 79 - the allocator owns the page and will try to recycle it in one of the pool 80 - caches. If PP_FLAG_DMA_SYNC_DEV is set, the page will be synced for_device 81 - using dma_sync_single_range_for_device(). 70 + .. kernel-doc:: include/net/page_pool.h 71 + :identifiers: struct page_pool_params 82 72 83 - * page_pool_put_full_page(): Similar to page_pool_put_page(), but will DMA sync 84 - for the entire memory area configured in area pool->max_len. 73 + .. kernel-doc:: include/net/page_pool.h 74 + :identifiers: page_pool_put_page page_pool_put_full_page 75 + page_pool_recycle_direct page_pool_dev_alloc_pages 76 + page_pool_get_dma_addr page_pool_get_dma_dir 85 77 86 - * page_pool_recycle_direct(): Similar to page_pool_put_full_page() but caller 87 - must guarantee safe context (e.g NAPI), since it will recycle the page 88 - directly into the pool fast cache. 78 + .. kernel-doc:: net/core/page_pool.c 79 + :identifiers: page_pool_put_page_bulk page_pool_get_stats 89 80 90 - * page_pool_dev_alloc_pages(): Get a page from the page allocator or page_pool 91 - caches. 81 + DMA sync 82 + -------- 83 + Driver is always responsible for syncing the pages for the CPU. 84 + Drivers may choose to take care of syncing for the device as well 85 + or set the ``PP_FLAG_DMA_SYNC_DEV`` flag to request that pages 86 + allocated from the page pool are already synced for the device. 92 87 93 - * page_pool_get_dma_addr(): Retrieve the stored DMA address. 88 + If ``PP_FLAG_DMA_SYNC_DEV`` is set, the driver must inform the core what portion 89 + of the buffer has to be synced. This allows the core to avoid syncing the entire 90 + page when the drivers knows that the device only accessed a portion of the page. 94 91 95 - * page_pool_get_dma_dir(): Retrieve the stored DMA direction. 92 + Most drivers will reserve headroom in front of the frame. This part 93 + of the buffer is not touched by the device, so to avoid syncing 94 + it drivers can set the ``offset`` field in struct page_pool_params 95 + appropriately. 96 96 97 - * page_pool_put_page_bulk(): Tries to refill a number of pages into the 98 - ptr_ring cache holding ptr_ring producer lock. If the ptr_ring is full, 99 - page_pool_put_page_bulk() will release leftover pages to the page allocator. 100 - page_pool_put_page_bulk() is suitable to be run inside the driver NAPI tx 101 - completion loop for the XDP_REDIRECT use case. 102 - Please note the caller must not use data area after running 103 - page_pool_put_page_bulk(), as this function overwrites it. 97 + For pages recycled on the XDP xmit and skb paths the page pool will 98 + use the ``max_len`` member of struct page_pool_params to decide how 99 + much of the page needs to be synced (starting at ``offset``). 100 + When directly freeing pages in the driver (page_pool_put_page()) 101 + the ``dma_sync_size`` argument specifies how much of the buffer needs 102 + to be synced. 104 103 105 - * page_pool_get_stats(): Retrieve statistics about the page_pool. This API 106 - is only available if the kernel has been configured with 107 - ``CONFIG_PAGE_POOL_STATS=y``. A pointer to a caller allocated ``struct 108 - page_pool_stats`` structure is passed to this API which is filled in. The 109 - caller can then report those stats to the user (perhaps via ethtool, 110 - debugfs, etc.). See below for an example usage of this API. 104 + If in doubt set ``offset`` to 0, ``max_len`` to ``PAGE_SIZE`` and 105 + pass -1 as ``dma_sync_size``. That combination of arguments is always 106 + correct. 107 + 108 + Note that the syncing parameters are for the entire page. 109 + This is important to remember when using fragments (``PP_FLAG_PAGE_FRAG``), 110 + where allocated buffers may be smaller than a full page. 111 + Unless the driver author really understands page pool internals 112 + it's recommended to always use ``offset = 0``, ``max_len = PAGE_SIZE`` 113 + with fragmented page pools. 111 114 112 115 Stats API and structures 113 116 ------------------------ 114 117 If the kernel is configured with ``CONFIG_PAGE_POOL_STATS=y``, the API 115 - ``page_pool_get_stats()`` and structures described below are available. It 116 - takes a pointer to a ``struct page_pool`` and a pointer to a ``struct 117 - page_pool_stats`` allocated by the caller. 118 + page_pool_get_stats() and structures described below are available. 119 + It takes a pointer to a ``struct page_pool`` and a pointer to a struct 120 + page_pool_stats allocated by the caller. 118 121 119 - The API will fill in the provided ``struct page_pool_stats`` with 122 + The API will fill in the provided struct page_pool_stats with 120 123 statistics about the page_pool. 121 124 122 - The stats structure has the following fields:: 123 - 124 - struct page_pool_stats { 125 - struct page_pool_alloc_stats alloc_stats; 126 - struct page_pool_recycle_stats recycle_stats; 127 - }; 128 - 129 - 130 - The ``struct page_pool_alloc_stats`` has the following fields: 131 - * ``fast``: successful fast path allocations 132 - * ``slow``: slow path order-0 allocations 133 - * ``slow_high_order``: slow path high order allocations 134 - * ``empty``: ptr ring is empty, so a slow path allocation was forced. 135 - * ``refill``: an allocation which triggered a refill of the cache 136 - * ``waive``: pages obtained from the ptr ring that cannot be added to 137 - the cache due to a NUMA mismatch. 138 - 139 - The ``struct page_pool_recycle_stats`` has the following fields: 140 - * ``cached``: recycling placed page in the page pool cache 141 - * ``cache_full``: page pool cache was full 142 - * ``ring``: page placed into the ptr ring 143 - * ``ring_full``: page released from page pool because the ptr ring was full 144 - * ``released_refcnt``: page released (and not recycled) because refcnt > 1 125 + .. kernel-doc:: include/net/page_pool.h 126 + :identifiers: struct page_pool_recycle_stats 127 + struct page_pool_alloc_stats 128 + struct page_pool_stats 145 129 146 130 Coding examples 147 131 ===============

+104 -30

include/net/page_pool.h

··· 70 70 struct page *cache[PP_ALLOC_CACHE_SIZE]; 71 71 }; 72 72 73 + /** 74 + * struct page_pool_params - page pool parameters 75 + * @flags: PP_FLAG_DMA_MAP, PP_FLAG_DMA_SYNC_DEV, PP_FLAG_PAGE_FRAG 76 + * @order: 2^order pages on allocation 77 + * @pool_size: size of the ptr_ring 78 + * @nid: NUMA node id to allocate from pages from 79 + * @dev: device, for DMA pre-mapping purposes 80 + * @napi: NAPI which is the sole consumer of pages, otherwise NULL 81 + * @dma_dir: DMA mapping direction 82 + * @max_len: max DMA sync memory size for PP_FLAG_DMA_SYNC_DEV 83 + * @offset: DMA sync address offset for PP_FLAG_DMA_SYNC_DEV 84 + */ 73 85 struct page_pool_params { 74 86 unsigned int flags; 75 87 unsigned int order; 76 88 unsigned int pool_size; 77 - int nid; /* Numa node id to allocate from pages from */ 78 - struct device *dev; /* device, for DMA pre-mapping purposes */ 79 - struct napi_struct *napi; /* Sole consumer of pages, otherwise NULL */ 80 - enum dma_data_direction dma_dir; /* DMA mapping direction */ 81 - unsigned int max_len; /* max DMA sync memory size */ 82 - unsigned int offset; /* DMA addr offset */ 89 + int nid; 90 + struct device *dev; 91 + struct napi_struct *napi; 92 + enum dma_data_direction dma_dir; 93 + unsigned int max_len; 94 + unsigned int offset; 95 + /* private: used by test code only */ 83 96 void (*init_callback)(struct page *page, void *arg); 84 97 void *init_arg; 85 98 }; 86 99 87 100 #ifdef CONFIG_PAGE_POOL_STATS 101 + /** 102 + * struct page_pool_alloc_stats - allocation statistics 103 + * @fast: successful fast path allocations 104 + * @slow: slow path order-0 allocations 105 + * @slow_high_order: slow path high order allocations 106 + * @empty: ptr ring is empty, so a slow path allocation was forced 107 + * @refill: an allocation which triggered a refill of the cache 108 + * @waive: pages obtained from the ptr ring that cannot be added to 109 + * the cache due to a NUMA mismatch 110 + */ 88 111 struct page_pool_alloc_stats { 89 - u64 fast; /* fast path allocations */ 90 - u64 slow; /* slow-path order 0 allocations */ 91 - u64 slow_high_order; /* slow-path high order allocations */ 92 - u64 empty; /* failed refills due to empty ptr ring, forcing 93 - * slow path allocation 94 - */ 95 - u64 refill; /* allocations via successful refill */ 96 - u64 waive; /* failed refills due to numa zone mismatch */ 112 + u64 fast; 113 + u64 slow; 114 + u64 slow_high_order; 115 + u64 empty; 116 + u64 refill; 117 + u64 waive; 97 118 }; 98 119 120 + /** 121 + * struct page_pool_recycle_stats - recycling (freeing) statistics 122 + * @cached: recycling placed page in the page pool cache 123 + * @cache_full: page pool cache was full 124 + * @ring: page placed into the ptr ring 125 + * @ring_full: page released from page pool because the ptr ring was full 126 + * @released_refcnt: page released (and not recycled) because refcnt > 1 127 + */ 99 128 struct page_pool_recycle_stats { 100 - u64 cached; /* recycling placed page in the cache. */ 101 - u64 cache_full; /* cache was full */ 102 - u64 ring; /* recycling placed page back into ptr ring */ 103 - u64 ring_full; /* page was released from page-pool because 104 - * PTR ring was full. 105 - */ 106 - u64 released_refcnt; /* page released because of elevated 107 - * refcnt 108 - */ 129 + u64 cached; 130 + u64 cache_full; 131 + u64 ring; 132 + u64 ring_full; 133 + u64 released_refcnt; 109 134 }; 110 135 111 - /* This struct wraps the above stats structs so users of the 112 - * page_pool_get_stats API can pass a single argument when requesting the 113 - * stats for the page pool. 136 + /** 137 + * struct page_pool_stats - combined page pool use statistics 138 + * @alloc_stats: see struct page_pool_alloc_stats 139 + * @recycle_stats: see struct page_pool_recycle_stats 140 + * 141 + * Wrapper struct for combining page pool stats with different storage 142 + * requirements. 114 143 */ 115 144 struct page_pool_stats { 116 145 struct page_pool_alloc_stats alloc_stats; ··· 240 211 241 212 struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp); 242 213 214 + /** 215 + * page_pool_dev_alloc_pages() - allocate a page. 216 + * @pool: pool from which to allocate 217 + * 218 + * Get a page from the page allocator or page_pool caches. 219 + */ 243 220 static inline struct page *page_pool_dev_alloc_pages(struct page_pool *pool) 244 221 { 245 222 gfp_t gfp = (GFP_ATOMIC | __GFP_NOWARN); ··· 265 230 return page_pool_alloc_frag(pool, offset, size, gfp); 266 231 } 267 232 268 - /* get the stored dma direction. A driver might decide to treat this locally and 269 - * avoid the extra cache line from page_pool to determine the direction 233 + /** 234 + * page_pool_get_dma_dir() - Retrieve the stored DMA direction. 235 + * @pool: pool from which page was allocated 236 + * 237 + * Get the stored dma direction. A driver might decide to store this locally 238 + * and avoid the extra cache line from page_pool to determine the direction. 270 239 */ 271 240 static 272 241 inline enum dma_data_direction page_pool_get_dma_dir(struct page_pool *pool) ··· 360 321 (page_pool_defrag_page(page, 1) == 0); 361 322 } 362 323 324 + /** 325 + * page_pool_put_page() - release a reference to a page pool page 326 + * @pool: pool from which page was allocated 327 + * @page: page to release a reference on 328 + * @dma_sync_size: how much of the page may have been touched by the device 329 + * @allow_direct: released by the consumer, allow lockless caching 330 + * 331 + * The outcome of this depends on the page refcnt. If the driver bumps 332 + * the refcnt > 1 this will unmap the page. If the page refcnt is 1 333 + * the allocator owns the page and will try to recycle it in one of the pool 334 + * caches. If PP_FLAG_DMA_SYNC_DEV is set, the page will be synced for_device 335 + * using dma_sync_single_range_for_device(). 336 + */ 363 337 static inline void page_pool_put_page(struct page_pool *pool, 364 338 struct page *page, 365 339 unsigned int dma_sync_size, ··· 389 337 #endif 390 338 } 391 339 392 - /* Same as above but will try to sync the entire area pool->max_len */ 340 + /** 341 + * page_pool_put_full_page() - release a reference on a page pool page 342 + * @pool: pool from which page was allocated 343 + * @page: page to release a reference on 344 + * @allow_direct: released by the consumer, allow lockless caching 345 + * 346 + * Similar to page_pool_put_page(), but will DMA sync the entire memory area 347 + * as configured in &page_pool_params.max_len. 348 + */ 393 349 static inline void page_pool_put_full_page(struct page_pool *pool, 394 350 struct page *page, bool allow_direct) 395 351 { 396 352 page_pool_put_page(pool, page, -1, allow_direct); 397 353 } 398 354 399 - /* Same as above but the caller must guarantee safe context. e.g NAPI */ 355 + /** 356 + * page_pool_recycle_direct() - release a reference on a page pool page 357 + * @pool: pool from which page was allocated 358 + * @page: page to release a reference on 359 + * 360 + * Similar to page_pool_put_full_page() but caller must guarantee safe context 361 + * (e.g NAPI), since it will recycle the page directly into the pool fast cache. 362 + */ 400 363 static inline void page_pool_recycle_direct(struct page_pool *pool, 401 364 struct page *page) 402 365 { ··· 421 354 #define PAGE_POOL_DMA_USE_PP_FRAG_COUNT \ 422 355 (sizeof(dma_addr_t) > sizeof(unsigned long)) 423 356 357 + /** 358 + * page_pool_get_dma_addr() - Retrieve the stored DMA address. 359 + * @page: page allocated from a page pool 360 + * 361 + * Fetch the DMA address of the page. The page pool to which the page belongs 362 + * must had been created with PP_FLAG_DMA_MAP. 363 + */ 424 364 static inline dma_addr_t page_pool_get_dma_addr(struct page *page) 425 365 { 426 366 dma_addr_t ret = page->dma_addr;

+30 -1

net/core/page_pool.c

··· 58 58 "rx_pp_recycle_released_ref", 59 59 }; 60 60 61 + /** 62 + * page_pool_get_stats() - fetch page pool stats 63 + * @pool: pool from which page was allocated 64 + * @stats: struct page_pool_stats to fill in 65 + * 66 + * Retrieve statistics about the page_pool. This API is only available 67 + * if the kernel has been configured with ``CONFIG_PAGE_POOL_STATS=y``. 68 + * A pointer to a caller allocated struct page_pool_stats structure 69 + * is passed to this API which is filled in. The caller can then report 70 + * those stats to the user (perhaps via ethtool, debugfs, etc.). 71 + */ 61 72 bool page_pool_get_stats(struct page_pool *pool, 62 73 struct page_pool_stats *stats) 63 74 { ··· 235 224 return 0; 236 225 } 237 226 227 + /** 228 + * page_pool_create() - create a page pool. 229 + * @params: parameters, see struct page_pool_params 230 + */ 238 231 struct page_pool *page_pool_create(const struct page_pool_params *params) 239 232 { 240 233 struct page_pool *pool; ··· 641 626 } 642 627 EXPORT_SYMBOL(page_pool_put_defragged_page); 643 628 644 - /* Caller must not use data area after call, as this function overwrites it */ 629 + /** 630 + * page_pool_put_page_bulk() - release references on multiple pages 631 + * @pool: pool from which pages were allocated 632 + * @data: array holding page pointers 633 + * @count: number of pages in @data 634 + * 635 + * Tries to refill a number of pages into the ptr_ring cache holding ptr_ring 636 + * producer lock. If the ptr_ring is full, page_pool_put_page_bulk() 637 + * will release leftover pages to the page allocator. 638 + * page_pool_put_page_bulk() is suitable to be run inside the driver NAPI tx 639 + * completion loop for the XDP_REDIRECT use case. 640 + * 641 + * Please note the caller must not use data area after running 642 + * page_pool_put_page_bulk(), as this function overwrites it. 643 + */ 645 644 void page_pool_put_page_bulk(struct page_pool *pool, void **data, 646 645 int count) 647 646 {