Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

zsmalloc: zsmalloc documentation

Create zsmalloc doc which explains design concept and stat information.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Juneho Choi <juno.choi@lge.com>
Cc: Gunho Lee <gunho.lee@lge.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Seth Jennings <sjennings@variantweb.net>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Minchan Kim and committed by
Linus Torvalds
d02be50d 248ca1b0

+71 -29
+70
Documentation/vm/zsmalloc.txt
··· 1 + zsmalloc 2 + -------- 3 + 4 + This allocator is designed for use with zram. Thus, the allocator is 5 + supposed to work well under low memory conditions. In particular, it 6 + never attempts higher order page allocation which is very likely to 7 + fail under memory pressure. On the other hand, if we just use single 8 + (0-order) pages, it would suffer from very high fragmentation -- 9 + any object of size PAGE_SIZE/2 or larger would occupy an entire page. 10 + This was one of the major issues with its predecessor (xvmalloc). 11 + 12 + To overcome these issues, zsmalloc allocates a bunch of 0-order pages 13 + and links them together using various 'struct page' fields. These linked 14 + pages act as a single higher-order page i.e. an object can span 0-order 15 + page boundaries. The code refers to these linked pages as a single entity 16 + called zspage. 17 + 18 + For simplicity, zsmalloc can only allocate objects of size up to PAGE_SIZE 19 + since this satisfies the requirements of all its current users (in the 20 + worst case, page is incompressible and is thus stored "as-is" i.e. in 21 + uncompressed form). For allocation requests larger than this size, failure 22 + is returned (see zs_malloc). 23 + 24 + Additionally, zs_malloc() does not return a dereferenceable pointer. 25 + Instead, it returns an opaque handle (unsigned long) which encodes actual 26 + location of the allocated object. The reason for this indirection is that 27 + zsmalloc does not keep zspages permanently mapped since that would cause 28 + issues on 32-bit systems where the VA region for kernel space mappings 29 + is very small. So, before using the allocating memory, the object has to 30 + be mapped using zs_map_object() to get a usable pointer and subsequently 31 + unmapped using zs_unmap_object(). 32 + 33 + stat 34 + ---- 35 + 36 + With CONFIG_ZSMALLOC_STAT, we could see zsmalloc internal information via 37 + /sys/kernel/debug/zsmalloc/<user name>. Here is a sample of stat output: 38 + 39 + # cat /sys/kernel/debug/zsmalloc/zram0/classes 40 + 41 + class size almost_full almost_empty obj_allocated obj_used pages_used pages_per_zspage 42 + .. 43 + .. 44 + 9 176 0 1 186 129 8 4 45 + 10 192 1 0 2880 2872 135 3 46 + 11 208 0 1 819 795 42 2 47 + 12 224 0 1 219 159 12 4 48 + .. 49 + .. 50 + 51 + 52 + class: index 53 + size: object size zspage stores 54 + almost_empty: the number of ZS_ALMOST_EMPTY zspages(see below) 55 + almost_full: the number of ZS_ALMOST_FULL zspages(see below) 56 + obj_allocated: the number of objects allocated 57 + obj_used: the number of objects allocated to the user 58 + pages_used: the number of pages allocated for the class 59 + pages_per_zspage: the number of 0-order pages to make a zspage 60 + 61 + We assign a zspage to ZS_ALMOST_EMPTY fullness group when: 62 + n <= N / f, where 63 + n = number of allocated objects 64 + N = total number of objects zspage can store 65 + f = fullness_threshold_frac(ie, 4 at the moment) 66 + 67 + Similarly, we assign zspage to: 68 + ZS_ALMOST_FULL when n > N / f 69 + ZS_EMPTY when n == 0 70 + ZS_FULL when n == N
+1
MAINTAINERS
··· 10972 10972 S: Maintained 10973 10973 F: mm/zsmalloc.c 10974 10974 F: include/linux/zsmalloc.h 10975 + F: Documentation/vm/zsmalloc.txt 10975 10976 10976 10977 ZSWAP COMPRESSED SWAP CACHING 10977 10978 M: Seth Jennings <sjennings@variantweb.net>
-29
mm/zsmalloc.c
··· 12 12 */ 13 13 14 14 /* 15 - * This allocator is designed for use with zram. Thus, the allocator is 16 - * supposed to work well under low memory conditions. In particular, it 17 - * never attempts higher order page allocation which is very likely to 18 - * fail under memory pressure. On the other hand, if we just use single 19 - * (0-order) pages, it would suffer from very high fragmentation -- 20 - * any object of size PAGE_SIZE/2 or larger would occupy an entire page. 21 - * This was one of the major issues with its predecessor (xvmalloc). 22 - * 23 - * To overcome these issues, zsmalloc allocates a bunch of 0-order pages 24 - * and links them together using various 'struct page' fields. These linked 25 - * pages act as a single higher-order page i.e. an object can span 0-order 26 - * page boundaries. The code refers to these linked pages as a single entity 27 - * called zspage. 28 - * 29 - * For simplicity, zsmalloc can only allocate objects of size up to PAGE_SIZE 30 - * since this satisfies the requirements of all its current users (in the 31 - * worst case, page is incompressible and is thus stored "as-is" i.e. in 32 - * uncompressed form). For allocation requests larger than this size, failure 33 - * is returned (see zs_malloc). 34 - * 35 - * Additionally, zs_malloc() does not return a dereferenceable pointer. 36 - * Instead, it returns an opaque handle (unsigned long) which encodes actual 37 - * location of the allocated object. The reason for this indirection is that 38 - * zsmalloc does not keep zspages permanently mapped since that would cause 39 - * issues on 32-bit systems where the VA region for kernel space mappings 40 - * is very small. So, before using the allocating memory, the object has to 41 - * be mapped using zs_map_object() to get a usable pointer and subsequently 42 - * unmapped using zs_unmap_object(). 43 - * 44 15 * Following is how we use various fields and flags of underlying 45 16 * struct page(s) to form a zspage. 46 17 *