SLUB: Support for performance statistics

The statistics provided here allow the monitoring of allocator behavior but
at the cost of some (minimal) loss of performance. Counters are placed in
SLUB's per cpu data structure. The per cpu structure may be extended by the
statistics to grow larger than one cacheline which will increase the cache
footprint of SLUB.

There is a compile option to enable/disable the inclusion of the runtime
statistics and its off by default.

The slabinfo tool is enhanced to support these statistics via two options:

-D Switches the line of information displayed for a slab from size
mode to activity mode.

-A Sorts the slabs displayed by activity. This allows the display of
the slabs most important to the performance of a certain load.

-r Report option will report detailed statistics on

Example (tbench load):

slabinfo -AD ->Shows the most active slabs

Name Objects Alloc Free %Fast
skbuff_fclone_cache 33 111953835 111953835 99 99
:0000192 2666 5283688 5281047 99 99
:0001024 849 5247230 5246389 83 83
vm_area_struct 1349 119642 118355 91 22
:0004096 15 66753 66751 98 98
:0000064 2067 25297 23383 98 78
dentry 10259 28635 18464 91 45
:0000080 11004 18950 8089 98 98
:0000096 1703 12358 10784 99 98
:0000128 762 10582 9875 94 18
:0000512 184 9807 9647 95 81
:0002048 479 9669 9195 83 65
anon_vma 777 9461 9002 99 71
kmalloc-8 6492 9981 5624 99 97
:0000768 258 7174 6931 58 15

So the skbuff_fclone_cache is of highest importance for the tbench load.
Pretty high load on the 192 sized slab. Look for the aliases

slabinfo -a | grep 000192
:0000192 <- xfs_btree_cur filp kmalloc-192 uid_cache tw_sock_TCP
request_sock_TCPv6 tw_sock_TCPv6 skbuff_head_cache xfs_ili

Likely skbuff_head_cache.


Looking into the statistics of the skbuff_fclone_cache is possible through

slabinfo skbuff_fclone_cache ->-r option implied if cache name is mentioned


.... Usual output ...

Slab Perf Counter Alloc Free %Al %Fr
--------------------------------------------------
Fastpath 111953360 111946981 99 99
Slowpath 1044 7423 0 0
Page Alloc 272 264 0 0
Add partial 25 325 0 0
Remove partial 86 264 0 0
RemoteObj/SlabFrozen 350 4832 0 0
Total 111954404 111954404

Flushes 49 Refill 0
Deactivate Full=325(92%) Empty=0(0%) ToHead=24(6%) ToTail=1(0%)

Looks good because the fastpath is overwhelmingly taken.


skbuff_head_cache:

Slab Perf Counter Alloc Free %Al %Fr
--------------------------------------------------
Fastpath 5297262 5259882 99 99
Slowpath 4477 39586 0 0
Page Alloc 937 824 0 0
Add partial 0 2515 0 0
Remove partial 1691 824 0 0
RemoteObj/SlabFrozen 2621 9684 0 0
Total 5301739 5299468

Deactivate Full=2620(100%) Empty=0(0%) ToHead=0(0%) ToTail=0(0%)


Descriptions of the output:

Total: The total number of allocation and frees that occurred for a
slab

Fastpath: The number of allocations/frees that used the fastpath.

Slowpath: Other allocations

Page Alloc: Number of calls to the page allocator as a result of slowpath
processing

Add Partial: Number of slabs added to the partial list through free or
alloc (occurs during cpuslab flushes)

Remove Partial: Number of slabs removed from the partial list as a result of
allocations retrieving a partial slab or by a free freeing
the last object of a slab.

RemoteObj/Froz: How many times were remotely freed object encountered when a
slab was about to be deactivated. Frozen: How many times was
free able to skip list processing because the slab was in use
as the cpuslab of another processor.

Flushes: Number of times the cpuslab was flushed on request
(kmem_cache_shrink, may result from races in __slab_alloc)

Refill: Number of times we were able to refill the cpuslab from
remotely freed objects for the same slab.

Deactivate: Statistics how slabs were deactivated. Shows how they were
put onto the partial list.

In general fastpath is very good. Slowpath without partial list processing is
also desirable. Any touching of partial list uses node specific locks which
may potentially cause list lock contention.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

authored by Christoph Lameter and committed by Christoph Lameter 8ff12cfc 1f84260c

+293 -19
+138 -11
Documentation/vm/slabinfo.c
··· 32 int sanity_checks, slab_size, store_user, trace; 33 int order, poison, reclaim_account, red_zone; 34 unsigned long partial, objects, slabs; 35 int numa[MAX_NODES]; 36 int numa_partial[MAX_NODES]; 37 } slabinfo[MAX_SLABS]; ··· 71 int show_single_ref = 0; 72 int show_totals = 0; 73 int sort_size = 0; 74 int set_debug = 0; 75 int show_ops = 0; 76 77 /* Debug options */ 78 int sanity = 0; ··· 102 printf("slabinfo 5/7/2007. (c) 2007 sgi. clameter@sgi.com\n\n" 103 "slabinfo [-ahnpvtsz] [-d debugopts] [slab-regexp]\n" 104 "-a|--aliases Show aliases\n" 105 "-d<options>|--debug=<options> Set/Clear Debug options\n" 106 - "-e|--empty Show empty slabs\n" 107 "-f|--first-alias Show first alias\n" 108 "-h|--help Show usage information\n" 109 "-i|--inverted Inverted list\n" ··· 292 293 void first_line(void) 294 { 295 - printf("Name Objects Objsize Space " 296 - "Slabs/Part/Cpu O/S O %%Fr %%Ef Flg\n"); 297 } 298 299 /* ··· 321 unsigned long slab_size(struct slabinfo *s) 322 { 323 return s->slabs * (page_size << s->order); 324 } 325 326 void slab_numa(struct slabinfo *s, int mode) ··· 412 return "Off"; 413 } 414 415 void report(struct slabinfo *s) 416 { 417 if (strcmp(s->name, "*") == 0) ··· 515 ops(s); 516 show_tracking(s); 517 slab_numa(s, 1); 518 } 519 520 void slabcache(struct slabinfo *s) ··· 565 *p++ = 'T'; 566 567 *p = 0; 568 - printf("%-21s %8ld %7d %8s %14s %4d %1d %3ld %3ld %s\n", 569 - s->name, s->objects, s->object_size, size_str, dist_str, 570 - s->objs_per_slab, s->order, 571 - s->slabs ? (s->partial * 100) / s->slabs : 100, 572 - s->slabs ? (s->objects * s->object_size * 100) / 573 - (s->slabs * (page_size << s->order)) : 100, 574 - flags); 575 } 576 577 /* ··· 992 993 if (sort_size) 994 result = slab_size(s1) < slab_size(s2); 995 else 996 result = strcasecmp(s1->name, s2->name); 997 ··· 1176 free(t); 1177 slab->store_user = get_obj("store_user"); 1178 slab->trace = get_obj("trace"); 1179 chdir(".."); 1180 if (slab->name[0] == ':') 1181 alias_targets++; ··· 1243 1244 struct option opts[] = { 1245 { "aliases", 0, NULL, 'a' }, 1246 { "debug", 2, NULL, 'd' }, 1247 { "empty", 0, NULL, 'e' }, 1248 { "first-alias", 0, NULL, 'f' }, 1249 { "help", 0, NULL, 'h' }, ··· 1270 1271 page_size = getpagesize(); 1272 1273 - while ((c = getopt_long(argc, argv, "ad::efhil1noprstvzTS", 1274 opts, NULL)) != -1) 1275 switch (c) { 1276 case '1': ··· 1279 case 'a': 1280 show_alias = 1; 1281 break; 1282 case 'd': 1283 set_debug = 1; 1284 if (!debug_opt_scan(optarg)) 1285 fatal("Invalid debug option '%s'\n", optarg); 1286 break; 1287 case 'e': 1288 show_empty = 1;
··· 32 int sanity_checks, slab_size, store_user, trace; 33 int order, poison, reclaim_account, red_zone; 34 unsigned long partial, objects, slabs; 35 + unsigned long alloc_fastpath, alloc_slowpath; 36 + unsigned long free_fastpath, free_slowpath; 37 + unsigned long free_frozen, free_add_partial, free_remove_partial; 38 + unsigned long alloc_from_partial, alloc_slab, free_slab, alloc_refill; 39 + unsigned long cpuslab_flush, deactivate_full, deactivate_empty; 40 + unsigned long deactivate_to_head, deactivate_to_tail; 41 + unsigned long deactivate_remote_frees; 42 int numa[MAX_NODES]; 43 int numa_partial[MAX_NODES]; 44 } slabinfo[MAX_SLABS]; ··· 64 int show_single_ref = 0; 65 int show_totals = 0; 66 int sort_size = 0; 67 + int sort_active = 0; 68 int set_debug = 0; 69 int show_ops = 0; 70 + int show_activity = 0; 71 72 /* Debug options */ 73 int sanity = 0; ··· 93 printf("slabinfo 5/7/2007. (c) 2007 sgi. clameter@sgi.com\n\n" 94 "slabinfo [-ahnpvtsz] [-d debugopts] [slab-regexp]\n" 95 "-a|--aliases Show aliases\n" 96 + "-A|--activity Most active slabs first\n" 97 "-d<options>|--debug=<options> Set/Clear Debug options\n" 98 + "-D|--display-active Switch line format to activity\n" 99 + "-e|--empty Show empty slabs\n" 100 "-f|--first-alias Show first alias\n" 101 "-h|--help Show usage information\n" 102 "-i|--inverted Inverted list\n" ··· 281 282 void first_line(void) 283 { 284 + if (show_activity) 285 + printf("Name Objects Alloc Free %%Fast\n"); 286 + else 287 + printf("Name Objects Objsize Space " 288 + "Slabs/Part/Cpu O/S O %%Fr %%Ef Flg\n"); 289 } 290 291 /* ··· 307 unsigned long slab_size(struct slabinfo *s) 308 { 309 return s->slabs * (page_size << s->order); 310 + } 311 + 312 + unsigned long slab_activity(struct slabinfo *s) 313 + { 314 + return s->alloc_fastpath + s->free_fastpath + 315 + s->alloc_slowpath + s->free_slowpath; 316 } 317 318 void slab_numa(struct slabinfo *s, int mode) ··· 392 return "Off"; 393 } 394 395 + void slab_stats(struct slabinfo *s) 396 + { 397 + unsigned long total_alloc; 398 + unsigned long total_free; 399 + unsigned long total; 400 + 401 + if (!s->alloc_slab) 402 + return; 403 + 404 + total_alloc = s->alloc_fastpath + s->alloc_slowpath; 405 + total_free = s->free_fastpath + s->free_slowpath; 406 + 407 + if (!total_alloc) 408 + return; 409 + 410 + printf("\n"); 411 + printf("Slab Perf Counter Alloc Free %%Al %%Fr\n"); 412 + printf("--------------------------------------------------\n"); 413 + printf("Fastpath %8lu %8lu %3lu %3lu\n", 414 + s->alloc_fastpath, s->free_fastpath, 415 + s->alloc_fastpath * 100 / total_alloc, 416 + s->free_fastpath * 100 / total_free); 417 + printf("Slowpath %8lu %8lu %3lu %3lu\n", 418 + total_alloc - s->alloc_fastpath, s->free_slowpath, 419 + (total_alloc - s->alloc_fastpath) * 100 / total_alloc, 420 + s->free_slowpath * 100 / total_free); 421 + printf("Page Alloc %8lu %8lu %3lu %3lu\n", 422 + s->alloc_slab, s->free_slab, 423 + s->alloc_slab * 100 / total_alloc, 424 + s->free_slab * 100 / total_free); 425 + printf("Add partial %8lu %8lu %3lu %3lu\n", 426 + s->deactivate_to_head + s->deactivate_to_tail, 427 + s->free_add_partial, 428 + (s->deactivate_to_head + s->deactivate_to_tail) * 100 / total_alloc, 429 + s->free_add_partial * 100 / total_free); 430 + printf("Remove partial %8lu %8lu %3lu %3lu\n", 431 + s->alloc_from_partial, s->free_remove_partial, 432 + s->alloc_from_partial * 100 / total_alloc, 433 + s->free_remove_partial * 100 / total_free); 434 + 435 + printf("RemoteObj/SlabFrozen %8lu %8lu %3lu %3lu\n", 436 + s->deactivate_remote_frees, s->free_frozen, 437 + s->deactivate_remote_frees * 100 / total_alloc, 438 + s->free_frozen * 100 / total_free); 439 + 440 + printf("Total %8lu %8lu\n\n", total_alloc, total_free); 441 + 442 + if (s->cpuslab_flush) 443 + printf("Flushes %8lu\n", s->cpuslab_flush); 444 + 445 + if (s->alloc_refill) 446 + printf("Refill %8lu\n", s->alloc_refill); 447 + 448 + total = s->deactivate_full + s->deactivate_empty + 449 + s->deactivate_to_head + s->deactivate_to_tail; 450 + 451 + if (total) 452 + printf("Deactivate Full=%lu(%lu%%) Empty=%lu(%lu%%) " 453 + "ToHead=%lu(%lu%%) ToTail=%lu(%lu%%)\n", 454 + s->deactivate_full, (s->deactivate_full * 100) / total, 455 + s->deactivate_empty, (s->deactivate_empty * 100) / total, 456 + s->deactivate_to_head, (s->deactivate_to_head * 100) / total, 457 + s->deactivate_to_tail, (s->deactivate_to_tail * 100) / total); 458 + } 459 + 460 void report(struct slabinfo *s) 461 { 462 if (strcmp(s->name, "*") == 0) ··· 430 ops(s); 431 show_tracking(s); 432 slab_numa(s, 1); 433 + slab_stats(s); 434 } 435 436 void slabcache(struct slabinfo *s) ··· 479 *p++ = 'T'; 480 481 *p = 0; 482 + if (show_activity) { 483 + unsigned long total_alloc; 484 + unsigned long total_free; 485 + 486 + total_alloc = s->alloc_fastpath + s->alloc_slowpath; 487 + total_free = s->free_fastpath + s->free_slowpath; 488 + 489 + printf("%-21s %8ld %8ld %8ld %3ld %3ld \n", 490 + s->name, s->objects, 491 + total_alloc, total_free, 492 + total_alloc ? (s->alloc_fastpath * 100 / total_alloc) : 0, 493 + total_free ? (s->free_fastpath * 100 / total_free) : 0); 494 + } 495 + else 496 + printf("%-21s %8ld %7d %8s %14s %4d %1d %3ld %3ld %s\n", 497 + s->name, s->objects, s->object_size, size_str, dist_str, 498 + s->objs_per_slab, s->order, 499 + s->slabs ? (s->partial * 100) / s->slabs : 100, 500 + s->slabs ? (s->objects * s->object_size * 100) / 501 + (s->slabs * (page_size << s->order)) : 100, 502 + flags); 503 } 504 505 /* ··· 892 893 if (sort_size) 894 result = slab_size(s1) < slab_size(s2); 895 + else if (sort_active) 896 + result = slab_activity(s1) < slab_activity(s2); 897 else 898 result = strcasecmp(s1->name, s2->name); 899 ··· 1074 free(t); 1075 slab->store_user = get_obj("store_user"); 1076 slab->trace = get_obj("trace"); 1077 + slab->alloc_fastpath = get_obj("alloc_fastpath"); 1078 + slab->alloc_slowpath = get_obj("alloc_slowpath"); 1079 + slab->free_fastpath = get_obj("free_fastpath"); 1080 + slab->free_slowpath = get_obj("free_slowpath"); 1081 + slab->free_frozen= get_obj("free_frozen"); 1082 + slab->free_add_partial = get_obj("free_add_partial"); 1083 + slab->free_remove_partial = get_obj("free_remove_partial"); 1084 + slab->alloc_from_partial = get_obj("alloc_from_partial"); 1085 + slab->alloc_slab = get_obj("alloc_slab"); 1086 + slab->alloc_refill = get_obj("alloc_refill"); 1087 + slab->free_slab = get_obj("free_slab"); 1088 + slab->cpuslab_flush = get_obj("cpuslab_flush"); 1089 + slab->deactivate_full = get_obj("deactivate_full"); 1090 + slab->deactivate_empty = get_obj("deactivate_empty"); 1091 + slab->deactivate_to_head = get_obj("deactivate_to_head"); 1092 + slab->deactivate_to_tail = get_obj("deactivate_to_tail"); 1093 + slab->deactivate_remote_frees = get_obj("deactivate_remote_frees"); 1094 chdir(".."); 1095 if (slab->name[0] == ':') 1096 alias_targets++; ··· 1124 1125 struct option opts[] = { 1126 { "aliases", 0, NULL, 'a' }, 1127 + { "activity", 0, NULL, 'A' }, 1128 { "debug", 2, NULL, 'd' }, 1129 + { "display-activity", 0, NULL, 'D' }, 1130 { "empty", 0, NULL, 'e' }, 1131 { "first-alias", 0, NULL, 'f' }, 1132 { "help", 0, NULL, 'h' }, ··· 1149 1150 page_size = getpagesize(); 1151 1152 + while ((c = getopt_long(argc, argv, "aAd::Defhil1noprstvzTS", 1153 opts, NULL)) != -1) 1154 switch (c) { 1155 case '1': ··· 1158 case 'a': 1159 show_alias = 1; 1160 break; 1161 + case 'A': 1162 + sort_active = 1; 1163 + break; 1164 case 'd': 1165 set_debug = 1; 1166 if (!debug_opt_scan(optarg)) 1167 fatal("Invalid debug option '%s'\n", optarg); 1168 + break; 1169 + case 'D': 1170 + show_activity = 1; 1171 break; 1172 case 'e': 1173 show_empty = 1;
+23
include/linux/slub_def.h
··· 11 #include <linux/workqueue.h> 12 #include <linux/kobject.h> 13 14 struct kmem_cache_cpu { 15 void **freelist; /* Pointer to first free per cpu object */ 16 struct page *page; /* The slab from which we are allocating */ 17 int node; /* The node of the page (or -1 for debug) */ 18 unsigned int offset; /* Freepointer offset (in word units) */ 19 unsigned int objsize; /* Size of an object (from kmem_cache) */ 20 }; 21 22 struct kmem_cache_node {
··· 11 #include <linux/workqueue.h> 12 #include <linux/kobject.h> 13 14 + enum stat_item { 15 + ALLOC_FASTPATH, /* Allocation from cpu slab */ 16 + ALLOC_SLOWPATH, /* Allocation by getting a new cpu slab */ 17 + FREE_FASTPATH, /* Free to cpu slub */ 18 + FREE_SLOWPATH, /* Freeing not to cpu slab */ 19 + FREE_FROZEN, /* Freeing to frozen slab */ 20 + FREE_ADD_PARTIAL, /* Freeing moves slab to partial list */ 21 + FREE_REMOVE_PARTIAL, /* Freeing removes last object */ 22 + ALLOC_FROM_PARTIAL, /* Cpu slab acquired from partial list */ 23 + ALLOC_SLAB, /* Cpu slab acquired from page allocator */ 24 + ALLOC_REFILL, /* Refill cpu slab from slab freelist */ 25 + FREE_SLAB, /* Slab freed to the page allocator */ 26 + CPUSLAB_FLUSH, /* Abandoning of the cpu slab */ 27 + DEACTIVATE_FULL, /* Cpu slab was full when deactivated */ 28 + DEACTIVATE_EMPTY, /* Cpu slab was empty when deactivated */ 29 + DEACTIVATE_TO_HEAD, /* Cpu slab was moved to the head of partials */ 30 + DEACTIVATE_TO_TAIL, /* Cpu slab was moved to the tail of partials */ 31 + DEACTIVATE_REMOTE_FREES,/* Slab contained remotely freed objects */ 32 + NR_SLUB_STAT_ITEMS }; 33 + 34 struct kmem_cache_cpu { 35 void **freelist; /* Pointer to first free per cpu object */ 36 struct page *page; /* The slab from which we are allocating */ 37 int node; /* The node of the page (or -1 for debug) */ 38 unsigned int offset; /* Freepointer offset (in word units) */ 39 unsigned int objsize; /* Size of an object (from kmem_cache) */ 40 + #ifdef CONFIG_SLUB_STATS 41 + unsigned stat[NR_SLUB_STAT_ITEMS]; 42 + #endif 43 }; 44 45 struct kmem_cache_node {
+13
lib/Kconfig.debug
··· 205 off in a kernel built with CONFIG_SLUB_DEBUG_ON by specifying 206 "slub_debug=-". 207 208 config DEBUG_PREEMPT 209 bool "Debug preemptible kernel" 210 depends on DEBUG_KERNEL && PREEMPT && (TRACE_IRQFLAGS_SUPPORT || PPC64)
··· 205 off in a kernel built with CONFIG_SLUB_DEBUG_ON by specifying 206 "slub_debug=-". 207 208 + config SLUB_STATS 209 + default n 210 + bool "Enable SLUB performance statistics" 211 + depends on SLUB 212 + help 213 + SLUB statistics are useful to debug SLUBs allocation behavior in 214 + order find ways to optimize the allocator. This should never be 215 + enabled for production use since keeping statistics slows down 216 + the allocator by a few percentage points. The slabinfo command 217 + supports the determination of the most active slabs to figure 218 + out which slabs are relevant to a particular load. 219 + Try running: slabinfo -DA 220 + 221 config DEBUG_PREEMPT 222 bool "Debug preemptible kernel" 223 depends on DEBUG_KERNEL && PREEMPT && (TRACE_IRQFLAGS_SUPPORT || PPC64)
+119 -8
mm/slub.c
··· 250 static int sysfs_slab_add(struct kmem_cache *); 251 static int sysfs_slab_alias(struct kmem_cache *, const char *); 252 static void sysfs_slab_remove(struct kmem_cache *); 253 #else 254 static inline int sysfs_slab_add(struct kmem_cache *s) { return 0; } 255 static inline int sysfs_slab_alias(struct kmem_cache *s, const char *p) ··· 259 { 260 kfree(s); 261 } 262 #endif 263 264 /******************************************************************** 265 * Core slab cache functions ··· 1373 static void unfreeze_slab(struct kmem_cache *s, struct page *page, int tail) 1374 { 1375 struct kmem_cache_node *n = get_node(s, page_to_nid(page)); 1376 1377 ClearSlabFrozen(page); 1378 if (page->inuse) { 1379 1380 - if (page->freelist != page->end) 1381 add_partial(n, page, tail); 1382 - else if (SlabDebug(page) && (s->flags & SLAB_STORE_USER)) 1383 - add_full(n, page); 1384 slab_unlock(page); 1385 - 1386 } else { 1387 if (n->nr_partial < MIN_PARTIAL) { 1388 /* 1389 * Adding an empty slab to the partial slabs in order ··· 1402 slab_unlock(page); 1403 } else { 1404 slab_unlock(page); 1405 discard_slab(s, page); 1406 } 1407 } ··· 1415 { 1416 struct page *page = c->page; 1417 int tail = 1; 1418 /* 1419 * Merge cpu freelist into freelist. Typically we get here 1420 * because both freelists are empty. So this is unlikely ··· 1447 1448 static inline void flush_slab(struct kmem_cache *s, struct kmem_cache_cpu *c) 1449 { 1450 slab_lock(c->page); 1451 deactivate_slab(s, c); 1452 } ··· 1530 slab_lock(c->page); 1531 if (unlikely(!node_match(c, node))) 1532 goto another_slab; 1533 load_freelist: 1534 object = c->page->freelist; 1535 if (unlikely(object == c->page->end)) ··· 1545 c->node = page_to_nid(c->page); 1546 unlock_out: 1547 slab_unlock(c->page); 1548 out: 1549 #ifdef SLUB_FASTPATH 1550 local_irq_restore(flags); ··· 1559 new = get_partial(s, gfpflags, node); 1560 if (new) { 1561 c->page = new; 1562 goto load_freelist; 1563 } 1564 ··· 1573 1574 if (new) { 1575 c = get_cpu_slab(s, smp_processor_id()); 1576 if (c->page) 1577 flush_slab(s, c); 1578 slab_lock(new); ··· 1633 object = __slab_alloc(s, gfpflags, node, addr, c); 1634 break; 1635 } 1636 } while (cmpxchg_local(&c->freelist, object, object[c->offset]) 1637 != object); 1638 #else ··· 1648 else { 1649 object = c->freelist; 1650 c->freelist = object[c->offset]; 1651 } 1652 local_irq_restore(flags); 1653 #endif ··· 1686 { 1687 void *prior; 1688 void **object = (void *)x; 1689 1690 #ifdef SLUB_FASTPATH 1691 unsigned long flags; 1692 1693 local_irq_save(flags); 1694 #endif 1695 slab_lock(page); 1696 1697 if (unlikely(SlabDebug(page))) ··· 1704 page->freelist = object; 1705 page->inuse--; 1706 1707 - if (unlikely(SlabFrozen(page))) 1708 goto out_unlock; 1709 1710 if (unlikely(!page->inuse)) 1711 goto slab_empty; ··· 1717 * was not on the partial list before 1718 * then add it. 1719 */ 1720 - if (unlikely(prior == page->end)) 1721 add_partial(get_node(s, page_to_nid(page)), page, 1); 1722 1723 out_unlock: 1724 slab_unlock(page); ··· 1730 return; 1731 1732 slab_empty: 1733 - if (prior != page->end) 1734 /* 1735 * Slab still on the partial list. 1736 */ 1737 remove_partial(s, page); 1738 - 1739 slab_unlock(page); 1740 #ifdef SLUB_FASTPATH 1741 local_irq_restore(flags); 1742 #endif ··· 1792 break; 1793 } 1794 object[c->offset] = freelist; 1795 } while (cmpxchg_local(&c->freelist, freelist, object) != freelist); 1796 #else 1797 unsigned long flags; ··· 1803 if (likely(page == c->page && c->node >= 0)) { 1804 object[c->offset] = c->freelist; 1805 c->freelist = object; 1806 } else 1807 __slab_free(s, page, x, addr, c->offset); 1808 ··· 4016 SLAB_ATTR(remote_node_defrag_ratio); 4017 #endif 4018 4019 static struct attribute *slab_attrs[] = { 4020 &slab_size_attr.attr, 4021 &object_size_attr.attr, ··· 4101 #endif 4102 #ifdef CONFIG_NUMA 4103 &remote_node_defrag_ratio_attr.attr, 4104 #endif 4105 NULL 4106 };
··· 250 static int sysfs_slab_add(struct kmem_cache *); 251 static int sysfs_slab_alias(struct kmem_cache *, const char *); 252 static void sysfs_slab_remove(struct kmem_cache *); 253 + 254 #else 255 static inline int sysfs_slab_add(struct kmem_cache *s) { return 0; } 256 static inline int sysfs_slab_alias(struct kmem_cache *s, const char *p) ··· 258 { 259 kfree(s); 260 } 261 + 262 #endif 263 + 264 + static inline void stat(struct kmem_cache_cpu *c, enum stat_item si) 265 + { 266 + #ifdef CONFIG_SLUB_STATS 267 + c->stat[si]++; 268 + #endif 269 + } 270 271 /******************************************************************** 272 * Core slab cache functions ··· 1364 static void unfreeze_slab(struct kmem_cache *s, struct page *page, int tail) 1365 { 1366 struct kmem_cache_node *n = get_node(s, page_to_nid(page)); 1367 + struct kmem_cache_cpu *c = get_cpu_slab(s, smp_processor_id()); 1368 1369 ClearSlabFrozen(page); 1370 if (page->inuse) { 1371 1372 + if (page->freelist != page->end) { 1373 add_partial(n, page, tail); 1374 + stat(c, tail ? DEACTIVATE_TO_TAIL : DEACTIVATE_TO_HEAD); 1375 + } else { 1376 + stat(c, DEACTIVATE_FULL); 1377 + if (SlabDebug(page) && (s->flags & SLAB_STORE_USER)) 1378 + add_full(n, page); 1379 + } 1380 slab_unlock(page); 1381 } else { 1382 + stat(c, DEACTIVATE_EMPTY); 1383 if (n->nr_partial < MIN_PARTIAL) { 1384 /* 1385 * Adding an empty slab to the partial slabs in order ··· 1388 slab_unlock(page); 1389 } else { 1390 slab_unlock(page); 1391 + stat(get_cpu_slab(s, raw_smp_processor_id()), FREE_SLAB); 1392 discard_slab(s, page); 1393 } 1394 } ··· 1400 { 1401 struct page *page = c->page; 1402 int tail = 1; 1403 + 1404 + if (c->freelist) 1405 + stat(c, DEACTIVATE_REMOTE_FREES); 1406 /* 1407 * Merge cpu freelist into freelist. Typically we get here 1408 * because both freelists are empty. So this is unlikely ··· 1429 1430 static inline void flush_slab(struct kmem_cache *s, struct kmem_cache_cpu *c) 1431 { 1432 + stat(c, CPUSLAB_FLUSH); 1433 slab_lock(c->page); 1434 deactivate_slab(s, c); 1435 } ··· 1511 slab_lock(c->page); 1512 if (unlikely(!node_match(c, node))) 1513 goto another_slab; 1514 + stat(c, ALLOC_REFILL); 1515 load_freelist: 1516 object = c->page->freelist; 1517 if (unlikely(object == c->page->end)) ··· 1525 c->node = page_to_nid(c->page); 1526 unlock_out: 1527 slab_unlock(c->page); 1528 + stat(c, ALLOC_SLOWPATH); 1529 out: 1530 #ifdef SLUB_FASTPATH 1531 local_irq_restore(flags); ··· 1538 new = get_partial(s, gfpflags, node); 1539 if (new) { 1540 c->page = new; 1541 + stat(c, ALLOC_FROM_PARTIAL); 1542 goto load_freelist; 1543 } 1544 ··· 1551 1552 if (new) { 1553 c = get_cpu_slab(s, smp_processor_id()); 1554 + stat(c, ALLOC_SLAB); 1555 if (c->page) 1556 flush_slab(s, c); 1557 slab_lock(new); ··· 1610 object = __slab_alloc(s, gfpflags, node, addr, c); 1611 break; 1612 } 1613 + stat(c, ALLOC_FASTPATH); 1614 } while (cmpxchg_local(&c->freelist, object, object[c->offset]) 1615 != object); 1616 #else ··· 1624 else { 1625 object = c->freelist; 1626 c->freelist = object[c->offset]; 1627 + stat(c, ALLOC_FASTPATH); 1628 } 1629 local_irq_restore(flags); 1630 #endif ··· 1661 { 1662 void *prior; 1663 void **object = (void *)x; 1664 + struct kmem_cache_cpu *c; 1665 1666 #ifdef SLUB_FASTPATH 1667 unsigned long flags; 1668 1669 local_irq_save(flags); 1670 #endif 1671 + c = get_cpu_slab(s, raw_smp_processor_id()); 1672 + stat(c, FREE_SLOWPATH); 1673 slab_lock(page); 1674 1675 if (unlikely(SlabDebug(page))) ··· 1676 page->freelist = object; 1677 page->inuse--; 1678 1679 + if (unlikely(SlabFrozen(page))) { 1680 + stat(c, FREE_FROZEN); 1681 goto out_unlock; 1682 + } 1683 1684 if (unlikely(!page->inuse)) 1685 goto slab_empty; ··· 1687 * was not on the partial list before 1688 * then add it. 1689 */ 1690 + if (unlikely(prior == page->end)) { 1691 add_partial(get_node(s, page_to_nid(page)), page, 1); 1692 + stat(c, FREE_ADD_PARTIAL); 1693 + } 1694 1695 out_unlock: 1696 slab_unlock(page); ··· 1698 return; 1699 1700 slab_empty: 1701 + if (prior != page->end) { 1702 /* 1703 * Slab still on the partial list. 1704 */ 1705 remove_partial(s, page); 1706 + stat(c, FREE_REMOVE_PARTIAL); 1707 + } 1708 slab_unlock(page); 1709 + stat(c, FREE_SLAB); 1710 #ifdef SLUB_FASTPATH 1711 local_irq_restore(flags); 1712 #endif ··· 1758 break; 1759 } 1760 object[c->offset] = freelist; 1761 + stat(c, FREE_FASTPATH); 1762 } while (cmpxchg_local(&c->freelist, freelist, object) != freelist); 1763 #else 1764 unsigned long flags; ··· 1768 if (likely(page == c->page && c->node >= 0)) { 1769 object[c->offset] = c->freelist; 1770 c->freelist = object; 1771 + stat(c, FREE_FASTPATH); 1772 } else 1773 __slab_free(s, page, x, addr, c->offset); 1774 ··· 3980 SLAB_ATTR(remote_node_defrag_ratio); 3981 #endif 3982 3983 + #ifdef CONFIG_SLUB_STATS 3984 + 3985 + static int show_stat(struct kmem_cache *s, char *buf, enum stat_item si) 3986 + { 3987 + unsigned long sum = 0; 3988 + int cpu; 3989 + int len; 3990 + int *data = kmalloc(nr_cpu_ids * sizeof(int), GFP_KERNEL); 3991 + 3992 + if (!data) 3993 + return -ENOMEM; 3994 + 3995 + for_each_online_cpu(cpu) { 3996 + unsigned x = get_cpu_slab(s, cpu)->stat[si]; 3997 + 3998 + data[cpu] = x; 3999 + sum += x; 4000 + } 4001 + 4002 + len = sprintf(buf, "%lu", sum); 4003 + 4004 + for_each_online_cpu(cpu) { 4005 + if (data[cpu] && len < PAGE_SIZE - 20) 4006 + len += sprintf(buf + len, " c%d=%u", cpu, data[cpu]); 4007 + } 4008 + kfree(data); 4009 + return len + sprintf(buf + len, "\n"); 4010 + } 4011 + 4012 + #define STAT_ATTR(si, text) \ 4013 + static ssize_t text##_show(struct kmem_cache *s, char *buf) \ 4014 + { \ 4015 + return show_stat(s, buf, si); \ 4016 + } \ 4017 + SLAB_ATTR_RO(text); \ 4018 + 4019 + STAT_ATTR(ALLOC_FASTPATH, alloc_fastpath); 4020 + STAT_ATTR(ALLOC_SLOWPATH, alloc_slowpath); 4021 + STAT_ATTR(FREE_FASTPATH, free_fastpath); 4022 + STAT_ATTR(FREE_SLOWPATH, free_slowpath); 4023 + STAT_ATTR(FREE_FROZEN, free_frozen); 4024 + STAT_ATTR(FREE_ADD_PARTIAL, free_add_partial); 4025 + STAT_ATTR(FREE_REMOVE_PARTIAL, free_remove_partial); 4026 + STAT_ATTR(ALLOC_FROM_PARTIAL, alloc_from_partial); 4027 + STAT_ATTR(ALLOC_SLAB, alloc_slab); 4028 + STAT_ATTR(ALLOC_REFILL, alloc_refill); 4029 + STAT_ATTR(FREE_SLAB, free_slab); 4030 + STAT_ATTR(CPUSLAB_FLUSH, cpuslab_flush); 4031 + STAT_ATTR(DEACTIVATE_FULL, deactivate_full); 4032 + STAT_ATTR(DEACTIVATE_EMPTY, deactivate_empty); 4033 + STAT_ATTR(DEACTIVATE_TO_HEAD, deactivate_to_head); 4034 + STAT_ATTR(DEACTIVATE_TO_TAIL, deactivate_to_tail); 4035 + STAT_ATTR(DEACTIVATE_REMOTE_FREES, deactivate_remote_frees); 4036 + 4037 + #endif 4038 + 4039 static struct attribute *slab_attrs[] = { 4040 &slab_size_attr.attr, 4041 &object_size_attr.attr, ··· 4009 #endif 4010 #ifdef CONFIG_NUMA 4011 &remote_node_defrag_ratio_attr.attr, 4012 + #endif 4013 + #ifdef CONFIG_SLUB_STATS 4014 + &alloc_fastpath_attr.attr, 4015 + &alloc_slowpath_attr.attr, 4016 + &free_fastpath_attr.attr, 4017 + &free_slowpath_attr.attr, 4018 + &free_frozen_attr.attr, 4019 + &free_add_partial_attr.attr, 4020 + &free_remove_partial_attr.attr, 4021 + &alloc_from_partial_attr.attr, 4022 + &alloc_slab_attr.attr, 4023 + &alloc_refill_attr.attr, 4024 + &free_slab_attr.attr, 4025 + &cpuslab_flush_attr.attr, 4026 + &deactivate_full_attr.attr, 4027 + &deactivate_empty_attr.attr, 4028 + &deactivate_to_head_attr.attr, 4029 + &deactivate_to_tail_attr.attr, 4030 + &deactivate_remote_frees_attr.attr, 4031 #endif 4032 NULL 4033 };