Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

cgroup: Avoid false cacheline sharing of read mostly rstat_cpu

The rstat_cpu and also rstat_css_list of the cgroup structure are read
mostly variables. However, they may share the same cacheline as the
subsequent rstat_flush_next and *bstat variables which can be updated
frequently. That will slow down the cgroup_rstat_cpu() call which is
called pretty frequently in the rstat code. Add a CACHELINE_PADDING()
line in between them to avoid false cacheline sharing.

A parallel kernel build on a 2-socket x86-64 server is used as the
benchmarking tool for measuring the lock hold time. Below were the lock
hold time frequency distribution before and after the patch:

Run time Before patch After patch
-------- ------------ -----------
0-01 us 9,928,562 9,820,428
01-05 us 110,151 50,935
05-10 us 270 93
10-15 us 273 146
15-20 us 135 76
20-25 us 0 2
25-30 us 1 0

It can be seen that the patch further pushes the lock hold time towards
the lower end.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

authored by

Waiman Long and committed by
Tejun Heo
77070eeb d499fd41

+7
+7
include/linux/cgroup-defs.h
··· 497 497 struct list_head rstat_css_list; 498 498 499 499 /* 500 + * Add padding to separate the read mostly rstat_cpu and 501 + * rstat_css_list into a different cacheline from the following 502 + * rstat_flush_next and *bstat fields which can have frequent updates. 503 + */ 504 + CACHELINE_PADDING(_pad_); 505 + 506 + /* 500 507 * A singly-linked list of cgroup structures to be rstat flushed. 501 508 * This is a scratch field to be used exclusively by 502 509 * cgroup_rstat_flush_locked() and protected by cgroup_rstat_lock.