Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

[PATCH] local_t: Documentation

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Mathieu Desnoyers and committed by
Linus Torvalds
f1f8810c d4d23add

+163
+163
Documentation/local_ops.txt
··· 1 + Semantics and Behavior of Local Atomic Operations 2 + 3 + Mathieu Desnoyers 4 + 5 + 6 + This document explains the purpose of the local atomic operations, how 7 + to implement them for any given architecture and shows how they can be used 8 + properly. It also stresses on the precautions that must be taken when reading 9 + those local variables across CPUs when the order of memory writes matters. 10 + 11 + 12 + 13 + * Purpose of local atomic operations 14 + 15 + Local atomic operations are meant to provide fast and highly reentrant per CPU 16 + counters. They minimize the performance cost of standard atomic operations by 17 + removing the LOCK prefix and memory barriers normally required to synchronize 18 + across CPUs. 19 + 20 + Having fast per CPU atomic counters is interesting in many cases : it does not 21 + require disabling interrupts to protect from interrupt handlers and it permits 22 + coherent counters in NMI handlers. It is especially useful for tracing purposes 23 + and for various performance monitoring counters. 24 + 25 + Local atomic operations only guarantee variable modification atomicity wrt the 26 + CPU which owns the data. Therefore, care must taken to make sure that only one 27 + CPU writes to the local_t data. This is done by using per cpu data and making 28 + sure that we modify it from within a preemption safe context. It is however 29 + permitted to read local_t data from any CPU : it will then appear to be written 30 + out of order wrt other memory writes on the owner CPU. 31 + 32 + 33 + * Implementation for a given architecture 34 + 35 + It can be done by slightly modifying the standard atomic operations : only 36 + their UP variant must be kept. It typically means removing LOCK prefix (on 37 + i386 and x86_64) and any SMP sychronization barrier. If the architecture does 38 + not have a different behavior between SMP and UP, including asm-generic/local.h 39 + in your archtecture's local.h is sufficient. 40 + 41 + The local_t type is defined as an opaque signed long by embedding an 42 + atomic_long_t inside a structure. This is made so a cast from this type to a 43 + long fails. The definition looks like : 44 + 45 + typedef struct { atomic_long_t a; } local_t; 46 + 47 + 48 + * How to use local atomic operations 49 + 50 + #include <linux/percpu.h> 51 + #include <asm/local.h> 52 + 53 + static DEFINE_PER_CPU(local_t, counters) = LOCAL_INIT(0); 54 + 55 + 56 + * Counting 57 + 58 + Counting is done on all the bits of a signed long. 59 + 60 + In preemptible context, use get_cpu_var() and put_cpu_var() around local atomic 61 + operations : it makes sure that preemption is disabled around write access to 62 + the per cpu variable. For instance : 63 + 64 + local_inc(&get_cpu_var(counters)); 65 + put_cpu_var(counters); 66 + 67 + If you are already in a preemption-safe context, you can directly use 68 + __get_cpu_var() instead. 69 + 70 + local_inc(&__get_cpu_var(counters)); 71 + 72 + 73 + 74 + * Reading the counters 75 + 76 + Those local counters can be read from foreign CPUs to sum the count. Note that 77 + the data seen by local_read across CPUs must be considered to be out of order 78 + relatively to other memory writes happening on the CPU that owns the data. 79 + 80 + long sum = 0; 81 + for_each_online_cpu(cpu) 82 + sum += local_read(&per_cpu(counters, cpu)); 83 + 84 + If you want to use a remote local_read to synchronize access to a resource 85 + between CPUs, explicit smp_wmb() and smp_rmb() memory barriers must be used 86 + respectively on the writer and the reader CPUs. It would be the case if you use 87 + the local_t variable as a counter of bytes written in a buffer : there should 88 + be a smp_wmb() between the buffer write and the counter increment and also a 89 + smp_rmb() between the counter read and the buffer read. 90 + 91 + 92 + Here is a sample module which implements a basic per cpu counter using local.h. 93 + 94 + --- BEGIN --- 95 + /* test-local.c 96 + * 97 + * Sample module for local.h usage. 98 + */ 99 + 100 + 101 + #include <asm/local.h> 102 + #include <linux/module.h> 103 + #include <linux/timer.h> 104 + 105 + static DEFINE_PER_CPU(local_t, counters) = LOCAL_INIT(0); 106 + 107 + static struct timer_list test_timer; 108 + 109 + /* IPI called on each CPU. */ 110 + static void test_each(void *info) 111 + { 112 + /* Increment the counter from a non preemptible context */ 113 + printk("Increment on cpu %d\n", smp_processor_id()); 114 + local_inc(&__get_cpu_var(counters)); 115 + 116 + /* This is what incrementing the variable would look like within a 117 + * preemptible context (it disables preemption) : 118 + * 119 + * local_inc(&get_cpu_var(counters)); 120 + * put_cpu_var(counters); 121 + */ 122 + } 123 + 124 + static void do_test_timer(unsigned long data) 125 + { 126 + int cpu; 127 + 128 + /* Increment the counters */ 129 + on_each_cpu(test_each, NULL, 0, 1); 130 + /* Read all the counters */ 131 + printk("Counters read from CPU %d\n", smp_processor_id()); 132 + for_each_online_cpu(cpu) { 133 + printk("Read : CPU %d, count %ld\n", cpu, 134 + local_read(&per_cpu(counters, cpu))); 135 + } 136 + del_timer(&test_timer); 137 + test_timer.expires = jiffies + 1000; 138 + add_timer(&test_timer); 139 + } 140 + 141 + static int __init test_init(void) 142 + { 143 + /* initialize the timer that will increment the counter */ 144 + init_timer(&test_timer); 145 + test_timer.function = do_test_timer; 146 + test_timer.expires = jiffies + 1; 147 + add_timer(&test_timer); 148 + 149 + return 0; 150 + } 151 + 152 + static void __exit test_exit(void) 153 + { 154 + del_timer_sync(&test_timer); 155 + } 156 + 157 + module_init(test_init); 158 + module_exit(test_exit); 159 + 160 + MODULE_LICENSE("GPL"); 161 + MODULE_AUTHOR("Mathieu Desnoyers"); 162 + MODULE_DESCRIPTION("Local Atomic Ops"); 163 + --- END ---