Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

lib/percpu_counter: percpu_counter_add_batch() overflow/underflow

Patch series "various irq handling fixes/docu updates".

If an interrupt happens between __this_cpu_read(*fbc->counters) and
this_cpu_add(*fbc->counters, amount), and that interrupt modifies the
per_cpu_counter, then the this_cpu_add() after the interrupt returns may
under/overflow.

Link: https://lkml.kernel.org/r/20221216150155.200389-1-manfred@colorfullife.com
Link: https://lkml.kernel.org/r/20221216150441.200533-1-manfred@colorfullife.com
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: "Sun, Jiebin" <jiebin.sun@intel.com>
Cc: <1vier1@web.de>
Cc: Alexander Sverdlin <alexander.sverdlin@siemens.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Manfred Spraul and committed by
Andrew Morton
805afd83 a9dc087f

+15 -10
+15 -10
lib/percpu_counter.c
··· 73 73 EXPORT_SYMBOL(percpu_counter_set); 74 74 75 75 /* 76 - * This function is both preempt and irq safe. The former is due to explicit 77 - * preemption disable. The latter is guaranteed by the fact that the slow path 78 - * is explicitly protected by an irq-safe spinlock whereas the fast patch uses 79 - * this_cpu_add which is irq-safe by definition. Hence there is no need muck 80 - * with irq state before calling this one 76 + * local_irq_save() is needed to make the function irq safe: 77 + * - The slow path would be ok as protected by an irq-safe spinlock. 78 + * - this_cpu_add would be ok as it is irq-safe by definition. 79 + * But: 80 + * The decision slow path/fast path and the actual update must be atomic, too. 81 + * Otherwise a call in process context could check the current values and 82 + * decide that the fast path can be used. If now an interrupt occurs before 83 + * the this_cpu_add(), and the interrupt updates this_cpu(*fbc->counters), 84 + * then the this_cpu_add() that is executed after the interrupt has completed 85 + * can produce values larger than "batch" or even overflows. 81 86 */ 82 87 void percpu_counter_add_batch(struct percpu_counter *fbc, s64 amount, s32 batch) 83 88 { 84 89 s64 count; 90 + unsigned long flags; 85 91 86 - preempt_disable(); 92 + local_irq_save(flags); 87 93 count = __this_cpu_read(*fbc->counters) + amount; 88 94 if (abs(count) >= batch) { 89 - unsigned long flags; 90 - raw_spin_lock_irqsave(&fbc->lock, flags); 95 + raw_spin_lock(&fbc->lock); 91 96 fbc->count += count; 92 97 __this_cpu_sub(*fbc->counters, count - amount); 93 - raw_spin_unlock_irqrestore(&fbc->lock, flags); 98 + raw_spin_unlock(&fbc->lock); 94 99 } else { 95 100 this_cpu_add(*fbc->counters, amount); 96 101 } 97 - preempt_enable(); 102 + local_irq_restore(flags); 98 103 } 99 104 EXPORT_SYMBOL(percpu_counter_add_batch); 100 105