Pad irq_desc to internode cacheline size

Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

kernel os linux

We noticed a drop in n/w performance due to the irq_desc being cacheline
aligned rather than internode aligned. We see 50% of expected performance
when two e1000 nics local to two different nodes have consecutive irq
descriptors allocated, due to false sharing.

Note that this patch does away with cacheline padding for the UP case, as
it does not seem useful for UP configurations.

Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Ravikiran G Thirumalai and committed by

Linus Torvalds 19 years ago e729aa16 428e6ce0

+2 -4

2 changed files

expand all

include

linux

irq.h

kernel

irq

handle.c

+1 -3

include/linux/irq.h

··· 147 147 * @dir: /proc/irq/ procfs entry 148 148 * @affinity_entry: /proc/irq/smp_affinity procfs entry on SMP 149 149 * @name: flow handler name for /proc/interrupts output 150 - * 151 - * Pad this out to 32 bytes for cache and indexing reasons. 152 150 */ 153 151 struct irq_desc { 154 152 irq_flow_handler_t handle_irq; ··· 173 175 struct proc_dir_entry *dir; 174 176 #endif 175 177 const char *name; 176 - } ____cacheline_aligned; 178 + } ____cacheline_internodealigned_in_smp; 177 179 178 180 extern struct irq_desc irq_desc[NR_IRQS]; 179 181

+1 -1

kernel/irq/handle.c

··· 48 48 * 49 49 * Controller mappings for all interrupt sources: 50 50 */ 51 - struct irq_desc irq_desc[NR_IRQS] __cacheline_aligned = { 51 + struct irq_desc irq_desc[NR_IRQS] __cacheline_aligned_in_smp = { 52 52 [0 ... NR_IRQS-1] = { 53 53 .status = IRQ_DISABLED, 54 54 .chip = &no_irq_chip,