x86_64 irq: use mask/unmask and proper locking in fixup_irqs()

Force irq migration path during cpu offline, is not using proper locks and
irq_chip mask/unmask routines. This will result in some races(especially
the device generating the interrupt can see some inconsistent state,
resulting in issues like stuck irq,..).

Appended patch fixes the issue by taking proper lock and encapsulating
irq_chip set_affinity() with a mask() before and an unmask() after.

This fixes a MSI irq stuck issue reported by Darrick Wong.

There are several more general bugs in this area(irq migration in the
process context). For example,

1. Possibility of missing edge triggered irq.
2. Reliable method of migrating level triggered irq in the process context.

We plan to look and close these in the near future.

Eric says:
In addition even with the fix from Suresh there is still at least one
nasty hardware race in fixup_irqs(). However we exercise that code
path rarely enough that we are unlikely to hit it in the real world,
and that race seems to have existed since the code was merged. And a
fix for that is not coming soon as it is an open investigation area
if we can fix irq migration to work outside of irq context or if
we have to rework the requirements imposed by the generic cpu hotplug
and layer on fixup_irqs(). So this may come up again.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Reported-and-tested-by: Darrick Wong <djwong@us.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by Siddha, Suresh B and committed by Linus Torvalds 48d8d7ee c47e285d

+29 -3
+29 -3
arch/x86_64/kernel/irq.c
··· 144 144 145 145 for (irq = 0; irq < NR_IRQS; irq++) { 146 146 cpumask_t mask; 147 + int break_affinity = 0; 148 + int set_affinity = 1; 149 + 147 150 if (irq == 2) 148 151 continue; 149 152 153 + /* interrupt's are disabled at this point */ 154 + spin_lock(&irq_desc[irq].lock); 155 + 156 + if (!irq_has_action(irq) || 157 + cpus_equal(irq_desc[irq].affinity, map)) { 158 + spin_unlock(&irq_desc[irq].lock); 159 + continue; 160 + } 161 + 150 162 cpus_and(mask, irq_desc[irq].affinity, map); 151 - if (any_online_cpu(mask) == NR_CPUS) { 152 - printk("Breaking affinity for irq %i\n", irq); 163 + if (cpus_empty(mask)) { 164 + break_affinity = 1; 153 165 mask = map; 154 166 } 167 + 168 + if (irq_desc[irq].chip->mask) 169 + irq_desc[irq].chip->mask(irq); 170 + 155 171 if (irq_desc[irq].chip->set_affinity) 156 172 irq_desc[irq].chip->set_affinity(irq, mask); 157 - else if (irq_desc[irq].action && !(warned++)) 173 + else if (!(warned++)) 174 + set_affinity = 0; 175 + 176 + if (irq_desc[irq].chip->unmask) 177 + irq_desc[irq].chip->unmask(irq); 178 + 179 + spin_unlock(&irq_desc[irq].lock); 180 + 181 + if (break_affinity && set_affinity) 182 + printk("Broke affinity for irq %i\n", irq); 183 + else if (!set_affinity) 158 184 printk("Cannot set affinity for irq %i\n", irq); 159 185 } 160 186