Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

powerpc: Fix checkstop in native_hpte_clear() with lockdep

native_hpte_clear() is called in real mode from two places:
- Early in boot during htab initialisation if firmware assisted dump is
active.
- Late in the kexec path.

In both contexts there is no need to disable interrupts are they are
already disabled. Furthermore, locking around the tlbie() is only required
for pre POWER5 hardware.

On POWER5 or newer hardware concurrent tlbie()s work as expected and on pre
POWER5 hardware concurrent tlbie()s could result in deadlock. This code
would only be executed at crashdump time, during which all bets are off,
concurrent tlbie()s are unlikely and taking locks is unsafe therefore the
best course of action is to simply do nothing. Concurrent tlbie()s are not
possible in the first case as secondary CPUs have not come up yet.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

authored by

Cyril Bur and committed by
Michael Ellerman
fdf880a6 4108efb0

+18 -14
+7 -2
arch/powerpc/include/asm/machdep.h
··· 61 61 unsigned long addr, 62 62 unsigned char *hpte_slot_array, 63 63 int psize, int ssize, int local); 64 - /* special for kexec, to be called in real mode, linear mapping is 65 - * destroyed as well */ 64 + /* 65 + * Special for kexec. 66 + * To be called in real mode with interrupts disabled. No locks are 67 + * taken as such, concurrent access on pre POWER5 hardware could result 68 + * in a deadlock. 69 + * The linear mapping is destroyed as well. 70 + */ 66 71 void (*hpte_clear_all)(void); 67 72 68 73 void __iomem * (*ioremap)(phys_addr_t addr, unsigned long size,
+11 -12
arch/powerpc/mm/hash_native_64.c
··· 582 582 * be when they isi), and we are the only one left. We rely on our kernel 583 583 * mapping being 0xC0's and the hardware ignoring those two real bits. 584 584 * 585 + * This must be called with interrupts disabled. 586 + * 587 + * Taking the native_tlbie_lock is unsafe here due to the possibility of 588 + * lockdep being on. On pre POWER5 hardware, not taking the lock could 589 + * cause deadlock. POWER5 and newer not taking the lock is fine. This only 590 + * gets called during boot before secondary CPUs have come up and during 591 + * crashdump and all bets are off anyway. 592 + * 585 593 * TODO: add batching support when enabled. remember, no dynamic memory here, 586 594 * athough there is the control page available... 587 595 */ 588 596 static void native_hpte_clear(void) 589 597 { 590 598 unsigned long vpn = 0; 591 - unsigned long slot, slots, flags; 599 + unsigned long slot, slots; 592 600 struct hash_pte *hptep = htab_address; 593 601 unsigned long hpte_v; 594 602 unsigned long pteg_count; 595 603 int psize, apsize, ssize; 596 604 597 605 pteg_count = htab_hash_mask + 1; 598 - 599 - local_irq_save(flags); 600 - 601 - /* we take the tlbie lock and hold it. Some hardware will 602 - * deadlock if we try to tlbie from two processors at once. 603 - */ 604 - raw_spin_lock(&native_tlbie_lock); 605 606 606 607 slots = pteg_count * HPTES_PER_GROUP; 607 608 ··· 615 614 hpte_v = be64_to_cpu(hptep->v); 616 615 617 616 /* 618 - * Call __tlbie() here rather than tlbie() since we 619 - * already hold the native_tlbie_lock. 617 + * Call __tlbie() here rather than tlbie() since we can't take the 618 + * native_tlbie_lock. 620 619 */ 621 620 if (hpte_v & HPTE_V_VALID) { 622 621 hpte_decode(hptep, slot, &psize, &apsize, &ssize, &vpn); ··· 626 625 } 627 626 628 627 asm volatile("eieio; tlbsync; ptesync":::"memory"); 629 - raw_spin_unlock(&native_tlbie_lock); 630 - local_irq_restore(flags); 631 628 } 632 629 633 630 /*