Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

powerpc/mm: Update PROTFAULT handling in the page fault path

With radix, we can get page fault with DSISR_PROTFAULT value set in case of
PROT_NONE or autonuma mapping. The PROT_NONE case in handled by the vma check
where we consider the access bad. For autonuma we should fall through and fixup
the access mask correctly.

Without this patch we trigger the WARN_ON() on radix. This code moves that
WARN_ON() within a radix_enabled() check. I also moved the WARN_ON() outside
the if condition making it apply for all type of faults (exec/write/read). It
is also conditionalized for book3s, because BOOK3E can also get a PROTFAULT to
handle the D/I cache sync.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

authored by

Aneesh Kumar K.V and committed by
Michael Ellerman
18061c17 c21a493a

+39 -14
+6 -4
arch/powerpc/mm/copro_fault.c
··· 67 67 if (!(vma->vm_flags & (VM_READ | VM_EXEC))) 68 68 goto out_unlock; 69 69 /* 70 - * protfault should only happen due to us 71 - * mapping a region readonly temporarily. PROT_NONE 72 - * is also covered by the VMA check above. 70 + * PROT_NONE is covered by the VMA check above. 71 + * and hash should get a NOHPTE fault instead of 72 + * a PROTFAULT in case fixup is needed for things 73 + * like autonuma. 73 74 */ 74 - WARN_ON_ONCE(dsisr & DSISR_PROTFAULT); 75 + if (!radix_enabled()) 76 + WARN_ON_ONCE(dsisr & DSISR_PROTFAULT); 75 77 } 76 78 77 79 ret = 0;
+33 -10
arch/powerpc/mm/fault.c
··· 418 418 (cpu_has_feature(CPU_FTR_NOEXECUTE) || 419 419 !(vma->vm_flags & (VM_READ | VM_WRITE)))) 420 420 goto bad_area; 421 - 422 - #ifdef CONFIG_PPC_STD_MMU 423 - /* 424 - * protfault should only happen due to us 425 - * mapping a region readonly temporarily. PROT_NONE 426 - * is also covered by the VMA check above. 427 - */ 428 - WARN_ON_ONCE(error_code & DSISR_PROTFAULT); 429 - #endif /* CONFIG_PPC_STD_MMU */ 430 421 /* a write */ 431 422 } else if (is_write) { 432 423 if (!(vma->vm_flags & VM_WRITE)) ··· 427 436 } else { 428 437 if (!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE))) 429 438 goto bad_area; 430 - WARN_ON_ONCE(error_code & DSISR_PROTFAULT); 431 439 } 440 + #ifdef CONFIG_PPC_STD_MMU 441 + /* 442 + * For hash translation mode, we should never get a 443 + * PROTFAULT. Any update to pte to reduce access will result in us 444 + * removing the hash page table entry, thus resulting in a DSISR_NOHPTE 445 + * fault instead of DSISR_PROTFAULT. 446 + * 447 + * A pte update to relax the access will not result in a hash page table 448 + * entry invalidate and hence can result in DSISR_PROTFAULT. 449 + * ptep_set_access_flags() doesn't do a hpte flush. This is why we have 450 + * the special !is_write in the below conditional. 451 + * 452 + * For platforms that doesn't supports coherent icache and do support 453 + * per page noexec bit, we do setup things such that we do the 454 + * sync between D/I cache via fault. But that is handled via low level 455 + * hash fault code (hash_page_do_lazy_icache()) and we should not reach 456 + * here in such case. 457 + * 458 + * For wrong access that can result in PROTFAULT, the above vma->vm_flags 459 + * check should handle those and hence we should fall to the bad_area 460 + * handling correctly. 461 + * 462 + * For embedded with per page exec support that doesn't support coherent 463 + * icache we do get PROTFAULT and we handle that D/I cache sync in 464 + * set_pte_at while taking the noexec/prot fault. Hence this is WARN_ON 465 + * is conditional for server MMU. 466 + * 467 + * For radix, we can get prot fault for autonuma case, because radix 468 + * page table will have them marked noaccess for user. 469 + */ 470 + if (!radix_enabled() && !is_write) 471 + WARN_ON_ONCE(error_code & DSISR_PROTFAULT); 472 + #endif /* CONFIG_PPC_STD_MMU */ 432 473 433 474 /* 434 475 * If for any reason at all we couldn't handle the fault,