Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

KVM: arm64: Invert KVM_PGTABLE_WALK_HANDLE_FAULT to fix pKVM walkers

Commit ddcadb297ce5 ("KVM: arm64: Ignore EAGAIN for walks outside of a
fault") introduced a new walker flag ('KVM_PGTABLE_WALK_HANDLE_FAULT')
to KVM's page-table code. When set, the walk logic maintains its
previous behaviour of terminating a walk as soon as the visitor callback
returns an error. However, when the flag is clear, the walk will
continue if the visitor returns -EAGAIN and the error is then suppressed
and returned as zero to the caller.

Clearing the flag is beneficial when write-protecting a range of IPAs
with kvm_pgtable_stage2_wrprotect() but is not useful in any other
cases, either because we are operating on a single page (e.g.
kvm_pgtable_stage2_mkyoung() or kvm_phys_addr_ioremap()) or because the
early termination is desirable (e.g. when mapping pages from a fault in
user_mem_abort()).

Subsequently, commit e912efed485a ("KVM: arm64: Introduce the EL1 pKVM
MMU") hooked up pKVM's hypercall interface to the MMU code at EL1 but
failed to propagate any of the walker flags. As a result, page-table
walks at EL2 fail to set KVM_PGTABLE_WALK_HANDLE_FAULT even when the
early termination semantics are desirable on the fault handling path.

Rather than complicate the pKVM hypercall interface, invert the flag so
that the whole thing can be simplified and only pass the new flag
('KVM_PGTABLE_WALK_IGNORE_EAGAIN') from the wrprotect code.

Cc: Fuad Tabba <tabba@google.com>
Cc: Quentin Perret <qperret@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Fixes: fce886a60207 ("KVM: arm64: Plumb the pKVM MMU in KVM")
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Quentin Perret <qperret@google.com>
Link: https://msgid.link/20260105154939.11041-2-will@kernel.org
Signed-off-by: Oliver Upton <oupton@kernel.org>

authored by

Will Deacon and committed by
Oliver Upton
19cffd16 86364832

+9 -10
+3 -3
arch/arm64/include/asm/kvm_pgtable.h
··· 301 301 * children. 302 302 * @KVM_PGTABLE_WALK_SHARED: Indicates the page-tables may be shared 303 303 * with other software walkers. 304 - * @KVM_PGTABLE_WALK_HANDLE_FAULT: Indicates the page-table walk was 305 - * invoked from a fault handler. 304 + * @KVM_PGTABLE_WALK_IGNORE_EAGAIN: Don't terminate the walk early if 305 + * the walker returns -EAGAIN. 306 306 * @KVM_PGTABLE_WALK_SKIP_BBM_TLBI: Visit and update table entries 307 307 * without Break-before-make's 308 308 * TLB invalidation. ··· 315 315 KVM_PGTABLE_WALK_TABLE_PRE = BIT(1), 316 316 KVM_PGTABLE_WALK_TABLE_POST = BIT(2), 317 317 KVM_PGTABLE_WALK_SHARED = BIT(3), 318 - KVM_PGTABLE_WALK_HANDLE_FAULT = BIT(4), 318 + KVM_PGTABLE_WALK_IGNORE_EAGAIN = BIT(4), 319 319 KVM_PGTABLE_WALK_SKIP_BBM_TLBI = BIT(5), 320 320 KVM_PGTABLE_WALK_SKIP_CMO = BIT(6), 321 321 };
+3 -2
arch/arm64/kvm/hyp/pgtable.c
··· 144 144 * page table walk. 145 145 */ 146 146 if (r == -EAGAIN) 147 - return !(walker->flags & KVM_PGTABLE_WALK_HANDLE_FAULT); 147 + return walker->flags & KVM_PGTABLE_WALK_IGNORE_EAGAIN; 148 148 149 149 return !r; 150 150 } ··· 1262 1262 { 1263 1263 return stage2_update_leaf_attrs(pgt, addr, size, 0, 1264 1264 KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W, 1265 - NULL, NULL, 0); 1265 + NULL, NULL, 1266 + KVM_PGTABLE_WALK_IGNORE_EAGAIN); 1266 1267 } 1267 1268 1268 1269 void kvm_pgtable_stage2_mkyoung(struct kvm_pgtable *pgt, u64 addr,
+3 -5
arch/arm64/kvm/mmu.c
··· 1563 1563 *prot &= ~KVM_PGTABLE_PROT_PX; 1564 1564 } 1565 1565 1566 - #define KVM_PGTABLE_WALK_MEMABORT_FLAGS (KVM_PGTABLE_WALK_HANDLE_FAULT | KVM_PGTABLE_WALK_SHARED) 1567 - 1568 1566 static int gmem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, 1569 1567 struct kvm_s2_trans *nested, 1570 1568 struct kvm_memory_slot *memslot, bool is_perm) 1571 1569 { 1572 1570 bool write_fault, exec_fault, writable; 1573 - enum kvm_pgtable_walk_flags flags = KVM_PGTABLE_WALK_MEMABORT_FLAGS; 1571 + enum kvm_pgtable_walk_flags flags = KVM_PGTABLE_WALK_SHARED; 1574 1572 enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; 1575 1573 struct kvm_pgtable *pgt = vcpu->arch.hw_mmu->pgt; 1576 1574 unsigned long mmu_seq; ··· 1663 1665 struct kvm_pgtable *pgt; 1664 1666 struct page *page; 1665 1667 vm_flags_t vm_flags; 1666 - enum kvm_pgtable_walk_flags flags = KVM_PGTABLE_WALK_MEMABORT_FLAGS; 1668 + enum kvm_pgtable_walk_flags flags = KVM_PGTABLE_WALK_SHARED; 1667 1669 1668 1670 if (fault_is_perm) 1669 1671 fault_granule = kvm_vcpu_trap_get_perm_fault_granule(vcpu); ··· 1931 1933 /* Resolve the access fault by making the page young again. */ 1932 1934 static void handle_access_fault(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa) 1933 1935 { 1934 - enum kvm_pgtable_walk_flags flags = KVM_PGTABLE_WALK_HANDLE_FAULT | KVM_PGTABLE_WALK_SHARED; 1936 + enum kvm_pgtable_walk_flags flags = KVM_PGTABLE_WALK_SHARED; 1935 1937 struct kvm_s2_mmu *mmu; 1936 1938 1937 1939 trace_kvm_access_fault(fault_ipa);