Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

x86/mm: Always set the ASID valid bit for the INVLPGB instruction

When executing the INVLPGB instruction on a bare-metal host or hypervisor, if
the ASID valid bit is not set, the instruction will flush the TLB entries that
match the specified criteria for any ASID, not just the those of the host. If
virtual machines are running on the system, this may result in inadvertent
flushes of guest TLB entries.

When executing the INVLPGB instruction in a guest and the INVLPGB instruction is
not intercepted by the hypervisor, the hardware will replace the requested ASID
with the guest ASID and set the ASID valid bit before doing the broadcast
invalidation. Thus a guest is only able to flush its own TLB entries.

So to limit the host TLB flushing reach, always set the ASID valid bit using an
ASID value of 0 (which represents the host/hypervisor). This will will result in
the desired effect in both host and guest.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250304120449.GHZ8bsYYyEBOKQIxBm@fat_crate.local

authored by

Tom Lendacky and committed by
Ingo Molnar
634ab761 440a65b7

+32 -26
+32 -26
arch/x86/include/asm/tlb.h
··· 33 33 PMD_STRIDE = 1 34 34 }; 35 35 36 + /* 37 + * INVLPGB can be targeted by virtual address, PCID, ASID, or any combination 38 + * of the three. For example: 39 + * - FLAG_VA | FLAG_INCLUDE_GLOBAL: invalidate all TLB entries at the address 40 + * - FLAG_PCID: invalidate all TLB entries matching the PCID 41 + * 42 + * The first is used to invalidate (kernel) mappings at a particular 43 + * address across all processes. 44 + * 45 + * The latter invalidates all TLB entries matching a PCID. 46 + */ 47 + #define INVLPGB_FLAG_VA BIT(0) 48 + #define INVLPGB_FLAG_PCID BIT(1) 49 + #define INVLPGB_FLAG_ASID BIT(2) 50 + #define INVLPGB_FLAG_INCLUDE_GLOBAL BIT(3) 51 + #define INVLPGB_FLAG_FINAL_ONLY BIT(4) 52 + #define INVLPGB_FLAG_INCLUDE_NESTED BIT(5) 53 + 54 + /* The implied mode when all bits are clear: */ 55 + #define INVLPGB_MODE_ALL_NONGLOBALS 0UL 56 + 36 57 #ifdef CONFIG_BROADCAST_TLB_FLUSH 37 58 /* 38 59 * INVLPGB does broadcast TLB invalidation across all the CPUs in the system. ··· 61 40 * The INVLPGB instruction is weakly ordered, and a batch of invalidations can 62 41 * be done in a parallel fashion. 63 42 * 64 - * The instruction takes the number of extra pages to invalidate, beyond 65 - * the first page, while __invlpgb gets the more human readable number of 66 - * pages to invalidate. 43 + * The instruction takes the number of extra pages to invalidate, beyond the 44 + * first page, while __invlpgb gets the more human readable number of pages to 45 + * invalidate. 67 46 * 68 47 * The bits in rax[0:2] determine respectively which components of the address 69 48 * (VA, PCID, ASID) get compared when flushing. If neither bits are set, *any* 70 49 * address in the specified range matches. 50 + * 51 + * Since it is desired to only flush TLB entries for the ASID that is executing 52 + * the instruction (a host/hypervisor or a guest), the ASID valid bit should 53 + * always be set. On a host/hypervisor, the hardware will use the ASID value 54 + * specified in EDX[15:0] (which should be 0). On a guest, the hardware will 55 + * use the actual ASID value of the guest. 71 56 * 72 57 * TLBSYNC is used to ensure that pending INVLPGB invalidations initiated from 73 58 * this CPU have completed. ··· 82 55 unsigned long addr, u16 nr_pages, 83 56 enum addr_stride stride, u8 flags) 84 57 { 85 - u32 edx = (pcid << 16) | asid; 58 + u64 rax = addr | flags | INVLPGB_FLAG_ASID; 86 59 u32 ecx = (stride << 31) | (nr_pages - 1); 87 - u64 rax = addr | flags; 60 + u32 edx = (pcid << 16) | asid; 88 61 89 62 /* The low bits in rax are for flags. Verify addr is clean. */ 90 63 VM_WARN_ON_ONCE(addr & ~PAGE_MASK); ··· 119 92 static inline void __invlpgb_all(unsigned long asid, unsigned long pcid, u8 flags) { } 120 93 static inline void __tlbsync(void) { } 121 94 #endif 122 - 123 - /* 124 - * INVLPGB can be targeted by virtual address, PCID, ASID, or any combination 125 - * of the three. For example: 126 - * - FLAG_VA | FLAG_INCLUDE_GLOBAL: invalidate all TLB entries at the address 127 - * - FLAG_PCID: invalidate all TLB entries matching the PCID 128 - * 129 - * The first is used to invalidate (kernel) mappings at a particular 130 - * address across all processes. 131 - * 132 - * The latter invalidates all TLB entries matching a PCID. 133 - */ 134 - #define INVLPGB_FLAG_VA BIT(0) 135 - #define INVLPGB_FLAG_PCID BIT(1) 136 - #define INVLPGB_FLAG_ASID BIT(2) 137 - #define INVLPGB_FLAG_INCLUDE_GLOBAL BIT(3) 138 - #define INVLPGB_FLAG_FINAL_ONLY BIT(4) 139 - #define INVLPGB_FLAG_INCLUDE_NESTED BIT(5) 140 - 141 - /* The implied mode when all bits are clear: */ 142 - #define INVLPGB_MODE_ALL_NONGLOBALS 0UL 143 95 144 96 static inline void invlpgb_flush_user_nr_nosync(unsigned long pcid, 145 97 unsigned long addr,