Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

x86/xen: Avoid fast syscall path for Xen PV guests

After 32-bit syscall rewrite, and specifically after commit:

5f310f739b4c ("x86/entry/32: Re-implement SYSENTER using the new C path")

... the stack frame that is passed to xen_sysexit is no longer a
"standard" one (i.e. it's not pt_regs).

Since we end up calling xen_iret from xen_sysexit we don't need
to fix up the stack and instead follow entry_SYSENTER_32's IRET
path directly to xen_iret.

We can do the same thing for compat mode even though stack does
not need to be fixed. This will allow us to drop usergs_sysret32
paravirt op (in the subsequent patch)

Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: david.vrabel@citrix.com
Cc: konrad.wilk@oracle.com
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1447970147-1733-2-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

authored by

Boris Ostrovsky and committed by
Ingo Molnar
5fdf5d37 1ec21837

+13 -7
+3 -2
arch/x86/entry/entry_32.S
··· 308 308 309 309 movl %esp, %eax 310 310 call do_fast_syscall_32 311 - testl %eax, %eax 312 - jz .Lsyscall_32_done 311 + /* XEN PV guests always use IRET path */ 312 + ALTERNATIVE "testl %eax, %eax; jz .Lsyscall_32_done", \ 313 + "jmp .Lsyscall_32_done", X86_FEATURE_XENPV 313 314 314 315 /* Opportunistic SYSEXIT */ 315 316 TRACE_IRQS_ON /* User mode traces as IRQs on. */
+6 -4
arch/x86/entry/entry_64_compat.S
··· 121 121 122 122 movq %rsp, %rdi 123 123 call do_fast_syscall_32 124 - testl %eax, %eax 125 - jz .Lsyscall_32_done 124 + /* XEN PV guests always use IRET path */ 125 + ALTERNATIVE "testl %eax, %eax; jz .Lsyscall_32_done", \ 126 + "jmp .Lsyscall_32_done", X86_FEATURE_XENPV 126 127 jmp sysret32_from_system_call 127 128 128 129 sysenter_fix_flags: ··· 201 200 202 201 movq %rsp, %rdi 203 202 call do_fast_syscall_32 204 - testl %eax, %eax 205 - jz .Lsyscall_32_done 203 + /* XEN PV guests always use IRET path */ 204 + ALTERNATIVE "testl %eax, %eax; jz .Lsyscall_32_done", \ 205 + "jmp .Lsyscall_32_done", X86_FEATURE_XENPV 206 206 207 207 /* Opportunistic SYSRET */ 208 208 sysret32_from_system_call:
+1
arch/x86/include/asm/cpufeature.h
··· 216 216 #define X86_FEATURE_PAUSEFILTER ( 8*32+13) /* AMD filtered pause intercept */ 217 217 #define X86_FEATURE_PFTHRESHOLD ( 8*32+14) /* AMD pause filter threshold */ 218 218 #define X86_FEATURE_VMMCALL ( 8*32+15) /* Prefer vmmcall to vmcall */ 219 + #define X86_FEATURE_XENPV ( 8*32+16) /* "" Xen paravirtual guest */ 219 220 220 221 221 222 /* Intel-defined CPU features, CPUID level 0x00000007:0 (ebx), word 9 */
+3 -1
arch/x86/xen/enlighten.c
··· 1886 1886 1887 1887 static void xen_set_cpu_features(struct cpuinfo_x86 *c) 1888 1888 { 1889 - if (xen_pv_domain()) 1889 + if (xen_pv_domain()) { 1890 1890 clear_cpu_bug(c, X86_BUG_SYSRET_SS_ATTRS); 1891 + set_cpu_cap(c, X86_FEATURE_XENPV); 1892 + } 1891 1893 } 1892 1894 1893 1895 const struct hypervisor_x86 x86_hyper_xen = {