Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull STIBP fallout fixes from Thomas Gleixner:
"The performance destruction department finally got it's act together
and came up with a cure for the STIPB regression:

- Provide a command line option to control the spectre v2 user space
mitigations. Default is either seccomp or prctl (if seccomp is
disabled in Kconfig). prctl allows mitigation opt-in, seccomp
enables the migitation for sandboxed processes.

- Rework the code to handle the conditional STIBP/IBPB control and
remove the now unused ptrace_may_access_sched() optimization
attempt

- Disable STIBP automatically when SMT is disabled

- Optimize the switch_to() logic to avoid MSR writes and invocations
of __switch_to_xtra().

- Make the asynchronous speculation TIF updates synchronous to
prevent stale mitigation state.

As a general cleanup this also makes retpoline directly depend on
compiler support and removes the 'minimal retpoline' option which just
pretended to provide some form of security while providing none"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
x86/speculation: Provide IBPB always command line options
x86/speculation: Add seccomp Spectre v2 user space protection mode
x86/speculation: Enable prctl mode for spectre_v2_user
x86/speculation: Add prctl() control for indirect branch speculation
x86/speculation: Prepare arch_smt_update() for PRCTL mode
x86/speculation: Prevent stale SPEC_CTRL msr content
x86/speculation: Split out TIF update
ptrace: Remove unused ptrace_may_access_sched() and MODE_IBRS
x86/speculation: Prepare for conditional IBPB in switch_mm()
x86/speculation: Avoid __switch_to_xtra() calls
x86/process: Consolidate and simplify switch_to_xtra() code
x86/speculation: Prepare for per task indirect branch speculation control
x86/speculation: Add command line control for indirect branch speculation
x86/speculation: Unify conditional spectre v2 print functions
x86/speculataion: Mark command line parser data __initdata
x86/speculation: Mark string arrays const correctly
x86/speculation: Reorder the spec_v2 code
x86/l1tf: Show actual SMT state
x86/speculation: Rework SMT state change
sched/smt: Expose sched_smt_present static key
...

Changed files
+786 -288
Documentation
admin-guide
userspace-api
arch
include
linux
uapi
linux
kernel
scripts
tools
include
uapi
linux
+54 -2
Documentation/admin-guide/kernel-parameters.txt
··· 4199 4199 4200 4200 spectre_v2= [X86] Control mitigation of Spectre variant 2 4201 4201 (indirect branch speculation) vulnerability. 4202 + The default operation protects the kernel from 4203 + user space attacks. 4202 4204 4203 - on - unconditionally enable 4204 - off - unconditionally disable 4205 + on - unconditionally enable, implies 4206 + spectre_v2_user=on 4207 + off - unconditionally disable, implies 4208 + spectre_v2_user=off 4205 4209 auto - kernel detects whether your CPU model is 4206 4210 vulnerable 4207 4211 ··· 4215 4211 CONFIG_RETPOLINE configuration option, and the 4216 4212 compiler with which the kernel was built. 4217 4213 4214 + Selecting 'on' will also enable the mitigation 4215 + against user space to user space task attacks. 4216 + 4217 + Selecting 'off' will disable both the kernel and 4218 + the user space protections. 4219 + 4218 4220 Specific mitigations can also be selected manually: 4219 4221 4220 4222 retpoline - replace indirect branches ··· 4229 4219 4230 4220 Not specifying this option is equivalent to 4231 4221 spectre_v2=auto. 4222 + 4223 + spectre_v2_user= 4224 + [X86] Control mitigation of Spectre variant 2 4225 + (indirect branch speculation) vulnerability between 4226 + user space tasks 4227 + 4228 + on - Unconditionally enable mitigations. Is 4229 + enforced by spectre_v2=on 4230 + 4231 + off - Unconditionally disable mitigations. Is 4232 + enforced by spectre_v2=off 4233 + 4234 + prctl - Indirect branch speculation is enabled, 4235 + but mitigation can be enabled via prctl 4236 + per thread. The mitigation control state 4237 + is inherited on fork. 4238 + 4239 + prctl,ibpb 4240 + - Like "prctl" above, but only STIBP is 4241 + controlled per thread. IBPB is issued 4242 + always when switching between different user 4243 + space processes. 4244 + 4245 + seccomp 4246 + - Same as "prctl" above, but all seccomp 4247 + threads will enable the mitigation unless 4248 + they explicitly opt out. 4249 + 4250 + seccomp,ibpb 4251 + - Like "seccomp" above, but only STIBP is 4252 + controlled per thread. IBPB is issued 4253 + always when switching between different 4254 + user space processes. 4255 + 4256 + auto - Kernel selects the mitigation depending on 4257 + the available CPU features and vulnerability. 4258 + 4259 + Default mitigation: 4260 + If CONFIG_SECCOMP=y then "seccomp", otherwise "prctl" 4261 + 4262 + Not specifying this option is equivalent to 4263 + spectre_v2_user=auto. 4232 4264 4233 4265 spec_store_bypass_disable= 4234 4266 [HW] Control Speculative Store Bypass (SSB) Disable mitigation
+9
Documentation/userspace-api/spec_ctrl.rst
··· 92 92 * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_ENABLE, 0, 0); 93 93 * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_DISABLE, 0, 0); 94 94 * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_FORCE_DISABLE, 0, 0); 95 + 96 + - PR_SPEC_INDIR_BRANCH: Indirect Branch Speculation in User Processes 97 + (Mitigate Spectre V2 style attacks against user processes) 98 + 99 + Invocations: 100 + * prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, 0, 0, 0); 101 + * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_ENABLE, 0, 0); 102 + * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_DISABLE, 0, 0); 103 + * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_FORCE_DISABLE, 0, 0);
+1 -11
arch/x86/Kconfig
··· 444 444 branches. Requires a compiler with -mindirect-branch=thunk-extern 445 445 support for full protection. The kernel may run slower. 446 446 447 - Without compiler support, at least indirect branches in assembler 448 - code are eliminated. Since this includes the syscall entry path, 449 - it is not entirely pointless. 450 - 451 447 config INTEL_RDT 452 448 bool "Intel Resource Director Technology support" 453 449 depends on X86 && CPU_SUP_INTEL ··· 1000 1004 to the kernel image. 1001 1005 1002 1006 config SCHED_SMT 1003 - bool "SMT (Hyperthreading) scheduler support" 1004 - depends on SMP 1005 - ---help--- 1006 - SMT scheduler support improves the CPU scheduler's decision making 1007 - when dealing with Intel Pentium 4 chips with HyperThreading at a 1008 - cost of slightly increased overhead in some places. If unsure say 1009 - N here. 1007 + def_bool y if SMP 1010 1008 1011 1009 config SCHED_MC 1012 1010 def_bool y
+3 -2
arch/x86/Makefile
··· 220 220 221 221 # Avoid indirect branches in kernel to deal with Spectre 222 222 ifdef CONFIG_RETPOLINE 223 - ifneq ($(RETPOLINE_CFLAGS),) 224 - KBUILD_CFLAGS += $(RETPOLINE_CFLAGS) -DRETPOLINE 223 + ifeq ($(RETPOLINE_CFLAGS),) 224 + $(error You are building kernel with non-retpoline compiler, please update your compiler.) 225 225 endif 226 + KBUILD_CFLAGS += $(RETPOLINE_CFLAGS) 226 227 endif 227 228 228 229 archscripts: scripts_basic
+3 -2
arch/x86/include/asm/msr-index.h
··· 41 41 42 42 #define MSR_IA32_SPEC_CTRL 0x00000048 /* Speculation Control */ 43 43 #define SPEC_CTRL_IBRS (1 << 0) /* Indirect Branch Restricted Speculation */ 44 - #define SPEC_CTRL_STIBP (1 << 1) /* Single Thread Indirect Branch Predictors */ 44 + #define SPEC_CTRL_STIBP_SHIFT 1 /* Single Thread Indirect Branch Predictor (STIBP) bit */ 45 + #define SPEC_CTRL_STIBP (1 << SPEC_CTRL_STIBP_SHIFT) /* STIBP mask */ 45 46 #define SPEC_CTRL_SSBD_SHIFT 2 /* Speculative Store Bypass Disable bit */ 46 - #define SPEC_CTRL_SSBD (1 << SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */ 47 + #define SPEC_CTRL_SSBD (1 << SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */ 47 48 48 49 #define MSR_IA32_PRED_CMD 0x00000049 /* Prediction Command */ 49 50 #define PRED_CMD_IBPB (1 << 0) /* Indirect Branch Prediction Barrier */
+20 -6
arch/x86/include/asm/nospec-branch.h
··· 3 3 #ifndef _ASM_X86_NOSPEC_BRANCH_H_ 4 4 #define _ASM_X86_NOSPEC_BRANCH_H_ 5 5 6 + #include <linux/static_key.h> 7 + 6 8 #include <asm/alternative.h> 7 9 #include <asm/alternative-asm.h> 8 10 #include <asm/cpufeatures.h> ··· 164 162 _ASM_PTR " 999b\n\t" \ 165 163 ".popsection\n\t" 166 164 167 - #if defined(CONFIG_X86_64) && defined(RETPOLINE) 165 + #ifdef CONFIG_RETPOLINE 166 + #ifdef CONFIG_X86_64 168 167 169 168 /* 170 - * Since the inline asm uses the %V modifier which is only in newer GCC, 171 - * the 64-bit one is dependent on RETPOLINE not CONFIG_RETPOLINE. 169 + * Inline asm uses the %V modifier which is only in newer GCC 170 + * which is ensured when CONFIG_RETPOLINE is defined. 172 171 */ 173 172 # define CALL_NOSPEC \ 174 173 ANNOTATE_NOSPEC_ALTERNATIVE \ ··· 184 181 X86_FEATURE_RETPOLINE_AMD) 185 182 # define THUNK_TARGET(addr) [thunk_target] "r" (addr) 186 183 187 - #elif defined(CONFIG_X86_32) && defined(CONFIG_RETPOLINE) 184 + #else /* CONFIG_X86_32 */ 188 185 /* 189 186 * For i386 we use the original ret-equivalent retpoline, because 190 187 * otherwise we'll run out of registers. We don't care about CET ··· 214 211 X86_FEATURE_RETPOLINE_AMD) 215 212 216 213 # define THUNK_TARGET(addr) [thunk_target] "rm" (addr) 214 + #endif 217 215 #else /* No retpoline for C / inline asm */ 218 216 # define CALL_NOSPEC "call *%[thunk_target]\n" 219 217 # define THUNK_TARGET(addr) [thunk_target] "rm" (addr) ··· 223 219 /* The Spectre V2 mitigation variants */ 224 220 enum spectre_v2_mitigation { 225 221 SPECTRE_V2_NONE, 226 - SPECTRE_V2_RETPOLINE_MINIMAL, 227 - SPECTRE_V2_RETPOLINE_MINIMAL_AMD, 228 222 SPECTRE_V2_RETPOLINE_GENERIC, 229 223 SPECTRE_V2_RETPOLINE_AMD, 230 224 SPECTRE_V2_IBRS_ENHANCED, 225 + }; 226 + 227 + /* The indirect branch speculation control variants */ 228 + enum spectre_v2_user_mitigation { 229 + SPECTRE_V2_USER_NONE, 230 + SPECTRE_V2_USER_STRICT, 231 + SPECTRE_V2_USER_PRCTL, 232 + SPECTRE_V2_USER_SECCOMP, 231 233 }; 232 234 233 235 /* The Speculative Store Bypass disable variants */ ··· 312 302 X86_FEATURE_USE_IBRS_FW); \ 313 303 preempt_enable(); \ 314 304 } while (0) 305 + 306 + DECLARE_STATIC_KEY_FALSE(switch_to_cond_stibp); 307 + DECLARE_STATIC_KEY_FALSE(switch_mm_cond_ibpb); 308 + DECLARE_STATIC_KEY_FALSE(switch_mm_always_ibpb); 315 309 316 310 #endif /* __ASSEMBLY__ */ 317 311
+14 -6
arch/x86/include/asm/spec-ctrl.h
··· 53 53 return (tifn & _TIF_SSBD) >> (TIF_SSBD - SPEC_CTRL_SSBD_SHIFT); 54 54 } 55 55 56 + static inline u64 stibp_tif_to_spec_ctrl(u64 tifn) 57 + { 58 + BUILD_BUG_ON(TIF_SPEC_IB < SPEC_CTRL_STIBP_SHIFT); 59 + return (tifn & _TIF_SPEC_IB) >> (TIF_SPEC_IB - SPEC_CTRL_STIBP_SHIFT); 60 + } 61 + 56 62 static inline unsigned long ssbd_spec_ctrl_to_tif(u64 spec_ctrl) 57 63 { 58 64 BUILD_BUG_ON(TIF_SSBD < SPEC_CTRL_SSBD_SHIFT); 59 65 return (spec_ctrl & SPEC_CTRL_SSBD) << (TIF_SSBD - SPEC_CTRL_SSBD_SHIFT); 66 + } 67 + 68 + static inline unsigned long stibp_spec_ctrl_to_tif(u64 spec_ctrl) 69 + { 70 + BUILD_BUG_ON(TIF_SPEC_IB < SPEC_CTRL_STIBP_SHIFT); 71 + return (spec_ctrl & SPEC_CTRL_STIBP) << (TIF_SPEC_IB - SPEC_CTRL_STIBP_SHIFT); 60 72 } 61 73 62 74 static inline u64 ssbd_tif_to_amd_ls_cfg(u64 tifn) ··· 82 70 static inline void speculative_store_bypass_ht_init(void) { } 83 71 #endif 84 72 85 - extern void speculative_store_bypass_update(unsigned long tif); 86 - 87 - static inline void speculative_store_bypass_update_current(void) 88 - { 89 - speculative_store_bypass_update(current_thread_info()->flags); 90 - } 73 + extern void speculation_ctrl_update(unsigned long tif); 74 + extern void speculation_ctrl_update_current(void); 91 75 92 76 #endif
-3
arch/x86/include/asm/switch_to.h
··· 11 11 12 12 __visible struct task_struct *__switch_to(struct task_struct *prev, 13 13 struct task_struct *next); 14 - struct tss_struct; 15 - void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p, 16 - struct tss_struct *tss); 17 14 18 15 /* This runs runs on the previous thread's stack. */ 19 16 static inline void prepare_switch_to(struct task_struct *next)
+17 -3
arch/x86/include/asm/thread_info.h
··· 79 79 #define TIF_SIGPENDING 2 /* signal pending */ 80 80 #define TIF_NEED_RESCHED 3 /* rescheduling necessary */ 81 81 #define TIF_SINGLESTEP 4 /* reenable singlestep on user return*/ 82 - #define TIF_SSBD 5 /* Reduced data speculation */ 82 + #define TIF_SSBD 5 /* Speculative store bypass disable */ 83 83 #define TIF_SYSCALL_EMU 6 /* syscall emulation active */ 84 84 #define TIF_SYSCALL_AUDIT 7 /* syscall auditing active */ 85 85 #define TIF_SECCOMP 8 /* secure computing */ 86 + #define TIF_SPEC_IB 9 /* Indirect branch speculation mitigation */ 87 + #define TIF_SPEC_FORCE_UPDATE 10 /* Force speculation MSR update in context switch */ 86 88 #define TIF_USER_RETURN_NOTIFY 11 /* notify kernel of userspace return */ 87 89 #define TIF_UPROBE 12 /* breakpointed or singlestepping */ 88 90 #define TIF_PATCH_PENDING 13 /* pending live patching update */ ··· 112 110 #define _TIF_SYSCALL_EMU (1 << TIF_SYSCALL_EMU) 113 111 #define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT) 114 112 #define _TIF_SECCOMP (1 << TIF_SECCOMP) 113 + #define _TIF_SPEC_IB (1 << TIF_SPEC_IB) 114 + #define _TIF_SPEC_FORCE_UPDATE (1 << TIF_SPEC_FORCE_UPDATE) 115 115 #define _TIF_USER_RETURN_NOTIFY (1 << TIF_USER_RETURN_NOTIFY) 116 116 #define _TIF_UPROBE (1 << TIF_UPROBE) 117 117 #define _TIF_PATCH_PENDING (1 << TIF_PATCH_PENDING) ··· 149 145 _TIF_FSCHECK) 150 146 151 147 /* flags to check in __switch_to() */ 152 - #define _TIF_WORK_CTXSW \ 153 - (_TIF_IO_BITMAP|_TIF_NOCPUID|_TIF_NOTSC|_TIF_BLOCKSTEP|_TIF_SSBD) 148 + #define _TIF_WORK_CTXSW_BASE \ 149 + (_TIF_IO_BITMAP|_TIF_NOCPUID|_TIF_NOTSC|_TIF_BLOCKSTEP| \ 150 + _TIF_SSBD | _TIF_SPEC_FORCE_UPDATE) 151 + 152 + /* 153 + * Avoid calls to __switch_to_xtra() on UP as STIBP is not evaluated. 154 + */ 155 + #ifdef CONFIG_SMP 156 + # define _TIF_WORK_CTXSW (_TIF_WORK_CTXSW_BASE | _TIF_SPEC_IB) 157 + #else 158 + # define _TIF_WORK_CTXSW (_TIF_WORK_CTXSW_BASE) 159 + #endif 154 160 155 161 #define _TIF_WORK_CTXSW_PREV (_TIF_WORK_CTXSW|_TIF_USER_RETURN_NOTIFY) 156 162 #define _TIF_WORK_CTXSW_NEXT (_TIF_WORK_CTXSW)
+6 -2
arch/x86/include/asm/tlbflush.h
··· 169 169 170 170 #define LOADED_MM_SWITCHING ((struct mm_struct *)1) 171 171 172 + /* Last user mm for optimizing IBPB */ 173 + union { 174 + struct mm_struct *last_user_mm; 175 + unsigned long last_user_mm_ibpb; 176 + }; 177 + 172 178 u16 loaded_mm_asid; 173 179 u16 next_asid; 174 - /* last user mm's ctx id */ 175 - u64 last_ctx_id; 176 180 177 181 /* 178 182 * We can be in one of several states:
+394 -143
arch/x86/kernel/cpu/bugs.c
··· 14 14 #include <linux/module.h> 15 15 #include <linux/nospec.h> 16 16 #include <linux/prctl.h> 17 + #include <linux/sched/smt.h> 17 18 18 19 #include <asm/spec-ctrl.h> 19 20 #include <asm/cmdline.h> ··· 53 52 */ 54 53 u64 __ro_after_init x86_amd_ls_cfg_base; 55 54 u64 __ro_after_init x86_amd_ls_cfg_ssbd_mask; 55 + 56 + /* Control conditional STIPB in switch_to() */ 57 + DEFINE_STATIC_KEY_FALSE(switch_to_cond_stibp); 58 + /* Control conditional IBPB in switch_mm() */ 59 + DEFINE_STATIC_KEY_FALSE(switch_mm_cond_ibpb); 60 + /* Control unconditional IBPB in switch_mm() */ 61 + DEFINE_STATIC_KEY_FALSE(switch_mm_always_ibpb); 56 62 57 63 void __init check_bugs(void) 58 64 { ··· 131 123 #endif 132 124 } 133 125 134 - /* The kernel command line selection */ 135 - enum spectre_v2_mitigation_cmd { 136 - SPECTRE_V2_CMD_NONE, 137 - SPECTRE_V2_CMD_AUTO, 138 - SPECTRE_V2_CMD_FORCE, 139 - SPECTRE_V2_CMD_RETPOLINE, 140 - SPECTRE_V2_CMD_RETPOLINE_GENERIC, 141 - SPECTRE_V2_CMD_RETPOLINE_AMD, 142 - }; 143 - 144 - static const char *spectre_v2_strings[] = { 145 - [SPECTRE_V2_NONE] = "Vulnerable", 146 - [SPECTRE_V2_RETPOLINE_MINIMAL] = "Vulnerable: Minimal generic ASM retpoline", 147 - [SPECTRE_V2_RETPOLINE_MINIMAL_AMD] = "Vulnerable: Minimal AMD ASM retpoline", 148 - [SPECTRE_V2_RETPOLINE_GENERIC] = "Mitigation: Full generic retpoline", 149 - [SPECTRE_V2_RETPOLINE_AMD] = "Mitigation: Full AMD retpoline", 150 - [SPECTRE_V2_IBRS_ENHANCED] = "Mitigation: Enhanced IBRS", 151 - }; 152 - 153 - #undef pr_fmt 154 - #define pr_fmt(fmt) "Spectre V2 : " fmt 155 - 156 - static enum spectre_v2_mitigation spectre_v2_enabled __ro_after_init = 157 - SPECTRE_V2_NONE; 158 - 159 126 void 160 127 x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest) 161 128 { ··· 151 168 if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD) || 152 169 static_cpu_has(X86_FEATURE_AMD_SSBD)) 153 170 hostval |= ssbd_tif_to_spec_ctrl(ti->flags); 171 + 172 + /* Conditional STIBP enabled? */ 173 + if (static_branch_unlikely(&switch_to_cond_stibp)) 174 + hostval |= stibp_tif_to_spec_ctrl(ti->flags); 154 175 155 176 if (hostval != guestval) { 156 177 msrval = setguest ? guestval : hostval; ··· 189 202 tif = setguest ? ssbd_spec_ctrl_to_tif(guestval) : 190 203 ssbd_spec_ctrl_to_tif(hostval); 191 204 192 - speculative_store_bypass_update(tif); 205 + speculation_ctrl_update(tif); 193 206 } 194 207 } 195 208 EXPORT_SYMBOL_GPL(x86_virt_spec_ctrl); ··· 203 216 else if (boot_cpu_has(X86_FEATURE_LS_CFG_SSBD)) 204 217 wrmsrl(MSR_AMD64_LS_CFG, msrval); 205 218 } 219 + 220 + #undef pr_fmt 221 + #define pr_fmt(fmt) "Spectre V2 : " fmt 222 + 223 + static enum spectre_v2_mitigation spectre_v2_enabled __ro_after_init = 224 + SPECTRE_V2_NONE; 225 + 226 + static enum spectre_v2_user_mitigation spectre_v2_user __ro_after_init = 227 + SPECTRE_V2_USER_NONE; 206 228 207 229 #ifdef RETPOLINE 208 230 static bool spectre_v2_bad_module; ··· 234 238 static inline const char *spectre_v2_module_string(void) { return ""; } 235 239 #endif 236 240 237 - static void __init spec2_print_if_insecure(const char *reason) 238 - { 239 - if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) 240 - pr_info("%s selected on command line.\n", reason); 241 - } 242 - 243 - static void __init spec2_print_if_secure(const char *reason) 244 - { 245 - if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) 246 - pr_info("%s selected on command line.\n", reason); 247 - } 248 - 249 - static inline bool retp_compiler(void) 250 - { 251 - return __is_defined(RETPOLINE); 252 - } 253 - 254 241 static inline bool match_option(const char *arg, int arglen, const char *opt) 255 242 { 256 243 int len = strlen(opt); ··· 241 262 return len == arglen && !strncmp(arg, opt, len); 242 263 } 243 264 265 + /* The kernel command line selection for spectre v2 */ 266 + enum spectre_v2_mitigation_cmd { 267 + SPECTRE_V2_CMD_NONE, 268 + SPECTRE_V2_CMD_AUTO, 269 + SPECTRE_V2_CMD_FORCE, 270 + SPECTRE_V2_CMD_RETPOLINE, 271 + SPECTRE_V2_CMD_RETPOLINE_GENERIC, 272 + SPECTRE_V2_CMD_RETPOLINE_AMD, 273 + }; 274 + 275 + enum spectre_v2_user_cmd { 276 + SPECTRE_V2_USER_CMD_NONE, 277 + SPECTRE_V2_USER_CMD_AUTO, 278 + SPECTRE_V2_USER_CMD_FORCE, 279 + SPECTRE_V2_USER_CMD_PRCTL, 280 + SPECTRE_V2_USER_CMD_PRCTL_IBPB, 281 + SPECTRE_V2_USER_CMD_SECCOMP, 282 + SPECTRE_V2_USER_CMD_SECCOMP_IBPB, 283 + }; 284 + 285 + static const char * const spectre_v2_user_strings[] = { 286 + [SPECTRE_V2_USER_NONE] = "User space: Vulnerable", 287 + [SPECTRE_V2_USER_STRICT] = "User space: Mitigation: STIBP protection", 288 + [SPECTRE_V2_USER_PRCTL] = "User space: Mitigation: STIBP via prctl", 289 + [SPECTRE_V2_USER_SECCOMP] = "User space: Mitigation: STIBP via seccomp and prctl", 290 + }; 291 + 292 + static const struct { 293 + const char *option; 294 + enum spectre_v2_user_cmd cmd; 295 + bool secure; 296 + } v2_user_options[] __initdata = { 297 + { "auto", SPECTRE_V2_USER_CMD_AUTO, false }, 298 + { "off", SPECTRE_V2_USER_CMD_NONE, false }, 299 + { "on", SPECTRE_V2_USER_CMD_FORCE, true }, 300 + { "prctl", SPECTRE_V2_USER_CMD_PRCTL, false }, 301 + { "prctl,ibpb", SPECTRE_V2_USER_CMD_PRCTL_IBPB, false }, 302 + { "seccomp", SPECTRE_V2_USER_CMD_SECCOMP, false }, 303 + { "seccomp,ibpb", SPECTRE_V2_USER_CMD_SECCOMP_IBPB, false }, 304 + }; 305 + 306 + static void __init spec_v2_user_print_cond(const char *reason, bool secure) 307 + { 308 + if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2) != secure) 309 + pr_info("spectre_v2_user=%s forced on command line.\n", reason); 310 + } 311 + 312 + static enum spectre_v2_user_cmd __init 313 + spectre_v2_parse_user_cmdline(enum spectre_v2_mitigation_cmd v2_cmd) 314 + { 315 + char arg[20]; 316 + int ret, i; 317 + 318 + switch (v2_cmd) { 319 + case SPECTRE_V2_CMD_NONE: 320 + return SPECTRE_V2_USER_CMD_NONE; 321 + case SPECTRE_V2_CMD_FORCE: 322 + return SPECTRE_V2_USER_CMD_FORCE; 323 + default: 324 + break; 325 + } 326 + 327 + ret = cmdline_find_option(boot_command_line, "spectre_v2_user", 328 + arg, sizeof(arg)); 329 + if (ret < 0) 330 + return SPECTRE_V2_USER_CMD_AUTO; 331 + 332 + for (i = 0; i < ARRAY_SIZE(v2_user_options); i++) { 333 + if (match_option(arg, ret, v2_user_options[i].option)) { 334 + spec_v2_user_print_cond(v2_user_options[i].option, 335 + v2_user_options[i].secure); 336 + return v2_user_options[i].cmd; 337 + } 338 + } 339 + 340 + pr_err("Unknown user space protection option (%s). Switching to AUTO select\n", arg); 341 + return SPECTRE_V2_USER_CMD_AUTO; 342 + } 343 + 344 + static void __init 345 + spectre_v2_user_select_mitigation(enum spectre_v2_mitigation_cmd v2_cmd) 346 + { 347 + enum spectre_v2_user_mitigation mode = SPECTRE_V2_USER_NONE; 348 + bool smt_possible = IS_ENABLED(CONFIG_SMP); 349 + enum spectre_v2_user_cmd cmd; 350 + 351 + if (!boot_cpu_has(X86_FEATURE_IBPB) && !boot_cpu_has(X86_FEATURE_STIBP)) 352 + return; 353 + 354 + if (cpu_smt_control == CPU_SMT_FORCE_DISABLED || 355 + cpu_smt_control == CPU_SMT_NOT_SUPPORTED) 356 + smt_possible = false; 357 + 358 + cmd = spectre_v2_parse_user_cmdline(v2_cmd); 359 + switch (cmd) { 360 + case SPECTRE_V2_USER_CMD_NONE: 361 + goto set_mode; 362 + case SPECTRE_V2_USER_CMD_FORCE: 363 + mode = SPECTRE_V2_USER_STRICT; 364 + break; 365 + case SPECTRE_V2_USER_CMD_PRCTL: 366 + case SPECTRE_V2_USER_CMD_PRCTL_IBPB: 367 + mode = SPECTRE_V2_USER_PRCTL; 368 + break; 369 + case SPECTRE_V2_USER_CMD_AUTO: 370 + case SPECTRE_V2_USER_CMD_SECCOMP: 371 + case SPECTRE_V2_USER_CMD_SECCOMP_IBPB: 372 + if (IS_ENABLED(CONFIG_SECCOMP)) 373 + mode = SPECTRE_V2_USER_SECCOMP; 374 + else 375 + mode = SPECTRE_V2_USER_PRCTL; 376 + break; 377 + } 378 + 379 + /* Initialize Indirect Branch Prediction Barrier */ 380 + if (boot_cpu_has(X86_FEATURE_IBPB)) { 381 + setup_force_cpu_cap(X86_FEATURE_USE_IBPB); 382 + 383 + switch (cmd) { 384 + case SPECTRE_V2_USER_CMD_FORCE: 385 + case SPECTRE_V2_USER_CMD_PRCTL_IBPB: 386 + case SPECTRE_V2_USER_CMD_SECCOMP_IBPB: 387 + static_branch_enable(&switch_mm_always_ibpb); 388 + break; 389 + case SPECTRE_V2_USER_CMD_PRCTL: 390 + case SPECTRE_V2_USER_CMD_AUTO: 391 + case SPECTRE_V2_USER_CMD_SECCOMP: 392 + static_branch_enable(&switch_mm_cond_ibpb); 393 + break; 394 + default: 395 + break; 396 + } 397 + 398 + pr_info("mitigation: Enabling %s Indirect Branch Prediction Barrier\n", 399 + static_key_enabled(&switch_mm_always_ibpb) ? 400 + "always-on" : "conditional"); 401 + } 402 + 403 + /* If enhanced IBRS is enabled no STIPB required */ 404 + if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED) 405 + return; 406 + 407 + /* 408 + * If SMT is not possible or STIBP is not available clear the STIPB 409 + * mode. 410 + */ 411 + if (!smt_possible || !boot_cpu_has(X86_FEATURE_STIBP)) 412 + mode = SPECTRE_V2_USER_NONE; 413 + set_mode: 414 + spectre_v2_user = mode; 415 + /* Only print the STIBP mode when SMT possible */ 416 + if (smt_possible) 417 + pr_info("%s\n", spectre_v2_user_strings[mode]); 418 + } 419 + 420 + static const char * const spectre_v2_strings[] = { 421 + [SPECTRE_V2_NONE] = "Vulnerable", 422 + [SPECTRE_V2_RETPOLINE_GENERIC] = "Mitigation: Full generic retpoline", 423 + [SPECTRE_V2_RETPOLINE_AMD] = "Mitigation: Full AMD retpoline", 424 + [SPECTRE_V2_IBRS_ENHANCED] = "Mitigation: Enhanced IBRS", 425 + }; 426 + 244 427 static const struct { 245 428 const char *option; 246 429 enum spectre_v2_mitigation_cmd cmd; 247 430 bool secure; 248 - } mitigation_options[] = { 249 - { "off", SPECTRE_V2_CMD_NONE, false }, 250 - { "on", SPECTRE_V2_CMD_FORCE, true }, 251 - { "retpoline", SPECTRE_V2_CMD_RETPOLINE, false }, 252 - { "retpoline,amd", SPECTRE_V2_CMD_RETPOLINE_AMD, false }, 253 - { "retpoline,generic", SPECTRE_V2_CMD_RETPOLINE_GENERIC, false }, 254 - { "auto", SPECTRE_V2_CMD_AUTO, false }, 431 + } mitigation_options[] __initdata = { 432 + { "off", SPECTRE_V2_CMD_NONE, false }, 433 + { "on", SPECTRE_V2_CMD_FORCE, true }, 434 + { "retpoline", SPECTRE_V2_CMD_RETPOLINE, false }, 435 + { "retpoline,amd", SPECTRE_V2_CMD_RETPOLINE_AMD, false }, 436 + { "retpoline,generic", SPECTRE_V2_CMD_RETPOLINE_GENERIC, false }, 437 + { "auto", SPECTRE_V2_CMD_AUTO, false }, 255 438 }; 439 + 440 + static void __init spec_v2_print_cond(const char *reason, bool secure) 441 + { 442 + if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2) != secure) 443 + pr_info("%s selected on command line.\n", reason); 444 + } 256 445 257 446 static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void) 258 447 { 448 + enum spectre_v2_mitigation_cmd cmd = SPECTRE_V2_CMD_AUTO; 259 449 char arg[20]; 260 450 int ret, i; 261 - enum spectre_v2_mitigation_cmd cmd = SPECTRE_V2_CMD_AUTO; 262 451 263 452 if (cmdline_find_option_bool(boot_command_line, "nospectre_v2")) 264 453 return SPECTRE_V2_CMD_NONE; 265 - else { 266 - ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, sizeof(arg)); 267 - if (ret < 0) 268 - return SPECTRE_V2_CMD_AUTO; 269 454 270 - for (i = 0; i < ARRAY_SIZE(mitigation_options); i++) { 271 - if (!match_option(arg, ret, mitigation_options[i].option)) 272 - continue; 273 - cmd = mitigation_options[i].cmd; 274 - break; 275 - } 455 + ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, sizeof(arg)); 456 + if (ret < 0) 457 + return SPECTRE_V2_CMD_AUTO; 276 458 277 - if (i >= ARRAY_SIZE(mitigation_options)) { 278 - pr_err("unknown option (%s). Switching to AUTO select\n", arg); 279 - return SPECTRE_V2_CMD_AUTO; 280 - } 459 + for (i = 0; i < ARRAY_SIZE(mitigation_options); i++) { 460 + if (!match_option(arg, ret, mitigation_options[i].option)) 461 + continue; 462 + cmd = mitigation_options[i].cmd; 463 + break; 464 + } 465 + 466 + if (i >= ARRAY_SIZE(mitigation_options)) { 467 + pr_err("unknown option (%s). Switching to AUTO select\n", arg); 468 + return SPECTRE_V2_CMD_AUTO; 281 469 } 282 470 283 471 if ((cmd == SPECTRE_V2_CMD_RETPOLINE || ··· 462 316 return SPECTRE_V2_CMD_AUTO; 463 317 } 464 318 465 - if (mitigation_options[i].secure) 466 - spec2_print_if_secure(mitigation_options[i].option); 467 - else 468 - spec2_print_if_insecure(mitigation_options[i].option); 469 - 319 + spec_v2_print_cond(mitigation_options[i].option, 320 + mitigation_options[i].secure); 470 321 return cmd; 471 - } 472 - 473 - static bool stibp_needed(void) 474 - { 475 - if (spectre_v2_enabled == SPECTRE_V2_NONE) 476 - return false; 477 - 478 - if (!boot_cpu_has(X86_FEATURE_STIBP)) 479 - return false; 480 - 481 - return true; 482 - } 483 - 484 - static void update_stibp_msr(void *info) 485 - { 486 - wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base); 487 - } 488 - 489 - void arch_smt_update(void) 490 - { 491 - u64 mask; 492 - 493 - if (!stibp_needed()) 494 - return; 495 - 496 - mutex_lock(&spec_ctrl_mutex); 497 - mask = x86_spec_ctrl_base; 498 - if (cpu_smt_control == CPU_SMT_ENABLED) 499 - mask |= SPEC_CTRL_STIBP; 500 - else 501 - mask &= ~SPEC_CTRL_STIBP; 502 - 503 - if (mask != x86_spec_ctrl_base) { 504 - pr_info("Spectre v2 cross-process SMT mitigation: %s STIBP\n", 505 - cpu_smt_control == CPU_SMT_ENABLED ? 506 - "Enabling" : "Disabling"); 507 - x86_spec_ctrl_base = mask; 508 - on_each_cpu(update_stibp_msr, NULL, 1); 509 - } 510 - mutex_unlock(&spec_ctrl_mutex); 511 322 } 512 323 513 324 static void __init spectre_v2_select_mitigation(void) ··· 520 417 pr_err("Spectre mitigation: LFENCE not serializing, switching to generic retpoline\n"); 521 418 goto retpoline_generic; 522 419 } 523 - mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_AMD : 524 - SPECTRE_V2_RETPOLINE_MINIMAL_AMD; 420 + mode = SPECTRE_V2_RETPOLINE_AMD; 525 421 setup_force_cpu_cap(X86_FEATURE_RETPOLINE_AMD); 526 422 setup_force_cpu_cap(X86_FEATURE_RETPOLINE); 527 423 } else { 528 424 retpoline_generic: 529 - mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_GENERIC : 530 - SPECTRE_V2_RETPOLINE_MINIMAL; 425 + mode = SPECTRE_V2_RETPOLINE_GENERIC; 531 426 setup_force_cpu_cap(X86_FEATURE_RETPOLINE); 532 427 } 533 428 ··· 544 443 setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW); 545 444 pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n"); 546 445 547 - /* Initialize Indirect Branch Prediction Barrier if supported */ 548 - if (boot_cpu_has(X86_FEATURE_IBPB)) { 549 - setup_force_cpu_cap(X86_FEATURE_USE_IBPB); 550 - pr_info("Spectre v2 mitigation: Enabling Indirect Branch Prediction Barrier\n"); 551 - } 552 - 553 446 /* 554 447 * Retpoline means the kernel is safe because it has no indirect 555 448 * branches. Enhanced IBRS protects firmware too, so, enable restricted ··· 560 465 pr_info("Enabling Restricted Speculation for firmware calls\n"); 561 466 } 562 467 468 + /* Set up IBPB and STIBP depending on the general spectre V2 command */ 469 + spectre_v2_user_select_mitigation(cmd); 470 + 563 471 /* Enable STIBP if appropriate */ 564 472 arch_smt_update(); 473 + } 474 + 475 + static void update_stibp_msr(void * __unused) 476 + { 477 + wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base); 478 + } 479 + 480 + /* Update x86_spec_ctrl_base in case SMT state changed. */ 481 + static void update_stibp_strict(void) 482 + { 483 + u64 mask = x86_spec_ctrl_base & ~SPEC_CTRL_STIBP; 484 + 485 + if (sched_smt_active()) 486 + mask |= SPEC_CTRL_STIBP; 487 + 488 + if (mask == x86_spec_ctrl_base) 489 + return; 490 + 491 + pr_info("Update user space SMT mitigation: STIBP %s\n", 492 + mask & SPEC_CTRL_STIBP ? "always-on" : "off"); 493 + x86_spec_ctrl_base = mask; 494 + on_each_cpu(update_stibp_msr, NULL, 1); 495 + } 496 + 497 + /* Update the static key controlling the evaluation of TIF_SPEC_IB */ 498 + static void update_indir_branch_cond(void) 499 + { 500 + if (sched_smt_active()) 501 + static_branch_enable(&switch_to_cond_stibp); 502 + else 503 + static_branch_disable(&switch_to_cond_stibp); 504 + } 505 + 506 + void arch_smt_update(void) 507 + { 508 + /* Enhanced IBRS implies STIBP. No update required. */ 509 + if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED) 510 + return; 511 + 512 + mutex_lock(&spec_ctrl_mutex); 513 + 514 + switch (spectre_v2_user) { 515 + case SPECTRE_V2_USER_NONE: 516 + break; 517 + case SPECTRE_V2_USER_STRICT: 518 + update_stibp_strict(); 519 + break; 520 + case SPECTRE_V2_USER_PRCTL: 521 + case SPECTRE_V2_USER_SECCOMP: 522 + update_indir_branch_cond(); 523 + break; 524 + } 525 + 526 + mutex_unlock(&spec_ctrl_mutex); 565 527 } 566 528 567 529 #undef pr_fmt ··· 635 483 SPEC_STORE_BYPASS_CMD_SECCOMP, 636 484 }; 637 485 638 - static const char *ssb_strings[] = { 486 + static const char * const ssb_strings[] = { 639 487 [SPEC_STORE_BYPASS_NONE] = "Vulnerable", 640 488 [SPEC_STORE_BYPASS_DISABLE] = "Mitigation: Speculative Store Bypass disabled", 641 489 [SPEC_STORE_BYPASS_PRCTL] = "Mitigation: Speculative Store Bypass disabled via prctl", ··· 645 493 static const struct { 646 494 const char *option; 647 495 enum ssb_mitigation_cmd cmd; 648 - } ssb_mitigation_options[] = { 496 + } ssb_mitigation_options[] __initdata = { 649 497 { "auto", SPEC_STORE_BYPASS_CMD_AUTO }, /* Platform decides */ 650 498 { "on", SPEC_STORE_BYPASS_CMD_ON }, /* Disable Speculative Store Bypass */ 651 499 { "off", SPEC_STORE_BYPASS_CMD_NONE }, /* Don't touch Speculative Store Bypass */ ··· 756 604 #undef pr_fmt 757 605 #define pr_fmt(fmt) "Speculation prctl: " fmt 758 606 607 + static void task_update_spec_tif(struct task_struct *tsk) 608 + { 609 + /* Force the update of the real TIF bits */ 610 + set_tsk_thread_flag(tsk, TIF_SPEC_FORCE_UPDATE); 611 + 612 + /* 613 + * Immediately update the speculation control MSRs for the current 614 + * task, but for a non-current task delay setting the CPU 615 + * mitigation until it is scheduled next. 616 + * 617 + * This can only happen for SECCOMP mitigation. For PRCTL it's 618 + * always the current task. 619 + */ 620 + if (tsk == current) 621 + speculation_ctrl_update_current(); 622 + } 623 + 759 624 static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl) 760 625 { 761 - bool update; 762 - 763 626 if (ssb_mode != SPEC_STORE_BYPASS_PRCTL && 764 627 ssb_mode != SPEC_STORE_BYPASS_SECCOMP) 765 628 return -ENXIO; ··· 785 618 if (task_spec_ssb_force_disable(task)) 786 619 return -EPERM; 787 620 task_clear_spec_ssb_disable(task); 788 - update = test_and_clear_tsk_thread_flag(task, TIF_SSBD); 621 + task_update_spec_tif(task); 789 622 break; 790 623 case PR_SPEC_DISABLE: 791 624 task_set_spec_ssb_disable(task); 792 - update = !test_and_set_tsk_thread_flag(task, TIF_SSBD); 625 + task_update_spec_tif(task); 793 626 break; 794 627 case PR_SPEC_FORCE_DISABLE: 795 628 task_set_spec_ssb_disable(task); 796 629 task_set_spec_ssb_force_disable(task); 797 - update = !test_and_set_tsk_thread_flag(task, TIF_SSBD); 630 + task_update_spec_tif(task); 798 631 break; 799 632 default: 800 633 return -ERANGE; 801 634 } 635 + return 0; 636 + } 802 637 803 - /* 804 - * If being set on non-current task, delay setting the CPU 805 - * mitigation until it is next scheduled. 806 - */ 807 - if (task == current && update) 808 - speculative_store_bypass_update_current(); 809 - 638 + static int ib_prctl_set(struct task_struct *task, unsigned long ctrl) 639 + { 640 + switch (ctrl) { 641 + case PR_SPEC_ENABLE: 642 + if (spectre_v2_user == SPECTRE_V2_USER_NONE) 643 + return 0; 644 + /* 645 + * Indirect branch speculation is always disabled in strict 646 + * mode. 647 + */ 648 + if (spectre_v2_user == SPECTRE_V2_USER_STRICT) 649 + return -EPERM; 650 + task_clear_spec_ib_disable(task); 651 + task_update_spec_tif(task); 652 + break; 653 + case PR_SPEC_DISABLE: 654 + case PR_SPEC_FORCE_DISABLE: 655 + /* 656 + * Indirect branch speculation is always allowed when 657 + * mitigation is force disabled. 658 + */ 659 + if (spectre_v2_user == SPECTRE_V2_USER_NONE) 660 + return -EPERM; 661 + if (spectre_v2_user == SPECTRE_V2_USER_STRICT) 662 + return 0; 663 + task_set_spec_ib_disable(task); 664 + if (ctrl == PR_SPEC_FORCE_DISABLE) 665 + task_set_spec_ib_force_disable(task); 666 + task_update_spec_tif(task); 667 + break; 668 + default: 669 + return -ERANGE; 670 + } 810 671 return 0; 811 672 } 812 673 ··· 844 649 switch (which) { 845 650 case PR_SPEC_STORE_BYPASS: 846 651 return ssb_prctl_set(task, ctrl); 652 + case PR_SPEC_INDIRECT_BRANCH: 653 + return ib_prctl_set(task, ctrl); 847 654 default: 848 655 return -ENODEV; 849 656 } ··· 856 659 { 857 660 if (ssb_mode == SPEC_STORE_BYPASS_SECCOMP) 858 661 ssb_prctl_set(task, PR_SPEC_FORCE_DISABLE); 662 + if (spectre_v2_user == SPECTRE_V2_USER_SECCOMP) 663 + ib_prctl_set(task, PR_SPEC_FORCE_DISABLE); 859 664 } 860 665 #endif 861 666 ··· 880 681 } 881 682 } 882 683 684 + static int ib_prctl_get(struct task_struct *task) 685 + { 686 + if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) 687 + return PR_SPEC_NOT_AFFECTED; 688 + 689 + switch (spectre_v2_user) { 690 + case SPECTRE_V2_USER_NONE: 691 + return PR_SPEC_ENABLE; 692 + case SPECTRE_V2_USER_PRCTL: 693 + case SPECTRE_V2_USER_SECCOMP: 694 + if (task_spec_ib_force_disable(task)) 695 + return PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE; 696 + if (task_spec_ib_disable(task)) 697 + return PR_SPEC_PRCTL | PR_SPEC_DISABLE; 698 + return PR_SPEC_PRCTL | PR_SPEC_ENABLE; 699 + case SPECTRE_V2_USER_STRICT: 700 + return PR_SPEC_DISABLE; 701 + default: 702 + return PR_SPEC_NOT_AFFECTED; 703 + } 704 + } 705 + 883 706 int arch_prctl_spec_ctrl_get(struct task_struct *task, unsigned long which) 884 707 { 885 708 switch (which) { 886 709 case PR_SPEC_STORE_BYPASS: 887 710 return ssb_prctl_get(task); 711 + case PR_SPEC_INDIRECT_BRANCH: 712 + return ib_prctl_get(task); 888 713 default: 889 714 return -ENODEV; 890 715 } ··· 1046 823 #define L1TF_DEFAULT_MSG "Mitigation: PTE Inversion" 1047 824 1048 825 #if IS_ENABLED(CONFIG_KVM_INTEL) 1049 - static const char *l1tf_vmx_states[] = { 826 + static const char * const l1tf_vmx_states[] = { 1050 827 [VMENTER_L1D_FLUSH_AUTO] = "auto", 1051 828 [VMENTER_L1D_FLUSH_NEVER] = "vulnerable", 1052 829 [VMENTER_L1D_FLUSH_COND] = "conditional cache flushes", ··· 1062 839 1063 840 if (l1tf_vmx_mitigation == VMENTER_L1D_FLUSH_EPT_DISABLED || 1064 841 (l1tf_vmx_mitigation == VMENTER_L1D_FLUSH_NEVER && 1065 - cpu_smt_control == CPU_SMT_ENABLED)) 842 + sched_smt_active())) { 1066 843 return sprintf(buf, "%s; VMX: %s\n", L1TF_DEFAULT_MSG, 1067 844 l1tf_vmx_states[l1tf_vmx_mitigation]); 845 + } 1068 846 1069 847 return sprintf(buf, "%s; VMX: %s, SMT %s\n", L1TF_DEFAULT_MSG, 1070 848 l1tf_vmx_states[l1tf_vmx_mitigation], 1071 - cpu_smt_control == CPU_SMT_ENABLED ? "vulnerable" : "disabled"); 849 + sched_smt_active() ? "vulnerable" : "disabled"); 1072 850 } 1073 851 #else 1074 852 static ssize_t l1tf_show_state(char *buf) ··· 1078 854 } 1079 855 #endif 1080 856 857 + static char *stibp_state(void) 858 + { 859 + if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED) 860 + return ""; 861 + 862 + switch (spectre_v2_user) { 863 + case SPECTRE_V2_USER_NONE: 864 + return ", STIBP: disabled"; 865 + case SPECTRE_V2_USER_STRICT: 866 + return ", STIBP: forced"; 867 + case SPECTRE_V2_USER_PRCTL: 868 + case SPECTRE_V2_USER_SECCOMP: 869 + if (static_key_enabled(&switch_to_cond_stibp)) 870 + return ", STIBP: conditional"; 871 + } 872 + return ""; 873 + } 874 + 875 + static char *ibpb_state(void) 876 + { 877 + if (boot_cpu_has(X86_FEATURE_IBPB)) { 878 + if (static_key_enabled(&switch_mm_always_ibpb)) 879 + return ", IBPB: always-on"; 880 + if (static_key_enabled(&switch_mm_cond_ibpb)) 881 + return ", IBPB: conditional"; 882 + return ", IBPB: disabled"; 883 + } 884 + return ""; 885 + } 886 + 1081 887 static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr, 1082 888 char *buf, unsigned int bug) 1083 889 { 1084 - int ret; 1085 - 1086 890 if (!boot_cpu_has_bug(bug)) 1087 891 return sprintf(buf, "Not affected\n"); 1088 892 ··· 1128 876 return sprintf(buf, "Mitigation: __user pointer sanitization\n"); 1129 877 1130 878 case X86_BUG_SPECTRE_V2: 1131 - ret = sprintf(buf, "%s%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled], 1132 - boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "", 879 + return sprintf(buf, "%s%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled], 880 + ibpb_state(), 1133 881 boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "", 1134 - (x86_spec_ctrl_base & SPEC_CTRL_STIBP) ? ", STIBP" : "", 882 + stibp_state(), 1135 883 boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? ", RSB filling" : "", 1136 884 spectre_v2_module_string()); 1137 - return ret; 1138 885 1139 886 case X86_BUG_SPEC_STORE_BYPASS: 1140 887 return sprintf(buf, "%s\n", ssb_strings[ssb_mode]);
+82 -19
arch/x86/kernel/process.c
··· 40 40 #include <asm/prctl.h> 41 41 #include <asm/spec-ctrl.h> 42 42 43 + #include "process.h" 44 + 43 45 /* 44 46 * per-CPU TSS segments. Threads are completely 'soft' on Linux, 45 47 * no more per-task TSS's. The TSS size is kept cacheline-aligned ··· 254 252 enable_cpuid(); 255 253 } 256 254 257 - static inline void switch_to_bitmap(struct tss_struct *tss, 258 - struct thread_struct *prev, 255 + static inline void switch_to_bitmap(struct thread_struct *prev, 259 256 struct thread_struct *next, 260 257 unsigned long tifp, unsigned long tifn) 261 258 { 259 + struct tss_struct *tss = this_cpu_ptr(&cpu_tss_rw); 260 + 262 261 if (tifn & _TIF_IO_BITMAP) { 263 262 /* 264 263 * Copy the relevant range of the IO bitmap. ··· 398 395 wrmsrl(MSR_AMD64_VIRT_SPEC_CTRL, ssbd_tif_to_spec_ctrl(tifn)); 399 396 } 400 397 401 - static __always_inline void intel_set_ssb_state(unsigned long tifn) 398 + /* 399 + * Update the MSRs managing speculation control, during context switch. 400 + * 401 + * tifp: Previous task's thread flags 402 + * tifn: Next task's thread flags 403 + */ 404 + static __always_inline void __speculation_ctrl_update(unsigned long tifp, 405 + unsigned long tifn) 402 406 { 403 - u64 msr = x86_spec_ctrl_base | ssbd_tif_to_spec_ctrl(tifn); 407 + unsigned long tif_diff = tifp ^ tifn; 408 + u64 msr = x86_spec_ctrl_base; 409 + bool updmsr = false; 404 410 405 - wrmsrl(MSR_IA32_SPEC_CTRL, msr); 411 + /* 412 + * If TIF_SSBD is different, select the proper mitigation 413 + * method. Note that if SSBD mitigation is disabled or permanentely 414 + * enabled this branch can't be taken because nothing can set 415 + * TIF_SSBD. 416 + */ 417 + if (tif_diff & _TIF_SSBD) { 418 + if (static_cpu_has(X86_FEATURE_VIRT_SSBD)) { 419 + amd_set_ssb_virt_state(tifn); 420 + } else if (static_cpu_has(X86_FEATURE_LS_CFG_SSBD)) { 421 + amd_set_core_ssb_state(tifn); 422 + } else if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD) || 423 + static_cpu_has(X86_FEATURE_AMD_SSBD)) { 424 + msr |= ssbd_tif_to_spec_ctrl(tifn); 425 + updmsr = true; 426 + } 427 + } 428 + 429 + /* 430 + * Only evaluate TIF_SPEC_IB if conditional STIBP is enabled, 431 + * otherwise avoid the MSR write. 432 + */ 433 + if (IS_ENABLED(CONFIG_SMP) && 434 + static_branch_unlikely(&switch_to_cond_stibp)) { 435 + updmsr |= !!(tif_diff & _TIF_SPEC_IB); 436 + msr |= stibp_tif_to_spec_ctrl(tifn); 437 + } 438 + 439 + if (updmsr) 440 + wrmsrl(MSR_IA32_SPEC_CTRL, msr); 406 441 } 407 442 408 - static __always_inline void __speculative_store_bypass_update(unsigned long tifn) 443 + static unsigned long speculation_ctrl_update_tif(struct task_struct *tsk) 409 444 { 410 - if (static_cpu_has(X86_FEATURE_VIRT_SSBD)) 411 - amd_set_ssb_virt_state(tifn); 412 - else if (static_cpu_has(X86_FEATURE_LS_CFG_SSBD)) 413 - amd_set_core_ssb_state(tifn); 414 - else 415 - intel_set_ssb_state(tifn); 445 + if (test_and_clear_tsk_thread_flag(tsk, TIF_SPEC_FORCE_UPDATE)) { 446 + if (task_spec_ssb_disable(tsk)) 447 + set_tsk_thread_flag(tsk, TIF_SSBD); 448 + else 449 + clear_tsk_thread_flag(tsk, TIF_SSBD); 450 + 451 + if (task_spec_ib_disable(tsk)) 452 + set_tsk_thread_flag(tsk, TIF_SPEC_IB); 453 + else 454 + clear_tsk_thread_flag(tsk, TIF_SPEC_IB); 455 + } 456 + /* Return the updated threadinfo flags*/ 457 + return task_thread_info(tsk)->flags; 416 458 } 417 459 418 - void speculative_store_bypass_update(unsigned long tif) 460 + void speculation_ctrl_update(unsigned long tif) 419 461 { 462 + /* Forced update. Make sure all relevant TIF flags are different */ 420 463 preempt_disable(); 421 - __speculative_store_bypass_update(tif); 464 + __speculation_ctrl_update(~tif, tif); 422 465 preempt_enable(); 423 466 } 424 467 425 - void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p, 426 - struct tss_struct *tss) 468 + /* Called from seccomp/prctl update */ 469 + void speculation_ctrl_update_current(void) 470 + { 471 + preempt_disable(); 472 + speculation_ctrl_update(speculation_ctrl_update_tif(current)); 473 + preempt_enable(); 474 + } 475 + 476 + void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p) 427 477 { 428 478 struct thread_struct *prev, *next; 429 479 unsigned long tifp, tifn; ··· 486 430 487 431 tifn = READ_ONCE(task_thread_info(next_p)->flags); 488 432 tifp = READ_ONCE(task_thread_info(prev_p)->flags); 489 - switch_to_bitmap(tss, prev, next, tifp, tifn); 433 + switch_to_bitmap(prev, next, tifp, tifn); 490 434 491 435 propagate_user_return_notify(prev_p, next_p); 492 436 ··· 507 451 if ((tifp ^ tifn) & _TIF_NOCPUID) 508 452 set_cpuid_faulting(!!(tifn & _TIF_NOCPUID)); 509 453 510 - if ((tifp ^ tifn) & _TIF_SSBD) 511 - __speculative_store_bypass_update(tifn); 454 + if (likely(!((tifp | tifn) & _TIF_SPEC_FORCE_UPDATE))) { 455 + __speculation_ctrl_update(tifp, tifn); 456 + } else { 457 + speculation_ctrl_update_tif(prev_p); 458 + tifn = speculation_ctrl_update_tif(next_p); 459 + 460 + /* Enforce MSR update to ensure consistent state */ 461 + __speculation_ctrl_update(~tifn, tifn); 462 + } 512 463 } 513 464 514 465 /*
+39
arch/x86/kernel/process.h
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + // 3 + // Code shared between 32 and 64 bit 4 + 5 + #include <asm/spec-ctrl.h> 6 + 7 + void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p); 8 + 9 + /* 10 + * This needs to be inline to optimize for the common case where no extra 11 + * work needs to be done. 12 + */ 13 + static inline void switch_to_extra(struct task_struct *prev, 14 + struct task_struct *next) 15 + { 16 + unsigned long next_tif = task_thread_info(next)->flags; 17 + unsigned long prev_tif = task_thread_info(prev)->flags; 18 + 19 + if (IS_ENABLED(CONFIG_SMP)) { 20 + /* 21 + * Avoid __switch_to_xtra() invocation when conditional 22 + * STIPB is disabled and the only different bit is 23 + * TIF_SPEC_IB. For CONFIG_SMP=n TIF_SPEC_IB is not 24 + * in the TIF_WORK_CTXSW masks. 25 + */ 26 + if (!static_branch_likely(&switch_to_cond_stibp)) { 27 + prev_tif &= ~_TIF_SPEC_IB; 28 + next_tif &= ~_TIF_SPEC_IB; 29 + } 30 + } 31 + 32 + /* 33 + * __switch_to_xtra() handles debug registers, i/o bitmaps, 34 + * speculation mitigations etc. 35 + */ 36 + if (unlikely(next_tif & _TIF_WORK_CTXSW_NEXT || 37 + prev_tif & _TIF_WORK_CTXSW_PREV)) 38 + __switch_to_xtra(prev, next); 39 + }
+3 -7
arch/x86/kernel/process_32.c
··· 59 59 #include <asm/intel_rdt_sched.h> 60 60 #include <asm/proto.h> 61 61 62 + #include "process.h" 63 + 62 64 void __show_regs(struct pt_regs *regs, enum show_regs_mode mode) 63 65 { 64 66 unsigned long cr0 = 0L, cr2 = 0L, cr3 = 0L, cr4 = 0L; ··· 234 232 struct fpu *prev_fpu = &prev->fpu; 235 233 struct fpu *next_fpu = &next->fpu; 236 234 int cpu = smp_processor_id(); 237 - struct tss_struct *tss = &per_cpu(cpu_tss_rw, cpu); 238 235 239 236 /* never put a printk in __switch_to... printk() calls wake_up*() indirectly */ 240 237 ··· 265 264 if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl)) 266 265 set_iopl_mask(next->iopl); 267 266 268 - /* 269 - * Now maybe handle debug registers and/or IO bitmaps 270 - */ 271 - if (unlikely(task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV || 272 - task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT)) 273 - __switch_to_xtra(prev_p, next_p, tss); 267 + switch_to_extra(prev_p, next_p); 274 268 275 269 /* 276 270 * Leave lazy mode, flushing any hypercalls made here.
+3 -7
arch/x86/kernel/process_64.c
··· 60 60 #include <asm/unistd_32_ia32.h> 61 61 #endif 62 62 63 + #include "process.h" 64 + 63 65 /* Prints also some state that isn't saved in the pt_regs */ 64 66 void __show_regs(struct pt_regs *regs, enum show_regs_mode mode) 65 67 { ··· 555 553 struct fpu *prev_fpu = &prev->fpu; 556 554 struct fpu *next_fpu = &next->fpu; 557 555 int cpu = smp_processor_id(); 558 - struct tss_struct *tss = &per_cpu(cpu_tss_rw, cpu); 559 556 560 557 WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ENTRY) && 561 558 this_cpu_read(irq_count) != -1); ··· 618 617 /* Reload sp0. */ 619 618 update_task_stack(next_p); 620 619 621 - /* 622 - * Now maybe reload the debug registers and handle I/O bitmaps 623 - */ 624 - if (unlikely(task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT || 625 - task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV)) 626 - __switch_to_xtra(prev_p, next_p, tss); 620 + switch_to_extra(prev_p, next_p); 627 621 628 622 #ifdef CONFIG_XEN_PV 629 623 /*
+86 -29
arch/x86/mm/tlb.c
··· 7 7 #include <linux/export.h> 8 8 #include <linux/cpu.h> 9 9 #include <linux/debugfs.h> 10 - #include <linux/ptrace.h> 11 10 12 11 #include <asm/tlbflush.h> 13 12 #include <asm/mmu_context.h> ··· 28 29 * 29 30 * Implement flush IPI by CALL_FUNCTION_VECTOR, Alex Shi 30 31 */ 32 + 33 + /* 34 + * Use bit 0 to mangle the TIF_SPEC_IB state into the mm pointer which is 35 + * stored in cpu_tlb_state.last_user_mm_ibpb. 36 + */ 37 + #define LAST_USER_MM_IBPB 0x1UL 31 38 32 39 /* 33 40 * We get here when we do something requiring a TLB invalidation ··· 186 181 } 187 182 } 188 183 189 - static bool ibpb_needed(struct task_struct *tsk, u64 last_ctx_id) 184 + static inline unsigned long mm_mangle_tif_spec_ib(struct task_struct *next) 190 185 { 186 + unsigned long next_tif = task_thread_info(next)->flags; 187 + unsigned long ibpb = (next_tif >> TIF_SPEC_IB) & LAST_USER_MM_IBPB; 188 + 189 + return (unsigned long)next->mm | ibpb; 190 + } 191 + 192 + static void cond_ibpb(struct task_struct *next) 193 + { 194 + if (!next || !next->mm) 195 + return; 196 + 191 197 /* 192 - * Check if the current (previous) task has access to the memory 193 - * of the @tsk (next) task. If access is denied, make sure to 194 - * issue a IBPB to stop user->user Spectre-v2 attacks. 195 - * 196 - * Note: __ptrace_may_access() returns 0 or -ERRNO. 198 + * Both, the conditional and the always IBPB mode use the mm 199 + * pointer to avoid the IBPB when switching between tasks of the 200 + * same process. Using the mm pointer instead of mm->context.ctx_id 201 + * opens a hypothetical hole vs. mm_struct reuse, which is more or 202 + * less impossible to control by an attacker. Aside of that it 203 + * would only affect the first schedule so the theoretically 204 + * exposed data is not really interesting. 197 205 */ 198 - return (tsk && tsk->mm && tsk->mm->context.ctx_id != last_ctx_id && 199 - ptrace_may_access_sched(tsk, PTRACE_MODE_SPEC_IBPB)); 206 + if (static_branch_likely(&switch_mm_cond_ibpb)) { 207 + unsigned long prev_mm, next_mm; 208 + 209 + /* 210 + * This is a bit more complex than the always mode because 211 + * it has to handle two cases: 212 + * 213 + * 1) Switch from a user space task (potential attacker) 214 + * which has TIF_SPEC_IB set to a user space task 215 + * (potential victim) which has TIF_SPEC_IB not set. 216 + * 217 + * 2) Switch from a user space task (potential attacker) 218 + * which has TIF_SPEC_IB not set to a user space task 219 + * (potential victim) which has TIF_SPEC_IB set. 220 + * 221 + * This could be done by unconditionally issuing IBPB when 222 + * a task which has TIF_SPEC_IB set is either scheduled in 223 + * or out. Though that results in two flushes when: 224 + * 225 + * - the same user space task is scheduled out and later 226 + * scheduled in again and only a kernel thread ran in 227 + * between. 228 + * 229 + * - a user space task belonging to the same process is 230 + * scheduled in after a kernel thread ran in between 231 + * 232 + * - a user space task belonging to the same process is 233 + * scheduled in immediately. 234 + * 235 + * Optimize this with reasonably small overhead for the 236 + * above cases. Mangle the TIF_SPEC_IB bit into the mm 237 + * pointer of the incoming task which is stored in 238 + * cpu_tlbstate.last_user_mm_ibpb for comparison. 239 + */ 240 + next_mm = mm_mangle_tif_spec_ib(next); 241 + prev_mm = this_cpu_read(cpu_tlbstate.last_user_mm_ibpb); 242 + 243 + /* 244 + * Issue IBPB only if the mm's are different and one or 245 + * both have the IBPB bit set. 246 + */ 247 + if (next_mm != prev_mm && 248 + (next_mm | prev_mm) & LAST_USER_MM_IBPB) 249 + indirect_branch_prediction_barrier(); 250 + 251 + this_cpu_write(cpu_tlbstate.last_user_mm_ibpb, next_mm); 252 + } 253 + 254 + if (static_branch_unlikely(&switch_mm_always_ibpb)) { 255 + /* 256 + * Only flush when switching to a user space task with a 257 + * different context than the user space task which ran 258 + * last on this CPU. 259 + */ 260 + if (this_cpu_read(cpu_tlbstate.last_user_mm) != next->mm) { 261 + indirect_branch_prediction_barrier(); 262 + this_cpu_write(cpu_tlbstate.last_user_mm, next->mm); 263 + } 264 + } 200 265 } 201 266 202 267 void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, ··· 367 292 new_asid = prev_asid; 368 293 need_flush = true; 369 294 } else { 370 - u64 last_ctx_id = this_cpu_read(cpu_tlbstate.last_ctx_id); 371 - 372 295 /* 373 296 * Avoid user/user BTB poisoning by flushing the branch 374 297 * predictor when switching between processes. This stops 375 298 * one process from doing Spectre-v2 attacks on another. 376 - * 377 - * As an optimization, flush indirect branches only when 378 - * switching into a processes that can't be ptrace by the 379 - * current one (as in such case, attacker has much more 380 - * convenient way how to tamper with the next process than 381 - * branch buffer poisoning). 382 299 */ 383 - if (static_cpu_has(X86_FEATURE_USE_IBPB) && 384 - ibpb_needed(tsk, last_ctx_id)) 385 - indirect_branch_prediction_barrier(); 300 + cond_ibpb(tsk); 386 301 387 302 if (IS_ENABLED(CONFIG_VMAP_STACK)) { 388 303 /* ··· 429 364 /* See above wrt _rcuidle. */ 430 365 trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, 0); 431 366 } 432 - 433 - /* 434 - * Record last user mm's context id, so we can avoid 435 - * flushing branch buffer with IBPB if we switch back 436 - * to the same user. 437 - */ 438 - if (next != &init_mm) 439 - this_cpu_write(cpu_tlbstate.last_ctx_id, next->context.ctx_id); 440 367 441 368 /* Make sure we write CR3 before loaded_mm. */ 442 369 barrier(); ··· 498 441 write_cr3(build_cr3(mm->pgd, 0)); 499 442 500 443 /* Reinitialize tlbstate. */ 501 - this_cpu_write(cpu_tlbstate.last_ctx_id, mm->context.ctx_id); 444 + this_cpu_write(cpu_tlbstate.last_user_mm_ibpb, LAST_USER_MM_IBPB); 502 445 this_cpu_write(cpu_tlbstate.loaded_mm_asid, 0); 503 446 this_cpu_write(cpu_tlbstate.next_asid, 1); 504 447 this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, mm->context.ctx_id);
-17
include/linux/ptrace.h
··· 64 64 #define PTRACE_MODE_NOAUDIT 0x04 65 65 #define PTRACE_MODE_FSCREDS 0x08 66 66 #define PTRACE_MODE_REALCREDS 0x10 67 - #define PTRACE_MODE_SCHED 0x20 68 - #define PTRACE_MODE_IBPB 0x40 69 67 70 68 /* shorthands for READ/ATTACH and FSCREDS/REALCREDS combinations */ 71 69 #define PTRACE_MODE_READ_FSCREDS (PTRACE_MODE_READ | PTRACE_MODE_FSCREDS) 72 70 #define PTRACE_MODE_READ_REALCREDS (PTRACE_MODE_READ | PTRACE_MODE_REALCREDS) 73 71 #define PTRACE_MODE_ATTACH_FSCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_FSCREDS) 74 72 #define PTRACE_MODE_ATTACH_REALCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_REALCREDS) 75 - #define PTRACE_MODE_SPEC_IBPB (PTRACE_MODE_ATTACH_REALCREDS | PTRACE_MODE_IBPB) 76 73 77 74 /** 78 75 * ptrace_may_access - check whether the caller is permitted to access ··· 86 89 * process_vm_writev or ptrace (and should use the real credentials). 87 90 */ 88 91 extern bool ptrace_may_access(struct task_struct *task, unsigned int mode); 89 - 90 - /** 91 - * ptrace_may_access - check whether the caller is permitted to access 92 - * a target task. 93 - * @task: target task 94 - * @mode: selects type of access and caller credentials 95 - * 96 - * Returns true on success, false on denial. 97 - * 98 - * Similar to ptrace_may_access(). Only to be called from context switch 99 - * code. Does not call into audit and the regular LSM hooks due to locking 100 - * constraints. 101 - */ 102 - extern bool ptrace_may_access_sched(struct task_struct *task, unsigned int mode); 103 92 104 93 static inline int ptrace_reparented(struct task_struct *child) 105 94 {
+9
include/linux/sched.h
··· 1454 1454 #define PFA_SPREAD_SLAB 2 /* Spread some slab caches over cpuset */ 1455 1455 #define PFA_SPEC_SSB_DISABLE 3 /* Speculative Store Bypass disabled */ 1456 1456 #define PFA_SPEC_SSB_FORCE_DISABLE 4 /* Speculative Store Bypass force disabled*/ 1457 + #define PFA_SPEC_IB_DISABLE 5 /* Indirect branch speculation restricted */ 1458 + #define PFA_SPEC_IB_FORCE_DISABLE 6 /* Indirect branch speculation permanently restricted */ 1457 1459 1458 1460 #define TASK_PFA_TEST(name, func) \ 1459 1461 static inline bool task_##func(struct task_struct *p) \ ··· 1486 1484 1487 1485 TASK_PFA_TEST(SPEC_SSB_FORCE_DISABLE, spec_ssb_force_disable) 1488 1486 TASK_PFA_SET(SPEC_SSB_FORCE_DISABLE, spec_ssb_force_disable) 1487 + 1488 + TASK_PFA_TEST(SPEC_IB_DISABLE, spec_ib_disable) 1489 + TASK_PFA_SET(SPEC_IB_DISABLE, spec_ib_disable) 1490 + TASK_PFA_CLEAR(SPEC_IB_DISABLE, spec_ib_disable) 1491 + 1492 + TASK_PFA_TEST(SPEC_IB_FORCE_DISABLE, spec_ib_force_disable) 1493 + TASK_PFA_SET(SPEC_IB_FORCE_DISABLE, spec_ib_force_disable) 1489 1494 1490 1495 static inline void 1491 1496 current_restore_flags(unsigned long orig_flags, unsigned long flags)
+20
include/linux/sched/smt.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + #ifndef _LINUX_SCHED_SMT_H 3 + #define _LINUX_SCHED_SMT_H 4 + 5 + #include <linux/static_key.h> 6 + 7 + #ifdef CONFIG_SCHED_SMT 8 + extern struct static_key_false sched_smt_present; 9 + 10 + static __always_inline bool sched_smt_active(void) 11 + { 12 + return static_branch_likely(&sched_smt_present); 13 + } 14 + #else 15 + static inline bool sched_smt_active(void) { return false; } 16 + #endif 17 + 18 + void arch_smt_update(void); 19 + 20 + #endif
+1
include/uapi/linux/prctl.h
··· 212 212 #define PR_SET_SPECULATION_CTRL 53 213 213 /* Speculation control variants */ 214 214 # define PR_SPEC_STORE_BYPASS 0 215 + # define PR_SPEC_INDIRECT_BRANCH 1 215 216 /* Return and control values for PR_SET/GET_SPECULATION_CTRL */ 216 217 # define PR_SPEC_NOT_AFFECTED 0 217 218 # define PR_SPEC_PRCTL (1UL << 0)
+9 -6
kernel/cpu.c
··· 10 10 #include <linux/sched/signal.h> 11 11 #include <linux/sched/hotplug.h> 12 12 #include <linux/sched/task.h> 13 + #include <linux/sched/smt.h> 13 14 #include <linux/unistd.h> 14 15 #include <linux/cpu.h> 15 16 #include <linux/oom.h> ··· 367 366 } 368 367 369 368 #endif /* CONFIG_HOTPLUG_CPU */ 369 + 370 + /* 371 + * Architectures that need SMT-specific errata handling during SMT hotplug 372 + * should override this. 373 + */ 374 + void __weak arch_smt_update(void) { } 370 375 371 376 #ifdef CONFIG_HOTPLUG_SMT 372 377 enum cpuhp_smt_control cpu_smt_control __read_mostly = CPU_SMT_ENABLED; ··· 1018 1011 * concurrent CPU hotplug via cpu_add_remove_lock. 1019 1012 */ 1020 1013 lockup_detector_cleanup(); 1014 + arch_smt_update(); 1021 1015 return ret; 1022 1016 } 1023 1017 ··· 1147 1139 ret = cpuhp_up_callbacks(cpu, st, target); 1148 1140 out: 1149 1141 cpus_write_unlock(); 1142 + arch_smt_update(); 1150 1143 return ret; 1151 1144 } 1152 1145 ··· 2063 2054 /* Tell user space about the state change */ 2064 2055 kobject_uevent(&dev->kobj, KOBJ_ONLINE); 2065 2056 } 2066 - 2067 - /* 2068 - * Architectures that need SMT-specific errata handling during SMT hotplug 2069 - * should override this. 2070 - */ 2071 - void __weak arch_smt_update(void) { }; 2072 2057 2073 2058 static int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval) 2074 2059 {
-10
kernel/ptrace.c
··· 261 261 262 262 static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode) 263 263 { 264 - if (mode & PTRACE_MODE_SCHED) 265 - return false; 266 - 267 264 if (mode & PTRACE_MODE_NOAUDIT) 268 265 return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE); 269 266 else ··· 328 331 !ptrace_has_cap(mm->user_ns, mode))) 329 332 return -EPERM; 330 333 331 - if (mode & PTRACE_MODE_SCHED) 332 - return 0; 333 334 return security_ptrace_access_check(task, mode); 334 - } 335 - 336 - bool ptrace_may_access_sched(struct task_struct *task, unsigned int mode) 337 - { 338 - return __ptrace_may_access(task, mode | PTRACE_MODE_SCHED); 339 335 } 340 336 341 337 bool ptrace_may_access(struct task_struct *task, unsigned int mode)
+11 -8
kernel/sched/core.c
··· 5738 5738 5739 5739 #ifdef CONFIG_SCHED_SMT 5740 5740 /* 5741 - * The sched_smt_present static key needs to be evaluated on every 5742 - * hotplug event because at boot time SMT might be disabled when 5743 - * the number of booted CPUs is limited. 5744 - * 5745 - * If then later a sibling gets hotplugged, then the key would stay 5746 - * off and SMT scheduling would never be functional. 5741 + * When going up, increment the number of cores with SMT present. 5747 5742 */ 5748 - if (cpumask_weight(cpu_smt_mask(cpu)) > 1) 5749 - static_branch_enable_cpuslocked(&sched_smt_present); 5743 + if (cpumask_weight(cpu_smt_mask(cpu)) == 2) 5744 + static_branch_inc_cpuslocked(&sched_smt_present); 5750 5745 #endif 5751 5746 set_cpu_active(cpu, true); 5752 5747 ··· 5784 5789 * Do sync before park smpboot threads to take care the rcu boost case. 5785 5790 */ 5786 5791 synchronize_rcu_mult(call_rcu, call_rcu_sched); 5792 + 5793 + #ifdef CONFIG_SCHED_SMT 5794 + /* 5795 + * When going down, decrement the number of cores with SMT present. 5796 + */ 5797 + if (cpumask_weight(cpu_smt_mask(cpu)) == 2) 5798 + static_branch_dec_cpuslocked(&sched_smt_present); 5799 + #endif 5787 5800 5788 5801 if (!sched_smp_initialized) 5789 5802 return 0;
+1 -3
kernel/sched/sched.h
··· 23 23 #include <linux/sched/prio.h> 24 24 #include <linux/sched/rt.h> 25 25 #include <linux/sched/signal.h> 26 + #include <linux/sched/smt.h> 26 27 #include <linux/sched/stat.h> 27 28 #include <linux/sched/sysctl.h> 28 29 #include <linux/sched/task.h> ··· 937 936 938 937 939 938 #ifdef CONFIG_SCHED_SMT 940 - 941 - extern struct static_key_false sched_smt_present; 942 - 943 939 extern void __update_idle_core(struct rq *rq); 944 940 945 941 static inline void update_idle_core(struct rq *rq)
-2
scripts/Makefile.build
··· 236 236 objtool_args += --no-unreachable 237 237 endif 238 238 ifdef CONFIG_RETPOLINE 239 - ifneq ($(RETPOLINE_CFLAGS),) 240 239 objtool_args += --retpoline 241 - endif 242 240 endif 243 241 244 242
+1
tools/include/uapi/linux/prctl.h
··· 212 212 #define PR_SET_SPECULATION_CTRL 53 213 213 /* Speculation control variants */ 214 214 # define PR_SPEC_STORE_BYPASS 0 215 + # define PR_SPEC_INDIRECT_BRANCH 1 215 216 /* Return and control values for PR_SET/GET_SPECULATION_CTRL */ 216 217 # define PR_SPEC_NOT_AFFECTED 0 217 218 # define PR_SPEC_PRCTL (1UL << 0)