Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
"The headline feature is the re-enablement of support for Arm's
Scalable Matrix Extension (SME) thanks to a bumper crop of fixes
from Mark Rutland.

If matrices aren't your thing, then Ryan's page-table optimisation
work is much more interesting.

Summary:

ACPI, EFI and PSCI:

- Decouple Arm's "Software Delegated Exception Interface" (SDEI)
support from the ACPI GHES code so that it can be used by platforms
booted with device-tree

- Remove unnecessary per-CPU tracking of the FPSIMD state across EFI
runtime calls

- Fix a node refcount imbalance in the PSCI device-tree code

CPU Features:

- Ensure register sanitisation is applied to fields in ID_AA64MMFR4

- Expose AIDR_EL1 to userspace via sysfs, primarily so that KVM
guests can reliably query the underlying CPU types from the VMM

- Re-enabling of SME support (CONFIG_ARM64_SME) as a result of fixes
to our context-switching, signal handling and ptrace code

Entry code:

- Hook up TIF_NEED_RESCHED_LAZY so that CONFIG_PREEMPT_LAZY can be
selected

Memory management:

- Prevent BSS exports from being used by the early PI code

- Propagate level and stride information to the low-level TLB
invalidation routines when operating on hugetlb entries

- Use the page-table contiguous hint for vmap() mappings with
VM_ALLOW_HUGE_VMAP where possible

- Optimise vmalloc()/vmap() page-table updates to use "lazy MMU mode"
and hook this up on arm64 so that the trailing DSB (used to publish
the updates to the hardware walker) can be deferred until the end
of the mapping operation

- Extend mmap() randomisation for 52-bit virtual addresses (on par
with 48-bit addressing) and remove limited support for
randomisation of the linear map

Perf and PMUs:

- Add support for probing the CMN-S3 driver using ACPI

- Minor driver fixes to the CMN, Arm-NI and amlogic PMU drivers

Selftests:

- Fix FPSIMD and SME tests to align with the freshly re-enabled SME
support

- Fix default setting of the OUTPUT variable so that tests are
installed in the right location

vDSO:

- Replace raw counter access from inline assembly code with a call to
the the __arch_counter_get_cntvct() helper function

Miscellaneous:

- Add some missing header inclusions to the CCA headers

- Rework rendering of /proc/cpuinfo to follow the x86-approach and
avoid repeated buffer expansion (the user-visible format remains
identical)

- Remove redundant selection of CONFIG_CRC32

- Extend early error message when failing to map the device-tree
blob"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (83 commits)
arm64: cputype: Add cputype definition for HIP12
arm64: el2_setup.h: Make __init_el2_fgt labels consistent, again
perf/arm-cmn: Add CMN S3 ACPI binding
arm64/boot: Disallow BSS exports to startup code
arm64/boot: Move global CPU override variables out of BSS
arm64/boot: Move init_pgdir[] and init_idmap_pgdir[] into __pi_ namespace
perf/arm-cmn: Initialise cmn->cpu earlier
kselftest/arm64: Set default OUTPUT path when undefined
arm64: Update comment regarding values in __boot_cpu_mode
arm64: mm: Drop redundant check in pmd_trans_huge()
arm64/mm: Re-organise setting up FEAT_S1PIE registers PIRE0_EL1 and PIR_EL1
arm64/mm: Permit lazy_mmu_mode to be nested
arm64/mm: Disable barrier batching in interrupt contexts
arm64/cpuinfo: only show one cpu's info in c_show()
arm64/mm: Batch barriers when updating kernel mappings
mm/vmalloc: Enter lazy mmu mode while manipulating vmalloc ptes
arm64/mm: Support huge pte-mapped pages in vmap
mm/vmalloc: Gracefully unmap huge ptes
mm/vmalloc: Warn on improper use of vunmap_range()
arm64/mm: Hoist barriers out of set_ptes_anysz() loop
...

+1065 -896
+1
Documentation/ABI/testing/sysfs-devices-system-cpu
··· 544 544 /sys/devices/system/cpu/cpuX/regs/identification/ 545 545 /sys/devices/system/cpu/cpuX/regs/identification/midr_el1 546 546 /sys/devices/system/cpu/cpuX/regs/identification/revidr_el1 547 + /sys/devices/system/cpu/cpuX/regs/identification/aidr_el1 547 548 /sys/devices/system/cpu/cpuX/regs/identification/smidr_el1 548 549 Date: June 2016 549 550 Contact: Linux ARM Kernel Mailing list <linux-arm-kernel@lists.infradead.org>
+7 -6
Documentation/arch/arm64/cpu-feature-registers.rst
··· 72 72 process could be migrated to another CPU by the time it uses the 73 73 register value, unless the CPU affinity is set. Hence, there is no 74 74 guarantee that the value reflects the processor that it is 75 - currently executing on. The REVIDR is not exposed due to this 76 - constraint, as REVIDR makes sense only in conjunction with the 77 - MIDR. Alternately, MIDR_EL1 and REVIDR_EL1 are exposed via sysfs 78 - at:: 75 + currently executing on. REVIDR and AIDR are not exposed due to this 76 + constraint, as these registers only make sense in conjunction with 77 + the MIDR. Alternately, MIDR_EL1, REVIDR_EL1, and AIDR_EL1 are exposed 78 + via sysfs at:: 79 79 80 80 /sys/devices/system/cpu/cpu$ID/regs/identification/ 81 - \- midr 82 - \- revidr 81 + \- midr_el1 82 + \- revidr_el1 83 + \- aidr_el1 83 84 84 85 3. Implementation 85 86 --------------------
+4 -4
Documentation/arch/arm64/sme.rst
··· 69 69 vectors from 0 to VL/8-1 stored in the same endianness invariant format as is 70 70 used for SVE vectors. 71 71 72 - * On thread creation TPIDR2_EL0 is preserved unless CLONE_SETTLS is specified, 73 - in which case it is set to 0. 72 + * On thread creation PSTATE.ZA and TPIDR2_EL0 are preserved unless CLONE_VM 73 + is specified, in which case PSTATE.ZA is set to 0 and TPIDR2_EL0 is set to 0. 74 74 75 75 2. Vector lengths 76 76 ------------------ ··· 115 115 5. Signal handling 116 116 ------------------- 117 117 118 - * Signal handlers are invoked with streaming mode and ZA disabled. 118 + * Signal handlers are invoked with PSTATE.SM=0, PSTATE.ZA=0, and TPIDR2_EL0=0. 119 119 120 120 * A new signal frame record TPIDR2_MAGIC is added formatted as a struct 121 121 tpidr2_context to allow access to TPIDR2_EL0 from signal handlers. ··· 241 241 length, or calling PR_SME_SET_VL with the PR_SME_SET_VL_ONEXEC flag, 242 242 does not constitute a change to the vector length for this purpose. 243 243 244 - * Changing the vector length causes PSTATE.ZA and PSTATE.SM to be cleared. 244 + * Changing the vector length causes PSTATE.ZA to be cleared. 245 245 Calling PR_SME_SET_VL with vl equal to the thread's current vector 246 246 length, or calling PR_SME_SET_VL with the PR_SME_SET_VL_ONEXEC flag, 247 247 does not constitute a change to the vector length for this purpose.
+4 -5
arch/arm64/Kconfig
··· 42 42 select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS 43 43 select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE 44 44 select ARCH_HAS_NONLEAF_PMD_YOUNG if ARM64_HAFT 45 + select ARCH_HAS_PREEMPT_LAZY 45 46 select ARCH_HAS_PTDUMP 46 47 select ARCH_HAS_PTE_DEVMAP 47 48 select ARCH_HAS_PTE_SPECIAL ··· 135 134 select COMMON_CLK 136 135 select CPU_PM if (SUSPEND || CPU_IDLE) 137 136 select CPUMASK_OFFSTACK if NR_CPUS > 256 138 - select CRC32 139 137 select DCACHE_WORD_ACCESS 140 138 select DYNAMIC_FTRACE if FUNCTION_TRACER 141 139 select DMA_BOUNCE_UNALIGNED_KMALLOC ··· 333 333 default 24 if ARM64_VA_BITS=39 334 334 default 27 if ARM64_VA_BITS=42 335 335 default 30 if ARM64_VA_BITS=47 336 - default 29 if ARM64_VA_BITS=48 && ARM64_64K_PAGES 337 - default 31 if ARM64_VA_BITS=48 && ARM64_16K_PAGES 338 - default 33 if ARM64_VA_BITS=48 336 + default 29 if (ARM64_VA_BITS=48 || ARM64_VA_BITS=52) && ARM64_64K_PAGES 337 + default 31 if (ARM64_VA_BITS=48 || ARM64_VA_BITS=52) && ARM64_16K_PAGES 338 + default 33 if (ARM64_VA_BITS=48 || ARM64_VA_BITS=52) 339 339 default 14 if ARM64_64K_PAGES 340 340 default 16 if ARM64_16K_PAGES 341 341 default 18 ··· 2285 2285 bool "ARM Scalable Matrix Extension support" 2286 2286 default y 2287 2287 depends on ARM64_SVE 2288 - depends on BROKEN 2289 2288 help 2290 2289 The Scalable Matrix Extension (SME) is an extension to the AArch64 2291 2290 execution state which utilises a substantial subset of the SVE
+1
arch/arm64/include/asm/cpu.h
··· 44 44 u64 reg_dczid; 45 45 u64 reg_midr; 46 46 u64 reg_revidr; 47 + u64 reg_aidr; 47 48 u64 reg_gmid; 48 49 u64 reg_smidr; 49 50 u64 reg_mpamidr;
+2
arch/arm64/include/asm/cputype.h
··· 134 134 135 135 #define HISI_CPU_PART_TSV110 0xD01 136 136 #define HISI_CPU_PART_HIP09 0xD02 137 + #define HISI_CPU_PART_HIP12 0xD06 137 138 138 139 #define APPLE_CPU_PART_M1_ICESTORM 0x022 139 140 #define APPLE_CPU_PART_M1_FIRESTORM 0x023 ··· 223 222 #define MIDR_FUJITSU_A64FX MIDR_CPU_MODEL(ARM_CPU_IMP_FUJITSU, FUJITSU_CPU_PART_A64FX) 224 223 #define MIDR_HISI_TSV110 MIDR_CPU_MODEL(ARM_CPU_IMP_HISI, HISI_CPU_PART_TSV110) 225 224 #define MIDR_HISI_HIP09 MIDR_CPU_MODEL(ARM_CPU_IMP_HISI, HISI_CPU_PART_HIP09) 225 + #define MIDR_HISI_HIP12 MIDR_CPU_MODEL(ARM_CPU_IMP_HISI, HISI_CPU_PART_HIP12) 226 226 #define MIDR_APPLE_M1_ICESTORM MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M1_ICESTORM) 227 227 #define MIDR_APPLE_M1_FIRESTORM MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M1_FIRESTORM) 228 228 #define MIDR_APPLE_M1_ICESTORM_PRO MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M1_ICESTORM_PRO)
+7 -3
arch/arm64/include/asm/el2_setup.h
··· 204 204 orr x0, x0, #(1 << 62) 205 205 206 206 .Lskip_spe_fgt_\@: 207 + 208 + .Lset_debug_fgt_\@: 207 209 msr_s SYS_HDFGRTR_EL2, x0 208 210 msr_s SYS_HDFGWTR_EL2, x0 209 211 210 212 mov x0, xzr 211 213 mrs x1, id_aa64pfr1_el1 212 214 ubfx x1, x1, #ID_AA64PFR1_EL1_SME_SHIFT, #4 213 - cbz x1, .Lskip_debug_fgt_\@ 215 + cbz x1, .Lskip_sme_fgt_\@ 214 216 215 217 /* Disable nVHE traps of TPIDR2 and SMPRI */ 216 218 orr x0, x0, #HFGxTR_EL2_nSMPRI_EL1_MASK 217 219 orr x0, x0, #HFGxTR_EL2_nTPIDR2_EL0_MASK 218 220 219 - .Lskip_debug_fgt_\@: 221 + .Lskip_sme_fgt_\@: 220 222 mrs_s x1, SYS_ID_AA64MMFR3_EL1 221 223 ubfx x1, x1, #ID_AA64MMFR3_EL1_S1PIE_SHIFT, #4 222 224 cbz x1, .Lskip_pie_fgt_\@ ··· 239 237 /* GCS depends on PIE so we don't check it if PIE is absent */ 240 238 mrs_s x1, SYS_ID_AA64PFR1_EL1 241 239 ubfx x1, x1, #ID_AA64PFR1_EL1_GCS_SHIFT, #4 242 - cbz x1, .Lset_fgt_\@ 240 + cbz x1, .Lskip_gce_fgt_\@ 243 241 244 242 /* Disable traps of access to GCS registers at EL0 and EL1 */ 245 243 orr x0, x0, #HFGxTR_EL2_nGCS_EL1_MASK 246 244 orr x0, x0, #HFGxTR_EL2_nGCS_EL0_MASK 245 + 246 + .Lskip_gce_fgt_\@: 247 247 248 248 .Lset_fgt_\@: 249 249 msr_s SYS_HFGRTR_EL2, x0
+7 -5
arch/arm64/include/asm/esr.h
··· 378 378 /* 379 379 * ISS values for SME traps 380 380 */ 381 + #define ESR_ELx_SME_ISS_SMTC_MASK GENMASK(2, 0) 382 + #define ESR_ELx_SME_ISS_SMTC(esr) ((esr) & ESR_ELx_SME_ISS_SMTC_MASK) 381 383 382 - #define ESR_ELx_SME_ISS_SME_DISABLED 0 383 - #define ESR_ELx_SME_ISS_ILL 1 384 - #define ESR_ELx_SME_ISS_SM_DISABLED 2 385 - #define ESR_ELx_SME_ISS_ZA_DISABLED 3 386 - #define ESR_ELx_SME_ISS_ZT_DISABLED 4 384 + #define ESR_ELx_SME_ISS_SMTC_SME_DISABLED 0 385 + #define ESR_ELx_SME_ISS_SMTC_ILL 1 386 + #define ESR_ELx_SME_ISS_SMTC_SM_DISABLED 2 387 + #define ESR_ELx_SME_ISS_SMTC_ZA_DISABLED 3 388 + #define ESR_ELx_SME_ISS_SMTC_ZT_DISABLED 4 387 389 388 390 /* ISS field definitions for MOPS exceptions */ 389 391 #define ESR_ELx_MOPS_ISS_MEM_INST (UL(1) << 24)
+47 -17
arch/arm64/include/asm/fpsimd.h
··· 6 6 #define __ASM_FP_H 7 7 8 8 #include <asm/errno.h> 9 + #include <asm/percpu.h> 9 10 #include <asm/ptrace.h> 10 11 #include <asm/processor.h> 11 12 #include <asm/sigcontext.h> ··· 77 76 extern void fpsimd_thread_switch(struct task_struct *next); 78 77 extern void fpsimd_flush_thread(void); 79 78 80 - extern void fpsimd_signal_preserve_current_state(void); 81 79 extern void fpsimd_preserve_current_state(void); 82 80 extern void fpsimd_restore_current_state(void); 83 81 extern void fpsimd_update_current_state(struct user_fpsimd_state const *state); ··· 93 93 enum fp_type to_save; 94 94 }; 95 95 96 + DECLARE_PER_CPU(struct cpu_fp_state, fpsimd_last_state); 97 + 96 98 extern void fpsimd_bind_state_to_cpu(struct cpu_fp_state *fp_state); 97 99 98 100 extern void fpsimd_flush_task_state(struct task_struct *target); 101 + extern void fpsimd_save_and_flush_current_state(void); 99 102 extern void fpsimd_save_and_flush_cpu_state(void); 100 103 101 104 static inline bool thread_sm_enabled(struct thread_struct *thread) ··· 110 107 { 111 108 return system_supports_sme() && (thread->svcr & SVCR_ZA_MASK); 112 109 } 110 + 111 + extern void task_smstop_sm(struct task_struct *task); 113 112 114 113 /* Maximum VL that SVE/SME VL-agnostic software can transparently support */ 115 114 #define VL_ARCH_MAX 0x100 ··· 200 195 201 196 extern void sve_alloc(struct task_struct *task, bool flush); 202 197 extern void fpsimd_release_task(struct task_struct *task); 203 - extern void fpsimd_sync_to_sve(struct task_struct *task); 204 - extern void fpsimd_force_sync_to_sve(struct task_struct *task); 205 - extern void sve_sync_to_fpsimd(struct task_struct *task); 206 - extern void sve_sync_from_fpsimd_zeropad(struct task_struct *task); 198 + extern void fpsimd_sync_from_effective_state(struct task_struct *task); 199 + extern void fpsimd_sync_to_effective_state_zeropad(struct task_struct *task); 207 200 208 201 extern int vec_set_vector_length(struct task_struct *task, enum vec_type type, 209 202 unsigned long vl, unsigned long flags); ··· 295 292 return vq_available(ARM64_VEC_SVE, vq); 296 293 } 297 294 298 - size_t sve_state_size(struct task_struct const *task); 295 + static inline size_t __sve_state_size(unsigned int sve_vl, unsigned int sme_vl) 296 + { 297 + unsigned int vl = max(sve_vl, sme_vl); 298 + return SVE_SIG_REGS_SIZE(sve_vq_from_vl(vl)); 299 + } 300 + 301 + /* 302 + * Return how many bytes of memory are required to store the full SVE 303 + * state for task, given task's currently configured vector length. 304 + */ 305 + static inline size_t sve_state_size(struct task_struct const *task) 306 + { 307 + unsigned int sve_vl = task_get_sve_vl(task); 308 + unsigned int sme_vl = task_get_sme_vl(task); 309 + return __sve_state_size(sve_vl, sme_vl); 310 + } 299 311 300 312 #else /* ! CONFIG_ARM64_SVE */ 301 313 302 314 static inline void sve_alloc(struct task_struct *task, bool flush) { } 303 315 static inline void fpsimd_release_task(struct task_struct *task) { } 304 - static inline void sve_sync_to_fpsimd(struct task_struct *task) { } 305 - static inline void sve_sync_from_fpsimd_zeropad(struct task_struct *task) { } 316 + static inline void fpsimd_sync_from_effective_state(struct task_struct *task) { } 317 + static inline void fpsimd_sync_to_effective_state_zeropad(struct task_struct *task) { } 306 318 307 319 static inline int sve_max_virtualisable_vl(void) 308 320 { ··· 350 332 static inline void vec_update_vq_map(enum vec_type t) { } 351 333 static inline int vec_verify_vq_map(enum vec_type t) { return 0; } 352 334 static inline void sve_setup(void) { } 335 + 336 + static inline size_t __sve_state_size(unsigned int sve_vl, unsigned int sme_vl) 337 + { 338 + return 0; 339 + } 353 340 354 341 static inline size_t sve_state_size(struct task_struct const *task) 355 342 { ··· 408 385 extern int sme_get_current_vl(void); 409 386 extern void sme_suspend_exit(void); 410 387 388 + static inline size_t __sme_state_size(unsigned int sme_vl) 389 + { 390 + size_t size = ZA_SIG_REGS_SIZE(sve_vq_from_vl(sme_vl)); 391 + 392 + if (system_supports_sme2()) 393 + size += ZT_SIG_REG_SIZE; 394 + 395 + return size; 396 + } 397 + 411 398 /* 412 399 * Return how many bytes of memory are required to store the full SME 413 400 * specific state for task, given task's currently configured vector ··· 425 392 */ 426 393 static inline size_t sme_state_size(struct task_struct const *task) 427 394 { 428 - unsigned int vl = task_get_sme_vl(task); 429 - size_t size; 430 - 431 - size = ZA_SIG_REGS_SIZE(sve_vq_from_vl(vl)); 432 - 433 - if (system_supports_sme2()) 434 - size += ZT_SIG_REG_SIZE; 435 - 436 - return size; 395 + return __sme_state_size(task_get_sme_vl(task)); 437 396 } 438 397 439 398 #else ··· 445 420 static inline int sme_set_current_vl(unsigned long arg) { return -EINVAL; } 446 421 static inline int sme_get_current_vl(void) { return -EINVAL; } 447 422 static inline void sme_suspend_exit(void) { } 423 + 424 + static inline size_t __sme_state_size(unsigned int sme_vl) 425 + { 426 + return 0; 427 + } 448 428 449 429 static inline size_t sme_state_size(struct task_struct const *task) 450 430 {
+25 -16
arch/arm64/include/asm/hugetlb.h
··· 69 69 70 70 #include <asm-generic/hugetlb.h> 71 71 72 + static inline void __flush_hugetlb_tlb_range(struct vm_area_struct *vma, 73 + unsigned long start, 74 + unsigned long end, 75 + unsigned long stride, 76 + bool last_level) 77 + { 78 + switch (stride) { 79 + #ifndef __PAGETABLE_PMD_FOLDED 80 + case PUD_SIZE: 81 + __flush_tlb_range(vma, start, end, PUD_SIZE, last_level, 1); 82 + break; 83 + #endif 84 + case CONT_PMD_SIZE: 85 + case PMD_SIZE: 86 + __flush_tlb_range(vma, start, end, PMD_SIZE, last_level, 2); 87 + break; 88 + case CONT_PTE_SIZE: 89 + __flush_tlb_range(vma, start, end, PAGE_SIZE, last_level, 3); 90 + break; 91 + default: 92 + __flush_tlb_range(vma, start, end, PAGE_SIZE, last_level, TLBI_TTL_UNKNOWN); 93 + } 94 + } 95 + 72 96 #define __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE 73 97 static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma, 74 98 unsigned long start, ··· 100 76 { 101 77 unsigned long stride = huge_page_size(hstate_vma(vma)); 102 78 103 - switch (stride) { 104 - #ifndef __PAGETABLE_PMD_FOLDED 105 - case PUD_SIZE: 106 - __flush_tlb_range(vma, start, end, PUD_SIZE, false, 1); 107 - break; 108 - #endif 109 - case CONT_PMD_SIZE: 110 - case PMD_SIZE: 111 - __flush_tlb_range(vma, start, end, PMD_SIZE, false, 2); 112 - break; 113 - case CONT_PTE_SIZE: 114 - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 3); 115 - break; 116 - default: 117 - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN); 118 - } 79 + __flush_hugetlb_tlb_range(vma, start, end, stride, false); 119 80 } 120 81 121 82 #endif /* __ASM_HUGETLB_H */
+2
arch/arm64/include/asm/mem_encrypt.h
··· 4 4 5 5 #include <asm/rsi.h> 6 6 7 + struct device; 8 + 7 9 struct arm64_mem_crypt_ops { 8 10 int (*encrypt)(unsigned long addr, int numpages); 9 11 int (*decrypt)(unsigned long addr, int numpages);
+172 -66
arch/arm64/include/asm/pgtable.h
··· 40 40 #include <linux/sched.h> 41 41 #include <linux/page_table_check.h> 42 42 43 + static inline void emit_pte_barriers(void) 44 + { 45 + /* 46 + * These barriers are emitted under certain conditions after a pte entry 47 + * was modified (see e.g. __set_pte_complete()). The dsb makes the store 48 + * visible to the table walker. The isb ensures that any previous 49 + * speculative "invalid translation" marker that is in the CPU's 50 + * pipeline gets cleared, so that any access to that address after 51 + * setting the pte to valid won't cause a spurious fault. If the thread 52 + * gets preempted after storing to the pgtable but before emitting these 53 + * barriers, __switch_to() emits a dsb which ensure the walker gets to 54 + * see the store. There is no guarantee of an isb being issued though. 55 + * This is safe because it will still get issued (albeit on a 56 + * potentially different CPU) when the thread starts running again, 57 + * before any access to the address. 58 + */ 59 + dsb(ishst); 60 + isb(); 61 + } 62 + 63 + static inline void queue_pte_barriers(void) 64 + { 65 + unsigned long flags; 66 + 67 + if (in_interrupt()) { 68 + emit_pte_barriers(); 69 + return; 70 + } 71 + 72 + flags = read_thread_flags(); 73 + 74 + if (flags & BIT(TIF_LAZY_MMU)) { 75 + /* Avoid the atomic op if already set. */ 76 + if (!(flags & BIT(TIF_LAZY_MMU_PENDING))) 77 + set_thread_flag(TIF_LAZY_MMU_PENDING); 78 + } else { 79 + emit_pte_barriers(); 80 + } 81 + } 82 + 83 + #define __HAVE_ARCH_ENTER_LAZY_MMU_MODE 84 + static inline void arch_enter_lazy_mmu_mode(void) 85 + { 86 + /* 87 + * lazy_mmu_mode is not supposed to permit nesting. But in practice this 88 + * does happen with CONFIG_DEBUG_PAGEALLOC, where a page allocation 89 + * inside a lazy_mmu_mode section (such as zap_pte_range()) will change 90 + * permissions on the linear map with apply_to_page_range(), which 91 + * re-enters lazy_mmu_mode. So we tolerate nesting in our 92 + * implementation. The first call to arch_leave_lazy_mmu_mode() will 93 + * flush and clear the flag such that the remainder of the work in the 94 + * outer nest behaves as if outside of lazy mmu mode. This is safe and 95 + * keeps tracking simple. 96 + */ 97 + 98 + if (in_interrupt()) 99 + return; 100 + 101 + set_thread_flag(TIF_LAZY_MMU); 102 + } 103 + 104 + static inline void arch_flush_lazy_mmu_mode(void) 105 + { 106 + if (in_interrupt()) 107 + return; 108 + 109 + if (test_and_clear_thread_flag(TIF_LAZY_MMU_PENDING)) 110 + emit_pte_barriers(); 111 + } 112 + 113 + static inline void arch_leave_lazy_mmu_mode(void) 114 + { 115 + if (in_interrupt()) 116 + return; 117 + 118 + arch_flush_lazy_mmu_mode(); 119 + clear_thread_flag(TIF_LAZY_MMU); 120 + } 121 + 43 122 #ifdef CONFIG_TRANSPARENT_HUGEPAGE 44 123 #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE 45 124 ··· 399 320 WRITE_ONCE(*ptep, pte); 400 321 } 401 322 323 + static inline void __set_pte_complete(pte_t pte) 324 + { 325 + /* 326 + * Only if the new pte is valid and kernel, otherwise TLB maintenance 327 + * has the necessary barriers. 328 + */ 329 + if (pte_valid_not_user(pte)) 330 + queue_pte_barriers(); 331 + } 332 + 402 333 static inline void __set_pte(pte_t *ptep, pte_t pte) 403 334 { 404 335 __set_pte_nosync(ptep, pte); 405 - 406 - /* 407 - * Only if the new pte is valid and kernel, otherwise TLB maintenance 408 - * or update_mmu_cache() have the necessary barriers. 409 - */ 410 - if (pte_valid_not_user(pte)) { 411 - dsb(ishst); 412 - isb(); 413 - } 336 + __set_pte_complete(pte); 414 337 } 415 338 416 339 static inline pte_t __ptep_get(pte_t *ptep) ··· 502 421 static inline pte_t pte_advance_pfn(pte_t pte, unsigned long nr) 503 422 { 504 423 return pfn_pte(pte_pfn(pte) + nr, pte_pgprot(pte)); 505 - } 506 - 507 - static inline void __set_ptes(struct mm_struct *mm, 508 - unsigned long __always_unused addr, 509 - pte_t *ptep, pte_t pte, unsigned int nr) 510 - { 511 - page_table_check_ptes_set(mm, ptep, pte, nr); 512 - __sync_cache_and_tags(pte, nr); 513 - 514 - for (;;) { 515 - __check_safe_pte_update(mm, ptep, pte); 516 - __set_pte(ptep, pte); 517 - if (--nr == 0) 518 - break; 519 - ptep++; 520 - pte = pte_advance_pfn(pte, 1); 521 - } 522 424 } 523 425 524 426 /* ··· 713 649 return __pgprot(pud_val(pfn_pud(pfn, __pgprot(0))) ^ pud_val(pud)); 714 650 } 715 651 716 - static inline void __set_pte_at(struct mm_struct *mm, 717 - unsigned long __always_unused addr, 718 - pte_t *ptep, pte_t pte, unsigned int nr) 652 + static inline void __set_ptes_anysz(struct mm_struct *mm, pte_t *ptep, 653 + pte_t pte, unsigned int nr, 654 + unsigned long pgsize) 719 655 { 720 - __sync_cache_and_tags(pte, nr); 721 - __check_safe_pte_update(mm, ptep, pte); 722 - __set_pte(ptep, pte); 656 + unsigned long stride = pgsize >> PAGE_SHIFT; 657 + 658 + switch (pgsize) { 659 + case PAGE_SIZE: 660 + page_table_check_ptes_set(mm, ptep, pte, nr); 661 + break; 662 + case PMD_SIZE: 663 + page_table_check_pmds_set(mm, (pmd_t *)ptep, pte_pmd(pte), nr); 664 + break; 665 + #ifndef __PAGETABLE_PMD_FOLDED 666 + case PUD_SIZE: 667 + page_table_check_puds_set(mm, (pud_t *)ptep, pte_pud(pte), nr); 668 + break; 669 + #endif 670 + default: 671 + VM_WARN_ON(1); 672 + } 673 + 674 + __sync_cache_and_tags(pte, nr * stride); 675 + 676 + for (;;) { 677 + __check_safe_pte_update(mm, ptep, pte); 678 + __set_pte_nosync(ptep, pte); 679 + if (--nr == 0) 680 + break; 681 + ptep++; 682 + pte = pte_advance_pfn(pte, stride); 683 + } 684 + 685 + __set_pte_complete(pte); 723 686 } 724 687 725 - static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr, 726 - pmd_t *pmdp, pmd_t pmd) 688 + static inline void __set_ptes(struct mm_struct *mm, 689 + unsigned long __always_unused addr, 690 + pte_t *ptep, pte_t pte, unsigned int nr) 727 691 { 728 - page_table_check_pmd_set(mm, pmdp, pmd); 729 - return __set_pte_at(mm, addr, (pte_t *)pmdp, pmd_pte(pmd), 730 - PMD_SIZE >> PAGE_SHIFT); 692 + __set_ptes_anysz(mm, ptep, pte, nr, PAGE_SIZE); 731 693 } 732 694 733 - static inline void set_pud_at(struct mm_struct *mm, unsigned long addr, 734 - pud_t *pudp, pud_t pud) 695 + static inline void __set_pmds(struct mm_struct *mm, 696 + unsigned long __always_unused addr, 697 + pmd_t *pmdp, pmd_t pmd, unsigned int nr) 735 698 { 736 - page_table_check_pud_set(mm, pudp, pud); 737 - return __set_pte_at(mm, addr, (pte_t *)pudp, pud_pte(pud), 738 - PUD_SIZE >> PAGE_SHIFT); 699 + __set_ptes_anysz(mm, (pte_t *)pmdp, pmd_pte(pmd), nr, PMD_SIZE); 739 700 } 701 + #define set_pmd_at(mm, addr, pmdp, pmd) __set_pmds(mm, addr, pmdp, pmd, 1) 702 + 703 + static inline void __set_puds(struct mm_struct *mm, 704 + unsigned long __always_unused addr, 705 + pud_t *pudp, pud_t pud, unsigned int nr) 706 + { 707 + __set_ptes_anysz(mm, (pte_t *)pudp, pud_pte(pud), nr, PUD_SIZE); 708 + } 709 + #define set_pud_at(mm, addr, pudp, pud) __set_puds(mm, addr, pudp, pud, 1) 740 710 741 711 #define __p4d_to_phys(p4d) __pte_to_phys(p4d_pte(p4d)) 742 712 #define __phys_to_p4d_val(phys) __phys_to_pte_val(phys) ··· 837 739 * If pmd is present-invalid, pmd_table() won't detect it 838 740 * as a table, so force the valid bit for the comparison. 839 741 */ 840 - return pmd_val(pmd) && pmd_present(pmd) && 841 - !pmd_table(__pmd(pmd_val(pmd) | PTE_VALID)); 742 + return pmd_present(pmd) && !pmd_table(__pmd(pmd_val(pmd) | PTE_VALID)); 842 743 } 843 744 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ 844 745 ··· 851 754 PUD_TYPE_TABLE) 852 755 #endif 853 756 854 - extern pgd_t init_pg_dir[]; 855 - extern pgd_t init_pg_end[]; 856 757 extern pgd_t swapper_pg_dir[]; 857 758 extern pgd_t idmap_pg_dir[]; 858 759 extern pgd_t tramp_pg_dir[]; ··· 875 780 876 781 WRITE_ONCE(*pmdp, pmd); 877 782 878 - if (pmd_valid(pmd)) { 879 - dsb(ishst); 880 - isb(); 881 - } 783 + if (pmd_valid(pmd)) 784 + queue_pte_barriers(); 882 785 } 883 786 884 787 static inline void pmd_clear(pmd_t *pmdp) ··· 941 848 942 849 WRITE_ONCE(*pudp, pud); 943 850 944 - if (pud_valid(pud)) { 945 - dsb(ishst); 946 - isb(); 947 - } 851 + if (pud_valid(pud)) 852 + queue_pte_barriers(); 948 853 } 949 854 950 855 static inline void pud_clear(pud_t *pudp) ··· 1021 930 } 1022 931 1023 932 WRITE_ONCE(*p4dp, p4d); 1024 - dsb(ishst); 1025 - isb(); 933 + queue_pte_barriers(); 1026 934 } 1027 935 1028 936 static inline void p4d_clear(p4d_t *p4dp) ··· 1149 1059 } 1150 1060 1151 1061 WRITE_ONCE(*pgdp, pgd); 1152 - dsb(ishst); 1153 - isb(); 1062 + queue_pte_barriers(); 1154 1063 } 1155 1064 1156 1065 static inline void pgd_clear(pgd_t *pgdp) ··· 1390 1301 } 1391 1302 #endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG */ 1392 1303 1393 - static inline pte_t __ptep_get_and_clear(struct mm_struct *mm, 1394 - unsigned long address, pte_t *ptep) 1304 + static inline pte_t __ptep_get_and_clear_anysz(struct mm_struct *mm, 1305 + pte_t *ptep, 1306 + unsigned long pgsize) 1395 1307 { 1396 1308 pte_t pte = __pte(xchg_relaxed(&pte_val(*ptep), 0)); 1397 1309 1398 - page_table_check_pte_clear(mm, pte); 1310 + switch (pgsize) { 1311 + case PAGE_SIZE: 1312 + page_table_check_pte_clear(mm, pte); 1313 + break; 1314 + case PMD_SIZE: 1315 + page_table_check_pmd_clear(mm, pte_pmd(pte)); 1316 + break; 1317 + #ifndef __PAGETABLE_PMD_FOLDED 1318 + case PUD_SIZE: 1319 + page_table_check_pud_clear(mm, pte_pud(pte)); 1320 + break; 1321 + #endif 1322 + default: 1323 + VM_WARN_ON(1); 1324 + } 1399 1325 1400 1326 return pte; 1327 + } 1328 + 1329 + static inline pte_t __ptep_get_and_clear(struct mm_struct *mm, 1330 + unsigned long address, pte_t *ptep) 1331 + { 1332 + return __ptep_get_and_clear_anysz(mm, ptep, PAGE_SIZE); 1401 1333 } 1402 1334 1403 1335 static inline void __clear_full_ptes(struct mm_struct *mm, unsigned long addr, ··· 1457 1347 static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, 1458 1348 unsigned long address, pmd_t *pmdp) 1459 1349 { 1460 - pmd_t pmd = __pmd(xchg_relaxed(&pmd_val(*pmdp), 0)); 1461 - 1462 - page_table_check_pmd_clear(mm, pmd); 1463 - 1464 - return pmd; 1350 + return pte_pmd(__ptep_get_and_clear_anysz(mm, (pte_t *)pmdp, PMD_SIZE)); 1465 1351 } 1466 1352 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ 1467 1353
+2
arch/arm64/include/asm/rsi_cmds.h
··· 7 7 #define __ASM_RSI_CMDS_H 8 8 9 9 #include <linux/arm-smccc.h> 10 + #include <linux/string.h> 11 + #include <asm/memory.h> 10 12 11 13 #include <asm/rsi_smc.h> 12 14
+11 -7
arch/arm64/include/asm/thread_info.h
··· 59 59 60 60 #define TIF_SIGPENDING 0 /* signal pending */ 61 61 #define TIF_NEED_RESCHED 1 /* rescheduling necessary */ 62 - #define TIF_NOTIFY_RESUME 2 /* callback before returning to user */ 63 - #define TIF_FOREIGN_FPSTATE 3 /* CPU's FP state is not current's */ 64 - #define TIF_UPROBE 4 /* uprobe breakpoint or singlestep */ 65 - #define TIF_MTE_ASYNC_FAULT 5 /* MTE Asynchronous Tag Check Fault */ 66 - #define TIF_NOTIFY_SIGNAL 6 /* signal notifications exist */ 62 + #define TIF_NEED_RESCHED_LAZY 2 /* Lazy rescheduling needed */ 63 + #define TIF_NOTIFY_RESUME 3 /* callback before returning to user */ 64 + #define TIF_FOREIGN_FPSTATE 4 /* CPU's FP state is not current's */ 65 + #define TIF_UPROBE 5 /* uprobe breakpoint or singlestep */ 66 + #define TIF_MTE_ASYNC_FAULT 6 /* MTE Asynchronous Tag Check Fault */ 67 + #define TIF_NOTIFY_SIGNAL 7 /* signal notifications exist */ 67 68 #define TIF_SYSCALL_TRACE 8 /* syscall trace active */ 68 69 #define TIF_SYSCALL_AUDIT 9 /* syscall auditing */ 69 70 #define TIF_SYSCALL_TRACEPOINT 10 /* syscall tracepoint for ftrace */ ··· 83 82 #define TIF_SME_VL_INHERIT 28 /* Inherit SME vl_onexec across exec */ 84 83 #define TIF_KERNEL_FPSTATE 29 /* Task is in a kernel mode FPSIMD section */ 85 84 #define TIF_TSC_SIGSEGV 30 /* SIGSEGV on counter-timer access */ 85 + #define TIF_LAZY_MMU 31 /* Task in lazy mmu mode */ 86 + #define TIF_LAZY_MMU_PENDING 32 /* Ops pending for lazy mmu mode exit */ 86 87 87 88 #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) 88 89 #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) 90 + #define _TIF_NEED_RESCHED_LAZY (1 << TIF_NEED_RESCHED_LAZY) 89 91 #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME) 90 92 #define _TIF_FOREIGN_FPSTATE (1 << TIF_FOREIGN_FPSTATE) 91 93 #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE) ··· 104 100 #define _TIF_NOTIFY_SIGNAL (1 << TIF_NOTIFY_SIGNAL) 105 101 #define _TIF_TSC_SIGSEGV (1 << TIF_TSC_SIGSEGV) 106 102 107 - #define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \ 103 + #define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY | \ 108 104 _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \ 109 105 _TIF_UPROBE | _TIF_MTE_ASYNC_FAULT | \ 110 - _TIF_NOTIFY_SIGNAL) 106 + _TIF_NOTIFY_SIGNAL | _TIF_SIGPENDING) 111 107 112 108 #define _TIF_SYSCALL_WORK (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \ 113 109 _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
+2 -20
arch/arm64/include/asm/vdso/gettimeofday.h
··· 8 8 #ifndef __ASSEMBLY__ 9 9 10 10 #include <asm/alternative.h> 11 + #include <asm/arch_timer.h> 11 12 #include <asm/barrier.h> 12 13 #include <asm/unistd.h> 13 14 #include <asm/sysreg.h> ··· 70 69 static __always_inline u64 __arch_get_hw_counter(s32 clock_mode, 71 70 const struct vdso_time_data *vd) 72 71 { 73 - u64 res; 74 - 75 72 /* 76 73 * Core checks for mode already, so this raced against a concurrent 77 74 * update. Return something. Core will do another round and then ··· 78 79 if (clock_mode == VDSO_CLOCKMODE_NONE) 79 80 return 0; 80 81 81 - /* 82 - * If FEAT_ECV is available, use the self-synchronizing counter. 83 - * Otherwise the isb is required to prevent that the counter value 84 - * is speculated. 85 - */ 86 - asm volatile( 87 - ALTERNATIVE("isb\n" 88 - "mrs %0, cntvct_el0", 89 - "nop\n" 90 - __mrs_s("%0", SYS_CNTVCTSS_EL0), 91 - ARM64_HAS_ECV) 92 - : "=r" (res) 93 - : 94 - : "memory"); 95 - 96 - arch_counter_enforce_ordering(res); 97 - 98 - return res; 82 + return __arch_counter_get_cntvct(); 99 83 } 100 84 101 85 #if IS_ENABLED(CONFIG_CC_IS_GCC) && IS_ENABLED(CONFIG_PAGE_SIZE_64KB)
+2 -1
arch/arm64/include/asm/virt.h
··· 67 67 * __boot_cpu_mode records what mode CPUs were booted in. 68 68 * A correctly-implemented bootloader must start all CPUs in the same mode: 69 69 * In this case, both 32bit halves of __boot_cpu_mode will contain the 70 - * same value (either 0 if booted in EL1, BOOT_CPU_MODE_EL2 if booted in EL2). 70 + * same value (either BOOT_CPU_MODE_EL1 if booted in EL1, BOOT_CPU_MODE_EL2 if 71 + * booted in EL2). 71 72 * 72 73 * Should the bootloader fail to do this, the two values will be different. 73 74 * This allows the kernel to flag an error when the secondaries have come up.
+45
arch/arm64/include/asm/vmalloc.h
··· 23 23 return !IS_ENABLED(CONFIG_PTDUMP_DEBUGFS); 24 24 } 25 25 26 + #define arch_vmap_pte_range_map_size arch_vmap_pte_range_map_size 27 + static inline unsigned long arch_vmap_pte_range_map_size(unsigned long addr, 28 + unsigned long end, u64 pfn, 29 + unsigned int max_page_shift) 30 + { 31 + /* 32 + * If the block is at least CONT_PTE_SIZE in size, and is naturally 33 + * aligned in both virtual and physical space, then we can pte-map the 34 + * block using the PTE_CONT bit for more efficient use of the TLB. 35 + */ 36 + if (max_page_shift < CONT_PTE_SHIFT) 37 + return PAGE_SIZE; 38 + 39 + if (end - addr < CONT_PTE_SIZE) 40 + return PAGE_SIZE; 41 + 42 + if (!IS_ALIGNED(addr, CONT_PTE_SIZE)) 43 + return PAGE_SIZE; 44 + 45 + if (!IS_ALIGNED(PFN_PHYS(pfn), CONT_PTE_SIZE)) 46 + return PAGE_SIZE; 47 + 48 + return CONT_PTE_SIZE; 49 + } 50 + 51 + #define arch_vmap_pte_range_unmap_size arch_vmap_pte_range_unmap_size 52 + static inline unsigned long arch_vmap_pte_range_unmap_size(unsigned long addr, 53 + pte_t *ptep) 54 + { 55 + /* 56 + * The caller handles alignment so it's sufficient just to check 57 + * PTE_CONT. 58 + */ 59 + return pte_valid_cont(__ptep_get(ptep)) ? CONT_PTE_SIZE : PAGE_SIZE; 60 + } 61 + 62 + #define arch_vmap_pte_supported_shift arch_vmap_pte_supported_shift 63 + static inline int arch_vmap_pte_supported_shift(unsigned long size) 64 + { 65 + if (size >= CONT_PTE_SIZE) 66 + return CONT_PTE_SHIFT; 67 + 68 + return PAGE_SHIFT; 69 + } 70 + 26 71 #endif 27 72 28 73 #define arch_vmap_pgprot_tagged arch_vmap_pgprot_tagged
+2
arch/arm64/kernel/asm-offsets.c
··· 182 182 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS 183 183 DEFINE(FTRACE_OPS_DIRECT_CALL, offsetof(struct ftrace_ops, direct_call)); 184 184 #endif 185 + DEFINE(PIE_E0_ASM, PIE_E0); 186 + DEFINE(PIE_E1_ASM, PIE_E1); 185 187 return 0; 186 188 }
+12 -10
arch/arm64/kernel/cpufeature.c
··· 765 765 #define ARM64_FTR_REG(id, table) \ 766 766 __ARM64_FTR_REG_OVERRIDE(#id, id, table, &no_override) 767 767 768 - struct arm64_ftr_override id_aa64mmfr0_override; 769 - struct arm64_ftr_override id_aa64mmfr1_override; 770 - struct arm64_ftr_override id_aa64mmfr2_override; 771 - struct arm64_ftr_override id_aa64pfr0_override; 772 - struct arm64_ftr_override id_aa64pfr1_override; 773 - struct arm64_ftr_override id_aa64zfr0_override; 774 - struct arm64_ftr_override id_aa64smfr0_override; 775 - struct arm64_ftr_override id_aa64isar1_override; 776 - struct arm64_ftr_override id_aa64isar2_override; 768 + struct arm64_ftr_override __read_mostly id_aa64mmfr0_override; 769 + struct arm64_ftr_override __read_mostly id_aa64mmfr1_override; 770 + struct arm64_ftr_override __read_mostly id_aa64mmfr2_override; 771 + struct arm64_ftr_override __read_mostly id_aa64pfr0_override; 772 + struct arm64_ftr_override __read_mostly id_aa64pfr1_override; 773 + struct arm64_ftr_override __read_mostly id_aa64zfr0_override; 774 + struct arm64_ftr_override __read_mostly id_aa64smfr0_override; 775 + struct arm64_ftr_override __read_mostly id_aa64isar1_override; 776 + struct arm64_ftr_override __read_mostly id_aa64isar2_override; 777 777 778 - struct arm64_ftr_override arm64_sw_feature_override; 778 + struct arm64_ftr_override __read_mostly arm64_sw_feature_override; 779 779 780 780 static const struct __ftr_reg_entry { 781 781 u32 sys_id; ··· 1410 1410 info->reg_id_aa64mmfr2, boot->reg_id_aa64mmfr2); 1411 1411 taint |= check_update_ftr_reg(SYS_ID_AA64MMFR3_EL1, cpu, 1412 1412 info->reg_id_aa64mmfr3, boot->reg_id_aa64mmfr3); 1413 + taint |= check_update_ftr_reg(SYS_ID_AA64MMFR4_EL1, cpu, 1414 + info->reg_id_aa64mmfr4, boot->reg_id_aa64mmfr4); 1413 1415 1414 1416 taint |= check_update_ftr_reg(SYS_ID_AA64PFR0_EL1, cpu, 1415 1417 info->reg_id_aa64pfr0, boot->reg_id_aa64pfr0);
+55 -53
arch/arm64/kernel/cpuinfo.c
··· 209 209 210 210 static int c_show(struct seq_file *m, void *v) 211 211 { 212 - int i, j; 212 + int j; 213 + int cpu = m->index; 213 214 bool compat = personality(current->personality) == PER_LINUX32; 215 + struct cpuinfo_arm64 *cpuinfo = v; 216 + u32 midr = cpuinfo->reg_midr; 214 217 215 - for_each_online_cpu(i) { 216 - struct cpuinfo_arm64 *cpuinfo = &per_cpu(cpu_data, i); 217 - u32 midr = cpuinfo->reg_midr; 218 + /* 219 + * glibc reads /proc/cpuinfo to determine the number of 220 + * online processors, looking for lines beginning with 221 + * "processor". Give glibc what it expects. 222 + */ 223 + seq_printf(m, "processor\t: %d\n", cpu); 224 + if (compat) 225 + seq_printf(m, "model name\t: ARMv8 Processor rev %d (%s)\n", 226 + MIDR_REVISION(midr), COMPAT_ELF_PLATFORM); 218 227 219 - /* 220 - * glibc reads /proc/cpuinfo to determine the number of 221 - * online processors, looking for lines beginning with 222 - * "processor". Give glibc what it expects. 223 - */ 224 - seq_printf(m, "processor\t: %d\n", i); 225 - if (compat) 226 - seq_printf(m, "model name\t: ARMv8 Processor rev %d (%s)\n", 227 - MIDR_REVISION(midr), COMPAT_ELF_PLATFORM); 228 + seq_printf(m, "BogoMIPS\t: %lu.%02lu\n", 229 + loops_per_jiffy / (500000UL/HZ), 230 + loops_per_jiffy / (5000UL/HZ) % 100); 228 231 229 - seq_printf(m, "BogoMIPS\t: %lu.%02lu\n", 230 - loops_per_jiffy / (500000UL/HZ), 231 - loops_per_jiffy / (5000UL/HZ) % 100); 232 - 233 - /* 234 - * Dump out the common processor features in a single line. 235 - * Userspace should read the hwcaps with getauxval(AT_HWCAP) 236 - * rather than attempting to parse this, but there's a body of 237 - * software which does already (at least for 32-bit). 238 - */ 239 - seq_puts(m, "Features\t:"); 240 - if (compat) { 232 + /* 233 + * Dump out the common processor features in a single line. 234 + * Userspace should read the hwcaps with getauxval(AT_HWCAP) 235 + * rather than attempting to parse this, but there's a body of 236 + * software which does already (at least for 32-bit). 237 + */ 238 + seq_puts(m, "Features\t:"); 239 + if (compat) { 241 240 #ifdef CONFIG_COMPAT 242 - for (j = 0; j < ARRAY_SIZE(compat_hwcap_str); j++) { 243 - if (compat_elf_hwcap & (1 << j)) { 244 - /* 245 - * Warn once if any feature should not 246 - * have been present on arm64 platform. 247 - */ 248 - if (WARN_ON_ONCE(!compat_hwcap_str[j])) 249 - continue; 241 + for (j = 0; j < ARRAY_SIZE(compat_hwcap_str); j++) { 242 + if (compat_elf_hwcap & (1 << j)) { 243 + /* 244 + * Warn once if any feature should not 245 + * have been present on arm64 platform. 246 + */ 247 + if (WARN_ON_ONCE(!compat_hwcap_str[j])) 248 + continue; 250 249 251 - seq_printf(m, " %s", compat_hwcap_str[j]); 252 - } 250 + seq_printf(m, " %s", compat_hwcap_str[j]); 253 251 } 254 - 255 - for (j = 0; j < ARRAY_SIZE(compat_hwcap2_str); j++) 256 - if (compat_elf_hwcap2 & (1 << j)) 257 - seq_printf(m, " %s", compat_hwcap2_str[j]); 258 - #endif /* CONFIG_COMPAT */ 259 - } else { 260 - for (j = 0; j < ARRAY_SIZE(hwcap_str); j++) 261 - if (cpu_have_feature(j)) 262 - seq_printf(m, " %s", hwcap_str[j]); 263 252 } 264 - seq_puts(m, "\n"); 265 253 266 - seq_printf(m, "CPU implementer\t: 0x%02x\n", 267 - MIDR_IMPLEMENTOR(midr)); 268 - seq_printf(m, "CPU architecture: 8\n"); 269 - seq_printf(m, "CPU variant\t: 0x%x\n", MIDR_VARIANT(midr)); 270 - seq_printf(m, "CPU part\t: 0x%03x\n", MIDR_PARTNUM(midr)); 271 - seq_printf(m, "CPU revision\t: %d\n\n", MIDR_REVISION(midr)); 254 + for (j = 0; j < ARRAY_SIZE(compat_hwcap2_str); j++) 255 + if (compat_elf_hwcap2 & (1 << j)) 256 + seq_printf(m, " %s", compat_hwcap2_str[j]); 257 + #endif /* CONFIG_COMPAT */ 258 + } else { 259 + for (j = 0; j < ARRAY_SIZE(hwcap_str); j++) 260 + if (cpu_have_feature(j)) 261 + seq_printf(m, " %s", hwcap_str[j]); 272 262 } 263 + seq_puts(m, "\n"); 264 + 265 + seq_printf(m, "CPU implementer\t: 0x%02x\n", 266 + MIDR_IMPLEMENTOR(midr)); 267 + seq_puts(m, "CPU architecture: 8\n"); 268 + seq_printf(m, "CPU variant\t: 0x%x\n", MIDR_VARIANT(midr)); 269 + seq_printf(m, "CPU part\t: 0x%03x\n", MIDR_PARTNUM(midr)); 270 + seq_printf(m, "CPU revision\t: %d\n\n", MIDR_REVISION(midr)); 273 271 274 272 return 0; 275 273 } 276 274 277 275 static void *c_start(struct seq_file *m, loff_t *pos) 278 276 { 279 - return *pos < 1 ? (void *)1 : NULL; 277 + *pos = cpumask_next(*pos - 1, cpu_online_mask); 278 + return *pos < nr_cpu_ids ? &per_cpu(cpu_data, *pos) : NULL; 280 279 } 281 280 282 281 static void *c_next(struct seq_file *m, void *v, loff_t *pos) 283 282 { 284 283 ++*pos; 285 - return NULL; 284 + return c_start(m, pos); 286 285 } 287 286 288 287 static void c_stop(struct seq_file *m, void *v) ··· 327 328 328 329 CPUREGS_ATTR_RO(midr_el1, midr); 329 330 CPUREGS_ATTR_RO(revidr_el1, revidr); 331 + CPUREGS_ATTR_RO(aidr_el1, aidr); 330 332 CPUREGS_ATTR_RO(smidr_el1, smidr); 331 333 332 334 static struct attribute *cpuregs_id_attrs[] = { 333 335 &cpuregs_attr_midr_el1.attr, 334 336 &cpuregs_attr_revidr_el1.attr, 337 + &cpuregs_attr_aidr_el1.attr, 335 338 NULL 336 339 }; 337 340 ··· 470 469 info->reg_dczid = read_cpuid(DCZID_EL0); 471 470 info->reg_midr = read_cpuid_id(); 472 471 info->reg_revidr = read_cpuid(REVIDR_EL1); 472 + info->reg_aidr = read_cpuid(AIDR_EL1); 473 473 474 474 info->reg_id_aa64dfr0 = read_cpuid(ID_AA64DFR0_EL1); 475 475 info->reg_id_aa64dfr1 = read_cpuid(ID_AA64DFR1_EL1);
+2 -2
arch/arm64/kernel/efi.c
··· 169 169 void arch_efi_call_virt_setup(void) 170 170 { 171 171 efi_virtmap_load(); 172 - __efi_fpsimd_begin(); 173 172 raw_spin_lock(&efi_rt_lock); 173 + __efi_fpsimd_begin(); 174 174 } 175 175 176 176 void arch_efi_call_virt_teardown(void) 177 177 { 178 - raw_spin_unlock(&efi_rt_lock); 179 178 __efi_fpsimd_end(); 179 + raw_spin_unlock(&efi_rt_lock); 180 180 efi_virtmap_unload(); 181 181 } 182 182
+36 -12
arch/arm64/kernel/entry-common.c
··· 132 132 do { 133 133 local_irq_enable(); 134 134 135 - if (thread_flags & _TIF_NEED_RESCHED) 135 + if (thread_flags & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)) 136 136 schedule(); 137 137 138 138 if (thread_flags & _TIF_UPROBE) ··· 393 393 * As per the ABI exit SME streaming mode and clear the SVE state not 394 394 * shared with FPSIMD on syscall entry. 395 395 */ 396 - static inline void fp_user_discard(void) 396 + static inline void fpsimd_syscall_enter(void) 397 397 { 398 - /* 399 - * If SME is active then exit streaming mode. If ZA is active 400 - * then flush the SVE registers but leave userspace access to 401 - * both SVE and SME enabled, otherwise disable SME for the 402 - * task and fall through to disabling SVE too. This means 403 - * that after a syscall we never have any streaming mode 404 - * register state to track, if this changes the KVM code will 405 - * need updating. 406 - */ 398 + /* Ensure PSTATE.SM is clear, but leave PSTATE.ZA as-is. */ 407 399 if (system_supports_sme()) 408 400 sme_smstop_sm(); 409 401 402 + /* 403 + * The CPU is not in streaming mode. If non-streaming SVE is not 404 + * supported, there is no SVE state that needs to be discarded. 405 + */ 410 406 if (!system_supports_sve()) 411 407 return; 412 408 ··· 412 416 sve_vq_minus_one = sve_vq_from_vl(task_get_sve_vl(current)) - 1; 413 417 sve_flush_live(true, sve_vq_minus_one); 414 418 } 419 + 420 + /* 421 + * Any live non-FPSIMD SVE state has been zeroed. Allow 422 + * fpsimd_save_user_state() to lazily discard SVE state until either 423 + * the live state is unbound or fpsimd_syscall_exit() is called. 424 + */ 425 + __this_cpu_write(fpsimd_last_state.to_save, FP_STATE_FPSIMD); 426 + } 427 + 428 + static __always_inline void fpsimd_syscall_exit(void) 429 + { 430 + if (!system_supports_sve()) 431 + return; 432 + 433 + /* 434 + * The current task's user FPSIMD/SVE/SME state is now bound to this 435 + * CPU. The fpsimd_last_state.to_save value is either: 436 + * 437 + * - FP_STATE_FPSIMD, if the state has not been reloaded on this CPU 438 + * since fpsimd_syscall_enter(). 439 + * 440 + * - FP_STATE_CURRENT, if the state has been reloaded on this CPU at 441 + * any point. 442 + * 443 + * Reset this to FP_STATE_CURRENT to stop lazy discarding. 444 + */ 445 + __this_cpu_write(fpsimd_last_state.to_save, FP_STATE_CURRENT); 415 446 } 416 447 417 448 UNHANDLED(el1t, 64, sync) ··· 762 739 { 763 740 enter_from_user_mode(regs); 764 741 cortex_a76_erratum_1463225_svc_handler(); 765 - fp_user_discard(); 742 + fpsimd_syscall_enter(); 766 743 local_daif_restore(DAIF_PROCCTX); 767 744 do_el0_svc(regs); 768 745 exit_to_user_mode(regs); 746 + fpsimd_syscall_exit(); 769 747 } 770 748 771 749 static void noinstr el0_fpac(struct pt_regs *regs, unsigned long esr)
+167 -213
arch/arm64/kernel/fpsimd.c
··· 119 119 * whatever is in the FPSIMD registers is not saved to memory, but discarded. 120 120 */ 121 121 122 - static DEFINE_PER_CPU(struct cpu_fp_state, fpsimd_last_state); 122 + DEFINE_PER_CPU(struct cpu_fp_state, fpsimd_last_state); 123 123 124 124 __ro_after_init struct vl_info vl_info[ARM64_VEC_MAX] = { 125 125 #ifdef CONFIG_ARM64_SVE ··· 180 180 set_default_vl(ARM64_VEC_SVE, val); 181 181 } 182 182 183 - static void __percpu *efi_sve_state; 183 + static u8 *efi_sve_state; 184 184 185 185 #else /* ! CONFIG_ARM64_SVE */ 186 186 187 187 /* Dummy declaration for code that will be optimised out: */ 188 - extern void __percpu *efi_sve_state; 188 + extern u8 *efi_sve_state; 189 189 190 190 #endif /* ! CONFIG_ARM64_SVE */ 191 191 ··· 359 359 WARN_ON(preemptible()); 360 360 WARN_ON(test_thread_flag(TIF_KERNEL_FPSTATE)); 361 361 362 - if (system_supports_fpmr()) 363 - write_sysreg_s(current->thread.uw.fpmr, SYS_FPMR); 364 - 365 362 if (system_supports_sve() || system_supports_sme()) { 366 363 switch (current->thread.fp_type) { 367 364 case FP_STATE_FPSIMD: 368 365 /* Stop tracking SVE for this task until next use. */ 369 - if (test_and_clear_thread_flag(TIF_SVE)) 370 - sve_user_disable(); 366 + clear_thread_flag(TIF_SVE); 371 367 break; 372 368 case FP_STATE_SVE: 373 - if (!thread_sm_enabled(&current->thread) && 374 - !WARN_ON_ONCE(!test_and_set_thread_flag(TIF_SVE))) 375 - sve_user_enable(); 369 + if (!thread_sm_enabled(&current->thread)) 370 + WARN_ON_ONCE(!test_and_set_thread_flag(TIF_SVE)); 376 371 377 372 if (test_thread_flag(TIF_SVE)) 378 373 sve_set_vq(sve_vq_from_vl(task_get_sve_vl(current)) - 1); ··· 407 412 if (thread_sm_enabled(&current->thread)) 408 413 restore_ffr = system_supports_fa64(); 409 414 } 415 + 416 + if (system_supports_fpmr()) 417 + write_sysreg_s(current->thread.uw.fpmr, SYS_FPMR); 410 418 411 419 if (restore_sve_regs) { 412 420 WARN_ON_ONCE(current->thread.fp_type != FP_STATE_SVE); ··· 451 453 *(last->fpmr) = read_sysreg_s(SYS_FPMR); 452 454 453 455 /* 454 - * If a task is in a syscall the ABI allows us to only 455 - * preserve the state shared with FPSIMD so don't bother 456 - * saving the full SVE state in that case. 456 + * Save SVE state if it is live. 457 + * 458 + * The syscall ABI discards live SVE state at syscall entry. When 459 + * entering a syscall, fpsimd_syscall_enter() sets to_save to 460 + * FP_STATE_FPSIMD to allow the SVE state to be lazily discarded until 461 + * either new SVE state is loaded+bound or fpsimd_syscall_exit() is 462 + * called prior to a return to userspace. 457 463 */ 458 - if ((last->to_save == FP_STATE_CURRENT && test_thread_flag(TIF_SVE) && 459 - !in_syscall(current_pt_regs())) || 464 + if ((last->to_save == FP_STATE_CURRENT && test_thread_flag(TIF_SVE)) || 460 465 last->to_save == FP_STATE_SVE) { 461 466 save_sve_regs = true; 462 467 save_ffr = true; ··· 652 651 * task->thread.uw.fpsimd_state must be up to date before calling this 653 652 * function. 654 653 */ 655 - static void fpsimd_to_sve(struct task_struct *task) 654 + static inline void fpsimd_to_sve(struct task_struct *task) 656 655 { 657 656 unsigned int vq; 658 657 void *sst = task->thread.sve_state; ··· 676 675 * bytes of allocated kernel memory. 677 676 * task->thread.sve_state must be up to date before calling this function. 678 677 */ 679 - static void sve_to_fpsimd(struct task_struct *task) 678 + static inline void sve_to_fpsimd(struct task_struct *task) 680 679 { 681 680 unsigned int vq, vl; 682 681 void const *sst = task->thread.sve_state; ··· 695 694 } 696 695 } 697 696 697 + static inline void __fpsimd_zero_vregs(struct user_fpsimd_state *fpsimd) 698 + { 699 + memset(&fpsimd->vregs, 0, sizeof(fpsimd->vregs)); 700 + } 701 + 702 + /* 703 + * Simulate the effects of an SMSTOP SM instruction. 704 + */ 705 + void task_smstop_sm(struct task_struct *task) 706 + { 707 + if (!thread_sm_enabled(&task->thread)) 708 + return; 709 + 710 + __fpsimd_zero_vregs(&task->thread.uw.fpsimd_state); 711 + task->thread.uw.fpsimd_state.fpsr = 0x0800009f; 712 + if (system_supports_fpmr()) 713 + task->thread.uw.fpmr = 0; 714 + 715 + task->thread.svcr &= ~SVCR_SM_MASK; 716 + task->thread.fp_type = FP_STATE_FPSIMD; 717 + } 718 + 698 719 void cpu_enable_fpmr(const struct arm64_cpu_capabilities *__always_unused p) 699 720 { 700 721 write_sysreg_s(read_sysreg_s(SYS_SCTLR_EL1) | SCTLR_EL1_EnFPM_MASK, ··· 724 701 } 725 702 726 703 #ifdef CONFIG_ARM64_SVE 727 - /* 728 - * Call __sve_free() directly only if you know task can't be scheduled 729 - * or preempted. 730 - */ 731 - static void __sve_free(struct task_struct *task) 704 + static void sve_free(struct task_struct *task) 732 705 { 733 706 kfree(task->thread.sve_state); 734 707 task->thread.sve_state = NULL; 735 - } 736 - 737 - static void sve_free(struct task_struct *task) 738 - { 739 - WARN_ON(test_tsk_thread_flag(task, TIF_SVE)); 740 - 741 - __sve_free(task); 742 - } 743 - 744 - /* 745 - * Return how many bytes of memory are required to store the full SVE 746 - * state for task, given task's currently configured vector length. 747 - */ 748 - size_t sve_state_size(struct task_struct const *task) 749 - { 750 - unsigned int vl = 0; 751 - 752 - if (system_supports_sve()) 753 - vl = task_get_sve_vl(task); 754 - if (system_supports_sme()) 755 - vl = max(vl, task_get_sme_vl(task)); 756 - 757 - return SVE_SIG_REGS_SIZE(sve_vq_from_vl(vl)); 758 708 } 759 709 760 710 /* ··· 754 758 kzalloc(sve_state_size(task), GFP_KERNEL); 755 759 } 756 760 757 - 758 761 /* 759 - * Force the FPSIMD state shared with SVE to be updated in the SVE state 760 - * even if the SVE state is the current active state. 762 + * Ensure that task->thread.uw.fpsimd_state is up to date with respect to the 763 + * task's currently effective FPSIMD/SVE state. 761 764 * 762 - * This should only be called by ptrace. task must be non-runnable. 763 - * task->thread.sve_state must point to at least sve_state_size(task) 764 - * bytes of allocated kernel memory. 765 + * The task's FPSIMD/SVE/SME state must not be subject to concurrent 766 + * manipulation. 765 767 */ 766 - void fpsimd_force_sync_to_sve(struct task_struct *task) 767 - { 768 - fpsimd_to_sve(task); 769 - } 770 - 771 - /* 772 - * Ensure that task->thread.sve_state is up to date with respect to 773 - * the user task, irrespective of when SVE is in use or not. 774 - * 775 - * This should only be called by ptrace. task must be non-runnable. 776 - * task->thread.sve_state must point to at least sve_state_size(task) 777 - * bytes of allocated kernel memory. 778 - */ 779 - void fpsimd_sync_to_sve(struct task_struct *task) 780 - { 781 - if (!test_tsk_thread_flag(task, TIF_SVE) && 782 - !thread_sm_enabled(&task->thread)) 783 - fpsimd_to_sve(task); 784 - } 785 - 786 - /* 787 - * Ensure that task->thread.uw.fpsimd_state is up to date with respect to 788 - * the user task, irrespective of whether SVE is in use or not. 789 - * 790 - * This should only be called by ptrace. task must be non-runnable. 791 - * task->thread.sve_state must point to at least sve_state_size(task) 792 - * bytes of allocated kernel memory. 793 - */ 794 - void sve_sync_to_fpsimd(struct task_struct *task) 768 + void fpsimd_sync_from_effective_state(struct task_struct *task) 795 769 { 796 770 if (task->thread.fp_type == FP_STATE_SVE) 797 771 sve_to_fpsimd(task); 798 772 } 799 773 800 774 /* 801 - * Ensure that task->thread.sve_state is up to date with respect to 802 - * the task->thread.uw.fpsimd_state. 775 + * Ensure that the task's currently effective FPSIMD/SVE state is up to date 776 + * with respect to task->thread.uw.fpsimd_state, zeroing any effective 777 + * non-FPSIMD (S)SVE state. 803 778 * 804 - * This should only be called by ptrace to merge new FPSIMD register 805 - * values into a task for which SVE is currently active. 806 - * task must be non-runnable. 807 - * task->thread.sve_state must point to at least sve_state_size(task) 808 - * bytes of allocated kernel memory. 809 - * task->thread.uw.fpsimd_state must already have been initialised with 810 - * the new FPSIMD register values to be merged in. 779 + * The task's FPSIMD/SVE/SME state must not be subject to concurrent 780 + * manipulation. 811 781 */ 812 - void sve_sync_from_fpsimd_zeropad(struct task_struct *task) 782 + void fpsimd_sync_to_effective_state_zeropad(struct task_struct *task) 813 783 { 814 784 unsigned int vq; 815 785 void *sst = task->thread.sve_state; 816 786 struct user_fpsimd_state const *fst = &task->thread.uw.fpsimd_state; 817 787 818 - if (!test_tsk_thread_flag(task, TIF_SVE) && 819 - !thread_sm_enabled(&task->thread)) 788 + if (task->thread.fp_type != FP_STATE_SVE) 820 789 return; 821 790 822 791 vq = sve_vq_from_vl(thread_get_cur_vl(&task->thread)); ··· 790 829 __fpsimd_to_sve(sst, fst, vq); 791 830 } 792 831 832 + static int change_live_vector_length(struct task_struct *task, 833 + enum vec_type type, 834 + unsigned long vl) 835 + { 836 + unsigned int sve_vl = task_get_sve_vl(task); 837 + unsigned int sme_vl = task_get_sme_vl(task); 838 + void *sve_state = NULL, *sme_state = NULL; 839 + 840 + if (type == ARM64_VEC_SME) 841 + sme_vl = vl; 842 + else 843 + sve_vl = vl; 844 + 845 + /* 846 + * Allocate the new sve_state and sme_state before freeing the old 847 + * copies so that allocation failure can be handled without needing to 848 + * mutate the task's state in any way. 849 + * 850 + * Changes to the SVE vector length must not discard live ZA state or 851 + * clear PSTATE.ZA, as userspace code which is unaware of the AAPCS64 852 + * ZA lazy saving scheme may attempt to change the SVE vector length 853 + * while unsaved/dormant ZA state exists. 854 + */ 855 + sve_state = kzalloc(__sve_state_size(sve_vl, sme_vl), GFP_KERNEL); 856 + if (!sve_state) 857 + goto out_mem; 858 + 859 + if (type == ARM64_VEC_SME) { 860 + sme_state = kzalloc(__sme_state_size(sme_vl), GFP_KERNEL); 861 + if (!sme_state) 862 + goto out_mem; 863 + } 864 + 865 + if (task == current) 866 + fpsimd_save_and_flush_current_state(); 867 + else 868 + fpsimd_flush_task_state(task); 869 + 870 + /* 871 + * Always preserve PSTATE.SM and the effective FPSIMD state, zeroing 872 + * other SVE state. 873 + */ 874 + fpsimd_sync_from_effective_state(task); 875 + task_set_vl(task, type, vl); 876 + kfree(task->thread.sve_state); 877 + task->thread.sve_state = sve_state; 878 + fpsimd_sync_to_effective_state_zeropad(task); 879 + 880 + if (type == ARM64_VEC_SME) { 881 + task->thread.svcr &= ~SVCR_ZA_MASK; 882 + kfree(task->thread.sme_state); 883 + task->thread.sme_state = sme_state; 884 + } 885 + 886 + return 0; 887 + 888 + out_mem: 889 + kfree(sve_state); 890 + kfree(sme_state); 891 + return -ENOMEM; 892 + } 893 + 793 894 int vec_set_vector_length(struct task_struct *task, enum vec_type type, 794 895 unsigned long vl, unsigned long flags) 795 896 { 796 - bool free_sme = false; 897 + bool onexec = flags & PR_SVE_SET_VL_ONEXEC; 898 + bool inherit = flags & PR_SVE_VL_INHERIT; 797 899 798 900 if (flags & ~(unsigned long)(PR_SVE_VL_INHERIT | 799 901 PR_SVE_SET_VL_ONEXEC)) ··· 876 852 877 853 vl = find_supported_vector_length(type, vl); 878 854 879 - if (flags & (PR_SVE_VL_INHERIT | 880 - PR_SVE_SET_VL_ONEXEC)) 855 + if (!onexec && vl != task_get_vl(task, type)) { 856 + if (change_live_vector_length(task, type, vl)) 857 + return -ENOMEM; 858 + } 859 + 860 + if (onexec || inherit) 881 861 task_set_vl_onexec(task, type, vl); 882 862 else 883 863 /* Reset VL to system default on next exec: */ 884 864 task_set_vl_onexec(task, type, 0); 885 865 886 - /* Only actually set the VL if not deferred: */ 887 - if (flags & PR_SVE_SET_VL_ONEXEC) 888 - goto out; 889 - 890 - if (vl == task_get_vl(task, type)) 891 - goto out; 892 - 893 - /* 894 - * To ensure the FPSIMD bits of the SVE vector registers are preserved, 895 - * write any live register state back to task_struct, and convert to a 896 - * regular FPSIMD thread. 897 - */ 898 - if (task == current) { 899 - get_cpu_fpsimd_context(); 900 - 901 - fpsimd_save_user_state(); 902 - } 903 - 904 - fpsimd_flush_task_state(task); 905 - if (test_and_clear_tsk_thread_flag(task, TIF_SVE) || 906 - thread_sm_enabled(&task->thread)) { 907 - sve_to_fpsimd(task); 908 - task->thread.fp_type = FP_STATE_FPSIMD; 909 - } 910 - 911 - if (system_supports_sme()) { 912 - if (type == ARM64_VEC_SME || 913 - !(task->thread.svcr & (SVCR_SM_MASK | SVCR_ZA_MASK))) { 914 - /* 915 - * We are changing the SME VL or weren't using 916 - * SME anyway, discard the state and force a 917 - * reallocation. 918 - */ 919 - task->thread.svcr &= ~(SVCR_SM_MASK | 920 - SVCR_ZA_MASK); 921 - clear_tsk_thread_flag(task, TIF_SME); 922 - free_sme = true; 923 - } 924 - } 925 - 926 - if (task == current) 927 - put_cpu_fpsimd_context(); 928 - 929 - task_set_vl(task, type, vl); 930 - 931 - /* 932 - * Free the changed states if they are not in use, SME will be 933 - * reallocated to the correct size on next use and we just 934 - * allocate SVE now in case it is needed for use in streaming 935 - * mode. 936 - */ 937 - sve_free(task); 938 - sve_alloc(task, true); 939 - 940 - if (free_sme) 941 - sme_free(task); 942 - 943 - out: 944 866 update_tsk_thread_flag(task, vec_vl_inherit_flag(type), 945 867 flags & PR_SVE_VL_INHERIT); 946 868 ··· 1101 1131 if (!sve_vl_valid(max_vl)) 1102 1132 goto fail; 1103 1133 1104 - efi_sve_state = __alloc_percpu( 1105 - SVE_SIG_REGS_SIZE(sve_vq_from_vl(max_vl)), SVE_VQ_BYTES); 1134 + efi_sve_state = kmalloc(SVE_SIG_REGS_SIZE(sve_vq_from_vl(max_vl)), 1135 + GFP_KERNEL); 1106 1136 if (!efi_sve_state) 1107 1137 goto fail; 1108 1138 1109 1139 return; 1110 1140 1111 1141 fail: 1112 - panic("Cannot allocate percpu memory for EFI SVE save/restore"); 1142 + panic("Cannot allocate memory for EFI SVE save/restore"); 1113 1143 } 1114 1144 1115 1145 void cpu_enable_sve(const struct arm64_cpu_capabilities *__always_unused p) ··· 1182 1212 */ 1183 1213 void fpsimd_release_task(struct task_struct *dead_task) 1184 1214 { 1185 - __sve_free(dead_task); 1215 + sve_free(dead_task); 1186 1216 sme_free(dead_task); 1187 1217 } 1188 1218 ··· 1406 1436 * If this not a trap due to SME being disabled then something 1407 1437 * is being used in the wrong mode, report as SIGILL. 1408 1438 */ 1409 - if (ESR_ELx_ISS(esr) != ESR_ELx_SME_ISS_SME_DISABLED) { 1439 + if (ESR_ELx_SME_ISS_SMTC(esr) != ESR_ELx_SME_ISS_SMTC_SME_DISABLED) { 1410 1440 force_signal_inject(SIGILL, ILL_ILLOPC, regs->pc, 0); 1411 1441 return; 1412 1442 } ··· 1430 1460 sme_set_vq(vq_minus_one); 1431 1461 1432 1462 fpsimd_bind_task_to_cpu(); 1463 + } else { 1464 + fpsimd_flush_task_state(current); 1433 1465 } 1434 1466 1435 1467 put_cpu_fpsimd_context(); ··· 1545 1573 fpsimd_save_user_state(); 1546 1574 1547 1575 if (test_tsk_thread_flag(next, TIF_KERNEL_FPSTATE)) { 1548 - fpsimd_load_kernel_state(next); 1549 1576 fpsimd_flush_cpu_state(); 1577 + fpsimd_load_kernel_state(next); 1550 1578 } else { 1551 1579 /* 1552 1580 * Fix up TIF_FOREIGN_FPSTATE to correctly describe next's ··· 1633 1661 current->thread.svcr = 0; 1634 1662 } 1635 1663 1664 + if (system_supports_fpmr()) 1665 + current->thread.uw.fpmr = 0; 1666 + 1636 1667 current->thread.fp_type = FP_STATE_FPSIMD; 1637 1668 1638 1669 put_cpu_fpsimd_context(); ··· 1655 1680 get_cpu_fpsimd_context(); 1656 1681 fpsimd_save_user_state(); 1657 1682 put_cpu_fpsimd_context(); 1658 - } 1659 - 1660 - /* 1661 - * Like fpsimd_preserve_current_state(), but ensure that 1662 - * current->thread.uw.fpsimd_state is updated so that it can be copied to 1663 - * the signal frame. 1664 - */ 1665 - void fpsimd_signal_preserve_current_state(void) 1666 - { 1667 - fpsimd_preserve_current_state(); 1668 - if (current->thread.fp_type == FP_STATE_SVE) 1669 - sve_to_fpsimd(current); 1670 1683 } 1671 1684 1672 1685 /* ··· 1749 1786 put_cpu_fpsimd_context(); 1750 1787 } 1751 1788 1752 - /* 1753 - * Load an updated userland FPSIMD state for 'current' from memory and set the 1754 - * flag that indicates that the FPSIMD register contents are the most recent 1755 - * FPSIMD state of 'current'. This is used by the signal code to restore the 1756 - * register state when returning from a signal handler in FPSIMD only cases, 1757 - * any SVE context will be discarded. 1758 - */ 1759 1789 void fpsimd_update_current_state(struct user_fpsimd_state const *state) 1760 1790 { 1761 1791 if (WARN_ON(!system_supports_fpsimd())) 1762 1792 return; 1763 1793 1764 - get_cpu_fpsimd_context(); 1765 - 1766 1794 current->thread.uw.fpsimd_state = *state; 1767 - if (test_thread_flag(TIF_SVE)) 1795 + if (current->thread.fp_type == FP_STATE_SVE) 1768 1796 fpsimd_to_sve(current); 1769 - 1770 - task_fpsimd_load(); 1771 - fpsimd_bind_task_to_cpu(); 1772 - 1773 - clear_thread_flag(TIF_FOREIGN_FPSTATE); 1774 - 1775 - put_cpu_fpsimd_context(); 1776 1797 } 1777 1798 1778 1799 /* ··· 1784 1837 set_tsk_thread_flag(t, TIF_FOREIGN_FPSTATE); 1785 1838 1786 1839 barrier(); 1840 + } 1841 + 1842 + void fpsimd_save_and_flush_current_state(void) 1843 + { 1844 + if (!system_supports_fpsimd()) 1845 + return; 1846 + 1847 + get_cpu_fpsimd_context(); 1848 + fpsimd_save_user_state(); 1849 + fpsimd_flush_task_state(current); 1850 + put_cpu_fpsimd_context(); 1787 1851 } 1788 1852 1789 1853 /* ··· 1906 1948 1907 1949 #ifdef CONFIG_EFI 1908 1950 1909 - static DEFINE_PER_CPU(struct user_fpsimd_state, efi_fpsimd_state); 1910 - static DEFINE_PER_CPU(bool, efi_fpsimd_state_used); 1911 - static DEFINE_PER_CPU(bool, efi_sve_state_used); 1912 - static DEFINE_PER_CPU(bool, efi_sm_state); 1951 + static struct user_fpsimd_state efi_fpsimd_state; 1952 + static bool efi_fpsimd_state_used; 1953 + static bool efi_sve_state_used; 1954 + static bool efi_sm_state; 1913 1955 1914 1956 /* 1915 1957 * EFI runtime services support functions ··· 1942 1984 * If !efi_sve_state, SVE can't be in use yet and doesn't need 1943 1985 * preserving: 1944 1986 */ 1945 - if (system_supports_sve() && likely(efi_sve_state)) { 1946 - char *sve_state = this_cpu_ptr(efi_sve_state); 1987 + if (system_supports_sve() && efi_sve_state != NULL) { 1947 1988 bool ffr = true; 1948 1989 u64 svcr; 1949 1990 1950 - __this_cpu_write(efi_sve_state_used, true); 1991 + efi_sve_state_used = true; 1951 1992 1952 1993 if (system_supports_sme()) { 1953 1994 svcr = read_sysreg_s(SYS_SVCR); 1954 1995 1955 - __this_cpu_write(efi_sm_state, 1956 - svcr & SVCR_SM_MASK); 1996 + efi_sm_state = svcr & SVCR_SM_MASK; 1957 1997 1958 1998 /* 1959 1999 * Unless we have FA64 FFR does not ··· 1961 2005 ffr = !(svcr & SVCR_SM_MASK); 1962 2006 } 1963 2007 1964 - sve_save_state(sve_state + sve_ffr_offset(sve_max_vl()), 1965 - &this_cpu_ptr(&efi_fpsimd_state)->fpsr, 1966 - ffr); 2008 + sve_save_state(efi_sve_state + sve_ffr_offset(sve_max_vl()), 2009 + &efi_fpsimd_state.fpsr, ffr); 1967 2010 1968 2011 if (system_supports_sme()) 1969 2012 sysreg_clear_set_s(SYS_SVCR, 1970 2013 SVCR_SM_MASK, 0); 1971 2014 1972 2015 } else { 1973 - fpsimd_save_state(this_cpu_ptr(&efi_fpsimd_state)); 2016 + fpsimd_save_state(&efi_fpsimd_state); 1974 2017 } 1975 2018 1976 - __this_cpu_write(efi_fpsimd_state_used, true); 2019 + efi_fpsimd_state_used = true; 1977 2020 } 1978 2021 } 1979 2022 ··· 1984 2029 if (!system_supports_fpsimd()) 1985 2030 return; 1986 2031 1987 - if (!__this_cpu_xchg(efi_fpsimd_state_used, false)) { 2032 + if (!efi_fpsimd_state_used) { 1988 2033 kernel_neon_end(); 1989 2034 } else { 1990 - if (system_supports_sve() && 1991 - likely(__this_cpu_read(efi_sve_state_used))) { 1992 - char const *sve_state = this_cpu_ptr(efi_sve_state); 2035 + if (system_supports_sve() && efi_sve_state_used) { 1993 2036 bool ffr = true; 1994 2037 1995 2038 /* ··· 1996 2043 * streaming mode. 1997 2044 */ 1998 2045 if (system_supports_sme()) { 1999 - if (__this_cpu_read(efi_sm_state)) { 2046 + if (efi_sm_state) { 2000 2047 sysreg_clear_set_s(SYS_SVCR, 2001 2048 0, 2002 2049 SVCR_SM_MASK); ··· 2010 2057 } 2011 2058 } 2012 2059 2013 - sve_load_state(sve_state + sve_ffr_offset(sve_max_vl()), 2014 - &this_cpu_ptr(&efi_fpsimd_state)->fpsr, 2015 - ffr); 2060 + sve_load_state(efi_sve_state + sve_ffr_offset(sve_max_vl()), 2061 + &efi_fpsimd_state.fpsr, ffr); 2016 2062 2017 - __this_cpu_write(efi_sve_state_used, false); 2063 + efi_sve_state_used = false; 2018 2064 } else { 2019 - fpsimd_load_state(this_cpu_ptr(&efi_fpsimd_state)); 2065 + fpsimd_load_state(&efi_fpsimd_state); 2020 2066 } 2067 + 2068 + efi_fpsimd_state_used = false; 2021 2069 } 2022 2070 } 2023 2071
+3 -3
arch/arm64/kernel/head.S
··· 89 89 adrp x1, early_init_stack 90 90 mov sp, x1 91 91 mov x29, xzr 92 - adrp x0, init_idmap_pg_dir 92 + adrp x0, __pi_init_idmap_pg_dir 93 93 mov x1, xzr 94 94 bl __pi_create_init_idmap 95 95 ··· 101 101 cbnz x19, 0f 102 102 dmb sy 103 103 mov x1, x0 // end of used region 104 - adrp x0, init_idmap_pg_dir 104 + adrp x0, __pi_init_idmap_pg_dir 105 105 adr_l x2, dcache_inval_poc 106 106 blr x2 107 107 b 1f ··· 507 507 508 508 SYM_FUNC_START_LOCAL(__primary_switch) 509 509 adrp x1, reserved_pg_dir 510 - adrp x2, init_idmap_pg_dir 510 + adrp x2, __pi_init_idmap_pg_dir 511 511 bl __enable_mmu 512 512 513 513 adrp x1, early_init_stack
+28 -29
arch/arm64/kernel/image-vars.h
··· 10 10 #error This file should only be included in vmlinux.lds.S 11 11 #endif 12 12 13 + #define PI_EXPORT_SYM(sym) \ 14 + __PI_EXPORT_SYM(sym, __pi_ ## sym, Cannot export BSS symbol sym to startup code) 15 + #define __PI_EXPORT_SYM(sym, pisym, msg)\ 16 + PROVIDE(pisym = sym); \ 17 + ASSERT((sym - KIMAGE_VADDR) < (__bss_start - KIMAGE_VADDR), #msg) 18 + 13 19 PROVIDE(__efistub_primary_entry = primary_entry); 14 20 15 21 /* ··· 42 36 PROVIDE(__pi___memmove = __pi_memmove); 43 37 PROVIDE(__pi___memset = __pi_memset); 44 38 45 - PROVIDE(__pi_id_aa64isar1_override = id_aa64isar1_override); 46 - PROVIDE(__pi_id_aa64isar2_override = id_aa64isar2_override); 47 - PROVIDE(__pi_id_aa64mmfr0_override = id_aa64mmfr0_override); 48 - PROVIDE(__pi_id_aa64mmfr1_override = id_aa64mmfr1_override); 49 - PROVIDE(__pi_id_aa64mmfr2_override = id_aa64mmfr2_override); 50 - PROVIDE(__pi_id_aa64pfr0_override = id_aa64pfr0_override); 51 - PROVIDE(__pi_id_aa64pfr1_override = id_aa64pfr1_override); 52 - PROVIDE(__pi_id_aa64smfr0_override = id_aa64smfr0_override); 53 - PROVIDE(__pi_id_aa64zfr0_override = id_aa64zfr0_override); 54 - PROVIDE(__pi_arm64_sw_feature_override = arm64_sw_feature_override); 55 - PROVIDE(__pi_arm64_use_ng_mappings = arm64_use_ng_mappings); 56 - PROVIDE(__pi__ctype = _ctype); 57 - PROVIDE(__pi_memstart_offset_seed = memstart_offset_seed); 39 + PI_EXPORT_SYM(id_aa64isar1_override); 40 + PI_EXPORT_SYM(id_aa64isar2_override); 41 + PI_EXPORT_SYM(id_aa64mmfr0_override); 42 + PI_EXPORT_SYM(id_aa64mmfr1_override); 43 + PI_EXPORT_SYM(id_aa64mmfr2_override); 44 + PI_EXPORT_SYM(id_aa64pfr0_override); 45 + PI_EXPORT_SYM(id_aa64pfr1_override); 46 + PI_EXPORT_SYM(id_aa64smfr0_override); 47 + PI_EXPORT_SYM(id_aa64zfr0_override); 48 + PI_EXPORT_SYM(arm64_sw_feature_override); 49 + PI_EXPORT_SYM(arm64_use_ng_mappings); 50 + PI_EXPORT_SYM(_ctype); 58 51 59 - PROVIDE(__pi_init_idmap_pg_dir = init_idmap_pg_dir); 60 - PROVIDE(__pi_init_idmap_pg_end = init_idmap_pg_end); 61 - PROVIDE(__pi_init_pg_dir = init_pg_dir); 62 - PROVIDE(__pi_init_pg_end = init_pg_end); 63 - PROVIDE(__pi_swapper_pg_dir = swapper_pg_dir); 52 + PI_EXPORT_SYM(swapper_pg_dir); 64 53 65 - PROVIDE(__pi__text = _text); 66 - PROVIDE(__pi__stext = _stext); 67 - PROVIDE(__pi__etext = _etext); 68 - PROVIDE(__pi___start_rodata = __start_rodata); 69 - PROVIDE(__pi___inittext_begin = __inittext_begin); 70 - PROVIDE(__pi___inittext_end = __inittext_end); 71 - PROVIDE(__pi___initdata_begin = __initdata_begin); 72 - PROVIDE(__pi___initdata_end = __initdata_end); 73 - PROVIDE(__pi__data = _data); 74 - PROVIDE(__pi___bss_start = __bss_start); 75 - PROVIDE(__pi__end = _end); 54 + PI_EXPORT_SYM(_text); 55 + PI_EXPORT_SYM(_stext); 56 + PI_EXPORT_SYM(_etext); 57 + PI_EXPORT_SYM(__start_rodata); 58 + PI_EXPORT_SYM(__inittext_begin); 59 + PI_EXPORT_SYM(__inittext_end); 60 + PI_EXPORT_SYM(__initdata_begin); 61 + PI_EXPORT_SYM(__initdata_end); 62 + PI_EXPORT_SYM(_data); 76 63 77 64 #ifdef CONFIG_KVM 78 65
-2
arch/arm64/kernel/kaslr.c
··· 10 10 #include <asm/cpufeature.h> 11 11 #include <asm/memory.h> 12 12 13 - u16 __initdata memstart_offset_seed; 14 - 15 13 bool __ro_after_init __kaslr_is_enabled = false; 16 14 17 15 void __init kaslr_init(void)
-4
arch/arm64/kernel/pi/kaslr_early.c
··· 18 18 19 19 #include "pi.h" 20 20 21 - extern u16 memstart_offset_seed; 22 - 23 21 static u64 __init get_kaslr_seed(void *fdt, int node) 24 22 { 25 23 static char const seed_str[] __initconst = "kaslr-seed"; ··· 50 52 !__arm64_rndr((unsigned long *)&seed)) 51 53 return 0; 52 54 } 53 - 54 - memstart_offset_seed = seed & U16_MAX; 55 55 56 56 /* 57 57 * OK, so we are proceeding with KASLR enabled. Calculate a suitable
+1
arch/arm64/kernel/pi/pi.h
··· 22 22 extern bool dynamic_scs_is_enabled; 23 23 24 24 extern pgd_t init_idmap_pg_dir[], init_idmap_pg_end[]; 25 + extern pgd_t init_pg_dir[], init_pg_end[]; 25 26 26 27 void init_feature_override(u64 boot_status, const void *fdt, int chosen); 27 28 u64 kaslr_early_init(void *fdt, int chosen);
+80 -46
arch/arm64/kernel/process.c
··· 344 344 345 345 int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) 346 346 { 347 - if (current->mm) 348 - fpsimd_preserve_current_state(); 347 + /* 348 + * The current/src task's FPSIMD state may or may not be live, and may 349 + * have been altered by ptrace after entry to the kernel. Save the 350 + * effective FPSIMD state so that this will be copied into dst. 351 + */ 352 + fpsimd_save_and_flush_current_state(); 353 + fpsimd_sync_from_effective_state(src); 354 + 349 355 *dst = *src; 350 356 351 357 /* 352 - * Detach src's sve_state (if any) from dst so that it does not 353 - * get erroneously used or freed prematurely. dst's copies 354 - * will be allocated on demand later on if dst uses SVE. 355 - * For consistency, also clear TIF_SVE here: this could be done 356 - * later in copy_process(), but to avoid tripping up future 357 - * maintainers it is best not to leave TIF flags and buffers in 358 - * an inconsistent state, even temporarily. 358 + * Drop stale reference to src's sve_state and convert dst to 359 + * non-streaming FPSIMD mode. 359 360 */ 361 + dst->thread.fp_type = FP_STATE_FPSIMD; 360 362 dst->thread.sve_state = NULL; 361 363 clear_tsk_thread_flag(dst, TIF_SVE); 364 + task_smstop_sm(dst); 362 365 363 366 /* 364 - * In the unlikely event that we create a new thread with ZA 365 - * enabled we should retain the ZA and ZT state so duplicate 366 - * it here. This may be shortly freed if we exec() or if 367 - * CLONE_SETTLS but it's simpler to do it here. To avoid 368 - * confusing the rest of the code ensure that we have a 369 - * sve_state allocated whenever sme_state is allocated. 367 + * Drop stale reference to src's sme_state and ensure dst has ZA 368 + * disabled. 369 + * 370 + * When necessary, ZA will be inherited later in copy_thread_za(). 370 371 */ 371 - if (thread_za_enabled(&src->thread)) { 372 - dst->thread.sve_state = kzalloc(sve_state_size(src), 373 - GFP_KERNEL); 374 - if (!dst->thread.sve_state) 375 - return -ENOMEM; 376 - 377 - dst->thread.sme_state = kmemdup(src->thread.sme_state, 378 - sme_state_size(src), 379 - GFP_KERNEL); 380 - if (!dst->thread.sme_state) { 381 - kfree(dst->thread.sve_state); 382 - dst->thread.sve_state = NULL; 383 - return -ENOMEM; 384 - } 385 - } else { 386 - dst->thread.sme_state = NULL; 387 - clear_tsk_thread_flag(dst, TIF_SME); 388 - } 389 - 390 - dst->thread.fp_type = FP_STATE_FPSIMD; 372 + dst->thread.sme_state = NULL; 373 + clear_tsk_thread_flag(dst, TIF_SME); 374 + dst->thread.svcr &= ~SVCR_ZA_MASK; 391 375 392 376 /* clear any pending asynchronous tag fault raised by the parent */ 393 377 clear_tsk_thread_flag(dst, TIF_MTE_ASYNC_FAULT); 378 + 379 + return 0; 380 + } 381 + 382 + static int copy_thread_za(struct task_struct *dst, struct task_struct *src) 383 + { 384 + if (!thread_za_enabled(&src->thread)) 385 + return 0; 386 + 387 + dst->thread.sve_state = kzalloc(sve_state_size(src), 388 + GFP_KERNEL); 389 + if (!dst->thread.sve_state) 390 + return -ENOMEM; 391 + 392 + dst->thread.sme_state = kmemdup(src->thread.sme_state, 393 + sme_state_size(src), 394 + GFP_KERNEL); 395 + if (!dst->thread.sme_state) { 396 + kfree(dst->thread.sve_state); 397 + dst->thread.sve_state = NULL; 398 + return -ENOMEM; 399 + } 400 + 401 + set_tsk_thread_flag(dst, TIF_SME); 402 + dst->thread.svcr |= SVCR_ZA_MASK; 394 403 395 404 return 0; 396 405 } ··· 436 427 * out-of-sync with the saved value. 437 428 */ 438 429 *task_user_tls(p) = read_sysreg(tpidr_el0); 439 - if (system_supports_tpidr2()) 440 - p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0); 441 430 442 431 if (system_supports_poe()) 443 432 p->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); ··· 448 441 } 449 442 450 443 /* 451 - * If a TLS pointer was passed to clone, use it for the new 452 - * thread. We also reset TPIDR2 if it's in use. 444 + * Due to the AAPCS64 "ZA lazy saving scheme", PSTATE.ZA and 445 + * TPIDR2 need to be manipulated as a pair, and either both 446 + * need to be inherited or both need to be reset. 447 + * 448 + * Within a process, child threads must not inherit their 449 + * parent's TPIDR2 value or they may clobber their parent's 450 + * stack at some later point. 451 + * 452 + * When a process is fork()'d, the child must inherit ZA and 453 + * TPIDR2 from its parent in case there was dormant ZA state. 454 + * 455 + * Use CLONE_VM to determine when the child will share the 456 + * address space with the parent, and cannot safely inherit the 457 + * state. 453 458 */ 454 - if (clone_flags & CLONE_SETTLS) { 455 - p->thread.uw.tp_value = tls; 456 - p->thread.tpidr2_el0 = 0; 459 + if (system_supports_sme()) { 460 + if (!(clone_flags & CLONE_VM)) { 461 + p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0); 462 + ret = copy_thread_za(p, current); 463 + if (ret) 464 + return ret; 465 + } else { 466 + p->thread.tpidr2_el0 = 0; 467 + WARN_ON_ONCE(p->thread.svcr & SVCR_ZA_MASK); 468 + } 457 469 } 470 + 471 + /* 472 + * If a TLS pointer was passed to clone, use it for the new 473 + * thread. 474 + */ 475 + if (clone_flags & CLONE_SETTLS) 476 + p->thread.uw.tp_value = tls; 458 477 459 478 ret = copy_thread_gcs(p, args); 460 479 if (ret != 0) ··· 713 680 gcs_thread_switch(next); 714 681 715 682 /* 716 - * Complete any pending TLB or cache maintenance on this CPU in case 717 - * the thread migrates to a different CPU. 718 - * This full barrier is also required by the membarrier system 719 - * call. 683 + * Complete any pending TLB or cache maintenance on this CPU in case the 684 + * thread migrates to a different CPU. This full barrier is also 685 + * required by the membarrier system call. Additionally it makes any 686 + * in-progress pgtable writes visible to the table walker; See 687 + * emit_pte_barriers(). 720 688 */ 721 689 dsb(ish); 722 690
+67 -70
arch/arm64/kernel/ptrace.c
··· 594 594 { 595 595 struct user_fpsimd_state *uregs; 596 596 597 - sve_sync_to_fpsimd(target); 597 + fpsimd_sync_from_effective_state(target); 598 598 599 599 uregs = &target->thread.uw.fpsimd_state; 600 600 ··· 626 626 * Ensure target->thread.uw.fpsimd_state is up to date, so that a 627 627 * short copyin can't resurrect stale data. 628 628 */ 629 - sve_sync_to_fpsimd(target); 629 + fpsimd_sync_from_effective_state(target); 630 630 631 631 newstate = target->thread.uw.fpsimd_state; 632 632 ··· 653 653 if (ret) 654 654 return ret; 655 655 656 - sve_sync_from_fpsimd_zeropad(target); 656 + fpsimd_sync_to_effective_state_zeropad(target); 657 657 fpsimd_flush_task_state(target); 658 658 659 659 return ret; ··· 775 775 task_type = ARM64_VEC_SVE; 776 776 active = (task_type == type); 777 777 778 + if (active && target->thread.fp_type == FP_STATE_SVE) 779 + header->flags = SVE_PT_REGS_SVE; 780 + else 781 + header->flags = SVE_PT_REGS_FPSIMD; 782 + 778 783 switch (type) { 779 784 case ARM64_VEC_SVE: 780 785 if (test_tsk_thread_flag(target, TIF_SVE_VL_INHERIT)) ··· 794 789 return; 795 790 } 796 791 797 - if (active) { 798 - if (target->thread.fp_type == FP_STATE_FPSIMD) { 799 - header->flags |= SVE_PT_REGS_FPSIMD; 800 - } else { 801 - header->flags |= SVE_PT_REGS_SVE; 802 - } 803 - } 804 - 805 792 header->vl = task_get_vl(target, type); 806 793 vq = sve_vq_from_vl(header->vl); 807 794 808 795 header->max_vl = vec_max_vl(type); 809 - header->size = SVE_PT_SIZE(vq, header->flags); 796 + if (active) 797 + header->size = SVE_PT_SIZE(vq, header->flags); 798 + else 799 + header->size = sizeof(header); 810 800 header->max_size = SVE_PT_SIZE(sve_vq_from_vl(header->max_vl), 811 801 SVE_PT_REGS_SVE); 812 802 } ··· 820 820 unsigned int vq; 821 821 unsigned long start, end; 822 822 823 + if (target == current) 824 + fpsimd_preserve_current_state(); 825 + 823 826 /* Header */ 824 827 sve_init_header_from_task(&header, target, type); 825 828 vq = sve_vq_from_vl(header.vl); 826 829 827 830 membuf_write(&to, &header, sizeof(header)); 828 831 829 - if (target == current) 830 - fpsimd_preserve_current_state(); 831 - 832 832 BUILD_BUG_ON(SVE_PT_FPSIMD_OFFSET != sizeof(header)); 833 833 BUILD_BUG_ON(SVE_PT_SVE_OFFSET != sizeof(header)); 834 + 835 + /* 836 + * When the requested vector type is not active, do not present data 837 + * from the other mode to userspace. 838 + */ 839 + if (header.size == sizeof(header)) 840 + return 0; 834 841 835 842 switch ((header.flags & SVE_PT_REGS_MASK)) { 836 843 case SVE_PT_REGS_FPSIMD: ··· 866 859 return membuf_zero(&to, end - start); 867 860 868 861 default: 869 - return 0; 862 + BUILD_BUG(); 870 863 } 871 864 } 872 865 ··· 890 883 struct user_sve_header header; 891 884 unsigned int vq; 892 885 unsigned long start, end; 886 + bool fpsimd; 887 + 888 + fpsimd_flush_task_state(target); 893 889 894 890 /* Header */ 895 891 if (count < sizeof(header)) ··· 900 890 ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &header, 901 891 0, sizeof(header)); 902 892 if (ret) 903 - goto out; 893 + return ret; 894 + 895 + /* 896 + * Streaming SVE data is always stored and presented in SVE format. 897 + * Require the user to provide SVE formatted data for consistency, and 898 + * to avoid the risk that we configure the task into an invalid state. 899 + */ 900 + fpsimd = (header.flags & SVE_PT_REGS_MASK) == SVE_PT_REGS_FPSIMD; 901 + if (fpsimd && type == ARM64_VEC_SME) 902 + return -EINVAL; 904 903 905 904 /* 906 905 * Apart from SVE_PT_REGS_MASK, all SVE_PT_* flags are consumed by ··· 918 899 ret = vec_set_vector_length(target, type, header.vl, 919 900 ((unsigned long)header.flags & ~SVE_PT_REGS_MASK) << 16); 920 901 if (ret) 921 - goto out; 902 + return ret; 903 + 904 + /* Allocate SME storage if necessary, preserving any existing ZA/ZT state */ 905 + if (type == ARM64_VEC_SME) { 906 + sme_alloc(target, false); 907 + if (!target->thread.sme_state) 908 + return -ENOMEM; 909 + } 910 + 911 + /* Allocate SVE storage if necessary, zeroing any existing SVE state */ 912 + if (!fpsimd) { 913 + sve_alloc(target, true); 914 + if (!target->thread.sve_state) 915 + return -ENOMEM; 916 + } 922 917 923 918 /* 924 919 * Actual VL set may be different from what the user asked ··· 943 910 944 911 /* Enter/exit streaming mode */ 945 912 if (system_supports_sme()) { 946 - u64 old_svcr = target->thread.svcr; 947 - 948 913 switch (type) { 949 914 case ARM64_VEC_SVE: 950 915 target->thread.svcr &= ~SVCR_SM_MASK; 916 + set_tsk_thread_flag(target, TIF_SVE); 951 917 break; 952 918 case ARM64_VEC_SME: 953 919 target->thread.svcr |= SVCR_SM_MASK; 954 - 955 - /* 956 - * Disable traps and ensure there is SME storage but 957 - * preserve any currently set values in ZA/ZT. 958 - */ 959 - sme_alloc(target, false); 960 920 set_tsk_thread_flag(target, TIF_SME); 961 921 break; 962 922 default: 963 923 WARN_ON_ONCE(1); 964 - ret = -EINVAL; 965 - goto out; 924 + return -EINVAL; 966 925 } 967 - 968 - /* 969 - * If we switched then invalidate any existing SVE 970 - * state and ensure there's storage. 971 - */ 972 - if (target->thread.svcr != old_svcr) 973 - sve_alloc(target, true); 974 926 } 927 + 928 + /* Always zero V regs, FPSR, and FPCR */ 929 + memset(&current->thread.uw.fpsimd_state, 0, 930 + sizeof(current->thread.uw.fpsimd_state)); 975 931 976 932 /* Registers: FPSIMD-only case */ 977 933 978 934 BUILD_BUG_ON(SVE_PT_FPSIMD_OFFSET != sizeof(header)); 979 - if ((header.flags & SVE_PT_REGS_MASK) == SVE_PT_REGS_FPSIMD) { 980 - ret = __fpr_set(target, regset, pos, count, kbuf, ubuf, 981 - SVE_PT_FPSIMD_OFFSET); 935 + if (fpsimd) { 982 936 clear_tsk_thread_flag(target, TIF_SVE); 983 937 target->thread.fp_type = FP_STATE_FPSIMD; 984 - goto out; 938 + ret = __fpr_set(target, regset, pos, count, kbuf, ubuf, 939 + SVE_PT_FPSIMD_OFFSET); 940 + return ret; 985 941 } 986 942 987 - /* 988 - * Otherwise: no registers or full SVE case. For backwards 989 - * compatibility reasons we treat empty flags as SVE registers. 990 - */ 943 + /* Otherwise: no registers or full SVE case. */ 944 + 945 + target->thread.fp_type = FP_STATE_SVE; 991 946 992 947 /* 993 948 * If setting a different VL from the requested VL and there is 994 949 * register data, the data layout will be wrong: don't even 995 950 * try to set the registers in this case. 996 951 */ 997 - if (count && vq != sve_vq_from_vl(header.vl)) { 998 - ret = -EIO; 999 - goto out; 1000 - } 1001 - 1002 - sve_alloc(target, true); 1003 - if (!target->thread.sve_state) { 1004 - ret = -ENOMEM; 1005 - clear_tsk_thread_flag(target, TIF_SVE); 1006 - target->thread.fp_type = FP_STATE_FPSIMD; 1007 - goto out; 1008 - } 1009 - 1010 - /* 1011 - * Ensure target->thread.sve_state is up to date with target's 1012 - * FPSIMD regs, so that a short copyin leaves trailing 1013 - * registers unmodified. Only enable SVE if we are 1014 - * configuring normal SVE, a system with streaming SVE may not 1015 - * have normal SVE. 1016 - */ 1017 - fpsimd_sync_to_sve(target); 1018 - if (type == ARM64_VEC_SVE) 1019 - set_tsk_thread_flag(target, TIF_SVE); 1020 - target->thread.fp_type = FP_STATE_SVE; 952 + if (count && vq != sve_vq_from_vl(header.vl)) 953 + return -EIO; 1021 954 1022 955 BUILD_BUG_ON(SVE_PT_SVE_OFFSET != sizeof(header)); 1023 956 start = SVE_PT_SVE_OFFSET; ··· 992 993 target->thread.sve_state, 993 994 start, end); 994 995 if (ret) 995 - goto out; 996 + return ret; 996 997 997 998 start = end; 998 999 end = SVE_PT_SVE_FPSR_OFFSET(vq); ··· 1008 1009 &target->thread.uw.fpsimd_state.fpsr, 1009 1010 start, end); 1010 1011 1011 - out: 1012 - fpsimd_flush_task_state(target); 1013 1012 return ret; 1014 1013 } 1015 1014
+5 -5
arch/arm64/kernel/setup.c
··· 169 169 170 170 static void __init setup_machine_fdt(phys_addr_t dt_phys) 171 171 { 172 - int size; 172 + int size = 0; 173 173 void *dt_virt = fixmap_remap_fdt(dt_phys, &size, PAGE_KERNEL); 174 174 const char *name; 175 175 ··· 182 182 */ 183 183 if (!early_init_dt_scan(dt_virt, dt_phys)) { 184 184 pr_crit("\n" 185 - "Error: invalid device tree blob at physical address %pa (virtual address 0x%px)\n" 186 - "The dtb must be 8-byte aligned and must not exceed 2 MB in size\n" 187 - "\nPlease check your bootloader.", 188 - &dt_phys, dt_virt); 185 + "Error: invalid device tree blob: PA=%pa, VA=%px, size=%d bytes\n" 186 + "The dtb must be 8-byte aligned and must not exceed 2 MB in size.\n" 187 + "\nPlease check your bootloader.\n", 188 + &dt_phys, dt_virt, size); 189 189 190 190 /* 191 191 * Note that in this _really_ early stage we cannot even BUG()
+53 -96
arch/arm64/kernel/signal.c
··· 250 250 &current->thread.uw.fpsimd_state; 251 251 int err; 252 252 253 + fpsimd_sync_from_effective_state(current); 254 + 253 255 /* copy the FP and status/control registers */ 254 256 err = __copy_to_user(ctx->vregs, fpsimd->vregs, sizeof(fpsimd->vregs)); 255 257 __put_user_error(fpsimd->fpsr, &ctx->fpsr, err); ··· 264 262 return err ? -EFAULT : 0; 265 263 } 266 264 267 - static int restore_fpsimd_context(struct user_ctxs *user) 265 + static int read_fpsimd_context(struct user_fpsimd_state *fpsimd, 266 + struct user_ctxs *user) 268 267 { 269 - struct user_fpsimd_state fpsimd; 270 - int err = 0; 268 + int err; 271 269 272 270 /* check the size information */ 273 271 if (user->fpsimd_size != sizeof(struct fpsimd_context)) 274 272 return -EINVAL; 275 273 276 274 /* copy the FP and status/control registers */ 277 - err = __copy_from_user(fpsimd.vregs, &(user->fpsimd->vregs), 278 - sizeof(fpsimd.vregs)); 279 - __get_user_error(fpsimd.fpsr, &(user->fpsimd->fpsr), err); 280 - __get_user_error(fpsimd.fpcr, &(user->fpsimd->fpcr), err); 275 + err = __copy_from_user(fpsimd->vregs, &(user->fpsimd->vregs), 276 + sizeof(fpsimd->vregs)); 277 + __get_user_error(fpsimd->fpsr, &(user->fpsimd->fpsr), err); 278 + __get_user_error(fpsimd->fpcr, &(user->fpsimd->fpcr), err); 279 + 280 + return err ? -EFAULT : 0; 281 + } 282 + 283 + static int restore_fpsimd_context(struct user_ctxs *user) 284 + { 285 + struct user_fpsimd_state fpsimd; 286 + int err; 287 + 288 + err = read_fpsimd_context(&fpsimd, user); 289 + if (err) 290 + return err; 281 291 282 292 clear_thread_flag(TIF_SVE); 293 + current->thread.svcr &= ~SVCR_SM_MASK; 283 294 current->thread.fp_type = FP_STATE_FPSIMD; 284 295 285 296 /* load the hardware registers from the fpsimd_state structure */ 286 - if (!err) 287 - fpsimd_update_current_state(&fpsimd); 288 - 289 - return err ? -EFAULT : 0; 297 + fpsimd_update_current_state(&fpsimd); 298 + return 0; 290 299 } 291 300 292 301 static int preserve_fpmr_context(struct fpmr_context __user *ctx) 293 302 { 294 303 int err = 0; 295 - 296 - current->thread.uw.fpmr = read_sysreg_s(SYS_FPMR); 297 304 298 305 __put_user_error(FPMR_MAGIC, &ctx->head.magic, err); 299 306 __put_user_error(sizeof(*ctx), &ctx->head.size, err); ··· 321 310 322 311 __get_user_error(fpmr, &user->fpmr->fpmr, err); 323 312 if (!err) 324 - write_sysreg_s(fpmr, SYS_FPMR); 313 + current->thread.uw.fpmr = fpmr; 325 314 326 315 return err; 327 316 } ··· 383 372 err |= __copy_to_user(&ctx->__reserved, reserved, sizeof(reserved)); 384 373 385 374 if (vq) { 386 - /* 387 - * This assumes that the SVE state has already been saved to 388 - * the task struct by calling the function 389 - * fpsimd_signal_preserve_current_state(). 390 - */ 391 375 err |= __copy_to_user((char __user *)ctx + SVE_SIG_REGS_OFFSET, 392 376 current->thread.sve_state, 393 377 SVE_SIG_REGS_SIZE(vq)); ··· 397 391 unsigned int vl, vq; 398 392 struct user_fpsimd_state fpsimd; 399 393 u16 user_vl, flags; 394 + bool sm; 400 395 401 396 if (user->sve_size < sizeof(*user->sve)) 402 397 return -EINVAL; ··· 407 400 if (err) 408 401 return err; 409 402 410 - if (flags & SVE_SIG_FLAG_SM) { 403 + sm = flags & SVE_SIG_FLAG_SM; 404 + if (sm) { 411 405 if (!system_supports_sme()) 412 406 return -EINVAL; 413 407 ··· 428 420 if (user_vl != vl) 429 421 return -EINVAL; 430 422 431 - if (user->sve_size == sizeof(*user->sve)) { 432 - clear_thread_flag(TIF_SVE); 433 - current->thread.svcr &= ~SVCR_SM_MASK; 434 - current->thread.fp_type = FP_STATE_FPSIMD; 435 - goto fpsimd_only; 436 - } 423 + /* 424 + * Non-streaming SVE state may be preserved without an SVE payload, in 425 + * which case the SVE context only has a header with VL==0, and all 426 + * state can be restored from the FPSIMD context. 427 + * 428 + * Streaming SVE state is always preserved with an SVE payload. For 429 + * consistency and robustness, reject restoring streaming SVE state 430 + * without an SVE payload. 431 + */ 432 + if (!sm && user->sve_size == sizeof(*user->sve)) 433 + return restore_fpsimd_context(user); 437 434 438 435 vq = sve_vq_from_vl(vl); 439 436 440 437 if (user->sve_size < SVE_SIG_CONTEXT_SIZE(vq)) 441 438 return -EINVAL; 442 - 443 - /* 444 - * Careful: we are about __copy_from_user() directly into 445 - * thread.sve_state with preemption enabled, so protection is 446 - * needed to prevent a racing context switch from writing stale 447 - * registers back over the new data. 448 - */ 449 - 450 - fpsimd_flush_task_state(current); 451 - /* From now, fpsimd_thread_switch() won't touch thread.sve_state */ 452 439 453 440 sve_alloc(current, true); 454 441 if (!current->thread.sve_state) { ··· 464 461 set_thread_flag(TIF_SVE); 465 462 current->thread.fp_type = FP_STATE_SVE; 466 463 467 - fpsimd_only: 468 - /* copy the FP and status/control registers */ 469 - /* restore_sigframe() already checked that user->fpsimd != NULL. */ 470 - err = __copy_from_user(fpsimd.vregs, user->fpsimd->vregs, 471 - sizeof(fpsimd.vregs)); 472 - __get_user_error(fpsimd.fpsr, &user->fpsimd->fpsr, err); 473 - __get_user_error(fpsimd.fpcr, &user->fpsimd->fpcr, err); 464 + err = read_fpsimd_context(&fpsimd, user); 465 + if (err) 466 + return err; 474 467 475 - /* load the hardware registers from the fpsimd_state structure */ 476 - if (!err) 477 - fpsimd_update_current_state(&fpsimd); 468 + /* Merge the FPSIMD registers into the SVE state */ 469 + fpsimd_update_current_state(&fpsimd); 478 470 479 - return err ? -EFAULT : 0; 471 + return 0; 480 472 } 481 473 482 474 #else /* ! CONFIG_ARM64_SVE */ ··· 491 493 492 494 static int preserve_tpidr2_context(struct tpidr2_context __user *ctx) 493 495 { 496 + u64 tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0); 494 497 int err = 0; 495 - 496 - current->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0); 497 498 498 499 __put_user_error(TPIDR2_MAGIC, &ctx->head.magic, err); 499 500 __put_user_error(sizeof(*ctx), &ctx->head.size, err); 500 - __put_user_error(current->thread.tpidr2_el0, &ctx->tpidr2, err); 501 + __put_user_error(tpidr2_el0, &ctx->tpidr2, err); 501 502 502 503 return err; 503 504 } ··· 538 541 err |= __copy_to_user(&ctx->__reserved, reserved, sizeof(reserved)); 539 542 540 543 if (vq) { 541 - /* 542 - * This assumes that the ZA state has already been saved to 543 - * the task struct by calling the function 544 - * fpsimd_signal_preserve_current_state(). 545 - */ 546 544 err |= __copy_to_user((char __user *)ctx + ZA_SIG_REGS_OFFSET, 547 545 current->thread.sme_state, 548 546 ZA_SIG_REGS_SIZE(vq)); ··· 571 579 572 580 if (user->za_size < ZA_SIG_CONTEXT_SIZE(vq)) 573 581 return -EINVAL; 574 - 575 - /* 576 - * Careful: we are about __copy_from_user() directly into 577 - * thread.sme_state with preemption enabled, so protection is 578 - * needed to prevent a racing context switch from writing stale 579 - * registers back over the new data. 580 - */ 581 - 582 - fpsimd_flush_task_state(current); 583 - /* From now, fpsimd_thread_switch() won't touch thread.sve_state */ 584 582 585 583 sme_alloc(current, true); 586 584 if (!current->thread.sme_state) { ··· 609 627 BUILD_BUG_ON(sizeof(ctx->__reserved) != sizeof(reserved)); 610 628 err |= __copy_to_user(&ctx->__reserved, reserved, sizeof(reserved)); 611 629 612 - /* 613 - * This assumes that the ZT state has already been saved to 614 - * the task struct by calling the function 615 - * fpsimd_signal_preserve_current_state(). 616 - */ 617 630 err |= __copy_to_user((char __user *)ctx + ZT_SIG_REGS_OFFSET, 618 631 thread_zt_state(&current->thread), 619 632 ZT_SIG_REGS_SIZE(1)); ··· 633 656 634 657 if (nregs != 1) 635 658 return -EINVAL; 636 - 637 - /* 638 - * Careful: we are about __copy_from_user() directly into 639 - * thread.zt_state with preemption enabled, so protection is 640 - * needed to prevent a racing context switch from writing stale 641 - * registers back over the new data. 642 - */ 643 - 644 - fpsimd_flush_task_state(current); 645 - /* From now, fpsimd_thread_switch() won't touch ZT in thread state */ 646 659 647 660 err = __copy_from_user(thread_zt_state(&current->thread), 648 661 (char __user const *)user->zt + ··· 983 1016 * Avoid sys_rt_sigreturn() restarting. 984 1017 */ 985 1018 forget_syscall(regs); 1019 + 1020 + fpsimd_save_and_flush_current_state(); 986 1021 987 1022 err |= !valid_user_regs(&regs->user_regs, current); 988 1023 if (err == 0) ··· 1476 1507 1477 1508 /* Signal handlers are invoked with ZA and streaming mode disabled */ 1478 1509 if (system_supports_sme()) { 1479 - /* 1480 - * If we were in streaming mode the saved register 1481 - * state was SVE but we will exit SM and use the 1482 - * FPSIMD register state - flush the saved FPSIMD 1483 - * register state in case it gets loaded. 1484 - */ 1485 - if (current->thread.svcr & SVCR_SM_MASK) { 1486 - memset(&current->thread.uw.fpsimd_state, 0, 1487 - sizeof(current->thread.uw.fpsimd_state)); 1488 - current->thread.fp_type = FP_STATE_FPSIMD; 1489 - } 1490 - 1491 - current->thread.svcr &= ~(SVCR_ZA_MASK | 1492 - SVCR_SM_MASK); 1493 - sme_smstop(); 1510 + task_smstop_sm(current); 1511 + current->thread.svcr &= ~SVCR_ZA_MASK; 1512 + write_sysreg_s(0, SYS_TPIDR2_EL0); 1494 1513 } 1495 1514 1496 1515 return 0; ··· 1492 1535 struct user_access_state ua_state; 1493 1536 int err = 0; 1494 1537 1495 - fpsimd_signal_preserve_current_state(); 1538 + fpsimd_save_and_flush_current_state(); 1496 1539 1497 1540 if (get_sigframe(&user, ksig, regs)) 1498 1541 return 1;
+7 -4
arch/arm64/kernel/signal32.c
··· 103 103 * Note that this also saves V16-31, which aren't visible 104 104 * in AArch32. 105 105 */ 106 - fpsimd_signal_preserve_current_state(); 106 + fpsimd_save_and_flush_current_state(); 107 107 108 108 /* Place structure header on the stack */ 109 109 __put_user_error(magic, &frame->magic, err); ··· 169 169 fpsimd.fpsr = fpscr & VFP_FPSCR_STAT_MASK; 170 170 fpsimd.fpcr = fpscr & VFP_FPSCR_CTRL_MASK; 171 171 172 + if (err) 173 + return -EFAULT; 174 + 172 175 /* 173 176 * We don't need to touch the exception register, so 174 177 * reload the hardware state. 175 178 */ 176 - if (!err) 177 - fpsimd_update_current_state(&fpsimd); 179 + fpsimd_save_and_flush_current_state(); 180 + current->thread.uw.fpsimd_state = fpsimd; 178 181 179 - return err ? -EFAULT : 0; 182 + return 0; 180 183 } 181 184 182 185 static int compat_restore_sigframe(struct pt_regs *regs,
+6 -4
arch/arm64/kernel/vmlinux.lds.S
··· 249 249 __inittext_end = .; 250 250 __initdata_begin = .; 251 251 252 - init_idmap_pg_dir = .; 252 + __pi_init_idmap_pg_dir = .; 253 253 . += INIT_IDMAP_DIR_SIZE; 254 - init_idmap_pg_end = .; 254 + __pi_init_idmap_pg_end = .; 255 255 256 256 .init.data : { 257 257 INIT_DATA ··· 319 319 320 320 /* start of zero-init region */ 321 321 BSS_SECTION(SBSS_ALIGN, 0, 0) 322 + __pi___bss_start = __bss_start; 322 323 323 324 . = ALIGN(PAGE_SIZE); 324 - init_pg_dir = .; 325 + __pi_init_pg_dir = .; 325 326 . += INIT_DIR_SIZE; 326 - init_pg_end = .; 327 + __pi_init_pg_end = .; 327 328 /* end of zero-init region */ 328 329 329 330 . += SZ_4K; /* stack for the early C runtime */ ··· 333 332 . = ALIGN(SEGMENT_ALIGN); 334 333 __pecoff_data_size = ABSOLUTE(. - __initdata_begin); 335 334 _end = .; 335 + __pi__end = .; 336 336 337 337 STABS_DEBUG 338 338 DWARF_DEBUG
+28 -45
arch/arm64/mm/hugetlbpage.c
··· 129 129 if (!pte_present(orig_pte) || !pte_cont(orig_pte)) 130 130 return orig_pte; 131 131 132 - ncontig = num_contig_ptes(page_size(pte_page(orig_pte)), &pgsize); 132 + ncontig = find_num_contig(mm, addr, ptep, &pgsize); 133 133 for (i = 0; i < ncontig; i++, ptep++) { 134 134 pte_t pte = __ptep_get(ptep); 135 135 ··· 159 159 pte_t pte, tmp_pte; 160 160 bool present; 161 161 162 - pte = __ptep_get_and_clear(mm, addr, ptep); 162 + pte = __ptep_get_and_clear_anysz(mm, ptep, pgsize); 163 163 present = pte_present(pte); 164 164 while (--ncontig) { 165 165 ptep++; 166 - addr += pgsize; 167 - tmp_pte = __ptep_get_and_clear(mm, addr, ptep); 166 + tmp_pte = __ptep_get_and_clear_anysz(mm, ptep, pgsize); 168 167 if (present) { 169 168 if (pte_dirty(tmp_pte)) 170 169 pte = pte_mkdirty(pte); ··· 182 183 { 183 184 pte_t orig_pte = get_clear_contig(mm, addr, ptep, pgsize, ncontig); 184 185 struct vm_area_struct vma = TLB_FLUSH_VMA(mm, 0); 186 + unsigned long end = addr + (pgsize * ncontig); 185 187 186 - flush_tlb_range(&vma, addr, addr + (pgsize * ncontig)); 188 + __flush_hugetlb_tlb_range(&vma, addr, end, pgsize, true); 187 189 return orig_pte; 188 190 } 189 191 ··· 207 207 unsigned long i, saddr = addr; 208 208 209 209 for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) 210 - __ptep_get_and_clear(mm, addr, ptep); 210 + __ptep_get_and_clear_anysz(mm, ptep, pgsize); 211 211 212 - flush_tlb_range(&vma, saddr, addr); 212 + if (mm == &init_mm) 213 + flush_tlb_kernel_range(saddr, addr); 214 + else 215 + __flush_hugetlb_tlb_range(&vma, saddr, addr, pgsize, true); 213 216 } 214 217 215 218 void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, ··· 221 218 size_t pgsize; 222 219 int i; 223 220 int ncontig; 224 - unsigned long pfn, dpfn; 225 - pgprot_t hugeprot; 226 221 227 222 ncontig = num_contig_ptes(sz, &pgsize); 228 223 229 224 if (!pte_present(pte)) { 230 225 for (i = 0; i < ncontig; i++, ptep++, addr += pgsize) 231 - __set_ptes(mm, addr, ptep, pte, 1); 226 + __set_ptes_anysz(mm, ptep, pte, 1, pgsize); 232 227 return; 233 228 } 234 229 235 - if (!pte_cont(pte)) { 236 - __set_ptes(mm, addr, ptep, pte, 1); 237 - return; 238 - } 230 + /* Only need to "break" if transitioning valid -> valid. */ 231 + if (pte_cont(pte) && pte_valid(__ptep_get(ptep))) 232 + clear_flush(mm, addr, ptep, pgsize, ncontig); 239 233 240 - pfn = pte_pfn(pte); 241 - dpfn = pgsize >> PAGE_SHIFT; 242 - hugeprot = pte_pgprot(pte); 243 - 244 - clear_flush(mm, addr, ptep, pgsize, ncontig); 245 - 246 - for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) 247 - __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); 234 + __set_ptes_anysz(mm, ptep, pte, ncontig, pgsize); 248 235 } 249 236 250 237 pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, ··· 424 431 unsigned long addr, pte_t *ptep, 425 432 pte_t pte, int dirty) 426 433 { 427 - int ncontig, i; 434 + int ncontig; 428 435 size_t pgsize = 0; 429 - unsigned long pfn = pte_pfn(pte), dpfn; 430 436 struct mm_struct *mm = vma->vm_mm; 431 - pgprot_t hugeprot; 432 437 pte_t orig_pte; 438 + 439 + VM_WARN_ON(!pte_present(pte)); 433 440 434 441 if (!pte_cont(pte)) 435 442 return __ptep_set_access_flags(vma, addr, ptep, pte, dirty); 436 443 437 - ncontig = find_num_contig(mm, addr, ptep, &pgsize); 438 - dpfn = pgsize >> PAGE_SHIFT; 444 + ncontig = num_contig_ptes(huge_page_size(hstate_vma(vma)), &pgsize); 439 445 440 446 if (!__cont_access_flags_changed(ptep, pte, ncontig)) 441 447 return 0; 442 448 443 449 orig_pte = get_clear_contig_flush(mm, addr, ptep, pgsize, ncontig); 450 + VM_WARN_ON(!pte_present(orig_pte)); 444 451 445 452 /* Make sure we don't lose the dirty or young state */ 446 453 if (pte_dirty(orig_pte)) ··· 449 456 if (pte_young(orig_pte)) 450 457 pte = pte_mkyoung(pte); 451 458 452 - hugeprot = pte_pgprot(pte); 453 - for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) 454 - __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); 455 - 459 + __set_ptes_anysz(mm, ptep, pte, ncontig, pgsize); 456 460 return 1; 457 461 } 458 462 459 463 void huge_ptep_set_wrprotect(struct mm_struct *mm, 460 464 unsigned long addr, pte_t *ptep) 461 465 { 462 - unsigned long pfn, dpfn; 463 - pgprot_t hugeprot; 464 - int ncontig, i; 466 + int ncontig; 465 467 size_t pgsize; 466 468 pte_t pte; 467 469 468 - if (!pte_cont(__ptep_get(ptep))) { 470 + pte = __ptep_get(ptep); 471 + VM_WARN_ON(!pte_present(pte)); 472 + 473 + if (!pte_cont(pte)) { 469 474 __ptep_set_wrprotect(mm, addr, ptep); 470 475 return; 471 476 } 472 477 473 478 ncontig = find_num_contig(mm, addr, ptep, &pgsize); 474 - dpfn = pgsize >> PAGE_SHIFT; 475 479 476 480 pte = get_clear_contig_flush(mm, addr, ptep, pgsize, ncontig); 477 481 pte = pte_wrprotect(pte); 478 482 479 - hugeprot = pte_pgprot(pte); 480 - pfn = pte_pfn(pte); 481 - 482 - for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) 483 - __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); 483 + __set_ptes_anysz(mm, ptep, pte, ncontig, pgsize); 484 484 } 485 485 486 486 pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, ··· 483 497 size_t pgsize; 484 498 int ncontig; 485 499 486 - if (!pte_cont(__ptep_get(ptep))) 487 - return ptep_clear_flush(vma, addr, ptep); 488 - 489 - ncontig = find_num_contig(mm, addr, ptep, &pgsize); 500 + ncontig = num_contig_ptes(huge_page_size(hstate_vma(vma)), &pgsize); 490 501 return get_clear_contig_flush(mm, addr, ptep, pgsize, ncontig); 491 502 } 492 503
-20
arch/arm64/mm/init.c
··· 275 275 } 276 276 } 277 277 278 - if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) { 279 - extern u16 memstart_offset_seed; 280 - u64 mmfr0 = read_cpuid(ID_AA64MMFR0_EL1); 281 - int parange = cpuid_feature_extract_unsigned_field( 282 - mmfr0, ID_AA64MMFR0_EL1_PARANGE_SHIFT); 283 - s64 range = linear_region_size - 284 - BIT(id_aa64mmfr0_parange_to_phys_shift(parange)); 285 - 286 - /* 287 - * If the size of the linear region exceeds, by a sufficient 288 - * margin, the size of the region that the physical memory can 289 - * span, randomize the linear region as well. 290 - */ 291 - if (memstart_offset_seed > 0 && range >= (s64)ARM64_MEMSTART_ALIGN) { 292 - range /= ARM64_MEMSTART_ALIGN; 293 - memstart_addr -= ARM64_MEMSTART_ALIGN * 294 - ((range * memstart_offset_seed) >> 16); 295 - } 296 - } 297 - 298 278 /* 299 279 * Register the kernel text, kernel data, initrd, and initial 300 280 * pagetables with memblock.
+3 -3
arch/arm64/mm/pageattr.c
··· 96 96 * we are operating on does not result in such splitting. 97 97 * 98 98 * Let's restrict ourselves to mappings created by vmalloc (or vmap). 99 - * Those are guaranteed to consist entirely of page mappings, and 100 - * splitting is never needed. 99 + * Disallow VM_ALLOW_HUGE_VMAP mappings to guarantee that only page 100 + * mappings are updated and splitting is never needed. 101 101 * 102 102 * So check whether the [addr, addr + size) interval is entirely 103 103 * covered by precisely one VM area that has the VM_ALLOC flag set. ··· 105 105 area = find_vm_area((void *)addr); 106 106 if (!area || 107 107 end > (unsigned long)kasan_reset_tag(area->addr) + area->size || 108 - !(area->flags & VM_ALLOC)) 108 + ((area->flags & (VM_ALLOC | VM_ALLOW_HUGE_VMAP)) != VM_ALLOC)) 109 109 return -EINVAL; 110 110 111 111 if (!numpages)
+2 -17
arch/arm64/mm/proc.S
··· 512 512 ubfx x1, x1, #ID_AA64MMFR3_EL1_S1PIE_SHIFT, #4 513 513 cbz x1, .Lskip_indirection 514 514 515 - /* 516 - * The PROT_* macros describing the various memory types may resolve to 517 - * C expressions if they include the PTE_MAYBE_* macros, and so they 518 - * can only be used from C code. The PIE_E* constants below are also 519 - * defined in terms of those macros, but will mask out those 520 - * PTE_MAYBE_* constants, whether they are set or not. So #define them 521 - * as 0x0 here so we can evaluate the PIE_E* constants in asm context. 522 - */ 523 - 524 - #define PTE_MAYBE_NG 0 525 - #define PTE_MAYBE_SHARED 0 526 - 527 - mov_q x0, PIE_E0 515 + mov_q x0, PIE_E0_ASM 528 516 msr REG_PIRE0_EL1, x0 529 - mov_q x0, PIE_E1 517 + mov_q x0, PIE_E1_ASM 530 518 msr REG_PIR_EL1, x0 531 - 532 - #undef PTE_MAYBE_NG 533 - #undef PTE_MAYBE_SHARED 534 519 535 520 orr tcr2, tcr2, TCR2_EL1_PIE 536 521 msr REG_TCR2_EL1, x0
+1
drivers/acpi/apei/Kconfig
··· 23 23 select ACPI_HED 24 24 select IRQ_WORK 25 25 select GENERIC_ALLOCATOR 26 + select ARM_SDE_INTERFACE if ARM64 26 27 help 27 28 Generic Hardware Error Source provides a way to report 28 29 platform hardware errors (such as that from chipset). It
+1 -1
drivers/acpi/apei/ghes.c
··· 1715 1715 { 1716 1716 int rc; 1717 1717 1718 - sdei_init(); 1718 + acpi_sdei_init(); 1719 1719 1720 1720 if (acpi_disabled) 1721 1721 return;
-1
drivers/firmware/Kconfig
··· 31 31 config ARM_SDE_INTERFACE 32 32 bool "ARM Software Delegated Exception Interface (SDEI)" 33 33 depends on ARM64 34 - depends on ACPI_APEI_GHES 35 34 help 36 35 The Software Delegated Exception Interface (SDEI) is an ARM 37 36 standard for registering callbacks from the platform firmware
+8 -3
drivers/firmware/arm_sdei.c
··· 1062 1062 return true; 1063 1063 } 1064 1064 1065 - void __init sdei_init(void) 1065 + void __init acpi_sdei_init(void) 1066 1066 { 1067 1067 struct platform_device *pdev; 1068 1068 int ret; 1069 1069 1070 - ret = platform_driver_register(&sdei_driver); 1071 - if (ret || !sdei_present_acpi()) 1070 + if (!sdei_present_acpi()) 1072 1071 return; 1073 1072 1074 1073 pdev = platform_device_register_simple(sdei_driver.driver.name, ··· 1079 1080 ret); 1080 1081 } 1081 1082 } 1083 + 1084 + static int __init sdei_init(void) 1085 + { 1086 + return platform_driver_register(&sdei_driver); 1087 + } 1088 + arch_initcall(sdei_init); 1082 1089 1083 1090 int sdei_event_handler(struct pt_regs *regs, 1084 1091 struct sdei_registered_event *arg)
+3 -1
drivers/firmware/psci/psci.c
··· 804 804 805 805 np = of_find_matching_node_and_match(NULL, psci_of_match, &matched_np); 806 806 807 - if (!np || !of_device_is_available(np)) 807 + if (!np || !of_device_is_available(np)) { 808 + of_node_put(np); 808 809 return -ENODEV; 810 + } 809 811 810 812 init_fn = (psci_initcall_t)matched_np->data; 811 813 ret = init_fn(np);
+1 -1
drivers/perf/Kconfig
··· 202 202 tristate "Cavium ThunderX2 SoC PMU UNCORE" 203 203 depends on ARCH_THUNDER2 || COMPILE_TEST 204 204 depends on NUMA && ACPI 205 - default m 205 + default m if ARCH_THUNDER2 206 206 help 207 207 Provides support for ThunderX2 UNCORE events. 208 208 The SoC has PMU support in its L3 cache controller (L3C) and
+1 -1
drivers/perf/amlogic/meson_ddr_pmu_core.c
··· 511 511 512 512 fmt_attr_fill(pmu->info.hw_info->fmt_attr); 513 513 514 - pmu->cpu = smp_processor_id(); 514 + pmu->cpu = raw_smp_processor_id(); 515 515 516 516 name = devm_kasprintf(&pdev->dev, GFP_KERNEL, DDR_PERF_DEV_NAME); 517 517 if (!name)
+6 -12
drivers/perf/arm-cmn.c
··· 727 727 728 728 if ((chan == 5 && cmn->rsp_vc_num < 2) || 729 729 (chan == 6 && cmn->dat_vc_num < 2) || 730 - (chan == 7 && cmn->snp_vc_num < 2) || 731 - (chan == 8 && cmn->req_vc_num < 2)) 730 + (chan == 7 && cmn->req_vc_num < 2) || 731 + (chan == 8 && cmn->snp_vc_num < 2)) 732 732 return 0; 733 733 } 734 734 ··· 882 882 _CMN_EVENT_XP(pub_##_name, (_event) | (4 << 5)), \ 883 883 _CMN_EVENT_XP(rsp2_##_name, (_event) | (5 << 5)), \ 884 884 _CMN_EVENT_XP(dat2_##_name, (_event) | (6 << 5)), \ 885 - _CMN_EVENT_XP(snp2_##_name, (_event) | (7 << 5)), \ 886 - _CMN_EVENT_XP(req2_##_name, (_event) | (8 << 5)) 885 + _CMN_EVENT_XP(req2_##_name, (_event) | (7 << 5)), \ 886 + _CMN_EVENT_XP(snp2_##_name, (_event) | (8 << 5)) 887 887 888 888 #define CMN_EVENT_XP_DAT(_name, _event) \ 889 889 _CMN_EVENT_XP_PORT(dat_##_name, (_event) | (3 << 5)), \ ··· 2167 2167 2168 2168 cmn->xps = arm_cmn_node(cmn, CMN_TYPE_XP); 2169 2169 2170 - if (cmn->part == PART_CMN600 && cmn->num_dtcs > 1) { 2171 - /* We do at least know that a DTC's XP must be in that DTC's domain */ 2172 - dn = arm_cmn_node(cmn, CMN_TYPE_DTC); 2173 - for (int i = 0; i < cmn->num_dtcs; i++) 2174 - arm_cmn_node_to_xp(cmn, dn + i)->dtc = i; 2175 - } 2176 - 2177 2170 for (dn = cmn->dns; dn->type; dn++) { 2178 2171 if (dn->type == CMN_TYPE_XP) 2179 2172 continue; ··· 2551 2558 2552 2559 cmn->dev = &pdev->dev; 2553 2560 cmn->part = (unsigned long)device_get_match_data(cmn->dev); 2561 + cmn->cpu = cpumask_local_spread(0, dev_to_node(cmn->dev)); 2554 2562 platform_set_drvdata(pdev, cmn); 2555 2563 2556 2564 if (cmn->part == PART_CMN600 && has_acpi_companion(cmn->dev)) { ··· 2579 2585 if (err) 2580 2586 return err; 2581 2587 2582 - cmn->cpu = cpumask_local_spread(0, dev_to_node(cmn->dev)); 2583 2588 cmn->pmu = (struct pmu) { 2584 2589 .module = THIS_MODULE, 2585 2590 .parent = cmn->dev, ··· 2644 2651 { "ARMHC600", PART_CMN600 }, 2645 2652 { "ARMHC650" }, 2646 2653 { "ARMHC700" }, 2654 + { "ARMHC003" }, 2647 2655 {} 2648 2656 }; 2649 2657 MODULE_DEVICE_TABLE(acpi, arm_cmn_acpi_match);
+22 -18
drivers/perf/arm-ni.c
··· 575 575 return err; 576 576 } 577 577 578 + static void arm_ni_remove(struct platform_device *pdev) 579 + { 580 + struct arm_ni *ni = platform_get_drvdata(pdev); 581 + 582 + for (int i = 0; i < ni->num_cds; i++) { 583 + struct arm_ni_cd *cd = ni->cds + i; 584 + 585 + if (!cd->pmu_base) 586 + continue; 587 + 588 + writel_relaxed(0, cd->pmu_base + NI_PMCR); 589 + writel_relaxed(U32_MAX, cd->pmu_base + NI_PMINTENCLR); 590 + perf_pmu_unregister(&cd->pmu); 591 + cpuhp_state_remove_instance_nocalls(arm_ni_hp_state, &cd->cpuhp_node); 592 + } 593 + } 594 + 578 595 static void arm_ni_probe_domain(void __iomem *base, struct arm_ni_node *node) 579 596 { 580 597 u32 reg = readl_relaxed(base + NI_NODE_TYPE); ··· 660 643 ni->num_cds = num_cds; 661 644 ni->part = part; 662 645 ni->id = atomic_fetch_inc(&id); 646 + platform_set_drvdata(pdev, ni); 663 647 664 648 for (int v = 0; v < cfg.num_components; v++) { 665 649 reg = readl_relaxed(cfg.base + NI_CHILD_PTR(v)); ··· 674 656 reg = readl_relaxed(pd.base + NI_CHILD_PTR(c)); 675 657 arm_ni_probe_domain(base + reg, &cd); 676 658 ret = arm_ni_init_cd(ni, &cd, res->start); 677 - if (ret) 659 + if (ret) { 660 + ni->cds[cd.id].pmu_base = NULL; 661 + arm_ni_remove(pdev); 678 662 return ret; 663 + } 679 664 } 680 665 } 681 666 } 682 667 683 668 return 0; 684 - } 685 - 686 - static void arm_ni_remove(struct platform_device *pdev) 687 - { 688 - struct arm_ni *ni = platform_get_drvdata(pdev); 689 - 690 - for (int i = 0; i < ni->num_cds; i++) { 691 - struct arm_ni_cd *cd = ni->cds + i; 692 - 693 - if (!cd->pmu_base) 694 - continue; 695 - 696 - writel_relaxed(0, cd->pmu_base + NI_PMCR); 697 - writel_relaxed(U32_MAX, cd->pmu_base + NI_PMINTENCLR); 698 - perf_pmu_unregister(&cd->pmu); 699 - cpuhp_state_remove_instance_nocalls(arm_ni_hp_state, &cd->cpuhp_node); 700 - } 701 669 } 702 670 703 671 #ifdef CONFIG_OF
+2 -2
include/linux/arm_sdei.h
··· 46 46 /* For use by arch code when CPU hotplug notifiers are not appropriate. */ 47 47 int sdei_mask_local_cpu(void); 48 48 int sdei_unmask_local_cpu(void); 49 - void __init sdei_init(void); 49 + void __init acpi_sdei_init(void); 50 50 void sdei_handler_abort(void); 51 51 #else 52 52 static inline int sdei_mask_local_cpu(void) { return 0; } 53 53 static inline int sdei_unmask_local_cpu(void) { return 0; } 54 - static inline void sdei_init(void) { } 54 + static inline void acpi_sdei_init(void) { } 55 55 static inline void sdei_handler_abort(void) { } 56 56 #endif /* CONFIG_ARM_SDE_INTERFACE */ 57 57
+18 -12
include/linux/page_table_check.h
··· 19 19 void __page_table_check_pud_clear(struct mm_struct *mm, pud_t pud); 20 20 void __page_table_check_ptes_set(struct mm_struct *mm, pte_t *ptep, pte_t pte, 21 21 unsigned int nr); 22 - void __page_table_check_pmd_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd); 23 - void __page_table_check_pud_set(struct mm_struct *mm, pud_t *pudp, pud_t pud); 22 + void __page_table_check_pmds_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd, 23 + unsigned int nr); 24 + void __page_table_check_puds_set(struct mm_struct *mm, pud_t *pudp, pud_t pud, 25 + unsigned int nr); 24 26 void __page_table_check_pte_clear_range(struct mm_struct *mm, 25 27 unsigned long addr, 26 28 pmd_t pmd); ··· 76 74 __page_table_check_ptes_set(mm, ptep, pte, nr); 77 75 } 78 76 79 - static inline void page_table_check_pmd_set(struct mm_struct *mm, pmd_t *pmdp, 80 - pmd_t pmd) 77 + static inline void page_table_check_pmds_set(struct mm_struct *mm, 78 + pmd_t *pmdp, pmd_t pmd, unsigned int nr) 81 79 { 82 80 if (static_branch_likely(&page_table_check_disabled)) 83 81 return; 84 82 85 - __page_table_check_pmd_set(mm, pmdp, pmd); 83 + __page_table_check_pmds_set(mm, pmdp, pmd, nr); 86 84 } 87 85 88 - static inline void page_table_check_pud_set(struct mm_struct *mm, pud_t *pudp, 89 - pud_t pud) 86 + static inline void page_table_check_puds_set(struct mm_struct *mm, 87 + pud_t *pudp, pud_t pud, unsigned int nr) 90 88 { 91 89 if (static_branch_likely(&page_table_check_disabled)) 92 90 return; 93 91 94 - __page_table_check_pud_set(mm, pudp, pud); 92 + __page_table_check_puds_set(mm, pudp, pud, nr); 95 93 } 96 94 97 95 static inline void page_table_check_pte_clear_range(struct mm_struct *mm, ··· 131 129 { 132 130 } 133 131 134 - static inline void page_table_check_pmd_set(struct mm_struct *mm, pmd_t *pmdp, 135 - pmd_t pmd) 132 + static inline void page_table_check_pmds_set(struct mm_struct *mm, 133 + pmd_t *pmdp, pmd_t pmd, unsigned int nr) 136 134 { 137 135 } 138 136 139 - static inline void page_table_check_pud_set(struct mm_struct *mm, pud_t *pudp, 140 - pud_t pud) 137 + static inline void page_table_check_puds_set(struct mm_struct *mm, 138 + pud_t *pudp, pud_t pud, unsigned int nr) 141 139 { 142 140 } 143 141 ··· 148 146 } 149 147 150 148 #endif /* CONFIG_PAGE_TABLE_CHECK */ 149 + 150 + #define page_table_check_pmd_set(mm, pmdp, pmd) page_table_check_pmds_set(mm, pmdp, pmd, 1) 151 + #define page_table_check_pud_set(mm, pudp, pud) page_table_check_puds_set(mm, pudp, pud, 1) 152 + 151 153 #endif /* __LINUX_PAGE_TABLE_CHECK_H */
+8
include/linux/vmalloc.h
··· 114 114 } 115 115 #endif 116 116 117 + #ifndef arch_vmap_pte_range_unmap_size 118 + static inline unsigned long arch_vmap_pte_range_unmap_size(unsigned long addr, 119 + pte_t *ptep) 120 + { 121 + return PAGE_SIZE; 122 + } 123 + #endif 124 + 117 125 #ifndef arch_vmap_pte_supported_shift 118 126 static inline int arch_vmap_pte_supported_shift(unsigned long size) 119 127 {
+20 -14
mm/page_table_check.c
··· 218 218 WARN_ON_ONCE(swap_cached_writable(pmd_to_swp_entry(pmd))); 219 219 } 220 220 221 - void __page_table_check_pmd_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd) 221 + void __page_table_check_pmds_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd, 222 + unsigned int nr) 222 223 { 224 + unsigned long stride = PMD_SIZE >> PAGE_SHIFT; 225 + unsigned int i; 226 + 223 227 if (&init_mm == mm) 224 228 return; 225 229 226 230 page_table_check_pmd_flags(pmd); 227 231 228 - __page_table_check_pmd_clear(mm, *pmdp); 229 - if (pmd_user_accessible_page(pmd)) { 230 - page_table_check_set(pmd_pfn(pmd), PMD_SIZE >> PAGE_SHIFT, 231 - pmd_write(pmd)); 232 - } 232 + for (i = 0; i < nr; i++) 233 + __page_table_check_pmd_clear(mm, *(pmdp + i)); 234 + if (pmd_user_accessible_page(pmd)) 235 + page_table_check_set(pmd_pfn(pmd), stride * nr, pmd_write(pmd)); 233 236 } 234 - EXPORT_SYMBOL(__page_table_check_pmd_set); 237 + EXPORT_SYMBOL(__page_table_check_pmds_set); 235 238 236 - void __page_table_check_pud_set(struct mm_struct *mm, pud_t *pudp, pud_t pud) 239 + void __page_table_check_puds_set(struct mm_struct *mm, pud_t *pudp, pud_t pud, 240 + unsigned int nr) 237 241 { 242 + unsigned long stride = PUD_SIZE >> PAGE_SHIFT; 243 + unsigned int i; 244 + 238 245 if (&init_mm == mm) 239 246 return; 240 247 241 - __page_table_check_pud_clear(mm, *pudp); 242 - if (pud_user_accessible_page(pud)) { 243 - page_table_check_set(pud_pfn(pud), PUD_SIZE >> PAGE_SHIFT, 244 - pud_write(pud)); 245 - } 248 + for (i = 0; i < nr; i++) 249 + __page_table_check_pud_clear(mm, *(pudp + i)); 250 + if (pud_user_accessible_page(pud)) 251 + page_table_check_set(pud_pfn(pud), stride * nr, pud_write(pud)); 246 252 } 247 - EXPORT_SYMBOL(__page_table_check_pud_set); 253 + EXPORT_SYMBOL(__page_table_check_puds_set); 248 254 249 255 void __page_table_check_pte_clear_range(struct mm_struct *mm, 250 256 unsigned long addr,
+36 -4
mm/vmalloc.c
··· 104 104 pte = pte_alloc_kernel_track(pmd, addr, mask); 105 105 if (!pte) 106 106 return -ENOMEM; 107 + 108 + arch_enter_lazy_mmu_mode(); 109 + 107 110 do { 108 111 if (unlikely(!pte_none(ptep_get(pte)))) { 109 112 if (pfn_valid(pfn)) { ··· 130 127 set_pte_at(&init_mm, addr, pte, pfn_pte(pfn, prot)); 131 128 pfn++; 132 129 } while (pte += PFN_DOWN(size), addr += size, addr != end); 130 + 131 + arch_leave_lazy_mmu_mode(); 133 132 *mask |= PGTBL_PTE_MODIFIED; 134 133 return 0; 135 134 } ··· 355 350 pgtbl_mod_mask *mask) 356 351 { 357 352 pte_t *pte; 353 + pte_t ptent; 354 + unsigned long size = PAGE_SIZE; 358 355 359 356 pte = pte_offset_kernel(pmd, addr); 357 + arch_enter_lazy_mmu_mode(); 358 + 360 359 do { 361 - pte_t ptent = ptep_get_and_clear(&init_mm, addr, pte); 360 + #ifdef CONFIG_HUGETLB_PAGE 361 + size = arch_vmap_pte_range_unmap_size(addr, pte); 362 + if (size != PAGE_SIZE) { 363 + if (WARN_ON(!IS_ALIGNED(addr, size))) { 364 + addr = ALIGN_DOWN(addr, size); 365 + pte = PTR_ALIGN_DOWN(pte, sizeof(*pte) * (size >> PAGE_SHIFT)); 366 + } 367 + ptent = huge_ptep_get_and_clear(&init_mm, addr, pte, size); 368 + if (WARN_ON(end - addr < size)) 369 + size = end - addr; 370 + } else 371 + #endif 372 + ptent = ptep_get_and_clear(&init_mm, addr, pte); 362 373 WARN_ON(!pte_none(ptent) && !pte_present(ptent)); 363 - } while (pte++, addr += PAGE_SIZE, addr != end); 374 + } while (pte += (size >> PAGE_SHIFT), addr += size, addr != end); 375 + 376 + arch_leave_lazy_mmu_mode(); 364 377 *mask |= PGTBL_PTE_MODIFIED; 365 378 } 366 379 ··· 397 374 if (cleared || pmd_bad(*pmd)) 398 375 *mask |= PGTBL_PMD_MODIFIED; 399 376 400 - if (cleared) 377 + if (cleared) { 378 + WARN_ON(next - addr < PMD_SIZE); 401 379 continue; 380 + } 402 381 if (pmd_none_or_clear_bad(pmd)) 403 382 continue; 404 383 vunmap_pte_range(pmd, addr, next, mask); ··· 424 399 if (cleared || pud_bad(*pud)) 425 400 *mask |= PGTBL_PUD_MODIFIED; 426 401 427 - if (cleared) 402 + if (cleared) { 403 + WARN_ON(next - addr < PUD_SIZE); 428 404 continue; 405 + } 429 406 if (pud_none_or_clear_bad(pud)) 430 407 continue; 431 408 vunmap_pmd_range(pud, addr, next, mask); ··· 524 497 pte = pte_alloc_kernel_track(pmd, addr, mask); 525 498 if (!pte) 526 499 return -ENOMEM; 500 + 501 + arch_enter_lazy_mmu_mode(); 502 + 527 503 do { 528 504 struct page *page = pages[*nr]; 529 505 ··· 540 510 set_pte_at(&init_mm, addr, pte, mk_pte(page, prot)); 541 511 (*nr)++; 542 512 } while (pte++, addr += PAGE_SIZE, addr != end); 513 + 514 + arch_leave_lazy_mmu_mode(); 543 515 *mask |= PGTBL_PTE_MODIFIED; 544 516 return 0; 545 517 }
+2
tools/testing/selftests/arm64/Makefile
··· 21 21 22 22 CFLAGS += -I$(top_srcdir)/tools/include 23 23 24 + OUTPUT ?= $(CURDIR) 25 + 24 26 export CFLAGS 25 27 export top_srcdir 26 28
+12 -2
tools/testing/selftests/arm64/abi/tpidr2.c
··· 169 169 child_tidptr); 170 170 } 171 171 172 + #define __STACK_SIZE (8 * 1024 * 1024) 173 + 172 174 /* 173 - * If we clone with CLONE_SETTLS then the value in the parent should 175 + * If we clone with CLONE_VM then the value in the parent should 174 176 * be unchanged and the child should start with zero and be able to 175 177 * set its own value. 176 178 */ ··· 181 179 int parent_tid, child_tid; 182 180 pid_t parent, waiting; 183 181 int ret, status; 182 + void *stack; 184 183 185 184 parent = getpid(); 186 185 set_tpidr2(parent); 187 186 188 - ret = sys_clone(CLONE_SETTLS, 0, &parent_tid, 0, &child_tid); 187 + stack = malloc(__STACK_SIZE); 188 + if (!stack) { 189 + putstr("# malloc() failed\n"); 190 + return 0; 191 + } 192 + 193 + ret = sys_clone(CLONE_VM, (unsigned long)stack + __STACK_SIZE, 194 + &parent_tid, 0, &child_tid); 189 195 if (ret == -1) { 190 196 putstr("# clone() failed\n"); 191 197 putnum(errno);
+26 -36
tools/testing/selftests/arm64/fp/fp-ptrace.c
··· 439 439 pass = false; 440 440 } 441 441 442 - if (sve->size != SVE_PT_SIZE(vq, sve->flags)) { 443 - ksft_print_msg("Mismatch in SVE header size: %d != %lu\n", 444 - sve->size, SVE_PT_SIZE(vq, sve->flags)); 445 - pass = false; 442 + if (svcr_in & SVCR_SM) { 443 + if (sve->size != sizeof(sve)) { 444 + ksft_print_msg("NT_ARM_SVE reports data with PSTATE.SM\n"); 445 + pass = false; 446 + } 447 + } else { 448 + if (sve->size != SVE_PT_SIZE(vq, sve->flags)) { 449 + ksft_print_msg("Mismatch in SVE header size: %d != %lu\n", 450 + sve->size, SVE_PT_SIZE(vq, sve->flags)); 451 + pass = false; 452 + } 446 453 } 447 454 448 455 /* The registers might be in completely different formats! */ ··· 522 515 pass = false; 523 516 } 524 517 525 - if (sve->size != SVE_PT_SIZE(vq, sve->flags)) { 526 - ksft_print_msg("Mismatch in SSVE header size: %d != %lu\n", 527 - sve->size, SVE_PT_SIZE(vq, sve->flags)); 528 - pass = false; 518 + if (!(svcr_in & SVCR_SM)) { 519 + if (sve->size != sizeof(sve)) { 520 + ksft_print_msg("NT_ARM_SSVE reports data without PSTATE.SM\n"); 521 + pass = false; 522 + } 523 + } else { 524 + if (sve->size != SVE_PT_SIZE(vq, sve->flags)) { 525 + ksft_print_msg("Mismatch in SSVE header size: %d != %lu\n", 526 + sve->size, SVE_PT_SIZE(vq, sve->flags)); 527 + pass = false; 528 + } 529 529 } 530 530 531 531 /* The registers might be in completely different formats! */ ··· 905 891 { 906 892 int vq = __sve_vq_from_vl(vl_in(config)); 907 893 int sme_vq = __sve_vq_from_vl(config->sme_vl_in); 908 - bool sm_change; 909 894 910 895 svcr_in = config->svcr_in; 911 896 svcr_expected = config->svcr_expected; 912 897 svcr_out = 0; 913 - 914 - if (sme_supported() && 915 - (svcr_in & SVCR_SM) != (svcr_expected & SVCR_SM)) 916 - sm_change = true; 917 - else 918 - sm_change = false; 919 898 920 899 fill_random(&v_in, sizeof(v_in)); 921 900 memcpy(v_expected, v_in, sizeof(v_in)); ··· 960 953 if (fpmr_supported()) { 961 954 fill_random(&fpmr_in, sizeof(fpmr_in)); 962 955 fpmr_in &= FPMR_SAFE_BITS; 963 - 964 - /* Entering or exiting streaming mode clears FPMR */ 965 - if (sm_change) 966 - fpmr_expected = 0; 967 - else 968 - fpmr_expected = fpmr_in; 956 + fpmr_expected = fpmr_in; 969 957 } else { 970 958 fpmr_in = 0; 971 959 fpmr_expected = 0; ··· 1197 1195 1198 1196 static bool za_write_supported(struct test_config *config) 1199 1197 { 1200 - if (config->sme_vl_in != config->sme_vl_expected) { 1201 - /* Changing the SME VL exits streaming mode. */ 1202 - if (config->svcr_expected & SVCR_SM) { 1203 - return false; 1204 - } 1205 - } else { 1206 - /* Otherwise we can't change streaming mode */ 1207 - if ((config->svcr_in & SVCR_SM) != 1208 - (config->svcr_expected & SVCR_SM)) { 1209 - return false; 1210 - } 1211 - } 1198 + if ((config->svcr_in & SVCR_SM) != (config->svcr_expected & SVCR_SM)) 1199 + return false; 1212 1200 1213 1201 return true; 1214 1202 } ··· 1216 1224 memset(zt_expected, 0, sizeof(zt_expected)); 1217 1225 } 1218 1226 1219 - /* Changing the SME VL flushes ZT, SVE state and exits SM */ 1227 + /* Changing the SME VL flushes ZT, SVE state */ 1220 1228 if (config->sme_vl_in != config->sme_vl_expected) { 1221 - svcr_expected &= ~SVCR_SM; 1222 - 1223 1229 sve_vq = __sve_vq_from_vl(vl_expected(config)); 1224 1230 memset(z_expected, 0, __SVE_ZREGS_SIZE(sve_vq)); 1225 1231 memset(p_expected, 0, __SVE_PREGS_SIZE(sve_vq));