Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
"The big highlight is support for the Scalable Vector Extension (SVE)
which required extensive ABI work to ensure we don't break existing
applications by blowing away their signal stack with the rather large
new vector context (<= 2 kbit per vector register). There's further
work to be done optimising things like exception return, but the ABI
is solid now.

Much of the line count comes from some new PMU drivers we have, but
they're pretty self-contained and I suspect we'll have more of them in
future.

Plenty of acronym soup here:

- initial support for the Scalable Vector Extension (SVE)

- improved handling for SError interrupts (required to handle RAS
events)

- enable GCC support for 128-bit integer types

- remove kernel text addresses from backtraces and register dumps

- use of WFE to implement long delay()s

- ACPI IORT updates from Lorenzo Pieralisi

- perf PMU driver for the Statistical Profiling Extension (SPE)

- perf PMU driver for Hisilicon's system PMUs

- misc cleanups and non-critical fixes"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (97 commits)
arm64: Make ARMV8_DEPRECATED depend on SYSCTL
arm64: Implement __lshrti3 library function
arm64: support __int128 on gcc 5+
arm64/sve: Add documentation
arm64/sve: Detect SVE and activate runtime support
arm64/sve: KVM: Hide SVE from CPU features exposed to guests
arm64/sve: KVM: Treat guest SVE use as undefined instruction execution
arm64/sve: KVM: Prevent guests from using SVE
arm64/sve: Add sysctl to set the default vector length for new processes
arm64/sve: Add prctl controls for userspace vector length management
arm64/sve: ptrace and ELF coredump support
arm64/sve: Preserve SVE registers around EFI runtime service calls
arm64/sve: Preserve SVE registers around kernel-mode NEON use
arm64/sve: Probe SVE capabilities and usable vector lengths
arm64: cpufeature: Move sys_caps_initialised declarations
arm64/sve: Backend logic for setting the vector length
arm64/sve: Signal handling support
arm64/sve: Support vector length resetting for new processes
arm64/sve: Core task context handling
arm64/sve: Low-level CPU setup
...

+7399 -601
+16 -2
Documentation/arm64/cpu-feature-registers.txt
··· 110 110 x--------------------------------------------------x 111 111 | Name | bits | visible | 112 112 |--------------------------------------------------| 113 - | RES0 | [63-32] | n | 113 + | RES0 | [63-48] | n | 114 + |--------------------------------------------------| 115 + | DP | [47-44] | y | 116 + |--------------------------------------------------| 117 + | SM4 | [43-40] | y | 118 + |--------------------------------------------------| 119 + | SM3 | [39-36] | y | 120 + |--------------------------------------------------| 121 + | SHA3 | [35-32] | y | 114 122 |--------------------------------------------------| 115 123 | RDM | [31-28] | y | 124 + |--------------------------------------------------| 125 + | RES0 | [27-24] | n | 116 126 |--------------------------------------------------| 117 127 | ATOMICS | [23-20] | y | 118 128 |--------------------------------------------------| ··· 142 132 x--------------------------------------------------x 143 133 | Name | bits | visible | 144 134 |--------------------------------------------------| 145 - | RES0 | [63-28] | n | 135 + | RES0 | [63-36] | n | 136 + |--------------------------------------------------| 137 + | SVE | [35-32] | y | 138 + |--------------------------------------------------| 139 + | RES0 | [31-28] | n | 146 140 |--------------------------------------------------| 147 141 | GIC | [27-24] | n | 148 142 |--------------------------------------------------|
+160
Documentation/arm64/elf_hwcaps.txt
··· 1 + ARM64 ELF hwcaps 2 + ================ 3 + 4 + This document describes the usage and semantics of the arm64 ELF hwcaps. 5 + 6 + 7 + 1. Introduction 8 + --------------- 9 + 10 + Some hardware or software features are only available on some CPU 11 + implementations, and/or with certain kernel configurations, but have no 12 + architected discovery mechanism available to userspace code at EL0. The 13 + kernel exposes the presence of these features to userspace through a set 14 + of flags called hwcaps, exposed in the auxilliary vector. 15 + 16 + Userspace software can test for features by acquiring the AT_HWCAP entry 17 + of the auxilliary vector, and testing whether the relevant flags are 18 + set, e.g. 19 + 20 + bool floating_point_is_present(void) 21 + { 22 + unsigned long hwcaps = getauxval(AT_HWCAP); 23 + if (hwcaps & HWCAP_FP) 24 + return true; 25 + 26 + return false; 27 + } 28 + 29 + Where software relies on a feature described by a hwcap, it should check 30 + the relevant hwcap flag to verify that the feature is present before 31 + attempting to make use of the feature. 32 + 33 + Features cannot be probed reliably through other means. When a feature 34 + is not available, attempting to use it may result in unpredictable 35 + behaviour, and is not guaranteed to result in any reliable indication 36 + that the feature is unavailable, such as a SIGILL. 37 + 38 + 39 + 2. Interpretation of hwcaps 40 + --------------------------- 41 + 42 + The majority of hwcaps are intended to indicate the presence of features 43 + which are described by architected ID registers inaccessible to 44 + userspace code at EL0. These hwcaps are defined in terms of ID register 45 + fields, and should be interpreted with reference to the definition of 46 + these fields in the ARM Architecture Reference Manual (ARM ARM). 47 + 48 + Such hwcaps are described below in the form: 49 + 50 + Functionality implied by idreg.field == val. 51 + 52 + Such hwcaps indicate the availability of functionality that the ARM ARM 53 + defines as being present when idreg.field has value val, but do not 54 + indicate that idreg.field is precisely equal to val, nor do they 55 + indicate the absence of functionality implied by other values of 56 + idreg.field. 57 + 58 + Other hwcaps may indicate the presence of features which cannot be 59 + described by ID registers alone. These may be described without 60 + reference to ID registers, and may refer to other documentation. 61 + 62 + 63 + 3. The hwcaps exposed in AT_HWCAP 64 + --------------------------------- 65 + 66 + HWCAP_FP 67 + 68 + Functionality implied by ID_AA64PFR0_EL1.FP == 0b0000. 69 + 70 + HWCAP_ASIMD 71 + 72 + Functionality implied by ID_AA64PFR0_EL1.AdvSIMD == 0b0000. 73 + 74 + HWCAP_EVTSTRM 75 + 76 + The generic timer is configured to generate events at a frequency of 77 + approximately 100KHz. 78 + 79 + HWCAP_AES 80 + 81 + Functionality implied by ID_AA64ISAR1_EL1.AES == 0b0001. 82 + 83 + HWCAP_PMULL 84 + 85 + Functionality implied by ID_AA64ISAR1_EL1.AES == 0b0010. 86 + 87 + HWCAP_SHA1 88 + 89 + Functionality implied by ID_AA64ISAR0_EL1.SHA1 == 0b0001. 90 + 91 + HWCAP_SHA2 92 + 93 + Functionality implied by ID_AA64ISAR0_EL1.SHA2 == 0b0001. 94 + 95 + HWCAP_CRC32 96 + 97 + Functionality implied by ID_AA64ISAR0_EL1.CRC32 == 0b0001. 98 + 99 + HWCAP_ATOMICS 100 + 101 + Functionality implied by ID_AA64ISAR0_EL1.Atomic == 0b0010. 102 + 103 + HWCAP_FPHP 104 + 105 + Functionality implied by ID_AA64PFR0_EL1.FP == 0b0001. 106 + 107 + HWCAP_ASIMDHP 108 + 109 + Functionality implied by ID_AA64PFR0_EL1.AdvSIMD == 0b0001. 110 + 111 + HWCAP_CPUID 112 + 113 + EL0 access to certain ID registers is available, to the extent 114 + described by Documentation/arm64/cpu-feature-registers.txt. 115 + 116 + These ID registers may imply the availability of features. 117 + 118 + HWCAP_ASIMDRDM 119 + 120 + Functionality implied by ID_AA64ISAR0_EL1.RDM == 0b0001. 121 + 122 + HWCAP_JSCVT 123 + 124 + Functionality implied by ID_AA64ISAR1_EL1.JSCVT == 0b0001. 125 + 126 + HWCAP_FCMA 127 + 128 + Functionality implied by ID_AA64ISAR1_EL1.FCMA == 0b0001. 129 + 130 + HWCAP_LRCPC 131 + 132 + Functionality implied by ID_AA64ISAR1_EL1.LRCPC == 0b0001. 133 + 134 + HWCAP_DCPOP 135 + 136 + Functionality implied by ID_AA64ISAR1_EL1.DPB == 0b0001. 137 + 138 + HWCAP_SHA3 139 + 140 + Functionality implied by ID_AA64ISAR0_EL1.SHA3 == 0b0001. 141 + 142 + HWCAP_SM3 143 + 144 + Functionality implied by ID_AA64ISAR0_EL1.SM3 == 0b0001. 145 + 146 + HWCAP_SM4 147 + 148 + Functionality implied by ID_AA64ISAR0_EL1.SM4 == 0b0001. 149 + 150 + HWCAP_ASIMDDP 151 + 152 + Functionality implied by ID_AA64ISAR0_EL1.DP == 0b0001. 153 + 154 + HWCAP_SHA512 155 + 156 + Functionality implied by ID_AA64ISAR0_EL1.SHA2 == 0b0002. 157 + 158 + HWCAP_SVE 159 + 160 + Functionality implied by ID_AA64PFR0_EL1.SVE == 0b0001.
+5 -5
Documentation/arm64/memory.txt
··· 86 86 +-------------------------------------------------> [63] TTBR0/1 87 87 88 88 89 - When using KVM, the hypervisor maps kernel pages in EL2, at a fixed 90 - offset from the kernel VA (top 24bits of the kernel VA set to zero): 89 + When using KVM without the Virtualization Host Extensions, the hypervisor 90 + maps kernel pages in EL2 at a fixed offset from the kernel VA. See the 91 + kern_hyp_va macro for more details. 91 92 92 - Start End Size Use 93 - ----------------------------------------------------------------------- 94 - 0000004000000000 0000007fffffffff 256GB kernel objects mapped in HYP 93 + When using KVM with the Virtualization Host Extensions, no additional 94 + mappings are created, since the host kernel runs directly in EL2.
+508
Documentation/arm64/sve.txt
··· 1 + Scalable Vector Extension support for AArch64 Linux 2 + =================================================== 3 + 4 + Author: Dave Martin <Dave.Martin@arm.com> 5 + Date: 4 August 2017 6 + 7 + This document outlines briefly the interface provided to userspace by Linux in 8 + order to support use of the ARM Scalable Vector Extension (SVE). 9 + 10 + This is an outline of the most important features and issues only and not 11 + intended to be exhaustive. 12 + 13 + This document does not aim to describe the SVE architecture or programmer's 14 + model. To aid understanding, a minimal description of relevant programmer's 15 + model features for SVE is included in Appendix A. 16 + 17 + 18 + 1. General 19 + ----------- 20 + 21 + * SVE registers Z0..Z31, P0..P15 and FFR and the current vector length VL, are 22 + tracked per-thread. 23 + 24 + * The presence of SVE is reported to userspace via HWCAP_SVE in the aux vector 25 + AT_HWCAP entry. Presence of this flag implies the presence of the SVE 26 + instructions and registers, and the Linux-specific system interfaces 27 + described in this document. SVE is reported in /proc/cpuinfo as "sve". 28 + 29 + * Support for the execution of SVE instructions in userspace can also be 30 + detected by reading the CPU ID register ID_AA64PFR0_EL1 using an MRS 31 + instruction, and checking that the value of the SVE field is nonzero. [3] 32 + 33 + It does not guarantee the presence of the system interfaces described in the 34 + following sections: software that needs to verify that those interfaces are 35 + present must check for HWCAP_SVE instead. 36 + 37 + * Debuggers should restrict themselves to interacting with the target via the 38 + NT_ARM_SVE regset. The recommended way of detecting support for this regset 39 + is to connect to a target process first and then attempt a 40 + ptrace(PTRACE_GETREGSET, pid, NT_ARM_SVE, &iov). 41 + 42 + 43 + 2. Vector length terminology 44 + ----------------------------- 45 + 46 + The size of an SVE vector (Z) register is referred to as the "vector length". 47 + 48 + To avoid confusion about the units used to express vector length, the kernel 49 + adopts the following conventions: 50 + 51 + * Vector length (VL) = size of a Z-register in bytes 52 + 53 + * Vector quadwords (VQ) = size of a Z-register in units of 128 bits 54 + 55 + (So, VL = 16 * VQ.) 56 + 57 + The VQ convention is used where the underlying granularity is important, such 58 + as in data structure definitions. In most other situations, the VL convention 59 + is used. This is consistent with the meaning of the "VL" pseudo-register in 60 + the SVE instruction set architecture. 61 + 62 + 63 + 3. System call behaviour 64 + ------------------------- 65 + 66 + * On syscall, V0..V31 are preserved (as without SVE). Thus, bits [127:0] of 67 + Z0..Z31 are preserved. All other bits of Z0..Z31, and all of P0..P15 and FFR 68 + become unspecified on return from a syscall. 69 + 70 + * The SVE registers are not used to pass arguments to or receive results from 71 + any syscall. 72 + 73 + * In practice the affected registers/bits will be preserved or will be replaced 74 + with zeros on return from a syscall, but userspace should not make 75 + assumptions about this. The kernel behaviour may vary on a case-by-case 76 + basis. 77 + 78 + * All other SVE state of a thread, including the currently configured vector 79 + length, the state of the PR_SVE_VL_INHERIT flag, and the deferred vector 80 + length (if any), is preserved across all syscalls, subject to the specific 81 + exceptions for execve() described in section 6. 82 + 83 + In particular, on return from a fork() or clone(), the parent and new child 84 + process or thread share identical SVE configuration, matching that of the 85 + parent before the call. 86 + 87 + 88 + 4. Signal handling 89 + ------------------- 90 + 91 + * A new signal frame record sve_context encodes the SVE registers on signal 92 + delivery. [1] 93 + 94 + * This record is supplementary to fpsimd_context. The FPSR and FPCR registers 95 + are only present in fpsimd_context. For convenience, the content of V0..V31 96 + is duplicated between sve_context and fpsimd_context. 97 + 98 + * The signal frame record for SVE always contains basic metadata, in particular 99 + the thread's vector length (in sve_context.vl). 100 + 101 + * The SVE registers may or may not be included in the record, depending on 102 + whether the registers are live for the thread. The registers are present if 103 + and only if: 104 + sve_context.head.size >= SVE_SIG_CONTEXT_SIZE(sve_vq_from_vl(sve_context.vl)). 105 + 106 + * If the registers are present, the remainder of the record has a vl-dependent 107 + size and layout. Macros SVE_SIG_* are defined [1] to facilitate access to 108 + the members. 109 + 110 + * If the SVE context is too big to fit in sigcontext.__reserved[], then extra 111 + space is allocated on the stack, an extra_context record is written in 112 + __reserved[] referencing this space. sve_context is then written in the 113 + extra space. Refer to [1] for further details about this mechanism. 114 + 115 + 116 + 5. Signal return 117 + ----------------- 118 + 119 + When returning from a signal handler: 120 + 121 + * If there is no sve_context record in the signal frame, or if the record is 122 + present but contains no register data as desribed in the previous section, 123 + then the SVE registers/bits become non-live and take unspecified values. 124 + 125 + * If sve_context is present in the signal frame and contains full register 126 + data, the SVE registers become live and are populated with the specified 127 + data. However, for backward compatibility reasons, bits [127:0] of Z0..Z31 128 + are always restored from the corresponding members of fpsimd_context.vregs[] 129 + and not from sve_context. The remaining bits are restored from sve_context. 130 + 131 + * Inclusion of fpsimd_context in the signal frame remains mandatory, 132 + irrespective of whether sve_context is present or not. 133 + 134 + * The vector length cannot be changed via signal return. If sve_context.vl in 135 + the signal frame does not match the current vector length, the signal return 136 + attempt is treated as illegal, resulting in a forced SIGSEGV. 137 + 138 + 139 + 6. prctl extensions 140 + -------------------- 141 + 142 + Some new prctl() calls are added to allow programs to manage the SVE vector 143 + length: 144 + 145 + prctl(PR_SVE_SET_VL, unsigned long arg) 146 + 147 + Sets the vector length of the calling thread and related flags, where 148 + arg == vl | flags. Other threads of the calling process are unaffected. 149 + 150 + vl is the desired vector length, where sve_vl_valid(vl) must be true. 151 + 152 + flags: 153 + 154 + PR_SVE_SET_VL_INHERIT 155 + 156 + Inherit the current vector length across execve(). Otherwise, the 157 + vector length is reset to the system default at execve(). (See 158 + Section 9.) 159 + 160 + PR_SVE_SET_VL_ONEXEC 161 + 162 + Defer the requested vector length change until the next execve() 163 + performed by this thread. 164 + 165 + The effect is equivalent to implicit exceution of the following 166 + call immediately after the next execve() (if any) by the thread: 167 + 168 + prctl(PR_SVE_SET_VL, arg & ~PR_SVE_SET_VL_ONEXEC) 169 + 170 + This allows launching of a new program with a different vector 171 + length, while avoiding runtime side effects in the caller. 172 + 173 + 174 + Without PR_SVE_SET_VL_ONEXEC, the requested change takes effect 175 + immediately. 176 + 177 + 178 + Return value: a nonnegative on success, or a negative value on error: 179 + EINVAL: SVE not supported, invalid vector length requested, or 180 + invalid flags. 181 + 182 + 183 + On success: 184 + 185 + * Either the calling thread's vector length or the deferred vector length 186 + to be applied at the next execve() by the thread (dependent on whether 187 + PR_SVE_SET_VL_ONEXEC is present in arg), is set to the largest value 188 + supported by the system that is less than or equal to vl. If vl == 189 + SVE_VL_MAX, the value set will be the largest value supported by the 190 + system. 191 + 192 + * Any previously outstanding deferred vector length change in the calling 193 + thread is cancelled. 194 + 195 + * The returned value describes the resulting configuration, encoded as for 196 + PR_SVE_GET_VL. The vector length reported in this value is the new 197 + current vector length for this thread if PR_SVE_SET_VL_ONEXEC was not 198 + present in arg; otherwise, the reported vector length is the deferred 199 + vector length that will be applied at the next execve() by the calling 200 + thread. 201 + 202 + * Changing the vector length causes all of P0..P15, FFR and all bits of 203 + Z0..V31 except for Z0 bits [127:0] .. Z31 bits [127:0] to become 204 + unspecified. Calling PR_SVE_SET_VL with vl equal to the thread's current 205 + vector length, or calling PR_SVE_SET_VL with the PR_SVE_SET_VL_ONEXEC 206 + flag, does not constitute a change to the vector length for this purpose. 207 + 208 + 209 + prctl(PR_SVE_GET_VL) 210 + 211 + Gets the vector length of the calling thread. 212 + 213 + The following flag may be OR-ed into the result: 214 + 215 + PR_SVE_SET_VL_INHERIT 216 + 217 + Vector length will be inherited across execve(). 218 + 219 + There is no way to determine whether there is an outstanding deferred 220 + vector length change (which would only normally be the case between a 221 + fork() or vfork() and the corresponding execve() in typical use). 222 + 223 + To extract the vector length from the result, and it with 224 + PR_SVE_VL_LEN_MASK. 225 + 226 + Return value: a nonnegative value on success, or a negative value on error: 227 + EINVAL: SVE not supported. 228 + 229 + 230 + 7. ptrace extensions 231 + --------------------- 232 + 233 + * A new regset NT_ARM_SVE is defined for use with PTRACE_GETREGSET and 234 + PTRACE_SETREGSET. 235 + 236 + Refer to [2] for definitions. 237 + 238 + The regset data starts with struct user_sve_header, containing: 239 + 240 + size 241 + 242 + Size of the complete regset, in bytes. 243 + This depends on vl and possibly on other things in the future. 244 + 245 + If a call to PTRACE_GETREGSET requests less data than the value of 246 + size, the caller can allocate a larger buffer and retry in order to 247 + read the complete regset. 248 + 249 + max_size 250 + 251 + Maximum size in bytes that the regset can grow to for the target 252 + thread. The regset won't grow bigger than this even if the target 253 + thread changes its vector length etc. 254 + 255 + vl 256 + 257 + Target thread's current vector length, in bytes. 258 + 259 + max_vl 260 + 261 + Maximum possible vector length for the target thread. 262 + 263 + flags 264 + 265 + either 266 + 267 + SVE_PT_REGS_FPSIMD 268 + 269 + SVE registers are not live (GETREGSET) or are to be made 270 + non-live (SETREGSET). 271 + 272 + The payload is of type struct user_fpsimd_state, with the same 273 + meaning as for NT_PRFPREG, starting at offset 274 + SVE_PT_FPSIMD_OFFSET from the start of user_sve_header. 275 + 276 + Extra data might be appended in the future: the size of the 277 + payload should be obtained using SVE_PT_FPSIMD_SIZE(vq, flags). 278 + 279 + vq should be obtained using sve_vq_from_vl(vl). 280 + 281 + or 282 + 283 + SVE_PT_REGS_SVE 284 + 285 + SVE registers are live (GETREGSET) or are to be made live 286 + (SETREGSET). 287 + 288 + The payload contains the SVE register data, starting at offset 289 + SVE_PT_SVE_OFFSET from the start of user_sve_header, and with 290 + size SVE_PT_SVE_SIZE(vq, flags); 291 + 292 + ... OR-ed with zero or more of the following flags, which have the same 293 + meaning and behaviour as the corresponding PR_SET_VL_* flags: 294 + 295 + SVE_PT_VL_INHERIT 296 + 297 + SVE_PT_VL_ONEXEC (SETREGSET only). 298 + 299 + * The effects of changing the vector length and/or flags are equivalent to 300 + those documented for PR_SVE_SET_VL. 301 + 302 + The caller must make a further GETREGSET call if it needs to know what VL is 303 + actually set by SETREGSET, unless is it known in advance that the requested 304 + VL is supported. 305 + 306 + * In the SVE_PT_REGS_SVE case, the size and layout of the payload depends on 307 + the header fields. The SVE_PT_SVE_*() macros are provided to facilitate 308 + access to the members. 309 + 310 + * In either case, for SETREGSET it is permissible to omit the payload, in which 311 + case only the vector length and flags are changed (along with any 312 + consequences of those changes). 313 + 314 + * For SETREGSET, if an SVE_PT_REGS_SVE payload is present and the 315 + requested VL is not supported, the effect will be the same as if the 316 + payload were omitted, except that an EIO error is reported. No 317 + attempt is made to translate the payload data to the correct layout 318 + for the vector length actually set. The thread's FPSIMD state is 319 + preserved, but the remaining bits of the SVE registers become 320 + unspecified. It is up to the caller to translate the payload layout 321 + for the actual VL and retry. 322 + 323 + * The effect of writing a partial, incomplete payload is unspecified. 324 + 325 + 326 + 8. ELF coredump extensions 327 + --------------------------- 328 + 329 + * A NT_ARM_SVE note will be added to each coredump for each thread of the 330 + dumped process. The contents will be equivalent to the data that would have 331 + been read if a PTRACE_GETREGSET of NT_ARM_SVE were executed for each thread 332 + when the coredump was generated. 333 + 334 + 335 + 9. System runtime configuration 336 + -------------------------------- 337 + 338 + * To mitigate the ABI impact of expansion of the signal frame, a policy 339 + mechanism is provided for administrators, distro maintainers and developers 340 + to set the default vector length for userspace processes: 341 + 342 + /proc/sys/abi/sve_default_vector_length 343 + 344 + Writing the text representation of an integer to this file sets the system 345 + default vector length to the specified value, unless the value is greater 346 + than the maximum vector length supported by the system in which case the 347 + default vector length is set to that maximum. 348 + 349 + The result can be determined by reopening the file and reading its 350 + contents. 351 + 352 + At boot, the default vector length is initially set to 64 or the maximum 353 + supported vector length, whichever is smaller. This determines the initial 354 + vector length of the init process (PID 1). 355 + 356 + Reading this file returns the current system default vector length. 357 + 358 + * At every execve() call, the new vector length of the new process is set to 359 + the system default vector length, unless 360 + 361 + * PR_SVE_SET_VL_INHERIT (or equivalently SVE_PT_VL_INHERIT) is set for the 362 + calling thread, or 363 + 364 + * a deferred vector length change is pending, established via the 365 + PR_SVE_SET_VL_ONEXEC flag (or SVE_PT_VL_ONEXEC). 366 + 367 + * Modifying the system default vector length does not affect the vector length 368 + of any existing process or thread that does not make an execve() call. 369 + 370 + 371 + Appendix A. SVE programmer's model (informative) 372 + ================================================= 373 + 374 + This section provides a minimal description of the additions made by SVE to the 375 + ARMv8-A programmer's model that are relevant to this document. 376 + 377 + Note: This section is for information only and not intended to be complete or 378 + to replace any architectural specification. 379 + 380 + A.1. Registers 381 + --------------- 382 + 383 + In A64 state, SVE adds the following: 384 + 385 + * 32 8VL-bit vector registers Z0..Z31 386 + For each Zn, Zn bits [127:0] alias the ARMv8-A vector register Vn. 387 + 388 + A register write using a Vn register name zeros all bits of the corresponding 389 + Zn except for bits [127:0]. 390 + 391 + * 16 VL-bit predicate registers P0..P15 392 + 393 + * 1 VL-bit special-purpose predicate register FFR (the "first-fault register") 394 + 395 + * a VL "pseudo-register" that determines the size of each vector register 396 + 397 + The SVE instruction set architecture provides no way to write VL directly. 398 + Instead, it can be modified only by EL1 and above, by writing appropriate 399 + system registers. 400 + 401 + * The value of VL can be configured at runtime by EL1 and above: 402 + 16 <= VL <= VLmax, where VL must be a multiple of 16. 403 + 404 + * The maximum vector length is determined by the hardware: 405 + 16 <= VLmax <= 256. 406 + 407 + (The SVE architecture specifies 256, but permits future architecture 408 + revisions to raise this limit.) 409 + 410 + * FPSR and FPCR are retained from ARMv8-A, and interact with SVE floating-point 411 + operations in a similar way to the way in which they interact with ARMv8 412 + floating-point operations. 413 + 414 + 8VL-1 128 0 bit index 415 + +---- //// -----------------+ 416 + Z0 | : V0 | 417 + : : 418 + Z7 | : V7 | 419 + Z8 | : * V8 | 420 + : : : 421 + Z15 | : *V15 | 422 + Z16 | : V16 | 423 + : : 424 + Z31 | : V31 | 425 + +---- //// -----------------+ 426 + 31 0 427 + VL-1 0 +-------+ 428 + +---- //// --+ FPSR | | 429 + P0 | | +-------+ 430 + : | | *FPCR | | 431 + P15 | | +-------+ 432 + +---- //// --+ 433 + FFR | | +-----+ 434 + +---- //// --+ VL | | 435 + +-----+ 436 + 437 + (*) callee-save: 438 + This only applies to bits [63:0] of Z-/V-registers. 439 + FPCR contains callee-save and caller-save bits. See [4] for details. 440 + 441 + 442 + A.2. Procedure call standard 443 + ----------------------------- 444 + 445 + The ARMv8-A base procedure call standard is extended as follows with respect to 446 + the additional SVE register state: 447 + 448 + * All SVE register bits that are not shared with FP/SIMD are caller-save. 449 + 450 + * Z8 bits [63:0] .. Z15 bits [63:0] are callee-save. 451 + 452 + This follows from the way these bits are mapped to V8..V15, which are caller- 453 + save in the base procedure call standard. 454 + 455 + 456 + Appendix B. ARMv8-A FP/SIMD programmer's model 457 + =============================================== 458 + 459 + Note: This section is for information only and not intended to be complete or 460 + to replace any architectural specification. 461 + 462 + Refer to [4] for for more information. 463 + 464 + ARMv8-A defines the following floating-point / SIMD register state: 465 + 466 + * 32 128-bit vector registers V0..V31 467 + * 2 32-bit status/control registers FPSR, FPCR 468 + 469 + 127 0 bit index 470 + +---------------+ 471 + V0 | | 472 + : : : 473 + V7 | | 474 + * V8 | | 475 + : : : : 476 + *V15 | | 477 + V16 | | 478 + : : : 479 + V31 | | 480 + +---------------+ 481 + 482 + 31 0 483 + +-------+ 484 + FPSR | | 485 + +-------+ 486 + *FPCR | | 487 + +-------+ 488 + 489 + (*) callee-save: 490 + This only applies to bits [63:0] of V-registers. 491 + FPCR contains a mixture of callee-save and caller-save bits. 492 + 493 + 494 + References 495 + ========== 496 + 497 + [1] arch/arm64/include/uapi/asm/sigcontext.h 498 + AArch64 Linux signal ABI definitions 499 + 500 + [2] arch/arm64/include/uapi/asm/ptrace.h 501 + AArch64 Linux ptrace ABI definitions 502 + 503 + [3] linux/Documentation/arm64/cpu-feature-registers.txt 504 + 505 + [4] ARM IHI0055C 506 + http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055c/IHI0055C_beta_aapcs64.pdf 507 + http://infocenter.arm.com/help/topic/com.arm.doc.subset.swdev.abi/index.html 508 + Procedure Call Standard for the ARM 64-bit Architecture (AArch64)
+20
Documentation/devicetree/bindings/arm/spe-pmu.txt
··· 1 + * ARMv8.2 Statistical Profiling Extension (SPE) Performance Monitor Units (PMU) 2 + 3 + ARMv8.2 introduces the optional Statistical Profiling Extension for collecting 4 + performance sample data using an in-memory trace buffer. 5 + 6 + ** SPE Required properties: 7 + 8 + - compatible : should be one of: 9 + "arm,statistical-profiling-extension-v1" 10 + 11 + - interrupts : Exactly 1 PPI must be listed. For heterogeneous systems where 12 + SPE is only supported on a subset of the CPUs, please consult 13 + the arm,gic-v3 binding for details on describing a PPI partition. 14 + 15 + ** Example: 16 + 17 + spe-pmu { 18 + compatible = "arm,statistical-profiling-extension-v1"; 19 + interrupts = <GIC_PPI 05 IRQ_TYPE_LEVEL_HIGH &part1>; 20 + };
+53
Documentation/perf/hisi-pmu.txt
··· 1 + HiSilicon SoC uncore Performance Monitoring Unit (PMU) 2 + ====================================================== 3 + The HiSilicon SoC chip includes various independent system device PMUs 4 + such as L3 cache (L3C), Hydra Home Agent (HHA) and DDRC. These PMUs are 5 + independent and have hardware logic to gather statistics and performance 6 + information. 7 + 8 + The HiSilicon SoC encapsulates multiple CPU and IO dies. Each CPU cluster 9 + (CCL) is made up of 4 cpu cores sharing one L3 cache; each CPU die is 10 + called Super CPU cluster (SCCL) and is made up of 6 CCLs. Each SCCL has 11 + two HHAs (0 - 1) and four DDRCs (0 - 3), respectively. 12 + 13 + HiSilicon SoC uncore PMU driver 14 + --------------------------------------- 15 + Each device PMU has separate registers for event counting, control and 16 + interrupt, and the PMU driver shall register perf PMU drivers like L3C, 17 + HHA and DDRC etc. The available events and configuration options shall 18 + be described in the sysfs, see : 19 + /sys/devices/hisi_sccl{X}_<l3c{Y}/hha{Y}/ddrc{Y}>/, or 20 + /sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}/hha{Y}/ddrc{Y}>. 21 + The "perf list" command shall list the available events from sysfs. 22 + 23 + Each L3C, HHA and DDRC is registered as a separate PMU with perf. The PMU 24 + name will appear in event listing as hisi_sccl<sccl-id>_module<index-id>. 25 + where "sccl-id" is the identifier of the SCCL and "index-id" is the index of 26 + module. 27 + e.g. hisi_sccl3_l3c0/rd_hit_cpipe is READ_HIT_CPIPE event of L3C index #0 in 28 + SCCL ID #3. 29 + e.g. hisi_sccl1_hha0/rx_operations is RX_OPERATIONS event of HHA index #0 in 30 + SCCL ID #1. 31 + 32 + The driver also provides a "cpumask" sysfs attribute, which shows the CPU core 33 + ID used to count the uncore PMU event. 34 + 35 + Example usage of perf: 36 + $# perf list 37 + hisi_sccl3_l3c0/rd_hit_cpipe/ [kernel PMU event] 38 + ------------------------------------------ 39 + hisi_sccl3_l3c0/wr_hit_cpipe/ [kernel PMU event] 40 + ------------------------------------------ 41 + hisi_sccl1_l3c0/rd_hit_cpipe/ [kernel PMU event] 42 + ------------------------------------------ 43 + hisi_sccl1_l3c0/wr_hit_cpipe/ [kernel PMU event] 44 + ------------------------------------------ 45 + 46 + $# perf stat -a -e hisi_sccl3_l3c0/rd_hit_cpipe/ sleep 5 47 + $# perf stat -a -e hisi_sccl3_l3c0/config=0x02/ sleep 5 48 + 49 + The current driver does not support sampling. So "perf record" is unsupported. 50 + Also attach to a task is unsupported as the events are all uncore. 51 + 52 + Note: Please contact the maintainer for a complete list of events supported for 53 + the PMU devices in the SoC and its information if needed.
+7
MAINTAINERS
··· 6259 6259 F: drivers/net/ethernet/hisilicon/ 6260 6260 F: Documentation/devicetree/bindings/net/hisilicon*.txt 6261 6261 6262 + HISILICON PMU DRIVER 6263 + M: Shaokun Zhang <zhangshaokun@hisilicon.com> 6264 + W: http://www.hisilicon.com 6265 + S: Supported 6266 + F: drivers/perf/hisilicon 6267 + F: Documentation/perf/hisi-pmu.txt 6268 + 6262 6269 HISILICON ROCE DRIVER 6263 6270 M: Lijun Ou <oulijun@huawei.com> 6264 6271 M: Wei Hu(Xavier) <xavier.huwei@huawei.com>
+1
arch/arm/include/asm/arch_timer.h
··· 107 107 static inline void arch_timer_set_cntkctl(u32 cntkctl) 108 108 { 109 109 asm volatile("mcr p15, 0, %0, c14, c1, 0" : : "r" (cntkctl)); 110 + isb(); 110 111 } 111 112 112 113 #endif
+3
arch/arm/include/asm/kvm_host.h
··· 293 293 int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu, 294 294 struct kvm_device_attr *attr); 295 295 296 + /* All host FP/SIMD state is restored on guest exit, so nothing to save: */ 297 + static inline void kvm_fpsimd_flush_cpu_state(void) {} 298 + 296 299 #endif /* __ARM_KVM_HOST_H__ */
+16 -2
arch/arm64/Kconfig
··· 21 21 select ARCH_HAS_STRICT_KERNEL_RWX 22 22 select ARCH_HAS_STRICT_MODULE_RWX 23 23 select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST 24 - select ARCH_HAVE_NMI_SAFE_CMPXCHG if ACPI_APEI_SEA 24 + select ARCH_HAVE_NMI_SAFE_CMPXCHG 25 25 select ARCH_INLINE_READ_LOCK if !PREEMPT 26 26 select ARCH_INLINE_READ_LOCK_BH if !PREEMPT 27 27 select ARCH_INLINE_READ_LOCK_IRQ if !PREEMPT ··· 115 115 select HAVE_IRQ_TIME_ACCOUNTING 116 116 select HAVE_MEMBLOCK 117 117 select HAVE_MEMBLOCK_NODE_MAP if NUMA 118 - select HAVE_NMI if ACPI_APEI_SEA 118 + select HAVE_NMI 119 119 select HAVE_PATA_PLATFORM 120 120 select HAVE_PERF_EVENTS 121 121 select HAVE_PERF_REGS ··· 136 136 select PCI_ECAM if ACPI 137 137 select POWER_RESET 138 138 select POWER_SUPPLY 139 + select REFCOUNT_FULL 139 140 select SPARSE_IRQ 140 141 select SYSCTL_EXCEPTION_TRACE 141 142 select THREAD_INFO_IN_TASK ··· 843 842 menuconfig ARMV8_DEPRECATED 844 843 bool "Emulate deprecated/obsolete ARMv8 instructions" 845 844 depends on COMPAT 845 + depends on SYSCTL 846 846 help 847 847 Legacy software support may require certain instructions 848 848 that have been deprecated or obsoleted in the architecture. ··· 1013 1011 1014 1012 endmenu 1015 1013 1014 + config ARM64_SVE 1015 + bool "ARM Scalable Vector Extension support" 1016 + default y 1017 + help 1018 + The Scalable Vector Extension (SVE) is an extension to the AArch64 1019 + execution state which complements and extends the SIMD functionality 1020 + of the base architecture to support much larger vectors and to enable 1021 + additional vectorisation opportunities. 1022 + 1023 + To enable use of this extension on CPUs that implement it, say Y. 1024 + 1016 1025 config ARM64_MODULE_CMODEL_LARGE 1017 1026 bool 1018 1027 ··· 1112 1099 config EFI 1113 1100 bool "UEFI runtime support" 1114 1101 depends on OF && !CPU_BIG_ENDIAN 1102 + depends on KERNEL_MODE_NEON 1115 1103 select LIBFDT 1116 1104 select UCS2_STRING 1117 1105 select EFI_PARAMS_FROM_FDT
+8 -2
arch/arm64/Makefile
··· 14 14 CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET) 15 15 GZFLAGS :=-9 16 16 17 - ifneq ($(CONFIG_RELOCATABLE),) 18 - LDFLAGS_vmlinux += -pie -shared -Bsymbolic 17 + ifeq ($(CONFIG_RELOCATABLE), y) 18 + # Pass --no-apply-dynamic-relocs to restore pre-binutils-2.27 behaviour 19 + # for relative relocs, since this leads to better Image compression 20 + # with the relocation offsets always being zero. 21 + LDFLAGS_vmlinux += -pie -shared -Bsymbolic \ 22 + $(call ld-option, --no-apply-dynamic-relocs) 19 23 endif 20 24 21 25 ifeq ($(CONFIG_ARM64_ERRATUM_843419),y) ··· 56 52 57 53 KBUILD_CFLAGS += $(call cc-option,-mabi=lp64) 58 54 KBUILD_AFLAGS += $(call cc-option,-mabi=lp64) 55 + 56 + KBUILD_CFLAGS += $(call cc-ifversion, -ge, 0500, -DCONFIG_ARCH_SUPPORTS_INT128) 59 57 60 58 ifeq ($(CONFIG_CPU_BIG_ENDIAN), y) 61 59 KBUILD_CPPFLAGS += -mbig-endian
+1
arch/arm64/include/asm/arch_timer.h
··· 144 144 static inline void arch_timer_set_cntkctl(u32 cntkctl) 145 145 { 146 146 write_sysreg(cntkctl, cntkctl_el1); 147 + isb(); 147 148 } 148 149 149 150 static inline u64 arch_counter_get_cntpct(void)
+4 -4
arch/arm64/include/asm/asm-bug.h
··· 22 22 #define _BUGVERBOSE_LOCATION(file, line) __BUGVERBOSE_LOCATION(file, line) 23 23 #define __BUGVERBOSE_LOCATION(file, line) \ 24 24 .pushsection .rodata.str,"aMS",@progbits,1; \ 25 - 2: .string file; \ 25 + 14472: .string file; \ 26 26 .popsection; \ 27 27 \ 28 - .long 2b - 0b; \ 28 + .long 14472b - 14470b; \ 29 29 .short line; 30 30 #else 31 31 #define _BUGVERBOSE_LOCATION(file, line) ··· 36 36 #define __BUG_ENTRY(flags) \ 37 37 .pushsection __bug_table,"aw"; \ 38 38 .align 2; \ 39 - 0: .long 1f - 0b; \ 39 + 14470: .long 14471f - 14470b; \ 40 40 _BUGVERBOSE_LOCATION(__FILE__, __LINE__) \ 41 41 .short flags; \ 42 42 .popsection; \ 43 - 1: 43 + 14471: 44 44 #else 45 45 #define __BUG_ENTRY(flags) 46 46 #endif
+32 -19
arch/arm64/include/asm/assembler.h
··· 25 25 26 26 #include <asm/asm-offsets.h> 27 27 #include <asm/cpufeature.h> 28 + #include <asm/debug-monitors.h> 28 29 #include <asm/mmu_context.h> 29 30 #include <asm/page.h> 30 31 #include <asm/pgtable-hwdef.h> 31 32 #include <asm/ptrace.h> 32 33 #include <asm/thread_info.h> 34 + 35 + .macro save_and_disable_daif, flags 36 + mrs \flags, daif 37 + msr daifset, #0xf 38 + .endm 39 + 40 + .macro disable_daif 41 + msr daifset, #0xf 42 + .endm 43 + 44 + .macro enable_daif 45 + msr daifclr, #0xf 46 + .endm 47 + 48 + .macro restore_daif, flags:req 49 + msr daif, \flags 50 + .endm 51 + 52 + /* Only on aarch64 pstate, PSR_D_BIT is different for aarch32 */ 53 + .macro inherit_daif, pstate:req, tmp:req 54 + and \tmp, \pstate, #(PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT) 55 + msr daif, \tmp 56 + .endm 57 + 58 + /* IRQ is the lowest priority flag, unconditionally unmask the rest. */ 59 + .macro enable_da_f 60 + msr daifclr, #(8 | 4 | 1) 61 + .endm 33 62 34 63 /* 35 64 * Enable and disable interrupts. ··· 80 51 msr daif, \flags 81 52 .endm 82 53 83 - /* 84 - * Enable and disable debug exceptions. 85 - */ 86 - .macro disable_dbg 87 - msr daifset, #8 88 - .endm 89 - 90 54 .macro enable_dbg 91 55 msr daifclr, #8 92 56 .endm ··· 87 65 .macro disable_step_tsk, flgs, tmp 88 66 tbz \flgs, #TIF_SINGLESTEP, 9990f 89 67 mrs \tmp, mdscr_el1 90 - bic \tmp, \tmp, #1 68 + bic \tmp, \tmp, #DBG_MDSCR_SS 91 69 msr mdscr_el1, \tmp 92 70 isb // Synchronise with enable_dbg 93 71 9990: 94 72 .endm 95 73 74 + /* call with daif masked */ 96 75 .macro enable_step_tsk, flgs, tmp 97 76 tbz \flgs, #TIF_SINGLESTEP, 9990f 98 - disable_dbg 99 77 mrs \tmp, mdscr_el1 100 - orr \tmp, \tmp, #1 78 + orr \tmp, \tmp, #DBG_MDSCR_SS 101 79 msr mdscr_el1, \tmp 102 80 9990: 103 - .endm 104 - 105 - /* 106 - * Enable both debug exceptions and interrupts. This is likely to be 107 - * faster than two daifclr operations, since writes to this register 108 - * are self-synchronising. 109 - */ 110 - .macro enable_dbg_and_irq 111 - msr daifclr, #(8 | 2) 112 81 .endm 113 82 114 83 /*
+2
arch/arm64/include/asm/barrier.h
··· 31 31 #define dmb(opt) asm volatile("dmb " #opt : : : "memory") 32 32 #define dsb(opt) asm volatile("dsb " #opt : : : "memory") 33 33 34 + #define psb_csync() asm volatile("hint #17" : : : "memory") 35 + 34 36 #define mb() dsb(sy) 35 37 #define rmb() dsb(ld) 36 38 #define wmb() dsb(st)
+4
arch/arm64/include/asm/cpu.h
··· 41 41 u64 reg_id_aa64mmfr2; 42 42 u64 reg_id_aa64pfr0; 43 43 u64 reg_id_aa64pfr1; 44 + u64 reg_id_aa64zfr0; 44 45 45 46 u32 reg_id_dfr0; 46 47 u32 reg_id_isar0; ··· 60 59 u32 reg_mvfr0; 61 60 u32 reg_mvfr1; 62 61 u32 reg_mvfr2; 62 + 63 + /* pseudo-ZCR for recording maximum ZCR_EL1 LEN value: */ 64 + u64 reg_zcr; 63 65 }; 64 66 65 67 DECLARE_PER_CPU(struct cpuinfo_arm64, cpu_data);
+2 -1
arch/arm64/include/asm/cpucaps.h
··· 40 40 #define ARM64_WORKAROUND_858921 19 41 41 #define ARM64_WORKAROUND_CAVIUM_30115 20 42 42 #define ARM64_HAS_DCPOP 21 43 + #define ARM64_SVE 22 43 44 44 - #define ARM64_NCAPS 22 45 + #define ARM64_NCAPS 23 45 46 46 47 #endif /* __ASM_CPUCAPS_H */
+42
arch/arm64/include/asm/cpufeature.h
··· 10 10 #define __ASM_CPUFEATURE_H 11 11 12 12 #include <asm/cpucaps.h> 13 + #include <asm/fpsimd.h> 13 14 #include <asm/hwcap.h> 15 + #include <asm/sigcontext.h> 14 16 #include <asm/sysreg.h> 15 17 16 18 /* ··· 225 223 return val == ID_AA64PFR0_EL0_32BIT_64BIT; 226 224 } 227 225 226 + static inline bool id_aa64pfr0_sve(u64 pfr0) 227 + { 228 + u32 val = cpuid_feature_extract_unsigned_field(pfr0, ID_AA64PFR0_SVE_SHIFT); 229 + 230 + return val > 0; 231 + } 232 + 228 233 void __init setup_cpu_features(void); 229 234 230 235 void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps, ··· 269 260 { 270 261 return IS_ENABLED(CONFIG_ARM64_SW_TTBR0_PAN) && 271 262 !cpus_have_const_cap(ARM64_HAS_PAN); 263 + } 264 + 265 + static inline bool system_supports_sve(void) 266 + { 267 + return IS_ENABLED(CONFIG_ARM64_SVE) && 268 + cpus_have_const_cap(ARM64_SVE); 269 + } 270 + 271 + /* 272 + * Read the pseudo-ZCR used by cpufeatures to identify the supported SVE 273 + * vector length. 274 + * 275 + * Use only if SVE is present. 276 + * This function clobbers the SVE vector length. 277 + */ 278 + static inline u64 read_zcr_features(void) 279 + { 280 + u64 zcr; 281 + unsigned int vq_max; 282 + 283 + /* 284 + * Set the maximum possible VL, and write zeroes to all other 285 + * bits to see if they stick. 286 + */ 287 + sve_kernel_enable(NULL); 288 + write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1); 289 + 290 + zcr = read_sysreg_s(SYS_ZCR_EL1); 291 + zcr &= ~(u64)ZCR_ELx_LEN_MASK; /* find sticky 1s outside LEN field */ 292 + vq_max = sve_vq_from_vl(sve_get_vl()); 293 + zcr |= vq_max - 1; /* set LEN field to maximum effective value */ 294 + 295 + return zcr; 272 296 } 273 297 274 298 #endif /* __ASSEMBLY__ */
+72
arch/arm64/include/asm/daifflags.h
··· 1 + /* 2 + * Copyright (C) 2017 ARM Ltd. 3 + * 4 + * This program is free software; you can redistribute it and/or modify 5 + * it under the terms of the GNU General Public License version 2 as 6 + * published by the Free Software Foundation. 7 + * 8 + * This program is distributed in the hope that it will be useful, 9 + * but WITHOUT ANY WARRANTY; without even the implied warranty of 10 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 11 + * GNU General Public License for more details. 12 + * 13 + * You should have received a copy of the GNU General Public License 14 + * along with this program. If not, see <http://www.gnu.org/licenses/>. 15 + */ 16 + #ifndef __ASM_DAIFFLAGS_H 17 + #define __ASM_DAIFFLAGS_H 18 + 19 + #include <linux/irqflags.h> 20 + 21 + #define DAIF_PROCCTX 0 22 + #define DAIF_PROCCTX_NOIRQ PSR_I_BIT 23 + 24 + /* mask/save/unmask/restore all exceptions, including interrupts. */ 25 + static inline void local_daif_mask(void) 26 + { 27 + asm volatile( 28 + "msr daifset, #0xf // local_daif_mask\n" 29 + : 30 + : 31 + : "memory"); 32 + trace_hardirqs_off(); 33 + } 34 + 35 + static inline unsigned long local_daif_save(void) 36 + { 37 + unsigned long flags; 38 + 39 + asm volatile( 40 + "mrs %0, daif // local_daif_save\n" 41 + : "=r" (flags) 42 + : 43 + : "memory"); 44 + local_daif_mask(); 45 + 46 + return flags; 47 + } 48 + 49 + static inline void local_daif_unmask(void) 50 + { 51 + trace_hardirqs_on(); 52 + asm volatile( 53 + "msr daifclr, #0xf // local_daif_unmask" 54 + : 55 + : 56 + : "memory"); 57 + } 58 + 59 + static inline void local_daif_restore(unsigned long flags) 60 + { 61 + if (!arch_irqs_disabled_flags(flags)) 62 + trace_hardirqs_on(); 63 + asm volatile( 64 + "msr daif, %0 // local_daif_restore" 65 + : 66 + : "r" (flags) 67 + : "memory"); 68 + if (arch_irqs_disabled_flags(flags)) 69 + trace_hardirqs_off(); 70 + } 71 + 72 + #endif
+2 -2
arch/arm64/include/asm/elf.h
··· 188 188 189 189 #define compat_start_thread compat_start_thread 190 190 /* 191 - * Unlike the native SET_PERSONALITY macro, the compat version inherits 192 - * READ_IMPLIES_EXEC across a fork() since this is the behaviour on 191 + * Unlike the native SET_PERSONALITY macro, the compat version maintains 192 + * READ_IMPLIES_EXEC across an execve() since this is the behaviour on 193 193 * arch/arm/. 194 194 */ 195 195 #define COMPAT_SET_PERSONALITY(ex) \
+2 -1
arch/arm64/include/asm/esr.h
··· 43 43 #define ESR_ELx_EC_HVC64 (0x16) 44 44 #define ESR_ELx_EC_SMC64 (0x17) 45 45 #define ESR_ELx_EC_SYS64 (0x18) 46 - /* Unallocated EC: 0x19 - 0x1E */ 46 + #define ESR_ELx_EC_SVE (0x19) 47 + /* Unallocated EC: 0x1A - 0x1E */ 47 48 #define ESR_ELx_EC_IMP_DEF (0x1f) 48 49 #define ESR_ELx_EC_IABT_LOW (0x20) 49 50 #define ESR_ELx_EC_IABT_CUR (0x21)
+70 -1
arch/arm64/include/asm/fpsimd.h
··· 17 17 #define __ASM_FP_H 18 18 19 19 #include <asm/ptrace.h> 20 + #include <asm/errno.h> 20 21 21 22 #ifndef __ASSEMBLY__ 23 + 24 + #include <linux/cache.h> 25 + #include <linux/stddef.h> 22 26 23 27 /* 24 28 * FP/SIMD storage area has: ··· 39 35 __uint128_t vregs[32]; 40 36 u32 fpsr; 41 37 u32 fpcr; 38 + /* 39 + * For ptrace compatibility, pad to next 128-bit 40 + * boundary here if extending this struct. 41 + */ 42 42 }; 43 43 }; 44 44 /* the id of the last cpu to have restored this state */ 45 45 unsigned int cpu; 46 46 }; 47 - 48 47 49 48 #if defined(__KERNEL__) && defined(CONFIG_COMPAT) 50 49 /* Masks for extracting the FPSR and FPCR from the FPSCR */ ··· 68 61 extern void fpsimd_thread_switch(struct task_struct *next); 69 62 extern void fpsimd_flush_thread(void); 70 63 64 + extern void fpsimd_signal_preserve_current_state(void); 71 65 extern void fpsimd_preserve_current_state(void); 72 66 extern void fpsimd_restore_current_state(void); 73 67 extern void fpsimd_update_current_state(struct fpsimd_state *state); 74 68 75 69 extern void fpsimd_flush_task_state(struct task_struct *target); 70 + extern void sve_flush_cpu_state(void); 71 + 72 + /* Maximum VL that SVE VL-agnostic software can transparently support */ 73 + #define SVE_VL_ARCH_MAX 0x100 74 + 75 + extern void sve_save_state(void *state, u32 *pfpsr); 76 + extern void sve_load_state(void const *state, u32 const *pfpsr, 77 + unsigned long vq_minus_1); 78 + extern unsigned int sve_get_vl(void); 79 + extern int sve_kernel_enable(void *); 80 + 81 + extern int __ro_after_init sve_max_vl; 82 + 83 + #ifdef CONFIG_ARM64_SVE 84 + 85 + extern size_t sve_state_size(struct task_struct const *task); 86 + 87 + extern void sve_alloc(struct task_struct *task); 88 + extern void fpsimd_release_task(struct task_struct *task); 89 + extern void fpsimd_sync_to_sve(struct task_struct *task); 90 + extern void sve_sync_to_fpsimd(struct task_struct *task); 91 + extern void sve_sync_from_fpsimd_zeropad(struct task_struct *task); 92 + 93 + extern int sve_set_vector_length(struct task_struct *task, 94 + unsigned long vl, unsigned long flags); 95 + 96 + extern int sve_set_current_vl(unsigned long arg); 97 + extern int sve_get_current_vl(void); 98 + 99 + /* 100 + * Probing and setup functions. 101 + * Calls to these functions must be serialised with one another. 102 + */ 103 + extern void __init sve_init_vq_map(void); 104 + extern void sve_update_vq_map(void); 105 + extern int sve_verify_vq_map(void); 106 + extern void __init sve_setup(void); 107 + 108 + #else /* ! CONFIG_ARM64_SVE */ 109 + 110 + static inline void sve_alloc(struct task_struct *task) { } 111 + static inline void fpsimd_release_task(struct task_struct *task) { } 112 + static inline void sve_sync_to_fpsimd(struct task_struct *task) { } 113 + static inline void sve_sync_from_fpsimd_zeropad(struct task_struct *task) { } 114 + 115 + static inline int sve_set_current_vl(unsigned long arg) 116 + { 117 + return -EINVAL; 118 + } 119 + 120 + static inline int sve_get_current_vl(void) 121 + { 122 + return -EINVAL; 123 + } 124 + 125 + static inline void sve_init_vq_map(void) { } 126 + static inline void sve_update_vq_map(void) { } 127 + static inline int sve_verify_vq_map(void) { return 0; } 128 + static inline void sve_setup(void) { } 129 + 130 + #endif /* ! CONFIG_ARM64_SVE */ 76 131 77 132 /* For use by EFI runtime services calls only */ 78 133 extern void __efi_fpsimd_begin(void);
+148
arch/arm64/include/asm/fpsimdmacros.h
··· 75 75 ldr w\tmpnr, [\state, #16 * 2 + 4] 76 76 fpsimd_restore_fpcr x\tmpnr, \state 77 77 .endm 78 + 79 + /* Sanity-check macros to help avoid encoding garbage instructions */ 80 + 81 + .macro _check_general_reg nr 82 + .if (\nr) < 0 || (\nr) > 30 83 + .error "Bad register number \nr." 84 + .endif 85 + .endm 86 + 87 + .macro _sve_check_zreg znr 88 + .if (\znr) < 0 || (\znr) > 31 89 + .error "Bad Scalable Vector Extension vector register number \znr." 90 + .endif 91 + .endm 92 + 93 + .macro _sve_check_preg pnr 94 + .if (\pnr) < 0 || (\pnr) > 15 95 + .error "Bad Scalable Vector Extension predicate register number \pnr." 96 + .endif 97 + .endm 98 + 99 + .macro _check_num n, min, max 100 + .if (\n) < (\min) || (\n) > (\max) 101 + .error "Number \n out of range [\min,\max]" 102 + .endif 103 + .endm 104 + 105 + /* SVE instruction encodings for non-SVE-capable assemblers */ 106 + 107 + /* STR (vector): STR Z\nz, [X\nxbase, #\offset, MUL VL] */ 108 + .macro _sve_str_v nz, nxbase, offset=0 109 + _sve_check_zreg \nz 110 + _check_general_reg \nxbase 111 + _check_num (\offset), -0x100, 0xff 112 + .inst 0xe5804000 \ 113 + | (\nz) \ 114 + | ((\nxbase) << 5) \ 115 + | (((\offset) & 7) << 10) \ 116 + | (((\offset) & 0x1f8) << 13) 117 + .endm 118 + 119 + /* LDR (vector): LDR Z\nz, [X\nxbase, #\offset, MUL VL] */ 120 + .macro _sve_ldr_v nz, nxbase, offset=0 121 + _sve_check_zreg \nz 122 + _check_general_reg \nxbase 123 + _check_num (\offset), -0x100, 0xff 124 + .inst 0x85804000 \ 125 + | (\nz) \ 126 + | ((\nxbase) << 5) \ 127 + | (((\offset) & 7) << 10) \ 128 + | (((\offset) & 0x1f8) << 13) 129 + .endm 130 + 131 + /* STR (predicate): STR P\np, [X\nxbase, #\offset, MUL VL] */ 132 + .macro _sve_str_p np, nxbase, offset=0 133 + _sve_check_preg \np 134 + _check_general_reg \nxbase 135 + _check_num (\offset), -0x100, 0xff 136 + .inst 0xe5800000 \ 137 + | (\np) \ 138 + | ((\nxbase) << 5) \ 139 + | (((\offset) & 7) << 10) \ 140 + | (((\offset) & 0x1f8) << 13) 141 + .endm 142 + 143 + /* LDR (predicate): LDR P\np, [X\nxbase, #\offset, MUL VL] */ 144 + .macro _sve_ldr_p np, nxbase, offset=0 145 + _sve_check_preg \np 146 + _check_general_reg \nxbase 147 + _check_num (\offset), -0x100, 0xff 148 + .inst 0x85800000 \ 149 + | (\np) \ 150 + | ((\nxbase) << 5) \ 151 + | (((\offset) & 7) << 10) \ 152 + | (((\offset) & 0x1f8) << 13) 153 + .endm 154 + 155 + /* RDVL X\nx, #\imm */ 156 + .macro _sve_rdvl nx, imm 157 + _check_general_reg \nx 158 + _check_num (\imm), -0x20, 0x1f 159 + .inst 0x04bf5000 \ 160 + | (\nx) \ 161 + | (((\imm) & 0x3f) << 5) 162 + .endm 163 + 164 + /* RDFFR (unpredicated): RDFFR P\np.B */ 165 + .macro _sve_rdffr np 166 + _sve_check_preg \np 167 + .inst 0x2519f000 \ 168 + | (\np) 169 + .endm 170 + 171 + /* WRFFR P\np.B */ 172 + .macro _sve_wrffr np 173 + _sve_check_preg \np 174 + .inst 0x25289000 \ 175 + | ((\np) << 5) 176 + .endm 177 + 178 + .macro __for from:req, to:req 179 + .if (\from) == (\to) 180 + _for__body \from 181 + .else 182 + __for \from, (\from) + ((\to) - (\from)) / 2 183 + __for (\from) + ((\to) - (\from)) / 2 + 1, \to 184 + .endif 185 + .endm 186 + 187 + .macro _for var:req, from:req, to:req, insn:vararg 188 + .macro _for__body \var:req 189 + \insn 190 + .endm 191 + 192 + __for \from, \to 193 + 194 + .purgem _for__body 195 + .endm 196 + 197 + .macro sve_save nxbase, xpfpsr, nxtmp 198 + _for n, 0, 31, _sve_str_v \n, \nxbase, \n - 34 199 + _for n, 0, 15, _sve_str_p \n, \nxbase, \n - 16 200 + _sve_rdffr 0 201 + _sve_str_p 0, \nxbase 202 + _sve_ldr_p 0, \nxbase, -16 203 + 204 + mrs x\nxtmp, fpsr 205 + str w\nxtmp, [\xpfpsr] 206 + mrs x\nxtmp, fpcr 207 + str w\nxtmp, [\xpfpsr, #4] 208 + .endm 209 + 210 + .macro sve_load nxbase, xpfpsr, xvqminus1, nxtmp 211 + mrs_s x\nxtmp, SYS_ZCR_EL1 212 + bic x\nxtmp, x\nxtmp, ZCR_ELx_LEN_MASK 213 + orr x\nxtmp, x\nxtmp, \xvqminus1 214 + msr_s SYS_ZCR_EL1, x\nxtmp // self-synchronising 215 + 216 + _for n, 0, 31, _sve_ldr_v \n, \nxbase, \n - 34 217 + _sve_ldr_p 0, \nxbase 218 + _sve_wrffr 0 219 + _for n, 0, 15, _sve_ldr_p \n, \nxbase, \n - 16 220 + 221 + ldr w\nxtmp, [\xpfpsr] 222 + msr fpsr, x\nxtmp 223 + ldr w\nxtmp, [\xpfpsr, #4] 224 + msr fpcr, x\nxtmp 225 + .endm
+13 -27
arch/arm64/include/asm/irqflags.h
··· 21 21 #include <asm/ptrace.h> 22 22 23 23 /* 24 + * Aarch64 has flags for masking: Debug, Asynchronous (serror), Interrupts and 25 + * FIQ exceptions, in the 'daif' register. We mask and unmask them in 'dai' 26 + * order: 27 + * Masking debug exceptions causes all other exceptions to be masked too/ 28 + * Masking SError masks irq, but not debug exceptions. Masking irqs has no 29 + * side effects for other flags. Keeping to this order makes it easier for 30 + * entry.S to know which exceptions should be unmasked. 31 + * 32 + * FIQ is never expected, but we mask it when we disable debug exceptions, and 33 + * unmask it at all other times. 34 + */ 35 + 36 + /* 24 37 * CPU interrupt mask handling. 25 38 */ 26 39 static inline unsigned long arch_local_irq_save(void) ··· 66 53 : "memory"); 67 54 } 68 55 69 - #define local_fiq_enable() asm("msr daifclr, #1" : : : "memory") 70 - #define local_fiq_disable() asm("msr daifset, #1" : : : "memory") 71 - 72 - #define local_async_enable() asm("msr daifclr, #4" : : : "memory") 73 - #define local_async_disable() asm("msr daifset, #4" : : : "memory") 74 - 75 56 /* 76 57 * Save the current interrupt enable state. 77 58 */ ··· 96 89 { 97 90 return flags & PSR_I_BIT; 98 91 } 99 - 100 - /* 101 - * save and restore debug state 102 - */ 103 - #define local_dbg_save(flags) \ 104 - do { \ 105 - typecheck(unsigned long, flags); \ 106 - asm volatile( \ 107 - "mrs %0, daif // local_dbg_save\n" \ 108 - "msr daifset, #8" \ 109 - : "=r" (flags) : : "memory"); \ 110 - } while (0) 111 - 112 - #define local_dbg_restore(flags) \ 113 - do { \ 114 - typecheck(unsigned long, flags); \ 115 - asm volatile( \ 116 - "msr daif, %0 // local_dbg_restore\n" \ 117 - : : "r" (flags) : "memory"); \ 118 - } while (0) 119 - 120 92 #endif 121 93 #endif
+4 -1
arch/arm64/include/asm/kvm_arm.h
··· 185 185 #define CPTR_EL2_TCPAC (1 << 31) 186 186 #define CPTR_EL2_TTA (1 << 20) 187 187 #define CPTR_EL2_TFP (1 << CPTR_EL2_TFP_SHIFT) 188 - #define CPTR_EL2_DEFAULT 0x000033ff 188 + #define CPTR_EL2_TZ (1 << 8) 189 + #define CPTR_EL2_RES1 0x000032ff /* known RES1 bits in CPTR_EL2 */ 190 + #define CPTR_EL2_DEFAULT CPTR_EL2_RES1 189 191 190 192 /* Hyp Debug Configuration Register bits */ 191 193 #define MDCR_EL2_TPMS (1 << 14) ··· 238 236 239 237 #define CPACR_EL1_FPEN (3 << 20) 240 238 #define CPACR_EL1_TTA (1 << 28) 239 + #define CPACR_EL1_DEFAULT (CPACR_EL1_FPEN | CPACR_EL1_ZEN_EL1EN) 241 240 242 241 #endif /* __ARM64_KVM_ARM_H__ */
+11
arch/arm64/include/asm/kvm_host.h
··· 25 25 #include <linux/types.h> 26 26 #include <linux/kvm_types.h> 27 27 #include <asm/cpufeature.h> 28 + #include <asm/fpsimd.h> 28 29 #include <asm/kvm.h> 29 30 #include <asm/kvm_asm.h> 30 31 #include <asm/kvm_mmio.h> ··· 383 382 384 383 WARN_ONCE(parange < 40, 385 384 "PARange is %d bits, unsupported configuration!", parange); 385 + } 386 + 387 + /* 388 + * All host FP/SIMD state is restored on guest exit, so nothing needs 389 + * doing here except in the SVE case: 390 + */ 391 + static inline void kvm_fpsimd_flush_cpu_state(void) 392 + { 393 + if (system_supports_sve()) 394 + sve_flush_cpu_state(); 386 395 } 387 396 388 397 #endif /* __ARM64_KVM_HOST_H__ */
-15
arch/arm64/include/asm/memory.h
··· 61 61 * KIMAGE_VADDR - the virtual address of the start of the kernel image 62 62 * VA_BITS - the maximum number of bits for virtual addresses. 63 63 * VA_START - the first kernel virtual address. 64 - * TASK_SIZE - the maximum size of a user space task. 65 - * TASK_UNMAPPED_BASE - the lower boundary of the mmap VM area. 66 64 */ 67 65 #define VA_BITS (CONFIG_ARM64_VA_BITS) 68 66 #define VA_START (UL(0xffffffffffffffff) - \ ··· 75 77 #define PCI_IO_END (VMEMMAP_START - SZ_2M) 76 78 #define PCI_IO_START (PCI_IO_END - PCI_IO_SIZE) 77 79 #define FIXADDR_TOP (PCI_IO_START - SZ_2M) 78 - #define TASK_SIZE_64 (UL(1) << VA_BITS) 79 - 80 - #ifdef CONFIG_COMPAT 81 - #define TASK_SIZE_32 UL(0x100000000) 82 - #define TASK_SIZE (test_thread_flag(TIF_32BIT) ? \ 83 - TASK_SIZE_32 : TASK_SIZE_64) 84 - #define TASK_SIZE_OF(tsk) (test_tsk_thread_flag(tsk, TIF_32BIT) ? \ 85 - TASK_SIZE_32 : TASK_SIZE_64) 86 - #else 87 - #define TASK_SIZE TASK_SIZE_64 88 - #endif /* CONFIG_COMPAT */ 89 - 90 - #define TASK_UNMAPPED_BASE (PAGE_ALIGN(TASK_SIZE / 4)) 91 80 92 81 #define KERNEL_START _text 93 82 #define KERNEL_END _end
+14
arch/arm64/include/asm/pgtable.h
··· 98 98 ((pte_val(pte) & (PTE_VALID | PTE_USER | PTE_UXN)) == (PTE_VALID | PTE_UXN)) 99 99 #define pte_valid_young(pte) \ 100 100 ((pte_val(pte) & (PTE_VALID | PTE_AF)) == (PTE_VALID | PTE_AF)) 101 + #define pte_valid_user(pte) \ 102 + ((pte_val(pte) & (PTE_VALID | PTE_USER)) == (PTE_VALID | PTE_USER)) 101 103 102 104 /* 103 105 * Could the pte be present in the TLB? We must check mm_tlb_flush_pending ··· 108 106 */ 109 107 #define pte_accessible(mm, pte) \ 110 108 (mm_tlb_flush_pending(mm) ? pte_present(pte) : pte_valid_young(pte)) 109 + 110 + /* 111 + * p??_access_permitted() is true for valid user mappings (subject to the 112 + * write permission check) other than user execute-only which do not have the 113 + * PTE_USER bit set. PROT_NONE mappings do not have the PTE_VALID bit set. 114 + */ 115 + #define pte_access_permitted(pte, write) \ 116 + (pte_valid_user(pte) && (!(write) || pte_write(pte))) 117 + #define pmd_access_permitted(pmd, write) \ 118 + (pte_access_permitted(pmd_pte(pmd), (write))) 119 + #define pud_access_permitted(pud, write) \ 120 + (pte_access_permitted(pud_pte(pud), (write))) 111 121 112 122 static inline pte_t clear_pte_bit(pte_t pte, pgprot_t prot) 113 123 {
+28
arch/arm64/include/asm/processor.h
··· 19 19 #ifndef __ASM_PROCESSOR_H 20 20 #define __ASM_PROCESSOR_H 21 21 22 + #define TASK_SIZE_64 (UL(1) << VA_BITS) 23 + 24 + #ifndef __ASSEMBLY__ 25 + 22 26 /* 23 27 * Default implementation of macro that returns current 24 28 * instruction pointer ("program counter"). ··· 40 36 #include <asm/pgtable-hwdef.h> 41 37 #include <asm/ptrace.h> 42 38 #include <asm/types.h> 39 + 40 + /* 41 + * TASK_SIZE - the maximum size of a user space task. 42 + * TASK_UNMAPPED_BASE - the lower boundary of the mmap VM area. 43 + */ 44 + #ifdef CONFIG_COMPAT 45 + #define TASK_SIZE_32 UL(0x100000000) 46 + #define TASK_SIZE (test_thread_flag(TIF_32BIT) ? \ 47 + TASK_SIZE_32 : TASK_SIZE_64) 48 + #define TASK_SIZE_OF(tsk) (test_tsk_thread_flag(tsk, TIF_32BIT) ? \ 49 + TASK_SIZE_32 : TASK_SIZE_64) 50 + #else 51 + #define TASK_SIZE TASK_SIZE_64 52 + #endif /* CONFIG_COMPAT */ 53 + 54 + #define TASK_UNMAPPED_BASE (PAGE_ALIGN(TASK_SIZE / 4)) 43 55 44 56 #define STACK_TOP_MAX TASK_SIZE_64 45 57 #ifdef CONFIG_COMPAT ··· 105 85 unsigned long tp2_value; 106 86 #endif 107 87 struct fpsimd_state fpsimd_state; 88 + void *sve_state; /* SVE registers, if any */ 89 + unsigned int sve_vl; /* SVE vector length */ 90 + unsigned int sve_vl_onexec; /* SVE vl after next exec */ 108 91 unsigned long fault_address; /* fault info */ 109 92 unsigned long fault_code; /* ESR_EL1 value */ 110 93 struct debug_info debug; /* debugging */ ··· 217 194 int cpu_enable_pan(void *__unused); 218 195 int cpu_enable_cache_maint_trap(void *__unused); 219 196 197 + /* Userspace interface for PR_SVE_{SET,GET}_VL prctl()s: */ 198 + #define SVE_SET_VL(arg) sve_set_current_vl(arg) 199 + #define SVE_GET_VL() sve_get_current_vl() 200 + 201 + #endif /* __ASSEMBLY__ */ 220 202 #endif /* __ASM_PROCESSOR_H */
+121
arch/arm64/include/asm/sysreg.h
··· 145 145 146 146 #define SYS_ID_AA64PFR0_EL1 sys_reg(3, 0, 0, 4, 0) 147 147 #define SYS_ID_AA64PFR1_EL1 sys_reg(3, 0, 0, 4, 1) 148 + #define SYS_ID_AA64ZFR0_EL1 sys_reg(3, 0, 0, 4, 4) 148 149 149 150 #define SYS_ID_AA64DFR0_EL1 sys_reg(3, 0, 0, 5, 0) 150 151 #define SYS_ID_AA64DFR1_EL1 sys_reg(3, 0, 0, 5, 1) 152 + 153 + #define SYS_ID_AA64AFR0_EL1 sys_reg(3, 0, 0, 5, 4) 154 + #define SYS_ID_AA64AFR1_EL1 sys_reg(3, 0, 0, 5, 5) 151 155 152 156 #define SYS_ID_AA64ISAR0_EL1 sys_reg(3, 0, 0, 6, 0) 153 157 #define SYS_ID_AA64ISAR1_EL1 sys_reg(3, 0, 0, 6, 1) ··· 164 160 #define SYS_ACTLR_EL1 sys_reg(3, 0, 1, 0, 1) 165 161 #define SYS_CPACR_EL1 sys_reg(3, 0, 1, 0, 2) 166 162 163 + #define SYS_ZCR_EL1 sys_reg(3, 0, 1, 2, 0) 164 + 167 165 #define SYS_TTBR0_EL1 sys_reg(3, 0, 2, 0, 0) 168 166 #define SYS_TTBR1_EL1 sys_reg(3, 0, 2, 0, 1) 169 167 #define SYS_TCR_EL1 sys_reg(3, 0, 2, 0, 2) ··· 177 171 #define SYS_ESR_EL1 sys_reg(3, 0, 5, 2, 0) 178 172 #define SYS_FAR_EL1 sys_reg(3, 0, 6, 0, 0) 179 173 #define SYS_PAR_EL1 sys_reg(3, 0, 7, 4, 0) 174 + 175 + /*** Statistical Profiling Extension ***/ 176 + /* ID registers */ 177 + #define SYS_PMSIDR_EL1 sys_reg(3, 0, 9, 9, 7) 178 + #define SYS_PMSIDR_EL1_FE_SHIFT 0 179 + #define SYS_PMSIDR_EL1_FT_SHIFT 1 180 + #define SYS_PMSIDR_EL1_FL_SHIFT 2 181 + #define SYS_PMSIDR_EL1_ARCHINST_SHIFT 3 182 + #define SYS_PMSIDR_EL1_LDS_SHIFT 4 183 + #define SYS_PMSIDR_EL1_ERND_SHIFT 5 184 + #define SYS_PMSIDR_EL1_INTERVAL_SHIFT 8 185 + #define SYS_PMSIDR_EL1_INTERVAL_MASK 0xfUL 186 + #define SYS_PMSIDR_EL1_MAXSIZE_SHIFT 12 187 + #define SYS_PMSIDR_EL1_MAXSIZE_MASK 0xfUL 188 + #define SYS_PMSIDR_EL1_COUNTSIZE_SHIFT 16 189 + #define SYS_PMSIDR_EL1_COUNTSIZE_MASK 0xfUL 190 + 191 + #define SYS_PMBIDR_EL1 sys_reg(3, 0, 9, 10, 7) 192 + #define SYS_PMBIDR_EL1_ALIGN_SHIFT 0 193 + #define SYS_PMBIDR_EL1_ALIGN_MASK 0xfU 194 + #define SYS_PMBIDR_EL1_P_SHIFT 4 195 + #define SYS_PMBIDR_EL1_F_SHIFT 5 196 + 197 + /* Sampling controls */ 198 + #define SYS_PMSCR_EL1 sys_reg(3, 0, 9, 9, 0) 199 + #define SYS_PMSCR_EL1_E0SPE_SHIFT 0 200 + #define SYS_PMSCR_EL1_E1SPE_SHIFT 1 201 + #define SYS_PMSCR_EL1_CX_SHIFT 3 202 + #define SYS_PMSCR_EL1_PA_SHIFT 4 203 + #define SYS_PMSCR_EL1_TS_SHIFT 5 204 + #define SYS_PMSCR_EL1_PCT_SHIFT 6 205 + 206 + #define SYS_PMSCR_EL2 sys_reg(3, 4, 9, 9, 0) 207 + #define SYS_PMSCR_EL2_E0HSPE_SHIFT 0 208 + #define SYS_PMSCR_EL2_E2SPE_SHIFT 1 209 + #define SYS_PMSCR_EL2_CX_SHIFT 3 210 + #define SYS_PMSCR_EL2_PA_SHIFT 4 211 + #define SYS_PMSCR_EL2_TS_SHIFT 5 212 + #define SYS_PMSCR_EL2_PCT_SHIFT 6 213 + 214 + #define SYS_PMSICR_EL1 sys_reg(3, 0, 9, 9, 2) 215 + 216 + #define SYS_PMSIRR_EL1 sys_reg(3, 0, 9, 9, 3) 217 + #define SYS_PMSIRR_EL1_RND_SHIFT 0 218 + #define SYS_PMSIRR_EL1_INTERVAL_SHIFT 8 219 + #define SYS_PMSIRR_EL1_INTERVAL_MASK 0xffffffUL 220 + 221 + /* Filtering controls */ 222 + #define SYS_PMSFCR_EL1 sys_reg(3, 0, 9, 9, 4) 223 + #define SYS_PMSFCR_EL1_FE_SHIFT 0 224 + #define SYS_PMSFCR_EL1_FT_SHIFT 1 225 + #define SYS_PMSFCR_EL1_FL_SHIFT 2 226 + #define SYS_PMSFCR_EL1_B_SHIFT 16 227 + #define SYS_PMSFCR_EL1_LD_SHIFT 17 228 + #define SYS_PMSFCR_EL1_ST_SHIFT 18 229 + 230 + #define SYS_PMSEVFR_EL1 sys_reg(3, 0, 9, 9, 5) 231 + #define SYS_PMSEVFR_EL1_RES0 0x0000ffff00ff0f55UL 232 + 233 + #define SYS_PMSLATFR_EL1 sys_reg(3, 0, 9, 9, 6) 234 + #define SYS_PMSLATFR_EL1_MINLAT_SHIFT 0 235 + 236 + /* Buffer controls */ 237 + #define SYS_PMBLIMITR_EL1 sys_reg(3, 0, 9, 10, 0) 238 + #define SYS_PMBLIMITR_EL1_E_SHIFT 0 239 + #define SYS_PMBLIMITR_EL1_FM_SHIFT 1 240 + #define SYS_PMBLIMITR_EL1_FM_MASK 0x3UL 241 + #define SYS_PMBLIMITR_EL1_FM_STOP_IRQ (0 << SYS_PMBLIMITR_EL1_FM_SHIFT) 242 + 243 + #define SYS_PMBPTR_EL1 sys_reg(3, 0, 9, 10, 1) 244 + 245 + /* Buffer error reporting */ 246 + #define SYS_PMBSR_EL1 sys_reg(3, 0, 9, 10, 3) 247 + #define SYS_PMBSR_EL1_COLL_SHIFT 16 248 + #define SYS_PMBSR_EL1_S_SHIFT 17 249 + #define SYS_PMBSR_EL1_EA_SHIFT 18 250 + #define SYS_PMBSR_EL1_DL_SHIFT 19 251 + #define SYS_PMBSR_EL1_EC_SHIFT 26 252 + #define SYS_PMBSR_EL1_EC_MASK 0x3fUL 253 + 254 + #define SYS_PMBSR_EL1_EC_BUF (0x0UL << SYS_PMBSR_EL1_EC_SHIFT) 255 + #define SYS_PMBSR_EL1_EC_FAULT_S1 (0x24UL << SYS_PMBSR_EL1_EC_SHIFT) 256 + #define SYS_PMBSR_EL1_EC_FAULT_S2 (0x25UL << SYS_PMBSR_EL1_EC_SHIFT) 257 + 258 + #define SYS_PMBSR_EL1_FAULT_FSC_SHIFT 0 259 + #define SYS_PMBSR_EL1_FAULT_FSC_MASK 0x3fUL 260 + 261 + #define SYS_PMBSR_EL1_BUF_BSC_SHIFT 0 262 + #define SYS_PMBSR_EL1_BUF_BSC_MASK 0x3fUL 263 + 264 + #define SYS_PMBSR_EL1_BUF_BSC_FULL (0x1UL << SYS_PMBSR_EL1_BUF_BSC_SHIFT) 265 + 266 + /*** End of Statistical Profiling Extension ***/ 180 267 181 268 #define SYS_PMINTENSET_EL1 sys_reg(3, 0, 9, 14, 1) 182 269 #define SYS_PMINTENCLR_EL1 sys_reg(3, 0, 9, 14, 2) ··· 349 250 350 251 #define SYS_PMCCFILTR_EL0 sys_reg (3, 3, 14, 15, 7) 351 252 253 + #define SYS_ZCR_EL2 sys_reg(3, 4, 1, 2, 0) 254 + 352 255 #define SYS_DACR32_EL2 sys_reg(3, 4, 3, 0, 0) 353 256 #define SYS_IFSR32_EL2 sys_reg(3, 4, 5, 0, 1) 354 257 #define SYS_FPEXC32_EL2 sys_reg(3, 4, 5, 3, 0) ··· 419 318 #define SCTLR_EL1_CP15BEN (1 << 5) 420 319 421 320 /* id_aa64isar0 */ 321 + #define ID_AA64ISAR0_DP_SHIFT 44 322 + #define ID_AA64ISAR0_SM4_SHIFT 40 323 + #define ID_AA64ISAR0_SM3_SHIFT 36 324 + #define ID_AA64ISAR0_SHA3_SHIFT 32 422 325 #define ID_AA64ISAR0_RDM_SHIFT 28 423 326 #define ID_AA64ISAR0_ATOMICS_SHIFT 20 424 327 #define ID_AA64ISAR0_CRC32_SHIFT 16 ··· 437 332 #define ID_AA64ISAR1_DPB_SHIFT 0 438 333 439 334 /* id_aa64pfr0 */ 335 + #define ID_AA64PFR0_SVE_SHIFT 32 440 336 #define ID_AA64PFR0_GIC_SHIFT 24 441 337 #define ID_AA64PFR0_ASIMD_SHIFT 20 442 338 #define ID_AA64PFR0_FP_SHIFT 16 ··· 446 340 #define ID_AA64PFR0_EL1_SHIFT 4 447 341 #define ID_AA64PFR0_EL0_SHIFT 0 448 342 343 + #define ID_AA64PFR0_SVE 0x1 449 344 #define ID_AA64PFR0_FP_NI 0xf 450 345 #define ID_AA64PFR0_FP_SUPPORTED 0x0 451 346 #define ID_AA64PFR0_ASIMD_NI 0xf ··· 546 439 #define ID_AA64MMFR0_TGRAN_SHIFT ID_AA64MMFR0_TGRAN64_SHIFT 547 440 #define ID_AA64MMFR0_TGRAN_SUPPORTED ID_AA64MMFR0_TGRAN64_SUPPORTED 548 441 #endif 442 + 443 + 444 + /* 445 + * The ZCR_ELx_LEN_* definitions intentionally include bits [8:4] which 446 + * are reserved by the SVE architecture for future expansion of the LEN 447 + * field, with compatible semantics. 448 + */ 449 + #define ZCR_ELx_LEN_SHIFT 0 450 + #define ZCR_ELx_LEN_SIZE 9 451 + #define ZCR_ELx_LEN_MASK 0x1ff 452 + 453 + #define CPACR_EL1_ZEN_EL1EN (1 << 16) /* enable EL1 access */ 454 + #define CPACR_EL1_ZEN_EL0EN (1 << 17) /* enable EL0 access, if EL1EN set */ 455 + #define CPACR_EL1_ZEN (CPACR_EL1_ZEN_EL1EN | CPACR_EL1_ZEN_EL0EN) 549 456 550 457 551 458 /* Safe value for MPIDR_EL1: Bit31:RES1, Bit30:U:0, Bit24:MT:0 */
+5
arch/arm64/include/asm/thread_info.h
··· 63 63 void arch_setup_new_exec(void); 64 64 #define arch_setup_new_exec arch_setup_new_exec 65 65 66 + void arch_release_task_struct(struct task_struct *tsk); 67 + 66 68 #endif 67 69 68 70 /* ··· 94 92 #define TIF_RESTORE_SIGMASK 20 95 93 #define TIF_SINGLESTEP 21 96 94 #define TIF_32BIT 22 /* 32bit process */ 95 + #define TIF_SVE 23 /* Scalable Vector Extension in use */ 96 + #define TIF_SVE_VL_INHERIT 24 /* Inherit sve_vl_onexec across exec */ 97 97 98 98 #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) 99 99 #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) ··· 109 105 #define _TIF_UPROBE (1 << TIF_UPROBE) 110 106 #define _TIF_FSCHECK (1 << TIF_FSCHECK) 111 107 #define _TIF_32BIT (1 << TIF_32BIT) 108 + #define _TIF_SVE (1 << TIF_SVE) 112 109 113 110 #define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \ 114 111 _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \
+8
arch/arm64/include/asm/traps.h
··· 34 34 35 35 void register_undef_hook(struct undef_hook *hook); 36 36 void unregister_undef_hook(struct undef_hook *hook); 37 + void force_signal_inject(int signal, int code, struct pt_regs *regs, 38 + unsigned long address); 37 39 38 40 void arm64_notify_segfault(struct pt_regs *regs, unsigned long addr); 41 + 42 + /* 43 + * Move regs->pc to next instruction and do necessary setup before it 44 + * is executed. 45 + */ 46 + void arm64_skip_faulting_instruction(struct pt_regs *regs, unsigned long size); 39 47 40 48 static inline int __in_irqentry_text(unsigned long ptr) 41 49 {
+6
arch/arm64/include/uapi/asm/hwcap.h
··· 37 37 #define HWCAP_FCMA (1 << 14) 38 38 #define HWCAP_LRCPC (1 << 15) 39 39 #define HWCAP_DCPOP (1 << 16) 40 + #define HWCAP_SHA3 (1 << 17) 41 + #define HWCAP_SM3 (1 << 18) 42 + #define HWCAP_SM4 (1 << 19) 43 + #define HWCAP_ASIMDDP (1 << 20) 44 + #define HWCAP_SHA512 (1 << 21) 45 + #define HWCAP_SVE (1 << 22) 40 46 41 47 #endif /* _UAPI__ASM_HWCAP_H */
+138 -1
arch/arm64/include/uapi/asm/ptrace.h
··· 23 23 #include <linux/types.h> 24 24 25 25 #include <asm/hwcap.h> 26 + #include <asm/sigcontext.h> 26 27 27 28 28 29 /* ··· 48 47 #define PSR_D_BIT 0x00000200 49 48 #define PSR_PAN_BIT 0x00400000 50 49 #define PSR_UAO_BIT 0x00800000 51 - #define PSR_Q_BIT 0x08000000 52 50 #define PSR_V_BIT 0x10000000 53 51 #define PSR_C_BIT 0x20000000 54 52 #define PSR_Z_BIT 0x40000000 ··· 63 63 64 64 65 65 #ifndef __ASSEMBLY__ 66 + 67 + #include <linux/prctl.h> 66 68 67 69 /* 68 70 * User structures for general purpose, floating point and debug registers. ··· 92 90 __u32 pad; 93 91 } dbg_regs[16]; 94 92 }; 93 + 94 + /* SVE/FP/SIMD state (NT_ARM_SVE) */ 95 + 96 + struct user_sve_header { 97 + __u32 size; /* total meaningful regset content in bytes */ 98 + __u32 max_size; /* maxmium possible size for this thread */ 99 + __u16 vl; /* current vector length */ 100 + __u16 max_vl; /* maximum possible vector length */ 101 + __u16 flags; 102 + __u16 __reserved; 103 + }; 104 + 105 + /* Definitions for user_sve_header.flags: */ 106 + #define SVE_PT_REGS_MASK (1 << 0) 107 + 108 + #define SVE_PT_REGS_FPSIMD 0 109 + #define SVE_PT_REGS_SVE SVE_PT_REGS_MASK 110 + 111 + /* 112 + * Common SVE_PT_* flags: 113 + * These must be kept in sync with prctl interface in <linux/ptrace.h> 114 + */ 115 + #define SVE_PT_VL_INHERIT (PR_SVE_VL_INHERIT >> 16) 116 + #define SVE_PT_VL_ONEXEC (PR_SVE_SET_VL_ONEXEC >> 16) 117 + 118 + 119 + /* 120 + * The remainder of the SVE state follows struct user_sve_header. The 121 + * total size of the SVE state (including header) depends on the 122 + * metadata in the header: SVE_PT_SIZE(vq, flags) gives the total size 123 + * of the state in bytes, including the header. 124 + * 125 + * Refer to <asm/sigcontext.h> for details of how to pass the correct 126 + * "vq" argument to these macros. 127 + */ 128 + 129 + /* Offset from the start of struct user_sve_header to the register data */ 130 + #define SVE_PT_REGS_OFFSET \ 131 + ((sizeof(struct sve_context) + (SVE_VQ_BYTES - 1)) \ 132 + / SVE_VQ_BYTES * SVE_VQ_BYTES) 133 + 134 + /* 135 + * The register data content and layout depends on the value of the 136 + * flags field. 137 + */ 138 + 139 + /* 140 + * (flags & SVE_PT_REGS_MASK) == SVE_PT_REGS_FPSIMD case: 141 + * 142 + * The payload starts at offset SVE_PT_FPSIMD_OFFSET, and is of type 143 + * struct user_fpsimd_state. Additional data might be appended in the 144 + * future: use SVE_PT_FPSIMD_SIZE(vq, flags) to compute the total size. 145 + * SVE_PT_FPSIMD_SIZE(vq, flags) will never be less than 146 + * sizeof(struct user_fpsimd_state). 147 + */ 148 + 149 + #define SVE_PT_FPSIMD_OFFSET SVE_PT_REGS_OFFSET 150 + 151 + #define SVE_PT_FPSIMD_SIZE(vq, flags) (sizeof(struct user_fpsimd_state)) 152 + 153 + /* 154 + * (flags & SVE_PT_REGS_MASK) == SVE_PT_REGS_SVE case: 155 + * 156 + * The payload starts at offset SVE_PT_SVE_OFFSET, and is of size 157 + * SVE_PT_SVE_SIZE(vq, flags). 158 + * 159 + * Additional macros describe the contents and layout of the payload. 160 + * For each, SVE_PT_SVE_x_OFFSET(args) is the start offset relative to 161 + * the start of struct user_sve_header, and SVE_PT_SVE_x_SIZE(args) is 162 + * the size in bytes: 163 + * 164 + * x type description 165 + * - ---- ----------- 166 + * ZREGS \ 167 + * ZREG | 168 + * PREGS | refer to <asm/sigcontext.h> 169 + * PREG | 170 + * FFR / 171 + * 172 + * FPSR uint32_t FPSR 173 + * FPCR uint32_t FPCR 174 + * 175 + * Additional data might be appended in the future. 176 + */ 177 + 178 + #define SVE_PT_SVE_ZREG_SIZE(vq) SVE_SIG_ZREG_SIZE(vq) 179 + #define SVE_PT_SVE_PREG_SIZE(vq) SVE_SIG_PREG_SIZE(vq) 180 + #define SVE_PT_SVE_FFR_SIZE(vq) SVE_SIG_FFR_SIZE(vq) 181 + #define SVE_PT_SVE_FPSR_SIZE sizeof(__u32) 182 + #define SVE_PT_SVE_FPCR_SIZE sizeof(__u32) 183 + 184 + #define __SVE_SIG_TO_PT(offset) \ 185 + ((offset) - SVE_SIG_REGS_OFFSET + SVE_PT_REGS_OFFSET) 186 + 187 + #define SVE_PT_SVE_OFFSET SVE_PT_REGS_OFFSET 188 + 189 + #define SVE_PT_SVE_ZREGS_OFFSET \ 190 + __SVE_SIG_TO_PT(SVE_SIG_ZREGS_OFFSET) 191 + #define SVE_PT_SVE_ZREG_OFFSET(vq, n) \ 192 + __SVE_SIG_TO_PT(SVE_SIG_ZREG_OFFSET(vq, n)) 193 + #define SVE_PT_SVE_ZREGS_SIZE(vq) \ 194 + (SVE_PT_SVE_ZREG_OFFSET(vq, SVE_NUM_ZREGS) - SVE_PT_SVE_ZREGS_OFFSET) 195 + 196 + #define SVE_PT_SVE_PREGS_OFFSET(vq) \ 197 + __SVE_SIG_TO_PT(SVE_SIG_PREGS_OFFSET(vq)) 198 + #define SVE_PT_SVE_PREG_OFFSET(vq, n) \ 199 + __SVE_SIG_TO_PT(SVE_SIG_PREG_OFFSET(vq, n)) 200 + #define SVE_PT_SVE_PREGS_SIZE(vq) \ 201 + (SVE_PT_SVE_PREG_OFFSET(vq, SVE_NUM_PREGS) - \ 202 + SVE_PT_SVE_PREGS_OFFSET(vq)) 203 + 204 + #define SVE_PT_SVE_FFR_OFFSET(vq) \ 205 + __SVE_SIG_TO_PT(SVE_SIG_FFR_OFFSET(vq)) 206 + 207 + #define SVE_PT_SVE_FPSR_OFFSET(vq) \ 208 + ((SVE_PT_SVE_FFR_OFFSET(vq) + SVE_PT_SVE_FFR_SIZE(vq) + \ 209 + (SVE_VQ_BYTES - 1)) \ 210 + / SVE_VQ_BYTES * SVE_VQ_BYTES) 211 + #define SVE_PT_SVE_FPCR_OFFSET(vq) \ 212 + (SVE_PT_SVE_FPSR_OFFSET(vq) + SVE_PT_SVE_FPSR_SIZE) 213 + 214 + /* 215 + * Any future extension appended after FPCR must be aligned to the next 216 + * 128-bit boundary. 217 + */ 218 + 219 + #define SVE_PT_SVE_SIZE(vq, flags) \ 220 + ((SVE_PT_SVE_FPCR_OFFSET(vq) + SVE_PT_SVE_FPCR_SIZE \ 221 + - SVE_PT_SVE_OFFSET + (SVE_VQ_BYTES - 1)) \ 222 + / SVE_VQ_BYTES * SVE_VQ_BYTES) 223 + 224 + #define SVE_PT_SIZE(vq, flags) \ 225 + (((flags) & SVE_PT_REGS_MASK) == SVE_PT_REGS_SVE ? \ 226 + SVE_PT_SVE_OFFSET + SVE_PT_SVE_SIZE(vq, flags) \ 227 + : SVE_PT_FPSIMD_OFFSET + SVE_PT_FPSIMD_SIZE(vq, flags)) 95 228 96 229 #endif /* __ASSEMBLY__ */ 97 230
+119 -1
arch/arm64/include/uapi/asm/sigcontext.h
··· 17 17 #ifndef _UAPI__ASM_SIGCONTEXT_H 18 18 #define _UAPI__ASM_SIGCONTEXT_H 19 19 20 + #ifndef __ASSEMBLY__ 21 + 20 22 #include <linux/types.h> 21 23 22 24 /* ··· 44 42 * 45 43 * 0x210 fpsimd_context 46 44 * 0x10 esr_context 45 + * 0x8a0 sve_context (vl <= 64) (optional) 47 46 * 0x20 extra_context (optional) 48 47 * 0x10 terminator (null _aarch64_ctx) 49 48 * 50 - * 0xdb0 (reserved for future allocation) 49 + * 0x510 (reserved for future allocation) 51 50 * 52 51 * New records that can exceed this space need to be opt-in for userspace, so 53 52 * that an expanded signal frame is not generated unexpectedly. The mechanism ··· 119 116 __u32 size; /* size in bytes of the extra space */ 120 117 __u32 __reserved[3]; 121 118 }; 119 + 120 + #define SVE_MAGIC 0x53564501 121 + 122 + struct sve_context { 123 + struct _aarch64_ctx head; 124 + __u16 vl; 125 + __u16 __reserved[3]; 126 + }; 127 + 128 + #endif /* !__ASSEMBLY__ */ 129 + 130 + /* 131 + * The SVE architecture leaves space for future expansion of the 132 + * vector length beyond its initial architectural limit of 2048 bits 133 + * (16 quadwords). 134 + * 135 + * See linux/Documentation/arm64/sve.txt for a description of the VL/VQ 136 + * terminology. 137 + */ 138 + #define SVE_VQ_BYTES 16 /* number of bytes per quadword */ 139 + 140 + #define SVE_VQ_MIN 1 141 + #define SVE_VQ_MAX 512 142 + 143 + #define SVE_VL_MIN (SVE_VQ_MIN * SVE_VQ_BYTES) 144 + #define SVE_VL_MAX (SVE_VQ_MAX * SVE_VQ_BYTES) 145 + 146 + #define SVE_NUM_ZREGS 32 147 + #define SVE_NUM_PREGS 16 148 + 149 + #define sve_vl_valid(vl) \ 150 + ((vl) % SVE_VQ_BYTES == 0 && (vl) >= SVE_VL_MIN && (vl) <= SVE_VL_MAX) 151 + #define sve_vq_from_vl(vl) ((vl) / SVE_VQ_BYTES) 152 + #define sve_vl_from_vq(vq) ((vq) * SVE_VQ_BYTES) 153 + 154 + /* 155 + * If the SVE registers are currently live for the thread at signal delivery, 156 + * sve_context.head.size >= 157 + * SVE_SIG_CONTEXT_SIZE(sve_vq_from_vl(sve_context.vl)) 158 + * and the register data may be accessed using the SVE_SIG_*() macros. 159 + * 160 + * If sve_context.head.size < 161 + * SVE_SIG_CONTEXT_SIZE(sve_vq_from_vl(sve_context.vl)), 162 + * the SVE registers were not live for the thread and no register data 163 + * is included: in this case, the SVE_SIG_*() macros should not be 164 + * used except for this check. 165 + * 166 + * The same convention applies when returning from a signal: a caller 167 + * will need to remove or resize the sve_context block if it wants to 168 + * make the SVE registers live when they were previously non-live or 169 + * vice-versa. This may require the the caller to allocate fresh 170 + * memory and/or move other context blocks in the signal frame. 171 + * 172 + * Changing the vector length during signal return is not permitted: 173 + * sve_context.vl must equal the thread's current vector length when 174 + * doing a sigreturn. 175 + * 176 + * 177 + * Note: for all these macros, the "vq" argument denotes the SVE 178 + * vector length in quadwords (i.e., units of 128 bits). 179 + * 180 + * The correct way to obtain vq is to use sve_vq_from_vl(vl). The 181 + * result is valid if and only if sve_vl_valid(vl) is true. This is 182 + * guaranteed for a struct sve_context written by the kernel. 183 + * 184 + * 185 + * Additional macros describe the contents and layout of the payload. 186 + * For each, SVE_SIG_x_OFFSET(args) is the start offset relative to 187 + * the start of struct sve_context, and SVE_SIG_x_SIZE(args) is the 188 + * size in bytes: 189 + * 190 + * x type description 191 + * - ---- ----------- 192 + * REGS the entire SVE context 193 + * 194 + * ZREGS __uint128_t[SVE_NUM_ZREGS][vq] all Z-registers 195 + * ZREG __uint128_t[vq] individual Z-register Zn 196 + * 197 + * PREGS uint16_t[SVE_NUM_PREGS][vq] all P-registers 198 + * PREG uint16_t[vq] individual P-register Pn 199 + * 200 + * FFR uint16_t[vq] first-fault status register 201 + * 202 + * Additional data might be appended in the future. 203 + */ 204 + 205 + #define SVE_SIG_ZREG_SIZE(vq) ((__u32)(vq) * SVE_VQ_BYTES) 206 + #define SVE_SIG_PREG_SIZE(vq) ((__u32)(vq) * (SVE_VQ_BYTES / 8)) 207 + #define SVE_SIG_FFR_SIZE(vq) SVE_SIG_PREG_SIZE(vq) 208 + 209 + #define SVE_SIG_REGS_OFFSET \ 210 + ((sizeof(struct sve_context) + (SVE_VQ_BYTES - 1)) \ 211 + / SVE_VQ_BYTES * SVE_VQ_BYTES) 212 + 213 + #define SVE_SIG_ZREGS_OFFSET SVE_SIG_REGS_OFFSET 214 + #define SVE_SIG_ZREG_OFFSET(vq, n) \ 215 + (SVE_SIG_ZREGS_OFFSET + SVE_SIG_ZREG_SIZE(vq) * (n)) 216 + #define SVE_SIG_ZREGS_SIZE(vq) \ 217 + (SVE_SIG_ZREG_OFFSET(vq, SVE_NUM_ZREGS) - SVE_SIG_ZREGS_OFFSET) 218 + 219 + #define SVE_SIG_PREGS_OFFSET(vq) \ 220 + (SVE_SIG_ZREGS_OFFSET + SVE_SIG_ZREGS_SIZE(vq)) 221 + #define SVE_SIG_PREG_OFFSET(vq, n) \ 222 + (SVE_SIG_PREGS_OFFSET(vq) + SVE_SIG_PREG_SIZE(vq) * (n)) 223 + #define SVE_SIG_PREGS_SIZE(vq) \ 224 + (SVE_SIG_PREG_OFFSET(vq, SVE_NUM_PREGS) - SVE_SIG_PREGS_OFFSET(vq)) 225 + 226 + #define SVE_SIG_FFR_OFFSET(vq) \ 227 + (SVE_SIG_PREGS_OFFSET(vq) + SVE_SIG_PREGS_SIZE(vq)) 228 + 229 + #define SVE_SIG_REGS_SIZE(vq) \ 230 + (SVE_SIG_FFR_OFFSET(vq) + SVE_SIG_FFR_SIZE(vq) - SVE_SIG_REGS_OFFSET) 231 + 232 + #define SVE_SIG_CONTEXT_SIZE(vq) (SVE_SIG_REGS_OFFSET + SVE_SIG_REGS_SIZE(vq)) 233 + 122 234 123 235 #endif /* _UAPI__ASM_SIGCONTEXT_H */
-2
arch/arm64/kernel/Makefile
··· 11 11 CFLAGS_REMOVE_insn.o = -pg 12 12 CFLAGS_REMOVE_return_address.o = -pg 13 13 14 - CFLAGS_setup.o = -DUTS_MACHINE='"$(UTS_MACHINE)"' 15 - 16 14 # Object file lists. 17 15 arm64-obj-y := debug-monitors.o entry.o irq.o fpsimd.o \ 18 16 entry-fpsimd.o process.o ptrace.o setup.o signal.o \
+7 -16
arch/arm64/kernel/armv8_deprecated.c
··· 228 228 return ret; 229 229 } 230 230 231 - static struct ctl_table ctl_abi[] = { 232 - { 233 - .procname = "abi", 234 - .mode = 0555, 235 - }, 236 - { } 237 - }; 238 - 239 - static void __init register_insn_emulation_sysctl(struct ctl_table *table) 231 + static void __init register_insn_emulation_sysctl(void) 240 232 { 241 233 unsigned long flags; 242 234 int i = 0; ··· 254 262 } 255 263 raw_spin_unlock_irqrestore(&insn_emulation_lock, flags); 256 264 257 - table->child = insns_sysctl; 258 - register_sysctl_table(table); 265 + register_sysctl("abi", insns_sysctl); 259 266 } 260 267 261 268 /* ··· 422 431 pr_warn_ratelimited("\"%s\" (%ld) uses obsolete SWP{B} instruction at 0x%llx\n", 423 432 current->comm, (unsigned long)current->pid, regs->pc); 424 433 425 - regs->pc += 4; 434 + arm64_skip_faulting_instruction(regs, 4); 426 435 return 0; 427 436 428 437 fault: ··· 503 512 pr_warn_ratelimited("\"%s\" (%ld) uses deprecated CP15 Barrier instruction at 0x%llx\n", 504 513 current->comm, (unsigned long)current->pid, regs->pc); 505 514 506 - regs->pc += 4; 515 + arm64_skip_faulting_instruction(regs, 4); 507 516 return 0; 508 517 } 509 518 ··· 577 586 static int a32_setend_handler(struct pt_regs *regs, u32 instr) 578 587 { 579 588 int rc = compat_setend_handler(regs, (instr >> 9) & 1); 580 - regs->pc += 4; 589 + arm64_skip_faulting_instruction(regs, 4); 581 590 return rc; 582 591 } 583 592 584 593 static int t16_setend_handler(struct pt_regs *regs, u32 instr) 585 594 { 586 595 int rc = compat_setend_handler(regs, (instr >> 3) & 1); 587 - regs->pc += 2; 596 + arm64_skip_faulting_instruction(regs, 2); 588 597 return rc; 589 598 } 590 599 ··· 635 644 cpuhp_setup_state_nocalls(CPUHP_AP_ARM64_ISNDEP_STARTING, 636 645 "arm64/isndep:starting", 637 646 run_all_insn_set_hw_mode, NULL); 638 - register_insn_emulation_sysctl(ctl_abi); 647 + register_insn_emulation_sysctl(); 639 648 640 649 return 0; 641 650 }
+141 -63
arch/arm64/kernel/cpufeature.c
··· 27 27 #include <asm/cpu.h> 28 28 #include <asm/cpufeature.h> 29 29 #include <asm/cpu_ops.h> 30 + #include <asm/fpsimd.h> 30 31 #include <asm/mmu_context.h> 31 32 #include <asm/processor.h> 32 33 #include <asm/sysreg.h> ··· 51 50 52 51 DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS); 53 52 EXPORT_SYMBOL(cpu_hwcaps); 53 + 54 + /* 55 + * Flag to indicate if we have computed the system wide 56 + * capabilities based on the boot time active CPUs. This 57 + * will be used to determine if a new booting CPU should 58 + * go through the verification process to make sure that it 59 + * supports the system capabilities, without using a hotplug 60 + * notifier. 61 + */ 62 + static bool sys_caps_initialised; 63 + 64 + static inline void set_sys_caps_initialised(void) 65 + { 66 + sys_caps_initialised = true; 67 + } 54 68 55 69 static int dump_cpu_hwcaps(struct notifier_block *self, unsigned long v, void *p) 56 70 { ··· 123 107 * sync with the documentation of the CPU feature register ABI. 124 108 */ 125 109 static const struct arm64_ftr_bits ftr_id_aa64isar0[] = { 126 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64ISAR0_RDM_SHIFT, 4, 0), 110 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_DP_SHIFT, 4, 0), 111 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_SM4_SHIFT, 4, 0), 112 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_SM3_SHIFT, 4, 0), 113 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_SHA3_SHIFT, 4, 0), 114 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_RDM_SHIFT, 4, 0), 127 115 ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_ATOMICS_SHIFT, 4, 0), 128 116 ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_CRC32_SHIFT, 4, 0), 129 117 ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_SHA2_SHIFT, 4, 0), ··· 137 117 }; 138 118 139 119 static const struct arm64_ftr_bits ftr_id_aa64isar1[] = { 140 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64ISAR1_LRCPC_SHIFT, 4, 0), 141 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64ISAR1_FCMA_SHIFT, 4, 0), 142 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64ISAR1_JSCVT_SHIFT, 4, 0), 143 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64ISAR1_DPB_SHIFT, 4, 0), 120 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_LRCPC_SHIFT, 4, 0), 121 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_FCMA_SHIFT, 4, 0), 122 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_JSCVT_SHIFT, 4, 0), 123 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_DPB_SHIFT, 4, 0), 144 124 ARM64_FTR_END, 145 125 }; 146 126 147 127 static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = { 148 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64PFR0_GIC_SHIFT, 4, 0), 128 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_SVE_SHIFT, 4, 0), 129 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_GIC_SHIFT, 4, 0), 149 130 S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI), 150 131 S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_FP_SHIFT, 4, ID_AA64PFR0_FP_NI), 151 132 /* Linux doesn't care about the EL3 */ 152 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_EXACT, ID_AA64PFR0_EL3_SHIFT, 4, 0), 153 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64PFR0_EL2_SHIFT, 4, 0), 154 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64PFR0_EL1_SHIFT, 4, ID_AA64PFR0_EL1_64BIT_ONLY), 155 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64PFR0_EL0_SHIFT, 4, ID_AA64PFR0_EL0_64BIT_ONLY), 133 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_EL3_SHIFT, 4, 0), 134 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_EL2_SHIFT, 4, 0), 135 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_EL1_SHIFT, 4, ID_AA64PFR0_EL1_64BIT_ONLY), 136 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_EL0_SHIFT, 4, ID_AA64PFR0_EL0_64BIT_ONLY), 156 137 ARM64_FTR_END, 157 138 }; 158 139 159 140 static const struct arm64_ftr_bits ftr_id_aa64mmfr0[] = { 160 - S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64MMFR0_TGRAN4_SHIFT, 4, ID_AA64MMFR0_TGRAN4_NI), 161 - S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64MMFR0_TGRAN64_SHIFT, 4, ID_AA64MMFR0_TGRAN64_NI), 162 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64MMFR0_TGRAN16_SHIFT, 4, ID_AA64MMFR0_TGRAN16_NI), 163 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64MMFR0_BIGENDEL0_SHIFT, 4, 0), 141 + S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR0_TGRAN4_SHIFT, 4, ID_AA64MMFR0_TGRAN4_NI), 142 + S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR0_TGRAN64_SHIFT, 4, ID_AA64MMFR0_TGRAN64_NI), 143 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR0_TGRAN16_SHIFT, 4, ID_AA64MMFR0_TGRAN16_NI), 144 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR0_BIGENDEL0_SHIFT, 4, 0), 164 145 /* Linux shouldn't care about secure memory */ 165 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_EXACT, ID_AA64MMFR0_SNSMEM_SHIFT, 4, 0), 166 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64MMFR0_BIGENDEL_SHIFT, 4, 0), 167 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64MMFR0_ASID_SHIFT, 4, 0), 146 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR0_SNSMEM_SHIFT, 4, 0), 147 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR0_BIGENDEL_SHIFT, 4, 0), 148 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR0_ASID_SHIFT, 4, 0), 168 149 /* 169 150 * Differing PARange is fine as long as all peripherals and memory are mapped 170 151 * within the minimum PARange of all CPUs ··· 176 155 177 156 static const struct arm64_ftr_bits ftr_id_aa64mmfr1[] = { 178 157 ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_PAN_SHIFT, 4, 0), 179 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64MMFR1_LOR_SHIFT, 4, 0), 180 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64MMFR1_HPD_SHIFT, 4, 0), 181 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64MMFR1_VHE_SHIFT, 4, 0), 182 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64MMFR1_VMIDBITS_SHIFT, 4, 0), 183 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64MMFR1_HADBS_SHIFT, 4, 0), 158 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_LOR_SHIFT, 4, 0), 159 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_HPD_SHIFT, 4, 0), 160 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_VHE_SHIFT, 4, 0), 161 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_VMIDBITS_SHIFT, 4, 0), 162 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_HADBS_SHIFT, 4, 0), 184 163 ARM64_FTR_END, 185 164 }; 186 165 187 166 static const struct arm64_ftr_bits ftr_id_aa64mmfr2[] = { 188 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64MMFR2_LVA_SHIFT, 4, 0), 189 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64MMFR2_IESB_SHIFT, 4, 0), 190 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64MMFR2_LSM_SHIFT, 4, 0), 191 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64MMFR2_UAO_SHIFT, 4, 0), 192 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64MMFR2_CNP_SHIFT, 4, 0), 167 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_LVA_SHIFT, 4, 0), 168 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_IESB_SHIFT, 4, 0), 169 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_LSM_SHIFT, 4, 0), 170 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_UAO_SHIFT, 4, 0), 171 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_CNP_SHIFT, 4, 0), 193 172 ARM64_FTR_END, 194 173 }; 195 174 ··· 214 193 }; 215 194 216 195 static const struct arm64_ftr_bits ftr_id_mmfr0[] = { 217 - S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, 28, 4, 0xf), /* InnerShr */ 218 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, 24, 4, 0), /* FCSE */ 196 + S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, 28, 4, 0xf), /* InnerShr */ 197 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, 24, 4, 0), /* FCSE */ 219 198 ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, 20, 4, 0), /* AuxReg */ 220 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, 16, 4, 0), /* TCM */ 221 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, 12, 4, 0), /* ShareLvl */ 222 - S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, 8, 4, 0xf), /* OuterShr */ 223 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, 4, 4, 0), /* PMSA */ 224 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, 0, 4, 0), /* VMSA */ 199 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, 16, 4, 0), /* TCM */ 200 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, 12, 4, 0), /* ShareLvl */ 201 + S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, 8, 4, 0xf), /* OuterShr */ 202 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, 4, 4, 0), /* PMSA */ 203 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, 0, 4, 0), /* VMSA */ 225 204 ARM64_FTR_END, 226 205 }; 227 206 ··· 242 221 }; 243 222 244 223 static const struct arm64_ftr_bits ftr_mvfr2[] = { 245 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, 4, 4, 0), /* FPMisc */ 246 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, 0, 4, 0), /* SIMDMisc */ 224 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, 4, 4, 0), /* FPMisc */ 225 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, 0, 4, 0), /* SIMDMisc */ 247 226 ARM64_FTR_END, 248 227 }; 249 228 ··· 255 234 256 235 257 236 static const struct arm64_ftr_bits ftr_id_isar5[] = { 258 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_ISAR5_RDM_SHIFT, 4, 0), 259 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_ISAR5_CRC32_SHIFT, 4, 0), 260 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_ISAR5_SHA2_SHIFT, 4, 0), 261 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_ISAR5_SHA1_SHIFT, 4, 0), 262 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_ISAR5_AES_SHIFT, 4, 0), 263 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_ISAR5_SEVL_SHIFT, 4, 0), 237 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_ISAR5_RDM_SHIFT, 4, 0), 238 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_ISAR5_CRC32_SHIFT, 4, 0), 239 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_ISAR5_SHA2_SHIFT, 4, 0), 240 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_ISAR5_SHA1_SHIFT, 4, 0), 241 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_ISAR5_AES_SHIFT, 4, 0), 242 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_ISAR5_SEVL_SHIFT, 4, 0), 264 243 ARM64_FTR_END, 265 244 }; 266 245 267 246 static const struct arm64_ftr_bits ftr_id_mmfr4[] = { 268 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, 4, 4, 0), /* ac2 */ 247 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, 4, 4, 0), /* ac2 */ 269 248 ARM64_FTR_END, 270 249 }; 271 250 272 251 static const struct arm64_ftr_bits ftr_id_pfr0[] = { 273 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, 12, 4, 0), /* State3 */ 274 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, 8, 4, 0), /* State2 */ 275 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, 4, 4, 0), /* State1 */ 276 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, 0, 4, 0), /* State0 */ 252 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, 12, 4, 0), /* State3 */ 253 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, 8, 4, 0), /* State2 */ 254 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, 4, 4, 0), /* State1 */ 255 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, 0, 4, 0), /* State0 */ 277 256 ARM64_FTR_END, 278 257 }; 279 258 ··· 286 265 ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, 8, 4, 0), 287 266 ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, 4, 4, 0), 288 267 ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, 0, 4, 0), 268 + ARM64_FTR_END, 269 + }; 270 + 271 + static const struct arm64_ftr_bits ftr_zcr[] = { 272 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, 273 + ZCR_ELx_LEN_SHIFT, ZCR_ELx_LEN_SIZE, 0), /* LEN */ 289 274 ARM64_FTR_END, 290 275 }; 291 276 ··· 361 334 /* Op1 = 0, CRn = 0, CRm = 4 */ 362 335 ARM64_FTR_REG(SYS_ID_AA64PFR0_EL1, ftr_id_aa64pfr0), 363 336 ARM64_FTR_REG(SYS_ID_AA64PFR1_EL1, ftr_raz), 337 + ARM64_FTR_REG(SYS_ID_AA64ZFR0_EL1, ftr_raz), 364 338 365 339 /* Op1 = 0, CRn = 0, CRm = 5 */ 366 340 ARM64_FTR_REG(SYS_ID_AA64DFR0_EL1, ftr_id_aa64dfr0), ··· 375 347 ARM64_FTR_REG(SYS_ID_AA64MMFR0_EL1, ftr_id_aa64mmfr0), 376 348 ARM64_FTR_REG(SYS_ID_AA64MMFR1_EL1, ftr_id_aa64mmfr1), 377 349 ARM64_FTR_REG(SYS_ID_AA64MMFR2_EL1, ftr_id_aa64mmfr2), 350 + 351 + /* Op1 = 0, CRn = 1, CRm = 2 */ 352 + ARM64_FTR_REG(SYS_ZCR_EL1, ftr_zcr), 378 353 379 354 /* Op1 = 3, CRn = 0, CRm = 0 */ 380 355 { SYS_CTR_EL0, &arm64_ftr_reg_ctrel0 }, ··· 516 485 init_cpu_ftr_reg(SYS_ID_AA64MMFR2_EL1, info->reg_id_aa64mmfr2); 517 486 init_cpu_ftr_reg(SYS_ID_AA64PFR0_EL1, info->reg_id_aa64pfr0); 518 487 init_cpu_ftr_reg(SYS_ID_AA64PFR1_EL1, info->reg_id_aa64pfr1); 488 + init_cpu_ftr_reg(SYS_ID_AA64ZFR0_EL1, info->reg_id_aa64zfr0); 519 489 520 490 if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) { 521 491 init_cpu_ftr_reg(SYS_ID_DFR0_EL1, info->reg_id_dfr0); ··· 537 505 init_cpu_ftr_reg(SYS_MVFR2_EL1, info->reg_mvfr2); 538 506 } 539 507 508 + if (id_aa64pfr0_sve(info->reg_id_aa64pfr0)) { 509 + init_cpu_ftr_reg(SYS_ZCR_EL1, info->reg_zcr); 510 + sve_init_vq_map(); 511 + } 540 512 } 541 513 542 514 static void update_cpu_ftr_reg(struct arm64_ftr_reg *reg, u64 new) ··· 644 608 taint |= check_update_ftr_reg(SYS_ID_AA64PFR1_EL1, cpu, 645 609 info->reg_id_aa64pfr1, boot->reg_id_aa64pfr1); 646 610 611 + taint |= check_update_ftr_reg(SYS_ID_AA64ZFR0_EL1, cpu, 612 + info->reg_id_aa64zfr0, boot->reg_id_aa64zfr0); 613 + 647 614 /* 648 615 * If we have AArch32, we care about 32-bit features for compat. 649 616 * If the system doesn't support AArch32, don't update them. ··· 692 653 info->reg_mvfr1, boot->reg_mvfr1); 693 654 taint |= check_update_ftr_reg(SYS_MVFR2_EL1, cpu, 694 655 info->reg_mvfr2, boot->reg_mvfr2); 656 + } 657 + 658 + if (id_aa64pfr0_sve(info->reg_id_aa64pfr0)) { 659 + taint |= check_update_ftr_reg(SYS_ZCR_EL1, cpu, 660 + info->reg_zcr, boot->reg_zcr); 661 + 662 + /* Probe vector lengths, unless we already gave up on SVE */ 663 + if (id_aa64pfr0_sve(read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1)) && 664 + !sys_caps_initialised) 665 + sve_update_vq_map(); 695 666 } 696 667 697 668 /* ··· 949 900 .min_field_value = 1, 950 901 }, 951 902 #endif 903 + #ifdef CONFIG_ARM64_SVE 904 + { 905 + .desc = "Scalable Vector Extension", 906 + .capability = ARM64_SVE, 907 + .def_scope = SCOPE_SYSTEM, 908 + .sys_reg = SYS_ID_AA64PFR0_EL1, 909 + .sign = FTR_UNSIGNED, 910 + .field_pos = ID_AA64PFR0_SVE_SHIFT, 911 + .min_field_value = ID_AA64PFR0_SVE, 912 + .matches = has_cpuid_feature, 913 + .enable = sve_kernel_enable, 914 + }, 915 + #endif /* CONFIG_ARM64_SVE */ 952 916 {}, 953 917 }; 954 918 ··· 983 921 HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_AES_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_AES), 984 922 HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SHA1_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_SHA1), 985 923 HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SHA2_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_SHA2), 924 + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SHA2_SHIFT, FTR_UNSIGNED, 2, CAP_HWCAP, HWCAP_SHA512), 986 925 HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_CRC32_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_CRC32), 987 926 HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_ATOMICS_SHIFT, FTR_UNSIGNED, 2, CAP_HWCAP, HWCAP_ATOMICS), 988 927 HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_RDM_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_ASIMDRDM), 928 + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SHA3_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_SHA3), 929 + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SM3_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_SM3), 930 + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SM4_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_SM4), 931 + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_DP_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_ASIMDDP), 989 932 HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_FP_SHIFT, FTR_SIGNED, 0, CAP_HWCAP, HWCAP_FP), 990 933 HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_FP_SHIFT, FTR_SIGNED, 1, CAP_HWCAP, HWCAP_FPHP), 991 934 HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_ASIMD_SHIFT, FTR_SIGNED, 0, CAP_HWCAP, HWCAP_ASIMD), ··· 999 932 HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_JSCVT_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_JSCVT), 1000 933 HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_FCMA_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_FCMA), 1001 934 HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_LRCPC_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_LRCPC), 935 + #ifdef CONFIG_ARM64_SVE 936 + HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_SVE_SHIFT, FTR_UNSIGNED, ID_AA64PFR0_SVE, CAP_HWCAP, HWCAP_SVE), 937 + #endif 1002 938 {}, 1003 939 }; 1004 940 ··· 1111 1041 } 1112 1042 1113 1043 /* 1114 - * Flag to indicate if we have computed the system wide 1115 - * capabilities based on the boot time active CPUs. This 1116 - * will be used to determine if a new booting CPU should 1117 - * go through the verification process to make sure that it 1118 - * supports the system capabilities, without using a hotplug 1119 - * notifier. 1120 - */ 1121 - static bool sys_caps_initialised; 1122 - 1123 - static inline void set_sys_caps_initialised(void) 1124 - { 1125 - sys_caps_initialised = true; 1126 - } 1127 - 1128 - /* 1129 1044 * Check for CPU features that are used in early boot 1130 1045 * based on the Boot CPU value. 1131 1046 */ ··· 1152 1097 } 1153 1098 } 1154 1099 1100 + static void verify_sve_features(void) 1101 + { 1102 + u64 safe_zcr = read_sanitised_ftr_reg(SYS_ZCR_EL1); 1103 + u64 zcr = read_zcr_features(); 1104 + 1105 + unsigned int safe_len = safe_zcr & ZCR_ELx_LEN_MASK; 1106 + unsigned int len = zcr & ZCR_ELx_LEN_MASK; 1107 + 1108 + if (len < safe_len || sve_verify_vq_map()) { 1109 + pr_crit("CPU%d: SVE: required vector length(s) missing\n", 1110 + smp_processor_id()); 1111 + cpu_die_early(); 1112 + } 1113 + 1114 + /* Add checks on other ZCR bits here if necessary */ 1115 + } 1116 + 1155 1117 /* 1156 1118 * Run through the enabled system capabilities and enable() it on this CPU. 1157 1119 * The capabilities were decided based on the available CPUs at the boot time. ··· 1182 1110 verify_local_cpu_errata_workarounds(); 1183 1111 verify_local_cpu_features(arm64_features); 1184 1112 verify_local_elf_hwcaps(arm64_elf_hwcaps); 1113 + 1185 1114 if (system_supports_32bit_el0()) 1186 1115 verify_local_elf_hwcaps(compat_elf_hwcaps); 1116 + 1117 + if (system_supports_sve()) 1118 + verify_sve_features(); 1187 1119 } 1188 1120 1189 1121 void check_local_cpu_capabilities(void) ··· 1264 1188 1265 1189 if (system_supports_32bit_el0()) 1266 1190 setup_elf_hwcaps(compat_elf_hwcaps); 1191 + 1192 + sve_setup(); 1267 1193 1268 1194 /* Advertise that we have computed the system capabilities */ 1269 1195 set_sys_caps_initialised(); ··· 1365 1287 if (!rc) { 1366 1288 dst = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn); 1367 1289 pt_regs_write_reg(regs, dst, val); 1368 - regs->pc += 4; 1290 + arm64_skip_faulting_instruction(regs, AARCH64_INSN_SIZE); 1369 1291 } 1370 1292 1371 1293 return rc;
+12
arch/arm64/kernel/cpuinfo.c
··· 19 19 #include <asm/cpu.h> 20 20 #include <asm/cputype.h> 21 21 #include <asm/cpufeature.h> 22 + #include <asm/fpsimd.h> 22 23 23 24 #include <linux/bitops.h> 24 25 #include <linux/bug.h> ··· 70 69 "fcma", 71 70 "lrcpc", 72 71 "dcpop", 72 + "sha3", 73 + "sm3", 74 + "sm4", 75 + "asimddp", 76 + "sha512", 77 + "sve", 73 78 NULL 74 79 }; 75 80 ··· 333 326 info->reg_id_aa64mmfr2 = read_cpuid(ID_AA64MMFR2_EL1); 334 327 info->reg_id_aa64pfr0 = read_cpuid(ID_AA64PFR0_EL1); 335 328 info->reg_id_aa64pfr1 = read_cpuid(ID_AA64PFR1_EL1); 329 + info->reg_id_aa64zfr0 = read_cpuid(ID_AA64ZFR0_EL1); 336 330 337 331 /* Update the 32bit ID registers only if AArch32 is implemented */ 338 332 if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) { ··· 355 347 info->reg_mvfr1 = read_cpuid(MVFR1_EL1); 356 348 info->reg_mvfr2 = read_cpuid(MVFR2_EL1); 357 349 } 350 + 351 + if (IS_ENABLED(CONFIG_ARM64_SVE) && 352 + id_aa64pfr0_sve(info->reg_id_aa64pfr0)) 353 + info->reg_zcr = read_zcr_features(); 358 354 359 355 cpuinfo_detect_icache_policy(info); 360 356 }
+3 -2
arch/arm64/kernel/debug-monitors.c
··· 30 30 31 31 #include <asm/cpufeature.h> 32 32 #include <asm/cputype.h> 33 + #include <asm/daifflags.h> 33 34 #include <asm/debug-monitors.h> 34 35 #include <asm/system_misc.h> 35 36 ··· 47 46 static void mdscr_write(u32 mdscr) 48 47 { 49 48 unsigned long flags; 50 - local_dbg_save(flags); 49 + flags = local_daif_save(); 51 50 write_sysreg(mdscr, mdscr_el1); 52 - local_dbg_restore(flags); 51 + local_daif_restore(flags); 53 52 } 54 53 NOKPROBE_SYMBOL(mdscr_write); 55 54
+17
arch/arm64/kernel/entry-fpsimd.S
··· 41 41 fpsimd_restore x0, 8 42 42 ret 43 43 ENDPROC(fpsimd_load_state) 44 + 45 + #ifdef CONFIG_ARM64_SVE 46 + ENTRY(sve_save_state) 47 + sve_save 0, x1, 2 48 + ret 49 + ENDPROC(sve_save_state) 50 + 51 + ENTRY(sve_load_state) 52 + sve_load 0, x1, x2, 3 53 + ret 54 + ENDPROC(sve_load_state) 55 + 56 + ENTRY(sve_get_vl) 57 + _sve_rdvl 0, 1 58 + ret 59 + ENDPROC(sve_get_vl) 60 + #endif /* CONFIG_ARM64_SVE */
+3 -9
arch/arm64/kernel/entry-ftrace.S
··· 108 108 mcount_get_lr x1 // function's lr (= parent's pc) 109 109 blr x2 // (*ftrace_trace_function)(pc, lr); 110 110 111 - #ifndef CONFIG_FUNCTION_GRAPH_TRACER 112 - skip_ftrace_call: // return; 113 - mcount_exit // } 114 - #else 115 - mcount_exit // return; 116 - // } 117 - skip_ftrace_call: 111 + skip_ftrace_call: // } 112 + #ifdef CONFIG_FUNCTION_GRAPH_TRACER 118 113 ldr_l x2, ftrace_graph_return 119 114 cmp x0, x2 // if ((ftrace_graph_return 120 115 b.ne ftrace_graph_caller // != ftrace_stub) ··· 118 123 adr_l x0, ftrace_graph_entry_stub // != ftrace_graph_entry_stub)) 119 124 cmp x0, x2 120 125 b.ne ftrace_graph_caller // ftrace_graph_caller(); 121 - 122 - mcount_exit 123 126 #endif /* CONFIG_FUNCTION_GRAPH_TRACER */ 127 + mcount_exit 124 128 ENDPROC(_mcount) 125 129 126 130 #else /* CONFIG_DYNAMIC_FTRACE */
+88 -40
arch/arm64/kernel/entry.S
··· 28 28 #include <asm/errno.h> 29 29 #include <asm/esr.h> 30 30 #include <asm/irq.h> 31 - #include <asm/memory.h> 31 + #include <asm/processor.h> 32 32 #include <asm/ptrace.h> 33 33 #include <asm/thread_info.h> 34 34 #include <asm/asm-uaccess.h> ··· 221 221 222 222 .macro kernel_exit, el 223 223 .if \el != 0 224 + disable_daif 225 + 224 226 /* Restore the task's original addr_limit. */ 225 227 ldr x20, [sp, #S_ORIG_ADDR_LIMIT] 226 228 str x20, [tsk, #TSK_TI_ADDR_LIMIT] ··· 375 373 kernel_ventry el1_sync // Synchronous EL1h 376 374 kernel_ventry el1_irq // IRQ EL1h 377 375 kernel_ventry el1_fiq_invalid // FIQ EL1h 378 - kernel_ventry el1_error_invalid // Error EL1h 376 + kernel_ventry el1_error // Error EL1h 379 377 380 378 kernel_ventry el0_sync // Synchronous 64-bit EL0 381 379 kernel_ventry el0_irq // IRQ 64-bit EL0 382 380 kernel_ventry el0_fiq_invalid // FIQ 64-bit EL0 383 - kernel_ventry el0_error_invalid // Error 64-bit EL0 381 + kernel_ventry el0_error // Error 64-bit EL0 384 382 385 383 #ifdef CONFIG_COMPAT 386 384 kernel_ventry el0_sync_compat // Synchronous 32-bit EL0 387 385 kernel_ventry el0_irq_compat // IRQ 32-bit EL0 388 386 kernel_ventry el0_fiq_invalid_compat // FIQ 32-bit EL0 389 - kernel_ventry el0_error_invalid_compat // Error 32-bit EL0 387 + kernel_ventry el0_error_compat // Error 32-bit EL0 390 388 #else 391 389 kernel_ventry el0_sync_invalid // Synchronous 32-bit EL0 392 390 kernel_ventry el0_irq_invalid // IRQ 32-bit EL0 ··· 455 453 el0_fiq_invalid_compat: 456 454 inv_entry 0, BAD_FIQ, 32 457 455 ENDPROC(el0_fiq_invalid_compat) 458 - 459 - el0_error_invalid_compat: 460 - inv_entry 0, BAD_ERROR, 32 461 - ENDPROC(el0_error_invalid_compat) 462 456 #endif 463 457 464 458 el1_sync_invalid: ··· 506 508 * Data abort handling 507 509 */ 508 510 mrs x3, far_el1 509 - enable_dbg 510 - // re-enable interrupts if they were enabled in the aborted context 511 - tbnz x23, #7, 1f // PSR_I_BIT 512 - enable_irq 513 - 1: 511 + inherit_daif pstate=x23, tmp=x2 514 512 clear_address_tag x0, x3 515 513 mov x2, sp // struct pt_regs 516 514 bl do_mem_abort 517 515 518 - // disable interrupts before pulling preserved data off the stack 519 - disable_irq 520 516 kernel_exit 1 521 517 el1_sp_pc: 522 518 /* 523 519 * Stack or PC alignment exception handling 524 520 */ 525 521 mrs x0, far_el1 526 - enable_dbg 522 + inherit_daif pstate=x23, tmp=x2 527 523 mov x2, sp 528 524 bl do_sp_pc_abort 529 525 ASM_BUG() ··· 525 533 /* 526 534 * Undefined instruction 527 535 */ 528 - enable_dbg 536 + inherit_daif pstate=x23, tmp=x2 529 537 mov x0, sp 530 538 bl do_undefinstr 531 539 ASM_BUG() ··· 542 550 kernel_exit 1 543 551 el1_inv: 544 552 // TODO: add support for undefined instructions in kernel mode 545 - enable_dbg 553 + inherit_daif pstate=x23, tmp=x2 546 554 mov x0, sp 547 555 mov x2, x1 548 556 mov x1, #BAD_SYNC ··· 553 561 .align 6 554 562 el1_irq: 555 563 kernel_entry 1 556 - enable_dbg 564 + enable_da_f 557 565 #ifdef CONFIG_TRACE_IRQFLAGS 558 566 bl trace_hardirqs_off 559 567 #endif ··· 599 607 b.eq el0_ia 600 608 cmp x24, #ESR_ELx_EC_FP_ASIMD // FP/ASIMD access 601 609 b.eq el0_fpsimd_acc 610 + cmp x24, #ESR_ELx_EC_SVE // SVE access 611 + b.eq el0_sve_acc 602 612 cmp x24, #ESR_ELx_EC_FP_EXC64 // FP/ASIMD exception 603 613 b.eq el0_fpsimd_exc 604 614 cmp x24, #ESR_ELx_EC_SYS64 // configurable trap ··· 652 658 /* 653 659 * AArch32 syscall handling 654 660 */ 661 + ldr x16, [tsk, #TSK_TI_FLAGS] // load thread flags 655 662 adrp stbl, compat_sys_call_table // load compat syscall table pointer 656 663 mov wscno, w7 // syscall number in w7 (r7) 657 664 mov wsc_nr, #__NR_compat_syscalls ··· 662 667 el0_irq_compat: 663 668 kernel_entry 0, 32 664 669 b el0_irq_naked 670 + 671 + el0_error_compat: 672 + kernel_entry 0, 32 673 + b el0_error_naked 665 674 #endif 666 675 667 676 el0_da: ··· 673 674 * Data abort handling 674 675 */ 675 676 mrs x26, far_el1 676 - // enable interrupts before calling the main handler 677 - enable_dbg_and_irq 677 + enable_daif 678 678 ct_user_exit 679 679 clear_address_tag x0, x26 680 680 mov x1, x25 ··· 685 687 * Instruction abort handling 686 688 */ 687 689 mrs x26, far_el1 688 - // enable interrupts before calling the main handler 689 - enable_dbg_and_irq 690 + enable_daif 690 691 ct_user_exit 691 692 mov x0, x26 692 693 mov x1, x25 ··· 696 699 /* 697 700 * Floating Point or Advanced SIMD access 698 701 */ 699 - enable_dbg 702 + enable_daif 700 703 ct_user_exit 701 704 mov x0, x25 702 705 mov x1, sp 703 706 bl do_fpsimd_acc 704 707 b ret_to_user 708 + el0_sve_acc: 709 + /* 710 + * Scalable Vector Extension access 711 + */ 712 + enable_daif 713 + ct_user_exit 714 + mov x0, x25 715 + mov x1, sp 716 + bl do_sve_acc 717 + b ret_to_user 705 718 el0_fpsimd_exc: 706 719 /* 707 - * Floating Point or Advanced SIMD exception 720 + * Floating Point, Advanced SIMD or SVE exception 708 721 */ 709 - enable_dbg 722 + enable_daif 710 723 ct_user_exit 711 724 mov x0, x25 712 725 mov x1, sp ··· 727 720 * Stack or PC alignment exception handling 728 721 */ 729 722 mrs x26, far_el1 730 - // enable interrupts before calling the main handler 731 - enable_dbg_and_irq 723 + enable_daif 732 724 ct_user_exit 733 725 mov x0, x26 734 726 mov x1, x25 ··· 738 732 /* 739 733 * Undefined instruction 740 734 */ 741 - // enable interrupts before calling the main handler 742 - enable_dbg_and_irq 735 + enable_daif 743 736 ct_user_exit 744 737 mov x0, sp 745 738 bl do_undefinstr ··· 747 742 /* 748 743 * System instructions, for trapped cache maintenance instructions 749 744 */ 750 - enable_dbg_and_irq 745 + enable_daif 751 746 ct_user_exit 752 747 mov x0, x25 753 748 mov x1, sp ··· 762 757 mov x1, x25 763 758 mov x2, sp 764 759 bl do_debug_exception 765 - enable_dbg 760 + enable_daif 766 761 ct_user_exit 767 762 b ret_to_user 768 763 el0_inv: 769 - enable_dbg 764 + enable_daif 770 765 ct_user_exit 771 766 mov x0, sp 772 767 mov x1, #BAD_SYNC ··· 779 774 el0_irq: 780 775 kernel_entry 0 781 776 el0_irq_naked: 782 - enable_dbg 777 + enable_da_f 783 778 #ifdef CONFIG_TRACE_IRQFLAGS 784 779 bl trace_hardirqs_off 785 780 #endif ··· 793 788 b ret_to_user 794 789 ENDPROC(el0_irq) 795 790 791 + el1_error: 792 + kernel_entry 1 793 + mrs x1, esr_el1 794 + enable_dbg 795 + mov x0, sp 796 + bl do_serror 797 + kernel_exit 1 798 + ENDPROC(el1_error) 799 + 800 + el0_error: 801 + kernel_entry 0 802 + el0_error_naked: 803 + mrs x1, esr_el1 804 + enable_dbg 805 + mov x0, sp 806 + bl do_serror 807 + enable_daif 808 + ct_user_exit 809 + b ret_to_user 810 + ENDPROC(el0_error) 811 + 812 + 796 813 /* 797 814 * This is the fast syscall return path. We do as little as possible here, 798 815 * and this includes saving x0 back into the kernel stack. 799 816 */ 800 817 ret_fast_syscall: 801 - disable_irq // disable interrupts 818 + disable_daif 802 819 str x0, [sp, #S_X0] // returned x0 803 820 ldr x1, [tsk, #TSK_TI_FLAGS] // re-check for syscall tracing 804 821 and x2, x1, #_TIF_SYSCALL_WORK ··· 830 803 enable_step_tsk x1, x2 831 804 kernel_exit 0 832 805 ret_fast_syscall_trace: 833 - enable_irq // enable interrupts 806 + enable_daif 834 807 b __sys_trace_return_skipped // we already saved x0 835 808 836 809 /* ··· 848 821 * "slow" syscall return path. 849 822 */ 850 823 ret_to_user: 851 - disable_irq // disable interrupts 824 + disable_daif 852 825 ldr x1, [tsk, #TSK_TI_FLAGS] 853 826 and x2, x1, #_TIF_WORK_MASK 854 827 cbnz x2, work_pending ··· 862 835 */ 863 836 .align 6 864 837 el0_svc: 838 + ldr x16, [tsk, #TSK_TI_FLAGS] // load thread flags 865 839 adrp stbl, sys_call_table // load syscall table pointer 866 840 mov wscno, w8 // syscall number in w8 867 841 mov wsc_nr, #__NR_syscalls 842 + 843 + #ifdef CONFIG_ARM64_SVE 844 + alternative_if_not ARM64_SVE 845 + b el0_svc_naked 846 + alternative_else_nop_endif 847 + tbz x16, #TIF_SVE, el0_svc_naked // Skip unless TIF_SVE set: 848 + bic x16, x16, #_TIF_SVE // discard SVE state 849 + str x16, [tsk, #TSK_TI_FLAGS] 850 + 851 + /* 852 + * task_fpsimd_load() won't be called to update CPACR_EL1 in 853 + * ret_to_user unless TIF_FOREIGN_FPSTATE is still set, which only 854 + * happens if a context switch or kernel_neon_begin() or context 855 + * modification (sigreturn, ptrace) intervenes. 856 + * So, ensure that CPACR_EL1 is already correct for the fast-path case: 857 + */ 858 + mrs x9, cpacr_el1 859 + bic x9, x9, #CPACR_EL1_ZEN_EL0EN // disable SVE for el0 860 + msr cpacr_el1, x9 // synchronised by eret to el0 861 + #endif 862 + 868 863 el0_svc_naked: // compat entry point 869 864 stp x0, xscno, [sp, #S_ORIG_X0] // save the original x0 and syscall number 870 - enable_dbg_and_irq 865 + enable_daif 871 866 ct_user_exit 1 872 867 873 - ldr x16, [tsk, #TSK_TI_FLAGS] // check for syscall hooks 874 - tst x16, #_TIF_SYSCALL_WORK 868 + tst x16, #_TIF_SYSCALL_WORK // check for syscall hooks 875 869 b.ne __sys_trace 876 870 cmp wscno, wsc_nr // check upper syscall limit 877 871 b.hs ni_sys
+881 -27
arch/arm64/kernel/fpsimd.c
··· 17 17 * along with this program. If not, see <http://www.gnu.org/licenses/>. 18 18 */ 19 19 20 + #include <linux/bitmap.h> 20 21 #include <linux/bottom_half.h> 22 + #include <linux/bug.h> 23 + #include <linux/cache.h> 24 + #include <linux/compat.h> 21 25 #include <linux/cpu.h> 22 26 #include <linux/cpu_pm.h> 23 27 #include <linux/kernel.h> 28 + #include <linux/linkage.h> 29 + #include <linux/irqflags.h> 24 30 #include <linux/init.h> 25 31 #include <linux/percpu.h> 32 + #include <linux/prctl.h> 26 33 #include <linux/preempt.h> 34 + #include <linux/prctl.h> 35 + #include <linux/ptrace.h> 27 36 #include <linux/sched/signal.h> 37 + #include <linux/sched/task_stack.h> 28 38 #include <linux/signal.h> 39 + #include <linux/slab.h> 40 + #include <linux/sysctl.h> 29 41 30 42 #include <asm/fpsimd.h> 31 43 #include <asm/cputype.h> 32 44 #include <asm/simd.h> 45 + #include <asm/sigcontext.h> 46 + #include <asm/sysreg.h> 47 + #include <asm/traps.h> 33 48 34 49 #define FPEXC_IOF (1 << 0) 35 50 #define FPEXC_DZF (1 << 1) ··· 54 39 #define FPEXC_IDF (1 << 7) 55 40 56 41 /* 42 + * (Note: in this discussion, statements about FPSIMD apply equally to SVE.) 43 + * 57 44 * In order to reduce the number of times the FPSIMD state is needlessly saved 58 45 * and restored, we need to keep track of two things: 59 46 * (a) for each task, we need to remember which CPU was the last one to have ··· 116 99 */ 117 100 static DEFINE_PER_CPU(struct fpsimd_state *, fpsimd_last_state); 118 101 102 + /* Default VL for tasks that don't set it explicitly: */ 103 + static int sve_default_vl = -1; 104 + 105 + #ifdef CONFIG_ARM64_SVE 106 + 107 + /* Maximum supported vector length across all CPUs (initially poisoned) */ 108 + int __ro_after_init sve_max_vl = -1; 109 + /* Set of available vector lengths, as vq_to_bit(vq): */ 110 + static __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX); 111 + static void __percpu *efi_sve_state; 112 + 113 + #else /* ! CONFIG_ARM64_SVE */ 114 + 115 + /* Dummy declaration for code that will be optimised out: */ 116 + extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX); 117 + extern void __percpu *efi_sve_state; 118 + 119 + #endif /* ! CONFIG_ARM64_SVE */ 120 + 121 + /* 122 + * Call __sve_free() directly only if you know task can't be scheduled 123 + * or preempted. 124 + */ 125 + static void __sve_free(struct task_struct *task) 126 + { 127 + kfree(task->thread.sve_state); 128 + task->thread.sve_state = NULL; 129 + } 130 + 131 + static void sve_free(struct task_struct *task) 132 + { 133 + WARN_ON(test_tsk_thread_flag(task, TIF_SVE)); 134 + 135 + __sve_free(task); 136 + } 137 + 138 + 139 + /* Offset of FFR in the SVE register dump */ 140 + static size_t sve_ffr_offset(int vl) 141 + { 142 + return SVE_SIG_FFR_OFFSET(sve_vq_from_vl(vl)) - SVE_SIG_REGS_OFFSET; 143 + } 144 + 145 + static void *sve_pffr(struct task_struct *task) 146 + { 147 + return (char *)task->thread.sve_state + 148 + sve_ffr_offset(task->thread.sve_vl); 149 + } 150 + 151 + static void change_cpacr(u64 val, u64 mask) 152 + { 153 + u64 cpacr = read_sysreg(CPACR_EL1); 154 + u64 new = (cpacr & ~mask) | val; 155 + 156 + if (new != cpacr) 157 + write_sysreg(new, CPACR_EL1); 158 + } 159 + 160 + static void sve_user_disable(void) 161 + { 162 + change_cpacr(0, CPACR_EL1_ZEN_EL0EN); 163 + } 164 + 165 + static void sve_user_enable(void) 166 + { 167 + change_cpacr(CPACR_EL1_ZEN_EL0EN, CPACR_EL1_ZEN_EL0EN); 168 + } 169 + 170 + /* 171 + * TIF_SVE controls whether a task can use SVE without trapping while 172 + * in userspace, and also the way a task's FPSIMD/SVE state is stored 173 + * in thread_struct. 174 + * 175 + * The kernel uses this flag to track whether a user task is actively 176 + * using SVE, and therefore whether full SVE register state needs to 177 + * be tracked. If not, the cheaper FPSIMD context handling code can 178 + * be used instead of the more costly SVE equivalents. 179 + * 180 + * * TIF_SVE set: 181 + * 182 + * The task can execute SVE instructions while in userspace without 183 + * trapping to the kernel. 184 + * 185 + * When stored, Z0-Z31 (incorporating Vn in bits[127:0] or the 186 + * corresponding Zn), P0-P15 and FFR are encoded in in 187 + * task->thread.sve_state, formatted appropriately for vector 188 + * length task->thread.sve_vl. 189 + * 190 + * task->thread.sve_state must point to a valid buffer at least 191 + * sve_state_size(task) bytes in size. 192 + * 193 + * During any syscall, the kernel may optionally clear TIF_SVE and 194 + * discard the vector state except for the FPSIMD subset. 195 + * 196 + * * TIF_SVE clear: 197 + * 198 + * An attempt by the user task to execute an SVE instruction causes 199 + * do_sve_acc() to be called, which does some preparation and then 200 + * sets TIF_SVE. 201 + * 202 + * When stored, FPSIMD registers V0-V31 are encoded in 203 + * task->fpsimd_state; bits [max : 128] for each of Z0-Z31 are 204 + * logically zero but not stored anywhere; P0-P15 and FFR are not 205 + * stored and have unspecified values from userspace's point of 206 + * view. For hygiene purposes, the kernel zeroes them on next use, 207 + * but userspace is discouraged from relying on this. 208 + * 209 + * task->thread.sve_state does not need to be non-NULL, valid or any 210 + * particular size: it must not be dereferenced. 211 + * 212 + * * FPSR and FPCR are always stored in task->fpsimd_state irrespctive of 213 + * whether TIF_SVE is clear or set, since these are not vector length 214 + * dependent. 215 + */ 216 + 217 + /* 218 + * Update current's FPSIMD/SVE registers from thread_struct. 219 + * 220 + * This function should be called only when the FPSIMD/SVE state in 221 + * thread_struct is known to be up to date, when preparing to enter 222 + * userspace. 223 + * 224 + * Softirqs (and preemption) must be disabled. 225 + */ 226 + static void task_fpsimd_load(void) 227 + { 228 + WARN_ON(!in_softirq() && !irqs_disabled()); 229 + 230 + if (system_supports_sve() && test_thread_flag(TIF_SVE)) 231 + sve_load_state(sve_pffr(current), 232 + &current->thread.fpsimd_state.fpsr, 233 + sve_vq_from_vl(current->thread.sve_vl) - 1); 234 + else 235 + fpsimd_load_state(&current->thread.fpsimd_state); 236 + 237 + if (system_supports_sve()) { 238 + /* Toggle SVE trapping for userspace if needed */ 239 + if (test_thread_flag(TIF_SVE)) 240 + sve_user_enable(); 241 + else 242 + sve_user_disable(); 243 + 244 + /* Serialised by exception return to user */ 245 + } 246 + } 247 + 248 + /* 249 + * Ensure current's FPSIMD/SVE storage in thread_struct is up to date 250 + * with respect to the CPU registers. 251 + * 252 + * Softirqs (and preemption) must be disabled. 253 + */ 254 + static void task_fpsimd_save(void) 255 + { 256 + WARN_ON(!in_softirq() && !irqs_disabled()); 257 + 258 + if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) { 259 + if (system_supports_sve() && test_thread_flag(TIF_SVE)) { 260 + if (WARN_ON(sve_get_vl() != current->thread.sve_vl)) { 261 + /* 262 + * Can't save the user regs, so current would 263 + * re-enter user with corrupt state. 264 + * There's no way to recover, so kill it: 265 + */ 266 + force_signal_inject( 267 + SIGKILL, 0, current_pt_regs(), 0); 268 + return; 269 + } 270 + 271 + sve_save_state(sve_pffr(current), 272 + &current->thread.fpsimd_state.fpsr); 273 + } else 274 + fpsimd_save_state(&current->thread.fpsimd_state); 275 + } 276 + } 277 + 278 + /* 279 + * Helpers to translate bit indices in sve_vq_map to VQ values (and 280 + * vice versa). This allows find_next_bit() to be used to find the 281 + * _maximum_ VQ not exceeding a certain value. 282 + */ 283 + 284 + static unsigned int vq_to_bit(unsigned int vq) 285 + { 286 + return SVE_VQ_MAX - vq; 287 + } 288 + 289 + static unsigned int bit_to_vq(unsigned int bit) 290 + { 291 + if (WARN_ON(bit >= SVE_VQ_MAX)) 292 + bit = SVE_VQ_MAX - 1; 293 + 294 + return SVE_VQ_MAX - bit; 295 + } 296 + 297 + /* 298 + * All vector length selection from userspace comes through here. 299 + * We're on a slow path, so some sanity-checks are included. 300 + * If things go wrong there's a bug somewhere, but try to fall back to a 301 + * safe choice. 302 + */ 303 + static unsigned int find_supported_vector_length(unsigned int vl) 304 + { 305 + int bit; 306 + int max_vl = sve_max_vl; 307 + 308 + if (WARN_ON(!sve_vl_valid(vl))) 309 + vl = SVE_VL_MIN; 310 + 311 + if (WARN_ON(!sve_vl_valid(max_vl))) 312 + max_vl = SVE_VL_MIN; 313 + 314 + if (vl > max_vl) 315 + vl = max_vl; 316 + 317 + bit = find_next_bit(sve_vq_map, SVE_VQ_MAX, 318 + vq_to_bit(sve_vq_from_vl(vl))); 319 + return sve_vl_from_vq(bit_to_vq(bit)); 320 + } 321 + 322 + #ifdef CONFIG_SYSCTL 323 + 324 + static int sve_proc_do_default_vl(struct ctl_table *table, int write, 325 + void __user *buffer, size_t *lenp, 326 + loff_t *ppos) 327 + { 328 + int ret; 329 + int vl = sve_default_vl; 330 + struct ctl_table tmp_table = { 331 + .data = &vl, 332 + .maxlen = sizeof(vl), 333 + }; 334 + 335 + ret = proc_dointvec(&tmp_table, write, buffer, lenp, ppos); 336 + if (ret || !write) 337 + return ret; 338 + 339 + /* Writing -1 has the special meaning "set to max": */ 340 + if (vl == -1) { 341 + /* Fail safe if sve_max_vl wasn't initialised */ 342 + if (WARN_ON(!sve_vl_valid(sve_max_vl))) 343 + vl = SVE_VL_MIN; 344 + else 345 + vl = sve_max_vl; 346 + 347 + goto chosen; 348 + } 349 + 350 + if (!sve_vl_valid(vl)) 351 + return -EINVAL; 352 + 353 + vl = find_supported_vector_length(vl); 354 + chosen: 355 + sve_default_vl = vl; 356 + return 0; 357 + } 358 + 359 + static struct ctl_table sve_default_vl_table[] = { 360 + { 361 + .procname = "sve_default_vector_length", 362 + .mode = 0644, 363 + .proc_handler = sve_proc_do_default_vl, 364 + }, 365 + { } 366 + }; 367 + 368 + static int __init sve_sysctl_init(void) 369 + { 370 + if (system_supports_sve()) 371 + if (!register_sysctl("abi", sve_default_vl_table)) 372 + return -EINVAL; 373 + 374 + return 0; 375 + } 376 + 377 + #else /* ! CONFIG_SYSCTL */ 378 + static int __init sve_sysctl_init(void) { return 0; } 379 + #endif /* ! CONFIG_SYSCTL */ 380 + 381 + #define ZREG(sve_state, vq, n) ((char *)(sve_state) + \ 382 + (SVE_SIG_ZREG_OFFSET(vq, n) - SVE_SIG_REGS_OFFSET)) 383 + 384 + /* 385 + * Transfer the FPSIMD state in task->thread.fpsimd_state to 386 + * task->thread.sve_state. 387 + * 388 + * Task can be a non-runnable task, or current. In the latter case, 389 + * softirqs (and preemption) must be disabled. 390 + * task->thread.sve_state must point to at least sve_state_size(task) 391 + * bytes of allocated kernel memory. 392 + * task->thread.fpsimd_state must be up to date before calling this function. 393 + */ 394 + static void fpsimd_to_sve(struct task_struct *task) 395 + { 396 + unsigned int vq; 397 + void *sst = task->thread.sve_state; 398 + struct fpsimd_state const *fst = &task->thread.fpsimd_state; 399 + unsigned int i; 400 + 401 + if (!system_supports_sve()) 402 + return; 403 + 404 + vq = sve_vq_from_vl(task->thread.sve_vl); 405 + for (i = 0; i < 32; ++i) 406 + memcpy(ZREG(sst, vq, i), &fst->vregs[i], 407 + sizeof(fst->vregs[i])); 408 + } 409 + 410 + /* 411 + * Transfer the SVE state in task->thread.sve_state to 412 + * task->thread.fpsimd_state. 413 + * 414 + * Task can be a non-runnable task, or current. In the latter case, 415 + * softirqs (and preemption) must be disabled. 416 + * task->thread.sve_state must point to at least sve_state_size(task) 417 + * bytes of allocated kernel memory. 418 + * task->thread.sve_state must be up to date before calling this function. 419 + */ 420 + static void sve_to_fpsimd(struct task_struct *task) 421 + { 422 + unsigned int vq; 423 + void const *sst = task->thread.sve_state; 424 + struct fpsimd_state *fst = &task->thread.fpsimd_state; 425 + unsigned int i; 426 + 427 + if (!system_supports_sve()) 428 + return; 429 + 430 + vq = sve_vq_from_vl(task->thread.sve_vl); 431 + for (i = 0; i < 32; ++i) 432 + memcpy(&fst->vregs[i], ZREG(sst, vq, i), 433 + sizeof(fst->vregs[i])); 434 + } 435 + 436 + #ifdef CONFIG_ARM64_SVE 437 + 438 + /* 439 + * Return how many bytes of memory are required to store the full SVE 440 + * state for task, given task's currently configured vector length. 441 + */ 442 + size_t sve_state_size(struct task_struct const *task) 443 + { 444 + return SVE_SIG_REGS_SIZE(sve_vq_from_vl(task->thread.sve_vl)); 445 + } 446 + 447 + /* 448 + * Ensure that task->thread.sve_state is allocated and sufficiently large. 449 + * 450 + * This function should be used only in preparation for replacing 451 + * task->thread.sve_state with new data. The memory is always zeroed 452 + * here to prevent stale data from showing through: this is done in 453 + * the interest of testability and predictability: except in the 454 + * do_sve_acc() case, there is no ABI requirement to hide stale data 455 + * written previously be task. 456 + */ 457 + void sve_alloc(struct task_struct *task) 458 + { 459 + if (task->thread.sve_state) { 460 + memset(task->thread.sve_state, 0, sve_state_size(current)); 461 + return; 462 + } 463 + 464 + /* This is a small allocation (maximum ~8KB) and Should Not Fail. */ 465 + task->thread.sve_state = 466 + kzalloc(sve_state_size(task), GFP_KERNEL); 467 + 468 + /* 469 + * If future SVE revisions can have larger vectors though, 470 + * this may cease to be true: 471 + */ 472 + BUG_ON(!task->thread.sve_state); 473 + } 474 + 475 + 476 + /* 477 + * Ensure that task->thread.sve_state is up to date with respect to 478 + * the user task, irrespective of when SVE is in use or not. 479 + * 480 + * This should only be called by ptrace. task must be non-runnable. 481 + * task->thread.sve_state must point to at least sve_state_size(task) 482 + * bytes of allocated kernel memory. 483 + */ 484 + void fpsimd_sync_to_sve(struct task_struct *task) 485 + { 486 + if (!test_tsk_thread_flag(task, TIF_SVE)) 487 + fpsimd_to_sve(task); 488 + } 489 + 490 + /* 491 + * Ensure that task->thread.fpsimd_state is up to date with respect to 492 + * the user task, irrespective of whether SVE is in use or not. 493 + * 494 + * This should only be called by ptrace. task must be non-runnable. 495 + * task->thread.sve_state must point to at least sve_state_size(task) 496 + * bytes of allocated kernel memory. 497 + */ 498 + void sve_sync_to_fpsimd(struct task_struct *task) 499 + { 500 + if (test_tsk_thread_flag(task, TIF_SVE)) 501 + sve_to_fpsimd(task); 502 + } 503 + 504 + /* 505 + * Ensure that task->thread.sve_state is up to date with respect to 506 + * the task->thread.fpsimd_state. 507 + * 508 + * This should only be called by ptrace to merge new FPSIMD register 509 + * values into a task for which SVE is currently active. 510 + * task must be non-runnable. 511 + * task->thread.sve_state must point to at least sve_state_size(task) 512 + * bytes of allocated kernel memory. 513 + * task->thread.fpsimd_state must already have been initialised with 514 + * the new FPSIMD register values to be merged in. 515 + */ 516 + void sve_sync_from_fpsimd_zeropad(struct task_struct *task) 517 + { 518 + unsigned int vq; 519 + void *sst = task->thread.sve_state; 520 + struct fpsimd_state const *fst = &task->thread.fpsimd_state; 521 + unsigned int i; 522 + 523 + if (!test_tsk_thread_flag(task, TIF_SVE)) 524 + return; 525 + 526 + vq = sve_vq_from_vl(task->thread.sve_vl); 527 + 528 + memset(sst, 0, SVE_SIG_REGS_SIZE(vq)); 529 + 530 + for (i = 0; i < 32; ++i) 531 + memcpy(ZREG(sst, vq, i), &fst->vregs[i], 532 + sizeof(fst->vregs[i])); 533 + } 534 + 535 + int sve_set_vector_length(struct task_struct *task, 536 + unsigned long vl, unsigned long flags) 537 + { 538 + if (flags & ~(unsigned long)(PR_SVE_VL_INHERIT | 539 + PR_SVE_SET_VL_ONEXEC)) 540 + return -EINVAL; 541 + 542 + if (!sve_vl_valid(vl)) 543 + return -EINVAL; 544 + 545 + /* 546 + * Clamp to the maximum vector length that VL-agnostic SVE code can 547 + * work with. A flag may be assigned in the future to allow setting 548 + * of larger vector lengths without confusing older software. 549 + */ 550 + if (vl > SVE_VL_ARCH_MAX) 551 + vl = SVE_VL_ARCH_MAX; 552 + 553 + vl = find_supported_vector_length(vl); 554 + 555 + if (flags & (PR_SVE_VL_INHERIT | 556 + PR_SVE_SET_VL_ONEXEC)) 557 + task->thread.sve_vl_onexec = vl; 558 + else 559 + /* Reset VL to system default on next exec: */ 560 + task->thread.sve_vl_onexec = 0; 561 + 562 + /* Only actually set the VL if not deferred: */ 563 + if (flags & PR_SVE_SET_VL_ONEXEC) 564 + goto out; 565 + 566 + if (vl == task->thread.sve_vl) 567 + goto out; 568 + 569 + /* 570 + * To ensure the FPSIMD bits of the SVE vector registers are preserved, 571 + * write any live register state back to task_struct, and convert to a 572 + * non-SVE thread. 573 + */ 574 + if (task == current) { 575 + local_bh_disable(); 576 + 577 + task_fpsimd_save(); 578 + set_thread_flag(TIF_FOREIGN_FPSTATE); 579 + } 580 + 581 + fpsimd_flush_task_state(task); 582 + if (test_and_clear_tsk_thread_flag(task, TIF_SVE)) 583 + sve_to_fpsimd(task); 584 + 585 + if (task == current) 586 + local_bh_enable(); 587 + 588 + /* 589 + * Force reallocation of task SVE state to the correct size 590 + * on next use: 591 + */ 592 + sve_free(task); 593 + 594 + task->thread.sve_vl = vl; 595 + 596 + out: 597 + if (flags & PR_SVE_VL_INHERIT) 598 + set_tsk_thread_flag(task, TIF_SVE_VL_INHERIT); 599 + else 600 + clear_tsk_thread_flag(task, TIF_SVE_VL_INHERIT); 601 + 602 + return 0; 603 + } 604 + 605 + /* 606 + * Encode the current vector length and flags for return. 607 + * This is only required for prctl(): ptrace has separate fields 608 + * 609 + * flags are as for sve_set_vector_length(). 610 + */ 611 + static int sve_prctl_status(unsigned long flags) 612 + { 613 + int ret; 614 + 615 + if (flags & PR_SVE_SET_VL_ONEXEC) 616 + ret = current->thread.sve_vl_onexec; 617 + else 618 + ret = current->thread.sve_vl; 619 + 620 + if (test_thread_flag(TIF_SVE_VL_INHERIT)) 621 + ret |= PR_SVE_VL_INHERIT; 622 + 623 + return ret; 624 + } 625 + 626 + /* PR_SVE_SET_VL */ 627 + int sve_set_current_vl(unsigned long arg) 628 + { 629 + unsigned long vl, flags; 630 + int ret; 631 + 632 + vl = arg & PR_SVE_VL_LEN_MASK; 633 + flags = arg & ~vl; 634 + 635 + if (!system_supports_sve()) 636 + return -EINVAL; 637 + 638 + ret = sve_set_vector_length(current, vl, flags); 639 + if (ret) 640 + return ret; 641 + 642 + return sve_prctl_status(flags); 643 + } 644 + 645 + /* PR_SVE_GET_VL */ 646 + int sve_get_current_vl(void) 647 + { 648 + if (!system_supports_sve()) 649 + return -EINVAL; 650 + 651 + return sve_prctl_status(0); 652 + } 653 + 654 + /* 655 + * Bitmap for temporary storage of the per-CPU set of supported vector lengths 656 + * during secondary boot. 657 + */ 658 + static DECLARE_BITMAP(sve_secondary_vq_map, SVE_VQ_MAX); 659 + 660 + static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX)) 661 + { 662 + unsigned int vq, vl; 663 + unsigned long zcr; 664 + 665 + bitmap_zero(map, SVE_VQ_MAX); 666 + 667 + zcr = ZCR_ELx_LEN_MASK; 668 + zcr = read_sysreg_s(SYS_ZCR_EL1) & ~zcr; 669 + 670 + for (vq = SVE_VQ_MAX; vq >= SVE_VQ_MIN; --vq) { 671 + write_sysreg_s(zcr | (vq - 1), SYS_ZCR_EL1); /* self-syncing */ 672 + vl = sve_get_vl(); 673 + vq = sve_vq_from_vl(vl); /* skip intervening lengths */ 674 + set_bit(vq_to_bit(vq), map); 675 + } 676 + } 677 + 678 + void __init sve_init_vq_map(void) 679 + { 680 + sve_probe_vqs(sve_vq_map); 681 + } 682 + 683 + /* 684 + * If we haven't committed to the set of supported VQs yet, filter out 685 + * those not supported by the current CPU. 686 + */ 687 + void sve_update_vq_map(void) 688 + { 689 + sve_probe_vqs(sve_secondary_vq_map); 690 + bitmap_and(sve_vq_map, sve_vq_map, sve_secondary_vq_map, SVE_VQ_MAX); 691 + } 692 + 693 + /* Check whether the current CPU supports all VQs in the committed set */ 694 + int sve_verify_vq_map(void) 695 + { 696 + int ret = 0; 697 + 698 + sve_probe_vqs(sve_secondary_vq_map); 699 + bitmap_andnot(sve_secondary_vq_map, sve_vq_map, sve_secondary_vq_map, 700 + SVE_VQ_MAX); 701 + if (!bitmap_empty(sve_secondary_vq_map, SVE_VQ_MAX)) { 702 + pr_warn("SVE: cpu%d: Required vector length(s) missing\n", 703 + smp_processor_id()); 704 + ret = -EINVAL; 705 + } 706 + 707 + return ret; 708 + } 709 + 710 + static void __init sve_efi_setup(void) 711 + { 712 + if (!IS_ENABLED(CONFIG_EFI)) 713 + return; 714 + 715 + /* 716 + * alloc_percpu() warns and prints a backtrace if this goes wrong. 717 + * This is evidence of a crippled system and we are returning void, 718 + * so no attempt is made to handle this situation here. 719 + */ 720 + if (!sve_vl_valid(sve_max_vl)) 721 + goto fail; 722 + 723 + efi_sve_state = __alloc_percpu( 724 + SVE_SIG_REGS_SIZE(sve_vq_from_vl(sve_max_vl)), SVE_VQ_BYTES); 725 + if (!efi_sve_state) 726 + goto fail; 727 + 728 + return; 729 + 730 + fail: 731 + panic("Cannot allocate percpu memory for EFI SVE save/restore"); 732 + } 733 + 734 + /* 735 + * Enable SVE for EL1. 736 + * Intended for use by the cpufeatures code during CPU boot. 737 + */ 738 + int sve_kernel_enable(void *__always_unused p) 739 + { 740 + write_sysreg(read_sysreg(CPACR_EL1) | CPACR_EL1_ZEN_EL1EN, CPACR_EL1); 741 + isb(); 742 + 743 + return 0; 744 + } 745 + 746 + void __init sve_setup(void) 747 + { 748 + u64 zcr; 749 + 750 + if (!system_supports_sve()) 751 + return; 752 + 753 + /* 754 + * The SVE architecture mandates support for 128-bit vectors, 755 + * so sve_vq_map must have at least SVE_VQ_MIN set. 756 + * If something went wrong, at least try to patch it up: 757 + */ 758 + if (WARN_ON(!test_bit(vq_to_bit(SVE_VQ_MIN), sve_vq_map))) 759 + set_bit(vq_to_bit(SVE_VQ_MIN), sve_vq_map); 760 + 761 + zcr = read_sanitised_ftr_reg(SYS_ZCR_EL1); 762 + sve_max_vl = sve_vl_from_vq((zcr & ZCR_ELx_LEN_MASK) + 1); 763 + 764 + /* 765 + * Sanity-check that the max VL we determined through CPU features 766 + * corresponds properly to sve_vq_map. If not, do our best: 767 + */ 768 + if (WARN_ON(sve_max_vl != find_supported_vector_length(sve_max_vl))) 769 + sve_max_vl = find_supported_vector_length(sve_max_vl); 770 + 771 + /* 772 + * For the default VL, pick the maximum supported value <= 64. 773 + * VL == 64 is guaranteed not to grow the signal frame. 774 + */ 775 + sve_default_vl = find_supported_vector_length(64); 776 + 777 + pr_info("SVE: maximum available vector length %u bytes per vector\n", 778 + sve_max_vl); 779 + pr_info("SVE: default vector length %u bytes per vector\n", 780 + sve_default_vl); 781 + 782 + sve_efi_setup(); 783 + } 784 + 785 + /* 786 + * Called from the put_task_struct() path, which cannot get here 787 + * unless dead_task is really dead and not schedulable. 788 + */ 789 + void fpsimd_release_task(struct task_struct *dead_task) 790 + { 791 + __sve_free(dead_task); 792 + } 793 + 794 + #endif /* CONFIG_ARM64_SVE */ 795 + 796 + /* 797 + * Trapped SVE access 798 + * 799 + * Storage is allocated for the full SVE state, the current FPSIMD 800 + * register contents are migrated across, and TIF_SVE is set so that 801 + * the SVE access trap will be disabled the next time this task 802 + * reaches ret_to_user. 803 + * 804 + * TIF_SVE should be clear on entry: otherwise, task_fpsimd_load() 805 + * would have disabled the SVE access trap for userspace during 806 + * ret_to_user, making an SVE access trap impossible in that case. 807 + */ 808 + asmlinkage void do_sve_acc(unsigned int esr, struct pt_regs *regs) 809 + { 810 + /* Even if we chose not to use SVE, the hardware could still trap: */ 811 + if (unlikely(!system_supports_sve()) || WARN_ON(is_compat_task())) { 812 + force_signal_inject(SIGILL, ILL_ILLOPC, regs, 0); 813 + return; 814 + } 815 + 816 + sve_alloc(current); 817 + 818 + local_bh_disable(); 819 + 820 + task_fpsimd_save(); 821 + fpsimd_to_sve(current); 822 + 823 + /* Force ret_to_user to reload the registers: */ 824 + fpsimd_flush_task_state(current); 825 + set_thread_flag(TIF_FOREIGN_FPSTATE); 826 + 827 + if (test_and_set_thread_flag(TIF_SVE)) 828 + WARN_ON(1); /* SVE access shouldn't have trapped */ 829 + 830 + local_bh_enable(); 831 + } 832 + 119 833 /* 120 834 * Trapped FP/ASIMD access. 121 835 */ 122 - void do_fpsimd_acc(unsigned int esr, struct pt_regs *regs) 836 + asmlinkage void do_fpsimd_acc(unsigned int esr, struct pt_regs *regs) 123 837 { 124 838 /* TODO: implement lazy context saving/restoring */ 125 839 WARN_ON(1); ··· 859 111 /* 860 112 * Raise a SIGFPE for the current process. 861 113 */ 862 - void do_fpsimd_exc(unsigned int esr, struct pt_regs *regs) 114 + asmlinkage void do_fpsimd_exc(unsigned int esr, struct pt_regs *regs) 863 115 { 864 116 siginfo_t info; 865 117 unsigned int si_code = 0; ··· 892 144 * the registers is in fact the most recent userland FPSIMD state of 893 145 * 'current'. 894 146 */ 895 - if (current->mm && !test_thread_flag(TIF_FOREIGN_FPSTATE)) 896 - fpsimd_save_state(&current->thread.fpsimd_state); 147 + if (current->mm) 148 + task_fpsimd_save(); 897 149 898 150 if (next->mm) { 899 151 /* ··· 907 159 908 160 if (__this_cpu_read(fpsimd_last_state) == st 909 161 && st->cpu == smp_processor_id()) 910 - clear_ti_thread_flag(task_thread_info(next), 911 - TIF_FOREIGN_FPSTATE); 162 + clear_tsk_thread_flag(next, TIF_FOREIGN_FPSTATE); 912 163 else 913 - set_ti_thread_flag(task_thread_info(next), 914 - TIF_FOREIGN_FPSTATE); 164 + set_tsk_thread_flag(next, TIF_FOREIGN_FPSTATE); 915 165 } 916 166 } 917 167 918 168 void fpsimd_flush_thread(void) 919 169 { 170 + int vl, supported_vl; 171 + 920 172 if (!system_supports_fpsimd()) 921 173 return; 922 174 ··· 924 176 925 177 memset(&current->thread.fpsimd_state, 0, sizeof(struct fpsimd_state)); 926 178 fpsimd_flush_task_state(current); 179 + 180 + if (system_supports_sve()) { 181 + clear_thread_flag(TIF_SVE); 182 + sve_free(current); 183 + 184 + /* 185 + * Reset the task vector length as required. 186 + * This is where we ensure that all user tasks have a valid 187 + * vector length configured: no kernel task can become a user 188 + * task without an exec and hence a call to this function. 189 + * By the time the first call to this function is made, all 190 + * early hardware probing is complete, so sve_default_vl 191 + * should be valid. 192 + * If a bug causes this to go wrong, we make some noise and 193 + * try to fudge thread.sve_vl to a safe value here. 194 + */ 195 + vl = current->thread.sve_vl_onexec ? 196 + current->thread.sve_vl_onexec : sve_default_vl; 197 + 198 + if (WARN_ON(!sve_vl_valid(vl))) 199 + vl = SVE_VL_MIN; 200 + 201 + supported_vl = find_supported_vector_length(vl); 202 + if (WARN_ON(supported_vl != vl)) 203 + vl = supported_vl; 204 + 205 + current->thread.sve_vl = vl; 206 + 207 + /* 208 + * If the task is not set to inherit, ensure that the vector 209 + * length will be reset by a subsequent exec: 210 + */ 211 + if (!test_thread_flag(TIF_SVE_VL_INHERIT)) 212 + current->thread.sve_vl_onexec = 0; 213 + } 214 + 927 215 set_thread_flag(TIF_FOREIGN_FPSTATE); 928 216 929 217 local_bh_enable(); ··· 975 191 return; 976 192 977 193 local_bh_disable(); 978 - 979 - if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) 980 - fpsimd_save_state(&current->thread.fpsimd_state); 981 - 194 + task_fpsimd_save(); 982 195 local_bh_enable(); 196 + } 197 + 198 + /* 199 + * Like fpsimd_preserve_current_state(), but ensure that 200 + * current->thread.fpsimd_state is updated so that it can be copied to 201 + * the signal frame. 202 + */ 203 + void fpsimd_signal_preserve_current_state(void) 204 + { 205 + fpsimd_preserve_current_state(); 206 + if (system_supports_sve() && test_thread_flag(TIF_SVE)) 207 + sve_to_fpsimd(current); 983 208 } 984 209 985 210 /* ··· 1006 213 if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) { 1007 214 struct fpsimd_state *st = &current->thread.fpsimd_state; 1008 215 1009 - fpsimd_load_state(st); 216 + task_fpsimd_load(); 1010 217 __this_cpu_write(fpsimd_last_state, st); 1011 218 st->cpu = smp_processor_id(); 1012 219 } ··· 1026 233 1027 234 local_bh_disable(); 1028 235 1029 - fpsimd_load_state(state); 236 + if (system_supports_sve() && test_thread_flag(TIF_SVE)) { 237 + current->thread.fpsimd_state = *state; 238 + fpsimd_to_sve(current); 239 + } 240 + task_fpsimd_load(); 241 + 1030 242 if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) { 1031 243 struct fpsimd_state *st = &current->thread.fpsimd_state; 1032 244 ··· 1049 251 { 1050 252 t->thread.fpsimd_state.cpu = NR_CPUS; 1051 253 } 254 + 255 + static inline void fpsimd_flush_cpu_state(void) 256 + { 257 + __this_cpu_write(fpsimd_last_state, NULL); 258 + } 259 + 260 + /* 261 + * Invalidate any task SVE state currently held in this CPU's regs. 262 + * 263 + * This is used to prevent the kernel from trying to reuse SVE register data 264 + * that is detroyed by KVM guest enter/exit. This function should go away when 265 + * KVM SVE support is implemented. Don't use it for anything else. 266 + */ 267 + #ifdef CONFIG_ARM64_SVE 268 + void sve_flush_cpu_state(void) 269 + { 270 + struct fpsimd_state *const fpstate = __this_cpu_read(fpsimd_last_state); 271 + struct task_struct *tsk; 272 + 273 + if (!fpstate) 274 + return; 275 + 276 + tsk = container_of(fpstate, struct task_struct, thread.fpsimd_state); 277 + if (test_tsk_thread_flag(tsk, TIF_SVE)) 278 + fpsimd_flush_cpu_state(); 279 + } 280 + #endif /* CONFIG_ARM64_SVE */ 1052 281 1053 282 #ifdef CONFIG_KERNEL_MODE_NEON 1054 283 ··· 1111 286 __this_cpu_write(kernel_neon_busy, true); 1112 287 1113 288 /* Save unsaved task fpsimd state, if any: */ 1114 - if (current->mm && !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE)) 1115 - fpsimd_save_state(&current->thread.fpsimd_state); 289 + if (current->mm) { 290 + task_fpsimd_save(); 291 + set_thread_flag(TIF_FOREIGN_FPSTATE); 292 + } 1116 293 1117 294 /* Invalidate any task state remaining in the fpsimd regs: */ 1118 - __this_cpu_write(fpsimd_last_state, NULL); 295 + fpsimd_flush_cpu_state(); 1119 296 1120 297 preempt_disable(); 1121 298 ··· 1152 325 1153 326 static DEFINE_PER_CPU(struct fpsimd_state, efi_fpsimd_state); 1154 327 static DEFINE_PER_CPU(bool, efi_fpsimd_state_used); 328 + static DEFINE_PER_CPU(bool, efi_sve_state_used); 1155 329 1156 330 /* 1157 331 * EFI runtime services support functions ··· 1178 350 1179 351 WARN_ON(preemptible()); 1180 352 1181 - if (may_use_simd()) 353 + if (may_use_simd()) { 1182 354 kernel_neon_begin(); 1183 - else { 1184 - fpsimd_save_state(this_cpu_ptr(&efi_fpsimd_state)); 355 + } else { 356 + /* 357 + * If !efi_sve_state, SVE can't be in use yet and doesn't need 358 + * preserving: 359 + */ 360 + if (system_supports_sve() && likely(efi_sve_state)) { 361 + char *sve_state = this_cpu_ptr(efi_sve_state); 362 + 363 + __this_cpu_write(efi_sve_state_used, true); 364 + 365 + sve_save_state(sve_state + sve_ffr_offset(sve_max_vl), 366 + &this_cpu_ptr(&efi_fpsimd_state)->fpsr); 367 + } else { 368 + fpsimd_save_state(this_cpu_ptr(&efi_fpsimd_state)); 369 + } 370 + 1185 371 __this_cpu_write(efi_fpsimd_state_used, true); 1186 372 } 1187 373 } ··· 1208 366 if (!system_supports_fpsimd()) 1209 367 return; 1210 368 1211 - if (__this_cpu_xchg(efi_fpsimd_state_used, false)) 1212 - fpsimd_load_state(this_cpu_ptr(&efi_fpsimd_state)); 1213 - else 369 + if (!__this_cpu_xchg(efi_fpsimd_state_used, false)) { 1214 370 kernel_neon_end(); 371 + } else { 372 + if (system_supports_sve() && 373 + likely(__this_cpu_read(efi_sve_state_used))) { 374 + char const *sve_state = this_cpu_ptr(efi_sve_state); 375 + 376 + sve_load_state(sve_state + sve_ffr_offset(sve_max_vl), 377 + &this_cpu_ptr(&efi_fpsimd_state)->fpsr, 378 + sve_vq_from_vl(sve_get_vl()) - 1); 379 + 380 + __this_cpu_write(efi_sve_state_used, false); 381 + } else { 382 + fpsimd_load_state(this_cpu_ptr(&efi_fpsimd_state)); 383 + } 384 + } 1215 385 } 1216 386 1217 387 #endif /* CONFIG_EFI */ ··· 1236 382 { 1237 383 switch (cmd) { 1238 384 case CPU_PM_ENTER: 1239 - if (current->mm && !test_thread_flag(TIF_FOREIGN_FPSTATE)) 1240 - fpsimd_save_state(&current->thread.fpsimd_state); 1241 - this_cpu_write(fpsimd_last_state, NULL); 385 + if (current->mm) 386 + task_fpsimd_save(); 387 + fpsimd_flush_cpu_state(); 1242 388 break; 1243 389 case CPU_PM_EXIT: 1244 390 if (current->mm) ··· 1296 442 if (!(elf_hwcap & HWCAP_ASIMD)) 1297 443 pr_notice("Advanced SIMD is not implemented\n"); 1298 444 1299 - return 0; 445 + return sve_sysctl_init(); 1300 446 } 1301 447 core_initcall(fpsimd_init);
+24 -6
arch/arm64/kernel/head.S
··· 480 480 481 481 /* Statistical profiling */ 482 482 ubfx x0, x1, #32, #4 // Check ID_AA64DFR0_EL1 PMSVer 483 - cbz x0, 6f // Skip if SPE not present 484 - cbnz x2, 5f // VHE? 483 + cbz x0, 7f // Skip if SPE not present 484 + cbnz x2, 6f // VHE? 485 + mrs_s x4, SYS_PMBIDR_EL1 // If SPE available at EL2, 486 + and x4, x4, #(1 << SYS_PMBIDR_EL1_P_SHIFT) 487 + cbnz x4, 5f // then permit sampling of physical 488 + mov x4, #(1 << SYS_PMSCR_EL2_PCT_SHIFT | \ 489 + 1 << SYS_PMSCR_EL2_PA_SHIFT) 490 + msr_s SYS_PMSCR_EL2, x4 // addresses and physical counter 491 + 5: 485 492 mov x1, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT) 486 493 orr x3, x3, x1 // If we don't have VHE, then 487 - b 6f // use EL1&0 translation. 488 - 5: // For VHE, use EL2 translation 494 + b 7f // use EL1&0 translation. 495 + 6: // For VHE, use EL2 translation 489 496 orr x3, x3, #MDCR_EL2_TPMS // and disable access from EL1 490 - 6: 497 + 7: 491 498 msr mdcr_el2, x3 // Configure debug traps 492 499 493 500 /* Stage-2 translation */ ··· 524 517 mov x0, #0x33ff 525 518 msr cptr_el2, x0 // Disable copro. traps to EL2 526 519 520 + /* SVE register access */ 521 + mrs x1, id_aa64pfr0_el1 522 + ubfx x1, x1, #ID_AA64PFR0_SVE_SHIFT, #4 523 + cbz x1, 7f 524 + 525 + bic x0, x0, #CPTR_EL2_TZ // Also disable SVE traps 526 + msr cptr_el2, x0 // Disable copro. traps to EL2 527 + isb 528 + mov x1, #ZCR_ELx_LEN_MASK // SVE: Enable full vector 529 + msr_s SYS_ZCR_EL2, x1 // length for EL1. 530 + 527 531 /* Hypervisor stub */ 528 - adr_l x0, __hyp_stub_vectors 532 + 7: adr_l x0, __hyp_stub_vectors 529 533 msr vbar_el2, x0 530 534 531 535 /* spsr */
+3 -2
arch/arm64/kernel/hibernate.c
··· 27 27 #include <asm/barrier.h> 28 28 #include <asm/cacheflush.h> 29 29 #include <asm/cputype.h> 30 + #include <asm/daifflags.h> 30 31 #include <asm/irqflags.h> 31 32 #include <asm/kexec.h> 32 33 #include <asm/memory.h> ··· 286 285 return -EBUSY; 287 286 } 288 287 289 - local_dbg_save(flags); 288 + flags = local_daif_save(); 290 289 291 290 if (__cpu_suspend_enter(&state)) { 292 291 /* make the crash dump kernel image visible/saveable */ ··· 316 315 __cpu_suspend_exit(); 317 316 } 318 317 319 - local_dbg_restore(flags); 318 + local_daif_restore(flags); 320 319 321 320 return ret; 322 321 }
+5 -7
arch/arm64/kernel/io.c
··· 25 25 */ 26 26 void __memcpy_fromio(void *to, const volatile void __iomem *from, size_t count) 27 27 { 28 - while (count && (!IS_ALIGNED((unsigned long)from, 8) || 29 - !IS_ALIGNED((unsigned long)to, 8))) { 28 + while (count && !IS_ALIGNED((unsigned long)from, 8)) { 30 29 *(u8 *)to = __raw_readb(from); 31 30 from++; 32 31 to++; ··· 53 54 */ 54 55 void __memcpy_toio(volatile void __iomem *to, const void *from, size_t count) 55 56 { 56 - while (count && (!IS_ALIGNED((unsigned long)to, 8) || 57 - !IS_ALIGNED((unsigned long)from, 8))) { 58 - __raw_writeb(*(volatile u8 *)from, to); 57 + while (count && !IS_ALIGNED((unsigned long)to, 8)) { 58 + __raw_writeb(*(u8 *)from, to); 59 59 from++; 60 60 to++; 61 61 count--; 62 62 } 63 63 64 64 while (count >= 8) { 65 - __raw_writeq(*(volatile u64 *)from, to); 65 + __raw_writeq(*(u64 *)from, to); 66 66 from += 8; 67 67 to += 8; 68 68 count -= 8; 69 69 } 70 70 71 71 while (count) { 72 - __raw_writeb(*(volatile u8 *)from, to); 72 + __raw_writeb(*(u8 *)from, to); 73 73 from++; 74 74 to++; 75 75 count--;
+2 -2
arch/arm64/kernel/machine_kexec.c
··· 18 18 19 19 #include <asm/cacheflush.h> 20 20 #include <asm/cpu_ops.h> 21 + #include <asm/daifflags.h> 21 22 #include <asm/memory.h> 22 23 #include <asm/mmu.h> 23 24 #include <asm/mmu_context.h> ··· 196 195 197 196 pr_info("Bye!\n"); 198 197 199 - /* Disable all DAIF exceptions. */ 200 - asm volatile ("msr daifset, #0xf" : : : "memory"); 198 + local_daif_mask(); 201 199 202 200 /* 203 201 * cpu_soft_restart will shutdown the MMU, disable data caches, then
+60 -4
arch/arm64/kernel/process.c
··· 49 49 #include <linux/notifier.h> 50 50 #include <trace/events/power.h> 51 51 #include <linux/percpu.h> 52 + #include <linux/thread_info.h> 52 53 53 54 #include <asm/alternative.h> 54 55 #include <asm/compat.h> ··· 171 170 while (1); 172 171 } 173 172 173 + static void print_pstate(struct pt_regs *regs) 174 + { 175 + u64 pstate = regs->pstate; 176 + 177 + if (compat_user_mode(regs)) { 178 + printk("pstate: %08llx (%c%c%c%c %c %s %s %c%c%c)\n", 179 + pstate, 180 + pstate & COMPAT_PSR_N_BIT ? 'N' : 'n', 181 + pstate & COMPAT_PSR_Z_BIT ? 'Z' : 'z', 182 + pstate & COMPAT_PSR_C_BIT ? 'C' : 'c', 183 + pstate & COMPAT_PSR_V_BIT ? 'V' : 'v', 184 + pstate & COMPAT_PSR_Q_BIT ? 'Q' : 'q', 185 + pstate & COMPAT_PSR_T_BIT ? "T32" : "A32", 186 + pstate & COMPAT_PSR_E_BIT ? "BE" : "LE", 187 + pstate & COMPAT_PSR_A_BIT ? 'A' : 'a', 188 + pstate & COMPAT_PSR_I_BIT ? 'I' : 'i', 189 + pstate & COMPAT_PSR_F_BIT ? 'F' : 'f'); 190 + } else { 191 + printk("pstate: %08llx (%c%c%c%c %c%c%c%c %cPAN %cUAO)\n", 192 + pstate, 193 + pstate & PSR_N_BIT ? 'N' : 'n', 194 + pstate & PSR_Z_BIT ? 'Z' : 'z', 195 + pstate & PSR_C_BIT ? 'C' : 'c', 196 + pstate & PSR_V_BIT ? 'V' : 'v', 197 + pstate & PSR_D_BIT ? 'D' : 'd', 198 + pstate & PSR_A_BIT ? 'A' : 'a', 199 + pstate & PSR_I_BIT ? 'I' : 'i', 200 + pstate & PSR_F_BIT ? 'F' : 'f', 201 + pstate & PSR_PAN_BIT ? '+' : '-', 202 + pstate & PSR_UAO_BIT ? '+' : '-'); 203 + } 204 + } 205 + 174 206 void __show_regs(struct pt_regs *regs) 175 207 { 176 208 int i, top_reg; ··· 220 186 } 221 187 222 188 show_regs_print_info(KERN_DEFAULT); 223 - print_symbol("PC is at %s\n", instruction_pointer(regs)); 224 - print_symbol("LR is at %s\n", lr); 225 - printk("pc : [<%016llx>] lr : [<%016llx>] pstate: %08llx\n", 226 - regs->pc, lr, regs->pstate); 189 + print_pstate(regs); 190 + print_symbol("pc : %s\n", regs->pc); 191 + print_symbol("lr : %s\n", lr); 227 192 printk("sp : %016llx\n", sp); 228 193 229 194 i = top_reg; ··· 274 241 { 275 242 } 276 243 244 + void arch_release_task_struct(struct task_struct *tsk) 245 + { 246 + fpsimd_release_task(tsk); 247 + } 248 + 249 + /* 250 + * src and dst may temporarily have aliased sve_state after task_struct 251 + * is copied. We cannot fix this properly here, because src may have 252 + * live SVE state and dst's thread_info may not exist yet, so tweaking 253 + * either src's or dst's TIF_SVE is not safe. 254 + * 255 + * The unaliasing is done in copy_thread() instead. This works because 256 + * dst is not schedulable or traceable until both of these functions 257 + * have been called. 258 + */ 277 259 int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) 278 260 { 279 261 if (current->mm) 280 262 fpsimd_preserve_current_state(); 281 263 *dst = *src; 264 + 282 265 return 0; 283 266 } 284 267 ··· 306 257 struct pt_regs *childregs = task_pt_regs(p); 307 258 308 259 memset(&p->thread.cpu_context, 0, sizeof(struct cpu_context)); 260 + 261 + /* 262 + * Unalias p->thread.sve_state (if any) from the parent task 263 + * and disable discard SVE state for p: 264 + */ 265 + clear_tsk_thread_flag(p, TIF_SVE); 266 + p->thread.sve_state = NULL; 309 267 310 268 if (likely(!(p->flags & PF_KTHREAD))) { 311 269 *childregs = *current_pt_regs();
+272 -8
arch/arm64/kernel/ptrace.c
··· 32 32 #include <linux/security.h> 33 33 #include <linux/init.h> 34 34 #include <linux/signal.h> 35 + #include <linux/string.h> 35 36 #include <linux/uaccess.h> 36 37 #include <linux/perf_event.h> 37 38 #include <linux/hw_breakpoint.h> ··· 41 40 #include <linux/elf.h> 42 41 43 42 #include <asm/compat.h> 43 + #include <asm/cpufeature.h> 44 44 #include <asm/debug-monitors.h> 45 45 #include <asm/pgtable.h> 46 46 #include <asm/stacktrace.h> ··· 620 618 /* 621 619 * TODO: update fp accessors for lazy context switching (sync/flush hwstate) 622 620 */ 621 + static int __fpr_get(struct task_struct *target, 622 + const struct user_regset *regset, 623 + unsigned int pos, unsigned int count, 624 + void *kbuf, void __user *ubuf, unsigned int start_pos) 625 + { 626 + struct user_fpsimd_state *uregs; 627 + 628 + sve_sync_to_fpsimd(target); 629 + 630 + uregs = &target->thread.fpsimd_state.user_fpsimd; 631 + 632 + return user_regset_copyout(&pos, &count, &kbuf, &ubuf, uregs, 633 + start_pos, start_pos + sizeof(*uregs)); 634 + } 635 + 623 636 static int fpr_get(struct task_struct *target, const struct user_regset *regset, 624 637 unsigned int pos, unsigned int count, 625 638 void *kbuf, void __user *ubuf) 626 639 { 627 - struct user_fpsimd_state *uregs; 628 - uregs = &target->thread.fpsimd_state.user_fpsimd; 629 - 630 640 if (target == current) 631 641 fpsimd_preserve_current_state(); 632 642 633 - return user_regset_copyout(&pos, &count, &kbuf, &ubuf, uregs, 0, -1); 643 + return __fpr_get(target, regset, pos, count, kbuf, ubuf, 0); 644 + } 645 + 646 + static int __fpr_set(struct task_struct *target, 647 + const struct user_regset *regset, 648 + unsigned int pos, unsigned int count, 649 + const void *kbuf, const void __user *ubuf, 650 + unsigned int start_pos) 651 + { 652 + int ret; 653 + struct user_fpsimd_state newstate; 654 + 655 + /* 656 + * Ensure target->thread.fpsimd_state is up to date, so that a 657 + * short copyin can't resurrect stale data. 658 + */ 659 + sve_sync_to_fpsimd(target); 660 + 661 + newstate = target->thread.fpsimd_state.user_fpsimd; 662 + 663 + ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &newstate, 664 + start_pos, start_pos + sizeof(newstate)); 665 + if (ret) 666 + return ret; 667 + 668 + target->thread.fpsimd_state.user_fpsimd = newstate; 669 + 670 + return ret; 634 671 } 635 672 636 673 static int fpr_set(struct task_struct *target, const struct user_regset *regset, ··· 677 636 const void *kbuf, const void __user *ubuf) 678 637 { 679 638 int ret; 680 - struct user_fpsimd_state newstate = 681 - target->thread.fpsimd_state.user_fpsimd; 682 639 683 - ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &newstate, 0, -1); 640 + ret = __fpr_set(target, regset, pos, count, kbuf, ubuf, 0); 684 641 if (ret) 685 642 return ret; 686 643 687 - target->thread.fpsimd_state.user_fpsimd = newstate; 644 + sve_sync_from_fpsimd_zeropad(target); 688 645 fpsimd_flush_task_state(target); 646 + 689 647 return ret; 690 648 } 691 649 ··· 742 702 return ret; 743 703 } 744 704 705 + #ifdef CONFIG_ARM64_SVE 706 + 707 + static void sve_init_header_from_task(struct user_sve_header *header, 708 + struct task_struct *target) 709 + { 710 + unsigned int vq; 711 + 712 + memset(header, 0, sizeof(*header)); 713 + 714 + header->flags = test_tsk_thread_flag(target, TIF_SVE) ? 715 + SVE_PT_REGS_SVE : SVE_PT_REGS_FPSIMD; 716 + if (test_tsk_thread_flag(target, TIF_SVE_VL_INHERIT)) 717 + header->flags |= SVE_PT_VL_INHERIT; 718 + 719 + header->vl = target->thread.sve_vl; 720 + vq = sve_vq_from_vl(header->vl); 721 + 722 + header->max_vl = sve_max_vl; 723 + if (WARN_ON(!sve_vl_valid(sve_max_vl))) 724 + header->max_vl = header->vl; 725 + 726 + header->size = SVE_PT_SIZE(vq, header->flags); 727 + header->max_size = SVE_PT_SIZE(sve_vq_from_vl(header->max_vl), 728 + SVE_PT_REGS_SVE); 729 + } 730 + 731 + static unsigned int sve_size_from_header(struct user_sve_header const *header) 732 + { 733 + return ALIGN(header->size, SVE_VQ_BYTES); 734 + } 735 + 736 + static unsigned int sve_get_size(struct task_struct *target, 737 + const struct user_regset *regset) 738 + { 739 + struct user_sve_header header; 740 + 741 + if (!system_supports_sve()) 742 + return 0; 743 + 744 + sve_init_header_from_task(&header, target); 745 + return sve_size_from_header(&header); 746 + } 747 + 748 + static int sve_get(struct task_struct *target, 749 + const struct user_regset *regset, 750 + unsigned int pos, unsigned int count, 751 + void *kbuf, void __user *ubuf) 752 + { 753 + int ret; 754 + struct user_sve_header header; 755 + unsigned int vq; 756 + unsigned long start, end; 757 + 758 + if (!system_supports_sve()) 759 + return -EINVAL; 760 + 761 + /* Header */ 762 + sve_init_header_from_task(&header, target); 763 + vq = sve_vq_from_vl(header.vl); 764 + 765 + ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf, &header, 766 + 0, sizeof(header)); 767 + if (ret) 768 + return ret; 769 + 770 + if (target == current) 771 + fpsimd_preserve_current_state(); 772 + 773 + /* Registers: FPSIMD-only case */ 774 + 775 + BUILD_BUG_ON(SVE_PT_FPSIMD_OFFSET != sizeof(header)); 776 + if ((header.flags & SVE_PT_REGS_MASK) == SVE_PT_REGS_FPSIMD) 777 + return __fpr_get(target, regset, pos, count, kbuf, ubuf, 778 + SVE_PT_FPSIMD_OFFSET); 779 + 780 + /* Otherwise: full SVE case */ 781 + 782 + BUILD_BUG_ON(SVE_PT_SVE_OFFSET != sizeof(header)); 783 + start = SVE_PT_SVE_OFFSET; 784 + end = SVE_PT_SVE_FFR_OFFSET(vq) + SVE_PT_SVE_FFR_SIZE(vq); 785 + ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf, 786 + target->thread.sve_state, 787 + start, end); 788 + if (ret) 789 + return ret; 790 + 791 + start = end; 792 + end = SVE_PT_SVE_FPSR_OFFSET(vq); 793 + ret = user_regset_copyout_zero(&pos, &count, &kbuf, &ubuf, 794 + start, end); 795 + if (ret) 796 + return ret; 797 + 798 + /* 799 + * Copy fpsr, and fpcr which must follow contiguously in 800 + * struct fpsimd_state: 801 + */ 802 + start = end; 803 + end = SVE_PT_SVE_FPCR_OFFSET(vq) + SVE_PT_SVE_FPCR_SIZE; 804 + ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf, 805 + &target->thread.fpsimd_state.fpsr, 806 + start, end); 807 + if (ret) 808 + return ret; 809 + 810 + start = end; 811 + end = sve_size_from_header(&header); 812 + return user_regset_copyout_zero(&pos, &count, &kbuf, &ubuf, 813 + start, end); 814 + } 815 + 816 + static int sve_set(struct task_struct *target, 817 + const struct user_regset *regset, 818 + unsigned int pos, unsigned int count, 819 + const void *kbuf, const void __user *ubuf) 820 + { 821 + int ret; 822 + struct user_sve_header header; 823 + unsigned int vq; 824 + unsigned long start, end; 825 + 826 + if (!system_supports_sve()) 827 + return -EINVAL; 828 + 829 + /* Header */ 830 + if (count < sizeof(header)) 831 + return -EINVAL; 832 + ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &header, 833 + 0, sizeof(header)); 834 + if (ret) 835 + goto out; 836 + 837 + /* 838 + * Apart from PT_SVE_REGS_MASK, all PT_SVE_* flags are consumed by 839 + * sve_set_vector_length(), which will also validate them for us: 840 + */ 841 + ret = sve_set_vector_length(target, header.vl, 842 + ((unsigned long)header.flags & ~SVE_PT_REGS_MASK) << 16); 843 + if (ret) 844 + goto out; 845 + 846 + /* Actual VL set may be less than the user asked for: */ 847 + vq = sve_vq_from_vl(target->thread.sve_vl); 848 + 849 + /* Registers: FPSIMD-only case */ 850 + 851 + BUILD_BUG_ON(SVE_PT_FPSIMD_OFFSET != sizeof(header)); 852 + if ((header.flags & SVE_PT_REGS_MASK) == SVE_PT_REGS_FPSIMD) { 853 + ret = __fpr_set(target, regset, pos, count, kbuf, ubuf, 854 + SVE_PT_FPSIMD_OFFSET); 855 + clear_tsk_thread_flag(target, TIF_SVE); 856 + goto out; 857 + } 858 + 859 + /* Otherwise: full SVE case */ 860 + 861 + /* 862 + * If setting a different VL from the requested VL and there is 863 + * register data, the data layout will be wrong: don't even 864 + * try to set the registers in this case. 865 + */ 866 + if (count && vq != sve_vq_from_vl(header.vl)) { 867 + ret = -EIO; 868 + goto out; 869 + } 870 + 871 + sve_alloc(target); 872 + 873 + /* 874 + * Ensure target->thread.sve_state is up to date with target's 875 + * FPSIMD regs, so that a short copyin leaves trailing registers 876 + * unmodified. 877 + */ 878 + fpsimd_sync_to_sve(target); 879 + set_tsk_thread_flag(target, TIF_SVE); 880 + 881 + BUILD_BUG_ON(SVE_PT_SVE_OFFSET != sizeof(header)); 882 + start = SVE_PT_SVE_OFFSET; 883 + end = SVE_PT_SVE_FFR_OFFSET(vq) + SVE_PT_SVE_FFR_SIZE(vq); 884 + ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, 885 + target->thread.sve_state, 886 + start, end); 887 + if (ret) 888 + goto out; 889 + 890 + start = end; 891 + end = SVE_PT_SVE_FPSR_OFFSET(vq); 892 + ret = user_regset_copyin_ignore(&pos, &count, &kbuf, &ubuf, 893 + start, end); 894 + if (ret) 895 + goto out; 896 + 897 + /* 898 + * Copy fpsr, and fpcr which must follow contiguously in 899 + * struct fpsimd_state: 900 + */ 901 + start = end; 902 + end = SVE_PT_SVE_FPCR_OFFSET(vq) + SVE_PT_SVE_FPCR_SIZE; 903 + ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, 904 + &target->thread.fpsimd_state.fpsr, 905 + start, end); 906 + 907 + out: 908 + fpsimd_flush_task_state(target); 909 + return ret; 910 + } 911 + 912 + #endif /* CONFIG_ARM64_SVE */ 913 + 745 914 enum aarch64_regset { 746 915 REGSET_GPR, 747 916 REGSET_FPR, ··· 960 711 REGSET_HW_WATCH, 961 712 #endif 962 713 REGSET_SYSTEM_CALL, 714 + #ifdef CONFIG_ARM64_SVE 715 + REGSET_SVE, 716 + #endif 963 717 }; 964 718 965 719 static const struct user_regset aarch64_regsets[] = { ··· 1020 768 .get = system_call_get, 1021 769 .set = system_call_set, 1022 770 }, 771 + #ifdef CONFIG_ARM64_SVE 772 + [REGSET_SVE] = { /* Scalable Vector Extension */ 773 + .core_note_type = NT_ARM_SVE, 774 + .n = DIV_ROUND_UP(SVE_PT_SIZE(SVE_VQ_MAX, SVE_PT_REGS_SVE), 775 + SVE_VQ_BYTES), 776 + .size = SVE_VQ_BYTES, 777 + .align = SVE_VQ_BYTES, 778 + .get = sve_get, 779 + .set = sve_set, 780 + .get_size = sve_get_size, 781 + }, 782 + #endif 1023 783 }; 1024 784 1025 785 static const struct user_regset_view user_aarch64_view = {
+7 -8
arch/arm64/kernel/setup.c
··· 23 23 #include <linux/stddef.h> 24 24 #include <linux/ioport.h> 25 25 #include <linux/delay.h> 26 - #include <linux/utsname.h> 27 26 #include <linux/initrd.h> 28 27 #include <linux/console.h> 29 28 #include <linux/cache.h> ··· 47 48 #include <asm/fixmap.h> 48 49 #include <asm/cpu.h> 49 50 #include <asm/cputype.h> 51 + #include <asm/daifflags.h> 50 52 #include <asm/elf.h> 51 53 #include <asm/cpufeature.h> 52 54 #include <asm/cpu_ops.h> ··· 103 103 * access percpu variable inside lock_release 104 104 */ 105 105 set_my_cpu_offset(0); 106 - pr_info("Booting Linux on physical CPU 0x%lx\n", (unsigned long)mpidr); 106 + pr_info("Booting Linux on physical CPU 0x%010lx [0x%08x]\n", 107 + (unsigned long)mpidr, read_cpuid_id()); 107 108 } 108 109 109 110 bool arch_match_cpu_phys_id(int cpu, u64 phys_id) ··· 245 244 246 245 void __init setup_arch(char **cmdline_p) 247 246 { 248 - pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id()); 249 - 250 - sprintf(init_utsname()->machine, UTS_MACHINE); 251 247 init_mm.start_code = (unsigned long) _text; 252 248 init_mm.end_code = (unsigned long) _etext; 253 249 init_mm.end_data = (unsigned long) _edata; ··· 260 262 parse_early_param(); 261 263 262 264 /* 263 - * Unmask asynchronous aborts after bringing up possible earlycon. 264 - * (Report possible System Errors once we can report this occurred) 265 + * Unmask asynchronous aborts and fiq after bringing up possible 266 + * earlycon. (Report possible System Errors once we can report this 267 + * occurred). 265 268 */ 266 - local_async_enable(); 269 + local_daif_restore(DAIF_PROCCTX_NOIRQ); 267 270 268 271 /* 269 272 * TTBR0 is only used for the identity mapping at this stage. Make it
+169 -10
arch/arm64/kernel/signal.c
··· 31 31 #include <linux/ratelimit.h> 32 32 #include <linux/syscalls.h> 33 33 34 + #include <asm/daifflags.h> 34 35 #include <asm/debug-monitors.h> 35 36 #include <asm/elf.h> 36 37 #include <asm/cacheflush.h> ··· 64 63 65 64 unsigned long fpsimd_offset; 66 65 unsigned long esr_offset; 66 + unsigned long sve_offset; 67 67 unsigned long extra_offset; 68 68 unsigned long end_offset; 69 69 }; ··· 181 179 struct fpsimd_state *fpsimd = &current->thread.fpsimd_state; 182 180 int err; 183 181 184 - /* dump the hardware registers to the fpsimd_state structure */ 185 - fpsimd_preserve_current_state(); 186 - 187 182 /* copy the FP and status/control registers */ 188 183 err = __copy_to_user(ctx->vregs, fpsimd->vregs, sizeof(fpsimd->vregs)); 189 184 __put_user_error(fpsimd->fpsr, &ctx->fpsr, err); ··· 213 214 __get_user_error(fpsimd.fpsr, &ctx->fpsr, err); 214 215 __get_user_error(fpsimd.fpcr, &ctx->fpcr, err); 215 216 217 + clear_thread_flag(TIF_SVE); 218 + 216 219 /* load the hardware registers from the fpsimd_state structure */ 217 220 if (!err) 218 221 fpsimd_update_current_state(&fpsimd); ··· 222 221 return err ? -EFAULT : 0; 223 222 } 224 223 224 + 225 225 struct user_ctxs { 226 226 struct fpsimd_context __user *fpsimd; 227 + struct sve_context __user *sve; 227 228 }; 229 + 230 + #ifdef CONFIG_ARM64_SVE 231 + 232 + static int preserve_sve_context(struct sve_context __user *ctx) 233 + { 234 + int err = 0; 235 + u16 reserved[ARRAY_SIZE(ctx->__reserved)]; 236 + unsigned int vl = current->thread.sve_vl; 237 + unsigned int vq = 0; 238 + 239 + if (test_thread_flag(TIF_SVE)) 240 + vq = sve_vq_from_vl(vl); 241 + 242 + memset(reserved, 0, sizeof(reserved)); 243 + 244 + __put_user_error(SVE_MAGIC, &ctx->head.magic, err); 245 + __put_user_error(round_up(SVE_SIG_CONTEXT_SIZE(vq), 16), 246 + &ctx->head.size, err); 247 + __put_user_error(vl, &ctx->vl, err); 248 + BUILD_BUG_ON(sizeof(ctx->__reserved) != sizeof(reserved)); 249 + err |= __copy_to_user(&ctx->__reserved, reserved, sizeof(reserved)); 250 + 251 + if (vq) { 252 + /* 253 + * This assumes that the SVE state has already been saved to 254 + * the task struct by calling preserve_fpsimd_context(). 255 + */ 256 + err |= __copy_to_user((char __user *)ctx + SVE_SIG_REGS_OFFSET, 257 + current->thread.sve_state, 258 + SVE_SIG_REGS_SIZE(vq)); 259 + } 260 + 261 + return err ? -EFAULT : 0; 262 + } 263 + 264 + static int restore_sve_fpsimd_context(struct user_ctxs *user) 265 + { 266 + int err; 267 + unsigned int vq; 268 + struct fpsimd_state fpsimd; 269 + struct sve_context sve; 270 + 271 + if (__copy_from_user(&sve, user->sve, sizeof(sve))) 272 + return -EFAULT; 273 + 274 + if (sve.vl != current->thread.sve_vl) 275 + return -EINVAL; 276 + 277 + if (sve.head.size <= sizeof(*user->sve)) { 278 + clear_thread_flag(TIF_SVE); 279 + goto fpsimd_only; 280 + } 281 + 282 + vq = sve_vq_from_vl(sve.vl); 283 + 284 + if (sve.head.size < SVE_SIG_CONTEXT_SIZE(vq)) 285 + return -EINVAL; 286 + 287 + /* 288 + * Careful: we are about __copy_from_user() directly into 289 + * thread.sve_state with preemption enabled, so protection is 290 + * needed to prevent a racing context switch from writing stale 291 + * registers back over the new data. 292 + */ 293 + 294 + fpsimd_flush_task_state(current); 295 + barrier(); 296 + /* From now, fpsimd_thread_switch() won't clear TIF_FOREIGN_FPSTATE */ 297 + 298 + set_thread_flag(TIF_FOREIGN_FPSTATE); 299 + barrier(); 300 + /* From now, fpsimd_thread_switch() won't touch thread.sve_state */ 301 + 302 + sve_alloc(current); 303 + err = __copy_from_user(current->thread.sve_state, 304 + (char __user const *)user->sve + 305 + SVE_SIG_REGS_OFFSET, 306 + SVE_SIG_REGS_SIZE(vq)); 307 + if (err) 308 + return -EFAULT; 309 + 310 + set_thread_flag(TIF_SVE); 311 + 312 + fpsimd_only: 313 + /* copy the FP and status/control registers */ 314 + /* restore_sigframe() already checked that user->fpsimd != NULL. */ 315 + err = __copy_from_user(fpsimd.vregs, user->fpsimd->vregs, 316 + sizeof(fpsimd.vregs)); 317 + __get_user_error(fpsimd.fpsr, &user->fpsimd->fpsr, err); 318 + __get_user_error(fpsimd.fpcr, &user->fpsimd->fpcr, err); 319 + 320 + /* load the hardware registers from the fpsimd_state structure */ 321 + if (!err) 322 + fpsimd_update_current_state(&fpsimd); 323 + 324 + return err ? -EFAULT : 0; 325 + } 326 + 327 + #else /* ! CONFIG_ARM64_SVE */ 328 + 329 + /* Turn any non-optimised out attempts to use these into a link error: */ 330 + extern int preserve_sve_context(void __user *ctx); 331 + extern int restore_sve_fpsimd_context(struct user_ctxs *user); 332 + 333 + #endif /* ! CONFIG_ARM64_SVE */ 334 + 228 335 229 336 static int parse_user_sigframe(struct user_ctxs *user, 230 337 struct rt_sigframe __user *sf) ··· 346 237 char const __user *const sfp = (char const __user *)sf; 347 238 348 239 user->fpsimd = NULL; 240 + user->sve = NULL; 349 241 350 242 if (!IS_ALIGNED((unsigned long)base, 16)) 351 243 goto invalid; ··· 395 285 396 286 case ESR_MAGIC: 397 287 /* ignore */ 288 + break; 289 + 290 + case SVE_MAGIC: 291 + if (!system_supports_sve()) 292 + goto invalid; 293 + 294 + if (user->sve) 295 + goto invalid; 296 + 297 + if (size < sizeof(*user->sve)) 298 + goto invalid; 299 + 300 + user->sve = (struct sve_context __user *)head; 398 301 break; 399 302 400 303 case EXTRA_MAGIC: ··· 466 343 */ 467 344 offset = 0; 468 345 limit = extra_size; 346 + 347 + if (!access_ok(VERIFY_READ, base, limit)) 348 + goto invalid; 349 + 469 350 continue; 470 351 471 352 default: ··· 486 359 } 487 360 488 361 done: 489 - if (!user->fpsimd) 490 - goto invalid; 491 - 492 362 return 0; 493 363 494 364 invalid: ··· 519 395 if (err == 0) 520 396 err = parse_user_sigframe(&user, sf); 521 397 522 - if (err == 0) 523 - err = restore_fpsimd_context(user.fpsimd); 398 + if (err == 0) { 399 + if (!user.fpsimd) 400 + return -EINVAL; 401 + 402 + if (user.sve) { 403 + if (!system_supports_sve()) 404 + return -EINVAL; 405 + 406 + err = restore_sve_fpsimd_context(&user); 407 + } else { 408 + err = restore_fpsimd_context(user.fpsimd); 409 + } 410 + } 524 411 525 412 return err; 526 413 } ··· 590 455 return err; 591 456 } 592 457 458 + if (system_supports_sve()) { 459 + unsigned int vq = 0; 460 + 461 + if (test_thread_flag(TIF_SVE)) 462 + vq = sve_vq_from_vl(current->thread.sve_vl); 463 + 464 + err = sigframe_alloc(user, &user->sve_offset, 465 + SVE_SIG_CONTEXT_SIZE(vq)); 466 + if (err) 467 + return err; 468 + } 469 + 593 470 return sigframe_alloc_end(user); 594 471 } 595 472 ··· 641 494 __put_user_error(ESR_MAGIC, &esr_ctx->head.magic, err); 642 495 __put_user_error(sizeof(*esr_ctx), &esr_ctx->head.size, err); 643 496 __put_user_error(current->thread.fault_code, &esr_ctx->esr, err); 497 + } 498 + 499 + /* Scalable Vector Extension state, if present */ 500 + if (system_supports_sve() && err == 0 && user->sve_offset) { 501 + struct sve_context __user *sve_ctx = 502 + apply_user_offset(user, user->sve_offset); 503 + err |= preserve_sve_context(sve_ctx); 644 504 } 645 505 646 506 if (err == 0 && user->extra_offset) { ··· 748 594 struct rt_sigframe_user_layout user; 749 595 struct rt_sigframe __user *frame; 750 596 int err = 0; 597 + 598 + fpsimd_signal_preserve_current_state(); 751 599 752 600 if (get_sigframe(&user, ksig, regs)) 753 601 return 1; ··· 912 756 addr_limit_user_check(); 913 757 914 758 if (thread_flags & _TIF_NEED_RESCHED) { 759 + /* Unmask Debug and SError for the next task */ 760 + local_daif_restore(DAIF_PROCCTX_NOIRQ); 761 + 915 762 schedule(); 916 763 } else { 917 - local_irq_enable(); 764 + local_daif_restore(DAIF_PROCCTX); 918 765 919 766 if (thread_flags & _TIF_UPROBE) 920 767 uprobe_notify_resume(regs); ··· 934 775 fpsimd_restore_current_state(); 935 776 } 936 777 937 - local_irq_disable(); 778 + local_daif_mask(); 938 779 thread_flags = READ_ONCE(current_thread_info()->flags); 939 780 } while (thread_flags & _TIF_WORK_MASK); 940 781 }
+1 -1
arch/arm64/kernel/signal32.c
··· 239 239 * Note that this also saves V16-31, which aren't visible 240 240 * in AArch32. 241 241 */ 242 - fpsimd_preserve_current_state(); 242 + fpsimd_signal_preserve_current_state(); 243 243 244 244 /* Place structure header on the stack */ 245 245 __put_user_error(magic, &frame->magic, err);
+8 -10
arch/arm64/kernel/smp.c
··· 47 47 #include <asm/cpu.h> 48 48 #include <asm/cputype.h> 49 49 #include <asm/cpu_ops.h> 50 + #include <asm/daifflags.h> 50 51 #include <asm/mmu_context.h> 51 52 #include <asm/numa.h> 52 53 #include <asm/pgtable.h> ··· 217 216 */ 218 217 asmlinkage void secondary_start_kernel(void) 219 218 { 219 + u64 mpidr = read_cpuid_mpidr() & MPIDR_HWID_BITMASK; 220 220 struct mm_struct *mm = &init_mm; 221 221 unsigned int cpu; 222 222 ··· 267 265 * the CPU migration code to notice that the CPU is online 268 266 * before we continue. 269 267 */ 270 - pr_info("CPU%u: Booted secondary processor [%08x]\n", 271 - cpu, read_cpuid_id()); 268 + pr_info("CPU%u: Booted secondary processor 0x%010lx [0x%08x]\n", 269 + cpu, (unsigned long)mpidr, 270 + read_cpuid_id()); 272 271 update_cpu_boot_status(CPU_BOOT_SUCCESS); 273 272 set_cpu_online(cpu, true); 274 273 complete(&cpu_running); 275 274 276 - local_irq_enable(); 277 - local_async_enable(); 275 + local_daif_restore(DAIF_PROCCTX); 278 276 279 277 /* 280 278 * OK, it's off to the idle thread for us ··· 370 368 /* 371 369 * Called from the idle thread for the CPU which has been shutdown. 372 370 * 373 - * Note that we disable IRQs here, but do not re-enable them 374 - * before returning to the caller. This is also the behaviour 375 - * of the other hotplug-cpu capable cores, so presumably coming 376 - * out of idle fixes this. 377 371 */ 378 372 void cpu_die(void) 379 373 { ··· 377 379 378 380 idle_task_exit(); 379 381 380 - local_irq_disable(); 382 + local_daif_mask(); 381 383 382 384 /* Tell __cpu_die() that this CPU is now safe to dispose of */ 383 385 (void)cpu_report_death(); ··· 835 837 { 836 838 set_cpu_online(cpu, false); 837 839 838 - local_irq_disable(); 840 + local_daif_mask(); 839 841 840 842 while (1) 841 843 cpu_relax();
+4 -4
arch/arm64/kernel/suspend.c
··· 5 5 #include <asm/alternative.h> 6 6 #include <asm/cacheflush.h> 7 7 #include <asm/cpufeature.h> 8 + #include <asm/daifflags.h> 8 9 #include <asm/debug-monitors.h> 9 10 #include <asm/exec.h> 10 11 #include <asm/pgtable.h> ··· 13 12 #include <asm/mmu_context.h> 14 13 #include <asm/smp_plat.h> 15 14 #include <asm/suspend.h> 16 - #include <asm/tlbflush.h> 17 15 18 16 /* 19 17 * This is allocated by cpu_suspend_init(), and used to store a pointer to ··· 58 58 /* 59 59 * Restore HW breakpoint registers to sane values 60 60 * before debug exceptions are possibly reenabled 61 - * through local_dbg_restore. 61 + * by cpu_suspend()s local_daif_restore() call. 62 62 */ 63 63 if (hw_breakpoint_restore) 64 64 hw_breakpoint_restore(cpu); ··· 82 82 * updates to mdscr register (saved and restored along with 83 83 * general purpose registers) from kernel debuggers. 84 84 */ 85 - local_dbg_save(flags); 85 + flags = local_daif_save(); 86 86 87 87 /* 88 88 * Function graph tracer state gets incosistent when the kernel ··· 115 115 * restored, so from this point onwards, debugging is fully 116 116 * renabled if it was enabled when core started shutdown. 117 117 */ 118 - local_dbg_restore(flags); 118 + local_daif_restore(flags); 119 119 120 120 return ret; 121 121 }
+38 -71
arch/arm64/kernel/traps.c
··· 38 38 39 39 #include <asm/atomic.h> 40 40 #include <asm/bug.h> 41 + #include <asm/daifflags.h> 41 42 #include <asm/debug-monitors.h> 42 43 #include <asm/esr.h> 43 44 #include <asm/insn.h> ··· 59 58 60 59 int show_unhandled_signals = 1; 61 60 62 - /* 63 - * Dump out the contents of some kernel memory nicely... 64 - */ 65 - static void dump_mem(const char *lvl, const char *str, unsigned long bottom, 66 - unsigned long top) 67 - { 68 - unsigned long first; 69 - mm_segment_t fs; 70 - int i; 71 - 72 - /* 73 - * We need to switch to kernel mode so that we can use __get_user 74 - * to safely read from kernel space. 75 - */ 76 - fs = get_fs(); 77 - set_fs(KERNEL_DS); 78 - 79 - printk("%s%s(0x%016lx to 0x%016lx)\n", lvl, str, bottom, top); 80 - 81 - for (first = bottom & ~31; first < top; first += 32) { 82 - unsigned long p; 83 - char str[sizeof(" 12345678") * 8 + 1]; 84 - 85 - memset(str, ' ', sizeof(str)); 86 - str[sizeof(str) - 1] = '\0'; 87 - 88 - for (p = first, i = 0; i < (32 / 8) 89 - && p < top; i++, p += 8) { 90 - if (p >= bottom && p < top) { 91 - unsigned long val; 92 - 93 - if (__get_user(val, (unsigned long *)p) == 0) 94 - sprintf(str + i * 17, " %016lx", val); 95 - else 96 - sprintf(str + i * 17, " ????????????????"); 97 - } 98 - } 99 - printk("%s%04lx:%s\n", lvl, first & 0xffff, str); 100 - } 101 - 102 - set_fs(fs); 103 - } 104 - 105 61 static void dump_backtrace_entry(unsigned long where) 106 62 { 107 - /* 108 - * Note that 'where' can have a physical address, but it's not handled. 109 - */ 110 - print_ip_sym(where); 63 + printk(" %pS\n", (void *)where); 111 64 } 112 65 113 66 static void __dump_instr(const char *lvl, struct pt_regs *regs) ··· 126 171 127 172 skip = !!regs; 128 173 printk("Call trace:\n"); 129 - while (1) { 130 - unsigned long stack; 131 - int ret; 132 - 174 + do { 133 175 /* skip until specified stack frame */ 134 176 if (!skip) { 135 177 dump_backtrace_entry(frame.pc); ··· 141 189 */ 142 190 dump_backtrace_entry(regs->pc); 143 191 } 144 - ret = unwind_frame(tsk, &frame); 145 - if (ret < 0) 146 - break; 147 - if (in_entry_text(frame.pc)) { 148 - stack = frame.fp - offsetof(struct pt_regs, stackframe); 149 - 150 - if (on_accessible_stack(tsk, stack)) 151 - dump_mem("", "Exception stack", stack, 152 - stack + sizeof(struct pt_regs)); 153 - } 154 - } 192 + } while (!unwind_frame(tsk, &frame)); 155 193 156 194 put_task_stack(tsk); 157 195 } ··· 235 293 } 236 294 } 237 295 296 + void arm64_skip_faulting_instruction(struct pt_regs *regs, unsigned long size) 297 + { 298 + regs->pc += size; 299 + 300 + /* 301 + * If we were single stepping, we want to get the step exception after 302 + * we return from the trap. 303 + */ 304 + user_fastforward_single_step(current); 305 + } 306 + 238 307 static LIST_HEAD(undef_hook); 239 308 static DEFINE_RAW_SPINLOCK(undef_lock); 240 309 ··· 311 358 return fn ? fn(regs, instr) : 1; 312 359 } 313 360 314 - static void force_signal_inject(int signal, int code, struct pt_regs *regs, 315 - unsigned long address) 361 + void force_signal_inject(int signal, int code, struct pt_regs *regs, 362 + unsigned long address) 316 363 { 317 364 siginfo_t info; 318 365 void __user *pc = (void __user *)instruction_pointer(regs); ··· 326 373 desc = "illegal memory access"; 327 374 break; 328 375 default: 329 - desc = "bad mode"; 376 + desc = "unknown or unrecoverable error"; 330 377 break; 331 378 } 332 379 ··· 433 480 if (ret) 434 481 arm64_notify_segfault(regs, address); 435 482 else 436 - regs->pc += 4; 483 + arm64_skip_faulting_instruction(regs, AARCH64_INSN_SIZE); 437 484 } 438 485 439 486 static void ctr_read_handler(unsigned int esr, struct pt_regs *regs) ··· 443 490 444 491 pt_regs_write_reg(regs, rt, val); 445 492 446 - regs->pc += 4; 493 + arm64_skip_faulting_instruction(regs, AARCH64_INSN_SIZE); 447 494 } 448 495 449 496 static void cntvct_read_handler(unsigned int esr, struct pt_regs *regs) ··· 451 498 int rt = (esr & ESR_ELx_SYS64_ISS_RT_MASK) >> ESR_ELx_SYS64_ISS_RT_SHIFT; 452 499 453 500 pt_regs_write_reg(regs, rt, arch_counter_get_cntvct()); 454 - regs->pc += 4; 501 + arm64_skip_faulting_instruction(regs, AARCH64_INSN_SIZE); 455 502 } 456 503 457 504 static void cntfrq_read_handler(unsigned int esr, struct pt_regs *regs) ··· 459 506 int rt = (esr & ESR_ELx_SYS64_ISS_RT_MASK) >> ESR_ELx_SYS64_ISS_RT_SHIFT; 460 507 461 508 pt_regs_write_reg(regs, rt, arch_timer_get_rate()); 462 - regs->pc += 4; 509 + arm64_skip_faulting_instruction(regs, AARCH64_INSN_SIZE); 463 510 } 464 511 465 512 struct sys64_hook { ··· 556 603 [ESR_ELx_EC_HVC64] = "HVC (AArch64)", 557 604 [ESR_ELx_EC_SMC64] = "SMC (AArch64)", 558 605 [ESR_ELx_EC_SYS64] = "MSR/MRS (AArch64)", 606 + [ESR_ELx_EC_SVE] = "SVE", 559 607 [ESR_ELx_EC_IMP_DEF] = "EL3 IMP DEF", 560 608 [ESR_ELx_EC_IABT_LOW] = "IABT (lower EL)", 561 609 [ESR_ELx_EC_IABT_CUR] = "IABT (current EL)", ··· 596 642 esr_get_class_string(esr)); 597 643 598 644 die("Oops - bad mode", regs, 0); 599 - local_irq_disable(); 645 + local_daif_mask(); 600 646 panic("bad mode"); 601 647 } 602 648 ··· 662 708 } 663 709 #endif 664 710 711 + asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr) 712 + { 713 + nmi_enter(); 714 + 715 + console_verbose(); 716 + 717 + pr_crit("SError Interrupt on CPU%d, code 0x%08x -- %s\n", 718 + smp_processor_id(), esr, esr_get_class_string(esr)); 719 + __show_regs(regs); 720 + 721 + panic("Asynchronous SError Interrupt"); 722 + } 723 + 665 724 void __pte_error(const char *file, int line, unsigned long val) 666 725 { 667 726 pr_err("%s:%d: bad pte %016lx.\n", file, line, val); ··· 728 761 } 729 762 730 763 /* If thread survives, skip over the BUG instruction and continue: */ 731 - regs->pc += AARCH64_INSN_SIZE; /* skip BRK and resume */ 764 + arm64_skip_faulting_instruction(regs, AARCH64_INSN_SIZE); 732 765 return DBG_HOOK_HANDLED; 733 766 } 734 767
+1 -1
arch/arm64/kernel/vdso/gettimeofday.S
··· 309 309 b.ne 4f 310 310 ldr x2, 6f 311 311 2: 312 - cbz w1, 3f 312 + cbz x1, 3f 313 313 stp xzr, x2, [x1] 314 314 315 315 3: /* res == NULL. */
+8
arch/arm64/kvm/handle_exit.c
··· 147 147 return 1; 148 148 } 149 149 150 + static int handle_sve(struct kvm_vcpu *vcpu, struct kvm_run *run) 151 + { 152 + /* Until SVE is supported for guests: */ 153 + kvm_inject_undefined(vcpu); 154 + return 1; 155 + } 156 + 150 157 static exit_handle_fn arm_exit_handlers[] = { 151 158 [0 ... ESR_ELx_EC_MAX] = kvm_handle_unknown_ec, 152 159 [ESR_ELx_EC_WFx] = kvm_handle_wfx, ··· 167 160 [ESR_ELx_EC_HVC64] = handle_hvc, 168 161 [ESR_ELx_EC_SMC64] = handle_smc, 169 162 [ESR_ELx_EC_SYS64] = kvm_handle_sys_reg, 163 + [ESR_ELx_EC_SVE] = handle_sve, 170 164 [ESR_ELx_EC_IABT_LOW] = kvm_handle_guest_abort, 171 165 [ESR_ELx_EC_DABT_LOW] = kvm_handle_guest_abort, 172 166 [ESR_ELx_EC_SOFTSTP_LOW]= kvm_handle_guest_debug,
+7 -17
arch/arm64/kvm/hyp/debug-sr.c
··· 65 65 default: write_debug(ptr[0], reg, 0); \ 66 66 } 67 67 68 - #define PMSCR_EL1 sys_reg(3, 0, 9, 9, 0) 69 - 70 - #define PMBLIMITR_EL1 sys_reg(3, 0, 9, 10, 0) 71 - #define PMBLIMITR_EL1_E BIT(0) 72 - 73 - #define PMBIDR_EL1 sys_reg(3, 0, 9, 10, 7) 74 - #define PMBIDR_EL1_P BIT(4) 75 - 76 - #define psb_csync() asm volatile("hint #17") 77 - 78 68 static void __hyp_text __debug_save_spe_vhe(u64 *pmscr_el1) 79 69 { 80 70 /* The vcpu can run. but it can't hide. */ ··· 80 90 return; 81 91 82 92 /* Yes; is it owned by EL3? */ 83 - reg = read_sysreg_s(PMBIDR_EL1); 84 - if (reg & PMBIDR_EL1_P) 93 + reg = read_sysreg_s(SYS_PMBIDR_EL1); 94 + if (reg & BIT(SYS_PMBIDR_EL1_P_SHIFT)) 85 95 return; 86 96 87 97 /* No; is the host actually using the thing? */ 88 - reg = read_sysreg_s(PMBLIMITR_EL1); 89 - if (!(reg & PMBLIMITR_EL1_E)) 98 + reg = read_sysreg_s(SYS_PMBLIMITR_EL1); 99 + if (!(reg & BIT(SYS_PMBLIMITR_EL1_E_SHIFT))) 90 100 return; 91 101 92 102 /* Yes; save the control register and disable data generation */ 93 - *pmscr_el1 = read_sysreg_s(PMSCR_EL1); 94 - write_sysreg_s(0, PMSCR_EL1); 103 + *pmscr_el1 = read_sysreg_s(SYS_PMSCR_EL1); 104 + write_sysreg_s(0, SYS_PMSCR_EL1); 95 105 isb(); 96 106 97 107 /* Now drain all buffered data to memory */ ··· 112 122 isb(); 113 123 114 124 /* Re-enable data generation */ 115 - write_sysreg_s(pmscr_el1, PMSCR_EL1); 125 + write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1); 116 126 } 117 127 118 128 void __hyp_text __debug_save_state(struct kvm_vcpu *vcpu,
+9 -3
arch/arm64/kvm/hyp/switch.c
··· 48 48 49 49 val = read_sysreg(cpacr_el1); 50 50 val |= CPACR_EL1_TTA; 51 - val &= ~CPACR_EL1_FPEN; 51 + val &= ~(CPACR_EL1_FPEN | CPACR_EL1_ZEN); 52 52 write_sysreg(val, cpacr_el1); 53 53 54 54 write_sysreg(__kvm_hyp_vector, vbar_el1); ··· 59 59 u64 val; 60 60 61 61 val = CPTR_EL2_DEFAULT; 62 - val |= CPTR_EL2_TTA | CPTR_EL2_TFP; 62 + val |= CPTR_EL2_TTA | CPTR_EL2_TFP | CPTR_EL2_TZ; 63 63 write_sysreg(val, cptr_el2); 64 64 } 65 65 ··· 81 81 * it will cause an exception. 82 82 */ 83 83 val = vcpu->arch.hcr_el2; 84 + 84 85 if (!(val & HCR_RW) && system_supports_fpsimd()) { 85 86 write_sysreg(1 << 30, fpexc32_el2); 86 87 isb(); 87 88 } 89 + 90 + if (val & HCR_RW) /* for AArch64 only: */ 91 + val |= HCR_TID3; /* TID3: trap feature register accesses */ 92 + 88 93 write_sysreg(val, hcr_el2); 94 + 89 95 /* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */ 90 96 write_sysreg(1 << 15, hstr_el2); 91 97 /* ··· 117 111 118 112 write_sysreg(mdcr_el2, mdcr_el2); 119 113 write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2); 120 - write_sysreg(CPACR_EL1_FPEN, cpacr_el1); 114 + write_sysreg(CPACR_EL1_DEFAULT, cpacr_el1); 121 115 write_sysreg(vectors, vbar_el1); 122 116 } 123 117
+247 -45
arch/arm64/kvm/sys_regs.c
··· 23 23 #include <linux/bsearch.h> 24 24 #include <linux/kvm_host.h> 25 25 #include <linux/mm.h> 26 + #include <linux/printk.h> 26 27 #include <linux/uaccess.h> 27 28 28 29 #include <asm/cacheflush.h> ··· 893 892 return true; 894 893 } 895 894 895 + /* Read a sanitised cpufeature ID register by sys_reg_desc */ 896 + static u64 read_id_reg(struct sys_reg_desc const *r, bool raz) 897 + { 898 + u32 id = sys_reg((u32)r->Op0, (u32)r->Op1, 899 + (u32)r->CRn, (u32)r->CRm, (u32)r->Op2); 900 + u64 val = raz ? 0 : read_sanitised_ftr_reg(id); 901 + 902 + if (id == SYS_ID_AA64PFR0_EL1) { 903 + if (val & (0xfUL << ID_AA64PFR0_SVE_SHIFT)) 904 + pr_err_once("kvm [%i]: SVE unsupported for guests, suppressing\n", 905 + task_pid_nr(current)); 906 + 907 + val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT); 908 + } 909 + 910 + return val; 911 + } 912 + 913 + /* cpufeature ID register access trap handlers */ 914 + 915 + static bool __access_id_reg(struct kvm_vcpu *vcpu, 916 + struct sys_reg_params *p, 917 + const struct sys_reg_desc *r, 918 + bool raz) 919 + { 920 + if (p->is_write) 921 + return write_to_read_only(vcpu, p, r); 922 + 923 + p->regval = read_id_reg(r, raz); 924 + return true; 925 + } 926 + 927 + static bool access_id_reg(struct kvm_vcpu *vcpu, 928 + struct sys_reg_params *p, 929 + const struct sys_reg_desc *r) 930 + { 931 + return __access_id_reg(vcpu, p, r, false); 932 + } 933 + 934 + static bool access_raz_id_reg(struct kvm_vcpu *vcpu, 935 + struct sys_reg_params *p, 936 + const struct sys_reg_desc *r) 937 + { 938 + return __access_id_reg(vcpu, p, r, true); 939 + } 940 + 941 + static int reg_from_user(u64 *val, const void __user *uaddr, u64 id); 942 + static int reg_to_user(void __user *uaddr, const u64 *val, u64 id); 943 + static u64 sys_reg_to_index(const struct sys_reg_desc *reg); 944 + 945 + /* 946 + * cpufeature ID register user accessors 947 + * 948 + * For now, these registers are immutable for userspace, so no values 949 + * are stored, and for set_id_reg() we don't allow the effective value 950 + * to be changed. 951 + */ 952 + static int __get_id_reg(const struct sys_reg_desc *rd, void __user *uaddr, 953 + bool raz) 954 + { 955 + const u64 id = sys_reg_to_index(rd); 956 + const u64 val = read_id_reg(rd, raz); 957 + 958 + return reg_to_user(uaddr, &val, id); 959 + } 960 + 961 + static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr, 962 + bool raz) 963 + { 964 + const u64 id = sys_reg_to_index(rd); 965 + int err; 966 + u64 val; 967 + 968 + err = reg_from_user(&val, uaddr, id); 969 + if (err) 970 + return err; 971 + 972 + /* This is what we mean by invariant: you can't change it. */ 973 + if (val != read_id_reg(rd, raz)) 974 + return -EINVAL; 975 + 976 + return 0; 977 + } 978 + 979 + static int get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, 980 + const struct kvm_one_reg *reg, void __user *uaddr) 981 + { 982 + return __get_id_reg(rd, uaddr, false); 983 + } 984 + 985 + static int set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, 986 + const struct kvm_one_reg *reg, void __user *uaddr) 987 + { 988 + return __set_id_reg(rd, uaddr, false); 989 + } 990 + 991 + static int get_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, 992 + const struct kvm_one_reg *reg, void __user *uaddr) 993 + { 994 + return __get_id_reg(rd, uaddr, true); 995 + } 996 + 997 + static int set_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, 998 + const struct kvm_one_reg *reg, void __user *uaddr) 999 + { 1000 + return __set_id_reg(rd, uaddr, true); 1001 + } 1002 + 1003 + /* sys_reg_desc initialiser for known cpufeature ID registers */ 1004 + #define ID_SANITISED(name) { \ 1005 + SYS_DESC(SYS_##name), \ 1006 + .access = access_id_reg, \ 1007 + .get_user = get_id_reg, \ 1008 + .set_user = set_id_reg, \ 1009 + } 1010 + 1011 + /* 1012 + * sys_reg_desc initialiser for architecturally unallocated cpufeature ID 1013 + * register with encoding Op0=3, Op1=0, CRn=0, CRm=crm, Op2=op2 1014 + * (1 <= crm < 8, 0 <= Op2 < 8). 1015 + */ 1016 + #define ID_UNALLOCATED(crm, op2) { \ 1017 + Op0(3), Op1(0), CRn(0), CRm(crm), Op2(op2), \ 1018 + .access = access_raz_id_reg, \ 1019 + .get_user = get_raz_id_reg, \ 1020 + .set_user = set_raz_id_reg, \ 1021 + } 1022 + 1023 + /* 1024 + * sys_reg_desc initialiser for known ID registers that we hide from guests. 1025 + * For now, these are exposed just like unallocated ID regs: they appear 1026 + * RAZ for the guest. 1027 + */ 1028 + #define ID_HIDDEN(name) { \ 1029 + SYS_DESC(SYS_##name), \ 1030 + .access = access_raz_id_reg, \ 1031 + .get_user = get_raz_id_reg, \ 1032 + .set_user = set_raz_id_reg, \ 1033 + } 1034 + 896 1035 /* 897 1036 * Architected system registers. 898 1037 * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2 ··· 1085 944 { SYS_DESC(SYS_DBGVCR32_EL2), NULL, reset_val, DBGVCR32_EL2, 0 }, 1086 945 1087 946 { SYS_DESC(SYS_MPIDR_EL1), NULL, reset_mpidr, MPIDR_EL1 }, 947 + 948 + /* 949 + * ID regs: all ID_SANITISED() entries here must have corresponding 950 + * entries in arm64_ftr_regs[]. 951 + */ 952 + 953 + /* AArch64 mappings of the AArch32 ID registers */ 954 + /* CRm=1 */ 955 + ID_SANITISED(ID_PFR0_EL1), 956 + ID_SANITISED(ID_PFR1_EL1), 957 + ID_SANITISED(ID_DFR0_EL1), 958 + ID_HIDDEN(ID_AFR0_EL1), 959 + ID_SANITISED(ID_MMFR0_EL1), 960 + ID_SANITISED(ID_MMFR1_EL1), 961 + ID_SANITISED(ID_MMFR2_EL1), 962 + ID_SANITISED(ID_MMFR3_EL1), 963 + 964 + /* CRm=2 */ 965 + ID_SANITISED(ID_ISAR0_EL1), 966 + ID_SANITISED(ID_ISAR1_EL1), 967 + ID_SANITISED(ID_ISAR2_EL1), 968 + ID_SANITISED(ID_ISAR3_EL1), 969 + ID_SANITISED(ID_ISAR4_EL1), 970 + ID_SANITISED(ID_ISAR5_EL1), 971 + ID_SANITISED(ID_MMFR4_EL1), 972 + ID_UNALLOCATED(2,7), 973 + 974 + /* CRm=3 */ 975 + ID_SANITISED(MVFR0_EL1), 976 + ID_SANITISED(MVFR1_EL1), 977 + ID_SANITISED(MVFR2_EL1), 978 + ID_UNALLOCATED(3,3), 979 + ID_UNALLOCATED(3,4), 980 + ID_UNALLOCATED(3,5), 981 + ID_UNALLOCATED(3,6), 982 + ID_UNALLOCATED(3,7), 983 + 984 + /* AArch64 ID registers */ 985 + /* CRm=4 */ 986 + ID_SANITISED(ID_AA64PFR0_EL1), 987 + ID_SANITISED(ID_AA64PFR1_EL1), 988 + ID_UNALLOCATED(4,2), 989 + ID_UNALLOCATED(4,3), 990 + ID_UNALLOCATED(4,4), 991 + ID_UNALLOCATED(4,5), 992 + ID_UNALLOCATED(4,6), 993 + ID_UNALLOCATED(4,7), 994 + 995 + /* CRm=5 */ 996 + ID_SANITISED(ID_AA64DFR0_EL1), 997 + ID_SANITISED(ID_AA64DFR1_EL1), 998 + ID_UNALLOCATED(5,2), 999 + ID_UNALLOCATED(5,3), 1000 + ID_HIDDEN(ID_AA64AFR0_EL1), 1001 + ID_HIDDEN(ID_AA64AFR1_EL1), 1002 + ID_UNALLOCATED(5,6), 1003 + ID_UNALLOCATED(5,7), 1004 + 1005 + /* CRm=6 */ 1006 + ID_SANITISED(ID_AA64ISAR0_EL1), 1007 + ID_SANITISED(ID_AA64ISAR1_EL1), 1008 + ID_UNALLOCATED(6,2), 1009 + ID_UNALLOCATED(6,3), 1010 + ID_UNALLOCATED(6,4), 1011 + ID_UNALLOCATED(6,5), 1012 + ID_UNALLOCATED(6,6), 1013 + ID_UNALLOCATED(6,7), 1014 + 1015 + /* CRm=7 */ 1016 + ID_SANITISED(ID_AA64MMFR0_EL1), 1017 + ID_SANITISED(ID_AA64MMFR1_EL1), 1018 + ID_SANITISED(ID_AA64MMFR2_EL1), 1019 + ID_UNALLOCATED(7,3), 1020 + ID_UNALLOCATED(7,4), 1021 + ID_UNALLOCATED(7,5), 1022 + ID_UNALLOCATED(7,6), 1023 + ID_UNALLOCATED(7,7), 1024 + 1088 1025 { SYS_DESC(SYS_SCTLR_EL1), access_vm_reg, reset_val, SCTLR_EL1, 0x00C50078 }, 1089 1026 { SYS_DESC(SYS_CPACR_EL1), NULL, reset_val, CPACR_EL1, 0 }, 1090 1027 { SYS_DESC(SYS_TTBR0_EL1), access_vm_reg, reset_unknown, TTBR0_EL1 }, ··· 2009 1790 if (!r) 2010 1791 r = find_reg(&params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs)); 2011 1792 2012 - /* Not saved in the sys_reg array? */ 2013 - if (r && !r->reg) 1793 + /* Not saved in the sys_reg array and not otherwise accessible? */ 1794 + if (r && !(r->reg || r->get_user)) 2014 1795 r = NULL; 2015 1796 2016 1797 return r; ··· 2034 1815 FUNCTION_INVARIANT(midr_el1) 2035 1816 FUNCTION_INVARIANT(ctr_el0) 2036 1817 FUNCTION_INVARIANT(revidr_el1) 2037 - FUNCTION_INVARIANT(id_pfr0_el1) 2038 - FUNCTION_INVARIANT(id_pfr1_el1) 2039 - FUNCTION_INVARIANT(id_dfr0_el1) 2040 - FUNCTION_INVARIANT(id_afr0_el1) 2041 - FUNCTION_INVARIANT(id_mmfr0_el1) 2042 - FUNCTION_INVARIANT(id_mmfr1_el1) 2043 - FUNCTION_INVARIANT(id_mmfr2_el1) 2044 - FUNCTION_INVARIANT(id_mmfr3_el1) 2045 - FUNCTION_INVARIANT(id_isar0_el1) 2046 - FUNCTION_INVARIANT(id_isar1_el1) 2047 - FUNCTION_INVARIANT(id_isar2_el1) 2048 - FUNCTION_INVARIANT(id_isar3_el1) 2049 - FUNCTION_INVARIANT(id_isar4_el1) 2050 - FUNCTION_INVARIANT(id_isar5_el1) 2051 1818 FUNCTION_INVARIANT(clidr_el1) 2052 1819 FUNCTION_INVARIANT(aidr_el1) 2053 1820 ··· 2041 1836 static struct sys_reg_desc invariant_sys_regs[] = { 2042 1837 { SYS_DESC(SYS_MIDR_EL1), NULL, get_midr_el1 }, 2043 1838 { SYS_DESC(SYS_REVIDR_EL1), NULL, get_revidr_el1 }, 2044 - { SYS_DESC(SYS_ID_PFR0_EL1), NULL, get_id_pfr0_el1 }, 2045 - { SYS_DESC(SYS_ID_PFR1_EL1), NULL, get_id_pfr1_el1 }, 2046 - { SYS_DESC(SYS_ID_DFR0_EL1), NULL, get_id_dfr0_el1 }, 2047 - { SYS_DESC(SYS_ID_AFR0_EL1), NULL, get_id_afr0_el1 }, 2048 - { SYS_DESC(SYS_ID_MMFR0_EL1), NULL, get_id_mmfr0_el1 }, 2049 - { SYS_DESC(SYS_ID_MMFR1_EL1), NULL, get_id_mmfr1_el1 }, 2050 - { SYS_DESC(SYS_ID_MMFR2_EL1), NULL, get_id_mmfr2_el1 }, 2051 - { SYS_DESC(SYS_ID_MMFR3_EL1), NULL, get_id_mmfr3_el1 }, 2052 - { SYS_DESC(SYS_ID_ISAR0_EL1), NULL, get_id_isar0_el1 }, 2053 - { SYS_DESC(SYS_ID_ISAR1_EL1), NULL, get_id_isar1_el1 }, 2054 - { SYS_DESC(SYS_ID_ISAR2_EL1), NULL, get_id_isar2_el1 }, 2055 - { SYS_DESC(SYS_ID_ISAR3_EL1), NULL, get_id_isar3_el1 }, 2056 - { SYS_DESC(SYS_ID_ISAR4_EL1), NULL, get_id_isar4_el1 }, 2057 - { SYS_DESC(SYS_ID_ISAR5_EL1), NULL, get_id_isar5_el1 }, 2058 1839 { SYS_DESC(SYS_CLIDR_EL1), NULL, get_clidr_el1 }, 2059 1840 { SYS_DESC(SYS_AIDR_EL1), NULL, get_aidr_el1 }, 2060 1841 { SYS_DESC(SYS_CTR_EL0), NULL, get_ctr_el0 }, ··· 2270 2079 return true; 2271 2080 } 2272 2081 2082 + static int walk_one_sys_reg(const struct sys_reg_desc *rd, 2083 + u64 __user **uind, 2084 + unsigned int *total) 2085 + { 2086 + /* 2087 + * Ignore registers we trap but don't save, 2088 + * and for which no custom user accessor is provided. 2089 + */ 2090 + if (!(rd->reg || rd->get_user)) 2091 + return 0; 2092 + 2093 + if (!copy_reg_to_user(rd, uind)) 2094 + return -EFAULT; 2095 + 2096 + (*total)++; 2097 + return 0; 2098 + } 2099 + 2273 2100 /* Assumed ordered tables, see kvm_sys_reg_table_init. */ 2274 2101 static int walk_sys_regs(struct kvm_vcpu *vcpu, u64 __user *uind) 2275 2102 { 2276 2103 const struct sys_reg_desc *i1, *i2, *end1, *end2; 2277 2104 unsigned int total = 0; 2278 2105 size_t num; 2106 + int err; 2279 2107 2280 2108 /* We check for duplicates here, to allow arch-specific overrides. */ 2281 2109 i1 = get_target_table(vcpu->arch.target, true, &num); ··· 2308 2098 while (i1 || i2) { 2309 2099 int cmp = cmp_sys_reg(i1, i2); 2310 2100 /* target-specific overrides generic entry. */ 2311 - if (cmp <= 0) { 2312 - /* Ignore registers we trap but don't save. */ 2313 - if (i1->reg) { 2314 - if (!copy_reg_to_user(i1, &uind)) 2315 - return -EFAULT; 2316 - total++; 2317 - } 2318 - } else { 2319 - /* Ignore registers we trap but don't save. */ 2320 - if (i2->reg) { 2321 - if (!copy_reg_to_user(i2, &uind)) 2322 - return -EFAULT; 2323 - total++; 2324 - } 2325 - } 2101 + if (cmp <= 0) 2102 + err = walk_one_sys_reg(i1, &uind, &total); 2103 + else 2104 + err = walk_one_sys_reg(i2, &uind, &total); 2105 + 2106 + if (err) 2107 + return err; 2326 2108 2327 2109 if (cmp <= 0 && ++i1 == end1) 2328 2110 i1 = NULL;
+1 -1
arch/arm64/lib/Makefile
··· 3 3 copy_to_user.o copy_in_user.o copy_page.o \ 4 4 clear_page.o memchr.o memcpy.o memmove.o memset.o \ 5 5 memcmp.o strcmp.o strncmp.o strlen.o strnlen.o \ 6 - strchr.o strrchr.o 6 + strchr.o strrchr.o tishift.o 7 7 8 8 # Tell the compiler to treat all general purpose registers (with the 9 9 # exception of the IP registers, which are already handled by the caller
+19 -4
arch/arm64/lib/delay.c
··· 24 24 #include <linux/module.h> 25 25 #include <linux/timex.h> 26 26 27 + #include <clocksource/arm_arch_timer.h> 28 + 29 + #define USECS_TO_CYCLES(time_usecs) \ 30 + xloops_to_cycles((time_usecs) * 0x10C7UL) 31 + 32 + static inline unsigned long xloops_to_cycles(unsigned long xloops) 33 + { 34 + return (xloops * loops_per_jiffy * HZ) >> 32; 35 + } 36 + 27 37 void __delay(unsigned long cycles) 28 38 { 29 39 cycles_t start = get_cycles(); 40 + 41 + if (arch_timer_evtstrm_available()) { 42 + const cycles_t timer_evt_period = 43 + USECS_TO_CYCLES(ARCH_TIMER_EVT_STREAM_PERIOD_US); 44 + 45 + while ((get_cycles() - start + timer_evt_period) < cycles) 46 + wfe(); 47 + } 30 48 31 49 while ((get_cycles() - start) < cycles) 32 50 cpu_relax(); ··· 53 35 54 36 inline void __const_udelay(unsigned long xloops) 55 37 { 56 - unsigned long loops; 57 - 58 - loops = xloops * loops_per_jiffy * HZ; 59 - __delay(loops >> 32); 38 + __delay(xloops_to_cycles(xloops)); 60 39 } 61 40 EXPORT_SYMBOL(__const_udelay); 62 41
+80
arch/arm64/lib/tishift.S
··· 1 + /* 2 + * Copyright (C) 2017 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. 3 + * 4 + * This program is free software; you can redistribute it and/or modify 5 + * it under the terms of the GNU General Public License version 2 as 6 + * published by the Free Software Foundation. 7 + * 8 + * This program is distributed in the hope that it will be useful, 9 + * but WITHOUT ANY WARRANTY; without even the implied warranty of 10 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 11 + * GNU General Public License for more details. 12 + * 13 + * You should have received a copy of the GNU General Public License 14 + * along with this program. If not, see <http://www.gnu.org/licenses/>. 15 + */ 16 + 17 + #include <linux/linkage.h> 18 + 19 + ENTRY(__ashlti3) 20 + cbz x2, 1f 21 + mov x3, #64 22 + sub x3, x3, x2 23 + cmp x3, #0 24 + b.le 2f 25 + lsl x1, x1, x2 26 + lsr x3, x0, x3 27 + lsl x2, x0, x2 28 + orr x1, x1, x3 29 + mov x0, x2 30 + 1: 31 + ret 32 + 2: 33 + neg w1, w3 34 + mov x2, #0 35 + lsl x1, x0, x1 36 + mov x0, x2 37 + ret 38 + ENDPROC(__ashlti3) 39 + 40 + ENTRY(__ashrti3) 41 + cbz x2, 3f 42 + mov x3, #64 43 + sub x3, x3, x2 44 + cmp x3, #0 45 + b.le 4f 46 + lsr x0, x0, x2 47 + lsl x3, x1, x3 48 + asr x2, x1, x2 49 + orr x0, x0, x3 50 + mov x1, x2 51 + 3: 52 + ret 53 + 4: 54 + neg w0, w3 55 + asr x2, x1, #63 56 + asr x0, x1, x0 57 + mov x1, x2 58 + ret 59 + ENDPROC(__ashrti3) 60 + 61 + ENTRY(__lshrti3) 62 + cbz x2, 1f 63 + mov x3, #64 64 + sub x3, x3, x2 65 + cmp x3, #0 66 + b.le 2f 67 + lsr x0, x0, x2 68 + lsl x3, x1, x3 69 + lsr x2, x1, x2 70 + orr x0, x0, x3 71 + mov x1, x2 72 + 1: 73 + ret 74 + 2: 75 + neg w0, w3 76 + mov x2, #0 77 + lsr x0, x1, x0 78 + mov x1, x2 79 + ret 80 + ENDPROC(__lshrti3)
+2 -3
arch/arm64/mm/dma-mapping.c
··· 166 166 /* create a coherent mapping */ 167 167 page = virt_to_page(ptr); 168 168 coherent_ptr = dma_common_contiguous_remap(page, size, VM_USERMAP, 169 - prot, NULL); 169 + prot, __builtin_return_address(0)); 170 170 if (!coherent_ptr) 171 171 goto no_map; 172 172 ··· 303 303 unsigned long pfn, size_t size) 304 304 { 305 305 int ret = -ENXIO; 306 - unsigned long nr_vma_pages = (vma->vm_end - vma->vm_start) >> 307 - PAGE_SHIFT; 306 + unsigned long nr_vma_pages = vma_pages(vma); 308 307 unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT; 309 308 unsigned long off = vma->vm_pgoff; 310 309
+13 -59
arch/arm64/mm/fault.c
··· 105 105 (esr & ESR_ELx_WNR) >> ESR_ELx_WNR_SHIFT); 106 106 } 107 107 108 - /* 109 - * Decode mem abort information 110 - */ 111 108 static void mem_abort_decode(unsigned int esr) 112 109 { 113 110 pr_alert("Mem abort info:\n"); 114 111 112 + pr_alert(" ESR = 0x%08x\n", esr); 115 113 pr_alert(" Exception class = %s, IL = %u bits\n", 116 114 esr_get_class_string(esr), 117 115 (esr & ESR_ELx_IL) ? 32 : 16); ··· 247 249 return false; 248 250 } 249 251 250 - /* 251 - * The kernel tried to access some page that wasn't present. 252 - */ 253 252 static void __do_kernel_fault(unsigned long addr, unsigned int esr, 254 253 struct pt_regs *regs) 255 254 { ··· 259 264 if (!is_el1_instruction_abort(esr) && fixup_exception(regs)) 260 265 return; 261 266 262 - /* 263 - * No handler, we'll have to terminate things with extreme prejudice. 264 - */ 265 267 bust_spinlocks(1); 266 268 267 269 if (is_permission_fault(esr, regs, addr)) { ··· 283 291 do_exit(SIGKILL); 284 292 } 285 293 286 - /* 287 - * Something tried to access memory that isn't in our memory map. User mode 288 - * accesses just cause a SIGSEGV 289 - */ 290 294 static void __do_user_fault(struct task_struct *tsk, unsigned long addr, 291 295 unsigned int esr, unsigned int sig, int code, 292 296 struct pt_regs *regs, int fault) ··· 547 559 return 0; 548 560 } 549 561 550 - /* 551 - * First Level Translation Fault Handler 552 - * 553 - * We enter here because the first level page table doesn't contain a valid 554 - * entry for the address. 555 - * 556 - * If the address is in kernel space (>= TASK_SIZE), then we are probably 557 - * faulting in the vmalloc() area. 558 - * 559 - * If the init_task's first level page tables contains the relevant entry, we 560 - * copy the it to this task. If not, we send the process a signal, fixup the 561 - * exception, or oops the kernel. 562 - * 563 - * NOTE! We MUST NOT take any locks for this case. We may be in an interrupt 564 - * or a critical region, and should only copy the information from the master 565 - * page table, nothing more. 566 - */ 567 562 static int __kprobes do_translation_fault(unsigned long addr, 568 563 unsigned int esr, 569 564 struct pt_regs *regs) ··· 565 594 return 0; 566 595 } 567 596 568 - /* 569 - * This abort handler always returns "fault". 570 - */ 571 597 static int do_bad(unsigned long addr, unsigned int esr, struct pt_regs *regs) 572 598 { 573 - return 1; 599 + return 1; /* "fault" */ 574 600 } 575 601 576 - /* 577 - * This abort handler deals with Synchronous External Abort. 578 - * It calls notifiers, and then returns "fault". 579 - */ 580 602 static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs) 581 603 { 582 604 struct siginfo info; ··· 632 668 { do_sea, SIGBUS, 0, "level 1 (translation table walk)" }, 633 669 { do_sea, SIGBUS, 0, "level 2 (translation table walk)" }, 634 670 { do_sea, SIGBUS, 0, "level 3 (translation table walk)" }, 635 - { do_sea, SIGBUS, 0, "synchronous parity or ECC error" }, 671 + { do_sea, SIGBUS, 0, "synchronous parity or ECC error" }, // Reserved when RAS is implemented 636 672 { do_bad, SIGBUS, 0, "unknown 25" }, 637 673 { do_bad, SIGBUS, 0, "unknown 26" }, 638 674 { do_bad, SIGBUS, 0, "unknown 27" }, 639 - { do_sea, SIGBUS, 0, "level 0 synchronous parity error (translation table walk)" }, 640 - { do_sea, SIGBUS, 0, "level 1 synchronous parity error (translation table walk)" }, 641 - { do_sea, SIGBUS, 0, "level 2 synchronous parity error (translation table walk)" }, 642 - { do_sea, SIGBUS, 0, "level 3 synchronous parity error (translation table walk)" }, 675 + { do_sea, SIGBUS, 0, "level 0 synchronous parity error (translation table walk)" }, // Reserved when RAS is implemented 676 + { do_sea, SIGBUS, 0, "level 1 synchronous parity error (translation table walk)" }, // Reserved when RAS is implemented 677 + { do_sea, SIGBUS, 0, "level 2 synchronous parity error (translation table walk)" }, // Reserved when RAS is implemented 678 + { do_sea, SIGBUS, 0, "level 3 synchronous parity error (translation table walk)" }, // Reserved when RAS is implemented 643 679 { do_bad, SIGBUS, 0, "unknown 32" }, 644 680 { do_alignment_fault, SIGBUS, BUS_ADRALN, "alignment fault" }, 645 681 { do_bad, SIGBUS, 0, "unknown 34" }, ··· 657 693 { do_bad, SIGBUS, 0, "unknown 46" }, 658 694 { do_bad, SIGBUS, 0, "unknown 47" }, 659 695 { do_bad, SIGBUS, 0, "TLB conflict abort" }, 660 - { do_bad, SIGBUS, 0, "unknown 49" }, 696 + { do_bad, SIGBUS, 0, "Unsupported atomic hardware update fault" }, 661 697 { do_bad, SIGBUS, 0, "unknown 50" }, 662 698 { do_bad, SIGBUS, 0, "unknown 51" }, 663 699 { do_bad, SIGBUS, 0, "implementation fault (lockdown abort)" }, ··· 674 710 { do_bad, SIGBUS, 0, "unknown 63" }, 675 711 }; 676 712 677 - /* 678 - * Handle Synchronous External Aborts that occur in a guest kernel. 679 - * 680 - * The return value will be zero if the SEA was successfully handled 681 - * and non-zero if there was an error processing the error or there was 682 - * no error to process. 683 - */ 684 713 int handle_guest_sea(phys_addr_t addr, unsigned int esr) 685 714 { 686 715 int ret = -ENOENT; ··· 684 727 return ret; 685 728 } 686 729 687 - /* 688 - * Dispatch a data abort to the relevant handler. 689 - */ 690 730 asmlinkage void __exception do_mem_abort(unsigned long addr, unsigned int esr, 691 731 struct pt_regs *regs) 692 732 { ··· 693 739 if (!inf->fn(addr, esr, regs)) 694 740 return; 695 741 696 - pr_alert("Unhandled fault: %s (0x%08x) at 0x%016lx\n", 697 - inf->name, esr, addr); 742 + pr_alert("Unhandled fault: %s at 0x%016lx\n", 743 + inf->name, addr); 698 744 699 745 mem_abort_decode(esr); 746 + 747 + if (!user_mode(regs)) 748 + show_pte(addr); 700 749 701 750 info.si_signo = inf->sig; 702 751 info.si_errno = 0; ··· 708 751 arm64_notify_die("", regs, &info, esr); 709 752 } 710 753 711 - /* 712 - * Handle stack alignment exceptions. 713 - */ 714 754 asmlinkage void __exception do_sp_pc_abort(unsigned long addr, 715 755 unsigned int esr, 716 756 struct pt_regs *regs)
+4 -5
arch/arm64/mm/proc.S
··· 109 109 /* 110 110 * __cpu_setup() cleared MDSCR_EL1.MDE and friends, before unmasking 111 111 * debug exceptions. By restoring MDSCR_EL1 here, we may take a debug 112 - * exception. Mask them until local_dbg_restore() in cpu_suspend() 112 + * exception. Mask them until local_daif_restore() in cpu_suspend() 113 113 * resets them. 114 114 */ 115 - disable_dbg 115 + disable_daif 116 116 msr mdscr_el1, x10 117 117 118 118 msr sctlr_el1, x12 ··· 155 155 * called by anything else. It can only be executed from a TTBR0 mapping. 156 156 */ 157 157 ENTRY(idmap_cpu_replace_ttbr1) 158 - mrs x2, daif 159 - msr daifset, #0xf 158 + save_and_disable_daif flags=x2 160 159 161 160 adrp x1, empty_zero_page 162 161 msr ttbr1_el1, x1 ··· 168 169 msr ttbr1_el1, x0 169 170 isb 170 171 171 - msr daif, x2 172 + restore_daif x2 172 173 173 174 ret 174 175 ENDPROC(idmap_cpu_replace_ttbr1)
+1 -1
drivers/acpi/arm64/gtdt.c
··· 199 199 struct acpi_gtdt_timer_entry *gtdt_frame; 200 200 201 201 if (!block->timer_count) { 202 - pr_err(FW_BUG "GT block present, but frame count is zero."); 202 + pr_err(FW_BUG "GT block present, but frame count is zero.\n"); 203 203 return -ENODEV; 204 204 } 205 205
+200 -58
drivers/acpi/arm64/iort.c
··· 88 88 * 89 89 * Returns: fwnode_handle pointer on success, NULL on failure 90 90 */ 91 - static inline 92 - struct fwnode_handle *iort_get_fwnode(struct acpi_iort_node *node) 91 + static inline struct fwnode_handle *iort_get_fwnode( 92 + struct acpi_iort_node *node) 93 93 { 94 94 struct iort_fwnode *curr; 95 95 struct fwnode_handle *fwnode = NULL; ··· 124 124 } 125 125 } 126 126 spin_unlock(&iort_fwnode_lock); 127 + } 128 + 129 + /** 130 + * iort_get_iort_node() - Retrieve iort_node associated with an fwnode 131 + * 132 + * @fwnode: fwnode associated with device to be looked-up 133 + * 134 + * Returns: iort_node pointer on success, NULL on failure 135 + */ 136 + static inline struct acpi_iort_node *iort_get_iort_node( 137 + struct fwnode_handle *fwnode) 138 + { 139 + struct iort_fwnode *curr; 140 + struct acpi_iort_node *iort_node = NULL; 141 + 142 + spin_lock(&iort_fwnode_lock); 143 + list_for_each_entry(curr, &iort_fwnode_list, list) { 144 + if (curr->fwnode == fwnode) { 145 + iort_node = curr->iort_node; 146 + break; 147 + } 148 + } 149 + spin_unlock(&iort_fwnode_lock); 150 + 151 + return iort_node; 127 152 } 128 153 129 154 typedef acpi_status (*iort_find_node_callback) ··· 331 306 return 0; 332 307 } 333 308 334 - static 335 - struct acpi_iort_node *iort_node_get_id(struct acpi_iort_node *node, 336 - u32 *id_out, int index) 309 + static struct acpi_iort_node *iort_node_get_id(struct acpi_iort_node *node, 310 + u32 *id_out, int index) 337 311 { 338 312 struct acpi_iort_node *parent; 339 313 struct acpi_iort_id_mapping *map; ··· 356 332 357 333 if (map->flags & ACPI_IORT_ID_SINGLE_MAPPING) { 358 334 if (node->type == ACPI_IORT_NODE_NAMED_COMPONENT || 359 - node->type == ACPI_IORT_NODE_PCI_ROOT_COMPLEX) { 335 + node->type == ACPI_IORT_NODE_PCI_ROOT_COMPLEX || 336 + node->type == ACPI_IORT_NODE_SMMU_V3) { 360 337 *id_out = map->output_base; 361 338 return parent; 362 339 } ··· 365 340 366 341 return NULL; 367 342 } 343 + 344 + #if (ACPI_CA_VERSION > 0x20170929) 345 + static int iort_get_id_mapping_index(struct acpi_iort_node *node) 346 + { 347 + struct acpi_iort_smmu_v3 *smmu; 348 + 349 + switch (node->type) { 350 + case ACPI_IORT_NODE_SMMU_V3: 351 + /* 352 + * SMMUv3 dev ID mapping index was introduced in revision 1 353 + * table, not available in revision 0 354 + */ 355 + if (node->revision < 1) 356 + return -EINVAL; 357 + 358 + smmu = (struct acpi_iort_smmu_v3 *)node->node_data; 359 + /* 360 + * ID mapping index is only ignored if all interrupts are 361 + * GSIV based 362 + */ 363 + if (smmu->event_gsiv && smmu->pri_gsiv && smmu->gerr_gsiv 364 + && smmu->sync_gsiv) 365 + return -EINVAL; 366 + 367 + if (smmu->id_mapping_index >= node->mapping_count) { 368 + pr_err(FW_BUG "[node %p type %d] ID mapping index overflows valid mappings\n", 369 + node, node->type); 370 + return -EINVAL; 371 + } 372 + 373 + return smmu->id_mapping_index; 374 + default: 375 + return -EINVAL; 376 + } 377 + } 378 + #else 379 + static inline int iort_get_id_mapping_index(struct acpi_iort_node *node) 380 + { 381 + return -EINVAL; 382 + } 383 + #endif 368 384 369 385 static struct acpi_iort_node *iort_node_map_id(struct acpi_iort_node *node, 370 386 u32 id_in, u32 *id_out, ··· 416 350 /* Parse the ID mapping tree to find specified node type */ 417 351 while (node) { 418 352 struct acpi_iort_id_mapping *map; 419 - int i; 353 + int i, index; 420 354 421 355 if (IORT_TYPE_MASK(node->type) & type_mask) { 422 356 if (id_out) ··· 437 371 goto fail_map; 438 372 } 439 373 374 + /* 375 + * Get the special ID mapping index (if any) and skip its 376 + * associated ID map to prevent erroneous multi-stage 377 + * IORT ID translations. 378 + */ 379 + index = iort_get_id_mapping_index(node); 380 + 440 381 /* Do the ID translation */ 441 382 for (i = 0; i < node->mapping_count; i++, map++) { 383 + /* if it is special mapping index, skip it */ 384 + if (i == index) 385 + continue; 386 + 442 387 if (!iort_id_map(map, node->type, id, &id)) 443 388 break; 444 389 } ··· 469 392 return NULL; 470 393 } 471 394 472 - static 473 - struct acpi_iort_node *iort_node_map_platform_id(struct acpi_iort_node *node, 474 - u32 *id_out, u8 type_mask, 475 - int index) 395 + static struct acpi_iort_node *iort_node_map_platform_id( 396 + struct acpi_iort_node *node, u32 *id_out, u8 type_mask, 397 + int index) 476 398 { 477 399 struct acpi_iort_node *parent; 478 400 u32 id; ··· 500 424 { 501 425 struct pci_bus *pbus; 502 426 503 - if (!dev_is_pci(dev)) 427 + if (!dev_is_pci(dev)) { 428 + struct acpi_iort_node *node; 429 + /* 430 + * scan iort_fwnode_list to see if it's an iort platform 431 + * device (such as SMMU, PMCG),its iort node already cached 432 + * and associated with fwnode when iort platform devices 433 + * were initialized. 434 + */ 435 + node = iort_get_iort_node(dev->fwnode); 436 + if (node) 437 + return node; 438 + 439 + /* 440 + * if not, then it should be a platform device defined in 441 + * DSDT/SSDT (with Named Component node in IORT) 442 + */ 504 443 return iort_scan_node(ACPI_IORT_NODE_NAMED_COMPONENT, 505 444 iort_match_node_callback, dev); 445 + } 506 446 507 447 /* Find a PCI root bus */ 508 448 pbus = to_pci_dev(dev)->bus; ··· 558 466 */ 559 467 int iort_pmsi_get_dev_id(struct device *dev, u32 *dev_id) 560 468 { 561 - int i; 469 + int i, index; 562 470 struct acpi_iort_node *node; 563 471 564 472 node = iort_find_dev_node(dev); 565 473 if (!node) 566 474 return -ENODEV; 567 475 568 - for (i = 0; i < node->mapping_count; i++) { 569 - if (iort_node_map_platform_id(node, dev_id, IORT_MSI_TYPE, i)) 476 + index = iort_get_id_mapping_index(node); 477 + /* if there is a valid index, go get the dev_id directly */ 478 + if (index >= 0) { 479 + if (iort_node_get_id(node, dev_id, index)) 570 480 return 0; 481 + } else { 482 + for (i = 0; i < node->mapping_count; i++) { 483 + if (iort_node_map_platform_id(node, dev_id, 484 + IORT_MSI_TYPE, i)) 485 + return 0; 486 + } 571 487 } 572 488 573 489 return -ENODEV; ··· 636 536 return NULL; 637 537 638 538 return irq_find_matching_fwnode(handle, DOMAIN_BUS_PCI_MSI); 539 + } 540 + 541 + static void iort_set_device_domain(struct device *dev, 542 + struct acpi_iort_node *node) 543 + { 544 + struct acpi_iort_its_group *its; 545 + struct acpi_iort_node *msi_parent; 546 + struct acpi_iort_id_mapping *map; 547 + struct fwnode_handle *iort_fwnode; 548 + struct irq_domain *domain; 549 + int index; 550 + 551 + index = iort_get_id_mapping_index(node); 552 + if (index < 0) 553 + return; 554 + 555 + map = ACPI_ADD_PTR(struct acpi_iort_id_mapping, node, 556 + node->mapping_offset + index * sizeof(*map)); 557 + 558 + /* Firmware bug! */ 559 + if (!map->output_reference || 560 + !(map->flags & ACPI_IORT_ID_SINGLE_MAPPING)) { 561 + pr_err(FW_BUG "[node %p type %d] Invalid MSI mapping\n", 562 + node, node->type); 563 + return; 564 + } 565 + 566 + msi_parent = ACPI_ADD_PTR(struct acpi_iort_node, iort_table, 567 + map->output_reference); 568 + 569 + if (!msi_parent || msi_parent->type != ACPI_IORT_NODE_ITS_GROUP) 570 + return; 571 + 572 + /* Move to ITS specific data */ 573 + its = (struct acpi_iort_its_group *)msi_parent->node_data; 574 + 575 + iort_fwnode = iort_find_domain_token(its->identifiers[0]); 576 + if (!iort_fwnode) 577 + return; 578 + 579 + domain = irq_find_matching_fwnode(iort_fwnode, DOMAIN_BUS_PLATFORM_MSI); 580 + if (domain) 581 + dev_set_msi_domain(dev, domain); 639 582 } 640 583 641 584 /** ··· 766 623 } 767 624 768 625 #ifdef CONFIG_IOMMU_API 769 - static inline 770 - const struct iommu_ops *iort_fwspec_iommu_ops(struct iommu_fwspec *fwspec) 626 + static inline const struct iommu_ops *iort_fwspec_iommu_ops( 627 + struct iommu_fwspec *fwspec) 771 628 { 772 629 return (fwspec && fwspec->ops) ? fwspec->ops : NULL; 773 630 } 774 631 775 - static inline 776 - int iort_add_device_replay(const struct iommu_ops *ops, struct device *dev) 632 + static inline int iort_add_device_replay(const struct iommu_ops *ops, 633 + struct device *dev) 777 634 { 778 635 int err = 0; 779 636 ··· 783 640 return err; 784 641 } 785 642 #else 786 - static inline 787 - const struct iommu_ops *iort_fwspec_iommu_ops(struct iommu_fwspec *fwspec) 643 + static inline const struct iommu_ops *iort_fwspec_iommu_ops( 644 + struct iommu_fwspec *fwspec) 788 645 { return NULL; } 789 - static inline 790 - int iort_add_device_replay(const struct iommu_ops *ops, struct device *dev) 646 + static inline int iort_add_device_replay(const struct iommu_ops *ops, 647 + struct device *dev) 791 648 { return 0; } 792 649 #endif 793 650 ··· 1111 968 return smmu->flags & ACPI_IORT_SMMU_V3_COHACC_OVERRIDE; 1112 969 } 1113 970 1114 - #if defined(CONFIG_ACPI_NUMA) && defined(ACPI_IORT_SMMU_V3_PXM_VALID) 971 + #if defined(CONFIG_ACPI_NUMA) 1115 972 /* 1116 973 * set numa proximity domain for smmuv3 device 1117 974 */ ··· 1194 1051 return smmu->flags & ACPI_IORT_SMMU_COHERENT_WALK; 1195 1052 } 1196 1053 1197 - struct iort_iommu_config { 1054 + struct iort_dev_config { 1198 1055 const char *name; 1199 - int (*iommu_init)(struct acpi_iort_node *node); 1200 - bool (*iommu_is_coherent)(struct acpi_iort_node *node); 1201 - int (*iommu_count_resources)(struct acpi_iort_node *node); 1202 - void (*iommu_init_resources)(struct resource *res, 1056 + int (*dev_init)(struct acpi_iort_node *node); 1057 + bool (*dev_is_coherent)(struct acpi_iort_node *node); 1058 + int (*dev_count_resources)(struct acpi_iort_node *node); 1059 + void (*dev_init_resources)(struct resource *res, 1203 1060 struct acpi_iort_node *node); 1204 - void (*iommu_set_proximity)(struct device *dev, 1061 + void (*dev_set_proximity)(struct device *dev, 1205 1062 struct acpi_iort_node *node); 1206 1063 }; 1207 1064 1208 - static const struct iort_iommu_config iort_arm_smmu_v3_cfg __initconst = { 1065 + static const struct iort_dev_config iort_arm_smmu_v3_cfg __initconst = { 1209 1066 .name = "arm-smmu-v3", 1210 - .iommu_is_coherent = arm_smmu_v3_is_coherent, 1211 - .iommu_count_resources = arm_smmu_v3_count_resources, 1212 - .iommu_init_resources = arm_smmu_v3_init_resources, 1213 - .iommu_set_proximity = arm_smmu_v3_set_proximity, 1067 + .dev_is_coherent = arm_smmu_v3_is_coherent, 1068 + .dev_count_resources = arm_smmu_v3_count_resources, 1069 + .dev_init_resources = arm_smmu_v3_init_resources, 1070 + .dev_set_proximity = arm_smmu_v3_set_proximity, 1214 1071 }; 1215 1072 1216 - static const struct iort_iommu_config iort_arm_smmu_cfg __initconst = { 1073 + static const struct iort_dev_config iort_arm_smmu_cfg __initconst = { 1217 1074 .name = "arm-smmu", 1218 - .iommu_is_coherent = arm_smmu_is_coherent, 1219 - .iommu_count_resources = arm_smmu_count_resources, 1220 - .iommu_init_resources = arm_smmu_init_resources 1075 + .dev_is_coherent = arm_smmu_is_coherent, 1076 + .dev_count_resources = arm_smmu_count_resources, 1077 + .dev_init_resources = arm_smmu_init_resources 1221 1078 }; 1222 1079 1223 - static __init 1224 - const struct iort_iommu_config *iort_get_iommu_cfg(struct acpi_iort_node *node) 1080 + static __init const struct iort_dev_config *iort_get_dev_cfg( 1081 + struct acpi_iort_node *node) 1225 1082 { 1226 1083 switch (node->type) { 1227 1084 case ACPI_IORT_NODE_SMMU_V3: ··· 1234 1091 } 1235 1092 1236 1093 /** 1237 - * iort_add_smmu_platform_device() - Allocate a platform device for SMMU 1238 - * @node: Pointer to SMMU ACPI IORT node 1094 + * iort_add_platform_device() - Allocate a platform device for IORT node 1095 + * @node: Pointer to device ACPI IORT node 1239 1096 * 1240 1097 * Returns: 0 on success, <0 failure 1241 1098 */ 1242 - static int __init iort_add_smmu_platform_device(struct acpi_iort_node *node) 1099 + static int __init iort_add_platform_device(struct acpi_iort_node *node, 1100 + const struct iort_dev_config *ops) 1243 1101 { 1244 1102 struct fwnode_handle *fwnode; 1245 1103 struct platform_device *pdev; 1246 1104 struct resource *r; 1247 1105 enum dev_dma_attr attr; 1248 1106 int ret, count; 1249 - const struct iort_iommu_config *ops = iort_get_iommu_cfg(node); 1250 - 1251 - if (!ops) 1252 - return -ENODEV; 1253 1107 1254 1108 pdev = platform_device_alloc(ops->name, PLATFORM_DEVID_AUTO); 1255 1109 if (!pdev) 1256 1110 return -ENOMEM; 1257 1111 1258 - if (ops->iommu_set_proximity) 1259 - ops->iommu_set_proximity(&pdev->dev, node); 1112 + if (ops->dev_set_proximity) 1113 + ops->dev_set_proximity(&pdev->dev, node); 1260 1114 1261 - count = ops->iommu_count_resources(node); 1115 + count = ops->dev_count_resources(node); 1262 1116 1263 1117 r = kcalloc(count, sizeof(*r), GFP_KERNEL); 1264 1118 if (!r) { ··· 1263 1123 goto dev_put; 1264 1124 } 1265 1125 1266 - ops->iommu_init_resources(r, node); 1126 + ops->dev_init_resources(r, node); 1267 1127 1268 1128 ret = platform_device_add_resources(pdev, r, count); 1269 1129 /* ··· 1298 1158 1299 1159 pdev->dev.fwnode = fwnode; 1300 1160 1301 - attr = ops->iommu_is_coherent(node) ? 1302 - DEV_DMA_COHERENT : DEV_DMA_NON_COHERENT; 1161 + attr = ops->dev_is_coherent && ops->dev_is_coherent(node) ? 1162 + DEV_DMA_COHERENT : DEV_DMA_NON_COHERENT; 1303 1163 1304 1164 /* Configure DMA for the page table walker */ 1305 1165 acpi_dma_configure(&pdev->dev, attr); 1166 + 1167 + iort_set_device_domain(&pdev->dev, node); 1306 1168 1307 1169 ret = platform_device_add(pdev); 1308 1170 if (ret) ··· 1358 1216 struct fwnode_handle *fwnode; 1359 1217 int i, ret; 1360 1218 bool acs_enabled = false; 1219 + const struct iort_dev_config *ops; 1361 1220 1362 1221 /* 1363 1222 * iort_table and iort both point to the start of IORT table, but ··· 1381 1238 if (!acs_enabled) 1382 1239 acs_enabled = iort_enable_acs(iort_node); 1383 1240 1384 - if ((iort_node->type == ACPI_IORT_NODE_SMMU) || 1385 - (iort_node->type == ACPI_IORT_NODE_SMMU_V3)) { 1386 - 1241 + ops = iort_get_dev_cfg(iort_node); 1242 + if (ops) { 1387 1243 fwnode = acpi_alloc_fwnode_static(); 1388 1244 if (!fwnode) 1389 1245 return; 1390 1246 1391 1247 iort_set_fwnode(iort_node, fwnode); 1392 1248 1393 - ret = iort_add_smmu_platform_device(iort_node); 1249 + ret = iort_add_platform_device(iort_node, ops); 1394 1250 if (ret) { 1395 1251 iort_delete_fwnode(iort_node); 1396 1252 acpi_free_fwnode_static(fwnode);
+1
drivers/bus/arm-ccn.c
··· 1276 1276 1277 1277 /* Perf driver registration */ 1278 1278 ccn->dt.pmu = (struct pmu) { 1279 + .module = THIS_MODULE, 1279 1280 .attr_groups = arm_ccn_pmu_attr_groups, 1280 1281 .task_ctx_nr = perf_invalid_context, 1281 1282 .event_init = arm_ccn_pmu_event_init,
+22 -3
drivers/clocksource/arm_arch_timer.c
··· 77 77 static bool arch_counter_suspend_stop; 78 78 static bool vdso_default = true; 79 79 80 + static cpumask_t evtstrm_available = CPU_MASK_NONE; 80 81 static bool evtstrm_enable = IS_ENABLED(CONFIG_ARM_ARCH_TIMER_EVTSTREAM); 81 82 82 83 static int __init early_evtstrm_cfg(char *buf) ··· 740 739 #ifdef CONFIG_COMPAT 741 740 compat_elf_hwcap |= COMPAT_HWCAP_EVTSTRM; 742 741 #endif 742 + cpumask_set_cpu(smp_processor_id(), &evtstrm_available); 743 743 } 744 744 745 745 static void arch_timer_configure_evtstream(void) ··· 865 863 return arch_timer_rate; 866 864 } 867 865 866 + bool arch_timer_evtstrm_available(void) 867 + { 868 + /* 869 + * We might get called from a preemptible context. This is fine 870 + * because availability of the event stream should be always the same 871 + * for a preemptible context and context where we might resume a task. 872 + */ 873 + return cpumask_test_cpu(raw_smp_processor_id(), &evtstrm_available); 874 + } 875 + 868 876 static u64 arch_counter_get_cntvct_mem(void) 869 877 { 870 878 u32 vct_lo, vct_hi, tmp_hi; ··· 940 928 { 941 929 struct clock_event_device *clk = this_cpu_ptr(arch_timer_evt); 942 930 931 + cpumask_clear_cpu(smp_processor_id(), &evtstrm_available); 932 + 943 933 arch_timer_stop(clk); 944 934 return 0; 945 935 } ··· 951 937 static int arch_timer_cpu_pm_notify(struct notifier_block *self, 952 938 unsigned long action, void *hcpu) 953 939 { 954 - if (action == CPU_PM_ENTER) 940 + if (action == CPU_PM_ENTER) { 955 941 __this_cpu_write(saved_cntkctl, arch_timer_get_cntkctl()); 956 - else if (action == CPU_PM_ENTER_FAILED || action == CPU_PM_EXIT) 942 + 943 + cpumask_clear_cpu(smp_processor_id(), &evtstrm_available); 944 + } else if (action == CPU_PM_ENTER_FAILED || action == CPU_PM_EXIT) { 957 945 arch_timer_set_cntkctl(__this_cpu_read(saved_cntkctl)); 946 + 947 + if (elf_hwcap & HWCAP_EVTSTRM) 948 + cpumask_set_cpu(smp_processor_id(), &evtstrm_available); 949 + } 958 950 return NOTIFY_OK; 959 951 } 960 952 ··· 1035 1015 err = arch_timer_cpu_pm_init(); 1036 1016 if (err) 1037 1017 goto out_unreg_notify; 1038 - 1039 1018 1040 1019 /* Register and immediately configure the timer on the boot CPU */ 1041 1020 err = cpuhp_setup_state(CPUHP_AP_ARM_ARCH_TIMER_STARTING,
+15
drivers/perf/Kconfig
··· 17 17 depends on ARM_PMU && ACPI 18 18 def_bool y 19 19 20 + config HISI_PMU 21 + bool "HiSilicon SoC PMU" 22 + depends on ARM64 && ACPI 23 + help 24 + Support for HiSilicon SoC uncore performance monitoring 25 + unit (PMU), such as: L3C, HHA and DDRC. 26 + 20 27 config QCOM_L2_PMU 21 28 bool "Qualcomm Technologies L2-cache PMU" 22 29 depends on ARCH_QCOM && ARM64 && ACPI ··· 49 42 default n 50 43 help 51 44 Say y if you want to use APM X-Gene SoC performance monitors. 45 + 46 + config ARM_SPE_PMU 47 + tristate "Enable support for the ARMv8.2 Statistical Profiling Extension" 48 + depends on PERF_EVENTS && ARM64 49 + help 50 + Enable perf support for the ARMv8.2 Statistical Profiling 51 + Extension, which provides periodic sampling of operations in 52 + the CPU pipeline and reports this via the perf AUX interface. 52 53 53 54 endmenu
+2
drivers/perf/Makefile
··· 1 1 # SPDX-License-Identifier: GPL-2.0 2 2 obj-$(CONFIG_ARM_PMU) += arm_pmu.o arm_pmu_platform.o 3 3 obj-$(CONFIG_ARM_PMU_ACPI) += arm_pmu_acpi.o 4 + obj-$(CONFIG_HISI_PMU) += hisilicon/ 4 5 obj-$(CONFIG_QCOM_L2_PMU) += qcom_l2_pmu.o 5 6 obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o 6 7 obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o 8 + obj-$(CONFIG_ARM_SPE_PMU) += arm_spe_pmu.o
+5 -5
drivers/perf/arm_pmu.c
··· 539 539 if (!cpumask_test_and_clear_cpu(cpu, &armpmu->active_irqs)) 540 540 return; 541 541 542 - if (irq_is_percpu(irq)) { 542 + if (irq_is_percpu_devid(irq)) { 543 543 free_percpu_irq(irq, &hw_events->percpu_pmu); 544 544 cpumask_clear(&armpmu->active_irqs); 545 545 return; ··· 565 565 if (!irq) 566 566 return 0; 567 567 568 - if (irq_is_percpu(irq) && cpumask_empty(&armpmu->active_irqs)) { 568 + if (irq_is_percpu_devid(irq) && cpumask_empty(&armpmu->active_irqs)) { 569 569 err = request_percpu_irq(irq, handler, "arm-pmu", 570 570 &hw_events->percpu_pmu); 571 - } else if (irq_is_percpu(irq)) { 571 + } else if (irq_is_percpu_devid(irq)) { 572 572 int other_cpu = cpumask_first(&armpmu->active_irqs); 573 573 int other_irq = per_cpu(hw_events->irq, other_cpu); 574 574 ··· 649 649 650 650 irq = armpmu_get_cpu_irq(pmu, cpu); 651 651 if (irq) { 652 - if (irq_is_percpu(irq)) { 652 + if (irq_is_percpu_devid(irq)) { 653 653 enable_percpu_irq(irq, IRQ_TYPE_NONE); 654 654 return 0; 655 655 } ··· 667 667 return 0; 668 668 669 669 irq = armpmu_get_cpu_irq(pmu, cpu); 670 - if (irq && irq_is_percpu(irq)) 670 + if (irq && irq_is_percpu_devid(irq)) 671 671 disable_percpu_irq(irq); 672 672 673 673 return 0;
-3
drivers/perf/arm_pmu_acpi.c
··· 193 193 int pmu_idx = 0; 194 194 int cpu, ret; 195 195 196 - if (acpi_disabled) 197 - return 0; 198 - 199 196 /* 200 197 * Initialise and register the set of PMUs which we know about right 201 198 * now. Ideally we'd do this in arm_pmu_acpi_cpu_starting() so that we
+2 -2
drivers/perf/arm_pmu_platform.c
··· 127 127 128 128 if (num_irqs == 1) { 129 129 int irq = platform_get_irq(pdev, 0); 130 - if (irq && irq_is_percpu(irq)) 130 + if (irq && irq_is_percpu_devid(irq)) 131 131 return pmu_parse_percpu_irq(pmu, irq); 132 132 } 133 133 ··· 150 150 if (WARN_ON(irq <= 0)) 151 151 continue; 152 152 153 - if (irq_is_percpu(irq)) { 153 + if (irq_is_percpu_devid(irq)) { 154 154 pr_warn("multiple PPIs or mismatched SPI/PPI detected\n"); 155 155 return -EINVAL; 156 156 }
+1249
drivers/perf/arm_spe_pmu.c
··· 1 + /* 2 + * Perf support for the Statistical Profiling Extension, introduced as 3 + * part of ARMv8.2. 4 + * 5 + * This program is free software; you can redistribute it and/or modify 6 + * it under the terms of the GNU General Public License version 2 as 7 + * published by the Free Software Foundation. 8 + * 9 + * This program is distributed in the hope that it will be useful, 10 + * but WITHOUT ANY WARRANTY; without even the implied warranty of 11 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 12 + * GNU General Public License for more details. 13 + * 14 + * You should have received a copy of the GNU General Public License 15 + * along with this program. If not, see <http://www.gnu.org/licenses/>. 16 + * 17 + * Copyright (C) 2016 ARM Limited 18 + * 19 + * Author: Will Deacon <will.deacon@arm.com> 20 + */ 21 + 22 + #define PMUNAME "arm_spe" 23 + #define DRVNAME PMUNAME "_pmu" 24 + #define pr_fmt(fmt) DRVNAME ": " fmt 25 + 26 + #include <linux/cpuhotplug.h> 27 + #include <linux/interrupt.h> 28 + #include <linux/irq.h> 29 + #include <linux/module.h> 30 + #include <linux/of_address.h> 31 + #include <linux/of_device.h> 32 + #include <linux/perf_event.h> 33 + #include <linux/platform_device.h> 34 + #include <linux/slab.h> 35 + 36 + #include <asm/sysreg.h> 37 + 38 + #define ARM_SPE_BUF_PAD_BYTE 0 39 + 40 + struct arm_spe_pmu_buf { 41 + int nr_pages; 42 + bool snapshot; 43 + void *base; 44 + }; 45 + 46 + struct arm_spe_pmu { 47 + struct pmu pmu; 48 + struct platform_device *pdev; 49 + cpumask_t supported_cpus; 50 + struct hlist_node hotplug_node; 51 + 52 + int irq; /* PPI */ 53 + 54 + u16 min_period; 55 + u16 counter_sz; 56 + 57 + #define SPE_PMU_FEAT_FILT_EVT (1UL << 0) 58 + #define SPE_PMU_FEAT_FILT_TYP (1UL << 1) 59 + #define SPE_PMU_FEAT_FILT_LAT (1UL << 2) 60 + #define SPE_PMU_FEAT_ARCH_INST (1UL << 3) 61 + #define SPE_PMU_FEAT_LDS (1UL << 4) 62 + #define SPE_PMU_FEAT_ERND (1UL << 5) 63 + #define SPE_PMU_FEAT_DEV_PROBED (1UL << 63) 64 + u64 features; 65 + 66 + u16 max_record_sz; 67 + u16 align; 68 + struct perf_output_handle __percpu *handle; 69 + }; 70 + 71 + #define to_spe_pmu(p) (container_of(p, struct arm_spe_pmu, pmu)) 72 + 73 + /* Convert a free-running index from perf into an SPE buffer offset */ 74 + #define PERF_IDX2OFF(idx, buf) ((idx) % ((buf)->nr_pages << PAGE_SHIFT)) 75 + 76 + /* Keep track of our dynamic hotplug state */ 77 + static enum cpuhp_state arm_spe_pmu_online; 78 + 79 + enum arm_spe_pmu_buf_fault_action { 80 + SPE_PMU_BUF_FAULT_ACT_SPURIOUS, 81 + SPE_PMU_BUF_FAULT_ACT_FATAL, 82 + SPE_PMU_BUF_FAULT_ACT_OK, 83 + }; 84 + 85 + /* This sysfs gunk was really good fun to write. */ 86 + enum arm_spe_pmu_capabilities { 87 + SPE_PMU_CAP_ARCH_INST = 0, 88 + SPE_PMU_CAP_ERND, 89 + SPE_PMU_CAP_FEAT_MAX, 90 + SPE_PMU_CAP_CNT_SZ = SPE_PMU_CAP_FEAT_MAX, 91 + SPE_PMU_CAP_MIN_IVAL, 92 + }; 93 + 94 + static int arm_spe_pmu_feat_caps[SPE_PMU_CAP_FEAT_MAX] = { 95 + [SPE_PMU_CAP_ARCH_INST] = SPE_PMU_FEAT_ARCH_INST, 96 + [SPE_PMU_CAP_ERND] = SPE_PMU_FEAT_ERND, 97 + }; 98 + 99 + static u32 arm_spe_pmu_cap_get(struct arm_spe_pmu *spe_pmu, int cap) 100 + { 101 + if (cap < SPE_PMU_CAP_FEAT_MAX) 102 + return !!(spe_pmu->features & arm_spe_pmu_feat_caps[cap]); 103 + 104 + switch (cap) { 105 + case SPE_PMU_CAP_CNT_SZ: 106 + return spe_pmu->counter_sz; 107 + case SPE_PMU_CAP_MIN_IVAL: 108 + return spe_pmu->min_period; 109 + default: 110 + WARN(1, "unknown cap %d\n", cap); 111 + } 112 + 113 + return 0; 114 + } 115 + 116 + static ssize_t arm_spe_pmu_cap_show(struct device *dev, 117 + struct device_attribute *attr, 118 + char *buf) 119 + { 120 + struct platform_device *pdev = to_platform_device(dev); 121 + struct arm_spe_pmu *spe_pmu = platform_get_drvdata(pdev); 122 + struct dev_ext_attribute *ea = 123 + container_of(attr, struct dev_ext_attribute, attr); 124 + int cap = (long)ea->var; 125 + 126 + return snprintf(buf, PAGE_SIZE, "%u\n", 127 + arm_spe_pmu_cap_get(spe_pmu, cap)); 128 + } 129 + 130 + #define SPE_EXT_ATTR_ENTRY(_name, _func, _var) \ 131 + &((struct dev_ext_attribute[]) { \ 132 + { __ATTR(_name, S_IRUGO, _func, NULL), (void *)_var } \ 133 + })[0].attr.attr 134 + 135 + #define SPE_CAP_EXT_ATTR_ENTRY(_name, _var) \ 136 + SPE_EXT_ATTR_ENTRY(_name, arm_spe_pmu_cap_show, _var) 137 + 138 + static struct attribute *arm_spe_pmu_cap_attr[] = { 139 + SPE_CAP_EXT_ATTR_ENTRY(arch_inst, SPE_PMU_CAP_ARCH_INST), 140 + SPE_CAP_EXT_ATTR_ENTRY(ernd, SPE_PMU_CAP_ERND), 141 + SPE_CAP_EXT_ATTR_ENTRY(count_size, SPE_PMU_CAP_CNT_SZ), 142 + SPE_CAP_EXT_ATTR_ENTRY(min_interval, SPE_PMU_CAP_MIN_IVAL), 143 + NULL, 144 + }; 145 + 146 + static struct attribute_group arm_spe_pmu_cap_group = { 147 + .name = "caps", 148 + .attrs = arm_spe_pmu_cap_attr, 149 + }; 150 + 151 + /* User ABI */ 152 + #define ATTR_CFG_FLD_ts_enable_CFG config /* PMSCR_EL1.TS */ 153 + #define ATTR_CFG_FLD_ts_enable_LO 0 154 + #define ATTR_CFG_FLD_ts_enable_HI 0 155 + #define ATTR_CFG_FLD_pa_enable_CFG config /* PMSCR_EL1.PA */ 156 + #define ATTR_CFG_FLD_pa_enable_LO 1 157 + #define ATTR_CFG_FLD_pa_enable_HI 1 158 + #define ATTR_CFG_FLD_pct_enable_CFG config /* PMSCR_EL1.PCT */ 159 + #define ATTR_CFG_FLD_pct_enable_LO 2 160 + #define ATTR_CFG_FLD_pct_enable_HI 2 161 + #define ATTR_CFG_FLD_jitter_CFG config /* PMSIRR_EL1.RND */ 162 + #define ATTR_CFG_FLD_jitter_LO 16 163 + #define ATTR_CFG_FLD_jitter_HI 16 164 + #define ATTR_CFG_FLD_branch_filter_CFG config /* PMSFCR_EL1.B */ 165 + #define ATTR_CFG_FLD_branch_filter_LO 32 166 + #define ATTR_CFG_FLD_branch_filter_HI 32 167 + #define ATTR_CFG_FLD_load_filter_CFG config /* PMSFCR_EL1.LD */ 168 + #define ATTR_CFG_FLD_load_filter_LO 33 169 + #define ATTR_CFG_FLD_load_filter_HI 33 170 + #define ATTR_CFG_FLD_store_filter_CFG config /* PMSFCR_EL1.ST */ 171 + #define ATTR_CFG_FLD_store_filter_LO 34 172 + #define ATTR_CFG_FLD_store_filter_HI 34 173 + 174 + #define ATTR_CFG_FLD_event_filter_CFG config1 /* PMSEVFR_EL1 */ 175 + #define ATTR_CFG_FLD_event_filter_LO 0 176 + #define ATTR_CFG_FLD_event_filter_HI 63 177 + 178 + #define ATTR_CFG_FLD_min_latency_CFG config2 /* PMSLATFR_EL1.MINLAT */ 179 + #define ATTR_CFG_FLD_min_latency_LO 0 180 + #define ATTR_CFG_FLD_min_latency_HI 11 181 + 182 + /* Why does everything I do descend into this? */ 183 + #define __GEN_PMU_FORMAT_ATTR(cfg, lo, hi) \ 184 + (lo) == (hi) ? #cfg ":" #lo "\n" : #cfg ":" #lo "-" #hi 185 + 186 + #define _GEN_PMU_FORMAT_ATTR(cfg, lo, hi) \ 187 + __GEN_PMU_FORMAT_ATTR(cfg, lo, hi) 188 + 189 + #define GEN_PMU_FORMAT_ATTR(name) \ 190 + PMU_FORMAT_ATTR(name, \ 191 + _GEN_PMU_FORMAT_ATTR(ATTR_CFG_FLD_##name##_CFG, \ 192 + ATTR_CFG_FLD_##name##_LO, \ 193 + ATTR_CFG_FLD_##name##_HI)) 194 + 195 + #define _ATTR_CFG_GET_FLD(attr, cfg, lo, hi) \ 196 + ((((attr)->cfg) >> lo) & GENMASK(hi - lo, 0)) 197 + 198 + #define ATTR_CFG_GET_FLD(attr, name) \ 199 + _ATTR_CFG_GET_FLD(attr, \ 200 + ATTR_CFG_FLD_##name##_CFG, \ 201 + ATTR_CFG_FLD_##name##_LO, \ 202 + ATTR_CFG_FLD_##name##_HI) 203 + 204 + GEN_PMU_FORMAT_ATTR(ts_enable); 205 + GEN_PMU_FORMAT_ATTR(pa_enable); 206 + GEN_PMU_FORMAT_ATTR(pct_enable); 207 + GEN_PMU_FORMAT_ATTR(jitter); 208 + GEN_PMU_FORMAT_ATTR(branch_filter); 209 + GEN_PMU_FORMAT_ATTR(load_filter); 210 + GEN_PMU_FORMAT_ATTR(store_filter); 211 + GEN_PMU_FORMAT_ATTR(event_filter); 212 + GEN_PMU_FORMAT_ATTR(min_latency); 213 + 214 + static struct attribute *arm_spe_pmu_formats_attr[] = { 215 + &format_attr_ts_enable.attr, 216 + &format_attr_pa_enable.attr, 217 + &format_attr_pct_enable.attr, 218 + &format_attr_jitter.attr, 219 + &format_attr_branch_filter.attr, 220 + &format_attr_load_filter.attr, 221 + &format_attr_store_filter.attr, 222 + &format_attr_event_filter.attr, 223 + &format_attr_min_latency.attr, 224 + NULL, 225 + }; 226 + 227 + static struct attribute_group arm_spe_pmu_format_group = { 228 + .name = "format", 229 + .attrs = arm_spe_pmu_formats_attr, 230 + }; 231 + 232 + static ssize_t arm_spe_pmu_get_attr_cpumask(struct device *dev, 233 + struct device_attribute *attr, 234 + char *buf) 235 + { 236 + struct platform_device *pdev = to_platform_device(dev); 237 + struct arm_spe_pmu *spe_pmu = platform_get_drvdata(pdev); 238 + 239 + return cpumap_print_to_pagebuf(true, buf, &spe_pmu->supported_cpus); 240 + } 241 + static DEVICE_ATTR(cpumask, S_IRUGO, arm_spe_pmu_get_attr_cpumask, NULL); 242 + 243 + static struct attribute *arm_spe_pmu_attrs[] = { 244 + &dev_attr_cpumask.attr, 245 + NULL, 246 + }; 247 + 248 + static struct attribute_group arm_spe_pmu_group = { 249 + .attrs = arm_spe_pmu_attrs, 250 + }; 251 + 252 + static const struct attribute_group *arm_spe_pmu_attr_groups[] = { 253 + &arm_spe_pmu_group, 254 + &arm_spe_pmu_cap_group, 255 + &arm_spe_pmu_format_group, 256 + NULL, 257 + }; 258 + 259 + /* Convert between user ABI and register values */ 260 + static u64 arm_spe_event_to_pmscr(struct perf_event *event) 261 + { 262 + struct perf_event_attr *attr = &event->attr; 263 + u64 reg = 0; 264 + 265 + reg |= ATTR_CFG_GET_FLD(attr, ts_enable) << SYS_PMSCR_EL1_TS_SHIFT; 266 + reg |= ATTR_CFG_GET_FLD(attr, pa_enable) << SYS_PMSCR_EL1_PA_SHIFT; 267 + reg |= ATTR_CFG_GET_FLD(attr, pct_enable) << SYS_PMSCR_EL1_PCT_SHIFT; 268 + 269 + if (!attr->exclude_user) 270 + reg |= BIT(SYS_PMSCR_EL1_E0SPE_SHIFT); 271 + 272 + if (!attr->exclude_kernel) 273 + reg |= BIT(SYS_PMSCR_EL1_E1SPE_SHIFT); 274 + 275 + if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && capable(CAP_SYS_ADMIN)) 276 + reg |= BIT(SYS_PMSCR_EL1_CX_SHIFT); 277 + 278 + return reg; 279 + } 280 + 281 + static void arm_spe_event_sanitise_period(struct perf_event *event) 282 + { 283 + struct arm_spe_pmu *spe_pmu = to_spe_pmu(event->pmu); 284 + u64 period = event->hw.sample_period; 285 + u64 max_period = SYS_PMSIRR_EL1_INTERVAL_MASK 286 + << SYS_PMSIRR_EL1_INTERVAL_SHIFT; 287 + 288 + if (period < spe_pmu->min_period) 289 + period = spe_pmu->min_period; 290 + else if (period > max_period) 291 + period = max_period; 292 + else 293 + period &= max_period; 294 + 295 + event->hw.sample_period = period; 296 + } 297 + 298 + static u64 arm_spe_event_to_pmsirr(struct perf_event *event) 299 + { 300 + struct perf_event_attr *attr = &event->attr; 301 + u64 reg = 0; 302 + 303 + arm_spe_event_sanitise_period(event); 304 + 305 + reg |= ATTR_CFG_GET_FLD(attr, jitter) << SYS_PMSIRR_EL1_RND_SHIFT; 306 + reg |= event->hw.sample_period; 307 + 308 + return reg; 309 + } 310 + 311 + static u64 arm_spe_event_to_pmsfcr(struct perf_event *event) 312 + { 313 + struct perf_event_attr *attr = &event->attr; 314 + u64 reg = 0; 315 + 316 + reg |= ATTR_CFG_GET_FLD(attr, load_filter) << SYS_PMSFCR_EL1_LD_SHIFT; 317 + reg |= ATTR_CFG_GET_FLD(attr, store_filter) << SYS_PMSFCR_EL1_ST_SHIFT; 318 + reg |= ATTR_CFG_GET_FLD(attr, branch_filter) << SYS_PMSFCR_EL1_B_SHIFT; 319 + 320 + if (reg) 321 + reg |= BIT(SYS_PMSFCR_EL1_FT_SHIFT); 322 + 323 + if (ATTR_CFG_GET_FLD(attr, event_filter)) 324 + reg |= BIT(SYS_PMSFCR_EL1_FE_SHIFT); 325 + 326 + if (ATTR_CFG_GET_FLD(attr, min_latency)) 327 + reg |= BIT(SYS_PMSFCR_EL1_FL_SHIFT); 328 + 329 + return reg; 330 + } 331 + 332 + static u64 arm_spe_event_to_pmsevfr(struct perf_event *event) 333 + { 334 + struct perf_event_attr *attr = &event->attr; 335 + return ATTR_CFG_GET_FLD(attr, event_filter); 336 + } 337 + 338 + static u64 arm_spe_event_to_pmslatfr(struct perf_event *event) 339 + { 340 + struct perf_event_attr *attr = &event->attr; 341 + return ATTR_CFG_GET_FLD(attr, min_latency) 342 + << SYS_PMSLATFR_EL1_MINLAT_SHIFT; 343 + } 344 + 345 + static void arm_spe_pmu_pad_buf(struct perf_output_handle *handle, int len) 346 + { 347 + struct arm_spe_pmu_buf *buf = perf_get_aux(handle); 348 + u64 head = PERF_IDX2OFF(handle->head, buf); 349 + 350 + memset(buf->base + head, ARM_SPE_BUF_PAD_BYTE, len); 351 + if (!buf->snapshot) 352 + perf_aux_output_skip(handle, len); 353 + } 354 + 355 + static u64 arm_spe_pmu_next_snapshot_off(struct perf_output_handle *handle) 356 + { 357 + struct arm_spe_pmu_buf *buf = perf_get_aux(handle); 358 + struct arm_spe_pmu *spe_pmu = to_spe_pmu(handle->event->pmu); 359 + u64 head = PERF_IDX2OFF(handle->head, buf); 360 + u64 limit = buf->nr_pages * PAGE_SIZE; 361 + 362 + /* 363 + * The trace format isn't parseable in reverse, so clamp 364 + * the limit to half of the buffer size in snapshot mode 365 + * so that the worst case is half a buffer of records, as 366 + * opposed to a single record. 367 + */ 368 + if (head < limit >> 1) 369 + limit >>= 1; 370 + 371 + /* 372 + * If we're within max_record_sz of the limit, we must 373 + * pad, move the head index and recompute the limit. 374 + */ 375 + if (limit - head < spe_pmu->max_record_sz) { 376 + arm_spe_pmu_pad_buf(handle, limit - head); 377 + handle->head = PERF_IDX2OFF(limit, buf); 378 + limit = ((buf->nr_pages * PAGE_SIZE) >> 1) + handle->head; 379 + } 380 + 381 + return limit; 382 + } 383 + 384 + static u64 __arm_spe_pmu_next_off(struct perf_output_handle *handle) 385 + { 386 + struct arm_spe_pmu *spe_pmu = to_spe_pmu(handle->event->pmu); 387 + struct arm_spe_pmu_buf *buf = perf_get_aux(handle); 388 + const u64 bufsize = buf->nr_pages * PAGE_SIZE; 389 + u64 limit = bufsize; 390 + u64 head, tail, wakeup; 391 + 392 + /* 393 + * The head can be misaligned for two reasons: 394 + * 395 + * 1. The hardware left PMBPTR pointing to the first byte after 396 + * a record when generating a buffer management event. 397 + * 398 + * 2. We used perf_aux_output_skip to consume handle->size bytes 399 + * and CIRC_SPACE was used to compute the size, which always 400 + * leaves one entry free. 401 + * 402 + * Deal with this by padding to the next alignment boundary and 403 + * moving the head index. If we run out of buffer space, we'll 404 + * reduce handle->size to zero and end up reporting truncation. 405 + */ 406 + head = PERF_IDX2OFF(handle->head, buf); 407 + if (!IS_ALIGNED(head, spe_pmu->align)) { 408 + unsigned long delta = roundup(head, spe_pmu->align) - head; 409 + 410 + delta = min(delta, handle->size); 411 + arm_spe_pmu_pad_buf(handle, delta); 412 + head = PERF_IDX2OFF(handle->head, buf); 413 + } 414 + 415 + /* If we've run out of free space, then nothing more to do */ 416 + if (!handle->size) 417 + goto no_space; 418 + 419 + /* Compute the tail and wakeup indices now that we've aligned head */ 420 + tail = PERF_IDX2OFF(handle->head + handle->size, buf); 421 + wakeup = PERF_IDX2OFF(handle->wakeup, buf); 422 + 423 + /* 424 + * Avoid clobbering unconsumed data. We know we have space, so 425 + * if we see head == tail we know that the buffer is empty. If 426 + * head > tail, then there's nothing to clobber prior to 427 + * wrapping. 428 + */ 429 + if (head < tail) 430 + limit = round_down(tail, PAGE_SIZE); 431 + 432 + /* 433 + * Wakeup may be arbitrarily far into the future. If it's not in 434 + * the current generation, either we'll wrap before hitting it, 435 + * or it's in the past and has been handled already. 436 + * 437 + * If there's a wakeup before we wrap, arrange to be woken up by 438 + * the page boundary following it. Keep the tail boundary if 439 + * that's lower. 440 + */ 441 + if (handle->wakeup < (handle->head + handle->size) && head <= wakeup) 442 + limit = min(limit, round_up(wakeup, PAGE_SIZE)); 443 + 444 + if (limit > head) 445 + return limit; 446 + 447 + arm_spe_pmu_pad_buf(handle, handle->size); 448 + no_space: 449 + perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED); 450 + perf_aux_output_end(handle, 0); 451 + return 0; 452 + } 453 + 454 + static u64 arm_spe_pmu_next_off(struct perf_output_handle *handle) 455 + { 456 + struct arm_spe_pmu_buf *buf = perf_get_aux(handle); 457 + struct arm_spe_pmu *spe_pmu = to_spe_pmu(handle->event->pmu); 458 + u64 limit = __arm_spe_pmu_next_off(handle); 459 + u64 head = PERF_IDX2OFF(handle->head, buf); 460 + 461 + /* 462 + * If the head has come too close to the end of the buffer, 463 + * then pad to the end and recompute the limit. 464 + */ 465 + if (limit && (limit - head < spe_pmu->max_record_sz)) { 466 + arm_spe_pmu_pad_buf(handle, limit - head); 467 + limit = __arm_spe_pmu_next_off(handle); 468 + } 469 + 470 + return limit; 471 + } 472 + 473 + static void arm_spe_perf_aux_output_begin(struct perf_output_handle *handle, 474 + struct perf_event *event) 475 + { 476 + u64 base, limit; 477 + struct arm_spe_pmu_buf *buf; 478 + 479 + /* Start a new aux session */ 480 + buf = perf_aux_output_begin(handle, event); 481 + if (!buf) { 482 + event->hw.state |= PERF_HES_STOPPED; 483 + /* 484 + * We still need to clear the limit pointer, since the 485 + * profiler might only be disabled by virtue of a fault. 486 + */ 487 + limit = 0; 488 + goto out_write_limit; 489 + } 490 + 491 + limit = buf->snapshot ? arm_spe_pmu_next_snapshot_off(handle) 492 + : arm_spe_pmu_next_off(handle); 493 + if (limit) 494 + limit |= BIT(SYS_PMBLIMITR_EL1_E_SHIFT); 495 + 496 + limit += (u64)buf->base; 497 + base = (u64)buf->base + PERF_IDX2OFF(handle->head, buf); 498 + write_sysreg_s(base, SYS_PMBPTR_EL1); 499 + 500 + out_write_limit: 501 + write_sysreg_s(limit, SYS_PMBLIMITR_EL1); 502 + } 503 + 504 + static void arm_spe_perf_aux_output_end(struct perf_output_handle *handle) 505 + { 506 + struct arm_spe_pmu_buf *buf = perf_get_aux(handle); 507 + u64 offset, size; 508 + 509 + offset = read_sysreg_s(SYS_PMBPTR_EL1) - (u64)buf->base; 510 + size = offset - PERF_IDX2OFF(handle->head, buf); 511 + 512 + if (buf->snapshot) 513 + handle->head = offset; 514 + 515 + perf_aux_output_end(handle, size); 516 + } 517 + 518 + static void arm_spe_pmu_disable_and_drain_local(void) 519 + { 520 + /* Disable profiling at EL0 and EL1 */ 521 + write_sysreg_s(0, SYS_PMSCR_EL1); 522 + isb(); 523 + 524 + /* Drain any buffered data */ 525 + psb_csync(); 526 + dsb(nsh); 527 + 528 + /* Disable the profiling buffer */ 529 + write_sysreg_s(0, SYS_PMBLIMITR_EL1); 530 + isb(); 531 + } 532 + 533 + /* IRQ handling */ 534 + static enum arm_spe_pmu_buf_fault_action 535 + arm_spe_pmu_buf_get_fault_act(struct perf_output_handle *handle) 536 + { 537 + const char *err_str; 538 + u64 pmbsr; 539 + enum arm_spe_pmu_buf_fault_action ret; 540 + 541 + /* 542 + * Ensure new profiling data is visible to the CPU and any external 543 + * aborts have been resolved. 544 + */ 545 + psb_csync(); 546 + dsb(nsh); 547 + 548 + /* Ensure hardware updates to PMBPTR_EL1 are visible */ 549 + isb(); 550 + 551 + /* Service required? */ 552 + pmbsr = read_sysreg_s(SYS_PMBSR_EL1); 553 + if (!(pmbsr & BIT(SYS_PMBSR_EL1_S_SHIFT))) 554 + return SPE_PMU_BUF_FAULT_ACT_SPURIOUS; 555 + 556 + /* 557 + * If we've lost data, disable profiling and also set the PARTIAL 558 + * flag to indicate that the last record is corrupted. 559 + */ 560 + if (pmbsr & BIT(SYS_PMBSR_EL1_DL_SHIFT)) 561 + perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED | 562 + PERF_AUX_FLAG_PARTIAL); 563 + 564 + /* Report collisions to userspace so that it can up the period */ 565 + if (pmbsr & BIT(SYS_PMBSR_EL1_COLL_SHIFT)) 566 + perf_aux_output_flag(handle, PERF_AUX_FLAG_COLLISION); 567 + 568 + /* We only expect buffer management events */ 569 + switch (pmbsr & (SYS_PMBSR_EL1_EC_MASK << SYS_PMBSR_EL1_EC_SHIFT)) { 570 + case SYS_PMBSR_EL1_EC_BUF: 571 + /* Handled below */ 572 + break; 573 + case SYS_PMBSR_EL1_EC_FAULT_S1: 574 + case SYS_PMBSR_EL1_EC_FAULT_S2: 575 + err_str = "Unexpected buffer fault"; 576 + goto out_err; 577 + default: 578 + err_str = "Unknown error code"; 579 + goto out_err; 580 + } 581 + 582 + /* Buffer management event */ 583 + switch (pmbsr & 584 + (SYS_PMBSR_EL1_BUF_BSC_MASK << SYS_PMBSR_EL1_BUF_BSC_SHIFT)) { 585 + case SYS_PMBSR_EL1_BUF_BSC_FULL: 586 + ret = SPE_PMU_BUF_FAULT_ACT_OK; 587 + goto out_stop; 588 + default: 589 + err_str = "Unknown buffer status code"; 590 + } 591 + 592 + out_err: 593 + pr_err_ratelimited("%s on CPU %d [PMBSR=0x%016llx, PMBPTR=0x%016llx, PMBLIMITR=0x%016llx]\n", 594 + err_str, smp_processor_id(), pmbsr, 595 + read_sysreg_s(SYS_PMBPTR_EL1), 596 + read_sysreg_s(SYS_PMBLIMITR_EL1)); 597 + ret = SPE_PMU_BUF_FAULT_ACT_FATAL; 598 + 599 + out_stop: 600 + arm_spe_perf_aux_output_end(handle); 601 + return ret; 602 + } 603 + 604 + static irqreturn_t arm_spe_pmu_irq_handler(int irq, void *dev) 605 + { 606 + struct perf_output_handle *handle = dev; 607 + struct perf_event *event = handle->event; 608 + enum arm_spe_pmu_buf_fault_action act; 609 + 610 + if (!perf_get_aux(handle)) 611 + return IRQ_NONE; 612 + 613 + act = arm_spe_pmu_buf_get_fault_act(handle); 614 + if (act == SPE_PMU_BUF_FAULT_ACT_SPURIOUS) 615 + return IRQ_NONE; 616 + 617 + /* 618 + * Ensure perf callbacks have completed, which may disable the 619 + * profiling buffer in response to a TRUNCATION flag. 620 + */ 621 + irq_work_run(); 622 + 623 + switch (act) { 624 + case SPE_PMU_BUF_FAULT_ACT_FATAL: 625 + /* 626 + * If a fatal exception occurred then leaving the profiling 627 + * buffer enabled is a recipe waiting to happen. Since 628 + * fatal faults don't always imply truncation, make sure 629 + * that the profiling buffer is disabled explicitly before 630 + * clearing the syndrome register. 631 + */ 632 + arm_spe_pmu_disable_and_drain_local(); 633 + break; 634 + case SPE_PMU_BUF_FAULT_ACT_OK: 635 + /* 636 + * We handled the fault (the buffer was full), so resume 637 + * profiling as long as we didn't detect truncation. 638 + * PMBPTR might be misaligned, but we'll burn that bridge 639 + * when we get to it. 640 + */ 641 + if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)) { 642 + arm_spe_perf_aux_output_begin(handle, event); 643 + isb(); 644 + } 645 + break; 646 + case SPE_PMU_BUF_FAULT_ACT_SPURIOUS: 647 + /* We've seen you before, but GCC has the memory of a sieve. */ 648 + break; 649 + } 650 + 651 + /* The buffer pointers are now sane, so resume profiling. */ 652 + write_sysreg_s(0, SYS_PMBSR_EL1); 653 + return IRQ_HANDLED; 654 + } 655 + 656 + /* Perf callbacks */ 657 + static int arm_spe_pmu_event_init(struct perf_event *event) 658 + { 659 + u64 reg; 660 + struct perf_event_attr *attr = &event->attr; 661 + struct arm_spe_pmu *spe_pmu = to_spe_pmu(event->pmu); 662 + 663 + /* This is, of course, deeply driver-specific */ 664 + if (attr->type != event->pmu->type) 665 + return -ENOENT; 666 + 667 + if (event->cpu >= 0 && 668 + !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus)) 669 + return -ENOENT; 670 + 671 + if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0) 672 + return -EOPNOTSUPP; 673 + 674 + if (attr->exclude_idle) 675 + return -EOPNOTSUPP; 676 + 677 + /* 678 + * Feedback-directed frequency throttling doesn't work when we 679 + * have a buffer of samples. We'd need to manually count the 680 + * samples in the buffer when it fills up and adjust the event 681 + * count to reflect that. Instead, just force the user to specify 682 + * a sample period. 683 + */ 684 + if (attr->freq) 685 + return -EINVAL; 686 + 687 + reg = arm_spe_event_to_pmsfcr(event); 688 + if ((reg & BIT(SYS_PMSFCR_EL1_FE_SHIFT)) && 689 + !(spe_pmu->features & SPE_PMU_FEAT_FILT_EVT)) 690 + return -EOPNOTSUPP; 691 + 692 + if ((reg & BIT(SYS_PMSFCR_EL1_FT_SHIFT)) && 693 + !(spe_pmu->features & SPE_PMU_FEAT_FILT_TYP)) 694 + return -EOPNOTSUPP; 695 + 696 + if ((reg & BIT(SYS_PMSFCR_EL1_FL_SHIFT)) && 697 + !(spe_pmu->features & SPE_PMU_FEAT_FILT_LAT)) 698 + return -EOPNOTSUPP; 699 + 700 + reg = arm_spe_event_to_pmscr(event); 701 + if (!capable(CAP_SYS_ADMIN) && 702 + (reg & (BIT(SYS_PMSCR_EL1_PA_SHIFT) | 703 + BIT(SYS_PMSCR_EL1_CX_SHIFT) | 704 + BIT(SYS_PMSCR_EL1_PCT_SHIFT)))) 705 + return -EACCES; 706 + 707 + return 0; 708 + } 709 + 710 + static void arm_spe_pmu_start(struct perf_event *event, int flags) 711 + { 712 + u64 reg; 713 + struct arm_spe_pmu *spe_pmu = to_spe_pmu(event->pmu); 714 + struct hw_perf_event *hwc = &event->hw; 715 + struct perf_output_handle *handle = this_cpu_ptr(spe_pmu->handle); 716 + 717 + hwc->state = 0; 718 + arm_spe_perf_aux_output_begin(handle, event); 719 + if (hwc->state) 720 + return; 721 + 722 + reg = arm_spe_event_to_pmsfcr(event); 723 + write_sysreg_s(reg, SYS_PMSFCR_EL1); 724 + 725 + reg = arm_spe_event_to_pmsevfr(event); 726 + write_sysreg_s(reg, SYS_PMSEVFR_EL1); 727 + 728 + reg = arm_spe_event_to_pmslatfr(event); 729 + write_sysreg_s(reg, SYS_PMSLATFR_EL1); 730 + 731 + if (flags & PERF_EF_RELOAD) { 732 + reg = arm_spe_event_to_pmsirr(event); 733 + write_sysreg_s(reg, SYS_PMSIRR_EL1); 734 + isb(); 735 + reg = local64_read(&hwc->period_left); 736 + write_sysreg_s(reg, SYS_PMSICR_EL1); 737 + } 738 + 739 + reg = arm_spe_event_to_pmscr(event); 740 + isb(); 741 + write_sysreg_s(reg, SYS_PMSCR_EL1); 742 + } 743 + 744 + static void arm_spe_pmu_stop(struct perf_event *event, int flags) 745 + { 746 + struct arm_spe_pmu *spe_pmu = to_spe_pmu(event->pmu); 747 + struct hw_perf_event *hwc = &event->hw; 748 + struct perf_output_handle *handle = this_cpu_ptr(spe_pmu->handle); 749 + 750 + /* If we're already stopped, then nothing to do */ 751 + if (hwc->state & PERF_HES_STOPPED) 752 + return; 753 + 754 + /* Stop all trace generation */ 755 + arm_spe_pmu_disable_and_drain_local(); 756 + 757 + if (flags & PERF_EF_UPDATE) { 758 + /* 759 + * If there's a fault pending then ensure we contain it 760 + * to this buffer, since we might be on the context-switch 761 + * path. 762 + */ 763 + if (perf_get_aux(handle)) { 764 + enum arm_spe_pmu_buf_fault_action act; 765 + 766 + act = arm_spe_pmu_buf_get_fault_act(handle); 767 + if (act == SPE_PMU_BUF_FAULT_ACT_SPURIOUS) 768 + arm_spe_perf_aux_output_end(handle); 769 + else 770 + write_sysreg_s(0, SYS_PMBSR_EL1); 771 + } 772 + 773 + /* 774 + * This may also contain ECOUNT, but nobody else should 775 + * be looking at period_left, since we forbid frequency 776 + * based sampling. 777 + */ 778 + local64_set(&hwc->period_left, read_sysreg_s(SYS_PMSICR_EL1)); 779 + hwc->state |= PERF_HES_UPTODATE; 780 + } 781 + 782 + hwc->state |= PERF_HES_STOPPED; 783 + } 784 + 785 + static int arm_spe_pmu_add(struct perf_event *event, int flags) 786 + { 787 + int ret = 0; 788 + struct arm_spe_pmu *spe_pmu = to_spe_pmu(event->pmu); 789 + struct hw_perf_event *hwc = &event->hw; 790 + int cpu = event->cpu == -1 ? smp_processor_id() : event->cpu; 791 + 792 + if (!cpumask_test_cpu(cpu, &spe_pmu->supported_cpus)) 793 + return -ENOENT; 794 + 795 + hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED; 796 + 797 + if (flags & PERF_EF_START) { 798 + arm_spe_pmu_start(event, PERF_EF_RELOAD); 799 + if (hwc->state & PERF_HES_STOPPED) 800 + ret = -EINVAL; 801 + } 802 + 803 + return ret; 804 + } 805 + 806 + static void arm_spe_pmu_del(struct perf_event *event, int flags) 807 + { 808 + arm_spe_pmu_stop(event, PERF_EF_UPDATE); 809 + } 810 + 811 + static void arm_spe_pmu_read(struct perf_event *event) 812 + { 813 + } 814 + 815 + static void *arm_spe_pmu_setup_aux(int cpu, void **pages, int nr_pages, 816 + bool snapshot) 817 + { 818 + int i; 819 + struct page **pglist; 820 + struct arm_spe_pmu_buf *buf; 821 + 822 + /* We need at least two pages for this to work. */ 823 + if (nr_pages < 2) 824 + return NULL; 825 + 826 + /* 827 + * We require an even number of pages for snapshot mode, so that 828 + * we can effectively treat the buffer as consisting of two equal 829 + * parts and give userspace a fighting chance of getting some 830 + * useful data out of it. 831 + */ 832 + if (!nr_pages || (snapshot && (nr_pages & 1))) 833 + return NULL; 834 + 835 + if (cpu == -1) 836 + cpu = raw_smp_processor_id(); 837 + 838 + buf = kzalloc_node(sizeof(*buf), GFP_KERNEL, cpu_to_node(cpu)); 839 + if (!buf) 840 + return NULL; 841 + 842 + pglist = kcalloc(nr_pages, sizeof(*pglist), GFP_KERNEL); 843 + if (!pglist) 844 + goto out_free_buf; 845 + 846 + for (i = 0; i < nr_pages; ++i) { 847 + struct page *page = virt_to_page(pages[i]); 848 + 849 + if (PagePrivate(page)) { 850 + pr_warn("unexpected high-order page for auxbuf!"); 851 + goto out_free_pglist; 852 + } 853 + 854 + pglist[i] = virt_to_page(pages[i]); 855 + } 856 + 857 + buf->base = vmap(pglist, nr_pages, VM_MAP, PAGE_KERNEL); 858 + if (!buf->base) 859 + goto out_free_pglist; 860 + 861 + buf->nr_pages = nr_pages; 862 + buf->snapshot = snapshot; 863 + 864 + kfree(pglist); 865 + return buf; 866 + 867 + out_free_pglist: 868 + kfree(pglist); 869 + out_free_buf: 870 + kfree(buf); 871 + return NULL; 872 + } 873 + 874 + static void arm_spe_pmu_free_aux(void *aux) 875 + { 876 + struct arm_spe_pmu_buf *buf = aux; 877 + 878 + vunmap(buf->base); 879 + kfree(buf); 880 + } 881 + 882 + /* Initialisation and teardown functions */ 883 + static int arm_spe_pmu_perf_init(struct arm_spe_pmu *spe_pmu) 884 + { 885 + static atomic_t pmu_idx = ATOMIC_INIT(-1); 886 + 887 + int idx; 888 + char *name; 889 + struct device *dev = &spe_pmu->pdev->dev; 890 + 891 + spe_pmu->pmu = (struct pmu) { 892 + .module = THIS_MODULE, 893 + .capabilities = PERF_PMU_CAP_EXCLUSIVE | PERF_PMU_CAP_ITRACE, 894 + .attr_groups = arm_spe_pmu_attr_groups, 895 + /* 896 + * We hitch a ride on the software context here, so that 897 + * we can support per-task profiling (which is not possible 898 + * with the invalid context as it doesn't get sched callbacks). 899 + * This requires that userspace either uses a dummy event for 900 + * perf_event_open, since the aux buffer is not setup until 901 + * a subsequent mmap, or creates the profiling event in a 902 + * disabled state and explicitly PERF_EVENT_IOC_ENABLEs it 903 + * once the buffer has been created. 904 + */ 905 + .task_ctx_nr = perf_sw_context, 906 + .event_init = arm_spe_pmu_event_init, 907 + .add = arm_spe_pmu_add, 908 + .del = arm_spe_pmu_del, 909 + .start = arm_spe_pmu_start, 910 + .stop = arm_spe_pmu_stop, 911 + .read = arm_spe_pmu_read, 912 + .setup_aux = arm_spe_pmu_setup_aux, 913 + .free_aux = arm_spe_pmu_free_aux, 914 + }; 915 + 916 + idx = atomic_inc_return(&pmu_idx); 917 + name = devm_kasprintf(dev, GFP_KERNEL, "%s_%d", PMUNAME, idx); 918 + return perf_pmu_register(&spe_pmu->pmu, name, -1); 919 + } 920 + 921 + static void arm_spe_pmu_perf_destroy(struct arm_spe_pmu *spe_pmu) 922 + { 923 + perf_pmu_unregister(&spe_pmu->pmu); 924 + } 925 + 926 + static void __arm_spe_pmu_dev_probe(void *info) 927 + { 928 + int fld; 929 + u64 reg; 930 + struct arm_spe_pmu *spe_pmu = info; 931 + struct device *dev = &spe_pmu->pdev->dev; 932 + 933 + fld = cpuid_feature_extract_unsigned_field(read_cpuid(ID_AA64DFR0_EL1), 934 + ID_AA64DFR0_PMSVER_SHIFT); 935 + if (!fld) { 936 + dev_err(dev, 937 + "unsupported ID_AA64DFR0_EL1.PMSVer [%d] on CPU %d\n", 938 + fld, smp_processor_id()); 939 + return; 940 + } 941 + 942 + /* Read PMBIDR first to determine whether or not we have access */ 943 + reg = read_sysreg_s(SYS_PMBIDR_EL1); 944 + if (reg & BIT(SYS_PMBIDR_EL1_P_SHIFT)) { 945 + dev_err(dev, 946 + "profiling buffer owned by higher exception level\n"); 947 + return; 948 + } 949 + 950 + /* Minimum alignment. If it's out-of-range, then fail the probe */ 951 + fld = reg >> SYS_PMBIDR_EL1_ALIGN_SHIFT & SYS_PMBIDR_EL1_ALIGN_MASK; 952 + spe_pmu->align = 1 << fld; 953 + if (spe_pmu->align > SZ_2K) { 954 + dev_err(dev, "unsupported PMBIDR.Align [%d] on CPU %d\n", 955 + fld, smp_processor_id()); 956 + return; 957 + } 958 + 959 + /* It's now safe to read PMSIDR and figure out what we've got */ 960 + reg = read_sysreg_s(SYS_PMSIDR_EL1); 961 + if (reg & BIT(SYS_PMSIDR_EL1_FE_SHIFT)) 962 + spe_pmu->features |= SPE_PMU_FEAT_FILT_EVT; 963 + 964 + if (reg & BIT(SYS_PMSIDR_EL1_FT_SHIFT)) 965 + spe_pmu->features |= SPE_PMU_FEAT_FILT_TYP; 966 + 967 + if (reg & BIT(SYS_PMSIDR_EL1_FL_SHIFT)) 968 + spe_pmu->features |= SPE_PMU_FEAT_FILT_LAT; 969 + 970 + if (reg & BIT(SYS_PMSIDR_EL1_ARCHINST_SHIFT)) 971 + spe_pmu->features |= SPE_PMU_FEAT_ARCH_INST; 972 + 973 + if (reg & BIT(SYS_PMSIDR_EL1_LDS_SHIFT)) 974 + spe_pmu->features |= SPE_PMU_FEAT_LDS; 975 + 976 + if (reg & BIT(SYS_PMSIDR_EL1_ERND_SHIFT)) 977 + spe_pmu->features |= SPE_PMU_FEAT_ERND; 978 + 979 + /* This field has a spaced out encoding, so just use a look-up */ 980 + fld = reg >> SYS_PMSIDR_EL1_INTERVAL_SHIFT & SYS_PMSIDR_EL1_INTERVAL_MASK; 981 + switch (fld) { 982 + case 0: 983 + spe_pmu->min_period = 256; 984 + break; 985 + case 2: 986 + spe_pmu->min_period = 512; 987 + break; 988 + case 3: 989 + spe_pmu->min_period = 768; 990 + break; 991 + case 4: 992 + spe_pmu->min_period = 1024; 993 + break; 994 + case 5: 995 + spe_pmu->min_period = 1536; 996 + break; 997 + case 6: 998 + spe_pmu->min_period = 2048; 999 + break; 1000 + case 7: 1001 + spe_pmu->min_period = 3072; 1002 + break; 1003 + default: 1004 + dev_warn(dev, "unknown PMSIDR_EL1.Interval [%d]; assuming 8\n", 1005 + fld); 1006 + /* Fallthrough */ 1007 + case 8: 1008 + spe_pmu->min_period = 4096; 1009 + } 1010 + 1011 + /* Maximum record size. If it's out-of-range, then fail the probe */ 1012 + fld = reg >> SYS_PMSIDR_EL1_MAXSIZE_SHIFT & SYS_PMSIDR_EL1_MAXSIZE_MASK; 1013 + spe_pmu->max_record_sz = 1 << fld; 1014 + if (spe_pmu->max_record_sz > SZ_2K || spe_pmu->max_record_sz < 16) { 1015 + dev_err(dev, "unsupported PMSIDR_EL1.MaxSize [%d] on CPU %d\n", 1016 + fld, smp_processor_id()); 1017 + return; 1018 + } 1019 + 1020 + fld = reg >> SYS_PMSIDR_EL1_COUNTSIZE_SHIFT & SYS_PMSIDR_EL1_COUNTSIZE_MASK; 1021 + switch (fld) { 1022 + default: 1023 + dev_warn(dev, "unknown PMSIDR_EL1.CountSize [%d]; assuming 2\n", 1024 + fld); 1025 + /* Fallthrough */ 1026 + case 2: 1027 + spe_pmu->counter_sz = 12; 1028 + } 1029 + 1030 + dev_info(dev, 1031 + "probed for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n", 1032 + cpumask_pr_args(&spe_pmu->supported_cpus), 1033 + spe_pmu->max_record_sz, spe_pmu->align, spe_pmu->features); 1034 + 1035 + spe_pmu->features |= SPE_PMU_FEAT_DEV_PROBED; 1036 + return; 1037 + } 1038 + 1039 + static void __arm_spe_pmu_reset_local(void) 1040 + { 1041 + /* 1042 + * This is probably overkill, as we have no idea where we're 1043 + * draining any buffered data to... 1044 + */ 1045 + arm_spe_pmu_disable_and_drain_local(); 1046 + 1047 + /* Reset the buffer base pointer */ 1048 + write_sysreg_s(0, SYS_PMBPTR_EL1); 1049 + isb(); 1050 + 1051 + /* Clear any pending management interrupts */ 1052 + write_sysreg_s(0, SYS_PMBSR_EL1); 1053 + isb(); 1054 + } 1055 + 1056 + static void __arm_spe_pmu_setup_one(void *info) 1057 + { 1058 + struct arm_spe_pmu *spe_pmu = info; 1059 + 1060 + __arm_spe_pmu_reset_local(); 1061 + enable_percpu_irq(spe_pmu->irq, IRQ_TYPE_NONE); 1062 + } 1063 + 1064 + static void __arm_spe_pmu_stop_one(void *info) 1065 + { 1066 + struct arm_spe_pmu *spe_pmu = info; 1067 + 1068 + disable_percpu_irq(spe_pmu->irq); 1069 + __arm_spe_pmu_reset_local(); 1070 + } 1071 + 1072 + static int arm_spe_pmu_cpu_startup(unsigned int cpu, struct hlist_node *node) 1073 + { 1074 + struct arm_spe_pmu *spe_pmu; 1075 + 1076 + spe_pmu = hlist_entry_safe(node, struct arm_spe_pmu, hotplug_node); 1077 + if (!cpumask_test_cpu(cpu, &spe_pmu->supported_cpus)) 1078 + return 0; 1079 + 1080 + __arm_spe_pmu_setup_one(spe_pmu); 1081 + return 0; 1082 + } 1083 + 1084 + static int arm_spe_pmu_cpu_teardown(unsigned int cpu, struct hlist_node *node) 1085 + { 1086 + struct arm_spe_pmu *spe_pmu; 1087 + 1088 + spe_pmu = hlist_entry_safe(node, struct arm_spe_pmu, hotplug_node); 1089 + if (!cpumask_test_cpu(cpu, &spe_pmu->supported_cpus)) 1090 + return 0; 1091 + 1092 + __arm_spe_pmu_stop_one(spe_pmu); 1093 + return 0; 1094 + } 1095 + 1096 + static int arm_spe_pmu_dev_init(struct arm_spe_pmu *spe_pmu) 1097 + { 1098 + int ret; 1099 + cpumask_t *mask = &spe_pmu->supported_cpus; 1100 + 1101 + /* Make sure we probe the hardware on a relevant CPU */ 1102 + ret = smp_call_function_any(mask, __arm_spe_pmu_dev_probe, spe_pmu, 1); 1103 + if (ret || !(spe_pmu->features & SPE_PMU_FEAT_DEV_PROBED)) 1104 + return -ENXIO; 1105 + 1106 + /* Request our PPIs (note that the IRQ is still disabled) */ 1107 + ret = request_percpu_irq(spe_pmu->irq, arm_spe_pmu_irq_handler, DRVNAME, 1108 + spe_pmu->handle); 1109 + if (ret) 1110 + return ret; 1111 + 1112 + /* 1113 + * Register our hotplug notifier now so we don't miss any events. 1114 + * This will enable the IRQ for any supported CPUs that are already 1115 + * up. 1116 + */ 1117 + ret = cpuhp_state_add_instance(arm_spe_pmu_online, 1118 + &spe_pmu->hotplug_node); 1119 + if (ret) 1120 + free_percpu_irq(spe_pmu->irq, spe_pmu->handle); 1121 + 1122 + return ret; 1123 + } 1124 + 1125 + static void arm_spe_pmu_dev_teardown(struct arm_spe_pmu *spe_pmu) 1126 + { 1127 + cpuhp_state_remove_instance(arm_spe_pmu_online, &spe_pmu->hotplug_node); 1128 + free_percpu_irq(spe_pmu->irq, spe_pmu->handle); 1129 + } 1130 + 1131 + /* Driver and device probing */ 1132 + static int arm_spe_pmu_irq_probe(struct arm_spe_pmu *spe_pmu) 1133 + { 1134 + struct platform_device *pdev = spe_pmu->pdev; 1135 + int irq = platform_get_irq(pdev, 0); 1136 + 1137 + if (irq < 0) { 1138 + dev_err(&pdev->dev, "failed to get IRQ (%d)\n", irq); 1139 + return -ENXIO; 1140 + } 1141 + 1142 + if (!irq_is_percpu(irq)) { 1143 + dev_err(&pdev->dev, "expected PPI but got SPI (%d)\n", irq); 1144 + return -EINVAL; 1145 + } 1146 + 1147 + if (irq_get_percpu_devid_partition(irq, &spe_pmu->supported_cpus)) { 1148 + dev_err(&pdev->dev, "failed to get PPI partition (%d)\n", irq); 1149 + return -EINVAL; 1150 + } 1151 + 1152 + spe_pmu->irq = irq; 1153 + return 0; 1154 + } 1155 + 1156 + static const struct of_device_id arm_spe_pmu_of_match[] = { 1157 + { .compatible = "arm,statistical-profiling-extension-v1", .data = (void *)1 }, 1158 + { /* Sentinel */ }, 1159 + }; 1160 + 1161 + static int arm_spe_pmu_device_dt_probe(struct platform_device *pdev) 1162 + { 1163 + int ret; 1164 + struct arm_spe_pmu *spe_pmu; 1165 + struct device *dev = &pdev->dev; 1166 + 1167 + spe_pmu = devm_kzalloc(dev, sizeof(*spe_pmu), GFP_KERNEL); 1168 + if (!spe_pmu) { 1169 + dev_err(dev, "failed to allocate spe_pmu\n"); 1170 + return -ENOMEM; 1171 + } 1172 + 1173 + spe_pmu->handle = alloc_percpu(typeof(*spe_pmu->handle)); 1174 + if (!spe_pmu->handle) 1175 + return -ENOMEM; 1176 + 1177 + spe_pmu->pdev = pdev; 1178 + platform_set_drvdata(pdev, spe_pmu); 1179 + 1180 + ret = arm_spe_pmu_irq_probe(spe_pmu); 1181 + if (ret) 1182 + goto out_free_handle; 1183 + 1184 + ret = arm_spe_pmu_dev_init(spe_pmu); 1185 + if (ret) 1186 + goto out_free_handle; 1187 + 1188 + ret = arm_spe_pmu_perf_init(spe_pmu); 1189 + if (ret) 1190 + goto out_teardown_dev; 1191 + 1192 + return 0; 1193 + 1194 + out_teardown_dev: 1195 + arm_spe_pmu_dev_teardown(spe_pmu); 1196 + out_free_handle: 1197 + free_percpu(spe_pmu->handle); 1198 + return ret; 1199 + } 1200 + 1201 + static int arm_spe_pmu_device_remove(struct platform_device *pdev) 1202 + { 1203 + struct arm_spe_pmu *spe_pmu = platform_get_drvdata(pdev); 1204 + 1205 + arm_spe_pmu_perf_destroy(spe_pmu); 1206 + arm_spe_pmu_dev_teardown(spe_pmu); 1207 + free_percpu(spe_pmu->handle); 1208 + return 0; 1209 + } 1210 + 1211 + static struct platform_driver arm_spe_pmu_driver = { 1212 + .driver = { 1213 + .name = DRVNAME, 1214 + .of_match_table = of_match_ptr(arm_spe_pmu_of_match), 1215 + }, 1216 + .probe = arm_spe_pmu_device_dt_probe, 1217 + .remove = arm_spe_pmu_device_remove, 1218 + }; 1219 + 1220 + static int __init arm_spe_pmu_init(void) 1221 + { 1222 + int ret; 1223 + 1224 + ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, DRVNAME, 1225 + arm_spe_pmu_cpu_startup, 1226 + arm_spe_pmu_cpu_teardown); 1227 + if (ret < 0) 1228 + return ret; 1229 + arm_spe_pmu_online = ret; 1230 + 1231 + ret = platform_driver_register(&arm_spe_pmu_driver); 1232 + if (ret) 1233 + cpuhp_remove_multi_state(arm_spe_pmu_online); 1234 + 1235 + return ret; 1236 + } 1237 + 1238 + static void __exit arm_spe_pmu_exit(void) 1239 + { 1240 + platform_driver_unregister(&arm_spe_pmu_driver); 1241 + cpuhp_remove_multi_state(arm_spe_pmu_online); 1242 + } 1243 + 1244 + module_init(arm_spe_pmu_init); 1245 + module_exit(arm_spe_pmu_exit); 1246 + 1247 + MODULE_DESCRIPTION("Perf driver for the ARMv8.2 Statistical Profiling Extension"); 1248 + MODULE_AUTHOR("Will Deacon <will.deacon@arm.com>"); 1249 + MODULE_LICENSE("GPL v2");
+1
drivers/perf/hisilicon/Makefile
··· 1 + obj-$(CONFIG_HISI_PMU) += hisi_uncore_pmu.o hisi_uncore_l3c_pmu.o hisi_uncore_hha_pmu.o hisi_uncore_ddrc_pmu.o
+463
drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c
··· 1 + /* 2 + * HiSilicon SoC DDRC uncore Hardware event counters support 3 + * 4 + * Copyright (C) 2017 Hisilicon Limited 5 + * Author: Shaokun Zhang <zhangshaokun@hisilicon.com> 6 + * Anurup M <anurup.m@huawei.com> 7 + * 8 + * This code is based on the uncore PMUs like arm-cci and arm-ccn. 9 + * 10 + * This program is free software; you can redistribute it and/or modify 11 + * it under the terms of the GNU General Public License version 2 as 12 + * published by the Free Software Foundation. 13 + */ 14 + #include <linux/acpi.h> 15 + #include <linux/bug.h> 16 + #include <linux/cpuhotplug.h> 17 + #include <linux/interrupt.h> 18 + #include <linux/irq.h> 19 + #include <linux/list.h> 20 + #include <linux/platform_device.h> 21 + #include <linux/smp.h> 22 + 23 + #include "hisi_uncore_pmu.h" 24 + 25 + /* DDRC register definition */ 26 + #define DDRC_PERF_CTRL 0x010 27 + #define DDRC_FLUX_WR 0x380 28 + #define DDRC_FLUX_RD 0x384 29 + #define DDRC_FLUX_WCMD 0x388 30 + #define DDRC_FLUX_RCMD 0x38c 31 + #define DDRC_PRE_CMD 0x3c0 32 + #define DDRC_ACT_CMD 0x3c4 33 + #define DDRC_BNK_CHG 0x3c8 34 + #define DDRC_RNK_CHG 0x3cc 35 + #define DDRC_EVENT_CTRL 0x6C0 36 + #define DDRC_INT_MASK 0x6c8 37 + #define DDRC_INT_STATUS 0x6cc 38 + #define DDRC_INT_CLEAR 0x6d0 39 + 40 + /* DDRC has 8-counters */ 41 + #define DDRC_NR_COUNTERS 0x8 42 + #define DDRC_PERF_CTRL_EN 0x2 43 + 44 + /* 45 + * For DDRC PMU, there are eight-events and every event has been mapped 46 + * to fixed-purpose counters which register offset is not consistent. 47 + * Therefore there is no write event type and we assume that event 48 + * code (0 to 7) is equal to counter index in PMU driver. 49 + */ 50 + #define GET_DDRC_EVENTID(hwc) (hwc->config_base & 0x7) 51 + 52 + static const u32 ddrc_reg_off[] = { 53 + DDRC_FLUX_WR, DDRC_FLUX_RD, DDRC_FLUX_WCMD, DDRC_FLUX_RCMD, 54 + DDRC_PRE_CMD, DDRC_ACT_CMD, DDRC_BNK_CHG, DDRC_RNK_CHG 55 + }; 56 + 57 + /* 58 + * Select the counter register offset using the counter index. 59 + * In DDRC there are no programmable counter, the count 60 + * is readed form the statistics counter register itself. 61 + */ 62 + static u32 hisi_ddrc_pmu_get_counter_offset(int cntr_idx) 63 + { 64 + return ddrc_reg_off[cntr_idx]; 65 + } 66 + 67 + static u64 hisi_ddrc_pmu_read_counter(struct hisi_pmu *ddrc_pmu, 68 + struct hw_perf_event *hwc) 69 + { 70 + /* Use event code as counter index */ 71 + u32 idx = GET_DDRC_EVENTID(hwc); 72 + 73 + if (!hisi_uncore_pmu_counter_valid(ddrc_pmu, idx)) { 74 + dev_err(ddrc_pmu->dev, "Unsupported event index:%d!\n", idx); 75 + return 0; 76 + } 77 + 78 + return readl(ddrc_pmu->base + hisi_ddrc_pmu_get_counter_offset(idx)); 79 + } 80 + 81 + static void hisi_ddrc_pmu_write_counter(struct hisi_pmu *ddrc_pmu, 82 + struct hw_perf_event *hwc, u64 val) 83 + { 84 + u32 idx = GET_DDRC_EVENTID(hwc); 85 + 86 + if (!hisi_uncore_pmu_counter_valid(ddrc_pmu, idx)) { 87 + dev_err(ddrc_pmu->dev, "Unsupported event index:%d!\n", idx); 88 + return; 89 + } 90 + 91 + writel((u32)val, 92 + ddrc_pmu->base + hisi_ddrc_pmu_get_counter_offset(idx)); 93 + } 94 + 95 + /* 96 + * For DDRC PMU, event has been mapped to fixed-purpose counter by hardware, 97 + * so there is no need to write event type. 98 + */ 99 + static void hisi_ddrc_pmu_write_evtype(struct hisi_pmu *hha_pmu, int idx, 100 + u32 type) 101 + { 102 + } 103 + 104 + static void hisi_ddrc_pmu_start_counters(struct hisi_pmu *ddrc_pmu) 105 + { 106 + u32 val; 107 + 108 + /* Set perf_enable in DDRC_PERF_CTRL to start event counting */ 109 + val = readl(ddrc_pmu->base + DDRC_PERF_CTRL); 110 + val |= DDRC_PERF_CTRL_EN; 111 + writel(val, ddrc_pmu->base + DDRC_PERF_CTRL); 112 + } 113 + 114 + static void hisi_ddrc_pmu_stop_counters(struct hisi_pmu *ddrc_pmu) 115 + { 116 + u32 val; 117 + 118 + /* Clear perf_enable in DDRC_PERF_CTRL to stop event counting */ 119 + val = readl(ddrc_pmu->base + DDRC_PERF_CTRL); 120 + val &= ~DDRC_PERF_CTRL_EN; 121 + writel(val, ddrc_pmu->base + DDRC_PERF_CTRL); 122 + } 123 + 124 + static void hisi_ddrc_pmu_enable_counter(struct hisi_pmu *ddrc_pmu, 125 + struct hw_perf_event *hwc) 126 + { 127 + u32 val; 128 + 129 + /* Set counter index(event code) in DDRC_EVENT_CTRL register */ 130 + val = readl(ddrc_pmu->base + DDRC_EVENT_CTRL); 131 + val |= (1 << GET_DDRC_EVENTID(hwc)); 132 + writel(val, ddrc_pmu->base + DDRC_EVENT_CTRL); 133 + } 134 + 135 + static void hisi_ddrc_pmu_disable_counter(struct hisi_pmu *ddrc_pmu, 136 + struct hw_perf_event *hwc) 137 + { 138 + u32 val; 139 + 140 + /* Clear counter index(event code) in DDRC_EVENT_CTRL register */ 141 + val = readl(ddrc_pmu->base + DDRC_EVENT_CTRL); 142 + val &= ~(1 << GET_DDRC_EVENTID(hwc)); 143 + writel(val, ddrc_pmu->base + DDRC_EVENT_CTRL); 144 + } 145 + 146 + static int hisi_ddrc_pmu_get_event_idx(struct perf_event *event) 147 + { 148 + struct hisi_pmu *ddrc_pmu = to_hisi_pmu(event->pmu); 149 + unsigned long *used_mask = ddrc_pmu->pmu_events.used_mask; 150 + struct hw_perf_event *hwc = &event->hw; 151 + /* For DDRC PMU, we use event code as counter index */ 152 + int idx = GET_DDRC_EVENTID(hwc); 153 + 154 + if (test_bit(idx, used_mask)) 155 + return -EAGAIN; 156 + 157 + set_bit(idx, used_mask); 158 + 159 + return idx; 160 + } 161 + 162 + static void hisi_ddrc_pmu_enable_counter_int(struct hisi_pmu *ddrc_pmu, 163 + struct hw_perf_event *hwc) 164 + { 165 + u32 val; 166 + 167 + /* Write 0 to enable interrupt */ 168 + val = readl(ddrc_pmu->base + DDRC_INT_MASK); 169 + val &= ~(1 << GET_DDRC_EVENTID(hwc)); 170 + writel(val, ddrc_pmu->base + DDRC_INT_MASK); 171 + } 172 + 173 + static void hisi_ddrc_pmu_disable_counter_int(struct hisi_pmu *ddrc_pmu, 174 + struct hw_perf_event *hwc) 175 + { 176 + u32 val; 177 + 178 + /* Write 1 to mask interrupt */ 179 + val = readl(ddrc_pmu->base + DDRC_INT_MASK); 180 + val |= (1 << GET_DDRC_EVENTID(hwc)); 181 + writel(val, ddrc_pmu->base + DDRC_INT_MASK); 182 + } 183 + 184 + static irqreturn_t hisi_ddrc_pmu_isr(int irq, void *dev_id) 185 + { 186 + struct hisi_pmu *ddrc_pmu = dev_id; 187 + struct perf_event *event; 188 + unsigned long overflown; 189 + int idx; 190 + 191 + /* Read the DDRC_INT_STATUS register */ 192 + overflown = readl(ddrc_pmu->base + DDRC_INT_STATUS); 193 + if (!overflown) 194 + return IRQ_NONE; 195 + 196 + /* 197 + * Find the counter index which overflowed if the bit was set 198 + * and handle it 199 + */ 200 + for_each_set_bit(idx, &overflown, DDRC_NR_COUNTERS) { 201 + /* Write 1 to clear the IRQ status flag */ 202 + writel((1 << idx), ddrc_pmu->base + DDRC_INT_CLEAR); 203 + 204 + /* Get the corresponding event struct */ 205 + event = ddrc_pmu->pmu_events.hw_events[idx]; 206 + if (!event) 207 + continue; 208 + 209 + hisi_uncore_pmu_event_update(event); 210 + hisi_uncore_pmu_set_event_period(event); 211 + } 212 + 213 + return IRQ_HANDLED; 214 + } 215 + 216 + static int hisi_ddrc_pmu_init_irq(struct hisi_pmu *ddrc_pmu, 217 + struct platform_device *pdev) 218 + { 219 + int irq, ret; 220 + 221 + /* Read and init IRQ */ 222 + irq = platform_get_irq(pdev, 0); 223 + if (irq < 0) { 224 + dev_err(&pdev->dev, "DDRC PMU get irq fail; irq:%d\n", irq); 225 + return irq; 226 + } 227 + 228 + ret = devm_request_irq(&pdev->dev, irq, hisi_ddrc_pmu_isr, 229 + IRQF_NOBALANCING | IRQF_NO_THREAD, 230 + dev_name(&pdev->dev), ddrc_pmu); 231 + if (ret < 0) { 232 + dev_err(&pdev->dev, 233 + "Fail to request IRQ:%d ret:%d\n", irq, ret); 234 + return ret; 235 + } 236 + 237 + ddrc_pmu->irq = irq; 238 + 239 + return 0; 240 + } 241 + 242 + static const struct acpi_device_id hisi_ddrc_pmu_acpi_match[] = { 243 + { "HISI0233", }, 244 + {}, 245 + }; 246 + MODULE_DEVICE_TABLE(acpi, hisi_ddrc_pmu_acpi_match); 247 + 248 + static int hisi_ddrc_pmu_init_data(struct platform_device *pdev, 249 + struct hisi_pmu *ddrc_pmu) 250 + { 251 + struct resource *res; 252 + 253 + /* 254 + * Use the SCCL_ID and DDRC channel ID to identify the 255 + * DDRC PMU, while SCCL_ID is in MPIDR[aff2]. 256 + */ 257 + if (device_property_read_u32(&pdev->dev, "hisilicon,ch-id", 258 + &ddrc_pmu->index_id)) { 259 + dev_err(&pdev->dev, "Can not read ddrc channel-id!\n"); 260 + return -EINVAL; 261 + } 262 + 263 + if (device_property_read_u32(&pdev->dev, "hisilicon,scl-id", 264 + &ddrc_pmu->sccl_id)) { 265 + dev_err(&pdev->dev, "Can not read ddrc sccl-id!\n"); 266 + return -EINVAL; 267 + } 268 + /* DDRC PMUs only share the same SCCL */ 269 + ddrc_pmu->ccl_id = -1; 270 + 271 + res = platform_get_resource(pdev, IORESOURCE_MEM, 0); 272 + ddrc_pmu->base = devm_ioremap_resource(&pdev->dev, res); 273 + if (IS_ERR(ddrc_pmu->base)) { 274 + dev_err(&pdev->dev, "ioremap failed for ddrc_pmu resource\n"); 275 + return PTR_ERR(ddrc_pmu->base); 276 + } 277 + 278 + return 0; 279 + } 280 + 281 + static struct attribute *hisi_ddrc_pmu_format_attr[] = { 282 + HISI_PMU_FORMAT_ATTR(event, "config:0-4"), 283 + NULL, 284 + }; 285 + 286 + static const struct attribute_group hisi_ddrc_pmu_format_group = { 287 + .name = "format", 288 + .attrs = hisi_ddrc_pmu_format_attr, 289 + }; 290 + 291 + static struct attribute *hisi_ddrc_pmu_events_attr[] = { 292 + HISI_PMU_EVENT_ATTR(flux_wr, 0x00), 293 + HISI_PMU_EVENT_ATTR(flux_rd, 0x01), 294 + HISI_PMU_EVENT_ATTR(flux_wcmd, 0x02), 295 + HISI_PMU_EVENT_ATTR(flux_rcmd, 0x03), 296 + HISI_PMU_EVENT_ATTR(pre_cmd, 0x04), 297 + HISI_PMU_EVENT_ATTR(act_cmd, 0x05), 298 + HISI_PMU_EVENT_ATTR(rnk_chg, 0x06), 299 + HISI_PMU_EVENT_ATTR(rw_chg, 0x07), 300 + NULL, 301 + }; 302 + 303 + static const struct attribute_group hisi_ddrc_pmu_events_group = { 304 + .name = "events", 305 + .attrs = hisi_ddrc_pmu_events_attr, 306 + }; 307 + 308 + static DEVICE_ATTR(cpumask, 0444, hisi_cpumask_sysfs_show, NULL); 309 + 310 + static struct attribute *hisi_ddrc_pmu_cpumask_attrs[] = { 311 + &dev_attr_cpumask.attr, 312 + NULL, 313 + }; 314 + 315 + static const struct attribute_group hisi_ddrc_pmu_cpumask_attr_group = { 316 + .attrs = hisi_ddrc_pmu_cpumask_attrs, 317 + }; 318 + 319 + static const struct attribute_group *hisi_ddrc_pmu_attr_groups[] = { 320 + &hisi_ddrc_pmu_format_group, 321 + &hisi_ddrc_pmu_events_group, 322 + &hisi_ddrc_pmu_cpumask_attr_group, 323 + NULL, 324 + }; 325 + 326 + static const struct hisi_uncore_ops hisi_uncore_ddrc_ops = { 327 + .write_evtype = hisi_ddrc_pmu_write_evtype, 328 + .get_event_idx = hisi_ddrc_pmu_get_event_idx, 329 + .start_counters = hisi_ddrc_pmu_start_counters, 330 + .stop_counters = hisi_ddrc_pmu_stop_counters, 331 + .enable_counter = hisi_ddrc_pmu_enable_counter, 332 + .disable_counter = hisi_ddrc_pmu_disable_counter, 333 + .enable_counter_int = hisi_ddrc_pmu_enable_counter_int, 334 + .disable_counter_int = hisi_ddrc_pmu_disable_counter_int, 335 + .write_counter = hisi_ddrc_pmu_write_counter, 336 + .read_counter = hisi_ddrc_pmu_read_counter, 337 + }; 338 + 339 + static int hisi_ddrc_pmu_dev_probe(struct platform_device *pdev, 340 + struct hisi_pmu *ddrc_pmu) 341 + { 342 + int ret; 343 + 344 + ret = hisi_ddrc_pmu_init_data(pdev, ddrc_pmu); 345 + if (ret) 346 + return ret; 347 + 348 + ret = hisi_ddrc_pmu_init_irq(ddrc_pmu, pdev); 349 + if (ret) 350 + return ret; 351 + 352 + ddrc_pmu->num_counters = DDRC_NR_COUNTERS; 353 + ddrc_pmu->counter_bits = 32; 354 + ddrc_pmu->ops = &hisi_uncore_ddrc_ops; 355 + ddrc_pmu->dev = &pdev->dev; 356 + ddrc_pmu->on_cpu = -1; 357 + ddrc_pmu->check_event = 7; 358 + 359 + return 0; 360 + } 361 + 362 + static int hisi_ddrc_pmu_probe(struct platform_device *pdev) 363 + { 364 + struct hisi_pmu *ddrc_pmu; 365 + char *name; 366 + int ret; 367 + 368 + ddrc_pmu = devm_kzalloc(&pdev->dev, sizeof(*ddrc_pmu), GFP_KERNEL); 369 + if (!ddrc_pmu) 370 + return -ENOMEM; 371 + 372 + platform_set_drvdata(pdev, ddrc_pmu); 373 + 374 + ret = hisi_ddrc_pmu_dev_probe(pdev, ddrc_pmu); 375 + if (ret) 376 + return ret; 377 + 378 + ret = cpuhp_state_add_instance(CPUHP_AP_PERF_ARM_HISI_DDRC_ONLINE, 379 + &ddrc_pmu->node); 380 + if (ret) { 381 + dev_err(&pdev->dev, "Error %d registering hotplug;\n", ret); 382 + return ret; 383 + } 384 + 385 + name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%u_ddrc%u", 386 + ddrc_pmu->sccl_id, ddrc_pmu->index_id); 387 + ddrc_pmu->pmu = (struct pmu) { 388 + .name = name, 389 + .task_ctx_nr = perf_invalid_context, 390 + .event_init = hisi_uncore_pmu_event_init, 391 + .pmu_enable = hisi_uncore_pmu_enable, 392 + .pmu_disable = hisi_uncore_pmu_disable, 393 + .add = hisi_uncore_pmu_add, 394 + .del = hisi_uncore_pmu_del, 395 + .start = hisi_uncore_pmu_start, 396 + .stop = hisi_uncore_pmu_stop, 397 + .read = hisi_uncore_pmu_read, 398 + .attr_groups = hisi_ddrc_pmu_attr_groups, 399 + }; 400 + 401 + ret = perf_pmu_register(&ddrc_pmu->pmu, name, -1); 402 + if (ret) { 403 + dev_err(ddrc_pmu->dev, "DDRC PMU register failed!\n"); 404 + cpuhp_state_remove_instance(CPUHP_AP_PERF_ARM_HISI_DDRC_ONLINE, 405 + &ddrc_pmu->node); 406 + } 407 + 408 + return ret; 409 + } 410 + 411 + static int hisi_ddrc_pmu_remove(struct platform_device *pdev) 412 + { 413 + struct hisi_pmu *ddrc_pmu = platform_get_drvdata(pdev); 414 + 415 + perf_pmu_unregister(&ddrc_pmu->pmu); 416 + cpuhp_state_remove_instance(CPUHP_AP_PERF_ARM_HISI_DDRC_ONLINE, 417 + &ddrc_pmu->node); 418 + 419 + return 0; 420 + } 421 + 422 + static struct platform_driver hisi_ddrc_pmu_driver = { 423 + .driver = { 424 + .name = "hisi_ddrc_pmu", 425 + .acpi_match_table = ACPI_PTR(hisi_ddrc_pmu_acpi_match), 426 + }, 427 + .probe = hisi_ddrc_pmu_probe, 428 + .remove = hisi_ddrc_pmu_remove, 429 + }; 430 + 431 + static int __init hisi_ddrc_pmu_module_init(void) 432 + { 433 + int ret; 434 + 435 + ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_ARM_HISI_DDRC_ONLINE, 436 + "AP_PERF_ARM_HISI_DDRC_ONLINE", 437 + hisi_uncore_pmu_online_cpu, 438 + hisi_uncore_pmu_offline_cpu); 439 + if (ret) { 440 + pr_err("DDRC PMU: setup hotplug, ret = %d\n", ret); 441 + return ret; 442 + } 443 + 444 + ret = platform_driver_register(&hisi_ddrc_pmu_driver); 445 + if (ret) 446 + cpuhp_remove_multi_state(CPUHP_AP_PERF_ARM_HISI_DDRC_ONLINE); 447 + 448 + return ret; 449 + } 450 + module_init(hisi_ddrc_pmu_module_init); 451 + 452 + static void __exit hisi_ddrc_pmu_module_exit(void) 453 + { 454 + platform_driver_unregister(&hisi_ddrc_pmu_driver); 455 + cpuhp_remove_multi_state(CPUHP_AP_PERF_ARM_HISI_DDRC_ONLINE); 456 + 457 + } 458 + module_exit(hisi_ddrc_pmu_module_exit); 459 + 460 + MODULE_DESCRIPTION("HiSilicon SoC DDRC uncore PMU driver"); 461 + MODULE_LICENSE("GPL v2"); 462 + MODULE_AUTHOR("Shaokun Zhang <zhangshaokun@hisilicon.com>"); 463 + MODULE_AUTHOR("Anurup M <anurup.m@huawei.com>");
+473
drivers/perf/hisilicon/hisi_uncore_hha_pmu.c
··· 1 + /* 2 + * HiSilicon SoC HHA uncore Hardware event counters support 3 + * 4 + * Copyright (C) 2017 Hisilicon Limited 5 + * Author: Shaokun Zhang <zhangshaokun@hisilicon.com> 6 + * Anurup M <anurup.m@huawei.com> 7 + * 8 + * This code is based on the uncore PMUs like arm-cci and arm-ccn. 9 + * 10 + * This program is free software; you can redistribute it and/or modify 11 + * it under the terms of the GNU General Public License version 2 as 12 + * published by the Free Software Foundation. 13 + */ 14 + #include <linux/acpi.h> 15 + #include <linux/bug.h> 16 + #include <linux/cpuhotplug.h> 17 + #include <linux/interrupt.h> 18 + #include <linux/irq.h> 19 + #include <linux/list.h> 20 + #include <linux/platform_device.h> 21 + #include <linux/smp.h> 22 + 23 + #include "hisi_uncore_pmu.h" 24 + 25 + /* HHA register definition */ 26 + #define HHA_INT_MASK 0x0804 27 + #define HHA_INT_STATUS 0x0808 28 + #define HHA_INT_CLEAR 0x080C 29 + #define HHA_PERF_CTRL 0x1E00 30 + #define HHA_EVENT_CTRL 0x1E04 31 + #define HHA_EVENT_TYPE0 0x1E80 32 + /* 33 + * Each counter is 48-bits and [48:63] are reserved 34 + * which are Read-As-Zero and Writes-Ignored. 35 + */ 36 + #define HHA_CNT0_LOWER 0x1F00 37 + 38 + /* HHA has 16-counters */ 39 + #define HHA_NR_COUNTERS 0x10 40 + 41 + #define HHA_PERF_CTRL_EN 0x1 42 + #define HHA_EVTYPE_NONE 0xff 43 + 44 + /* 45 + * Select the counter register offset using the counter index 46 + * each counter is 48-bits. 47 + */ 48 + static u32 hisi_hha_pmu_get_counter_offset(int cntr_idx) 49 + { 50 + return (HHA_CNT0_LOWER + (cntr_idx * 8)); 51 + } 52 + 53 + static u64 hisi_hha_pmu_read_counter(struct hisi_pmu *hha_pmu, 54 + struct hw_perf_event *hwc) 55 + { 56 + u32 idx = hwc->idx; 57 + 58 + if (!hisi_uncore_pmu_counter_valid(hha_pmu, idx)) { 59 + dev_err(hha_pmu->dev, "Unsupported event index:%d!\n", idx); 60 + return 0; 61 + } 62 + 63 + /* Read 64 bits and like L3C, top 16 bits are RAZ */ 64 + return readq(hha_pmu->base + hisi_hha_pmu_get_counter_offset(idx)); 65 + } 66 + 67 + static void hisi_hha_pmu_write_counter(struct hisi_pmu *hha_pmu, 68 + struct hw_perf_event *hwc, u64 val) 69 + { 70 + u32 idx = hwc->idx; 71 + 72 + if (!hisi_uncore_pmu_counter_valid(hha_pmu, idx)) { 73 + dev_err(hha_pmu->dev, "Unsupported event index:%d!\n", idx); 74 + return; 75 + } 76 + 77 + /* Write 64 bits and like L3C, top 16 bits are WI */ 78 + writeq(val, hha_pmu->base + hisi_hha_pmu_get_counter_offset(idx)); 79 + } 80 + 81 + static void hisi_hha_pmu_write_evtype(struct hisi_pmu *hha_pmu, int idx, 82 + u32 type) 83 + { 84 + u32 reg, reg_idx, shift, val; 85 + 86 + /* 87 + * Select the appropriate event select register(HHA_EVENT_TYPEx). 88 + * There are 4 event select registers for the 16 hardware counters. 89 + * Event code is 8-bits and for the first 4 hardware counters, 90 + * HHA_EVENT_TYPE0 is chosen. For the next 4 hardware counters, 91 + * HHA_EVENT_TYPE1 is chosen and so on. 92 + */ 93 + reg = HHA_EVENT_TYPE0 + 4 * (idx / 4); 94 + reg_idx = idx % 4; 95 + shift = 8 * reg_idx; 96 + 97 + /* Write event code to HHA_EVENT_TYPEx register */ 98 + val = readl(hha_pmu->base + reg); 99 + val &= ~(HHA_EVTYPE_NONE << shift); 100 + val |= (type << shift); 101 + writel(val, hha_pmu->base + reg); 102 + } 103 + 104 + static void hisi_hha_pmu_start_counters(struct hisi_pmu *hha_pmu) 105 + { 106 + u32 val; 107 + 108 + /* 109 + * Set perf_enable bit in HHA_PERF_CTRL to start event 110 + * counting for all enabled counters. 111 + */ 112 + val = readl(hha_pmu->base + HHA_PERF_CTRL); 113 + val |= HHA_PERF_CTRL_EN; 114 + writel(val, hha_pmu->base + HHA_PERF_CTRL); 115 + } 116 + 117 + static void hisi_hha_pmu_stop_counters(struct hisi_pmu *hha_pmu) 118 + { 119 + u32 val; 120 + 121 + /* 122 + * Clear perf_enable bit in HHA_PERF_CTRL to stop event 123 + * counting for all enabled counters. 124 + */ 125 + val = readl(hha_pmu->base + HHA_PERF_CTRL); 126 + val &= ~(HHA_PERF_CTRL_EN); 127 + writel(val, hha_pmu->base + HHA_PERF_CTRL); 128 + } 129 + 130 + static void hisi_hha_pmu_enable_counter(struct hisi_pmu *hha_pmu, 131 + struct hw_perf_event *hwc) 132 + { 133 + u32 val; 134 + 135 + /* Enable counter index in HHA_EVENT_CTRL register */ 136 + val = readl(hha_pmu->base + HHA_EVENT_CTRL); 137 + val |= (1 << hwc->idx); 138 + writel(val, hha_pmu->base + HHA_EVENT_CTRL); 139 + } 140 + 141 + static void hisi_hha_pmu_disable_counter(struct hisi_pmu *hha_pmu, 142 + struct hw_perf_event *hwc) 143 + { 144 + u32 val; 145 + 146 + /* Clear counter index in HHA_EVENT_CTRL register */ 147 + val = readl(hha_pmu->base + HHA_EVENT_CTRL); 148 + val &= ~(1 << hwc->idx); 149 + writel(val, hha_pmu->base + HHA_EVENT_CTRL); 150 + } 151 + 152 + static void hisi_hha_pmu_enable_counter_int(struct hisi_pmu *hha_pmu, 153 + struct hw_perf_event *hwc) 154 + { 155 + u32 val; 156 + 157 + /* Write 0 to enable interrupt */ 158 + val = readl(hha_pmu->base + HHA_INT_MASK); 159 + val &= ~(1 << hwc->idx); 160 + writel(val, hha_pmu->base + HHA_INT_MASK); 161 + } 162 + 163 + static void hisi_hha_pmu_disable_counter_int(struct hisi_pmu *hha_pmu, 164 + struct hw_perf_event *hwc) 165 + { 166 + u32 val; 167 + 168 + /* Write 1 to mask interrupt */ 169 + val = readl(hha_pmu->base + HHA_INT_MASK); 170 + val |= (1 << hwc->idx); 171 + writel(val, hha_pmu->base + HHA_INT_MASK); 172 + } 173 + 174 + static irqreturn_t hisi_hha_pmu_isr(int irq, void *dev_id) 175 + { 176 + struct hisi_pmu *hha_pmu = dev_id; 177 + struct perf_event *event; 178 + unsigned long overflown; 179 + int idx; 180 + 181 + /* Read HHA_INT_STATUS register */ 182 + overflown = readl(hha_pmu->base + HHA_INT_STATUS); 183 + if (!overflown) 184 + return IRQ_NONE; 185 + 186 + /* 187 + * Find the counter index which overflowed if the bit was set 188 + * and handle it 189 + */ 190 + for_each_set_bit(idx, &overflown, HHA_NR_COUNTERS) { 191 + /* Write 1 to clear the IRQ status flag */ 192 + writel((1 << idx), hha_pmu->base + HHA_INT_CLEAR); 193 + 194 + /* Get the corresponding event struct */ 195 + event = hha_pmu->pmu_events.hw_events[idx]; 196 + if (!event) 197 + continue; 198 + 199 + hisi_uncore_pmu_event_update(event); 200 + hisi_uncore_pmu_set_event_period(event); 201 + } 202 + 203 + return IRQ_HANDLED; 204 + } 205 + 206 + static int hisi_hha_pmu_init_irq(struct hisi_pmu *hha_pmu, 207 + struct platform_device *pdev) 208 + { 209 + int irq, ret; 210 + 211 + /* Read and init IRQ */ 212 + irq = platform_get_irq(pdev, 0); 213 + if (irq < 0) { 214 + dev_err(&pdev->dev, "HHA PMU get irq fail; irq:%d\n", irq); 215 + return irq; 216 + } 217 + 218 + ret = devm_request_irq(&pdev->dev, irq, hisi_hha_pmu_isr, 219 + IRQF_NOBALANCING | IRQF_NO_THREAD, 220 + dev_name(&pdev->dev), hha_pmu); 221 + if (ret < 0) { 222 + dev_err(&pdev->dev, 223 + "Fail to request IRQ:%d ret:%d\n", irq, ret); 224 + return ret; 225 + } 226 + 227 + hha_pmu->irq = irq; 228 + 229 + return 0; 230 + } 231 + 232 + static const struct acpi_device_id hisi_hha_pmu_acpi_match[] = { 233 + { "HISI0243", }, 234 + {}, 235 + }; 236 + MODULE_DEVICE_TABLE(acpi, hisi_hha_pmu_acpi_match); 237 + 238 + static int hisi_hha_pmu_init_data(struct platform_device *pdev, 239 + struct hisi_pmu *hha_pmu) 240 + { 241 + unsigned long long id; 242 + struct resource *res; 243 + acpi_status status; 244 + 245 + status = acpi_evaluate_integer(ACPI_HANDLE(&pdev->dev), 246 + "_UID", NULL, &id); 247 + if (ACPI_FAILURE(status)) 248 + return -EINVAL; 249 + 250 + hha_pmu->index_id = id; 251 + 252 + /* 253 + * Use SCCL_ID and UID to identify the HHA PMU, while 254 + * SCCL_ID is in MPIDR[aff2]. 255 + */ 256 + if (device_property_read_u32(&pdev->dev, "hisilicon,scl-id", 257 + &hha_pmu->sccl_id)) { 258 + dev_err(&pdev->dev, "Can not read hha sccl-id!\n"); 259 + return -EINVAL; 260 + } 261 + /* HHA PMUs only share the same SCCL */ 262 + hha_pmu->ccl_id = -1; 263 + 264 + res = platform_get_resource(pdev, IORESOURCE_MEM, 0); 265 + hha_pmu->base = devm_ioremap_resource(&pdev->dev, res); 266 + if (IS_ERR(hha_pmu->base)) { 267 + dev_err(&pdev->dev, "ioremap failed for hha_pmu resource\n"); 268 + return PTR_ERR(hha_pmu->base); 269 + } 270 + 271 + return 0; 272 + } 273 + 274 + static struct attribute *hisi_hha_pmu_format_attr[] = { 275 + HISI_PMU_FORMAT_ATTR(event, "config:0-7"), 276 + NULL, 277 + }; 278 + 279 + static const struct attribute_group hisi_hha_pmu_format_group = { 280 + .name = "format", 281 + .attrs = hisi_hha_pmu_format_attr, 282 + }; 283 + 284 + static struct attribute *hisi_hha_pmu_events_attr[] = { 285 + HISI_PMU_EVENT_ATTR(rx_ops_num, 0x00), 286 + HISI_PMU_EVENT_ATTR(rx_outer, 0x01), 287 + HISI_PMU_EVENT_ATTR(rx_sccl, 0x02), 288 + HISI_PMU_EVENT_ATTR(rx_ccix, 0x03), 289 + HISI_PMU_EVENT_ATTR(rx_wbi, 0x04), 290 + HISI_PMU_EVENT_ATTR(rx_wbip, 0x05), 291 + HISI_PMU_EVENT_ATTR(rx_wtistash, 0x11), 292 + HISI_PMU_EVENT_ATTR(rd_ddr_64b, 0x1c), 293 + HISI_PMU_EVENT_ATTR(wr_dr_64b, 0x1d), 294 + HISI_PMU_EVENT_ATTR(rd_ddr_128b, 0x1e), 295 + HISI_PMU_EVENT_ATTR(wr_ddr_128b, 0x1f), 296 + HISI_PMU_EVENT_ATTR(spill_num, 0x20), 297 + HISI_PMU_EVENT_ATTR(spill_success, 0x21), 298 + HISI_PMU_EVENT_ATTR(bi_num, 0x23), 299 + HISI_PMU_EVENT_ATTR(mediated_num, 0x32), 300 + HISI_PMU_EVENT_ATTR(tx_snp_num, 0x33), 301 + HISI_PMU_EVENT_ATTR(tx_snp_outer, 0x34), 302 + HISI_PMU_EVENT_ATTR(tx_snp_ccix, 0x35), 303 + HISI_PMU_EVENT_ATTR(rx_snprspdata, 0x38), 304 + HISI_PMU_EVENT_ATTR(rx_snprsp_outer, 0x3c), 305 + HISI_PMU_EVENT_ATTR(sdir-lookup, 0x40), 306 + HISI_PMU_EVENT_ATTR(edir-lookup, 0x41), 307 + HISI_PMU_EVENT_ATTR(sdir-hit, 0x42), 308 + HISI_PMU_EVENT_ATTR(edir-hit, 0x43), 309 + HISI_PMU_EVENT_ATTR(sdir-home-migrate, 0x4c), 310 + HISI_PMU_EVENT_ATTR(edir-home-migrate, 0x4d), 311 + NULL, 312 + }; 313 + 314 + static const struct attribute_group hisi_hha_pmu_events_group = { 315 + .name = "events", 316 + .attrs = hisi_hha_pmu_events_attr, 317 + }; 318 + 319 + static DEVICE_ATTR(cpumask, 0444, hisi_cpumask_sysfs_show, NULL); 320 + 321 + static struct attribute *hisi_hha_pmu_cpumask_attrs[] = { 322 + &dev_attr_cpumask.attr, 323 + NULL, 324 + }; 325 + 326 + static const struct attribute_group hisi_hha_pmu_cpumask_attr_group = { 327 + .attrs = hisi_hha_pmu_cpumask_attrs, 328 + }; 329 + 330 + static const struct attribute_group *hisi_hha_pmu_attr_groups[] = { 331 + &hisi_hha_pmu_format_group, 332 + &hisi_hha_pmu_events_group, 333 + &hisi_hha_pmu_cpumask_attr_group, 334 + NULL, 335 + }; 336 + 337 + static const struct hisi_uncore_ops hisi_uncore_hha_ops = { 338 + .write_evtype = hisi_hha_pmu_write_evtype, 339 + .get_event_idx = hisi_uncore_pmu_get_event_idx, 340 + .start_counters = hisi_hha_pmu_start_counters, 341 + .stop_counters = hisi_hha_pmu_stop_counters, 342 + .enable_counter = hisi_hha_pmu_enable_counter, 343 + .disable_counter = hisi_hha_pmu_disable_counter, 344 + .enable_counter_int = hisi_hha_pmu_enable_counter_int, 345 + .disable_counter_int = hisi_hha_pmu_disable_counter_int, 346 + .write_counter = hisi_hha_pmu_write_counter, 347 + .read_counter = hisi_hha_pmu_read_counter, 348 + }; 349 + 350 + static int hisi_hha_pmu_dev_probe(struct platform_device *pdev, 351 + struct hisi_pmu *hha_pmu) 352 + { 353 + int ret; 354 + 355 + ret = hisi_hha_pmu_init_data(pdev, hha_pmu); 356 + if (ret) 357 + return ret; 358 + 359 + ret = hisi_hha_pmu_init_irq(hha_pmu, pdev); 360 + if (ret) 361 + return ret; 362 + 363 + hha_pmu->num_counters = HHA_NR_COUNTERS; 364 + hha_pmu->counter_bits = 48; 365 + hha_pmu->ops = &hisi_uncore_hha_ops; 366 + hha_pmu->dev = &pdev->dev; 367 + hha_pmu->on_cpu = -1; 368 + hha_pmu->check_event = 0x65; 369 + 370 + return 0; 371 + } 372 + 373 + static int hisi_hha_pmu_probe(struct platform_device *pdev) 374 + { 375 + struct hisi_pmu *hha_pmu; 376 + char *name; 377 + int ret; 378 + 379 + hha_pmu = devm_kzalloc(&pdev->dev, sizeof(*hha_pmu), GFP_KERNEL); 380 + if (!hha_pmu) 381 + return -ENOMEM; 382 + 383 + platform_set_drvdata(pdev, hha_pmu); 384 + 385 + ret = hisi_hha_pmu_dev_probe(pdev, hha_pmu); 386 + if (ret) 387 + return ret; 388 + 389 + ret = cpuhp_state_add_instance(CPUHP_AP_PERF_ARM_HISI_HHA_ONLINE, 390 + &hha_pmu->node); 391 + if (ret) { 392 + dev_err(&pdev->dev, "Error %d registering hotplug\n", ret); 393 + return ret; 394 + } 395 + 396 + name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%u_hha%u", 397 + hha_pmu->sccl_id, hha_pmu->index_id); 398 + hha_pmu->pmu = (struct pmu) { 399 + .name = name, 400 + .task_ctx_nr = perf_invalid_context, 401 + .event_init = hisi_uncore_pmu_event_init, 402 + .pmu_enable = hisi_uncore_pmu_enable, 403 + .pmu_disable = hisi_uncore_pmu_disable, 404 + .add = hisi_uncore_pmu_add, 405 + .del = hisi_uncore_pmu_del, 406 + .start = hisi_uncore_pmu_start, 407 + .stop = hisi_uncore_pmu_stop, 408 + .read = hisi_uncore_pmu_read, 409 + .attr_groups = hisi_hha_pmu_attr_groups, 410 + }; 411 + 412 + ret = perf_pmu_register(&hha_pmu->pmu, name, -1); 413 + if (ret) { 414 + dev_err(hha_pmu->dev, "HHA PMU register failed!\n"); 415 + cpuhp_state_remove_instance(CPUHP_AP_PERF_ARM_HISI_HHA_ONLINE, 416 + &hha_pmu->node); 417 + } 418 + 419 + return ret; 420 + } 421 + 422 + static int hisi_hha_pmu_remove(struct platform_device *pdev) 423 + { 424 + struct hisi_pmu *hha_pmu = platform_get_drvdata(pdev); 425 + 426 + perf_pmu_unregister(&hha_pmu->pmu); 427 + cpuhp_state_remove_instance(CPUHP_AP_PERF_ARM_HISI_HHA_ONLINE, 428 + &hha_pmu->node); 429 + 430 + return 0; 431 + } 432 + 433 + static struct platform_driver hisi_hha_pmu_driver = { 434 + .driver = { 435 + .name = "hisi_hha_pmu", 436 + .acpi_match_table = ACPI_PTR(hisi_hha_pmu_acpi_match), 437 + }, 438 + .probe = hisi_hha_pmu_probe, 439 + .remove = hisi_hha_pmu_remove, 440 + }; 441 + 442 + static int __init hisi_hha_pmu_module_init(void) 443 + { 444 + int ret; 445 + 446 + ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_ARM_HISI_HHA_ONLINE, 447 + "AP_PERF_ARM_HISI_HHA_ONLINE", 448 + hisi_uncore_pmu_online_cpu, 449 + hisi_uncore_pmu_offline_cpu); 450 + if (ret) { 451 + pr_err("HHA PMU: Error setup hotplug, ret = %d;\n", ret); 452 + return ret; 453 + } 454 + 455 + ret = platform_driver_register(&hisi_hha_pmu_driver); 456 + if (ret) 457 + cpuhp_remove_multi_state(CPUHP_AP_PERF_ARM_HISI_HHA_ONLINE); 458 + 459 + return ret; 460 + } 461 + module_init(hisi_hha_pmu_module_init); 462 + 463 + static void __exit hisi_hha_pmu_module_exit(void) 464 + { 465 + platform_driver_unregister(&hisi_hha_pmu_driver); 466 + cpuhp_remove_multi_state(CPUHP_AP_PERF_ARM_HISI_HHA_ONLINE); 467 + } 468 + module_exit(hisi_hha_pmu_module_exit); 469 + 470 + MODULE_DESCRIPTION("HiSilicon SoC HHA uncore PMU driver"); 471 + MODULE_LICENSE("GPL v2"); 472 + MODULE_AUTHOR("Shaokun Zhang <zhangshaokun@hisilicon.com>"); 473 + MODULE_AUTHOR("Anurup M <anurup.m@huawei.com>");
+463
drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
··· 1 + /* 2 + * HiSilicon SoC L3C uncore Hardware event counters support 3 + * 4 + * Copyright (C) 2017 Hisilicon Limited 5 + * Author: Anurup M <anurup.m@huawei.com> 6 + * Shaokun Zhang <zhangshaokun@hisilicon.com> 7 + * 8 + * This code is based on the uncore PMUs like arm-cci and arm-ccn. 9 + * 10 + * This program is free software; you can redistribute it and/or modify 11 + * it under the terms of the GNU General Public License version 2 as 12 + * published by the Free Software Foundation. 13 + */ 14 + #include <linux/acpi.h> 15 + #include <linux/bug.h> 16 + #include <linux/cpuhotplug.h> 17 + #include <linux/interrupt.h> 18 + #include <linux/irq.h> 19 + #include <linux/list.h> 20 + #include <linux/platform_device.h> 21 + #include <linux/smp.h> 22 + 23 + #include "hisi_uncore_pmu.h" 24 + 25 + /* L3C register definition */ 26 + #define L3C_PERF_CTRL 0x0408 27 + #define L3C_INT_MASK 0x0800 28 + #define L3C_INT_STATUS 0x0808 29 + #define L3C_INT_CLEAR 0x080c 30 + #define L3C_EVENT_CTRL 0x1c00 31 + #define L3C_EVENT_TYPE0 0x1d00 32 + /* 33 + * Each counter is 48-bits and [48:63] are reserved 34 + * which are Read-As-Zero and Writes-Ignored. 35 + */ 36 + #define L3C_CNTR0_LOWER 0x1e00 37 + 38 + /* L3C has 8-counters */ 39 + #define L3C_NR_COUNTERS 0x8 40 + 41 + #define L3C_PERF_CTRL_EN 0x20000 42 + #define L3C_EVTYPE_NONE 0xff 43 + 44 + /* 45 + * Select the counter register offset using the counter index 46 + */ 47 + static u32 hisi_l3c_pmu_get_counter_offset(int cntr_idx) 48 + { 49 + return (L3C_CNTR0_LOWER + (cntr_idx * 8)); 50 + } 51 + 52 + static u64 hisi_l3c_pmu_read_counter(struct hisi_pmu *l3c_pmu, 53 + struct hw_perf_event *hwc) 54 + { 55 + u32 idx = hwc->idx; 56 + 57 + if (!hisi_uncore_pmu_counter_valid(l3c_pmu, idx)) { 58 + dev_err(l3c_pmu->dev, "Unsupported event index:%d!\n", idx); 59 + return 0; 60 + } 61 + 62 + /* Read 64-bits and the upper 16 bits are RAZ */ 63 + return readq(l3c_pmu->base + hisi_l3c_pmu_get_counter_offset(idx)); 64 + } 65 + 66 + static void hisi_l3c_pmu_write_counter(struct hisi_pmu *l3c_pmu, 67 + struct hw_perf_event *hwc, u64 val) 68 + { 69 + u32 idx = hwc->idx; 70 + 71 + if (!hisi_uncore_pmu_counter_valid(l3c_pmu, idx)) { 72 + dev_err(l3c_pmu->dev, "Unsupported event index:%d!\n", idx); 73 + return; 74 + } 75 + 76 + /* Write 64-bits and the upper 16 bits are WI */ 77 + writeq(val, l3c_pmu->base + hisi_l3c_pmu_get_counter_offset(idx)); 78 + } 79 + 80 + static void hisi_l3c_pmu_write_evtype(struct hisi_pmu *l3c_pmu, int idx, 81 + u32 type) 82 + { 83 + u32 reg, reg_idx, shift, val; 84 + 85 + /* 86 + * Select the appropriate event select register(L3C_EVENT_TYPE0/1). 87 + * There are 2 event select registers for the 8 hardware counters. 88 + * Event code is 8-bits and for the former 4 hardware counters, 89 + * L3C_EVENT_TYPE0 is chosen. For the latter 4 hardware counters, 90 + * L3C_EVENT_TYPE1 is chosen. 91 + */ 92 + reg = L3C_EVENT_TYPE0 + (idx / 4) * 4; 93 + reg_idx = idx % 4; 94 + shift = 8 * reg_idx; 95 + 96 + /* Write event code to L3C_EVENT_TYPEx Register */ 97 + val = readl(l3c_pmu->base + reg); 98 + val &= ~(L3C_EVTYPE_NONE << shift); 99 + val |= (type << shift); 100 + writel(val, l3c_pmu->base + reg); 101 + } 102 + 103 + static void hisi_l3c_pmu_start_counters(struct hisi_pmu *l3c_pmu) 104 + { 105 + u32 val; 106 + 107 + /* 108 + * Set perf_enable bit in L3C_PERF_CTRL register to start counting 109 + * for all enabled counters. 110 + */ 111 + val = readl(l3c_pmu->base + L3C_PERF_CTRL); 112 + val |= L3C_PERF_CTRL_EN; 113 + writel(val, l3c_pmu->base + L3C_PERF_CTRL); 114 + } 115 + 116 + static void hisi_l3c_pmu_stop_counters(struct hisi_pmu *l3c_pmu) 117 + { 118 + u32 val; 119 + 120 + /* 121 + * Clear perf_enable bit in L3C_PERF_CTRL register to stop counting 122 + * for all enabled counters. 123 + */ 124 + val = readl(l3c_pmu->base + L3C_PERF_CTRL); 125 + val &= ~(L3C_PERF_CTRL_EN); 126 + writel(val, l3c_pmu->base + L3C_PERF_CTRL); 127 + } 128 + 129 + static void hisi_l3c_pmu_enable_counter(struct hisi_pmu *l3c_pmu, 130 + struct hw_perf_event *hwc) 131 + { 132 + u32 val; 133 + 134 + /* Enable counter index in L3C_EVENT_CTRL register */ 135 + val = readl(l3c_pmu->base + L3C_EVENT_CTRL); 136 + val |= (1 << hwc->idx); 137 + writel(val, l3c_pmu->base + L3C_EVENT_CTRL); 138 + } 139 + 140 + static void hisi_l3c_pmu_disable_counter(struct hisi_pmu *l3c_pmu, 141 + struct hw_perf_event *hwc) 142 + { 143 + u32 val; 144 + 145 + /* Clear counter index in L3C_EVENT_CTRL register */ 146 + val = readl(l3c_pmu->base + L3C_EVENT_CTRL); 147 + val &= ~(1 << hwc->idx); 148 + writel(val, l3c_pmu->base + L3C_EVENT_CTRL); 149 + } 150 + 151 + static void hisi_l3c_pmu_enable_counter_int(struct hisi_pmu *l3c_pmu, 152 + struct hw_perf_event *hwc) 153 + { 154 + u32 val; 155 + 156 + val = readl(l3c_pmu->base + L3C_INT_MASK); 157 + /* Write 0 to enable interrupt */ 158 + val &= ~(1 << hwc->idx); 159 + writel(val, l3c_pmu->base + L3C_INT_MASK); 160 + } 161 + 162 + static void hisi_l3c_pmu_disable_counter_int(struct hisi_pmu *l3c_pmu, 163 + struct hw_perf_event *hwc) 164 + { 165 + u32 val; 166 + 167 + val = readl(l3c_pmu->base + L3C_INT_MASK); 168 + /* Write 1 to mask interrupt */ 169 + val |= (1 << hwc->idx); 170 + writel(val, l3c_pmu->base + L3C_INT_MASK); 171 + } 172 + 173 + static irqreturn_t hisi_l3c_pmu_isr(int irq, void *dev_id) 174 + { 175 + struct hisi_pmu *l3c_pmu = dev_id; 176 + struct perf_event *event; 177 + unsigned long overflown; 178 + int idx; 179 + 180 + /* Read L3C_INT_STATUS register */ 181 + overflown = readl(l3c_pmu->base + L3C_INT_STATUS); 182 + if (!overflown) 183 + return IRQ_NONE; 184 + 185 + /* 186 + * Find the counter index which overflowed if the bit was set 187 + * and handle it. 188 + */ 189 + for_each_set_bit(idx, &overflown, L3C_NR_COUNTERS) { 190 + /* Write 1 to clear the IRQ status flag */ 191 + writel((1 << idx), l3c_pmu->base + L3C_INT_CLEAR); 192 + 193 + /* Get the corresponding event struct */ 194 + event = l3c_pmu->pmu_events.hw_events[idx]; 195 + if (!event) 196 + continue; 197 + 198 + hisi_uncore_pmu_event_update(event); 199 + hisi_uncore_pmu_set_event_period(event); 200 + } 201 + 202 + return IRQ_HANDLED; 203 + } 204 + 205 + static int hisi_l3c_pmu_init_irq(struct hisi_pmu *l3c_pmu, 206 + struct platform_device *pdev) 207 + { 208 + int irq, ret; 209 + 210 + /* Read and init IRQ */ 211 + irq = platform_get_irq(pdev, 0); 212 + if (irq < 0) { 213 + dev_err(&pdev->dev, "L3C PMU get irq fail; irq:%d\n", irq); 214 + return irq; 215 + } 216 + 217 + ret = devm_request_irq(&pdev->dev, irq, hisi_l3c_pmu_isr, 218 + IRQF_NOBALANCING | IRQF_NO_THREAD, 219 + dev_name(&pdev->dev), l3c_pmu); 220 + if (ret < 0) { 221 + dev_err(&pdev->dev, 222 + "Fail to request IRQ:%d ret:%d\n", irq, ret); 223 + return ret; 224 + } 225 + 226 + l3c_pmu->irq = irq; 227 + 228 + return 0; 229 + } 230 + 231 + static const struct acpi_device_id hisi_l3c_pmu_acpi_match[] = { 232 + { "HISI0213", }, 233 + {}, 234 + }; 235 + MODULE_DEVICE_TABLE(acpi, hisi_l3c_pmu_acpi_match); 236 + 237 + static int hisi_l3c_pmu_init_data(struct platform_device *pdev, 238 + struct hisi_pmu *l3c_pmu) 239 + { 240 + unsigned long long id; 241 + struct resource *res; 242 + acpi_status status; 243 + 244 + status = acpi_evaluate_integer(ACPI_HANDLE(&pdev->dev), 245 + "_UID", NULL, &id); 246 + if (ACPI_FAILURE(status)) 247 + return -EINVAL; 248 + 249 + l3c_pmu->index_id = id; 250 + 251 + /* 252 + * Use the SCCL_ID and CCL_ID to identify the L3C PMU, while 253 + * SCCL_ID is in MPIDR[aff2] and CCL_ID is in MPIDR[aff1]. 254 + */ 255 + if (device_property_read_u32(&pdev->dev, "hisilicon,scl-id", 256 + &l3c_pmu->sccl_id)) { 257 + dev_err(&pdev->dev, "Can not read l3c sccl-id!\n"); 258 + return -EINVAL; 259 + } 260 + 261 + if (device_property_read_u32(&pdev->dev, "hisilicon,ccl-id", 262 + &l3c_pmu->ccl_id)) { 263 + dev_err(&pdev->dev, "Can not read l3c ccl-id!\n"); 264 + return -EINVAL; 265 + } 266 + 267 + res = platform_get_resource(pdev, IORESOURCE_MEM, 0); 268 + l3c_pmu->base = devm_ioremap_resource(&pdev->dev, res); 269 + if (IS_ERR(l3c_pmu->base)) { 270 + dev_err(&pdev->dev, "ioremap failed for l3c_pmu resource\n"); 271 + return PTR_ERR(l3c_pmu->base); 272 + } 273 + 274 + return 0; 275 + } 276 + 277 + static struct attribute *hisi_l3c_pmu_format_attr[] = { 278 + HISI_PMU_FORMAT_ATTR(event, "config:0-7"), 279 + NULL, 280 + }; 281 + 282 + static const struct attribute_group hisi_l3c_pmu_format_group = { 283 + .name = "format", 284 + .attrs = hisi_l3c_pmu_format_attr, 285 + }; 286 + 287 + static struct attribute *hisi_l3c_pmu_events_attr[] = { 288 + HISI_PMU_EVENT_ATTR(rd_cpipe, 0x00), 289 + HISI_PMU_EVENT_ATTR(wr_cpipe, 0x01), 290 + HISI_PMU_EVENT_ATTR(rd_hit_cpipe, 0x02), 291 + HISI_PMU_EVENT_ATTR(wr_hit_cpipe, 0x03), 292 + HISI_PMU_EVENT_ATTR(victim_num, 0x04), 293 + HISI_PMU_EVENT_ATTR(rd_spipe, 0x20), 294 + HISI_PMU_EVENT_ATTR(wr_spipe, 0x21), 295 + HISI_PMU_EVENT_ATTR(rd_hit_spipe, 0x22), 296 + HISI_PMU_EVENT_ATTR(wr_hit_spipe, 0x23), 297 + HISI_PMU_EVENT_ATTR(back_invalid, 0x29), 298 + HISI_PMU_EVENT_ATTR(retry_cpu, 0x40), 299 + HISI_PMU_EVENT_ATTR(retry_ring, 0x41), 300 + HISI_PMU_EVENT_ATTR(prefetch_drop, 0x42), 301 + NULL, 302 + }; 303 + 304 + static const struct attribute_group hisi_l3c_pmu_events_group = { 305 + .name = "events", 306 + .attrs = hisi_l3c_pmu_events_attr, 307 + }; 308 + 309 + static DEVICE_ATTR(cpumask, 0444, hisi_cpumask_sysfs_show, NULL); 310 + 311 + static struct attribute *hisi_l3c_pmu_cpumask_attrs[] = { 312 + &dev_attr_cpumask.attr, 313 + NULL, 314 + }; 315 + 316 + static const struct attribute_group hisi_l3c_pmu_cpumask_attr_group = { 317 + .attrs = hisi_l3c_pmu_cpumask_attrs, 318 + }; 319 + 320 + static const struct attribute_group *hisi_l3c_pmu_attr_groups[] = { 321 + &hisi_l3c_pmu_format_group, 322 + &hisi_l3c_pmu_events_group, 323 + &hisi_l3c_pmu_cpumask_attr_group, 324 + NULL, 325 + }; 326 + 327 + static const struct hisi_uncore_ops hisi_uncore_l3c_ops = { 328 + .write_evtype = hisi_l3c_pmu_write_evtype, 329 + .get_event_idx = hisi_uncore_pmu_get_event_idx, 330 + .start_counters = hisi_l3c_pmu_start_counters, 331 + .stop_counters = hisi_l3c_pmu_stop_counters, 332 + .enable_counter = hisi_l3c_pmu_enable_counter, 333 + .disable_counter = hisi_l3c_pmu_disable_counter, 334 + .enable_counter_int = hisi_l3c_pmu_enable_counter_int, 335 + .disable_counter_int = hisi_l3c_pmu_disable_counter_int, 336 + .write_counter = hisi_l3c_pmu_write_counter, 337 + .read_counter = hisi_l3c_pmu_read_counter, 338 + }; 339 + 340 + static int hisi_l3c_pmu_dev_probe(struct platform_device *pdev, 341 + struct hisi_pmu *l3c_pmu) 342 + { 343 + int ret; 344 + 345 + ret = hisi_l3c_pmu_init_data(pdev, l3c_pmu); 346 + if (ret) 347 + return ret; 348 + 349 + ret = hisi_l3c_pmu_init_irq(l3c_pmu, pdev); 350 + if (ret) 351 + return ret; 352 + 353 + l3c_pmu->num_counters = L3C_NR_COUNTERS; 354 + l3c_pmu->counter_bits = 48; 355 + l3c_pmu->ops = &hisi_uncore_l3c_ops; 356 + l3c_pmu->dev = &pdev->dev; 357 + l3c_pmu->on_cpu = -1; 358 + l3c_pmu->check_event = 0x59; 359 + 360 + return 0; 361 + } 362 + 363 + static int hisi_l3c_pmu_probe(struct platform_device *pdev) 364 + { 365 + struct hisi_pmu *l3c_pmu; 366 + char *name; 367 + int ret; 368 + 369 + l3c_pmu = devm_kzalloc(&pdev->dev, sizeof(*l3c_pmu), GFP_KERNEL); 370 + if (!l3c_pmu) 371 + return -ENOMEM; 372 + 373 + platform_set_drvdata(pdev, l3c_pmu); 374 + 375 + ret = hisi_l3c_pmu_dev_probe(pdev, l3c_pmu); 376 + if (ret) 377 + return ret; 378 + 379 + ret = cpuhp_state_add_instance(CPUHP_AP_PERF_ARM_HISI_L3_ONLINE, 380 + &l3c_pmu->node); 381 + if (ret) { 382 + dev_err(&pdev->dev, "Error %d registering hotplug\n", ret); 383 + return ret; 384 + } 385 + 386 + name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%u_l3c%u", 387 + l3c_pmu->sccl_id, l3c_pmu->index_id); 388 + l3c_pmu->pmu = (struct pmu) { 389 + .name = name, 390 + .task_ctx_nr = perf_invalid_context, 391 + .event_init = hisi_uncore_pmu_event_init, 392 + .pmu_enable = hisi_uncore_pmu_enable, 393 + .pmu_disable = hisi_uncore_pmu_disable, 394 + .add = hisi_uncore_pmu_add, 395 + .del = hisi_uncore_pmu_del, 396 + .start = hisi_uncore_pmu_start, 397 + .stop = hisi_uncore_pmu_stop, 398 + .read = hisi_uncore_pmu_read, 399 + .attr_groups = hisi_l3c_pmu_attr_groups, 400 + }; 401 + 402 + ret = perf_pmu_register(&l3c_pmu->pmu, name, -1); 403 + if (ret) { 404 + dev_err(l3c_pmu->dev, "L3C PMU register failed!\n"); 405 + cpuhp_state_remove_instance(CPUHP_AP_PERF_ARM_HISI_L3_ONLINE, 406 + &l3c_pmu->node); 407 + } 408 + 409 + return ret; 410 + } 411 + 412 + static int hisi_l3c_pmu_remove(struct platform_device *pdev) 413 + { 414 + struct hisi_pmu *l3c_pmu = platform_get_drvdata(pdev); 415 + 416 + perf_pmu_unregister(&l3c_pmu->pmu); 417 + cpuhp_state_remove_instance(CPUHP_AP_PERF_ARM_HISI_L3_ONLINE, 418 + &l3c_pmu->node); 419 + 420 + return 0; 421 + } 422 + 423 + static struct platform_driver hisi_l3c_pmu_driver = { 424 + .driver = { 425 + .name = "hisi_l3c_pmu", 426 + .acpi_match_table = ACPI_PTR(hisi_l3c_pmu_acpi_match), 427 + }, 428 + .probe = hisi_l3c_pmu_probe, 429 + .remove = hisi_l3c_pmu_remove, 430 + }; 431 + 432 + static int __init hisi_l3c_pmu_module_init(void) 433 + { 434 + int ret; 435 + 436 + ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_ARM_HISI_L3_ONLINE, 437 + "AP_PERF_ARM_HISI_L3_ONLINE", 438 + hisi_uncore_pmu_online_cpu, 439 + hisi_uncore_pmu_offline_cpu); 440 + if (ret) { 441 + pr_err("L3C PMU: Error setup hotplug, ret = %d\n", ret); 442 + return ret; 443 + } 444 + 445 + ret = platform_driver_register(&hisi_l3c_pmu_driver); 446 + if (ret) 447 + cpuhp_remove_multi_state(CPUHP_AP_PERF_ARM_HISI_L3_ONLINE); 448 + 449 + return ret; 450 + } 451 + module_init(hisi_l3c_pmu_module_init); 452 + 453 + static void __exit hisi_l3c_pmu_module_exit(void) 454 + { 455 + platform_driver_unregister(&hisi_l3c_pmu_driver); 456 + cpuhp_remove_multi_state(CPUHP_AP_PERF_ARM_HISI_L3_ONLINE); 457 + } 458 + module_exit(hisi_l3c_pmu_module_exit); 459 + 460 + MODULE_DESCRIPTION("HiSilicon SoC L3C uncore PMU driver"); 461 + MODULE_LICENSE("GPL v2"); 462 + MODULE_AUTHOR("Anurup M <anurup.m@huawei.com>"); 463 + MODULE_AUTHOR("Shaokun Zhang <zhangshaokun@hisilicon.com>");
+447
drivers/perf/hisilicon/hisi_uncore_pmu.c
··· 1 + /* 2 + * HiSilicon SoC Hardware event counters support 3 + * 4 + * Copyright (C) 2017 Hisilicon Limited 5 + * Author: Anurup M <anurup.m@huawei.com> 6 + * Shaokun Zhang <zhangshaokun@hisilicon.com> 7 + * 8 + * This code is based on the uncore PMUs like arm-cci and arm-ccn. 9 + * 10 + * This program is free software; you can redistribute it and/or modify 11 + * it under the terms of the GNU General Public License version 2 as 12 + * published by the Free Software Foundation. 13 + */ 14 + #include <linux/bitmap.h> 15 + #include <linux/bitops.h> 16 + #include <linux/bug.h> 17 + #include <linux/err.h> 18 + #include <linux/errno.h> 19 + #include <linux/interrupt.h> 20 + 21 + #include <asm/local64.h> 22 + 23 + #include "hisi_uncore_pmu.h" 24 + 25 + #define HISI_GET_EVENTID(ev) (ev->hw.config_base & 0xff) 26 + #define HISI_MAX_PERIOD(nr) (BIT_ULL(nr) - 1) 27 + 28 + /* 29 + * PMU format attributes 30 + */ 31 + ssize_t hisi_format_sysfs_show(struct device *dev, 32 + struct device_attribute *attr, char *buf) 33 + { 34 + struct dev_ext_attribute *eattr; 35 + 36 + eattr = container_of(attr, struct dev_ext_attribute, attr); 37 + 38 + return sprintf(buf, "%s\n", (char *)eattr->var); 39 + } 40 + 41 + /* 42 + * PMU event attributes 43 + */ 44 + ssize_t hisi_event_sysfs_show(struct device *dev, 45 + struct device_attribute *attr, char *page) 46 + { 47 + struct dev_ext_attribute *eattr; 48 + 49 + eattr = container_of(attr, struct dev_ext_attribute, attr); 50 + 51 + return sprintf(page, "config=0x%lx\n", (unsigned long)eattr->var); 52 + } 53 + 54 + /* 55 + * sysfs cpumask attributes. For uncore PMU, we only have a single CPU to show 56 + */ 57 + ssize_t hisi_cpumask_sysfs_show(struct device *dev, 58 + struct device_attribute *attr, char *buf) 59 + { 60 + struct hisi_pmu *hisi_pmu = to_hisi_pmu(dev_get_drvdata(dev)); 61 + 62 + return sprintf(buf, "%d\n", hisi_pmu->on_cpu); 63 + } 64 + 65 + static bool hisi_validate_event_group(struct perf_event *event) 66 + { 67 + struct perf_event *sibling, *leader = event->group_leader; 68 + struct hisi_pmu *hisi_pmu = to_hisi_pmu(event->pmu); 69 + /* Include count for the event */ 70 + int counters = 1; 71 + 72 + if (!is_software_event(leader)) { 73 + /* 74 + * We must NOT create groups containing mixed PMUs, although 75 + * software events are acceptable 76 + */ 77 + if (leader->pmu != event->pmu) 78 + return false; 79 + 80 + /* Increment counter for the leader */ 81 + if (leader != event) 82 + counters++; 83 + } 84 + 85 + list_for_each_entry(sibling, &event->group_leader->sibling_list, 86 + group_entry) { 87 + if (is_software_event(sibling)) 88 + continue; 89 + if (sibling->pmu != event->pmu) 90 + return false; 91 + /* Increment counter for each sibling */ 92 + counters++; 93 + } 94 + 95 + /* The group can not count events more than the counters in the HW */ 96 + return counters <= hisi_pmu->num_counters; 97 + } 98 + 99 + int hisi_uncore_pmu_counter_valid(struct hisi_pmu *hisi_pmu, int idx) 100 + { 101 + return idx >= 0 && idx < hisi_pmu->num_counters; 102 + } 103 + 104 + int hisi_uncore_pmu_get_event_idx(struct perf_event *event) 105 + { 106 + struct hisi_pmu *hisi_pmu = to_hisi_pmu(event->pmu); 107 + unsigned long *used_mask = hisi_pmu->pmu_events.used_mask; 108 + u32 num_counters = hisi_pmu->num_counters; 109 + int idx; 110 + 111 + idx = find_first_zero_bit(used_mask, num_counters); 112 + if (idx == num_counters) 113 + return -EAGAIN; 114 + 115 + set_bit(idx, used_mask); 116 + 117 + return idx; 118 + } 119 + 120 + static void hisi_uncore_pmu_clear_event_idx(struct hisi_pmu *hisi_pmu, int idx) 121 + { 122 + if (!hisi_uncore_pmu_counter_valid(hisi_pmu, idx)) { 123 + dev_err(hisi_pmu->dev, "Unsupported event index:%d!\n", idx); 124 + return; 125 + } 126 + 127 + clear_bit(idx, hisi_pmu->pmu_events.used_mask); 128 + } 129 + 130 + int hisi_uncore_pmu_event_init(struct perf_event *event) 131 + { 132 + struct hw_perf_event *hwc = &event->hw; 133 + struct hisi_pmu *hisi_pmu; 134 + 135 + if (event->attr.type != event->pmu->type) 136 + return -ENOENT; 137 + 138 + /* 139 + * We do not support sampling as the counters are all 140 + * shared by all CPU cores in a CPU die(SCCL). Also we 141 + * do not support attach to a task(per-process mode) 142 + */ 143 + if (is_sampling_event(event) || event->attach_state & PERF_ATTACH_TASK) 144 + return -EOPNOTSUPP; 145 + 146 + /* counters do not have these bits */ 147 + if (event->attr.exclude_user || 148 + event->attr.exclude_kernel || 149 + event->attr.exclude_host || 150 + event->attr.exclude_guest || 151 + event->attr.exclude_hv || 152 + event->attr.exclude_idle) 153 + return -EINVAL; 154 + 155 + /* 156 + * The uncore counters not specific to any CPU, so cannot 157 + * support per-task 158 + */ 159 + if (event->cpu < 0) 160 + return -EINVAL; 161 + 162 + /* 163 + * Validate if the events in group does not exceed the 164 + * available counters in hardware. 165 + */ 166 + if (!hisi_validate_event_group(event)) 167 + return -EINVAL; 168 + 169 + hisi_pmu = to_hisi_pmu(event->pmu); 170 + if (event->attr.config > hisi_pmu->check_event) 171 + return -EINVAL; 172 + 173 + if (hisi_pmu->on_cpu == -1) 174 + return -EINVAL; 175 + /* 176 + * We don't assign an index until we actually place the event onto 177 + * hardware. Use -1 to signify that we haven't decided where to put it 178 + * yet. 179 + */ 180 + hwc->idx = -1; 181 + hwc->config_base = event->attr.config; 182 + 183 + /* Enforce to use the same CPU for all events in this PMU */ 184 + event->cpu = hisi_pmu->on_cpu; 185 + 186 + return 0; 187 + } 188 + 189 + /* 190 + * Set the counter to count the event that we're interested in, 191 + * and enable interrupt and counter. 192 + */ 193 + static void hisi_uncore_pmu_enable_event(struct perf_event *event) 194 + { 195 + struct hisi_pmu *hisi_pmu = to_hisi_pmu(event->pmu); 196 + struct hw_perf_event *hwc = &event->hw; 197 + 198 + hisi_pmu->ops->write_evtype(hisi_pmu, hwc->idx, 199 + HISI_GET_EVENTID(event)); 200 + 201 + hisi_pmu->ops->enable_counter_int(hisi_pmu, hwc); 202 + hisi_pmu->ops->enable_counter(hisi_pmu, hwc); 203 + } 204 + 205 + /* 206 + * Disable counter and interrupt. 207 + */ 208 + static void hisi_uncore_pmu_disable_event(struct perf_event *event) 209 + { 210 + struct hisi_pmu *hisi_pmu = to_hisi_pmu(event->pmu); 211 + struct hw_perf_event *hwc = &event->hw; 212 + 213 + hisi_pmu->ops->disable_counter(hisi_pmu, hwc); 214 + hisi_pmu->ops->disable_counter_int(hisi_pmu, hwc); 215 + } 216 + 217 + void hisi_uncore_pmu_set_event_period(struct perf_event *event) 218 + { 219 + struct hisi_pmu *hisi_pmu = to_hisi_pmu(event->pmu); 220 + struct hw_perf_event *hwc = &event->hw; 221 + 222 + /* 223 + * The HiSilicon PMU counters support 32 bits or 48 bits, depending on 224 + * the PMU. We reduce it to 2^(counter_bits - 1) to account for the 225 + * extreme interrupt latency. So we could hopefully handle the overflow 226 + * interrupt before another 2^(counter_bits - 1) events occur and the 227 + * counter overtakes its previous value. 228 + */ 229 + u64 val = BIT_ULL(hisi_pmu->counter_bits - 1); 230 + 231 + local64_set(&hwc->prev_count, val); 232 + /* Write start value to the hardware event counter */ 233 + hisi_pmu->ops->write_counter(hisi_pmu, hwc, val); 234 + } 235 + 236 + void hisi_uncore_pmu_event_update(struct perf_event *event) 237 + { 238 + struct hisi_pmu *hisi_pmu = to_hisi_pmu(event->pmu); 239 + struct hw_perf_event *hwc = &event->hw; 240 + u64 delta, prev_raw_count, new_raw_count; 241 + 242 + do { 243 + /* Read the count from the counter register */ 244 + new_raw_count = hisi_pmu->ops->read_counter(hisi_pmu, hwc); 245 + prev_raw_count = local64_read(&hwc->prev_count); 246 + } while (local64_cmpxchg(&hwc->prev_count, prev_raw_count, 247 + new_raw_count) != prev_raw_count); 248 + /* 249 + * compute the delta 250 + */ 251 + delta = (new_raw_count - prev_raw_count) & 252 + HISI_MAX_PERIOD(hisi_pmu->counter_bits); 253 + local64_add(delta, &event->count); 254 + } 255 + 256 + void hisi_uncore_pmu_start(struct perf_event *event, int flags) 257 + { 258 + struct hisi_pmu *hisi_pmu = to_hisi_pmu(event->pmu); 259 + struct hw_perf_event *hwc = &event->hw; 260 + 261 + if (WARN_ON_ONCE(!(hwc->state & PERF_HES_STOPPED))) 262 + return; 263 + 264 + WARN_ON_ONCE(!(hwc->state & PERF_HES_UPTODATE)); 265 + hwc->state = 0; 266 + hisi_uncore_pmu_set_event_period(event); 267 + 268 + if (flags & PERF_EF_RELOAD) { 269 + u64 prev_raw_count = local64_read(&hwc->prev_count); 270 + 271 + hisi_pmu->ops->write_counter(hisi_pmu, hwc, prev_raw_count); 272 + } 273 + 274 + hisi_uncore_pmu_enable_event(event); 275 + perf_event_update_userpage(event); 276 + } 277 + 278 + void hisi_uncore_pmu_stop(struct perf_event *event, int flags) 279 + { 280 + struct hw_perf_event *hwc = &event->hw; 281 + 282 + hisi_uncore_pmu_disable_event(event); 283 + WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED); 284 + hwc->state |= PERF_HES_STOPPED; 285 + 286 + if (hwc->state & PERF_HES_UPTODATE) 287 + return; 288 + 289 + /* Read hardware counter and update the perf counter statistics */ 290 + hisi_uncore_pmu_event_update(event); 291 + hwc->state |= PERF_HES_UPTODATE; 292 + } 293 + 294 + int hisi_uncore_pmu_add(struct perf_event *event, int flags) 295 + { 296 + struct hisi_pmu *hisi_pmu = to_hisi_pmu(event->pmu); 297 + struct hw_perf_event *hwc = &event->hw; 298 + int idx; 299 + 300 + hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE; 301 + 302 + /* Get an available counter index for counting */ 303 + idx = hisi_pmu->ops->get_event_idx(event); 304 + if (idx < 0) 305 + return idx; 306 + 307 + event->hw.idx = idx; 308 + hisi_pmu->pmu_events.hw_events[idx] = event; 309 + 310 + if (flags & PERF_EF_START) 311 + hisi_uncore_pmu_start(event, PERF_EF_RELOAD); 312 + 313 + return 0; 314 + } 315 + 316 + void hisi_uncore_pmu_del(struct perf_event *event, int flags) 317 + { 318 + struct hisi_pmu *hisi_pmu = to_hisi_pmu(event->pmu); 319 + struct hw_perf_event *hwc = &event->hw; 320 + 321 + hisi_uncore_pmu_stop(event, PERF_EF_UPDATE); 322 + hisi_uncore_pmu_clear_event_idx(hisi_pmu, hwc->idx); 323 + perf_event_update_userpage(event); 324 + hisi_pmu->pmu_events.hw_events[hwc->idx] = NULL; 325 + } 326 + 327 + void hisi_uncore_pmu_read(struct perf_event *event) 328 + { 329 + /* Read hardware counter and update the perf counter statistics */ 330 + hisi_uncore_pmu_event_update(event); 331 + } 332 + 333 + void hisi_uncore_pmu_enable(struct pmu *pmu) 334 + { 335 + struct hisi_pmu *hisi_pmu = to_hisi_pmu(pmu); 336 + int enabled = bitmap_weight(hisi_pmu->pmu_events.used_mask, 337 + hisi_pmu->num_counters); 338 + 339 + if (!enabled) 340 + return; 341 + 342 + hisi_pmu->ops->start_counters(hisi_pmu); 343 + } 344 + 345 + void hisi_uncore_pmu_disable(struct pmu *pmu) 346 + { 347 + struct hisi_pmu *hisi_pmu = to_hisi_pmu(pmu); 348 + 349 + hisi_pmu->ops->stop_counters(hisi_pmu); 350 + } 351 + 352 + /* 353 + * Read Super CPU cluster and CPU cluster ID from MPIDR_EL1. 354 + * If multi-threading is supported, SCCL_ID is in MPIDR[aff3] and CCL_ID 355 + * is in MPIDR[aff2]; if not, SCCL_ID is in MPIDR[aff2] and CCL_ID is 356 + * in MPIDR[aff1]. If this changes in future, this shall be updated. 357 + */ 358 + static void hisi_read_sccl_and_ccl_id(int *sccl_id, int *ccl_id) 359 + { 360 + u64 mpidr = read_cpuid_mpidr(); 361 + 362 + if (mpidr & MPIDR_MT_BITMASK) { 363 + if (sccl_id) 364 + *sccl_id = MPIDR_AFFINITY_LEVEL(mpidr, 3); 365 + if (ccl_id) 366 + *ccl_id = MPIDR_AFFINITY_LEVEL(mpidr, 2); 367 + } else { 368 + if (sccl_id) 369 + *sccl_id = MPIDR_AFFINITY_LEVEL(mpidr, 2); 370 + if (ccl_id) 371 + *ccl_id = MPIDR_AFFINITY_LEVEL(mpidr, 1); 372 + } 373 + } 374 + 375 + /* 376 + * Check whether the CPU is associated with this uncore PMU 377 + */ 378 + static bool hisi_pmu_cpu_is_associated_pmu(struct hisi_pmu *hisi_pmu) 379 + { 380 + int sccl_id, ccl_id; 381 + 382 + if (hisi_pmu->ccl_id == -1) { 383 + /* If CCL_ID is -1, the PMU only shares the same SCCL */ 384 + hisi_read_sccl_and_ccl_id(&sccl_id, NULL); 385 + 386 + return sccl_id == hisi_pmu->sccl_id; 387 + } 388 + 389 + hisi_read_sccl_and_ccl_id(&sccl_id, &ccl_id); 390 + 391 + return sccl_id == hisi_pmu->sccl_id && ccl_id == hisi_pmu->ccl_id; 392 + } 393 + 394 + int hisi_uncore_pmu_online_cpu(unsigned int cpu, struct hlist_node *node) 395 + { 396 + struct hisi_pmu *hisi_pmu = hlist_entry_safe(node, struct hisi_pmu, 397 + node); 398 + 399 + if (!hisi_pmu_cpu_is_associated_pmu(hisi_pmu)) 400 + return 0; 401 + 402 + cpumask_set_cpu(cpu, &hisi_pmu->associated_cpus); 403 + 404 + /* If another CPU is already managing this PMU, simply return. */ 405 + if (hisi_pmu->on_cpu != -1) 406 + return 0; 407 + 408 + /* Use this CPU in cpumask for event counting */ 409 + hisi_pmu->on_cpu = cpu; 410 + 411 + /* Overflow interrupt also should use the same CPU */ 412 + WARN_ON(irq_set_affinity(hisi_pmu->irq, cpumask_of(cpu))); 413 + 414 + return 0; 415 + } 416 + 417 + int hisi_uncore_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node) 418 + { 419 + struct hisi_pmu *hisi_pmu = hlist_entry_safe(node, struct hisi_pmu, 420 + node); 421 + cpumask_t pmu_online_cpus; 422 + unsigned int target; 423 + 424 + if (!cpumask_test_and_clear_cpu(cpu, &hisi_pmu->associated_cpus)) 425 + return 0; 426 + 427 + /* Nothing to do if this CPU doesn't own the PMU */ 428 + if (hisi_pmu->on_cpu != cpu) 429 + return 0; 430 + 431 + /* Give up ownership of the PMU */ 432 + hisi_pmu->on_cpu = -1; 433 + 434 + /* Choose a new CPU to migrate ownership of the PMU to */ 435 + cpumask_and(&pmu_online_cpus, &hisi_pmu->associated_cpus, 436 + cpu_online_mask); 437 + target = cpumask_any_but(&pmu_online_cpus, cpu); 438 + if (target >= nr_cpu_ids) 439 + return 0; 440 + 441 + perf_pmu_migrate_context(&hisi_pmu->pmu, cpu, target); 442 + /* Use this CPU for event counting */ 443 + hisi_pmu->on_cpu = target; 444 + WARN_ON(irq_set_affinity(hisi_pmu->irq, cpumask_of(target))); 445 + 446 + return 0; 447 + }
+102
drivers/perf/hisilicon/hisi_uncore_pmu.h
··· 1 + /* 2 + * HiSilicon SoC Hardware event counters support 3 + * 4 + * Copyright (C) 2017 Hisilicon Limited 5 + * Author: Anurup M <anurup.m@huawei.com> 6 + * Shaokun Zhang <zhangshaokun@hisilicon.com> 7 + * 8 + * This code is based on the uncore PMUs like arm-cci and arm-ccn. 9 + * 10 + * This program is free software; you can redistribute it and/or modify 11 + * it under the terms of the GNU General Public License version 2 as 12 + * published by the Free Software Foundation. 13 + */ 14 + #ifndef __HISI_UNCORE_PMU_H__ 15 + #define __HISI_UNCORE_PMU_H__ 16 + 17 + #include <linux/cpumask.h> 18 + #include <linux/device.h> 19 + #include <linux/kernel.h> 20 + #include <linux/perf_event.h> 21 + #include <linux/types.h> 22 + 23 + #undef pr_fmt 24 + #define pr_fmt(fmt) "hisi_pmu: " fmt 25 + 26 + #define HISI_MAX_COUNTERS 0x10 27 + #define to_hisi_pmu(p) (container_of(p, struct hisi_pmu, pmu)) 28 + 29 + #define HISI_PMU_ATTR(_name, _func, _config) \ 30 + (&((struct dev_ext_attribute[]) { \ 31 + { __ATTR(_name, 0444, _func, NULL), (void *)_config } \ 32 + })[0].attr.attr) 33 + 34 + #define HISI_PMU_FORMAT_ATTR(_name, _config) \ 35 + HISI_PMU_ATTR(_name, hisi_format_sysfs_show, (void *)_config) 36 + #define HISI_PMU_EVENT_ATTR(_name, _config) \ 37 + HISI_PMU_ATTR(_name, hisi_event_sysfs_show, (unsigned long)_config) 38 + 39 + struct hisi_pmu; 40 + 41 + struct hisi_uncore_ops { 42 + void (*write_evtype)(struct hisi_pmu *, int, u32); 43 + int (*get_event_idx)(struct perf_event *); 44 + u64 (*read_counter)(struct hisi_pmu *, struct hw_perf_event *); 45 + void (*write_counter)(struct hisi_pmu *, struct hw_perf_event *, u64); 46 + void (*enable_counter)(struct hisi_pmu *, struct hw_perf_event *); 47 + void (*disable_counter)(struct hisi_pmu *, struct hw_perf_event *); 48 + void (*enable_counter_int)(struct hisi_pmu *, struct hw_perf_event *); 49 + void (*disable_counter_int)(struct hisi_pmu *, struct hw_perf_event *); 50 + void (*start_counters)(struct hisi_pmu *); 51 + void (*stop_counters)(struct hisi_pmu *); 52 + }; 53 + 54 + struct hisi_pmu_hwevents { 55 + struct perf_event *hw_events[HISI_MAX_COUNTERS]; 56 + DECLARE_BITMAP(used_mask, HISI_MAX_COUNTERS); 57 + }; 58 + 59 + /* Generic pmu struct for different pmu types */ 60 + struct hisi_pmu { 61 + struct pmu pmu; 62 + const struct hisi_uncore_ops *ops; 63 + struct hisi_pmu_hwevents pmu_events; 64 + /* associated_cpus: All CPUs associated with the PMU */ 65 + cpumask_t associated_cpus; 66 + /* CPU used for counting */ 67 + int on_cpu; 68 + int irq; 69 + struct device *dev; 70 + struct hlist_node node; 71 + int sccl_id; 72 + int ccl_id; 73 + void __iomem *base; 74 + /* the ID of the PMU modules */ 75 + u32 index_id; 76 + int num_counters; 77 + int counter_bits; 78 + /* check event code range */ 79 + int check_event; 80 + }; 81 + 82 + int hisi_uncore_pmu_counter_valid(struct hisi_pmu *hisi_pmu, int idx); 83 + int hisi_uncore_pmu_get_event_idx(struct perf_event *event); 84 + void hisi_uncore_pmu_read(struct perf_event *event); 85 + int hisi_uncore_pmu_add(struct perf_event *event, int flags); 86 + void hisi_uncore_pmu_del(struct perf_event *event, int flags); 87 + void hisi_uncore_pmu_start(struct perf_event *event, int flags); 88 + void hisi_uncore_pmu_stop(struct perf_event *event, int flags); 89 + void hisi_uncore_pmu_set_event_period(struct perf_event *event); 90 + void hisi_uncore_pmu_event_update(struct perf_event *event); 91 + int hisi_uncore_pmu_event_init(struct perf_event *event); 92 + void hisi_uncore_pmu_enable(struct pmu *pmu); 93 + void hisi_uncore_pmu_disable(struct pmu *pmu); 94 + ssize_t hisi_event_sysfs_show(struct device *dev, 95 + struct device_attribute *attr, char *buf); 96 + ssize_t hisi_format_sysfs_show(struct device *dev, 97 + struct device_attribute *attr, char *buf); 98 + ssize_t hisi_cpumask_sysfs_show(struct device *dev, 99 + struct device_attribute *attr, char *buf); 100 + int hisi_uncore_pmu_online_cpu(unsigned int cpu, struct hlist_node *node); 101 + int hisi_uncore_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node); 102 + #endif /* __HISI_UNCORE_PMU_H__ */
+54
drivers/perf/qcom_l2_pmu.c
··· 92 92 93 93 #define reg_idx(reg, i) (((i) * IA_L2_REG_OFFSET) + reg##_BASE) 94 94 95 + /* 96 + * Events 97 + */ 98 + #define L2_EVENT_CYCLES 0xfe 99 + #define L2_EVENT_DCACHE_OPS 0x400 100 + #define L2_EVENT_ICACHE_OPS 0x401 101 + #define L2_EVENT_TLBI 0x402 102 + #define L2_EVENT_BARRIERS 0x403 103 + #define L2_EVENT_TOTAL_READS 0x405 104 + #define L2_EVENT_TOTAL_WRITES 0x406 105 + #define L2_EVENT_TOTAL_REQUESTS 0x407 106 + #define L2_EVENT_LDREX 0x420 107 + #define L2_EVENT_STREX 0x421 108 + #define L2_EVENT_CLREX 0x422 109 + 95 110 static DEFINE_RAW_SPINLOCK(l2_access_lock); 96 111 97 112 /** ··· 715 700 /* CCG format for perf RAW codes. */ 716 701 PMU_FORMAT_ATTR(l2_code, "config:4-11"); 717 702 PMU_FORMAT_ATTR(l2_group, "config:0-3"); 703 + PMU_FORMAT_ATTR(event, "config:0-11"); 704 + 718 705 static struct attribute *l2_cache_pmu_formats[] = { 719 706 &format_attr_l2_code.attr, 720 707 &format_attr_l2_group.attr, 708 + &format_attr_event.attr, 721 709 NULL, 722 710 }; 723 711 ··· 729 711 .attrs = l2_cache_pmu_formats, 730 712 }; 731 713 714 + static ssize_t l2cache_pmu_event_show(struct device *dev, 715 + struct device_attribute *attr, char *page) 716 + { 717 + struct perf_pmu_events_attr *pmu_attr; 718 + 719 + pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr); 720 + return sprintf(page, "event=0x%02llx\n", pmu_attr->id); 721 + } 722 + 723 + #define L2CACHE_EVENT_ATTR(_name, _id) \ 724 + (&((struct perf_pmu_events_attr[]) { \ 725 + { .attr = __ATTR(_name, 0444, l2cache_pmu_event_show, NULL), \ 726 + .id = _id, } \ 727 + })[0].attr.attr) 728 + 729 + static struct attribute *l2_cache_pmu_events[] = { 730 + L2CACHE_EVENT_ATTR(cycles, L2_EVENT_CYCLES), 731 + L2CACHE_EVENT_ATTR(dcache-ops, L2_EVENT_DCACHE_OPS), 732 + L2CACHE_EVENT_ATTR(icache-ops, L2_EVENT_ICACHE_OPS), 733 + L2CACHE_EVENT_ATTR(tlbi, L2_EVENT_TLBI), 734 + L2CACHE_EVENT_ATTR(barriers, L2_EVENT_BARRIERS), 735 + L2CACHE_EVENT_ATTR(total-reads, L2_EVENT_TOTAL_READS), 736 + L2CACHE_EVENT_ATTR(total-writes, L2_EVENT_TOTAL_WRITES), 737 + L2CACHE_EVENT_ATTR(total-requests, L2_EVENT_TOTAL_REQUESTS), 738 + L2CACHE_EVENT_ATTR(ldrex, L2_EVENT_LDREX), 739 + L2CACHE_EVENT_ATTR(strex, L2_EVENT_STREX), 740 + L2CACHE_EVENT_ATTR(clrex, L2_EVENT_CLREX), 741 + NULL 742 + }; 743 + 744 + static struct attribute_group l2_cache_pmu_events_group = { 745 + .name = "events", 746 + .attrs = l2_cache_pmu_events, 747 + }; 748 + 732 749 static const struct attribute_group *l2_cache_pmu_attr_grps[] = { 733 750 &l2_cache_pmu_format_group, 734 751 &l2_cache_pmu_cpumask_group, 752 + &l2_cache_pmu_events_group, 735 753 NULL, 736 754 }; 737 755
+5 -5
fs/binfmt_elf.c
··· 1699 1699 long signr, size_t *total) 1700 1700 { 1701 1701 unsigned int i; 1702 - unsigned int regset_size = view->regsets[0].n * view->regsets[0].size; 1702 + unsigned int regset0_size = regset_size(t->task, &view->regsets[0]); 1703 1703 1704 1704 /* 1705 1705 * NT_PRSTATUS is the one special case, because the regset data ··· 1708 1708 * We assume that regset 0 is NT_PRSTATUS. 1709 1709 */ 1710 1710 fill_prstatus(&t->prstatus, t->task, signr); 1711 - (void) view->regsets[0].get(t->task, &view->regsets[0], 0, regset_size, 1711 + (void) view->regsets[0].get(t->task, &view->regsets[0], 0, regset0_size, 1712 1712 &t->prstatus.pr_reg, NULL); 1713 1713 1714 1714 fill_note(&t->notes[0], "CORE", NT_PRSTATUS, 1715 - PRSTATUS_SIZE(t->prstatus, regset_size), &t->prstatus); 1715 + PRSTATUS_SIZE(t->prstatus, regset0_size), &t->prstatus); 1716 1716 *total += notesize(&t->notes[0]); 1717 1717 1718 1718 do_thread_regset_writeback(t->task, &view->regsets[0]); ··· 1728 1728 if (regset->core_note_type && regset->get && 1729 1729 (!regset->active || regset->active(t->task, regset))) { 1730 1730 int ret; 1731 - size_t size = regset->n * regset->size; 1731 + size_t size = regset_size(t->task, regset); 1732 1732 void *data = kmalloc(size, GFP_KERNEL); 1733 1733 if (unlikely(!data)) 1734 1734 return 0; ··· 1743 1743 size, data); 1744 1744 else { 1745 1745 SET_PR_FPVALID(&t->prstatus, 1746 - 1, regset_size); 1746 + 1, regset0_size); 1747 1747 fill_note(&t->notes[i], "CORE", 1748 1748 NT_PRFPREG, size, data); 1749 1749 }
+9 -1
include/clocksource/arm_arch_timer.h
··· 67 67 #define ARCH_TIMER_USR_VT_ACCESS_EN (1 << 8) /* virtual timer registers */ 68 68 #define ARCH_TIMER_USR_PT_ACCESS_EN (1 << 9) /* physical timer registers */ 69 69 70 - #define ARCH_TIMER_EVT_STREAM_FREQ 10000 /* 100us */ 70 + #define ARCH_TIMER_EVT_STREAM_PERIOD_US 100 71 + #define ARCH_TIMER_EVT_STREAM_FREQ \ 72 + (USEC_PER_SEC / ARCH_TIMER_EVT_STREAM_PERIOD_US) 71 73 72 74 struct arch_timer_kvm_info { 73 75 struct timecounter timecounter; ··· 95 93 extern u32 arch_timer_get_rate(void); 96 94 extern u64 (*arch_timer_read_counter)(void); 97 95 extern struct arch_timer_kvm_info *arch_timer_get_kvm_info(void); 96 + extern bool arch_timer_evtstrm_available(void); 98 97 99 98 #else 100 99 ··· 107 104 static inline u64 arch_timer_read_counter(void) 108 105 { 109 106 return 0; 107 + } 108 + 109 + static inline bool arch_timer_evtstrm_available(void) 110 + { 111 + return false; 110 112 } 111 113 112 114 #endif
+2 -2
include/linux/acpi_iort.h
··· 49 49 /* IOMMU interface */ 50 50 static inline void iort_dma_setup(struct device *dev, u64 *dma_addr, 51 51 u64 *size) { } 52 - static inline 53 - const struct iommu_ops *iort_iommu_configure(struct device *dev) 52 + static inline const struct iommu_ops *iort_iommu_configure( 53 + struct device *dev) 54 54 { return NULL; } 55 55 #endif 56 56
+3
include/linux/cpuhotplug.h
··· 155 155 CPUHP_AP_PERF_S390_SF_ONLINE, 156 156 CPUHP_AP_PERF_ARM_CCI_ONLINE, 157 157 CPUHP_AP_PERF_ARM_CCN_ONLINE, 158 + CPUHP_AP_PERF_ARM_HISI_DDRC_ONLINE, 159 + CPUHP_AP_PERF_ARM_HISI_HHA_ONLINE, 160 + CPUHP_AP_PERF_ARM_HISI_L3_ONLINE, 158 161 CPUHP_AP_PERF_ARM_L2X0_ONLINE, 159 162 CPUHP_AP_PERF_ARM_QCOM_L2_ONLINE, 160 163 CPUHP_AP_PERF_ARM_QCOM_L3_ONLINE,
+8
include/linux/irqdesc.h
··· 246 246 return desc->status_use_accessors & IRQ_PER_CPU; 247 247 } 248 248 249 + static inline int irq_is_percpu_devid(unsigned int irq) 250 + { 251 + struct irq_desc *desc; 252 + 253 + desc = irq_to_desc(irq); 254 + return desc->status_use_accessors & IRQ_PER_CPU_DEVID; 255 + } 256 + 249 257 static inline void 250 258 irq_set_lockdep_class(unsigned int irq, struct lock_class_key *class) 251 259 {
+60 -7
include/linux/regset.h
··· 107 107 int immediate); 108 108 109 109 /** 110 + * user_regset_get_size_fn - type of @get_size function in &struct user_regset 111 + * @target: thread being examined 112 + * @regset: regset being examined 113 + * 114 + * This call is optional; usually the pointer is %NULL. 115 + * 116 + * When provided, this function must return the current size of regset 117 + * data, as observed by the @get function in &struct user_regset. The 118 + * value returned must be a multiple of @size. The returned size is 119 + * required to be valid only until the next time (if any) @regset is 120 + * modified for @target. 121 + * 122 + * This function is intended for dynamically sized regsets. A regset 123 + * that is statically sized does not need to implement it. 124 + * 125 + * This function should not be called directly: instead, callers should 126 + * call regset_size() to determine the current size of a regset. 127 + */ 128 + typedef unsigned int user_regset_get_size_fn(struct task_struct *target, 129 + const struct user_regset *regset); 130 + 131 + /** 110 132 * struct user_regset - accessible thread CPU state 111 133 * @n: Number of slots (registers). 112 134 * @size: Size in bytes of a slot (register). ··· 139 117 * @set: Function to store values. 140 118 * @active: Function to report if regset is active, or %NULL. 141 119 * @writeback: Function to write data back to user memory, or %NULL. 120 + * @get_size: Function to return the regset's size, or %NULL. 142 121 * 143 122 * This data structure describes a machine resource we call a register set. 144 123 * This is part of the state of an individual thread, not necessarily 145 124 * actual CPU registers per se. A register set consists of a number of 146 125 * similar slots, given by @n. Each slot is @size bytes, and aligned to 147 - * @align bytes (which is at least @size). 126 + * @align bytes (which is at least @size). For dynamically-sized 127 + * regsets, @n must contain the maximum possible number of slots for the 128 + * regset, and @get_size must point to a function that returns the 129 + * current regset size. 148 130 * 149 - * These functions must be called only on the current thread or on a 150 - * thread that is in %TASK_STOPPED or %TASK_TRACED state, that we are 151 - * guaranteed will not be woken up and return to user mode, and that we 152 - * have called wait_task_inactive() on. (The target thread always might 153 - * wake up for SIGKILL while these functions are working, in which case 154 - * that thread's user_regset state might be scrambled.) 131 + * Callers that need to know only the current size of the regset and do 132 + * not care about its internal structure should call regset_size() 133 + * instead of inspecting @n or calling @get_size. 134 + * 135 + * For backward compatibility, the @get and @set methods must pad to, or 136 + * accept, @n * @size bytes, even if the current regset size is smaller. 137 + * The precise semantics of these operations depend on the regset being 138 + * accessed. 139 + * 140 + * The functions to which &struct user_regset members point must be 141 + * called only on the current thread or on a thread that is in 142 + * %TASK_STOPPED or %TASK_TRACED state, that we are guaranteed will not 143 + * be woken up and return to user mode, and that we have called 144 + * wait_task_inactive() on. (The target thread always might wake up for 145 + * SIGKILL while these functions are working, in which case that 146 + * thread's user_regset state might be scrambled.) 155 147 * 156 148 * The @pos argument must be aligned according to @align; the @count 157 149 * argument must be a multiple of @size. These functions are not ··· 192 156 user_regset_set_fn *set; 193 157 user_regset_active_fn *active; 194 158 user_regset_writeback_fn *writeback; 159 + user_regset_get_size_fn *get_size; 195 160 unsigned int n; 196 161 unsigned int size; 197 162 unsigned int align; ··· 408 371 return regset->set(target, regset, offset, size, NULL, data); 409 372 } 410 373 374 + /** 375 + * regset_size - determine the current size of a regset 376 + * @target: thread to be examined 377 + * @regset: regset to be examined 378 + * 379 + * Note that the returned size is valid only until the next time 380 + * (if any) @regset is modified for @target. 381 + */ 382 + static inline unsigned int regset_size(struct task_struct *target, 383 + const struct user_regset *regset) 384 + { 385 + if (!regset->get_size) 386 + return regset->n * regset->size; 387 + else 388 + return regset->get_size(target, regset); 389 + } 411 390 412 391 #endif /* <linux/regset.h> */
+1
include/uapi/linux/elf.h
··· 418 418 #define NT_ARM_HW_BREAK 0x402 /* ARM hardware breakpoint registers */ 419 419 #define NT_ARM_HW_WATCH 0x403 /* ARM hardware watchpoint registers */ 420 420 #define NT_ARM_SYSTEM_CALL 0x404 /* ARM system call number */ 421 + #define NT_ARM_SVE 0x405 /* ARM Scalable Vector Extension registers */ 421 422 #define NT_METAG_CBUF 0x500 /* Metag catch buffer registers */ 422 423 #define NT_METAG_RPIPE 0x501 /* Metag read pipeline state */ 423 424 #define NT_METAG_TLS 0x502 /* Metag TLS pointer */
+1
include/uapi/linux/perf_event.h
··· 942 942 #define PERF_AUX_FLAG_TRUNCATED 0x01 /* record was truncated to fit */ 943 943 #define PERF_AUX_FLAG_OVERWRITE 0x02 /* snapshot from overwrite mode */ 944 944 #define PERF_AUX_FLAG_PARTIAL 0x04 /* record contains gaps */ 945 + #define PERF_AUX_FLAG_COLLISION 0x08 /* sample collided with another */ 945 946 946 947 #define PERF_FLAG_FD_NO_GROUP (1UL << 0) 947 948 #define PERF_FLAG_FD_OUTPUT (1UL << 1)
+9
include/uapi/linux/prctl.h
··· 198 198 # define PR_CAP_AMBIENT_LOWER 3 199 199 # define PR_CAP_AMBIENT_CLEAR_ALL 4 200 200 201 + /* arm64 Scalable Vector Extension controls */ 202 + /* Flag values must be kept in sync with ptrace NT_ARM_SVE interface */ 203 + #define PR_SVE_SET_VL 50 /* set task vector length */ 204 + # define PR_SVE_SET_VL_ONEXEC (1 << 18) /* defer effect until exec */ 205 + #define PR_SVE_GET_VL 51 /* get task vector length */ 206 + /* Bits common to PR_SVE_SET_VL and PR_SVE_GET_VL */ 207 + # define PR_SVE_VL_LEN_MASK 0xffff 208 + # define PR_SVE_VL_INHERIT (1 << 17) /* inherit across exec */ 209 + 201 210 #endif /* _LINUX_PRCTL_H */
+4
kernel/events/ring_buffer.c
··· 411 411 412 412 return NULL; 413 413 } 414 + EXPORT_SYMBOL_GPL(perf_aux_output_begin); 414 415 415 416 static bool __always_inline rb_need_aux_wakeup(struct ring_buffer *rb) 416 417 { ··· 481 480 rb_free_aux(rb); 482 481 ring_buffer_put(rb); 483 482 } 483 + EXPORT_SYMBOL_GPL(perf_aux_output_end); 484 484 485 485 /* 486 486 * Skip over a given number of bytes in the AUX buffer, due to, for example, ··· 507 505 508 506 return 0; 509 507 } 508 + EXPORT_SYMBOL_GPL(perf_aux_output_skip); 510 509 511 510 void *perf_get_aux(struct perf_output_handle *handle) 512 511 { ··· 517 514 518 515 return handle->rb->aux_priv; 519 516 } 517 + EXPORT_SYMBOL_GPL(perf_get_aux); 520 518 521 519 #define PERF_AUX_GFP (GFP_KERNEL | __GFP_ZERO | __GFP_NOWARN | __GFP_NORETRY) 522 520
+1
kernel/irq/irqdesc.c
··· 862 862 863 863 return 0; 864 864 } 865 + EXPORT_SYMBOL_GPL(irq_get_percpu_devid_partition); 865 866 866 867 void kstat_incr_irq_this_cpu(unsigned int irq) 867 868 {
+12
kernel/sys.c
··· 111 111 #ifndef SET_FP_MODE 112 112 # define SET_FP_MODE(a,b) (-EINVAL) 113 113 #endif 114 + #ifndef SVE_SET_VL 115 + # define SVE_SET_VL(a) (-EINVAL) 116 + #endif 117 + #ifndef SVE_GET_VL 118 + # define SVE_GET_VL() (-EINVAL) 119 + #endif 114 120 115 121 /* 116 122 * this is where the system-wide overflow UID and GID are defined, for ··· 2391 2385 break; 2392 2386 case PR_GET_FP_MODE: 2393 2387 error = GET_FP_MODE(me); 2388 + break; 2389 + case PR_SVE_SET_VL: 2390 + error = SVE_SET_VL(arg2); 2391 + break; 2392 + case PR_SVE_GET_VL: 2393 + error = SVE_GET_VL(); 2394 2394 break; 2395 2395 default: 2396 2396 error = -EINVAL;
+3
virt/kvm/arm/arm.c
··· 652 652 */ 653 653 preempt_disable(); 654 654 655 + /* Flush FP/SIMD state that can't survive guest entry/exit */ 656 + kvm_fpsimd_flush_cpu_state(); 657 + 655 658 kvm_pmu_flush_hwstate(vcpu); 656 659 657 660 kvm_timer_flush_hwstate(vcpu);