Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

- Kexec support for arm64

- Kprobes support

- Expose MIDR_EL1 and REVIDR_EL1 CPU identification registers to sysfs

- Trapping of user space cache maintenance operations and emulation in
the kernel (CPU errata workaround)

- Clean-up of the early page tables creation (kernel linear mapping,
EFI run-time maps) to avoid splitting larger blocks (e.g. pmds) into
smaller ones (e.g. ptes)

- VDSO support for CLOCK_MONOTONIC_RAW in clock_gettime()

- ARCH_HAS_KCOV enabled for arm64

- Optimise IP checksum helpers

- SWIOTLB optimisation to only allocate/initialise the buffer if the
available RAM is beyond the 32-bit mask

- Properly handle the "nosmp" command line argument

- Fix for the initialisation of the CPU debug state during early boot

- vdso-offsets.h build dependency workaround

- Build fix when RANDOMIZE_BASE is enabled with MODULES off

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (64 commits)
arm64: arm: Fix-up the removal of the arm64 regs_query_register_name() prototype
arm64: Only select ARM64_MODULE_PLTS if MODULES=y
arm64: mm: run pgtable_page_ctor() on non-swapper translation table pages
arm64: mm: make create_mapping_late() non-allocating
arm64: Honor nosmp kernel command line option
arm64: Fix incorrect per-cpu usage for boot CPU
arm64: kprobes: Add KASAN instrumentation around stack accesses
arm64: kprobes: Cleanup jprobe_return
arm64: kprobes: Fix overflow when saving stack
arm64: kprobes: WARN if attempting to step with PSTATE.D=1
arm64: debug: remove unused local_dbg_{enable, disable} macros
arm64: debug: remove redundant spsr manipulation
arm64: debug: unmask PSTATE.D earlier
arm64: localise Image objcopy flags
arm64: ptrace: remove extra define for CPSR's E bit
kprobes: Add arm64 case in kprobe example module
arm64: Add kernel return probes support (kretprobes)
arm64: Add trampoline code for kretprobes
arm64: kprobes instruction simulation support
arm64: Treat all entry code as non-kprobe-able
...

+3363 -587
+10
Documentation/ABI/testing/sysfs-devices-system-cpu
··· 340 340 'policyX/throttle_stats' directory and all the attributes are same as 341 341 the /sys/devices/system/cpu/cpuX/cpufreq/throttle_stats directory and 342 342 attributes which give the frequency throttle information of the chip. 343 + 344 + What: /sys/devices/system/cpu/cpuX/regs/ 345 + /sys/devices/system/cpu/cpuX/regs/identification/ 346 + /sys/devices/system/cpu/cpuX/regs/identification/midr_el1 347 + /sys/devices/system/cpu/cpuX/regs/identification/revidr_el1 348 + Date: June 2016 349 + Contact: Linux ARM Kernel Mailing list <linux-arm-kernel@lists.infradead.org> 350 + Description: AArch64 CPU registers 351 + 'identification' directory exposes the CPU ID registers for 352 + identifying model and revision of the CPU.
+185 -156
Documentation/arm64/acpi_object_usage.txt
··· 13 13 14 14 -- Required: DSDT, FADT, GTDT, MADT, MCFG, RSDP, SPCR, XSDT 15 15 16 - -- Recommended: BERT, EINJ, ERST, HEST, SSDT 16 + -- Recommended: BERT, EINJ, ERST, HEST, PCCT, SSDT 17 17 18 - -- Optional: BGRT, CPEP, CSRT, DRTM, ECDT, FACS, FPDT, MCHI, MPST, 19 - MSCT, RASF, SBST, SLIT, SPMI, SRAT, TCPA, TPM2, UEFI 18 + -- Optional: BGRT, CPEP, CSRT, DBG2, DRTM, ECDT, FACS, FPDT, IORT, 19 + MCHI, MPST, MSCT, NFIT, PMTT, RASF, SBST, SLIT, SPMI, SRAT, STAO, 20 + TCPA, TPM2, UEFI, XENV 20 21 21 - -- Not supported: BOOT, DBG2, DBGP, DMAR, ETDT, HPET, IBFT, IVRS, 22 - LPIT, MSDM, RSDT, SLIC, WAET, WDAT, WDRT, WPBT 23 - 22 + -- Not supported: BOOT, DBGP, DMAR, ETDT, HPET, IBFT, IVRS, LPIT, 23 + MSDM, OEMx, PSDT, RSDT, SLIC, WAET, WDAT, WDRT, WPBT 24 24 25 25 Table Usage for ARMv8 Linux 26 26 ----- ---------------------------------------------------------------- ··· 50 50 51 51 DBG2 Signature Reserved (signature == "DBG2") 52 52 == DeBuG port table 2 == 53 - Microsoft only table, will not be supported. 53 + License has changed and should be usable. Optional if used instead 54 + of earlycon=<device> on the command line. 54 55 55 56 DBGP Signature Reserved (signature == "DBGP") 56 57 == DeBuG Port table == ··· 134 133 135 134 HEST Section 18.3.2 (signature == "HEST") 136 135 == Hardware Error Source Table == 137 - Until further error source types are defined, use only types 6 (AER 138 - Root Port), 7 (AER Endpoint), 8 (AER Bridge), or 9 (Generic Hardware 139 - Error Source). Firmware first error handling is possible if and only 140 - if Trusted Firmware is being used on arm64. 136 + ARM-specific error sources have been defined; please use those or the 137 + PCI types such as type 6 (AER Root Port), 7 (AER Endpoint), or 8 (AER 138 + Bridge), or use type 9 (Generic Hardware Error Source). Firmware first 139 + error handling is possible if and only if Trusted Firmware is being 140 + used on arm64. 141 141 142 142 Must be supplied if RAS support is provided by the platform. It 143 143 is recommended this table be supplied. ··· 151 149 == iSCSI Boot Firmware Table == 152 150 Microsoft defined table, support TBD. 153 151 152 + IORT Signature Reserved (signature == "IORT") 153 + == Input Output Remapping Table == 154 + arm64 only table, required in order to describe IO topology, SMMUs, 155 + and GIC ITSs, and how those various components are connected together, 156 + such as identifying which components are behind which SMMUs/ITSs. 157 + This table will only be required on certain SBSA platforms (e.g., 158 + when using GICv3-ITS and an SMMU); on SBSA Level 0 platforms, it 159 + remains optional. 160 + 154 161 IVRS Signature Reserved (signature == "IVRS") 155 162 == I/O Virtualization Reporting Structure == 156 163 x86_64 (AMD) only table, will not be supported. 157 164 158 165 LPIT Signature Reserved (signature == "LPIT") 159 166 == Low Power Idle Table == 160 - x86 only table as of ACPI 5.1; future versions have been adapted for 161 - use with ARM and will be recommended in order to support ACPI power 162 - management. 167 + x86 only table as of ACPI 5.1; starting with ACPI 6.0, processor 168 + descriptions and power states on ARM platforms should use the DSDT 169 + and define processor container devices (_HID ACPI0010, Section 8.4, 170 + and more specifically 8.4.3 and and 8.4.4). 163 171 164 172 MADT Section 5.2.12 (signature == "APIC") 165 173 == Multiple APIC Description Table == 166 174 Required for arm64. Only the GIC interrupt controller structures 167 - should be used (types 0xA - 0xE). 175 + should be used (types 0xA - 0xF). 168 176 169 177 MCFG Signature Reserved (signature == "MCFG") 170 178 == Memory-mapped ConFiGuration space == ··· 188 176 == Memory Power State Table == 189 177 Optional, not currently supported. 190 178 179 + MSCT Section 5.2.19 (signature == "MSCT") 180 + == Maximum System Characteristic Table == 181 + Optional, not currently supported. 182 + 191 183 MSDM Signature Reserved (signature == "MSDM") 192 184 == Microsoft Data Management table == 193 185 Microsoft only table, will not be supported. 194 186 195 - MSCT Section 5.2.19 (signature == "MSCT") 196 - == Maximum System Characteristic Table == 187 + NFIT Section 5.2.25 (signature == "NFIT") 188 + == NVDIMM Firmware Interface Table == 197 189 Optional, not currently supported. 190 + 191 + OEMx Signature of "OEMx" only 192 + == OEM Specific Tables == 193 + All tables starting with a signature of "OEM" are reserved for OEM 194 + use. Since these are not meant to be of general use but are limited 195 + to very specific end users, they are not recommended for use and are 196 + not supported by the kernel for arm64. 197 + 198 + PCCT Section 14.1 (signature == "PCCT) 199 + == Platform Communications Channel Table == 200 + Recommend for use on arm64; use of PCC is recommended when using CPPC 201 + to control performance and power for platform processors. 202 + 203 + PMTT Section 5.2.21.12 (signature == "PMTT") 204 + == Platform Memory Topology Table == 205 + Optional, not currently supported. 206 + 207 + PSDT Section 5.2.11.3 (signature == "PSDT") 208 + == Persistent System Description Table == 209 + Obsolete table, will not be supported. 198 210 199 211 RASF Section 5.2.20 (signature == "RASF") 200 212 == RAS Feature table == ··· 231 195 RSDT Section 5.2.7 (signature == "RSDT") 232 196 == Root System Description Table == 233 197 Since this table can only provide 32-bit addresses, it is deprecated 234 - on arm64, and will not be used. 198 + on arm64, and will not be used. If provided, it will be ignored. 235 199 236 200 SBST Section 5.2.14 (signature == "SBST") 237 201 == Smart Battery Subsystem Table == ··· 256 220 SRAT Section 5.2.16 (signature == "SRAT") 257 221 == System Resource Affinity Table == 258 222 Optional, but if used, only the GICC Affinity structures are read. 259 - To support NUMA, this table is required. 223 + To support arm64 NUMA, this table is required. 260 224 261 225 SSDT Section 5.2.11.2 (signature == "SSDT") 262 226 == Secondary System Description Table == ··· 270 234 271 235 These tables are optional, however. ACPI tables should contain only 272 236 one DSDT but can contain many SSDTs. 237 + 238 + STAO Signature Reserved (signature == "STAO") 239 + == _STA Override table == 240 + Optional, but only necessary in virtualized environments in order to 241 + hide devices from guest OSs. 273 242 274 243 TCPA Signature Reserved (signature == "TCPA") 275 244 == Trusted Computing Platform Alliance table == ··· 307 266 == Windows Platform Binary Table == 308 267 Microsoft only table, will not be supported. 309 268 269 + XENV Signature Reserved (signature == "XENV") 270 + == Xen project table == 271 + Optional, used only by Xen at present. 272 + 310 273 XSDT Section 5.2.8 (signature == "XSDT") 311 274 == eXtended System Description Table == 312 275 Required for arm64. ··· 318 273 319 274 ACPI Objects 320 275 ------------ 321 - The expectations on individual ACPI objects are discussed in the list that 322 - follows: 276 + The expectations on individual ACPI objects that are likely to be used are 277 + shown in the list that follows; any object not explicitly mentioned below 278 + should be used as needed for a particular platform or particular subsystem, 279 + such as power management or PCI. 323 280 324 281 Name Section Usage for ARMv8 Linux 325 282 ---- ------------ ------------------------------------------------- 326 - _ADR 6.1.1 Use as needed. 283 + _CCA 6.2.17 This method must be defined for all bus masters 284 + on arm64 -- there are no assumptions made about 285 + whether such devices are cache coherent or not. 286 + The _CCA value is inherited by all descendants of 287 + these devices so it does not need to be repeated. 288 + Without _CCA on arm64, the kernel does not know what 289 + to do about setting up DMA for the device. 327 290 328 - _BBN 6.5.5 Use as needed; PCI-specific. 291 + NB: this method provides default cache coherency 292 + attributes; the presence of an SMMU can be used to 293 + modify that, however. For example, a master could 294 + default to non-coherent, but be made coherent with 295 + the appropriate SMMU configuration (see Table 17 of 296 + the IORT specification, ARM Document DEN 0049B). 329 297 330 - _BDN 6.5.3 Optional; not likely to be used on arm64. 298 + _CID 6.1.2 Use as needed, see also _HID. 331 299 332 - _CCA 6.2.17 This method should be defined for all bus masters 333 - on arm64. While cache coherency is assumed, making 334 - it explicit ensures the kernel will set up DMA as 335 - it should. 300 + _CLS 6.1.3 Use as needed, see also _HID. 336 301 337 - _CDM 6.2.1 Optional, to be used only for processor devices. 338 - 339 - _CID 6.1.2 Use as needed. 340 - 341 - _CLS 6.1.3 Use as needed. 302 + _CPC 8.4.7.1 Use as needed, power management specific. CPPC is 303 + recommended on arm64. 342 304 343 305 _CRS 6.2.2 Required on arm64. 344 306 345 - _DCK 6.5.2 Optional; not likely to be used on arm64. 307 + _CSD 8.4.2.2 Use as needed, used only in conjunction with _CST. 308 + 309 + _CST 8.4.2.1 Low power idle states (8.4.4) are recommended instead 310 + of C-states. 346 311 347 312 _DDN 6.1.4 This field can be used for a device name. However, 348 313 it is meant for DOS device names (e.g., COM1), so be 349 314 careful of its use across OSes. 350 - 351 - _DEP 6.5.8 Use as needed. 352 - 353 - _DIS 6.2.3 Optional, for power management use. 354 - 355 - _DLM 5.7.5 Optional. 356 - 357 - _DMA 6.2.4 Optional. 358 315 359 316 _DSD 6.2.5 To be used with caution. If this object is used, try 360 317 to use it within the constraints already defined by the ··· 372 325 with the UEFI Forum; this may cause some iteration as 373 326 more than one OS will be registering entries. 374 327 375 - _DSM Do not use this method. It is not standardized, the 328 + _DSM 9.1.1 Do not use this method. It is not standardized, the 376 329 return values are not well documented, and it is 377 330 currently a frequent source of error. 378 - 379 - _DSW 7.2.1 Use as needed; power management specific. 380 - 381 - _EDL 6.3.1 Optional. 382 - 383 - _EJD 6.3.2 Optional. 384 - 385 - _EJx 6.3.3 Optional. 386 - 387 - _FIX 6.2.7 x86 specific, not used on arm64. 388 331 389 332 \_GL 5.7.1 This object is not to be used in hardware reduced 390 333 mode, and therefore should not be used on arm64. ··· 386 349 \_GPE 5.3.1 This namespace is for x86 use only. Do not use it 387 350 on arm64. 388 351 389 - _GSB 6.2.7 Optional. 390 - 391 - _HID 6.1.5 Use as needed. This is the primary object to use in 392 - device probing, though _CID and _CLS may also be used. 393 - 394 - _HPP 6.2.8 Optional, PCI specific. 395 - 396 - _HPX 6.2.9 Optional, PCI specific. 397 - 398 - _HRV 6.1.6 Optional, use as needed to clarify device behavior; in 399 - some cases, this may be easier to use than _DSD. 352 + _HID 6.1.5 This is the primary object to use in device probing, 353 + though _CID and _CLS may also be used. 400 354 401 355 _INI 6.5.1 Not required, but can be useful in setting up devices 402 356 when UEFI leaves them in a state that may not be what 403 357 the driver expects before it starts probing. 404 358 405 - _IRC 7.2.15 Use as needed; power management specific. 359 + _LPI 8.4.4.3 Recommended for use with processor definitions (_HID 360 + ACPI0010) on arm64. See also _RDI. 406 361 407 - _LCK 6.3.4 Optional. 362 + _MLS 6.1.7 Highly recommended for use in internationalization. 408 363 409 - _MAT 6.2.10 Optional; see also the MADT. 410 - 411 - _MLS 6.1.7 Optional, but highly recommended for use in 412 - internationalization. 413 - 414 - _OFF 7.1.2 It is recommended to define this method for any device 364 + _OFF 7.2.2 It is recommended to define this method for any device 415 365 that can be turned on or off. 416 366 417 - _ON 7.1.3 It is recommended to define this method for any device 367 + _ON 7.2.3 It is recommended to define this method for any device 418 368 that can be turned on or off. 419 369 420 370 \_OS 5.7.3 This method will return "Linux" by default (this is ··· 422 398 by the kernel community, then register it with the 423 399 UEFI Forum. 424 400 425 - \_OSI 5.7.2 Deprecated on ARM64. Any invocation of this method 426 - will print a warning on the console and return false. 427 - That is, as far as ACPI firmware is concerned, _OSI 428 - cannot be used to determine what sort of system is 429 - being used or what functionality is provided. The 430 - _OSC method is to be used instead. 431 - 432 - _OST 6.3.5 Optional. 401 + \_OSI 5.7.2 Deprecated on ARM64. As far as ACPI firmware is 402 + concerned, _OSI is not to be used to determine what 403 + sort of system is being used or what functionality 404 + is provided. The _OSC method is to be used instead. 433 405 434 406 _PDC 8.4.1 Deprecated, do not use on arm64. 435 407 436 408 \_PIC 5.8.1 The method should not be used. On arm64, the only 437 409 interrupt model available is GIC. 438 410 439 - _PLD 6.1.8 Optional. 440 - 441 411 \_PR 5.3.1 This namespace is for x86 use only on legacy systems. 442 412 Do not use it on arm64. 443 - 444 - _PRS 6.2.12 Optional. 445 413 446 414 _PRT 6.2.13 Required as part of the definition of all PCI root 447 415 devices. 448 416 449 - _PRW 7.2.13 Use as needed; power management specific. 450 - 451 - _PRx 7.2.8-11 Use as needed; power management specific. If _PR0 is 417 + _PRx 7.3.8-11 Use as needed; power management specific. If _PR0 is 452 418 defined, _PR3 must also be defined. 453 419 454 - _PSC 7.2.6 Use as needed; power management specific. 455 - 456 - _PSE 7.2.7 Use as needed; power management specific. 457 - 458 - _PSW 7.2.14 Use as needed; power management specific. 459 - 460 - _PSx 7.2.2-5 Use as needed; power management specific. If _PS0 is 420 + _PSx 7.3.2-5 Use as needed; power management specific. If _PS0 is 461 421 defined, _PS3 must also be defined. If clocks or 462 422 regulators need adjusting to be consistent with power 463 423 usage, change them in these methods. 464 424 465 - \_PTS 7.3.1 Use as needed; power management specific. 466 - 467 - _PXM 6.2.14 Optional. 468 - 469 - _REG 6.5.4 Use as needed. 425 + _RDI 8.4.4.4 Recommended for use with processor definitions (_HID 426 + ACPI0010) on arm64. This should only be used in 427 + conjunction with _LPI. 470 428 471 429 \_REV 5.7.4 Always returns the latest version of ACPI supported. 472 - 473 - _RMV 6.3.6 Optional. 474 430 475 431 \_SB 5.3.1 Required on arm64; all devices must be defined in this 476 432 namespace. 477 433 478 - _SEG 6.5.6 Use as needed; PCI-specific. 479 - 480 - \_SI 5.3.1, Optional. 481 - 9.1 482 - 483 - _SLI 6.2.15 Optional; recommended when SLIT table is in use. 434 + _SLI 6.2.15 Use is recommended when SLIT table is in use. 484 435 485 436 _STA 6.3.7, It is recommended to define this method for any device 486 - 7.1.4 that can be turned on or off. 437 + 7.2.4 that can be turned on or off. See also the STAO table 438 + that provides overrides to hide devices in virtualized 439 + environments. 487 440 488 - _SRS 6.2.16 Optional; see also _PRS. 441 + _SRS 6.2.16 Use as needed; see also _PRS. 489 442 490 443 _STR 6.1.10 Recommended for conveying device names to end users; 491 444 this is preferred over using _DDN. 492 445 493 446 _SUB 6.1.9 Use as needed; _HID or _CID are preferred. 494 447 495 - _SUN 6.1.11 Optional. 448 + _SUN 6.1.11 Use as needed, but recommended. 496 449 497 - \_Sx 7.3.2 Use as needed; power management specific. 498 - 499 - _SxD 7.2.16-19 Use as needed; power management specific. 500 - 501 - _SxW 7.2.20-24 Use as needed; power management specific. 502 - 503 - _SWS 7.3.3 Use as needed; power management specific; this may 450 + _SWS 7.4.3 Use as needed; power management specific; this may 504 451 require specification changes for use on arm64. 505 - 506 - \_TTS 7.3.4 Use as needed; power management specific. 507 - 508 - \_TZ 5.3.1 Optional. 509 452 510 453 _UID 6.1.12 Recommended for distinguishing devices of the same 511 454 class; define it if at all possible. 512 455 513 - \_WAK 7.3.5 Use as needed; power management specific. 456 + 514 457 515 458 516 459 ACPI Event Model 517 460 ---------------- 518 461 Do not use GPE block devices; these are not supported in the hardware reduced 519 462 profile used by arm64. Since there are no GPE blocks defined for use on ARM 520 - platforms, GPIO-signaled interrupts should be used for creating system events. 463 + platforms, ACPI events must be signaled differently. 464 + 465 + There are two options: GPIO-signaled interrupts (Section 5.6.5), and 466 + interrupt-signaled events (Section 5.6.9). Interrupt-signaled events are a 467 + new feature in the ACPI 6.1 specification. Either -- or both -- can be used 468 + on a given platform, and which to use may be dependent of limitations in any 469 + given SoC. If possible, interrupt-signaled events are recommended. 521 470 522 471 523 472 ACPI Processor Control 524 473 ---------------------- 525 - Section 8 of the ACPI specification is currently undergoing change that 526 - should be completed in the 6.0 version of the specification. Processor 527 - performance control will be handled differently for arm64 at that point 528 - in time. Processor aggregator devices (section 8.5) will not be used, 529 - for example, but another similar mechanism instead. 474 + Section 8 of the ACPI specification changed significantly in version 6.0. 475 + Processors should now be defined as Device objects with _HID ACPI0007; do 476 + not use the deprecated Processor statement in ASL. All multiprocessor systems 477 + should also define a hierarchy of processors, done with Processor Container 478 + Devices (see Section 8.4.3.1, _HID ACPI0010); do not use processor aggregator 479 + devices (Section 8.5) to describe processor topology. Section 8.4 of the 480 + specification describes the semantics of these object definitions and how 481 + they interrelate. 530 482 531 - While UEFI constrains what we can say until the release of 6.0, it is 532 - recommended that CPPC (8.4.5) be used as the primary model. This will 533 - still be useful into the future. C-states and P-states will still be 534 - provided, but most of the current design work appears to favor CPPC. 483 + Most importantly, the processor hierarchy defined also defines the low power 484 + idle states that are available to the platform, along with the rules for 485 + determining which processors can be turned on or off and the circumstances 486 + that control that. Without this information, the processors will run in 487 + whatever power state they were left in by UEFI. 488 + 489 + Note too, that the processor Device objects defined and the entries in the 490 + MADT for GICs are expected to be in synchronization. The _UID of the Device 491 + object must correspond to processor IDs used in the MADT. 492 + 493 + It is recommended that CPPC (8.4.5) be used as the primary model for processor 494 + performance control on arm64. C-states and P-states may become available at 495 + some point in the future, but most current design work appears to favor CPPC. 535 496 536 497 Further, it is essential that the ARMv8 SoC provide a fully functional 537 498 implementation of PSCI; this will be the only mechanism supported by ACPI 538 - to control CPU power state (including secondary CPU booting). 539 - 540 - More details will be provided on the release of the ACPI 6.0 specification. 499 + to control CPU power state. Booting of secondary CPUs using the ACPI 500 + parking protocol is possible, but discouraged, since only PSCI is supported 501 + for ARM servers. 541 502 542 503 543 504 ACPI System Address Map Interfaces ··· 544 535 attention. 545 536 546 537 Since there is no direct equivalent of the x86 SCI or NMI, arm64 handles 547 - these slightly differently. The SCI is handled as a normal GPIO-signaled 548 - interrupt; given that these are corrected (or correctable) errors being 549 - reported, this is sufficient. The NMI is emulated as the highest priority 550 - GPIO-signaled interrupt possible. This implies some caution must be used 551 - since there could be interrupts at higher privilege levels or even interrupts 552 - at the same priority as the emulated NMI. In Linux, this should not be the 553 - case but one should be aware it could happen. 538 + these slightly differently. The SCI is handled as a high priority interrupt; 539 + given that these are corrected (or correctable) errors being reported, this 540 + is sufficient. The NMI is emulated as the highest priority interrupt 541 + possible. This implies some caution must be used since there could be 542 + interrupts at higher privilege levels or even interrupts at the same priority 543 + as the emulated NMI. In Linux, this should not be the case but one should 544 + be aware it could happen. 554 545 555 546 556 547 ACPI Objects Not Supported on ARM64 557 548 ----------------------------------- 558 549 While this may change in the future, there are several classes of objects 559 550 that can be defined, but are not currently of general interest to ARM servers. 551 + Some of these objects have x86 equivalents, and may actually make sense in ARM 552 + servers. However, there is either no hardware available at present, or there 553 + may not even be a non-ARM implementation yet. Hence, they are not currently 554 + supported. 560 555 561 - These are not supported: 556 + The following classes of objects are not supported: 562 557 563 558 -- Section 9.2: ambient light sensor devices 564 559 ··· 584 571 585 572 -- Section 9.18: time and alarm devices (see 9.15) 586 573 587 - 588 - ACPI Objects Not Yet Implemented 589 - -------------------------------- 590 - While these objects have x86 equivalents, and they do make some sense in ARM 591 - servers, there is either no hardware available at present, or in some cases 592 - there may not yet be a non-ARM implementation. Hence, they are currently not 593 - implemented though that may change in the future. 594 - 595 - Not yet implemented are: 596 - 597 574 -- Section 10: power source and power meter devices 598 575 599 576 -- Section 11: thermal management ··· 592 589 593 590 -- Section 13: SMBus interfaces 594 591 595 - -- Section 17: NUMA support (prototypes have been submitted for 596 - review) 592 + 593 + This also means that there is no support for the following objects: 594 + 595 + Name Section Name Section 596 + ---- ------------ ---- ------------ 597 + _ALC 9.3.4 _FDM 9.10.3 598 + _ALI 9.3.2 _FIX 6.2.7 599 + _ALP 9.3.6 _GAI 10.4.5 600 + _ALR 9.3.5 _GHL 10.4.7 601 + _ALT 9.3.3 _GTM 9.9.2.1.1 602 + _BCT 10.2.2.10 _LID 9.5.1 603 + _BDN 6.5.3 _PAI 10.4.4 604 + _BIF 10.2.2.1 _PCL 10.3.2 605 + _BIX 10.2.2.1 _PIF 10.3.3 606 + _BLT 9.2.3 _PMC 10.4.1 607 + _BMA 10.2.2.4 _PMD 10.4.8 608 + _BMC 10.2.2.12 _PMM 10.4.3 609 + _BMD 10.2.2.11 _PRL 10.3.4 610 + _BMS 10.2.2.5 _PSR 10.3.1 611 + _BST 10.2.2.6 _PTP 10.4.2 612 + _BTH 10.2.2.7 _SBS 10.1.3 613 + _BTM 10.2.2.9 _SHL 10.4.6 614 + _BTP 10.2.2.8 _STM 9.9.2.1.1 615 + _DCK 6.5.2 _UPD 9.16.1 616 + _EC 12.12 _UPP 9.16.2 617 + _FDE 9.10.1 _WPC 10.5.2 618 + _FDI 9.10.2 _WPP 10.5.3 619 +
+27 -13
Documentation/arm64/arm-acpi.txt
··· 34 34 35 35 The short form of the rationale for ACPI on ARM is: 36 36 37 - -- ACPI’s bytecode (AML) allows the platform to encode hardware behavior, 37 + -- ACPI’s byte code (AML) allows the platform to encode hardware behavior, 38 38 while DT explicitly does not support this. For hardware vendors, being 39 39 able to encode behavior is a key tool used in supporting operating 40 40 system releases on new hardware. ··· 57 57 58 58 -- The new ACPI governance process works well and Linux is now at the same 59 59 table as hardware vendors and other OS vendors. In fact, there is no 60 - longer any reason to feel that ACPI is only belongs to Windows or that 60 + longer any reason to feel that ACPI only belongs to Windows or that 61 61 Linux is in any way secondary to Microsoft in this arena. The move of 62 62 ACPI governance into the UEFI forum has significantly opened up the 63 63 specification development process, and currently, a large portion of the 64 - changes being made to ACPI is being driven by Linux. 64 + changes being made to ACPI are being driven by Linux. 65 65 66 66 Key to the use of ACPI is the support model. For servers in general, the 67 67 responsibility for hardware behaviour cannot solely be the domain of the ··· 110 110 exclusive with DT support at compile time. 111 111 112 112 At boot time the kernel will only use one description method depending on 113 - parameters passed from the bootloader (including kernel bootargs). 113 + parameters passed from the boot loader (including kernel bootargs). 114 114 115 115 Regardless of whether DT or ACPI is used, the kernel must always be capable 116 116 of booting with either scheme (in kernels with both schemes enabled at compile ··· 159 159 (Fixed ACPI Description Table). Any 32-bit address fields in the FADT will 160 160 be ignored on arm64. 161 161 162 - Hardware reduced mode (see Section 4.1 of the ACPI 5.1 specification) will 162 + Hardware reduced mode (see Section 4.1 of the ACPI 6.1 specification) will 163 163 be enforced by the ACPI core on arm64. Doing so allows the ACPI core to 164 164 run less complex code since it no longer has to provide support for legacy 165 165 hardware from other architectures. Any fields that are not to be used for ··· 167 167 168 168 For the ACPI core to operate properly, and in turn provide the information 169 169 the kernel needs to configure devices, it expects to find the following 170 - tables (all section numbers refer to the ACPI 5.1 specfication): 170 + tables (all section numbers refer to the ACPI 6.1 specification): 171 171 172 172 -- RSDP (Root System Description Pointer), section 5.2.5 173 173 ··· 185 185 -- If PCI is supported, the MCFG (Memory mapped ConFiGuration 186 186 Table), section 5.2.6, specifically Table 5-31. 187 187 188 + -- If booting without a console=<device> kernel parameter is 189 + supported, the SPCR (Serial Port Console Redirection table), 190 + section 5.2.6, specifically Table 5-31. 191 + 192 + -- If necessary to describe the I/O topology, SMMUs and GIC ITSs, 193 + the IORT (Input Output Remapping Table, section 5.2.6, specifically 194 + Table 5-31). 195 + 196 + -- If NUMA is supported, the SRAT (System Resource Affinity Table) 197 + and SLIT (System Locality distance Information Table), sections 198 + 5.2.16 and 5.2.17, respectively. 199 + 188 200 If the above tables are not all present, the kernel may or may not be 189 201 able to boot properly since it may not be able to configure all of the 190 - devices available. 202 + devices available. This list of tables is not meant to be all inclusive; 203 + in some environments other tables may be needed (e.g., any of the APEI 204 + tables from section 18) to support specific functionality. 191 205 192 206 193 207 ACPI Detection ··· 212 198 Recommendations" section. 213 199 214 200 In non-driver code, if the presence of ACPI needs to be detected at 215 - runtime, then check the value of acpi_disabled. If CONFIG_ACPI is not 201 + run time, then check the value of acpi_disabled. If CONFIG_ACPI is not 216 202 set, acpi_disabled will always be 1. 217 203 218 204 ··· 247 233 then retrieve the value of the property by evaluating the KEY0 object. 248 234 However, using Name() this way has multiple problems: (1) ACPI limits 249 235 names ("KEY0") to four characters unlike DT; (2) there is no industry 250 - wide registry that maintains a list of names, minimzing re-use; (3) 236 + wide registry that maintains a list of names, minimizing re-use; (3) 251 237 there is also no registry for the definition of property values ("value0"), 252 238 again making re-use difficult; and (4) how does one maintain backward 253 239 compatibility as new hardware comes out? The _DSD method was created ··· 448 434 version 5.1 was released and version 6.0 substantially completed, with most of 449 435 the changes being driven by ARM-specific requirements. Proposed changes are 450 436 presented and discussed in the ASWG (ACPI Specification Working Group) which 451 - is a part of the UEFI Forum. 437 + is a part of the UEFI Forum. The current version of the ACPI specification 438 + is 6.1 release in January 2016. 452 439 453 440 Participation in this group is open to all UEFI members. Please see 454 441 http://www.uefi.org/workinggroup for details on group membership. ··· 458 443 as closely as possible, and to only implement functionality that complies with 459 444 the released standards from UEFI ASWG. As a practical matter, there will be 460 445 vendors that provide bad ACPI tables or violate the standards in some way. 461 - If this is because of errors, quirks and fixups may be necessary, but will 446 + If this is because of errors, quirks and fix-ups may be necessary, but will 462 447 be avoided if possible. If there are features missing from ACPI that preclude 463 448 it from being used on a platform, ECRs (Engineering Change Requests) should be 464 449 submitted to ASWG and go through the normal approval process; for those that ··· 495 480 Software on ARM Platforms", dated 16 Aug 2014 496 481 497 482 [2] http://www.secretlab.ca/archives/151, 10 Jan 2015, Copyright (c) 2015, 498 - Linaro Ltd., written by Grant Likely. A copy of the verbatim text (apart 499 - from formatting) is also in Documentation/arm64/why_use_acpi.txt. 483 + Linaro Ltd., written by Grant Likely. 500 484 501 485 [3] AMD ACPI for Seattle platform documentation: 502 486 http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2012/10/Seattle_ACPI_Guide.pdf
+3 -1
Documentation/devicetree/bindings/arm/pmu.txt
··· 39 39 When using a PPI, specifies a list of phandles to CPU 40 40 nodes corresponding to the set of CPUs which have 41 41 a PMU of this type signalling the PPI listed in the 42 - interrupts property. 42 + interrupts property, unless this is already specified 43 + by the PPI interrupt specifier itself (in which case 44 + the interrupt-affinity property shouldn't be present). 43 45 44 46 This property should be present when there is more than 45 47 a single SPI.
+15 -1
arch/arm64/Kconfig
··· 8 8 select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE 9 9 select ARCH_HAS_ELF_RANDOMIZE 10 10 select ARCH_HAS_GCOV_PROFILE_ALL 11 + select ARCH_HAS_KCOV 11 12 select ARCH_HAS_SG_CHAIN 12 13 select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST 13 14 select ARCH_USE_CMPXCHG_LOCKREF ··· 87 86 select HAVE_PERF_EVENTS 88 87 select HAVE_PERF_REGS 89 88 select HAVE_PERF_USER_STACK_DUMP 89 + select HAVE_REGS_AND_STACK_ACCESS_API 90 90 select HAVE_RCU_TABLE_FREE 91 91 select HAVE_SYSCALL_TRACEPOINTS 92 + select HAVE_KPROBES 93 + select HAVE_KRETPROBES if HAVE_KPROBES 92 94 select IOMMU_DMA if IOMMU_SUPPORT 93 95 select IRQ_DOMAIN 94 96 select IRQ_FORCED_THREADING ··· 669 665 670 666 If in doubt, say N here. 671 667 668 + config KEXEC 669 + depends on PM_SLEEP_SMP 670 + select KEXEC_CORE 671 + bool "kexec system call" 672 + ---help--- 673 + kexec is a system call that implements the ability to shutdown your 674 + current kernel, and to start another kernel. It is like a reboot 675 + but it is independent of the system firmware. And like a reboot 676 + you can start any kernel with it, not just Linux. 677 + 672 678 config XEN_DOM0 673 679 def_bool y 674 680 depends on XEN ··· 887 873 888 874 config RANDOMIZE_BASE 889 875 bool "Randomize the address of the kernel image" 890 - select ARM64_MODULE_PLTS 876 + select ARM64_MODULE_PLTS if MODULES 891 877 select RELOCATABLE 892 878 help 893 879 Randomizes the virtual address at which the kernel image is
+10 -1
arch/arm64/Makefile
··· 12 12 13 13 LDFLAGS_vmlinux :=-p --no-undefined -X 14 14 CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET) 15 - OBJCOPYFLAGS :=-O binary -R .note -R .note.gnu.build-id -R .comment -S 16 15 GZFLAGS :=-9 17 16 18 17 ifneq ($(CONFIG_RELOCATABLE),) ··· 119 120 archclean: 120 121 $(Q)$(MAKE) $(clean)=$(boot) 121 122 $(Q)$(MAKE) $(clean)=$(boot)/dts 123 + 124 + # We need to generate vdso-offsets.h before compiling certain files in kernel/. 125 + # In order to do that, we should use the archprepare target, but we can't since 126 + # asm-offsets.h is included in some files used to generate vdso-offsets.h, and 127 + # asm-offsets.h is built in prepare0, for which archprepare is a dependency. 128 + # Therefore we need to generate the header after prepare0 has been made, hence 129 + # this hack. 130 + prepare: vdso_prepare 131 + vdso_prepare: prepare0 132 + $(Q)$(MAKE) $(build)=arch/arm64/kernel/vdso include/generated/vdso-offsets.h 122 133 123 134 define archhelp 124 135 echo '* Image.gz - Compressed kernel image (arch/$(ARCH)/boot/Image.gz)'
+2
arch/arm64/boot/Makefile
··· 14 14 # Based on the ia64 boot/Makefile. 15 15 # 16 16 17 + OBJCOPYFLAGS_Image :=-O binary -R .note -R .note.gnu.build-id -R .comment -S 18 + 17 19 targets := Image Image.gz 18 20 19 21 $(obj)/Image: vmlinux FORCE
+1
arch/arm64/configs/defconfig
··· 70 70 CONFIG_TRANSPARENT_HUGEPAGE=y 71 71 CONFIG_CMA=y 72 72 CONFIG_XEN=y 73 + CONFIG_KEXEC=y 73 74 # CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set 74 75 CONFIG_COMPAT=y 75 76 CONFIG_CPU_IDLE=y
-1
arch/arm64/include/asm/Kbuild
··· 1 1 generic-y += bug.h 2 2 generic-y += bugs.h 3 - generic-y += checksum.h 4 3 generic-y += clkdev.h 5 4 generic-y += cputime.h 6 5 generic-y += current.h
+7 -9
arch/arm64/include/asm/alternative.h
··· 95 95 * The code that follows this macro will be assembled and linked as 96 96 * normal. There are no restrictions on this code. 97 97 */ 98 - .macro alternative_if_not cap, enable = 1 99 - .if \enable 98 + .macro alternative_if_not cap 100 99 .pushsection .altinstructions, "a" 101 100 altinstruction_entry 661f, 663f, \cap, 662f-661f, 664f-663f 102 101 .popsection 103 102 661: 104 - .endif 105 103 .endm 106 104 107 105 /* ··· 116 118 * alternative sequence it is defined in (branches into an 117 119 * alternative sequence are not fixed up). 118 120 */ 119 - .macro alternative_else, enable = 1 120 - .if \enable 121 + .macro alternative_else 121 122 662: .pushsection .altinstr_replacement, "ax" 122 123 663: 123 - .endif 124 124 .endm 125 125 126 126 /* 127 127 * Complete an alternative code sequence. 128 128 */ 129 - .macro alternative_endif, enable = 1 130 - .if \enable 129 + .macro alternative_endif 131 130 664: .popsection 132 131 .org . - (664b-663b) + (662b-661b) 133 132 .org . - (662b-661b) + (664b-663b) 134 - .endif 135 133 .endm 136 134 137 135 #define _ALTERNATIVE_CFG(insn1, insn2, cap, cfg, ...) \ 138 136 alternative_insn insn1, insn2, cap, IS_ENABLED(cfg) 139 137 138 + .macro user_alt, label, oldinstr, newinstr, cond 139 + 9999: alternative_insn "\oldinstr", "\newinstr", \cond 140 + _ASM_EXTABLE 9999b, \label 141 + .endm 140 142 141 143 /* 142 144 * Generate the assembly for UAO alternatives with exception table entries.
+11 -1
arch/arm64/include/asm/assembler.h
··· 24 24 #define __ASM_ASSEMBLER_H 25 25 26 26 #include <asm/asm-offsets.h> 27 + #include <asm/cpufeature.h> 27 28 #include <asm/page.h> 28 29 #include <asm/pgtable-hwdef.h> 29 30 #include <asm/ptrace.h> ··· 262 261 add \size, \kaddr, \size 263 262 sub \tmp2, \tmp1, #1 264 263 bic \kaddr, \kaddr, \tmp2 265 - 9998: dc \op, \kaddr 264 + 9998: 265 + .if (\op == cvau || \op == cvac) 266 + alternative_if_not ARM64_WORKAROUND_CLEAN_CACHE 267 + dc \op, \kaddr 268 + alternative_else 269 + dc civac, \kaddr 270 + alternative_endif 271 + .else 272 + dc \op, \kaddr 273 + .endif 266 274 add \kaddr, \kaddr, \tmp1 267 275 cmp \kaddr, \size 268 276 b.lo 9998b
+51
arch/arm64/include/asm/checksum.h
··· 1 + /* 2 + * Copyright (C) 2016 ARM Ltd. 3 + * 4 + * This program is free software; you can redistribute it and/or modify 5 + * it under the terms of the GNU General Public License version 2 as 6 + * published by the Free Software Foundation. 7 + * 8 + * This program is distributed in the hope that it will be useful, 9 + * but WITHOUT ANY WARRANTY; without even the implied warranty of 10 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 11 + * GNU General Public License for more details. 12 + * 13 + * You should have received a copy of the GNU General Public License 14 + * along with this program. If not, see <http://www.gnu.org/licenses/>. 15 + */ 16 + #ifndef __ASM_CHECKSUM_H 17 + #define __ASM_CHECKSUM_H 18 + 19 + #include <linux/types.h> 20 + 21 + static inline __sum16 csum_fold(__wsum csum) 22 + { 23 + u32 sum = (__force u32)csum; 24 + sum += (sum >> 16) | (sum << 16); 25 + return ~(__force __sum16)(sum >> 16); 26 + } 27 + #define csum_fold csum_fold 28 + 29 + static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl) 30 + { 31 + __uint128_t tmp; 32 + u64 sum; 33 + 34 + tmp = *(const __uint128_t *)iph; 35 + iph += 16; 36 + ihl -= 4; 37 + tmp += ((tmp >> 64) | (tmp << 64)); 38 + sum = tmp >> 64; 39 + do { 40 + sum += *(const u32 *)iph; 41 + iph += 4; 42 + } while (--ihl); 43 + 44 + sum += ((sum >> 32) | (sum << 32)); 45 + return csum_fold(sum >> 32); 46 + } 47 + #define ip_fast_csum ip_fast_csum 48 + 49 + #include <asm-generic/checksum.h> 50 + 51 + #endif /* __ASM_CHECKSUM_H */
+2
arch/arm64/include/asm/cpu.h
··· 25 25 */ 26 26 struct cpuinfo_arm64 { 27 27 struct cpu cpu; 28 + struct kobject kobj; 28 29 u32 reg_ctr; 29 30 u32 reg_cntfrq; 30 31 u32 reg_dczid; 31 32 u32 reg_midr; 33 + u32 reg_revidr; 32 34 33 35 u64 reg_id_aa64dfr0; 34 36 u64 reg_id_aa64dfr1;
+2
arch/arm64/include/asm/cpufeature.h
··· 191 191 192 192 void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps, 193 193 const char *info); 194 + void enable_cpu_capabilities(const struct arm64_cpu_capabilities *caps); 194 195 void check_local_cpu_errata(void); 196 + void __init enable_errata_workarounds(void); 195 197 196 198 void verify_local_cpu_errata(void); 197 199 void verify_local_cpu_capabilities(void);
+5
arch/arm64/include/asm/debug-monitors.h
··· 66 66 67 67 #define CACHE_FLUSH_IS_SAFE 1 68 68 69 + /* kprobes BRK opcodes with ESR encoding */ 70 + #define BRK64_ESR_MASK 0xFFFF 71 + #define BRK64_ESR_KPROBES 0x0004 72 + #define BRK64_OPCODE_KPROBES (AARCH64_BREAK_MON | (BRK64_ESR_KPROBES << 5)) 73 + 69 74 /* AArch32 */ 70 75 #define DBG_ESR_EVT_BKPT 0x4 71 76 #define DBG_ESR_EVT_VECC 0x5
+1 -2
arch/arm64/include/asm/efi.h
··· 14 14 #endif 15 15 16 16 int efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md); 17 - 18 - #define efi_set_mapping_permissions efi_create_mapping 17 + int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md); 19 18 20 19 #define arch_efi_call_virt_setup() \ 21 20 ({ \
+1
arch/arm64/include/asm/esr.h
··· 74 74 75 75 #define ESR_ELx_EC_SHIFT (26) 76 76 #define ESR_ELx_EC_MASK (UL(0x3F) << ESR_ELx_EC_SHIFT) 77 + #define ESR_ELx_EC(esr) (((esr) & ESR_ELx_EC_MASK) >> ESR_ELx_EC_SHIFT) 77 78 78 79 #define ESR_ELx_IL (UL(1) << 25) 79 80 #define ESR_ELx_ISS_MASK (ESR_ELx_IL - 1)
+41
arch/arm64/include/asm/insn.h
··· 120 120 AARCH64_INSN_REG_SP = 31 /* Stack pointer: as load/store base reg */ 121 121 }; 122 122 123 + enum aarch64_insn_special_register { 124 + AARCH64_INSN_SPCLREG_SPSR_EL1 = 0xC200, 125 + AARCH64_INSN_SPCLREG_ELR_EL1 = 0xC201, 126 + AARCH64_INSN_SPCLREG_SP_EL0 = 0xC208, 127 + AARCH64_INSN_SPCLREG_SPSEL = 0xC210, 128 + AARCH64_INSN_SPCLREG_CURRENTEL = 0xC212, 129 + AARCH64_INSN_SPCLREG_DAIF = 0xDA11, 130 + AARCH64_INSN_SPCLREG_NZCV = 0xDA10, 131 + AARCH64_INSN_SPCLREG_FPCR = 0xDA20, 132 + AARCH64_INSN_SPCLREG_DSPSR_EL0 = 0xDA28, 133 + AARCH64_INSN_SPCLREG_DLR_EL0 = 0xDA29, 134 + AARCH64_INSN_SPCLREG_SPSR_EL2 = 0xE200, 135 + AARCH64_INSN_SPCLREG_ELR_EL2 = 0xE201, 136 + AARCH64_INSN_SPCLREG_SP_EL1 = 0xE208, 137 + AARCH64_INSN_SPCLREG_SPSR_INQ = 0xE218, 138 + AARCH64_INSN_SPCLREG_SPSR_ABT = 0xE219, 139 + AARCH64_INSN_SPCLREG_SPSR_UND = 0xE21A, 140 + AARCH64_INSN_SPCLREG_SPSR_FIQ = 0xE21B, 141 + AARCH64_INSN_SPCLREG_SPSR_EL3 = 0xF200, 142 + AARCH64_INSN_SPCLREG_ELR_EL3 = 0xF201, 143 + AARCH64_INSN_SPCLREG_SP_EL2 = 0xF210 144 + }; 145 + 123 146 enum aarch64_insn_variant { 124 147 AARCH64_INSN_VARIANT_32BIT, 125 148 AARCH64_INSN_VARIANT_64BIT ··· 246 223 static __always_inline u32 aarch64_insn_get_##abbr##_value(void) \ 247 224 { return (val); } 248 225 226 + __AARCH64_INSN_FUNCS(adr_adrp, 0x1F000000, 0x10000000) 227 + __AARCH64_INSN_FUNCS(prfm_lit, 0xFF000000, 0xD8000000) 249 228 __AARCH64_INSN_FUNCS(str_reg, 0x3FE0EC00, 0x38206800) 250 229 __AARCH64_INSN_FUNCS(ldr_reg, 0x3FE0EC00, 0x38606800) 230 + __AARCH64_INSN_FUNCS(ldr_lit, 0xBF000000, 0x18000000) 231 + __AARCH64_INSN_FUNCS(ldrsw_lit, 0xFF000000, 0x98000000) 232 + __AARCH64_INSN_FUNCS(exclusive, 0x3F800000, 0x08000000) 233 + __AARCH64_INSN_FUNCS(load_ex, 0x3F400000, 0x08400000) 234 + __AARCH64_INSN_FUNCS(store_ex, 0x3F400000, 0x08000000) 251 235 __AARCH64_INSN_FUNCS(stp_post, 0x7FC00000, 0x28800000) 252 236 __AARCH64_INSN_FUNCS(ldp_post, 0x7FC00000, 0x28C00000) 253 237 __AARCH64_INSN_FUNCS(stp_pre, 0x7FC00000, 0x29800000) ··· 303 273 __AARCH64_INSN_FUNCS(hvc, 0xFFE0001F, 0xD4000002) 304 274 __AARCH64_INSN_FUNCS(smc, 0xFFE0001F, 0xD4000003) 305 275 __AARCH64_INSN_FUNCS(brk, 0xFFE0001F, 0xD4200000) 276 + __AARCH64_INSN_FUNCS(exception, 0xFF000000, 0xD4000000) 306 277 __AARCH64_INSN_FUNCS(hint, 0xFFFFF01F, 0xD503201F) 307 278 __AARCH64_INSN_FUNCS(br, 0xFFFFFC1F, 0xD61F0000) 308 279 __AARCH64_INSN_FUNCS(blr, 0xFFFFFC1F, 0xD63F0000) 309 280 __AARCH64_INSN_FUNCS(ret, 0xFFFFFC1F, 0xD65F0000) 281 + __AARCH64_INSN_FUNCS(eret, 0xFFFFFFFF, 0xD69F03E0) 282 + __AARCH64_INSN_FUNCS(mrs, 0xFFF00000, 0xD5300000) 283 + __AARCH64_INSN_FUNCS(msr_imm, 0xFFF8F01F, 0xD500401F) 284 + __AARCH64_INSN_FUNCS(msr_reg, 0xFFF00000, 0xD5100000) 310 285 311 286 #undef __AARCH64_INSN_FUNCS 312 287 ··· 321 286 int aarch64_insn_read(void *addr, u32 *insnp); 322 287 int aarch64_insn_write(void *addr, u32 insn); 323 288 enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn); 289 + bool aarch64_insn_uses_literal(u32 insn); 290 + bool aarch64_insn_is_branch(u32 insn); 324 291 u64 aarch64_insn_decode_immediate(enum aarch64_insn_imm_type type, u32 insn); 325 292 u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type, 326 293 u32 insn, u64 imm); ··· 404 367 #define A32_RT_OFFSET 12 405 368 #define A32_RT2_OFFSET 0 406 369 370 + u32 aarch64_insn_extract_system_reg(u32 insn); 407 371 u32 aarch32_insn_extract_reg_num(u32 insn, int offset); 408 372 u32 aarch32_insn_mcr_extract_opc2(u32 insn); 409 373 u32 aarch32_insn_mcr_extract_crm(u32 insn); 374 + 375 + typedef bool (pstate_check_t)(unsigned long); 376 + extern pstate_check_t * const aarch32_opcode_cond_checks[16]; 410 377 #endif /* __ASSEMBLY__ */ 411 378 412 379 #endif /* __ASM_INSN_H */
-3
arch/arm64/include/asm/irqflags.h
··· 110 110 : : "r" (flags) : "memory"); \ 111 111 } while (0) 112 112 113 - #define local_dbg_enable() asm("msr daifclr, #8" : : : "memory") 114 - #define local_dbg_disable() asm("msr daifset, #8" : : : "memory") 115 - 116 113 #endif 117 114 #endif
+48
arch/arm64/include/asm/kexec.h
··· 1 + /* 2 + * kexec for arm64 3 + * 4 + * Copyright (C) Linaro. 5 + * Copyright (C) Huawei Futurewei Technologies. 6 + * 7 + * This program is free software; you can redistribute it and/or modify 8 + * it under the terms of the GNU General Public License version 2 as 9 + * published by the Free Software Foundation. 10 + */ 11 + 12 + #ifndef _ARM64_KEXEC_H 13 + #define _ARM64_KEXEC_H 14 + 15 + /* Maximum physical address we can use pages from */ 16 + 17 + #define KEXEC_SOURCE_MEMORY_LIMIT (-1UL) 18 + 19 + /* Maximum address we can reach in physical address mode */ 20 + 21 + #define KEXEC_DESTINATION_MEMORY_LIMIT (-1UL) 22 + 23 + /* Maximum address we can use for the control code buffer */ 24 + 25 + #define KEXEC_CONTROL_MEMORY_LIMIT (-1UL) 26 + 27 + #define KEXEC_CONTROL_PAGE_SIZE 4096 28 + 29 + #define KEXEC_ARCH KEXEC_ARCH_AARCH64 30 + 31 + #ifndef __ASSEMBLY__ 32 + 33 + /** 34 + * crash_setup_regs() - save registers for the panic kernel 35 + * 36 + * @newregs: registers are saved here 37 + * @oldregs: registers to be saved (may be %NULL) 38 + */ 39 + 40 + static inline void crash_setup_regs(struct pt_regs *newregs, 41 + struct pt_regs *oldregs) 42 + { 43 + /* Empty routine needed to avoid build errors. */ 44 + } 45 + 46 + #endif /* __ASSEMBLY__ */ 47 + 48 + #endif
+62
arch/arm64/include/asm/kprobes.h
··· 1 + /* 2 + * arch/arm64/include/asm/kprobes.h 3 + * 4 + * Copyright (C) 2013 Linaro Limited 5 + * 6 + * This program is free software; you can redistribute it and/or modify 7 + * it under the terms of the GNU General Public License version 2 as 8 + * published by the Free Software Foundation. 9 + * 10 + * This program is distributed in the hope that it will be useful, 11 + * but WITHOUT ANY WARRANTY; without even the implied warranty of 12 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 13 + * General Public License for more details. 14 + */ 15 + 16 + #ifndef _ARM_KPROBES_H 17 + #define _ARM_KPROBES_H 18 + 19 + #include <linux/types.h> 20 + #include <linux/ptrace.h> 21 + #include <linux/percpu.h> 22 + 23 + #define __ARCH_WANT_KPROBES_INSN_SLOT 24 + #define MAX_INSN_SIZE 1 25 + #define MAX_STACK_SIZE 128 26 + 27 + #define flush_insn_slot(p) do { } while (0) 28 + #define kretprobe_blacklist_size 0 29 + 30 + #include <asm/probes.h> 31 + 32 + struct prev_kprobe { 33 + struct kprobe *kp; 34 + unsigned int status; 35 + }; 36 + 37 + /* Single step context for kprobe */ 38 + struct kprobe_step_ctx { 39 + unsigned long ss_pending; 40 + unsigned long match_addr; 41 + }; 42 + 43 + /* per-cpu kprobe control block */ 44 + struct kprobe_ctlblk { 45 + unsigned int kprobe_status; 46 + unsigned long saved_irqflag; 47 + struct prev_kprobe prev_kprobe; 48 + struct kprobe_step_ctx ss_ctx; 49 + struct pt_regs jprobe_saved_regs; 50 + char jprobes_stack[MAX_STACK_SIZE]; 51 + }; 52 + 53 + void arch_remove_kprobe(struct kprobe *); 54 + int kprobe_fault_handler(struct pt_regs *regs, unsigned int fsr); 55 + int kprobe_exceptions_notify(struct notifier_block *self, 56 + unsigned long val, void *data); 57 + int kprobe_breakpoint_handler(struct pt_regs *regs, unsigned int esr); 58 + int kprobe_single_step_handler(struct pt_regs *regs, unsigned int esr); 59 + void kretprobe_trampoline(void); 60 + void __kprobes *trampoline_probe_handler(struct pt_regs *regs); 61 + 62 + #endif /* _ARM_KPROBES_H */
+1 -1
arch/arm64/include/asm/kvm_emulate.h
··· 210 210 211 211 static inline u8 kvm_vcpu_trap_get_class(const struct kvm_vcpu *vcpu) 212 212 { 213 - return kvm_vcpu_get_hsr(vcpu) >> ESR_ELx_EC_SHIFT; 213 + return ESR_ELx_EC(kvm_vcpu_get_hsr(vcpu)); 214 214 } 215 215 216 216 static inline bool kvm_vcpu_trap_is_iabt(const struct kvm_vcpu *vcpu)
+1 -1
arch/arm64/include/asm/mmu.h
··· 34 34 extern void init_mem_pgprot(void); 35 35 extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys, 36 36 unsigned long virt, phys_addr_t size, 37 - pgprot_t prot); 37 + pgprot_t prot, bool allow_block_mappings); 38 38 extern void *fixmap_remap_fdt(phys_addr_t dt_phys); 39 39 40 40 #endif
+35
arch/arm64/include/asm/probes.h
··· 1 + /* 2 + * arch/arm64/include/asm/probes.h 3 + * 4 + * Copyright (C) 2013 Linaro Limited 5 + * 6 + * This program is free software; you can redistribute it and/or modify 7 + * it under the terms of the GNU General Public License version 2 as 8 + * published by the Free Software Foundation. 9 + * 10 + * This program is distributed in the hope that it will be useful, 11 + * but WITHOUT ANY WARRANTY; without even the implied warranty of 12 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 13 + * General Public License for more details. 14 + */ 15 + #ifndef _ARM_PROBES_H 16 + #define _ARM_PROBES_H 17 + 18 + #include <asm/opcodes.h> 19 + 20 + struct kprobe; 21 + struct arch_specific_insn; 22 + 23 + typedef u32 kprobe_opcode_t; 24 + typedef void (kprobes_handler_t) (u32 opcode, long addr, struct pt_regs *); 25 + 26 + /* architecture specific copy of original instruction */ 27 + struct arch_specific_insn { 28 + kprobe_opcode_t *insn; 29 + pstate_check_t *pstate_cc; 30 + kprobes_handler_t *handler; 31 + /* restore address after step xol */ 32 + unsigned long restore; 33 + }; 34 + 35 + #endif
+1
arch/arm64/include/asm/processor.h
··· 192 192 193 193 void cpu_enable_pan(void *__unused); 194 194 void cpu_enable_uao(void *__unused); 195 + void cpu_enable_cache_maint_trap(void *__unused); 195 196 196 197 #endif /* __ASM_PROCESSOR_H */
+44
arch/arm64/include/asm/ptdump.h
··· 1 + /* 2 + * Copyright (C) 2014 ARM Ltd. 3 + * 4 + * This program is free software; you can redistribute it and/or modify 5 + * it under the terms of the GNU General Public License version 2 as 6 + * published by the Free Software Foundation. 7 + * 8 + * This program is distributed in the hope that it will be useful, 9 + * but WITHOUT ANY WARRANTY; without even the implied warranty of 10 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 11 + * GNU General Public License for more details. 12 + * 13 + * You should have received a copy of the GNU General Public License 14 + * along with this program. If not, see <http://www.gnu.org/licenses/>. 15 + */ 16 + #ifndef __ASM_PTDUMP_H 17 + #define __ASM_PTDUMP_H 18 + 19 + #ifdef CONFIG_ARM64_PTDUMP 20 + 21 + #include <linux/mm_types.h> 22 + 23 + struct addr_marker { 24 + unsigned long start_address; 25 + char *name; 26 + }; 27 + 28 + struct ptdump_info { 29 + struct mm_struct *mm; 30 + const struct addr_marker *markers; 31 + unsigned long base_addr; 32 + unsigned long max_addr; 33 + }; 34 + 35 + int ptdump_register(struct ptdump_info *info, const char *name); 36 + 37 + #else 38 + static inline int ptdump_register(struct ptdump_info *info, const char *name) 39 + { 40 + return 0; 41 + } 42 + #endif /* CONFIG_ARM64_PTDUMP */ 43 + 44 + #endif /* __ASM_PTDUMP_H */
+61 -3
arch/arm64/include/asm/ptrace.h
··· 46 46 #define COMPAT_PSR_MODE_UND 0x0000001b 47 47 #define COMPAT_PSR_MODE_SYS 0x0000001f 48 48 #define COMPAT_PSR_T_BIT 0x00000020 49 - #define COMPAT_PSR_E_BIT 0x00000200 50 49 #define COMPAT_PSR_F_BIT 0x00000040 51 50 #define COMPAT_PSR_I_BIT 0x00000080 52 51 #define COMPAT_PSR_A_BIT 0x00000100 ··· 73 74 #define COMPAT_PT_DATA_ADDR 0x10004 74 75 #define COMPAT_PT_TEXT_END_ADDR 0x10008 75 76 #ifndef __ASSEMBLY__ 77 + #include <linux/bug.h> 76 78 77 79 /* sizeof(struct user) for AArch32 */ 78 80 #define COMPAT_USER_SZ 296 ··· 121 121 u64 unused; // maintain 16 byte alignment 122 122 }; 123 123 124 + #define MAX_REG_OFFSET offsetof(struct pt_regs, pstate) 125 + 124 126 #define arch_has_single_step() (1) 125 127 126 128 #ifdef CONFIG_COMPAT ··· 148 146 #define fast_interrupts_enabled(regs) \ 149 147 (!((regs)->pstate & PSR_F_BIT)) 150 148 151 - #define user_stack_pointer(regs) \ 149 + #define GET_USP(regs) \ 152 150 (!compat_user_mode(regs) ? (regs)->sp : (regs)->compat_sp) 151 + 152 + #define SET_USP(ptregs, value) \ 153 + (!compat_user_mode(regs) ? ((regs)->sp = value) : ((regs)->compat_sp = value)) 154 + 155 + extern int regs_query_register_offset(const char *name); 156 + extern unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs, 157 + unsigned int n); 158 + 159 + /** 160 + * regs_get_register() - get register value from its offset 161 + * @regs: pt_regs from which register value is gotten 162 + * @offset: offset of the register. 163 + * 164 + * regs_get_register returns the value of a register whose offset from @regs. 165 + * The @offset is the offset of the register in struct pt_regs. 166 + * If @offset is bigger than MAX_REG_OFFSET, this returns 0. 167 + */ 168 + static inline u64 regs_get_register(struct pt_regs *regs, unsigned int offset) 169 + { 170 + u64 val = 0; 171 + 172 + WARN_ON(offset & 7); 173 + 174 + offset >>= 3; 175 + switch (offset) { 176 + case 0 ... 30: 177 + val = regs->regs[offset]; 178 + break; 179 + case offsetof(struct pt_regs, sp) >> 3: 180 + val = regs->sp; 181 + break; 182 + case offsetof(struct pt_regs, pc) >> 3: 183 + val = regs->pc; 184 + break; 185 + case offsetof(struct pt_regs, pstate) >> 3: 186 + val = regs->pstate; 187 + break; 188 + default: 189 + val = 0; 190 + } 191 + 192 + return val; 193 + } 194 + 195 + /* Valid only for Kernel mode traps. */ 196 + static inline unsigned long kernel_stack_pointer(struct pt_regs *regs) 197 + { 198 + return regs->sp; 199 + } 153 200 154 201 static inline unsigned long regs_return_value(struct pt_regs *regs) 155 202 { ··· 209 158 struct task_struct; 210 159 int valid_user_regs(struct user_pt_regs *regs, struct task_struct *task); 211 160 212 - #define instruction_pointer(regs) ((unsigned long)(regs)->pc) 161 + #define GET_IP(regs) ((unsigned long)(regs)->pc) 162 + #define SET_IP(regs, value) ((regs)->pc = ((u64) (value))) 213 163 164 + #define GET_FP(ptregs) ((unsigned long)(ptregs)->regs[29]) 165 + #define SET_FP(ptregs, value) ((ptregs)->regs[29] = ((u64) (value))) 166 + 167 + #include <asm-generic/ptrace.h> 168 + 169 + #undef profile_pc 214 170 extern unsigned long profile_pc(struct pt_regs *regs); 215 171 216 172 #endif /* __ASSEMBLY__ */
+1 -1
arch/arm64/include/asm/sysreg.h
··· 98 98 SCTLR_ELx_SA | SCTLR_ELx_I) 99 99 100 100 /* SCTLR_EL1 specific flags. */ 101 + #define SCTLR_EL1_UCI (1 << 26) 101 102 #define SCTLR_EL1_SPAN (1 << 23) 102 103 #define SCTLR_EL1_SED (1 << 8) 103 104 #define SCTLR_EL1_CP15BEN (1 << 5) 104 - 105 105 106 106 /* id_aa64isar0 */ 107 107 #define ID_AA64ISAR0_RDM_SHIFT 28
+2
arch/arm64/include/asm/traps.h
··· 34 34 void register_undef_hook(struct undef_hook *hook); 35 35 void unregister_undef_hook(struct undef_hook *hook); 36 36 37 + void arm64_notify_segfault(struct pt_regs *regs, unsigned long addr); 38 + 37 39 #ifdef CONFIG_FUNCTION_GRAPH_TRACER 38 40 static inline int __in_irqentry_text(unsigned long ptr) 39 41 {
+21 -4
arch/arm64/include/asm/uaccess.h
··· 21 21 /* 22 22 * User space memory access functions 23 23 */ 24 + #include <linux/kasan-checks.h> 24 25 #include <linux/string.h> 25 26 #include <linux/thread_info.h> 26 27 ··· 257 256 -EFAULT; \ 258 257 }) 259 258 260 - extern unsigned long __must_check __copy_from_user(void *to, const void __user *from, unsigned long n); 261 - extern unsigned long __must_check __copy_to_user(void __user *to, const void *from, unsigned long n); 259 + extern unsigned long __must_check __arch_copy_from_user(void *to, const void __user *from, unsigned long n); 260 + extern unsigned long __must_check __arch_copy_to_user(void __user *to, const void *from, unsigned long n); 262 261 extern unsigned long __must_check __copy_in_user(void __user *to, const void __user *from, unsigned long n); 263 262 extern unsigned long __must_check __clear_user(void __user *addr, unsigned long n); 264 263 264 + static inline unsigned long __must_check __copy_from_user(void *to, const void __user *from, unsigned long n) 265 + { 266 + kasan_check_write(to, n); 267 + return __arch_copy_from_user(to, from, n); 268 + } 269 + 270 + static inline unsigned long __must_check __copy_to_user(void __user *to, const void *from, unsigned long n) 271 + { 272 + kasan_check_read(from, n); 273 + return __arch_copy_to_user(to, from, n); 274 + } 275 + 265 276 static inline unsigned long __must_check copy_from_user(void *to, const void __user *from, unsigned long n) 266 277 { 278 + kasan_check_write(to, n); 279 + 267 280 if (access_ok(VERIFY_READ, from, n)) 268 - n = __copy_from_user(to, from, n); 281 + n = __arch_copy_from_user(to, from, n); 269 282 else /* security hole - plug it */ 270 283 memset(to, 0, n); 271 284 return n; ··· 287 272 288 273 static inline unsigned long __must_check copy_to_user(void __user *to, const void *from, unsigned long n) 289 274 { 275 + kasan_check_read(from, n); 276 + 290 277 if (access_ok(VERIFY_WRITE, to, n)) 291 - n = __copy_to_user(to, from, n); 278 + n = __arch_copy_to_user(to, from, n); 292 279 return n; 293 280 } 294 281
+6 -2
arch/arm64/include/asm/vdso_datapage.h
··· 22 22 23 23 struct vdso_data { 24 24 __u64 cs_cycle_last; /* Timebase at clocksource init */ 25 + __u64 raw_time_sec; /* Raw time */ 26 + __u64 raw_time_nsec; 25 27 __u64 xtime_clock_sec; /* Kernel time */ 26 28 __u64 xtime_clock_nsec; 27 29 __u64 xtime_coarse_sec; /* Coarse time */ ··· 31 29 __u64 wtm_clock_sec; /* Wall to monotonic time */ 32 30 __u64 wtm_clock_nsec; 33 31 __u32 tb_seq_count; /* Timebase sequence counter */ 34 - __u32 cs_mult; /* Clocksource multiplier */ 35 - __u32 cs_shift; /* Clocksource shift */ 32 + /* cs_* members must be adjacent and in this order (ldp accesses) */ 33 + __u32 cs_mono_mult; /* NTP-adjusted clocksource multiplier */ 34 + __u32 cs_shift; /* Clocksource shift (mono = raw) */ 35 + __u32 cs_raw_mult; /* Raw clocksource multiplier */ 36 36 __u32 tz_minuteswest; /* Whacky timezone stuff */ 37 37 __u32 tz_dsttime; 38 38 __u32 use_syscall;
+5
arch/arm64/include/asm/virt.h
··· 34 34 */ 35 35 #define HVC_SET_VECTORS 1 36 36 37 + /* 38 + * HVC_SOFT_RESTART - CPU soft reset, used by the cpu_soft_restart routine. 39 + */ 40 + #define HVC_SOFT_RESTART 2 41 + 37 42 #define BOOT_CPU_MODE_EL1 (0xe11) 38 43 #define BOOT_CPU_MODE_EL2 (0xe12) 39 44
+4 -7
arch/arm64/kernel/Makefile
··· 26 26 $(call if_changed,objcopy) 27 27 28 28 arm64-obj-$(CONFIG_COMPAT) += sys32.o kuser32.o signal32.o \ 29 - sys_compat.o entry32.o \ 30 - ../../arm/kernel/opcodes.o 29 + sys_compat.o entry32.o 31 30 arm64-obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o entry-ftrace.o 32 31 arm64-obj-$(CONFIG_MODULES) += arm64ksyms.o module.o 33 32 arm64-obj-$(CONFIG_ARM64_MODULE_PLTS) += module-plts.o ··· 46 47 arm64-obj-$(CONFIG_PARAVIRT) += paravirt.o 47 48 arm64-obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o 48 49 arm64-obj-$(CONFIG_HIBERNATION) += hibernate.o hibernate-asm.o 50 + arm64-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o \ 51 + cpu-reset.o 49 52 50 - obj-y += $(arm64-obj-y) vdso/ 53 + obj-y += $(arm64-obj-y) vdso/ probes/ 51 54 obj-m += $(arm64-obj-m) 52 55 head-y := head.o 53 56 extra-y += $(head-y) vmlinux.lds 54 - 55 - # vDSO - this must be built first to generate the symbol offsets 56 - $(call objectify,$(arm64-obj-y)): $(obj)/vdso/vdso-offsets.h 57 - $(obj)/vdso/vdso-offsets.h: $(obj)/vdso
+4 -2
arch/arm64/kernel/arm64ksyms.c
··· 27 27 #include <linux/uaccess.h> 28 28 #include <linux/io.h> 29 29 #include <linux/arm-smccc.h> 30 + #include <linux/kprobes.h> 30 31 31 32 #include <asm/checksum.h> 32 33 ··· 35 34 EXPORT_SYMBOL(clear_page); 36 35 37 36 /* user mem (segment) */ 38 - EXPORT_SYMBOL(__copy_from_user); 39 - EXPORT_SYMBOL(__copy_to_user); 37 + EXPORT_SYMBOL(__arch_copy_from_user); 38 + EXPORT_SYMBOL(__arch_copy_to_user); 40 39 EXPORT_SYMBOL(__clear_user); 41 40 EXPORT_SYMBOL(__copy_in_user); 42 41 ··· 69 68 70 69 #ifdef CONFIG_FUNCTION_TRACER 71 70 EXPORT_SYMBOL(_mcount); 71 + NOKPROBE_SYMBOL(_mcount); 72 72 #endif 73 73 74 74 /* arm-smccc */
+19 -25
arch/arm64/kernel/armv8_deprecated.c
··· 316 316 */ 317 317 #define TYPE_SWPB (1 << 22) 318 318 319 - /* 320 - * Set up process info to signal segmentation fault - called on access error. 321 - */ 322 - static void set_segfault(struct pt_regs *regs, unsigned long addr) 323 - { 324 - siginfo_t info; 325 - 326 - down_read(&current->mm->mmap_sem); 327 - if (find_vma(current->mm, addr) == NULL) 328 - info.si_code = SEGV_MAPERR; 329 - else 330 - info.si_code = SEGV_ACCERR; 331 - up_read(&current->mm->mmap_sem); 332 - 333 - info.si_signo = SIGSEGV; 334 - info.si_errno = 0; 335 - info.si_addr = (void *) instruction_pointer(regs); 336 - 337 - pr_debug("SWP{B} emulation: access caused memory abort!\n"); 338 - arm64_notify_die("Illegal memory access", regs, &info, 0); 339 - } 340 - 341 319 static int emulate_swpX(unsigned int address, unsigned int *data, 342 320 unsigned int type) 343 321 { ··· 344 366 return res; 345 367 } 346 368 369 + #define ARM_OPCODE_CONDITION_UNCOND 0xf 370 + 371 + static unsigned int __kprobes aarch32_check_condition(u32 opcode, u32 psr) 372 + { 373 + u32 cc_bits = opcode >> 28; 374 + 375 + if (cc_bits != ARM_OPCODE_CONDITION_UNCOND) { 376 + if ((*aarch32_opcode_cond_checks[cc_bits])(psr)) 377 + return ARM_OPCODE_CONDTEST_PASS; 378 + else 379 + return ARM_OPCODE_CONDTEST_FAIL; 380 + } 381 + return ARM_OPCODE_CONDTEST_UNCOND; 382 + } 383 + 347 384 /* 348 385 * swp_handler logs the id of calling process, dissects the instruction, sanity 349 386 * checks the memory location, calls emulate_swpX for the actual operation and ··· 373 380 374 381 type = instr & TYPE_SWPB; 375 382 376 - switch (arm_check_condition(instr, regs->pstate)) { 383 + switch (aarch32_check_condition(instr, regs->pstate)) { 377 384 case ARM_OPCODE_CONDTEST_PASS: 378 385 break; 379 386 case ARM_OPCODE_CONDTEST_FAIL: ··· 423 430 return 0; 424 431 425 432 fault: 426 - set_segfault(regs, address); 433 + pr_debug("SWP{B} emulation: access caused memory abort!\n"); 434 + arm64_notify_segfault(regs, address); 427 435 428 436 return 0; 429 437 } ··· 455 461 { 456 462 perf_sw_event(PERF_COUNT_SW_EMULATION_FAULTS, 1, regs, regs->pc); 457 463 458 - switch (arm_check_condition(instr, regs->pstate)) { 464 + switch (aarch32_check_condition(instr, regs->pstate)) { 459 465 case ARM_OPCODE_CONDTEST_PASS: 460 466 break; 461 467 case ARM_OPCODE_CONDTEST_FAIL:
+16 -1
arch/arm64/kernel/asm-offsets.c
··· 51 51 DEFINE(S_X5, offsetof(struct pt_regs, regs[5])); 52 52 DEFINE(S_X6, offsetof(struct pt_regs, regs[6])); 53 53 DEFINE(S_X7, offsetof(struct pt_regs, regs[7])); 54 + DEFINE(S_X8, offsetof(struct pt_regs, regs[8])); 55 + DEFINE(S_X10, offsetof(struct pt_regs, regs[10])); 56 + DEFINE(S_X12, offsetof(struct pt_regs, regs[12])); 57 + DEFINE(S_X14, offsetof(struct pt_regs, regs[14])); 58 + DEFINE(S_X16, offsetof(struct pt_regs, regs[16])); 59 + DEFINE(S_X18, offsetof(struct pt_regs, regs[18])); 60 + DEFINE(S_X20, offsetof(struct pt_regs, regs[20])); 61 + DEFINE(S_X22, offsetof(struct pt_regs, regs[22])); 62 + DEFINE(S_X24, offsetof(struct pt_regs, regs[24])); 63 + DEFINE(S_X26, offsetof(struct pt_regs, regs[26])); 64 + DEFINE(S_X28, offsetof(struct pt_regs, regs[28])); 54 65 DEFINE(S_LR, offsetof(struct pt_regs, regs[30])); 55 66 DEFINE(S_SP, offsetof(struct pt_regs, sp)); 56 67 #ifdef CONFIG_COMPAT ··· 89 78 BLANK(); 90 79 DEFINE(CLOCK_REALTIME, CLOCK_REALTIME); 91 80 DEFINE(CLOCK_MONOTONIC, CLOCK_MONOTONIC); 81 + DEFINE(CLOCK_MONOTONIC_RAW, CLOCK_MONOTONIC_RAW); 92 82 DEFINE(CLOCK_REALTIME_RES, MONOTONIC_RES_NSEC); 93 83 DEFINE(CLOCK_REALTIME_COARSE, CLOCK_REALTIME_COARSE); 94 84 DEFINE(CLOCK_MONOTONIC_COARSE,CLOCK_MONOTONIC_COARSE); ··· 97 85 DEFINE(NSEC_PER_SEC, NSEC_PER_SEC); 98 86 BLANK(); 99 87 DEFINE(VDSO_CS_CYCLE_LAST, offsetof(struct vdso_data, cs_cycle_last)); 88 + DEFINE(VDSO_RAW_TIME_SEC, offsetof(struct vdso_data, raw_time_sec)); 89 + DEFINE(VDSO_RAW_TIME_NSEC, offsetof(struct vdso_data, raw_time_nsec)); 100 90 DEFINE(VDSO_XTIME_CLK_SEC, offsetof(struct vdso_data, xtime_clock_sec)); 101 91 DEFINE(VDSO_XTIME_CLK_NSEC, offsetof(struct vdso_data, xtime_clock_nsec)); 102 92 DEFINE(VDSO_XTIME_CRS_SEC, offsetof(struct vdso_data, xtime_coarse_sec)); ··· 106 92 DEFINE(VDSO_WTM_CLK_SEC, offsetof(struct vdso_data, wtm_clock_sec)); 107 93 DEFINE(VDSO_WTM_CLK_NSEC, offsetof(struct vdso_data, wtm_clock_nsec)); 108 94 DEFINE(VDSO_TB_SEQ_COUNT, offsetof(struct vdso_data, tb_seq_count)); 109 - DEFINE(VDSO_CS_MULT, offsetof(struct vdso_data, cs_mult)); 95 + DEFINE(VDSO_CS_MONO_MULT, offsetof(struct vdso_data, cs_mono_mult)); 96 + DEFINE(VDSO_CS_RAW_MULT, offsetof(struct vdso_data, cs_raw_mult)); 110 97 DEFINE(VDSO_CS_SHIFT, offsetof(struct vdso_data, cs_shift)); 111 98 DEFINE(VDSO_TZ_MINWEST, offsetof(struct vdso_data, tz_minuteswest)); 112 99 DEFINE(VDSO_TZ_DSTTIME, offsetof(struct vdso_data, tz_dsttime));
+54
arch/arm64/kernel/cpu-reset.S
··· 1 + /* 2 + * CPU reset routines 3 + * 4 + * Copyright (C) 2001 Deep Blue Solutions Ltd. 5 + * Copyright (C) 2012 ARM Ltd. 6 + * Copyright (C) 2015 Huawei Futurewei Technologies. 7 + * 8 + * This program is free software; you can redistribute it and/or modify 9 + * it under the terms of the GNU General Public License version 2 as 10 + * published by the Free Software Foundation. 11 + */ 12 + 13 + #include <linux/linkage.h> 14 + #include <asm/assembler.h> 15 + #include <asm/sysreg.h> 16 + #include <asm/virt.h> 17 + 18 + .text 19 + .pushsection .idmap.text, "ax" 20 + 21 + /* 22 + * __cpu_soft_restart(el2_switch, entry, arg0, arg1, arg2) - Helper for 23 + * cpu_soft_restart. 24 + * 25 + * @el2_switch: Flag to indicate a swich to EL2 is needed. 26 + * @entry: Location to jump to for soft reset. 27 + * arg0: First argument passed to @entry. 28 + * arg1: Second argument passed to @entry. 29 + * arg2: Third argument passed to @entry. 30 + * 31 + * Put the CPU into the same state as it would be if it had been reset, and 32 + * branch to what would be the reset vector. It must be executed with the 33 + * flat identity mapping. 34 + */ 35 + ENTRY(__cpu_soft_restart) 36 + /* Clear sctlr_el1 flags. */ 37 + mrs x12, sctlr_el1 38 + ldr x13, =SCTLR_ELx_FLAGS 39 + bic x12, x12, x13 40 + msr sctlr_el1, x12 41 + isb 42 + 43 + cbz x0, 1f // el2_switch? 44 + mov x0, #HVC_SOFT_RESTART 45 + hvc #0 // no return 46 + 47 + 1: mov x18, x1 // entry 48 + mov x0, x2 // arg0 49 + mov x1, x3 // arg1 50 + mov x2, x4 // arg2 51 + br x18 52 + ENDPROC(__cpu_soft_restart) 53 + 54 + .popsection
+34
arch/arm64/kernel/cpu-reset.h
··· 1 + /* 2 + * CPU reset routines 3 + * 4 + * Copyright (C) 2015 Huawei Futurewei Technologies. 5 + * 6 + * This program is free software; you can redistribute it and/or modify 7 + * it under the terms of the GNU General Public License version 2 as 8 + * published by the Free Software Foundation. 9 + */ 10 + 11 + #ifndef _ARM64_CPU_RESET_H 12 + #define _ARM64_CPU_RESET_H 13 + 14 + #include <asm/virt.h> 15 + 16 + void __cpu_soft_restart(unsigned long el2_switch, unsigned long entry, 17 + unsigned long arg0, unsigned long arg1, unsigned long arg2); 18 + 19 + static inline void __noreturn cpu_soft_restart(unsigned long el2_switch, 20 + unsigned long entry, unsigned long arg0, unsigned long arg1, 21 + unsigned long arg2) 22 + { 23 + typeof(__cpu_soft_restart) *restart; 24 + 25 + el2_switch = el2_switch && !is_kernel_in_hyp_mode() && 26 + is_hyp_mode_available(); 27 + restart = (void *)virt_to_phys(__cpu_soft_restart); 28 + 29 + cpu_install_idmap(); 30 + restart(el2_switch, entry, arg0, arg1, arg2); 31 + unreachable(); 32 + } 33 + 34 + #endif
+7
arch/arm64/kernel/cpu_errata.c
··· 46 46 .desc = "ARM errata 826319, 827319, 824069", 47 47 .capability = ARM64_WORKAROUND_CLEAN_CACHE, 48 48 MIDR_RANGE(MIDR_CORTEX_A53, 0x00, 0x02), 49 + .enable = cpu_enable_cache_maint_trap, 49 50 }, 50 51 #endif 51 52 #ifdef CONFIG_ARM64_ERRATUM_819472 ··· 55 54 .desc = "ARM errata 819472", 56 55 .capability = ARM64_WORKAROUND_CLEAN_CACHE, 57 56 MIDR_RANGE(MIDR_CORTEX_A53, 0x00, 0x01), 57 + .enable = cpu_enable_cache_maint_trap, 58 58 }, 59 59 #endif 60 60 #ifdef CONFIG_ARM64_ERRATUM_832075 ··· 134 132 void check_local_cpu_errata(void) 135 133 { 136 134 update_cpu_capabilities(arm64_errata, "enabling workaround for"); 135 + } 136 + 137 + void __init enable_errata_workarounds(void) 138 + { 139 + enable_cpu_capabilities(arm64_errata); 137 140 }
+2 -2
arch/arm64/kernel/cpufeature.c
··· 913 913 * Run through the enabled capabilities and enable() it on all active 914 914 * CPUs 915 915 */ 916 - static void __init 917 - enable_cpu_capabilities(const struct arm64_cpu_capabilities *caps) 916 + void __init enable_cpu_capabilities(const struct arm64_cpu_capabilities *caps) 918 917 { 919 918 for (; caps->matches; caps++) 920 919 if (caps->enable && cpus_have_cap(caps->capability)) ··· 1035 1036 1036 1037 /* Set the CPU feature capabilies */ 1037 1038 setup_feature_capabilities(); 1039 + enable_errata_workarounds(); 1038 1040 setup_elf_hwcaps(arm64_elf_hwcaps); 1039 1041 1040 1042 if (system_supports_32bit_el0())
+120
arch/arm64/kernel/cpuinfo.c
··· 183 183 .show = c_show 184 184 }; 185 185 186 + 187 + static struct kobj_type cpuregs_kobj_type = { 188 + .sysfs_ops = &kobj_sysfs_ops, 189 + }; 190 + 191 + /* 192 + * The ARM ARM uses the phrase "32-bit register" to describe a register 193 + * whose upper 32 bits are RES0 (per C5.1.1, ARM DDI 0487A.i), however 194 + * no statement is made as to whether the upper 32 bits will or will not 195 + * be made use of in future, and between ARM DDI 0487A.c and ARM DDI 196 + * 0487A.d CLIDR_EL1 was expanded from 32-bit to 64-bit. 197 + * 198 + * Thus, while both MIDR_EL1 and REVIDR_EL1 are described as 32-bit 199 + * registers, we expose them both as 64 bit values to cater for possible 200 + * future expansion without an ABI break. 201 + */ 202 + #define kobj_to_cpuinfo(kobj) container_of(kobj, struct cpuinfo_arm64, kobj) 203 + #define CPUREGS_ATTR_RO(_name, _field) \ 204 + static ssize_t _name##_show(struct kobject *kobj, \ 205 + struct kobj_attribute *attr, char *buf) \ 206 + { \ 207 + struct cpuinfo_arm64 *info = kobj_to_cpuinfo(kobj); \ 208 + \ 209 + if (info->reg_midr) \ 210 + return sprintf(buf, "0x%016x\n", info->reg_##_field); \ 211 + else \ 212 + return 0; \ 213 + } \ 214 + static struct kobj_attribute cpuregs_attr_##_name = __ATTR_RO(_name) 215 + 216 + CPUREGS_ATTR_RO(midr_el1, midr); 217 + CPUREGS_ATTR_RO(revidr_el1, revidr); 218 + 219 + static struct attribute *cpuregs_id_attrs[] = { 220 + &cpuregs_attr_midr_el1.attr, 221 + &cpuregs_attr_revidr_el1.attr, 222 + NULL 223 + }; 224 + 225 + static struct attribute_group cpuregs_attr_group = { 226 + .attrs = cpuregs_id_attrs, 227 + .name = "identification" 228 + }; 229 + 230 + static int cpuid_add_regs(int cpu) 231 + { 232 + int rc; 233 + struct device *dev; 234 + struct cpuinfo_arm64 *info = &per_cpu(cpu_data, cpu); 235 + 236 + dev = get_cpu_device(cpu); 237 + if (!dev) { 238 + rc = -ENODEV; 239 + goto out; 240 + } 241 + rc = kobject_add(&info->kobj, &dev->kobj, "regs"); 242 + if (rc) 243 + goto out; 244 + rc = sysfs_create_group(&info->kobj, &cpuregs_attr_group); 245 + if (rc) 246 + kobject_del(&info->kobj); 247 + out: 248 + return rc; 249 + } 250 + 251 + static int cpuid_remove_regs(int cpu) 252 + { 253 + struct device *dev; 254 + struct cpuinfo_arm64 *info = &per_cpu(cpu_data, cpu); 255 + 256 + dev = get_cpu_device(cpu); 257 + if (!dev) 258 + return -ENODEV; 259 + if (info->kobj.parent) { 260 + sysfs_remove_group(&info->kobj, &cpuregs_attr_group); 261 + kobject_del(&info->kobj); 262 + } 263 + 264 + return 0; 265 + } 266 + 267 + static int cpuid_callback(struct notifier_block *nb, 268 + unsigned long action, void *hcpu) 269 + { 270 + int rc = 0; 271 + unsigned long cpu = (unsigned long)hcpu; 272 + 273 + switch (action & ~CPU_TASKS_FROZEN) { 274 + case CPU_ONLINE: 275 + rc = cpuid_add_regs(cpu); 276 + break; 277 + case CPU_DEAD: 278 + rc = cpuid_remove_regs(cpu); 279 + break; 280 + } 281 + 282 + return notifier_from_errno(rc); 283 + } 284 + 285 + static int __init cpuinfo_regs_init(void) 286 + { 287 + int cpu; 288 + 289 + cpu_notifier_register_begin(); 290 + 291 + for_each_possible_cpu(cpu) { 292 + struct cpuinfo_arm64 *info = &per_cpu(cpu_data, cpu); 293 + 294 + kobject_init(&info->kobj, &cpuregs_kobj_type); 295 + if (cpu_online(cpu)) 296 + cpuid_add_regs(cpu); 297 + } 298 + __hotcpu_notifier(cpuid_callback, 0); 299 + 300 + cpu_notifier_register_done(); 301 + return 0; 302 + } 186 303 static void cpuinfo_detect_icache_policy(struct cpuinfo_arm64 *info) 187 304 { 188 305 unsigned int cpu = smp_processor_id(); ··· 329 212 info->reg_ctr = read_cpuid_cachetype(); 330 213 info->reg_dczid = read_cpuid(DCZID_EL0); 331 214 info->reg_midr = read_cpuid_id(); 215 + info->reg_revidr = read_cpuid(REVIDR_EL1); 332 216 333 217 info->reg_id_aa64dfr0 = read_cpuid(ID_AA64DFR0_EL1); 334 218 info->reg_id_aa64dfr1 = read_cpuid(ID_AA64DFR1_EL1); ··· 382 264 boot_cpu_data = *info; 383 265 init_cpu_features(&boot_cpu_data); 384 266 } 267 + 268 + device_initcall(cpuinfo_regs_init);
+33 -14
arch/arm64/kernel/debug-monitors.c
··· 23 23 #include <linux/hardirq.h> 24 24 #include <linux/init.h> 25 25 #include <linux/ptrace.h> 26 + #include <linux/kprobes.h> 26 27 #include <linux/stat.h> 27 28 #include <linux/uaccess.h> 28 29 ··· 49 48 asm volatile("msr mdscr_el1, %0" :: "r" (mdscr)); 50 49 local_dbg_restore(flags); 51 50 } 51 + NOKPROBE_SYMBOL(mdscr_write); 52 52 53 53 static u32 mdscr_read(void) 54 54 { ··· 57 55 asm volatile("mrs %0, mdscr_el1" : "=r" (mdscr)); 58 56 return mdscr; 59 57 } 58 + NOKPROBE_SYMBOL(mdscr_read); 60 59 61 60 /* 62 61 * Allow root to disable self-hosted debug from userspace. ··· 106 103 mdscr_write(mdscr); 107 104 } 108 105 } 106 + NOKPROBE_SYMBOL(enable_debug_monitors); 109 107 110 108 void disable_debug_monitors(enum dbg_active_el el) 111 109 { ··· 127 123 mdscr_write(mdscr); 128 124 } 129 125 } 126 + NOKPROBE_SYMBOL(disable_debug_monitors); 130 127 131 128 /* 132 129 * OS lock clearing. ··· 156 151 /* Clear the OS lock. */ 157 152 on_each_cpu(clear_os_lock, NULL, 1); 158 153 isb(); 159 - local_dbg_enable(); 160 154 161 155 /* Register hotplug handler. */ 162 156 __register_cpu_notifier(&os_lock_nb); ··· 170 166 */ 171 167 static void set_regs_spsr_ss(struct pt_regs *regs) 172 168 { 173 - unsigned long spsr; 174 - 175 - spsr = regs->pstate; 176 - spsr &= ~DBG_SPSR_SS; 177 - spsr |= DBG_SPSR_SS; 178 - regs->pstate = spsr; 169 + regs->pstate |= DBG_SPSR_SS; 179 170 } 171 + NOKPROBE_SYMBOL(set_regs_spsr_ss); 180 172 181 173 static void clear_regs_spsr_ss(struct pt_regs *regs) 182 174 { 183 - unsigned long spsr; 184 - 185 - spsr = regs->pstate; 186 - spsr &= ~DBG_SPSR_SS; 187 - regs->pstate = spsr; 175 + regs->pstate &= ~DBG_SPSR_SS; 188 176 } 177 + NOKPROBE_SYMBOL(clear_regs_spsr_ss); 189 178 190 179 /* EL1 Single Step Handler hooks */ 191 180 static LIST_HEAD(step_hook); ··· 222 225 223 226 return retval; 224 227 } 228 + NOKPROBE_SYMBOL(call_step_hook); 225 229 226 230 static void send_user_sigtrap(int si_code) 227 231 { ··· 264 266 */ 265 267 user_rewind_single_step(current); 266 268 } else { 269 + #ifdef CONFIG_KPROBES 270 + if (kprobe_single_step_handler(regs, esr) == DBG_HOOK_HANDLED) 271 + return 0; 272 + #endif 267 273 if (call_step_hook(regs, esr) == DBG_HOOK_HANDLED) 268 274 return 0; 269 275 ··· 281 279 282 280 return 0; 283 281 } 282 + NOKPROBE_SYMBOL(single_step_handler); 284 283 285 284 /* 286 285 * Breakpoint handler is re-entrant as another breakpoint can ··· 319 316 320 317 return fn ? fn(regs, esr) : DBG_HOOK_ERROR; 321 318 } 319 + NOKPROBE_SYMBOL(call_break_hook); 322 320 323 321 static int brk_handler(unsigned long addr, unsigned int esr, 324 322 struct pt_regs *regs) 325 323 { 326 324 if (user_mode(regs)) { 327 325 send_user_sigtrap(TRAP_BRKPT); 328 - } else if (call_break_hook(regs, esr) != DBG_HOOK_HANDLED) { 329 - pr_warning("Unexpected kernel BRK exception at EL1\n"); 326 + } 327 + #ifdef CONFIG_KPROBES 328 + else if ((esr & BRK64_ESR_MASK) == BRK64_ESR_KPROBES) { 329 + if (kprobe_breakpoint_handler(regs, esr) != DBG_HOOK_HANDLED) 330 + return -EFAULT; 331 + } 332 + #endif 333 + else if (call_break_hook(regs, esr) != DBG_HOOK_HANDLED) { 334 + pr_warn("Unexpected kernel BRK exception at EL1\n"); 330 335 return -EFAULT; 331 336 } 332 337 333 338 return 0; 334 339 } 340 + NOKPROBE_SYMBOL(brk_handler); 335 341 336 342 int aarch32_break_handler(struct pt_regs *regs) 337 343 { ··· 377 365 send_user_sigtrap(TRAP_BRKPT); 378 366 return 0; 379 367 } 368 + NOKPROBE_SYMBOL(aarch32_break_handler); 380 369 381 370 static int __init debug_traps_init(void) 382 371 { ··· 399 386 if (test_ti_thread_flag(task_thread_info(task), TIF_SINGLESTEP)) 400 387 set_regs_spsr_ss(task_pt_regs(task)); 401 388 } 389 + NOKPROBE_SYMBOL(user_rewind_single_step); 402 390 403 391 void user_fastforward_single_step(struct task_struct *task) 404 392 { ··· 415 401 mdscr_write(mdscr_read() | DBG_MDSCR_SS); 416 402 enable_debug_monitors(DBG_ACTIVE_EL1); 417 403 } 404 + NOKPROBE_SYMBOL(kernel_enable_single_step); 418 405 419 406 void kernel_disable_single_step(void) 420 407 { ··· 423 408 mdscr_write(mdscr_read() & ~DBG_MDSCR_SS); 424 409 disable_debug_monitors(DBG_ACTIVE_EL1); 425 410 } 411 + NOKPROBE_SYMBOL(kernel_disable_single_step); 426 412 427 413 int kernel_active_single_step(void) 428 414 { 429 415 WARN_ON(!irqs_disabled()); 430 416 return mdscr_read() & DBG_MDSCR_SS; 431 417 } 418 + NOKPROBE_SYMBOL(kernel_active_single_step); 432 419 433 420 /* ptrace API */ 434 421 void user_enable_single_step(struct task_struct *task) ··· 438 421 set_ti_thread_flag(task_thread_info(task), TIF_SINGLESTEP); 439 422 set_regs_spsr_ss(task_pt_regs(task)); 440 423 } 424 + NOKPROBE_SYMBOL(user_enable_single_step); 441 425 442 426 void user_disable_single_step(struct task_struct *task) 443 427 { 444 428 clear_ti_thread_flag(task_thread_info(task), TIF_SINGLESTEP); 445 429 } 430 + NOKPROBE_SYMBOL(user_disable_single_step);
+49 -1
arch/arm64/kernel/efi.c
··· 62 62 int __init efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md) 63 63 { 64 64 pteval_t prot_val = create_mapping_protection(md); 65 + bool allow_block_mappings = (md->type != EFI_RUNTIME_SERVICES_CODE && 66 + md->type != EFI_RUNTIME_SERVICES_DATA); 67 + 68 + if (!PAGE_ALIGNED(md->phys_addr) || 69 + !PAGE_ALIGNED(md->num_pages << EFI_PAGE_SHIFT)) { 70 + /* 71 + * If the end address of this region is not aligned to page 72 + * size, the mapping is rounded up, and may end up sharing a 73 + * page frame with the next UEFI memory region. If we create 74 + * a block entry now, we may need to split it again when mapping 75 + * the next region, and support for that is going to be removed 76 + * from the MMU routines. So avoid block mappings altogether in 77 + * that case. 78 + */ 79 + allow_block_mappings = false; 80 + } 65 81 66 82 create_pgd_mapping(mm, md->phys_addr, md->virt_addr, 67 83 md->num_pages << EFI_PAGE_SHIFT, 68 - __pgprot(prot_val | PTE_NG)); 84 + __pgprot(prot_val | PTE_NG), allow_block_mappings); 69 85 return 0; 86 + } 87 + 88 + static int __init set_permissions(pte_t *ptep, pgtable_t token, 89 + unsigned long addr, void *data) 90 + { 91 + efi_memory_desc_t *md = data; 92 + pte_t pte = *ptep; 93 + 94 + if (md->attribute & EFI_MEMORY_RO) 95 + pte = set_pte_bit(pte, __pgprot(PTE_RDONLY)); 96 + if (md->attribute & EFI_MEMORY_XP) 97 + pte = set_pte_bit(pte, __pgprot(PTE_PXN)); 98 + set_pte(ptep, pte); 99 + return 0; 100 + } 101 + 102 + int __init efi_set_mapping_permissions(struct mm_struct *mm, 103 + efi_memory_desc_t *md) 104 + { 105 + BUG_ON(md->type != EFI_RUNTIME_SERVICES_CODE && 106 + md->type != EFI_RUNTIME_SERVICES_DATA); 107 + 108 + /* 109 + * Calling apply_to_page_range() is only safe on regions that are 110 + * guaranteed to be mapped down to pages. Since we are only called 111 + * for regions that have been mapped using efi_create_mapping() above 112 + * (and this is checked by the generic Memory Attributes table parsing 113 + * routines), there is no need to check that again here. 114 + */ 115 + return apply_to_page_range(mm, md->virt_addr, 116 + md->num_pages << EFI_PAGE_SHIFT, 117 + set_permissions, md); 70 118 } 71 119 72 120 static int __init arm64_dmi_init(void)
+15 -2
arch/arm64/kernel/entry.S
··· 258 258 /* 259 259 * Exception vectors. 260 260 */ 261 + .pushsection ".entry.text", "ax" 261 262 262 263 .align 11 263 264 ENTRY(vectors) ··· 467 466 cmp x24, #ESR_ELx_EC_FP_EXC64 // FP/ASIMD exception 468 467 b.eq el0_fpsimd_exc 469 468 cmp x24, #ESR_ELx_EC_SYS64 // configurable trap 470 - b.eq el0_undef 469 + b.eq el0_sys 471 470 cmp x24, #ESR_ELx_EC_SP_ALIGN // stack alignment exception 472 471 b.eq el0_sp_pc 473 472 cmp x24, #ESR_ELx_EC_PC_ALIGN // pc alignment exception ··· 548 547 enable_dbg_and_irq 549 548 ct_user_exit 550 549 mov x0, x26 551 - orr x1, x25, #1 << 24 // use reserved ISS bit for instruction aborts 550 + mov x1, x25 552 551 mov x2, sp 553 552 bl do_mem_abort 554 553 b ret_to_user ··· 594 593 ct_user_exit 595 594 mov x0, sp 596 595 bl do_undefinstr 596 + b ret_to_user 597 + el0_sys: 598 + /* 599 + * System instructions, for trapped cache maintenance instructions 600 + */ 601 + enable_dbg_and_irq 602 + ct_user_exit 603 + mov x0, x25 604 + mov x1, sp 605 + bl do_sysinstr 597 606 b ret_to_user 598 607 el0_dbg: 599 608 /* ··· 799 788 mov x0, sp 800 789 bl do_ni_syscall 801 790 b __sys_trace_return 791 + 792 + .popsection // .entry.text 802 793 803 794 /* 804 795 * Special system call wrappers.
+8
arch/arm64/kernel/hw_breakpoint.c
··· 24 24 #include <linux/cpu_pm.h> 25 25 #include <linux/errno.h> 26 26 #include <linux/hw_breakpoint.h> 27 + #include <linux/kprobes.h> 27 28 #include <linux/perf_event.h> 28 29 #include <linux/ptrace.h> 29 30 #include <linux/smp.h> ··· 128 127 129 128 return val; 130 129 } 130 + NOKPROBE_SYMBOL(read_wb_reg); 131 131 132 132 static void write_wb_reg(int reg, int n, u64 val) 133 133 { ··· 142 140 } 143 141 isb(); 144 142 } 143 + NOKPROBE_SYMBOL(write_wb_reg); 145 144 146 145 /* 147 146 * Convert a breakpoint privilege level to the corresponding exception ··· 160 157 return -EINVAL; 161 158 } 162 159 } 160 + NOKPROBE_SYMBOL(debug_exception_level); 163 161 164 162 enum hw_breakpoint_ops { 165 163 HW_BREAKPOINT_INSTALL, ··· 579 575 write_wb_reg(reg, i, ctrl); 580 576 } 581 577 } 578 + NOKPROBE_SYMBOL(toggle_bp_registers); 582 579 583 580 /* 584 581 * Debug exception handlers. ··· 659 654 660 655 return 0; 661 656 } 657 + NOKPROBE_SYMBOL(breakpoint_handler); 662 658 663 659 static int watchpoint_handler(unsigned long addr, unsigned int esr, 664 660 struct pt_regs *regs) ··· 762 756 763 757 return 0; 764 758 } 759 + NOKPROBE_SYMBOL(watchpoint_handler); 765 760 766 761 /* 767 762 * Handle single-step exception. ··· 820 813 821 814 return !handled_exception; 822 815 } 816 + NOKPROBE_SYMBOL(reinstall_suspended_bps); 823 817 824 818 /* 825 819 * Context-switcher for restoring suspended breakpoints.
+9 -1
arch/arm64/kernel/hyp-stub.S
··· 71 71 msr vbar_el2, x1 72 72 b 9f 73 73 74 + 2: cmp x0, #HVC_SOFT_RESTART 75 + b.ne 3f 76 + mov x0, x2 77 + mov x2, x4 78 + mov x4, x1 79 + mov x1, x3 80 + br x4 // no return 81 + 74 82 /* Someone called kvm_call_hyp() against the hyp-stub... */ 75 - 2: mov x0, #ARM_EXCEPTION_HYP_GONE 83 + 3: mov x0, #ARM_EXCEPTION_HYP_GONE 76 84 77 85 9: eret 78 86 ENDPROC(el1_sync)
+133
arch/arm64/kernel/insn.c
··· 30 30 #include <asm/cacheflush.h> 31 31 #include <asm/debug-monitors.h> 32 32 #include <asm/fixmap.h> 33 + #include <asm/opcodes.h> 33 34 #include <asm/insn.h> 34 35 35 36 #define AARCH64_INSN_SF_BIT BIT(31) ··· 161 160 aarch64_insn_is_smc(insn) || 162 161 aarch64_insn_is_brk(insn) || 163 162 aarch64_insn_is_nop(insn); 163 + } 164 + 165 + bool __kprobes aarch64_insn_uses_literal(u32 insn) 166 + { 167 + /* ldr/ldrsw (literal), prfm */ 168 + 169 + return aarch64_insn_is_ldr_lit(insn) || 170 + aarch64_insn_is_ldrsw_lit(insn) || 171 + aarch64_insn_is_adr_adrp(insn) || 172 + aarch64_insn_is_prfm_lit(insn); 173 + } 174 + 175 + bool __kprobes aarch64_insn_is_branch(u32 insn) 176 + { 177 + /* b, bl, cb*, tb*, b.cond, br, blr */ 178 + 179 + return aarch64_insn_is_b(insn) || 180 + aarch64_insn_is_bl(insn) || 181 + aarch64_insn_is_cbz(insn) || 182 + aarch64_insn_is_cbnz(insn) || 183 + aarch64_insn_is_tbz(insn) || 184 + aarch64_insn_is_tbnz(insn) || 185 + aarch64_insn_is_ret(insn) || 186 + aarch64_insn_is_br(insn) || 187 + aarch64_insn_is_blr(insn) || 188 + aarch64_insn_is_bcond(insn); 164 189 } 165 190 166 191 /* ··· 1202 1175 BUG(); 1203 1176 } 1204 1177 1178 + /* 1179 + * Extract the Op/CR data from a msr/mrs instruction. 1180 + */ 1181 + u32 aarch64_insn_extract_system_reg(u32 insn) 1182 + { 1183 + return (insn & 0x1FFFE0) >> 5; 1184 + } 1185 + 1205 1186 bool aarch32_insn_is_wide(u32 insn) 1206 1187 { 1207 1188 return insn >= 0xe800; ··· 1235 1200 { 1236 1201 return insn & CRM_MASK; 1237 1202 } 1203 + 1204 + static bool __kprobes __check_eq(unsigned long pstate) 1205 + { 1206 + return (pstate & PSR_Z_BIT) != 0; 1207 + } 1208 + 1209 + static bool __kprobes __check_ne(unsigned long pstate) 1210 + { 1211 + return (pstate & PSR_Z_BIT) == 0; 1212 + } 1213 + 1214 + static bool __kprobes __check_cs(unsigned long pstate) 1215 + { 1216 + return (pstate & PSR_C_BIT) != 0; 1217 + } 1218 + 1219 + static bool __kprobes __check_cc(unsigned long pstate) 1220 + { 1221 + return (pstate & PSR_C_BIT) == 0; 1222 + } 1223 + 1224 + static bool __kprobes __check_mi(unsigned long pstate) 1225 + { 1226 + return (pstate & PSR_N_BIT) != 0; 1227 + } 1228 + 1229 + static bool __kprobes __check_pl(unsigned long pstate) 1230 + { 1231 + return (pstate & PSR_N_BIT) == 0; 1232 + } 1233 + 1234 + static bool __kprobes __check_vs(unsigned long pstate) 1235 + { 1236 + return (pstate & PSR_V_BIT) != 0; 1237 + } 1238 + 1239 + static bool __kprobes __check_vc(unsigned long pstate) 1240 + { 1241 + return (pstate & PSR_V_BIT) == 0; 1242 + } 1243 + 1244 + static bool __kprobes __check_hi(unsigned long pstate) 1245 + { 1246 + pstate &= ~(pstate >> 1); /* PSR_C_BIT &= ~PSR_Z_BIT */ 1247 + return (pstate & PSR_C_BIT) != 0; 1248 + } 1249 + 1250 + static bool __kprobes __check_ls(unsigned long pstate) 1251 + { 1252 + pstate &= ~(pstate >> 1); /* PSR_C_BIT &= ~PSR_Z_BIT */ 1253 + return (pstate & PSR_C_BIT) == 0; 1254 + } 1255 + 1256 + static bool __kprobes __check_ge(unsigned long pstate) 1257 + { 1258 + pstate ^= (pstate << 3); /* PSR_N_BIT ^= PSR_V_BIT */ 1259 + return (pstate & PSR_N_BIT) == 0; 1260 + } 1261 + 1262 + static bool __kprobes __check_lt(unsigned long pstate) 1263 + { 1264 + pstate ^= (pstate << 3); /* PSR_N_BIT ^= PSR_V_BIT */ 1265 + return (pstate & PSR_N_BIT) != 0; 1266 + } 1267 + 1268 + static bool __kprobes __check_gt(unsigned long pstate) 1269 + { 1270 + /*PSR_N_BIT ^= PSR_V_BIT */ 1271 + unsigned long temp = pstate ^ (pstate << 3); 1272 + 1273 + temp |= (pstate << 1); /*PSR_N_BIT |= PSR_Z_BIT */ 1274 + return (temp & PSR_N_BIT) == 0; 1275 + } 1276 + 1277 + static bool __kprobes __check_le(unsigned long pstate) 1278 + { 1279 + /*PSR_N_BIT ^= PSR_V_BIT */ 1280 + unsigned long temp = pstate ^ (pstate << 3); 1281 + 1282 + temp |= (pstate << 1); /*PSR_N_BIT |= PSR_Z_BIT */ 1283 + return (temp & PSR_N_BIT) != 0; 1284 + } 1285 + 1286 + static bool __kprobes __check_al(unsigned long pstate) 1287 + { 1288 + return true; 1289 + } 1290 + 1291 + /* 1292 + * Note that the ARMv8 ARM calls condition code 0b1111 "nv", but states that 1293 + * it behaves identically to 0b1110 ("al"). 1294 + */ 1295 + pstate_check_t * const aarch32_opcode_cond_checks[16] = { 1296 + __check_eq, __check_ne, __check_cs, __check_cc, 1297 + __check_mi, __check_pl, __check_vs, __check_vc, 1298 + __check_hi, __check_ls, __check_ge, __check_lt, 1299 + __check_gt, __check_le, __check_al, __check_al 1300 + };
+4
arch/arm64/kernel/kgdb.c
··· 22 22 #include <linux/irq.h> 23 23 #include <linux/kdebug.h> 24 24 #include <linux/kgdb.h> 25 + #include <linux/kprobes.h> 25 26 #include <asm/traps.h> 26 27 27 28 struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] = { ··· 231 230 kgdb_handle_exception(1, SIGTRAP, 0, regs); 232 231 return 0; 233 232 } 233 + NOKPROBE_SYMBOL(kgdb_brk_fn) 234 234 235 235 static int kgdb_compiled_brk_fn(struct pt_regs *regs, unsigned int esr) 236 236 { ··· 240 238 241 239 return 0; 242 240 } 241 + NOKPROBE_SYMBOL(kgdb_compiled_brk_fn); 243 242 244 243 static int kgdb_step_brk_fn(struct pt_regs *regs, unsigned int esr) 245 244 { 246 245 kgdb_handle_exception(1, SIGTRAP, 0, regs); 247 246 return 0; 248 247 } 248 + NOKPROBE_SYMBOL(kgdb_step_brk_fn); 249 249 250 250 static struct break_hook kgdb_brkpt_hook = { 251 251 .esr_mask = 0xffffffff,
+212
arch/arm64/kernel/machine_kexec.c
··· 1 + /* 2 + * kexec for arm64 3 + * 4 + * Copyright (C) Linaro. 5 + * Copyright (C) Huawei Futurewei Technologies. 6 + * 7 + * This program is free software; you can redistribute it and/or modify 8 + * it under the terms of the GNU General Public License version 2 as 9 + * published by the Free Software Foundation. 10 + */ 11 + 12 + #include <linux/kexec.h> 13 + #include <linux/smp.h> 14 + 15 + #include <asm/cacheflush.h> 16 + #include <asm/cpu_ops.h> 17 + #include <asm/mmu_context.h> 18 + 19 + #include "cpu-reset.h" 20 + 21 + /* Global variables for the arm64_relocate_new_kernel routine. */ 22 + extern const unsigned char arm64_relocate_new_kernel[]; 23 + extern const unsigned long arm64_relocate_new_kernel_size; 24 + 25 + static unsigned long kimage_start; 26 + 27 + /** 28 + * kexec_image_info - For debugging output. 29 + */ 30 + #define kexec_image_info(_i) _kexec_image_info(__func__, __LINE__, _i) 31 + static void _kexec_image_info(const char *func, int line, 32 + const struct kimage *kimage) 33 + { 34 + unsigned long i; 35 + 36 + pr_debug("%s:%d:\n", func, line); 37 + pr_debug(" kexec kimage info:\n"); 38 + pr_debug(" type: %d\n", kimage->type); 39 + pr_debug(" start: %lx\n", kimage->start); 40 + pr_debug(" head: %lx\n", kimage->head); 41 + pr_debug(" nr_segments: %lu\n", kimage->nr_segments); 42 + 43 + for (i = 0; i < kimage->nr_segments; i++) { 44 + pr_debug(" segment[%lu]: %016lx - %016lx, 0x%lx bytes, %lu pages\n", 45 + i, 46 + kimage->segment[i].mem, 47 + kimage->segment[i].mem + kimage->segment[i].memsz, 48 + kimage->segment[i].memsz, 49 + kimage->segment[i].memsz / PAGE_SIZE); 50 + } 51 + } 52 + 53 + void machine_kexec_cleanup(struct kimage *kimage) 54 + { 55 + /* Empty routine needed to avoid build errors. */ 56 + } 57 + 58 + /** 59 + * machine_kexec_prepare - Prepare for a kexec reboot. 60 + * 61 + * Called from the core kexec code when a kernel image is loaded. 62 + * Forbid loading a kexec kernel if we have no way of hotplugging cpus or cpus 63 + * are stuck in the kernel. This avoids a panic once we hit machine_kexec(). 64 + */ 65 + int machine_kexec_prepare(struct kimage *kimage) 66 + { 67 + kimage_start = kimage->start; 68 + 69 + kexec_image_info(kimage); 70 + 71 + if (kimage->type != KEXEC_TYPE_CRASH && cpus_are_stuck_in_kernel()) { 72 + pr_err("Can't kexec: CPUs are stuck in the kernel.\n"); 73 + return -EBUSY; 74 + } 75 + 76 + return 0; 77 + } 78 + 79 + /** 80 + * kexec_list_flush - Helper to flush the kimage list and source pages to PoC. 81 + */ 82 + static void kexec_list_flush(struct kimage *kimage) 83 + { 84 + kimage_entry_t *entry; 85 + 86 + for (entry = &kimage->head; ; entry++) { 87 + unsigned int flag; 88 + void *addr; 89 + 90 + /* flush the list entries. */ 91 + __flush_dcache_area(entry, sizeof(kimage_entry_t)); 92 + 93 + flag = *entry & IND_FLAGS; 94 + if (flag == IND_DONE) 95 + break; 96 + 97 + addr = phys_to_virt(*entry & PAGE_MASK); 98 + 99 + switch (flag) { 100 + case IND_INDIRECTION: 101 + /* Set entry point just before the new list page. */ 102 + entry = (kimage_entry_t *)addr - 1; 103 + break; 104 + case IND_SOURCE: 105 + /* flush the source pages. */ 106 + __flush_dcache_area(addr, PAGE_SIZE); 107 + break; 108 + case IND_DESTINATION: 109 + break; 110 + default: 111 + BUG(); 112 + } 113 + } 114 + } 115 + 116 + /** 117 + * kexec_segment_flush - Helper to flush the kimage segments to PoC. 118 + */ 119 + static void kexec_segment_flush(const struct kimage *kimage) 120 + { 121 + unsigned long i; 122 + 123 + pr_debug("%s:\n", __func__); 124 + 125 + for (i = 0; i < kimage->nr_segments; i++) { 126 + pr_debug(" segment[%lu]: %016lx - %016lx, 0x%lx bytes, %lu pages\n", 127 + i, 128 + kimage->segment[i].mem, 129 + kimage->segment[i].mem + kimage->segment[i].memsz, 130 + kimage->segment[i].memsz, 131 + kimage->segment[i].memsz / PAGE_SIZE); 132 + 133 + __flush_dcache_area(phys_to_virt(kimage->segment[i].mem), 134 + kimage->segment[i].memsz); 135 + } 136 + } 137 + 138 + /** 139 + * machine_kexec - Do the kexec reboot. 140 + * 141 + * Called from the core kexec code for a sys_reboot with LINUX_REBOOT_CMD_KEXEC. 142 + */ 143 + void machine_kexec(struct kimage *kimage) 144 + { 145 + phys_addr_t reboot_code_buffer_phys; 146 + void *reboot_code_buffer; 147 + 148 + /* 149 + * New cpus may have become stuck_in_kernel after we loaded the image. 150 + */ 151 + BUG_ON(cpus_are_stuck_in_kernel() || (num_online_cpus() > 1)); 152 + 153 + reboot_code_buffer_phys = page_to_phys(kimage->control_code_page); 154 + reboot_code_buffer = phys_to_virt(reboot_code_buffer_phys); 155 + 156 + kexec_image_info(kimage); 157 + 158 + pr_debug("%s:%d: control_code_page: %p\n", __func__, __LINE__, 159 + kimage->control_code_page); 160 + pr_debug("%s:%d: reboot_code_buffer_phys: %pa\n", __func__, __LINE__, 161 + &reboot_code_buffer_phys); 162 + pr_debug("%s:%d: reboot_code_buffer: %p\n", __func__, __LINE__, 163 + reboot_code_buffer); 164 + pr_debug("%s:%d: relocate_new_kernel: %p\n", __func__, __LINE__, 165 + arm64_relocate_new_kernel); 166 + pr_debug("%s:%d: relocate_new_kernel_size: 0x%lx(%lu) bytes\n", 167 + __func__, __LINE__, arm64_relocate_new_kernel_size, 168 + arm64_relocate_new_kernel_size); 169 + 170 + /* 171 + * Copy arm64_relocate_new_kernel to the reboot_code_buffer for use 172 + * after the kernel is shut down. 173 + */ 174 + memcpy(reboot_code_buffer, arm64_relocate_new_kernel, 175 + arm64_relocate_new_kernel_size); 176 + 177 + /* Flush the reboot_code_buffer in preparation for its execution. */ 178 + __flush_dcache_area(reboot_code_buffer, arm64_relocate_new_kernel_size); 179 + flush_icache_range((uintptr_t)reboot_code_buffer, 180 + arm64_relocate_new_kernel_size); 181 + 182 + /* Flush the kimage list and its buffers. */ 183 + kexec_list_flush(kimage); 184 + 185 + /* Flush the new image if already in place. */ 186 + if (kimage->head & IND_DONE) 187 + kexec_segment_flush(kimage); 188 + 189 + pr_info("Bye!\n"); 190 + 191 + /* Disable all DAIF exceptions. */ 192 + asm volatile ("msr daifset, #0xf" : : : "memory"); 193 + 194 + /* 195 + * cpu_soft_restart will shutdown the MMU, disable data caches, then 196 + * transfer control to the reboot_code_buffer which contains a copy of 197 + * the arm64_relocate_new_kernel routine. arm64_relocate_new_kernel 198 + * uses physical addressing to relocate the new image to its final 199 + * position and transfers control to the image entry point when the 200 + * relocation is complete. 201 + */ 202 + 203 + cpu_soft_restart(1, reboot_code_buffer_phys, kimage->head, 204 + kimage_start, 0); 205 + 206 + BUG(); /* Should never get here. */ 207 + } 208 + 209 + void machine_crash_shutdown(struct pt_regs *regs) 210 + { 211 + /* Empty routine needed to avoid build errors. */ 212 + }
+3
arch/arm64/kernel/probes/Makefile
··· 1 + obj-$(CONFIG_KPROBES) += kprobes.o decode-insn.o \ 2 + kprobes_trampoline.o \ 3 + simulate-insn.o
+174
arch/arm64/kernel/probes/decode-insn.c
··· 1 + /* 2 + * arch/arm64/kernel/probes/decode-insn.c 3 + * 4 + * Copyright (C) 2013 Linaro Limited. 5 + * 6 + * This program is free software; you can redistribute it and/or modify 7 + * it under the terms of the GNU General Public License version 2 as 8 + * published by the Free Software Foundation. 9 + * 10 + * This program is distributed in the hope that it will be useful, 11 + * but WITHOUT ANY WARRANTY; without even the implied warranty of 12 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 13 + * General Public License for more details. 14 + */ 15 + 16 + #include <linux/kernel.h> 17 + #include <linux/kprobes.h> 18 + #include <linux/module.h> 19 + #include <asm/kprobes.h> 20 + #include <asm/insn.h> 21 + #include <asm/sections.h> 22 + 23 + #include "decode-insn.h" 24 + #include "simulate-insn.h" 25 + 26 + static bool __kprobes aarch64_insn_is_steppable(u32 insn) 27 + { 28 + /* 29 + * Branch instructions will write a new value into the PC which is 30 + * likely to be relative to the XOL address and therefore invalid. 31 + * Deliberate generation of an exception during stepping is also not 32 + * currently safe. Lastly, MSR instructions can do any number of nasty 33 + * things we can't handle during single-stepping. 34 + */ 35 + if (aarch64_get_insn_class(insn) == AARCH64_INSN_CLS_BR_SYS) { 36 + if (aarch64_insn_is_branch(insn) || 37 + aarch64_insn_is_msr_imm(insn) || 38 + aarch64_insn_is_msr_reg(insn) || 39 + aarch64_insn_is_exception(insn) || 40 + aarch64_insn_is_eret(insn)) 41 + return false; 42 + 43 + /* 44 + * The MRS instruction may not return a correct value when 45 + * executing in the single-stepping environment. We do make one 46 + * exception, for reading the DAIF bits. 47 + */ 48 + if (aarch64_insn_is_mrs(insn)) 49 + return aarch64_insn_extract_system_reg(insn) 50 + != AARCH64_INSN_SPCLREG_DAIF; 51 + 52 + /* 53 + * The HINT instruction is is problematic when single-stepping, 54 + * except for the NOP case. 55 + */ 56 + if (aarch64_insn_is_hint(insn)) 57 + return aarch64_insn_is_nop(insn); 58 + 59 + return true; 60 + } 61 + 62 + /* 63 + * Instructions which load PC relative literals are not going to work 64 + * when executed from an XOL slot. Instructions doing an exclusive 65 + * load/store are not going to complete successfully when single-step 66 + * exception handling happens in the middle of the sequence. 67 + */ 68 + if (aarch64_insn_uses_literal(insn) || 69 + aarch64_insn_is_exclusive(insn)) 70 + return false; 71 + 72 + return true; 73 + } 74 + 75 + /* Return: 76 + * INSN_REJECTED If instruction is one not allowed to kprobe, 77 + * INSN_GOOD If instruction is supported and uses instruction slot, 78 + * INSN_GOOD_NO_SLOT If instruction is supported but doesn't use its slot. 79 + */ 80 + static enum kprobe_insn __kprobes 81 + arm_probe_decode_insn(kprobe_opcode_t insn, struct arch_specific_insn *asi) 82 + { 83 + /* 84 + * Instructions reading or modifying the PC won't work from the XOL 85 + * slot. 86 + */ 87 + if (aarch64_insn_is_steppable(insn)) 88 + return INSN_GOOD; 89 + 90 + if (aarch64_insn_is_bcond(insn)) { 91 + asi->handler = simulate_b_cond; 92 + } else if (aarch64_insn_is_cbz(insn) || 93 + aarch64_insn_is_cbnz(insn)) { 94 + asi->handler = simulate_cbz_cbnz; 95 + } else if (aarch64_insn_is_tbz(insn) || 96 + aarch64_insn_is_tbnz(insn)) { 97 + asi->handler = simulate_tbz_tbnz; 98 + } else if (aarch64_insn_is_adr_adrp(insn)) { 99 + asi->handler = simulate_adr_adrp; 100 + } else if (aarch64_insn_is_b(insn) || 101 + aarch64_insn_is_bl(insn)) { 102 + asi->handler = simulate_b_bl; 103 + } else if (aarch64_insn_is_br(insn) || 104 + aarch64_insn_is_blr(insn) || 105 + aarch64_insn_is_ret(insn)) { 106 + asi->handler = simulate_br_blr_ret; 107 + } else if (aarch64_insn_is_ldr_lit(insn)) { 108 + asi->handler = simulate_ldr_literal; 109 + } else if (aarch64_insn_is_ldrsw_lit(insn)) { 110 + asi->handler = simulate_ldrsw_literal; 111 + } else { 112 + /* 113 + * Instruction cannot be stepped out-of-line and we don't 114 + * (yet) simulate it. 115 + */ 116 + return INSN_REJECTED; 117 + } 118 + 119 + return INSN_GOOD_NO_SLOT; 120 + } 121 + 122 + static bool __kprobes 123 + is_probed_address_atomic(kprobe_opcode_t *scan_start, kprobe_opcode_t *scan_end) 124 + { 125 + while (scan_start > scan_end) { 126 + /* 127 + * atomic region starts from exclusive load and ends with 128 + * exclusive store. 129 + */ 130 + if (aarch64_insn_is_store_ex(le32_to_cpu(*scan_start))) 131 + return false; 132 + else if (aarch64_insn_is_load_ex(le32_to_cpu(*scan_start))) 133 + return true; 134 + scan_start--; 135 + } 136 + 137 + return false; 138 + } 139 + 140 + enum kprobe_insn __kprobes 141 + arm_kprobe_decode_insn(kprobe_opcode_t *addr, struct arch_specific_insn *asi) 142 + { 143 + enum kprobe_insn decoded; 144 + kprobe_opcode_t insn = le32_to_cpu(*addr); 145 + kprobe_opcode_t *scan_start = addr - 1; 146 + kprobe_opcode_t *scan_end = addr - MAX_ATOMIC_CONTEXT_SIZE; 147 + #if defined(CONFIG_MODULES) && defined(MODULES_VADDR) 148 + struct module *mod; 149 + #endif 150 + 151 + if (addr >= (kprobe_opcode_t *)_text && 152 + scan_end < (kprobe_opcode_t *)_text) 153 + scan_end = (kprobe_opcode_t *)_text; 154 + #if defined(CONFIG_MODULES) && defined(MODULES_VADDR) 155 + else { 156 + preempt_disable(); 157 + mod = __module_address((unsigned long)addr); 158 + if (mod && within_module_init((unsigned long)addr, mod) && 159 + !within_module_init((unsigned long)scan_end, mod)) 160 + scan_end = (kprobe_opcode_t *)mod->init_layout.base; 161 + else if (mod && within_module_core((unsigned long)addr, mod) && 162 + !within_module_core((unsigned long)scan_end, mod)) 163 + scan_end = (kprobe_opcode_t *)mod->core_layout.base; 164 + preempt_enable(); 165 + } 166 + #endif 167 + decoded = arm_probe_decode_insn(insn, asi); 168 + 169 + if (decoded == INSN_REJECTED || 170 + is_probed_address_atomic(scan_start, scan_end)) 171 + return INSN_REJECTED; 172 + 173 + return decoded; 174 + }
+35
arch/arm64/kernel/probes/decode-insn.h
··· 1 + /* 2 + * arch/arm64/kernel/probes/decode-insn.h 3 + * 4 + * Copyright (C) 2013 Linaro Limited. 5 + * 6 + * This program is free software; you can redistribute it and/or modify 7 + * it under the terms of the GNU General Public License version 2 as 8 + * published by the Free Software Foundation. 9 + * 10 + * This program is distributed in the hope that it will be useful, 11 + * but WITHOUT ANY WARRANTY; without even the implied warranty of 12 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 13 + * General Public License for more details. 14 + */ 15 + 16 + #ifndef _ARM_KERNEL_KPROBES_ARM64_H 17 + #define _ARM_KERNEL_KPROBES_ARM64_H 18 + 19 + /* 20 + * ARM strongly recommends a limit of 128 bytes between LoadExcl and 21 + * StoreExcl instructions in a single thread of execution. So keep the 22 + * max atomic context size as 32. 23 + */ 24 + #define MAX_ATOMIC_CONTEXT_SIZE (128 / sizeof(kprobe_opcode_t)) 25 + 26 + enum kprobe_insn { 27 + INSN_REJECTED, 28 + INSN_GOOD_NO_SLOT, 29 + INSN_GOOD, 30 + }; 31 + 32 + enum kprobe_insn __kprobes 33 + arm_kprobe_decode_insn(kprobe_opcode_t *addr, struct arch_specific_insn *asi); 34 + 35 + #endif /* _ARM_KERNEL_KPROBES_ARM64_H */
+686
arch/arm64/kernel/probes/kprobes.c
··· 1 + /* 2 + * arch/arm64/kernel/probes/kprobes.c 3 + * 4 + * Kprobes support for ARM64 5 + * 6 + * Copyright (C) 2013 Linaro Limited. 7 + * Author: Sandeepa Prabhu <sandeepa.prabhu@linaro.org> 8 + * 9 + * This program is free software; you can redistribute it and/or modify 10 + * it under the terms of the GNU General Public License version 2 as 11 + * published by the Free Software Foundation. 12 + * 13 + * This program is distributed in the hope that it will be useful, 14 + * but WITHOUT ANY WARRANTY; without even the implied warranty of 15 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 16 + * General Public License for more details. 17 + * 18 + */ 19 + #include <linux/kasan.h> 20 + #include <linux/kernel.h> 21 + #include <linux/kprobes.h> 22 + #include <linux/module.h> 23 + #include <linux/slab.h> 24 + #include <linux/stop_machine.h> 25 + #include <linux/stringify.h> 26 + #include <asm/traps.h> 27 + #include <asm/ptrace.h> 28 + #include <asm/cacheflush.h> 29 + #include <asm/debug-monitors.h> 30 + #include <asm/system_misc.h> 31 + #include <asm/insn.h> 32 + #include <asm/uaccess.h> 33 + #include <asm/irq.h> 34 + #include <asm-generic/sections.h> 35 + 36 + #include "decode-insn.h" 37 + 38 + DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL; 39 + DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk); 40 + 41 + static void __kprobes 42 + post_kprobe_handler(struct kprobe_ctlblk *, struct pt_regs *); 43 + 44 + static inline unsigned long min_stack_size(unsigned long addr) 45 + { 46 + unsigned long size; 47 + 48 + if (on_irq_stack(addr, raw_smp_processor_id())) 49 + size = IRQ_STACK_PTR(raw_smp_processor_id()) - addr; 50 + else 51 + size = (unsigned long)current_thread_info() + THREAD_START_SP - addr; 52 + 53 + return min(size, FIELD_SIZEOF(struct kprobe_ctlblk, jprobes_stack)); 54 + } 55 + 56 + static void __kprobes arch_prepare_ss_slot(struct kprobe *p) 57 + { 58 + /* prepare insn slot */ 59 + p->ainsn.insn[0] = cpu_to_le32(p->opcode); 60 + 61 + flush_icache_range((uintptr_t) (p->ainsn.insn), 62 + (uintptr_t) (p->ainsn.insn) + 63 + MAX_INSN_SIZE * sizeof(kprobe_opcode_t)); 64 + 65 + /* 66 + * Needs restoring of return address after stepping xol. 67 + */ 68 + p->ainsn.restore = (unsigned long) p->addr + 69 + sizeof(kprobe_opcode_t); 70 + } 71 + 72 + static void __kprobes arch_prepare_simulate(struct kprobe *p) 73 + { 74 + /* This instructions is not executed xol. No need to adjust the PC */ 75 + p->ainsn.restore = 0; 76 + } 77 + 78 + static void __kprobes arch_simulate_insn(struct kprobe *p, struct pt_regs *regs) 79 + { 80 + struct kprobe_ctlblk *kcb = get_kprobe_ctlblk(); 81 + 82 + if (p->ainsn.handler) 83 + p->ainsn.handler((u32)p->opcode, (long)p->addr, regs); 84 + 85 + /* single step simulated, now go for post processing */ 86 + post_kprobe_handler(kcb, regs); 87 + } 88 + 89 + int __kprobes arch_prepare_kprobe(struct kprobe *p) 90 + { 91 + unsigned long probe_addr = (unsigned long)p->addr; 92 + extern char __start_rodata[]; 93 + extern char __end_rodata[]; 94 + 95 + if (probe_addr & 0x3) 96 + return -EINVAL; 97 + 98 + /* copy instruction */ 99 + p->opcode = le32_to_cpu(*p->addr); 100 + 101 + if (in_exception_text(probe_addr)) 102 + return -EINVAL; 103 + if (probe_addr >= (unsigned long) __start_rodata && 104 + probe_addr <= (unsigned long) __end_rodata) 105 + return -EINVAL; 106 + 107 + /* decode instruction */ 108 + switch (arm_kprobe_decode_insn(p->addr, &p->ainsn)) { 109 + case INSN_REJECTED: /* insn not supported */ 110 + return -EINVAL; 111 + 112 + case INSN_GOOD_NO_SLOT: /* insn need simulation */ 113 + p->ainsn.insn = NULL; 114 + break; 115 + 116 + case INSN_GOOD: /* instruction uses slot */ 117 + p->ainsn.insn = get_insn_slot(); 118 + if (!p->ainsn.insn) 119 + return -ENOMEM; 120 + break; 121 + }; 122 + 123 + /* prepare the instruction */ 124 + if (p->ainsn.insn) 125 + arch_prepare_ss_slot(p); 126 + else 127 + arch_prepare_simulate(p); 128 + 129 + return 0; 130 + } 131 + 132 + static int __kprobes patch_text(kprobe_opcode_t *addr, u32 opcode) 133 + { 134 + void *addrs[1]; 135 + u32 insns[1]; 136 + 137 + addrs[0] = (void *)addr; 138 + insns[0] = (u32)opcode; 139 + 140 + return aarch64_insn_patch_text(addrs, insns, 1); 141 + } 142 + 143 + /* arm kprobe: install breakpoint in text */ 144 + void __kprobes arch_arm_kprobe(struct kprobe *p) 145 + { 146 + patch_text(p->addr, BRK64_OPCODE_KPROBES); 147 + } 148 + 149 + /* disarm kprobe: remove breakpoint from text */ 150 + void __kprobes arch_disarm_kprobe(struct kprobe *p) 151 + { 152 + patch_text(p->addr, p->opcode); 153 + } 154 + 155 + void __kprobes arch_remove_kprobe(struct kprobe *p) 156 + { 157 + if (p->ainsn.insn) { 158 + free_insn_slot(p->ainsn.insn, 0); 159 + p->ainsn.insn = NULL; 160 + } 161 + } 162 + 163 + static void __kprobes save_previous_kprobe(struct kprobe_ctlblk *kcb) 164 + { 165 + kcb->prev_kprobe.kp = kprobe_running(); 166 + kcb->prev_kprobe.status = kcb->kprobe_status; 167 + } 168 + 169 + static void __kprobes restore_previous_kprobe(struct kprobe_ctlblk *kcb) 170 + { 171 + __this_cpu_write(current_kprobe, kcb->prev_kprobe.kp); 172 + kcb->kprobe_status = kcb->prev_kprobe.status; 173 + } 174 + 175 + static void __kprobes set_current_kprobe(struct kprobe *p) 176 + { 177 + __this_cpu_write(current_kprobe, p); 178 + } 179 + 180 + /* 181 + * The D-flag (Debug mask) is set (masked) upon debug exception entry. 182 + * Kprobes needs to clear (unmask) D-flag -ONLY- in case of recursive 183 + * probe i.e. when probe hit from kprobe handler context upon 184 + * executing the pre/post handlers. In this case we return with 185 + * D-flag clear so that single-stepping can be carried-out. 186 + * 187 + * Leave D-flag set in all other cases. 188 + */ 189 + static void __kprobes 190 + spsr_set_debug_flag(struct pt_regs *regs, int mask) 191 + { 192 + unsigned long spsr = regs->pstate; 193 + 194 + if (mask) 195 + spsr |= PSR_D_BIT; 196 + else 197 + spsr &= ~PSR_D_BIT; 198 + 199 + regs->pstate = spsr; 200 + } 201 + 202 + /* 203 + * Interrupts need to be disabled before single-step mode is set, and not 204 + * reenabled until after single-step mode ends. 205 + * Without disabling interrupt on local CPU, there is a chance of 206 + * interrupt occurrence in the period of exception return and start of 207 + * out-of-line single-step, that result in wrongly single stepping 208 + * into the interrupt handler. 209 + */ 210 + static void __kprobes kprobes_save_local_irqflag(struct kprobe_ctlblk *kcb, 211 + struct pt_regs *regs) 212 + { 213 + kcb->saved_irqflag = regs->pstate; 214 + regs->pstate |= PSR_I_BIT; 215 + } 216 + 217 + static void __kprobes kprobes_restore_local_irqflag(struct kprobe_ctlblk *kcb, 218 + struct pt_regs *regs) 219 + { 220 + if (kcb->saved_irqflag & PSR_I_BIT) 221 + regs->pstate |= PSR_I_BIT; 222 + else 223 + regs->pstate &= ~PSR_I_BIT; 224 + } 225 + 226 + static void __kprobes 227 + set_ss_context(struct kprobe_ctlblk *kcb, unsigned long addr) 228 + { 229 + kcb->ss_ctx.ss_pending = true; 230 + kcb->ss_ctx.match_addr = addr + sizeof(kprobe_opcode_t); 231 + } 232 + 233 + static void __kprobes clear_ss_context(struct kprobe_ctlblk *kcb) 234 + { 235 + kcb->ss_ctx.ss_pending = false; 236 + kcb->ss_ctx.match_addr = 0; 237 + } 238 + 239 + static void __kprobes setup_singlestep(struct kprobe *p, 240 + struct pt_regs *regs, 241 + struct kprobe_ctlblk *kcb, int reenter) 242 + { 243 + unsigned long slot; 244 + 245 + if (reenter) { 246 + save_previous_kprobe(kcb); 247 + set_current_kprobe(p); 248 + kcb->kprobe_status = KPROBE_REENTER; 249 + } else { 250 + kcb->kprobe_status = KPROBE_HIT_SS; 251 + } 252 + 253 + 254 + if (p->ainsn.insn) { 255 + /* prepare for single stepping */ 256 + slot = (unsigned long)p->ainsn.insn; 257 + 258 + set_ss_context(kcb, slot); /* mark pending ss */ 259 + 260 + if (kcb->kprobe_status == KPROBE_REENTER) 261 + spsr_set_debug_flag(regs, 0); 262 + else 263 + WARN_ON(regs->pstate & PSR_D_BIT); 264 + 265 + /* IRQs and single stepping do not mix well. */ 266 + kprobes_save_local_irqflag(kcb, regs); 267 + kernel_enable_single_step(regs); 268 + instruction_pointer_set(regs, slot); 269 + } else { 270 + /* insn simulation */ 271 + arch_simulate_insn(p, regs); 272 + } 273 + } 274 + 275 + static int __kprobes reenter_kprobe(struct kprobe *p, 276 + struct pt_regs *regs, 277 + struct kprobe_ctlblk *kcb) 278 + { 279 + switch (kcb->kprobe_status) { 280 + case KPROBE_HIT_SSDONE: 281 + case KPROBE_HIT_ACTIVE: 282 + kprobes_inc_nmissed_count(p); 283 + setup_singlestep(p, regs, kcb, 1); 284 + break; 285 + case KPROBE_HIT_SS: 286 + case KPROBE_REENTER: 287 + pr_warn("Unrecoverable kprobe detected at %p.\n", p->addr); 288 + dump_kprobe(p); 289 + BUG(); 290 + break; 291 + default: 292 + WARN_ON(1); 293 + return 0; 294 + } 295 + 296 + return 1; 297 + } 298 + 299 + static void __kprobes 300 + post_kprobe_handler(struct kprobe_ctlblk *kcb, struct pt_regs *regs) 301 + { 302 + struct kprobe *cur = kprobe_running(); 303 + 304 + if (!cur) 305 + return; 306 + 307 + /* return addr restore if non-branching insn */ 308 + if (cur->ainsn.restore != 0) 309 + instruction_pointer_set(regs, cur->ainsn.restore); 310 + 311 + /* restore back original saved kprobe variables and continue */ 312 + if (kcb->kprobe_status == KPROBE_REENTER) { 313 + restore_previous_kprobe(kcb); 314 + return; 315 + } 316 + /* call post handler */ 317 + kcb->kprobe_status = KPROBE_HIT_SSDONE; 318 + if (cur->post_handler) { 319 + /* post_handler can hit breakpoint and single step 320 + * again, so we enable D-flag for recursive exception. 321 + */ 322 + cur->post_handler(cur, regs, 0); 323 + } 324 + 325 + reset_current_kprobe(); 326 + } 327 + 328 + int __kprobes kprobe_fault_handler(struct pt_regs *regs, unsigned int fsr) 329 + { 330 + struct kprobe *cur = kprobe_running(); 331 + struct kprobe_ctlblk *kcb = get_kprobe_ctlblk(); 332 + 333 + switch (kcb->kprobe_status) { 334 + case KPROBE_HIT_SS: 335 + case KPROBE_REENTER: 336 + /* 337 + * We are here because the instruction being single 338 + * stepped caused a page fault. We reset the current 339 + * kprobe and the ip points back to the probe address 340 + * and allow the page fault handler to continue as a 341 + * normal page fault. 342 + */ 343 + instruction_pointer_set(regs, (unsigned long) cur->addr); 344 + if (!instruction_pointer(regs)) 345 + BUG(); 346 + 347 + kernel_disable_single_step(); 348 + if (kcb->kprobe_status == KPROBE_REENTER) 349 + spsr_set_debug_flag(regs, 1); 350 + 351 + if (kcb->kprobe_status == KPROBE_REENTER) 352 + restore_previous_kprobe(kcb); 353 + else 354 + reset_current_kprobe(); 355 + 356 + break; 357 + case KPROBE_HIT_ACTIVE: 358 + case KPROBE_HIT_SSDONE: 359 + /* 360 + * We increment the nmissed count for accounting, 361 + * we can also use npre/npostfault count for accounting 362 + * these specific fault cases. 363 + */ 364 + kprobes_inc_nmissed_count(cur); 365 + 366 + /* 367 + * We come here because instructions in the pre/post 368 + * handler caused the page_fault, this could happen 369 + * if handler tries to access user space by 370 + * copy_from_user(), get_user() etc. Let the 371 + * user-specified handler try to fix it first. 372 + */ 373 + if (cur->fault_handler && cur->fault_handler(cur, regs, fsr)) 374 + return 1; 375 + 376 + /* 377 + * In case the user-specified fault handler returned 378 + * zero, try to fix up. 379 + */ 380 + if (fixup_exception(regs)) 381 + return 1; 382 + } 383 + return 0; 384 + } 385 + 386 + int __kprobes kprobe_exceptions_notify(struct notifier_block *self, 387 + unsigned long val, void *data) 388 + { 389 + return NOTIFY_DONE; 390 + } 391 + 392 + static void __kprobes kprobe_handler(struct pt_regs *regs) 393 + { 394 + struct kprobe *p, *cur_kprobe; 395 + struct kprobe_ctlblk *kcb; 396 + unsigned long addr = instruction_pointer(regs); 397 + 398 + kcb = get_kprobe_ctlblk(); 399 + cur_kprobe = kprobe_running(); 400 + 401 + p = get_kprobe((kprobe_opcode_t *) addr); 402 + 403 + if (p) { 404 + if (cur_kprobe) { 405 + if (reenter_kprobe(p, regs, kcb)) 406 + return; 407 + } else { 408 + /* Probe hit */ 409 + set_current_kprobe(p); 410 + kcb->kprobe_status = KPROBE_HIT_ACTIVE; 411 + 412 + /* 413 + * If we have no pre-handler or it returned 0, we 414 + * continue with normal processing. If we have a 415 + * pre-handler and it returned non-zero, it prepped 416 + * for calling the break_handler below on re-entry, 417 + * so get out doing nothing more here. 418 + * 419 + * pre_handler can hit a breakpoint and can step thru 420 + * before return, keep PSTATE D-flag enabled until 421 + * pre_handler return back. 422 + */ 423 + if (!p->pre_handler || !p->pre_handler(p, regs)) { 424 + setup_singlestep(p, regs, kcb, 0); 425 + return; 426 + } 427 + } 428 + } else if ((le32_to_cpu(*(kprobe_opcode_t *) addr) == 429 + BRK64_OPCODE_KPROBES) && cur_kprobe) { 430 + /* We probably hit a jprobe. Call its break handler. */ 431 + if (cur_kprobe->break_handler && 432 + cur_kprobe->break_handler(cur_kprobe, regs)) { 433 + setup_singlestep(cur_kprobe, regs, kcb, 0); 434 + return; 435 + } 436 + } 437 + /* 438 + * The breakpoint instruction was removed right 439 + * after we hit it. Another cpu has removed 440 + * either a probepoint or a debugger breakpoint 441 + * at this address. In either case, no further 442 + * handling of this interrupt is appropriate. 443 + * Return back to original instruction, and continue. 444 + */ 445 + } 446 + 447 + static int __kprobes 448 + kprobe_ss_hit(struct kprobe_ctlblk *kcb, unsigned long addr) 449 + { 450 + if ((kcb->ss_ctx.ss_pending) 451 + && (kcb->ss_ctx.match_addr == addr)) { 452 + clear_ss_context(kcb); /* clear pending ss */ 453 + return DBG_HOOK_HANDLED; 454 + } 455 + /* not ours, kprobes should ignore it */ 456 + return DBG_HOOK_ERROR; 457 + } 458 + 459 + int __kprobes 460 + kprobe_single_step_handler(struct pt_regs *regs, unsigned int esr) 461 + { 462 + struct kprobe_ctlblk *kcb = get_kprobe_ctlblk(); 463 + int retval; 464 + 465 + /* return error if this is not our step */ 466 + retval = kprobe_ss_hit(kcb, instruction_pointer(regs)); 467 + 468 + if (retval == DBG_HOOK_HANDLED) { 469 + kprobes_restore_local_irqflag(kcb, regs); 470 + kernel_disable_single_step(); 471 + 472 + if (kcb->kprobe_status == KPROBE_REENTER) 473 + spsr_set_debug_flag(regs, 1); 474 + 475 + post_kprobe_handler(kcb, regs); 476 + } 477 + 478 + return retval; 479 + } 480 + 481 + int __kprobes 482 + kprobe_breakpoint_handler(struct pt_regs *regs, unsigned int esr) 483 + { 484 + kprobe_handler(regs); 485 + return DBG_HOOK_HANDLED; 486 + } 487 + 488 + int __kprobes setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs) 489 + { 490 + struct jprobe *jp = container_of(p, struct jprobe, kp); 491 + struct kprobe_ctlblk *kcb = get_kprobe_ctlblk(); 492 + long stack_ptr = kernel_stack_pointer(regs); 493 + 494 + kcb->jprobe_saved_regs = *regs; 495 + /* 496 + * As Linus pointed out, gcc assumes that the callee 497 + * owns the argument space and could overwrite it, e.g. 498 + * tailcall optimization. So, to be absolutely safe 499 + * we also save and restore enough stack bytes to cover 500 + * the argument area. 501 + */ 502 + kasan_disable_current(); 503 + memcpy(kcb->jprobes_stack, (void *)stack_ptr, 504 + min_stack_size(stack_ptr)); 505 + kasan_enable_current(); 506 + 507 + instruction_pointer_set(regs, (unsigned long) jp->entry); 508 + preempt_disable(); 509 + pause_graph_tracing(); 510 + return 1; 511 + } 512 + 513 + void __kprobes jprobe_return(void) 514 + { 515 + struct kprobe_ctlblk *kcb = get_kprobe_ctlblk(); 516 + 517 + /* 518 + * Jprobe handler return by entering break exception, 519 + * encoded same as kprobe, but with following conditions 520 + * -a special PC to identify it from the other kprobes. 521 + * -restore stack addr to original saved pt_regs 522 + */ 523 + asm volatile(" mov sp, %0 \n" 524 + "jprobe_return_break: brk %1 \n" 525 + : 526 + : "r" (kcb->jprobe_saved_regs.sp), 527 + "I" (BRK64_ESR_KPROBES) 528 + : "memory"); 529 + 530 + unreachable(); 531 + } 532 + 533 + int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs) 534 + { 535 + struct kprobe_ctlblk *kcb = get_kprobe_ctlblk(); 536 + long stack_addr = kcb->jprobe_saved_regs.sp; 537 + long orig_sp = kernel_stack_pointer(regs); 538 + struct jprobe *jp = container_of(p, struct jprobe, kp); 539 + extern const char jprobe_return_break[]; 540 + 541 + if (instruction_pointer(regs) != (u64) jprobe_return_break) 542 + return 0; 543 + 544 + if (orig_sp != stack_addr) { 545 + struct pt_regs *saved_regs = 546 + (struct pt_regs *)kcb->jprobe_saved_regs.sp; 547 + pr_err("current sp %lx does not match saved sp %lx\n", 548 + orig_sp, stack_addr); 549 + pr_err("Saved registers for jprobe %p\n", jp); 550 + show_regs(saved_regs); 551 + pr_err("Current registers\n"); 552 + show_regs(regs); 553 + BUG(); 554 + } 555 + unpause_graph_tracing(); 556 + *regs = kcb->jprobe_saved_regs; 557 + kasan_disable_current(); 558 + memcpy((void *)stack_addr, kcb->jprobes_stack, 559 + min_stack_size(stack_addr)); 560 + kasan_enable_current(); 561 + preempt_enable_no_resched(); 562 + return 1; 563 + } 564 + 565 + bool arch_within_kprobe_blacklist(unsigned long addr) 566 + { 567 + extern char __idmap_text_start[], __idmap_text_end[]; 568 + extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[]; 569 + 570 + if ((addr >= (unsigned long)__kprobes_text_start && 571 + addr < (unsigned long)__kprobes_text_end) || 572 + (addr >= (unsigned long)__entry_text_start && 573 + addr < (unsigned long)__entry_text_end) || 574 + (addr >= (unsigned long)__idmap_text_start && 575 + addr < (unsigned long)__idmap_text_end) || 576 + !!search_exception_tables(addr)) 577 + return true; 578 + 579 + if (!is_kernel_in_hyp_mode()) { 580 + if ((addr >= (unsigned long)__hyp_text_start && 581 + addr < (unsigned long)__hyp_text_end) || 582 + (addr >= (unsigned long)__hyp_idmap_text_start && 583 + addr < (unsigned long)__hyp_idmap_text_end)) 584 + return true; 585 + } 586 + 587 + return false; 588 + } 589 + 590 + void __kprobes __used *trampoline_probe_handler(struct pt_regs *regs) 591 + { 592 + struct kretprobe_instance *ri = NULL; 593 + struct hlist_head *head, empty_rp; 594 + struct hlist_node *tmp; 595 + unsigned long flags, orig_ret_address = 0; 596 + unsigned long trampoline_address = 597 + (unsigned long)&kretprobe_trampoline; 598 + kprobe_opcode_t *correct_ret_addr = NULL; 599 + 600 + INIT_HLIST_HEAD(&empty_rp); 601 + kretprobe_hash_lock(current, &head, &flags); 602 + 603 + /* 604 + * It is possible to have multiple instances associated with a given 605 + * task either because multiple functions in the call path have 606 + * return probes installed on them, and/or more than one 607 + * return probe was registered for a target function. 608 + * 609 + * We can handle this because: 610 + * - instances are always pushed into the head of the list 611 + * - when multiple return probes are registered for the same 612 + * function, the (chronologically) first instance's ret_addr 613 + * will be the real return address, and all the rest will 614 + * point to kretprobe_trampoline. 615 + */ 616 + hlist_for_each_entry_safe(ri, tmp, head, hlist) { 617 + if (ri->task != current) 618 + /* another task is sharing our hash bucket */ 619 + continue; 620 + 621 + orig_ret_address = (unsigned long)ri->ret_addr; 622 + 623 + if (orig_ret_address != trampoline_address) 624 + /* 625 + * This is the real return address. Any other 626 + * instances associated with this task are for 627 + * other calls deeper on the call stack 628 + */ 629 + break; 630 + } 631 + 632 + kretprobe_assert(ri, orig_ret_address, trampoline_address); 633 + 634 + correct_ret_addr = ri->ret_addr; 635 + hlist_for_each_entry_safe(ri, tmp, head, hlist) { 636 + if (ri->task != current) 637 + /* another task is sharing our hash bucket */ 638 + continue; 639 + 640 + orig_ret_address = (unsigned long)ri->ret_addr; 641 + if (ri->rp && ri->rp->handler) { 642 + __this_cpu_write(current_kprobe, &ri->rp->kp); 643 + get_kprobe_ctlblk()->kprobe_status = KPROBE_HIT_ACTIVE; 644 + ri->ret_addr = correct_ret_addr; 645 + ri->rp->handler(ri, regs); 646 + __this_cpu_write(current_kprobe, NULL); 647 + } 648 + 649 + recycle_rp_inst(ri, &empty_rp); 650 + 651 + if (orig_ret_address != trampoline_address) 652 + /* 653 + * This is the real return address. Any other 654 + * instances associated with this task are for 655 + * other calls deeper on the call stack 656 + */ 657 + break; 658 + } 659 + 660 + kretprobe_hash_unlock(current, &flags); 661 + 662 + hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) { 663 + hlist_del(&ri->hlist); 664 + kfree(ri); 665 + } 666 + return (void *)orig_ret_address; 667 + } 668 + 669 + void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri, 670 + struct pt_regs *regs) 671 + { 672 + ri->ret_addr = (kprobe_opcode_t *)regs->regs[30]; 673 + 674 + /* replace return addr (x30) with trampoline */ 675 + regs->regs[30] = (long)&kretprobe_trampoline; 676 + } 677 + 678 + int __kprobes arch_trampoline_kprobe(struct kprobe *p) 679 + { 680 + return 0; 681 + } 682 + 683 + int __init arch_init_kprobes(void) 684 + { 685 + return 0; 686 + }
+81
arch/arm64/kernel/probes/kprobes_trampoline.S
··· 1 + /* 2 + * trampoline entry and return code for kretprobes. 3 + */ 4 + 5 + #include <linux/linkage.h> 6 + #include <asm/asm-offsets.h> 7 + #include <asm/assembler.h> 8 + 9 + .text 10 + 11 + .macro save_all_base_regs 12 + stp x0, x1, [sp, #S_X0] 13 + stp x2, x3, [sp, #S_X2] 14 + stp x4, x5, [sp, #S_X4] 15 + stp x6, x7, [sp, #S_X6] 16 + stp x8, x9, [sp, #S_X8] 17 + stp x10, x11, [sp, #S_X10] 18 + stp x12, x13, [sp, #S_X12] 19 + stp x14, x15, [sp, #S_X14] 20 + stp x16, x17, [sp, #S_X16] 21 + stp x18, x19, [sp, #S_X18] 22 + stp x20, x21, [sp, #S_X20] 23 + stp x22, x23, [sp, #S_X22] 24 + stp x24, x25, [sp, #S_X24] 25 + stp x26, x27, [sp, #S_X26] 26 + stp x28, x29, [sp, #S_X28] 27 + add x0, sp, #S_FRAME_SIZE 28 + stp lr, x0, [sp, #S_LR] 29 + /* 30 + * Construct a useful saved PSTATE 31 + */ 32 + mrs x0, nzcv 33 + mrs x1, daif 34 + orr x0, x0, x1 35 + mrs x1, CurrentEL 36 + orr x0, x0, x1 37 + mrs x1, SPSel 38 + orr x0, x0, x1 39 + stp xzr, x0, [sp, #S_PC] 40 + .endm 41 + 42 + .macro restore_all_base_regs 43 + ldr x0, [sp, #S_PSTATE] 44 + and x0, x0, #(PSR_N_BIT | PSR_Z_BIT | PSR_C_BIT | PSR_V_BIT) 45 + msr nzcv, x0 46 + ldp x0, x1, [sp, #S_X0] 47 + ldp x2, x3, [sp, #S_X2] 48 + ldp x4, x5, [sp, #S_X4] 49 + ldp x6, x7, [sp, #S_X6] 50 + ldp x8, x9, [sp, #S_X8] 51 + ldp x10, x11, [sp, #S_X10] 52 + ldp x12, x13, [sp, #S_X12] 53 + ldp x14, x15, [sp, #S_X14] 54 + ldp x16, x17, [sp, #S_X16] 55 + ldp x18, x19, [sp, #S_X18] 56 + ldp x20, x21, [sp, #S_X20] 57 + ldp x22, x23, [sp, #S_X22] 58 + ldp x24, x25, [sp, #S_X24] 59 + ldp x26, x27, [sp, #S_X26] 60 + ldp x28, x29, [sp, #S_X28] 61 + .endm 62 + 63 + ENTRY(kretprobe_trampoline) 64 + sub sp, sp, #S_FRAME_SIZE 65 + 66 + save_all_base_regs 67 + 68 + mov x0, sp 69 + bl trampoline_probe_handler 70 + /* 71 + * Replace trampoline address in lr with actual orig_ret_addr return 72 + * address. 73 + */ 74 + mov lr, x0 75 + 76 + restore_all_base_regs 77 + 78 + add sp, sp, #S_FRAME_SIZE 79 + ret 80 + 81 + ENDPROC(kretprobe_trampoline)
+217
arch/arm64/kernel/probes/simulate-insn.c
··· 1 + /* 2 + * arch/arm64/kernel/probes/simulate-insn.c 3 + * 4 + * Copyright (C) 2013 Linaro Limited. 5 + * 6 + * This program is free software; you can redistribute it and/or modify 7 + * it under the terms of the GNU General Public License version 2 as 8 + * published by the Free Software Foundation. 9 + * 10 + * This program is distributed in the hope that it will be useful, 11 + * but WITHOUT ANY WARRANTY; without even the implied warranty of 12 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 13 + * General Public License for more details. 14 + */ 15 + 16 + #include <linux/kernel.h> 17 + #include <linux/kprobes.h> 18 + 19 + #include "simulate-insn.h" 20 + 21 + #define sign_extend(x, signbit) \ 22 + ((x) | (0 - ((x) & (1 << (signbit))))) 23 + 24 + #define bbl_displacement(insn) \ 25 + sign_extend(((insn) & 0x3ffffff) << 2, 27) 26 + 27 + #define bcond_displacement(insn) \ 28 + sign_extend(((insn >> 5) & 0x7ffff) << 2, 20) 29 + 30 + #define cbz_displacement(insn) \ 31 + sign_extend(((insn >> 5) & 0x7ffff) << 2, 20) 32 + 33 + #define tbz_displacement(insn) \ 34 + sign_extend(((insn >> 5) & 0x3fff) << 2, 15) 35 + 36 + #define ldr_displacement(insn) \ 37 + sign_extend(((insn >> 5) & 0x7ffff) << 2, 20) 38 + 39 + static inline void set_x_reg(struct pt_regs *regs, int reg, u64 val) 40 + { 41 + if (reg < 31) 42 + regs->regs[reg] = val; 43 + } 44 + 45 + static inline void set_w_reg(struct pt_regs *regs, int reg, u64 val) 46 + { 47 + if (reg < 31) 48 + regs->regs[reg] = lower_32_bits(val); 49 + } 50 + 51 + static inline u64 get_x_reg(struct pt_regs *regs, int reg) 52 + { 53 + if (reg < 31) 54 + return regs->regs[reg]; 55 + else 56 + return 0; 57 + } 58 + 59 + static inline u32 get_w_reg(struct pt_regs *regs, int reg) 60 + { 61 + if (reg < 31) 62 + return lower_32_bits(regs->regs[reg]); 63 + else 64 + return 0; 65 + } 66 + 67 + static bool __kprobes check_cbz(u32 opcode, struct pt_regs *regs) 68 + { 69 + int xn = opcode & 0x1f; 70 + 71 + return (opcode & (1 << 31)) ? 72 + (get_x_reg(regs, xn) == 0) : (get_w_reg(regs, xn) == 0); 73 + } 74 + 75 + static bool __kprobes check_cbnz(u32 opcode, struct pt_regs *regs) 76 + { 77 + int xn = opcode & 0x1f; 78 + 79 + return (opcode & (1 << 31)) ? 80 + (get_x_reg(regs, xn) != 0) : (get_w_reg(regs, xn) != 0); 81 + } 82 + 83 + static bool __kprobes check_tbz(u32 opcode, struct pt_regs *regs) 84 + { 85 + int xn = opcode & 0x1f; 86 + int bit_pos = ((opcode & (1 << 31)) >> 26) | ((opcode >> 19) & 0x1f); 87 + 88 + return ((get_x_reg(regs, xn) >> bit_pos) & 0x1) == 0; 89 + } 90 + 91 + static bool __kprobes check_tbnz(u32 opcode, struct pt_regs *regs) 92 + { 93 + int xn = opcode & 0x1f; 94 + int bit_pos = ((opcode & (1 << 31)) >> 26) | ((opcode >> 19) & 0x1f); 95 + 96 + return ((get_x_reg(regs, xn) >> bit_pos) & 0x1) != 0; 97 + } 98 + 99 + /* 100 + * instruction simulation functions 101 + */ 102 + void __kprobes 103 + simulate_adr_adrp(u32 opcode, long addr, struct pt_regs *regs) 104 + { 105 + long imm, xn, val; 106 + 107 + xn = opcode & 0x1f; 108 + imm = ((opcode >> 3) & 0x1ffffc) | ((opcode >> 29) & 0x3); 109 + imm = sign_extend(imm, 20); 110 + if (opcode & 0x80000000) 111 + val = (imm<<12) + (addr & 0xfffffffffffff000); 112 + else 113 + val = imm + addr; 114 + 115 + set_x_reg(regs, xn, val); 116 + 117 + instruction_pointer_set(regs, instruction_pointer(regs) + 4); 118 + } 119 + 120 + void __kprobes 121 + simulate_b_bl(u32 opcode, long addr, struct pt_regs *regs) 122 + { 123 + int disp = bbl_displacement(opcode); 124 + 125 + /* Link register is x30 */ 126 + if (opcode & (1 << 31)) 127 + set_x_reg(regs, 30, addr + 4); 128 + 129 + instruction_pointer_set(regs, addr + disp); 130 + } 131 + 132 + void __kprobes 133 + simulate_b_cond(u32 opcode, long addr, struct pt_regs *regs) 134 + { 135 + int disp = 4; 136 + 137 + if (aarch32_opcode_cond_checks[opcode & 0xf](regs->pstate & 0xffffffff)) 138 + disp = bcond_displacement(opcode); 139 + 140 + instruction_pointer_set(regs, addr + disp); 141 + } 142 + 143 + void __kprobes 144 + simulate_br_blr_ret(u32 opcode, long addr, struct pt_regs *regs) 145 + { 146 + int xn = (opcode >> 5) & 0x1f; 147 + 148 + /* update pc first in case we're doing a "blr lr" */ 149 + instruction_pointer_set(regs, get_x_reg(regs, xn)); 150 + 151 + /* Link register is x30 */ 152 + if (((opcode >> 21) & 0x3) == 1) 153 + set_x_reg(regs, 30, addr + 4); 154 + } 155 + 156 + void __kprobes 157 + simulate_cbz_cbnz(u32 opcode, long addr, struct pt_regs *regs) 158 + { 159 + int disp = 4; 160 + 161 + if (opcode & (1 << 24)) { 162 + if (check_cbnz(opcode, regs)) 163 + disp = cbz_displacement(opcode); 164 + } else { 165 + if (check_cbz(opcode, regs)) 166 + disp = cbz_displacement(opcode); 167 + } 168 + instruction_pointer_set(regs, addr + disp); 169 + } 170 + 171 + void __kprobes 172 + simulate_tbz_tbnz(u32 opcode, long addr, struct pt_regs *regs) 173 + { 174 + int disp = 4; 175 + 176 + if (opcode & (1 << 24)) { 177 + if (check_tbnz(opcode, regs)) 178 + disp = tbz_displacement(opcode); 179 + } else { 180 + if (check_tbz(opcode, regs)) 181 + disp = tbz_displacement(opcode); 182 + } 183 + instruction_pointer_set(regs, addr + disp); 184 + } 185 + 186 + void __kprobes 187 + simulate_ldr_literal(u32 opcode, long addr, struct pt_regs *regs) 188 + { 189 + u64 *load_addr; 190 + int xn = opcode & 0x1f; 191 + int disp; 192 + 193 + disp = ldr_displacement(opcode); 194 + load_addr = (u64 *) (addr + disp); 195 + 196 + if (opcode & (1 << 30)) /* x0-x30 */ 197 + set_x_reg(regs, xn, *load_addr); 198 + else /* w0-w30 */ 199 + set_w_reg(regs, xn, *load_addr); 200 + 201 + instruction_pointer_set(regs, instruction_pointer(regs) + 4); 202 + } 203 + 204 + void __kprobes 205 + simulate_ldrsw_literal(u32 opcode, long addr, struct pt_regs *regs) 206 + { 207 + s32 *load_addr; 208 + int xn = opcode & 0x1f; 209 + int disp; 210 + 211 + disp = ldr_displacement(opcode); 212 + load_addr = (s32 *) (addr + disp); 213 + 214 + set_x_reg(regs, xn, *load_addr); 215 + 216 + instruction_pointer_set(regs, instruction_pointer(regs) + 4); 217 + }
+28
arch/arm64/kernel/probes/simulate-insn.h
··· 1 + /* 2 + * arch/arm64/kernel/probes/simulate-insn.h 3 + * 4 + * Copyright (C) 2013 Linaro Limited 5 + * 6 + * This program is free software; you can redistribute it and/or modify 7 + * it under the terms of the GNU General Public License version 2 as 8 + * published by the Free Software Foundation. 9 + * 10 + * This program is distributed in the hope that it will be useful, 11 + * but WITHOUT ANY WARRANTY; without even the implied warranty of 12 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 13 + * General Public License for more details. 14 + */ 15 + 16 + #ifndef _ARM_KERNEL_KPROBES_SIMULATE_INSN_H 17 + #define _ARM_KERNEL_KPROBES_SIMULATE_INSN_H 18 + 19 + void simulate_adr_adrp(u32 opcode, long addr, struct pt_regs *regs); 20 + void simulate_b_bl(u32 opcode, long addr, struct pt_regs *regs); 21 + void simulate_b_cond(u32 opcode, long addr, struct pt_regs *regs); 22 + void simulate_br_blr_ret(u32 opcode, long addr, struct pt_regs *regs); 23 + void simulate_cbz_cbnz(u32 opcode, long addr, struct pt_regs *regs); 24 + void simulate_tbz_tbnz(u32 opcode, long addr, struct pt_regs *regs); 25 + void simulate_ldr_literal(u32 opcode, long addr, struct pt_regs *regs); 26 + void simulate_ldrsw_literal(u32 opcode, long addr, struct pt_regs *regs); 27 + 28 + #endif /* _ARM_KERNEL_KPROBES_SIMULATE_INSN_H */
+101
arch/arm64/kernel/ptrace.c
··· 48 48 #define CREATE_TRACE_POINTS 49 49 #include <trace/events/syscalls.h> 50 50 51 + struct pt_regs_offset { 52 + const char *name; 53 + int offset; 54 + }; 55 + 56 + #define REG_OFFSET_NAME(r) {.name = #r, .offset = offsetof(struct pt_regs, r)} 57 + #define REG_OFFSET_END {.name = NULL, .offset = 0} 58 + #define GPR_OFFSET_NAME(r) \ 59 + {.name = "x" #r, .offset = offsetof(struct pt_regs, regs[r])} 60 + 61 + static const struct pt_regs_offset regoffset_table[] = { 62 + GPR_OFFSET_NAME(0), 63 + GPR_OFFSET_NAME(1), 64 + GPR_OFFSET_NAME(2), 65 + GPR_OFFSET_NAME(3), 66 + GPR_OFFSET_NAME(4), 67 + GPR_OFFSET_NAME(5), 68 + GPR_OFFSET_NAME(6), 69 + GPR_OFFSET_NAME(7), 70 + GPR_OFFSET_NAME(8), 71 + GPR_OFFSET_NAME(9), 72 + GPR_OFFSET_NAME(10), 73 + GPR_OFFSET_NAME(11), 74 + GPR_OFFSET_NAME(12), 75 + GPR_OFFSET_NAME(13), 76 + GPR_OFFSET_NAME(14), 77 + GPR_OFFSET_NAME(15), 78 + GPR_OFFSET_NAME(16), 79 + GPR_OFFSET_NAME(17), 80 + GPR_OFFSET_NAME(18), 81 + GPR_OFFSET_NAME(19), 82 + GPR_OFFSET_NAME(20), 83 + GPR_OFFSET_NAME(21), 84 + GPR_OFFSET_NAME(22), 85 + GPR_OFFSET_NAME(23), 86 + GPR_OFFSET_NAME(24), 87 + GPR_OFFSET_NAME(25), 88 + GPR_OFFSET_NAME(26), 89 + GPR_OFFSET_NAME(27), 90 + GPR_OFFSET_NAME(28), 91 + GPR_OFFSET_NAME(29), 92 + GPR_OFFSET_NAME(30), 93 + {.name = "lr", .offset = offsetof(struct pt_regs, regs[30])}, 94 + REG_OFFSET_NAME(sp), 95 + REG_OFFSET_NAME(pc), 96 + REG_OFFSET_NAME(pstate), 97 + REG_OFFSET_END, 98 + }; 99 + 100 + /** 101 + * regs_query_register_offset() - query register offset from its name 102 + * @name: the name of a register 103 + * 104 + * regs_query_register_offset() returns the offset of a register in struct 105 + * pt_regs from its name. If the name is invalid, this returns -EINVAL; 106 + */ 107 + int regs_query_register_offset(const char *name) 108 + { 109 + const struct pt_regs_offset *roff; 110 + 111 + for (roff = regoffset_table; roff->name != NULL; roff++) 112 + if (!strcmp(roff->name, name)) 113 + return roff->offset; 114 + return -EINVAL; 115 + } 116 + 117 + /** 118 + * regs_within_kernel_stack() - check the address in the stack 119 + * @regs: pt_regs which contains kernel stack pointer. 120 + * @addr: address which is checked. 121 + * 122 + * regs_within_kernel_stack() checks @addr is within the kernel stack page(s). 123 + * If @addr is within the kernel stack, it returns true. If not, returns false. 124 + */ 125 + static bool regs_within_kernel_stack(struct pt_regs *regs, unsigned long addr) 126 + { 127 + return ((addr & ~(THREAD_SIZE - 1)) == 128 + (kernel_stack_pointer(regs) & ~(THREAD_SIZE - 1))) || 129 + on_irq_stack(addr, raw_smp_processor_id()); 130 + } 131 + 132 + /** 133 + * regs_get_kernel_stack_nth() - get Nth entry of the stack 134 + * @regs: pt_regs which contains kernel stack pointer. 135 + * @n: stack entry number. 136 + * 137 + * regs_get_kernel_stack_nth() returns @n th entry of the kernel stack which 138 + * is specified by @regs. If the @n th entry is NOT in the kernel stack, 139 + * this returns 0. 140 + */ 141 + unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs, unsigned int n) 142 + { 143 + unsigned long *addr = (unsigned long *)kernel_stack_pointer(regs); 144 + 145 + addr += n; 146 + if (regs_within_kernel_stack(regs, (unsigned long)addr)) 147 + return *addr; 148 + else 149 + return 0; 150 + } 151 + 51 152 /* 52 153 * TODO: does not yet catch signals sent when the child dies. 53 154 * in exit.c or in signal.c.
+130
arch/arm64/kernel/relocate_kernel.S
··· 1 + /* 2 + * kexec for arm64 3 + * 4 + * Copyright (C) Linaro. 5 + * Copyright (C) Huawei Futurewei Technologies. 6 + * 7 + * This program is free software; you can redistribute it and/or modify 8 + * it under the terms of the GNU General Public License version 2 as 9 + * published by the Free Software Foundation. 10 + */ 11 + 12 + #include <linux/kexec.h> 13 + #include <linux/linkage.h> 14 + 15 + #include <asm/assembler.h> 16 + #include <asm/kexec.h> 17 + #include <asm/page.h> 18 + #include <asm/sysreg.h> 19 + 20 + /* 21 + * arm64_relocate_new_kernel - Put a 2nd stage image in place and boot it. 22 + * 23 + * The memory that the old kernel occupies may be overwritten when coping the 24 + * new image to its final location. To assure that the 25 + * arm64_relocate_new_kernel routine which does that copy is not overwritten, 26 + * all code and data needed by arm64_relocate_new_kernel must be between the 27 + * symbols arm64_relocate_new_kernel and arm64_relocate_new_kernel_end. The 28 + * machine_kexec() routine will copy arm64_relocate_new_kernel to the kexec 29 + * control_code_page, a special page which has been set up to be preserved 30 + * during the copy operation. 31 + */ 32 + ENTRY(arm64_relocate_new_kernel) 33 + 34 + /* Setup the list loop variables. */ 35 + mov x17, x1 /* x17 = kimage_start */ 36 + mov x16, x0 /* x16 = kimage_head */ 37 + dcache_line_size x15, x0 /* x15 = dcache line size */ 38 + mov x14, xzr /* x14 = entry ptr */ 39 + mov x13, xzr /* x13 = copy dest */ 40 + 41 + /* Clear the sctlr_el2 flags. */ 42 + mrs x0, CurrentEL 43 + cmp x0, #CurrentEL_EL2 44 + b.ne 1f 45 + mrs x0, sctlr_el2 46 + ldr x1, =SCTLR_ELx_FLAGS 47 + bic x0, x0, x1 48 + msr sctlr_el2, x0 49 + isb 50 + 1: 51 + 52 + /* Check if the new image needs relocation. */ 53 + tbnz x16, IND_DONE_BIT, .Ldone 54 + 55 + .Lloop: 56 + and x12, x16, PAGE_MASK /* x12 = addr */ 57 + 58 + /* Test the entry flags. */ 59 + .Ltest_source: 60 + tbz x16, IND_SOURCE_BIT, .Ltest_indirection 61 + 62 + /* Invalidate dest page to PoC. */ 63 + mov x0, x13 64 + add x20, x0, #PAGE_SIZE 65 + sub x1, x15, #1 66 + bic x0, x0, x1 67 + 2: dc ivac, x0 68 + add x0, x0, x15 69 + cmp x0, x20 70 + b.lo 2b 71 + dsb sy 72 + 73 + mov x20, x13 74 + mov x21, x12 75 + copy_page x20, x21, x0, x1, x2, x3, x4, x5, x6, x7 76 + 77 + /* dest += PAGE_SIZE */ 78 + add x13, x13, PAGE_SIZE 79 + b .Lnext 80 + 81 + .Ltest_indirection: 82 + tbz x16, IND_INDIRECTION_BIT, .Ltest_destination 83 + 84 + /* ptr = addr */ 85 + mov x14, x12 86 + b .Lnext 87 + 88 + .Ltest_destination: 89 + tbz x16, IND_DESTINATION_BIT, .Lnext 90 + 91 + /* dest = addr */ 92 + mov x13, x12 93 + 94 + .Lnext: 95 + /* entry = *ptr++ */ 96 + ldr x16, [x14], #8 97 + 98 + /* while (!(entry & DONE)) */ 99 + tbz x16, IND_DONE_BIT, .Lloop 100 + 101 + .Ldone: 102 + /* wait for writes from copy_page to finish */ 103 + dsb nsh 104 + ic iallu 105 + dsb nsh 106 + isb 107 + 108 + /* Start new image. */ 109 + mov x0, xzr 110 + mov x1, xzr 111 + mov x2, xzr 112 + mov x3, xzr 113 + br x17 114 + 115 + ENDPROC(arm64_relocate_new_kernel) 116 + 117 + .ltorg 118 + 119 + .align 3 /* To keep the 64-bit values below naturally aligned. */ 120 + 121 + .Lcopy_end: 122 + .org KEXEC_CONTROL_PAGE_SIZE 123 + 124 + /* 125 + * arm64_relocate_new_kernel_size - Number of bytes to copy to the 126 + * control_code_page. 127 + */ 128 + .globl arm64_relocate_new_kernel_size 129 + arm64_relocate_new_kernel_size: 130 + .quad .Lcopy_end - arm64_relocate_new_kernel
+1 -1
arch/arm64/kernel/setup.c
··· 202 202 struct resource *res; 203 203 204 204 kernel_code.start = virt_to_phys(_text); 205 - kernel_code.end = virt_to_phys(_etext - 1); 205 + kernel_code.end = virt_to_phys(__init_begin - 1); 206 206 kernel_data.start = virt_to_phys(_sdata); 207 207 kernel_data.end = virt_to_phys(_end - 1); 208 208
+8 -2
arch/arm64/kernel/smp.c
··· 267 267 set_cpu_online(cpu, true); 268 268 complete(&cpu_running); 269 269 270 - local_dbg_enable(); 271 270 local_irq_enable(); 272 271 local_async_enable(); 273 272 ··· 436 437 437 438 void __init smp_prepare_boot_cpu(void) 438 439 { 440 + set_my_cpu_offset(per_cpu_offset(smp_processor_id())); 439 441 cpuinfo_store_boot_cpu(); 440 442 save_boot_cpu_run_el(); 441 - set_my_cpu_offset(per_cpu_offset(smp_processor_id())); 442 443 } 443 444 444 445 static u64 __init of_get_cpu_mpidr(struct device_node *dn) ··· 693 694 init_cpu_topology(); 694 695 695 696 smp_store_cpu_info(smp_processor_id()); 697 + 698 + /* 699 + * If UP is mandated by "nosmp" (which implies "maxcpus=0"), don't set 700 + * secondary CPUs present. 701 + */ 702 + if (max_cpus == 0) 703 + return; 696 704 697 705 /* 698 706 * Initialise the present map (which describes the set of CPUs
+120 -32
arch/arm64/kernel/traps.c
··· 41 41 #include <asm/stacktrace.h> 42 42 #include <asm/exception.h> 43 43 #include <asm/system_misc.h> 44 + #include <asm/sysreg.h> 44 45 45 46 static const char *handler[]= { 46 47 "Synchronous Abort", ··· 53 52 int show_unhandled_signals = 1; 54 53 55 54 /* 56 - * Dump out the contents of some memory nicely... 55 + * Dump out the contents of some kernel memory nicely... 57 56 */ 58 57 static void dump_mem(const char *lvl, const char *str, unsigned long bottom, 59 - unsigned long top, bool compat) 58 + unsigned long top) 60 59 { 61 60 unsigned long first; 62 61 mm_segment_t fs; 63 62 int i; 64 - unsigned int width = compat ? 4 : 8; 65 63 66 64 /* 67 65 * We need to switch to kernel mode so that we can use __get_user ··· 78 78 memset(str, ' ', sizeof(str)); 79 79 str[sizeof(str) - 1] = '\0'; 80 80 81 - for (p = first, i = 0; i < (32 / width) 82 - && p < top; i++, p += width) { 81 + for (p = first, i = 0; i < (32 / 8) 82 + && p < top; i++, p += 8) { 83 83 if (p >= bottom && p < top) { 84 84 unsigned long val; 85 85 86 - if (width == 8) { 87 - if (__get_user(val, (unsigned long *)p) == 0) 88 - sprintf(str + i * 17, " %016lx", val); 89 - else 90 - sprintf(str + i * 17, " ????????????????"); 91 - } else { 92 - if (__get_user(val, (unsigned int *)p) == 0) 93 - sprintf(str + i * 9, " %08lx", val); 94 - else 95 - sprintf(str + i * 9, " ????????"); 96 - } 86 + if (__get_user(val, (unsigned long *)p) == 0) 87 + sprintf(str + i * 17, " %016lx", val); 88 + else 89 + sprintf(str + i * 17, " ????????????????"); 97 90 } 98 91 } 99 92 printk("%s%04lx:%s\n", lvl, first & 0xffff, str); ··· 209 216 stack = IRQ_STACK_TO_TASK_STACK(irq_stack_ptr); 210 217 211 218 dump_mem("", "Exception stack", stack, 212 - stack + sizeof(struct pt_regs), false); 219 + stack + sizeof(struct pt_regs)); 213 220 } 214 221 } 215 222 } ··· 247 254 pr_emerg("Process %.*s (pid: %d, stack limit = 0x%p)\n", 248 255 TASK_COMM_LEN, tsk->comm, task_pid_nr(tsk), thread + 1); 249 256 250 - if (!user_mode(regs) || in_interrupt()) { 257 + if (!user_mode(regs)) { 251 258 dump_mem(KERN_EMERG, "Stack: ", regs->sp, 252 - THREAD_SIZE + (unsigned long)task_stack_page(tsk), 253 - compat_user_mode(regs)); 259 + THREAD_SIZE + (unsigned long)task_stack_page(tsk)); 254 260 dump_backtrace(regs, tsk); 255 261 dump_instr(KERN_EMERG, regs); 256 262 } ··· 365 373 return fn ? fn(regs, instr) : 1; 366 374 } 367 375 368 - asmlinkage void __exception do_undefinstr(struct pt_regs *regs) 376 + static void force_signal_inject(int signal, int code, struct pt_regs *regs, 377 + unsigned long address) 369 378 { 370 379 siginfo_t info; 371 380 void __user *pc = (void __user *)instruction_pointer(regs); 381 + const char *desc; 372 382 383 + switch (signal) { 384 + case SIGILL: 385 + desc = "undefined instruction"; 386 + break; 387 + case SIGSEGV: 388 + desc = "illegal memory access"; 389 + break; 390 + default: 391 + desc = "bad mode"; 392 + break; 393 + } 394 + 395 + if (unhandled_signal(current, signal) && 396 + show_unhandled_signals_ratelimited()) { 397 + pr_info("%s[%d]: %s: pc=%p\n", 398 + current->comm, task_pid_nr(current), desc, pc); 399 + dump_instr(KERN_INFO, regs); 400 + } 401 + 402 + info.si_signo = signal; 403 + info.si_errno = 0; 404 + info.si_code = code; 405 + info.si_addr = pc; 406 + 407 + arm64_notify_die(desc, regs, &info, 0); 408 + } 409 + 410 + /* 411 + * Set up process info to signal segmentation fault - called on access error. 412 + */ 413 + void arm64_notify_segfault(struct pt_regs *regs, unsigned long addr) 414 + { 415 + int code; 416 + 417 + down_read(&current->mm->mmap_sem); 418 + if (find_vma(current->mm, addr) == NULL) 419 + code = SEGV_MAPERR; 420 + else 421 + code = SEGV_ACCERR; 422 + up_read(&current->mm->mmap_sem); 423 + 424 + force_signal_inject(SIGSEGV, code, regs, addr); 425 + } 426 + 427 + asmlinkage void __exception do_undefinstr(struct pt_regs *regs) 428 + { 373 429 /* check for AArch32 breakpoint instructions */ 374 430 if (!aarch32_break_handler(regs)) 375 431 return; ··· 425 385 if (call_undef_hook(regs) == 0) 426 386 return; 427 387 428 - if (unhandled_signal(current, SIGILL) && show_unhandled_signals_ratelimited()) { 429 - pr_info("%s[%d]: undefined instruction: pc=%p\n", 430 - current->comm, task_pid_nr(current), pc); 431 - dump_instr(KERN_INFO, regs); 388 + force_signal_inject(SIGILL, ILL_ILLOPC, regs, 0); 389 + } 390 + 391 + void cpu_enable_cache_maint_trap(void *__unused) 392 + { 393 + config_sctlr_el1(SCTLR_EL1_UCI, 0); 394 + } 395 + 396 + #define __user_cache_maint(insn, address, res) \ 397 + asm volatile ( \ 398 + "1: " insn ", %1\n" \ 399 + " mov %w0, #0\n" \ 400 + "2:\n" \ 401 + " .pushsection .fixup,\"ax\"\n" \ 402 + " .align 2\n" \ 403 + "3: mov %w0, %w2\n" \ 404 + " b 2b\n" \ 405 + " .popsection\n" \ 406 + _ASM_EXTABLE(1b, 3b) \ 407 + : "=r" (res) \ 408 + : "r" (address), "i" (-EFAULT) ) 409 + 410 + asmlinkage void __exception do_sysinstr(unsigned int esr, struct pt_regs *regs) 411 + { 412 + unsigned long address; 413 + int ret; 414 + 415 + /* if this is a write with: Op0=1, Op2=1, Op1=3, CRn=7 */ 416 + if ((esr & 0x01fffc01) == 0x0012dc00) { 417 + int rt = (esr >> 5) & 0x1f; 418 + int crm = (esr >> 1) & 0x0f; 419 + 420 + address = (rt == 31) ? 0 : regs->regs[rt]; 421 + 422 + switch (crm) { 423 + case 11: /* DC CVAU, gets promoted */ 424 + __user_cache_maint("dc civac", address, ret); 425 + break; 426 + case 10: /* DC CVAC, gets promoted */ 427 + __user_cache_maint("dc civac", address, ret); 428 + break; 429 + case 14: /* DC CIVAC */ 430 + __user_cache_maint("dc civac", address, ret); 431 + break; 432 + case 5: /* IC IVAU */ 433 + __user_cache_maint("ic ivau", address, ret); 434 + break; 435 + default: 436 + force_signal_inject(SIGILL, ILL_ILLOPC, regs, 0); 437 + return; 438 + } 439 + } else { 440 + force_signal_inject(SIGILL, ILL_ILLOPC, regs, 0); 441 + return; 432 442 } 433 443 434 - info.si_signo = SIGILL; 435 - info.si_errno = 0; 436 - info.si_code = ILL_ILLOPC; 437 - info.si_addr = pc; 438 - 439 - arm64_notify_die("Oops - undefined instruction", regs, &info, 0); 444 + if (ret) 445 + arm64_notify_segfault(regs, address); 446 + else 447 + regs->pc += 4; 440 448 } 441 449 442 450 long compat_arm_syscall(struct pt_regs *regs); ··· 553 465 554 466 const char *esr_get_class_string(u32 esr) 555 467 { 556 - return esr_class_str[esr >> ESR_ELx_EC_SHIFT]; 468 + return esr_class_str[ESR_ELx_EC(esr)]; 557 469 } 558 470 559 471 /*
+7 -1
arch/arm64/kernel/vdso.c
··· 214 214 vdso_data->wtm_clock_nsec = tk->wall_to_monotonic.tv_nsec; 215 215 216 216 if (!use_syscall) { 217 + /* tkr_mono.cycle_last == tkr_raw.cycle_last */ 217 218 vdso_data->cs_cycle_last = tk->tkr_mono.cycle_last; 219 + vdso_data->raw_time_sec = tk->raw_time.tv_sec; 220 + vdso_data->raw_time_nsec = tk->raw_time.tv_nsec; 218 221 vdso_data->xtime_clock_sec = tk->xtime_sec; 219 222 vdso_data->xtime_clock_nsec = tk->tkr_mono.xtime_nsec; 220 - vdso_data->cs_mult = tk->tkr_mono.mult; 223 + /* tkr_raw.xtime_nsec == 0 */ 224 + vdso_data->cs_mono_mult = tk->tkr_mono.mult; 225 + vdso_data->cs_raw_mult = tk->tkr_raw.mult; 226 + /* tkr_mono.shift == tkr_raw.shift */ 221 227 vdso_data->cs_shift = tk->tkr_mono.shift; 222 228 } 223 229
+3 -4
arch/arm64/kernel/vdso/Makefile
··· 23 23 ccflags-y += -Wl,-shared 24 24 25 25 obj-y += vdso.o 26 - extra-y += vdso.lds vdso-offsets.h 26 + extra-y += vdso.lds 27 27 CPPFLAGS_vdso.lds += -P -C -U$(ARCH) 28 28 29 29 # Force dependency (incbin is bad) ··· 42 42 gen-vdsosym := $(srctree)/$(src)/gen_vdso_offsets.sh 43 43 quiet_cmd_vdsosym = VDSOSYM $@ 44 44 define cmd_vdsosym 45 - $(NM) $< | $(gen-vdsosym) | LC_ALL=C sort > $@ && \ 46 - cp $@ include/generated/ 45 + $(NM) $< | $(gen-vdsosym) | LC_ALL=C sort > $@ 47 46 endef 48 47 49 - $(obj)/vdso-offsets.h: $(obj)/vdso.so.dbg FORCE 48 + include/generated/vdso-offsets.h: $(obj)/vdso.so.dbg FORCE 50 49 $(call if_changed,vdsosym) 51 50 52 51 # Assembly rules for the .S files
+206 -125
arch/arm64/kernel/vdso/gettimeofday.S
··· 26 26 #define NSEC_PER_SEC_HI16 0x3b9a 27 27 28 28 vdso_data .req x6 29 - use_syscall .req w7 30 - seqcnt .req w8 29 + seqcnt .req w7 30 + w_tmp .req w8 31 + x_tmp .req x8 32 + 33 + /* 34 + * Conventions for macro arguments: 35 + * - An argument is write-only if its name starts with "res". 36 + * - All other arguments are read-only, unless otherwise specified. 37 + */ 31 38 32 39 .macro seqcnt_acquire 33 40 9999: ldr seqcnt, [vdso_data, #VDSO_TB_SEQ_COUNT] 34 41 tbnz seqcnt, #0, 9999b 35 42 dmb ishld 36 - ldr use_syscall, [vdso_data, #VDSO_USE_SYSCALL] 37 43 .endm 38 44 39 - .macro seqcnt_read, cnt 45 + .macro seqcnt_check fail 40 46 dmb ishld 41 - ldr \cnt, [vdso_data, #VDSO_TB_SEQ_COUNT] 47 + ldr w_tmp, [vdso_data, #VDSO_TB_SEQ_COUNT] 48 + cmp w_tmp, seqcnt 49 + b.ne \fail 42 50 .endm 43 51 44 - .macro seqcnt_check, cnt, fail 45 - cmp \cnt, seqcnt 46 - b.ne \fail 52 + .macro syscall_check fail 53 + ldr w_tmp, [vdso_data, #VDSO_USE_SYSCALL] 54 + cbnz w_tmp, \fail 55 + .endm 56 + 57 + .macro get_nsec_per_sec res 58 + mov \res, #NSEC_PER_SEC_LO16 59 + movk \res, #NSEC_PER_SEC_HI16, lsl #16 60 + .endm 61 + 62 + /* 63 + * Returns the clock delta, in nanoseconds left-shifted by the clock 64 + * shift. 65 + */ 66 + .macro get_clock_shifted_nsec res, cycle_last, mult 67 + /* Read the virtual counter. */ 68 + isb 69 + mrs x_tmp, cntvct_el0 70 + /* Calculate cycle delta and convert to ns. */ 71 + sub \res, x_tmp, \cycle_last 72 + /* We can only guarantee 56 bits of precision. */ 73 + movn x_tmp, #0xff00, lsl #48 74 + and \res, x_tmp, \res 75 + mul \res, \res, \mult 76 + .endm 77 + 78 + /* 79 + * Returns in res_{sec,nsec} the REALTIME timespec, based on the 80 + * "wall time" (xtime) and the clock_mono delta. 81 + */ 82 + .macro get_ts_realtime res_sec, res_nsec, \ 83 + clock_nsec, xtime_sec, xtime_nsec, nsec_to_sec 84 + add \res_nsec, \clock_nsec, \xtime_nsec 85 + udiv x_tmp, \res_nsec, \nsec_to_sec 86 + add \res_sec, \xtime_sec, x_tmp 87 + msub \res_nsec, x_tmp, \nsec_to_sec, \res_nsec 88 + .endm 89 + 90 + /* 91 + * Returns in res_{sec,nsec} the timespec based on the clock_raw delta, 92 + * used for CLOCK_MONOTONIC_RAW. 93 + */ 94 + .macro get_ts_clock_raw res_sec, res_nsec, clock_nsec, nsec_to_sec 95 + udiv \res_sec, \clock_nsec, \nsec_to_sec 96 + msub \res_nsec, \res_sec, \nsec_to_sec, \clock_nsec 97 + .endm 98 + 99 + /* sec and nsec are modified in place. */ 100 + .macro add_ts sec, nsec, ts_sec, ts_nsec, nsec_to_sec 101 + /* Add timespec. */ 102 + add \sec, \sec, \ts_sec 103 + add \nsec, \nsec, \ts_nsec 104 + 105 + /* Normalise the new timespec. */ 106 + cmp \nsec, \nsec_to_sec 107 + b.lt 9999f 108 + sub \nsec, \nsec, \nsec_to_sec 109 + add \sec, \sec, #1 110 + 9999: 111 + cmp \nsec, #0 112 + b.ge 9998f 113 + add \nsec, \nsec, \nsec_to_sec 114 + sub \sec, \sec, #1 115 + 9998: 116 + .endm 117 + 118 + .macro clock_gettime_return, shift=0 119 + .if \shift == 1 120 + lsr x11, x11, x12 121 + .endif 122 + stp x10, x11, [x1, #TSPEC_TV_SEC] 123 + mov x0, xzr 124 + ret 125 + .endm 126 + 127 + .macro jump_slot jumptable, index, label 128 + .if (. - \jumptable) != 4 * (\index) 129 + .error "Jump slot index mismatch" 130 + .endif 131 + b \label 47 132 .endm 48 133 49 134 .text ··· 136 51 /* int __kernel_gettimeofday(struct timeval *tv, struct timezone *tz); */ 137 52 ENTRY(__kernel_gettimeofday) 138 53 .cfi_startproc 139 - mov x2, x30 140 - .cfi_register x30, x2 141 - 142 - /* Acquire the sequence counter and get the timespec. */ 143 54 adr vdso_data, _vdso_data 144 - 1: seqcnt_acquire 145 - cbnz use_syscall, 4f 146 - 147 55 /* If tv is NULL, skip to the timezone code. */ 148 56 cbz x0, 2f 149 - bl __do_get_tspec 150 - seqcnt_check w9, 1b 57 + 58 + /* Compute the time of day. */ 59 + 1: seqcnt_acquire 60 + syscall_check fail=4f 61 + ldr x10, [vdso_data, #VDSO_CS_CYCLE_LAST] 62 + /* w11 = cs_mono_mult, w12 = cs_shift */ 63 + ldp w11, w12, [vdso_data, #VDSO_CS_MONO_MULT] 64 + ldp x13, x14, [vdso_data, #VDSO_XTIME_CLK_SEC] 65 + seqcnt_check fail=1b 66 + 67 + get_nsec_per_sec res=x9 68 + lsl x9, x9, x12 69 + 70 + get_clock_shifted_nsec res=x15, cycle_last=x10, mult=x11 71 + get_ts_realtime res_sec=x10, res_nsec=x11, \ 72 + clock_nsec=x15, xtime_sec=x13, xtime_nsec=x14, nsec_to_sec=x9 151 73 152 74 /* Convert ns to us. */ 153 75 mov x13, #1000 ··· 168 76 stp w4, w5, [x1, #TZ_MINWEST] 169 77 3: 170 78 mov x0, xzr 171 - ret x2 79 + ret 172 80 4: 173 81 /* Syscall fallback. */ 174 82 mov x8, #__NR_gettimeofday 175 83 svc #0 176 - ret x2 84 + ret 177 85 .cfi_endproc 178 86 ENDPROC(__kernel_gettimeofday) 87 + 88 + #define JUMPSLOT_MAX CLOCK_MONOTONIC_COARSE 179 89 180 90 /* int __kernel_clock_gettime(clockid_t clock_id, struct timespec *tp); */ 181 91 ENTRY(__kernel_clock_gettime) 182 92 .cfi_startproc 183 - cmp w0, #CLOCK_REALTIME 184 - ccmp w0, #CLOCK_MONOTONIC, #0x4, ne 185 - b.ne 2f 186 - 187 - mov x2, x30 188 - .cfi_register x30, x2 189 - 190 - /* Get kernel timespec. */ 93 + cmp w0, #JUMPSLOT_MAX 94 + b.hi syscall 191 95 adr vdso_data, _vdso_data 192 - 1: seqcnt_acquire 193 - cbnz use_syscall, 7f 96 + adr x_tmp, jumptable 97 + add x_tmp, x_tmp, w0, uxtw #2 98 + br x_tmp 194 99 195 - bl __do_get_tspec 196 - seqcnt_check w9, 1b 100 + ALIGN 101 + jumptable: 102 + jump_slot jumptable, CLOCK_REALTIME, realtime 103 + jump_slot jumptable, CLOCK_MONOTONIC, monotonic 104 + b syscall 105 + b syscall 106 + jump_slot jumptable, CLOCK_MONOTONIC_RAW, monotonic_raw 107 + jump_slot jumptable, CLOCK_REALTIME_COARSE, realtime_coarse 108 + jump_slot jumptable, CLOCK_MONOTONIC_COARSE, monotonic_coarse 197 109 198 - mov x30, x2 110 + .if (. - jumptable) != 4 * (JUMPSLOT_MAX + 1) 111 + .error "Wrong jumptable size" 112 + .endif 199 113 200 - cmp w0, #CLOCK_MONOTONIC 201 - b.ne 6f 114 + ALIGN 115 + realtime: 116 + seqcnt_acquire 117 + syscall_check fail=syscall 118 + ldr x10, [vdso_data, #VDSO_CS_CYCLE_LAST] 119 + /* w11 = cs_mono_mult, w12 = cs_shift */ 120 + ldp w11, w12, [vdso_data, #VDSO_CS_MONO_MULT] 121 + ldp x13, x14, [vdso_data, #VDSO_XTIME_CLK_SEC] 122 + seqcnt_check fail=realtime 202 123 203 - /* Get wtm timespec. */ 204 - ldp x13, x14, [vdso_data, #VDSO_WTM_CLK_SEC] 124 + /* All computations are done with left-shifted nsecs. */ 125 + get_nsec_per_sec res=x9 126 + lsl x9, x9, x12 205 127 206 - /* Check the sequence counter. */ 207 - seqcnt_read w9 208 - seqcnt_check w9, 1b 209 - b 4f 210 - 2: 211 - cmp w0, #CLOCK_REALTIME_COARSE 212 - ccmp w0, #CLOCK_MONOTONIC_COARSE, #0x4, ne 213 - b.ne 8f 128 + get_clock_shifted_nsec res=x15, cycle_last=x10, mult=x11 129 + get_ts_realtime res_sec=x10, res_nsec=x11, \ 130 + clock_nsec=x15, xtime_sec=x13, xtime_nsec=x14, nsec_to_sec=x9 131 + clock_gettime_return, shift=1 214 132 215 - /* xtime_coarse_nsec is already right-shifted */ 216 - mov x12, #0 133 + ALIGN 134 + monotonic: 135 + seqcnt_acquire 136 + syscall_check fail=syscall 137 + ldr x10, [vdso_data, #VDSO_CS_CYCLE_LAST] 138 + /* w11 = cs_mono_mult, w12 = cs_shift */ 139 + ldp w11, w12, [vdso_data, #VDSO_CS_MONO_MULT] 140 + ldp x13, x14, [vdso_data, #VDSO_XTIME_CLK_SEC] 141 + ldp x3, x4, [vdso_data, #VDSO_WTM_CLK_SEC] 142 + seqcnt_check fail=monotonic 217 143 218 - /* Get coarse timespec. */ 219 - adr vdso_data, _vdso_data 220 - 3: seqcnt_acquire 221 - ldp x10, x11, [vdso_data, #VDSO_XTIME_CRS_SEC] 144 + /* All computations are done with left-shifted nsecs. */ 145 + lsl x4, x4, x12 146 + get_nsec_per_sec res=x9 147 + lsl x9, x9, x12 222 148 223 - /* Get wtm timespec. */ 224 - ldp x13, x14, [vdso_data, #VDSO_WTM_CLK_SEC] 149 + get_clock_shifted_nsec res=x15, cycle_last=x10, mult=x11 150 + get_ts_realtime res_sec=x10, res_nsec=x11, \ 151 + clock_nsec=x15, xtime_sec=x13, xtime_nsec=x14, nsec_to_sec=x9 225 152 226 - /* Check the sequence counter. */ 227 - seqcnt_read w9 228 - seqcnt_check w9, 3b 153 + add_ts sec=x10, nsec=x11, ts_sec=x3, ts_nsec=x4, nsec_to_sec=x9 154 + clock_gettime_return, shift=1 229 155 230 - cmp w0, #CLOCK_MONOTONIC_COARSE 231 - b.ne 6f 232 - 4: 233 - /* Add on wtm timespec. */ 234 - add x10, x10, x13 156 + ALIGN 157 + monotonic_raw: 158 + seqcnt_acquire 159 + syscall_check fail=syscall 160 + ldr x10, [vdso_data, #VDSO_CS_CYCLE_LAST] 161 + /* w11 = cs_raw_mult, w12 = cs_shift */ 162 + ldp w12, w11, [vdso_data, #VDSO_CS_SHIFT] 163 + ldp x13, x14, [vdso_data, #VDSO_RAW_TIME_SEC] 164 + seqcnt_check fail=monotonic_raw 165 + 166 + /* All computations are done with left-shifted nsecs. */ 235 167 lsl x14, x14, x12 236 - add x11, x11, x14 168 + get_nsec_per_sec res=x9 169 + lsl x9, x9, x12 237 170 238 - /* Normalise the new timespec. */ 239 - mov x15, #NSEC_PER_SEC_LO16 240 - movk x15, #NSEC_PER_SEC_HI16, lsl #16 241 - lsl x15, x15, x12 242 - cmp x11, x15 243 - b.lt 5f 244 - sub x11, x11, x15 245 - add x10, x10, #1 246 - 5: 247 - cmp x11, #0 248 - b.ge 6f 249 - add x11, x11, x15 250 - sub x10, x10, #1 171 + get_clock_shifted_nsec res=x15, cycle_last=x10, mult=x11 172 + get_ts_clock_raw res_sec=x10, res_nsec=x11, \ 173 + clock_nsec=x15, nsec_to_sec=x9 251 174 252 - 6: /* Store to the user timespec. */ 253 - lsr x11, x11, x12 254 - stp x10, x11, [x1, #TSPEC_TV_SEC] 255 - mov x0, xzr 256 - ret 257 - 7: 258 - mov x30, x2 259 - 8: /* Syscall fallback. */ 175 + add_ts sec=x10, nsec=x11, ts_sec=x13, ts_nsec=x14, nsec_to_sec=x9 176 + clock_gettime_return, shift=1 177 + 178 + ALIGN 179 + realtime_coarse: 180 + seqcnt_acquire 181 + ldp x10, x11, [vdso_data, #VDSO_XTIME_CRS_SEC] 182 + seqcnt_check fail=realtime_coarse 183 + clock_gettime_return 184 + 185 + ALIGN 186 + monotonic_coarse: 187 + seqcnt_acquire 188 + ldp x10, x11, [vdso_data, #VDSO_XTIME_CRS_SEC] 189 + ldp x13, x14, [vdso_data, #VDSO_WTM_CLK_SEC] 190 + seqcnt_check fail=monotonic_coarse 191 + 192 + /* Computations are done in (non-shifted) nsecs. */ 193 + get_nsec_per_sec res=x9 194 + add_ts sec=x10, nsec=x11, ts_sec=x13, ts_nsec=x14, nsec_to_sec=x9 195 + clock_gettime_return 196 + 197 + ALIGN 198 + syscall: /* Syscall fallback. */ 260 199 mov x8, #__NR_clock_gettime 261 200 svc #0 262 201 ret ··· 299 176 .cfi_startproc 300 177 cmp w0, #CLOCK_REALTIME 301 178 ccmp w0, #CLOCK_MONOTONIC, #0x4, ne 179 + ccmp w0, #CLOCK_MONOTONIC_RAW, #0x4, ne 302 180 b.ne 1f 303 181 304 182 ldr x2, 5f ··· 327 203 .quad CLOCK_COARSE_RES 328 204 .cfi_endproc 329 205 ENDPROC(__kernel_clock_getres) 330 - 331 - /* 332 - * Read the current time from the architected counter. 333 - * Expects vdso_data to be initialised. 334 - * Clobbers the temporary registers (x9 - x15). 335 - * Returns: 336 - * - w9 = vDSO sequence counter 337 - * - (x10, x11) = (ts->tv_sec, shifted ts->tv_nsec) 338 - * - w12 = cs_shift 339 - */ 340 - ENTRY(__do_get_tspec) 341 - .cfi_startproc 342 - 343 - /* Read from the vDSO data page. */ 344 - ldr x10, [vdso_data, #VDSO_CS_CYCLE_LAST] 345 - ldp x13, x14, [vdso_data, #VDSO_XTIME_CLK_SEC] 346 - ldp w11, w12, [vdso_data, #VDSO_CS_MULT] 347 - seqcnt_read w9 348 - 349 - /* Read the virtual counter. */ 350 - isb 351 - mrs x15, cntvct_el0 352 - 353 - /* Calculate cycle delta and convert to ns. */ 354 - sub x10, x15, x10 355 - /* We can only guarantee 56 bits of precision. */ 356 - movn x15, #0xff00, lsl #48 357 - and x10, x15, x10 358 - mul x10, x10, x11 359 - 360 - /* Use the kernel time to calculate the new timespec. */ 361 - mov x11, #NSEC_PER_SEC_LO16 362 - movk x11, #NSEC_PER_SEC_HI16, lsl #16 363 - lsl x11, x11, x12 364 - add x15, x10, x14 365 - udiv x14, x15, x11 366 - add x10, x13, x14 367 - mul x13, x14, x11 368 - sub x11, x15, x13 369 - 370 - ret 371 - .cfi_endproc 372 - ENDPROC(__do_get_tspec)
+6 -3
arch/arm64/kernel/vmlinux.lds.S
··· 118 118 __exception_text_end = .; 119 119 IRQENTRY_TEXT 120 120 SOFTIRQENTRY_TEXT 121 + ENTRY_TEXT 121 122 TEXT_TEXT 122 123 SCHED_TEXT 123 124 LOCK_TEXT 125 + KPROBES_TEXT 124 126 HYPERVISOR_TEXT 125 127 IDMAP_TEXT 126 128 HIBERNATE_TEXT ··· 133 131 } 134 132 135 133 . = ALIGN(SEGMENT_ALIGN); 136 - RO_DATA(PAGE_SIZE) /* everything from this point to */ 137 - EXCEPTION_TABLE(8) /* _etext will be marked RO NX */ 134 + _etext = .; /* End of text section */ 135 + 136 + RO_DATA(PAGE_SIZE) /* everything from this point to */ 137 + EXCEPTION_TABLE(8) /* __init_begin will be marked RO NX */ 138 138 NOTES 139 139 140 140 . = ALIGN(SEGMENT_ALIGN); 141 - _etext = .; /* End of text and rodata section */ 142 141 __init_begin = .; 143 142 144 143 INIT_TEXT_SECTION(8)
+2 -2
arch/arm64/kvm/handle_exit.c
··· 106 106 run->exit_reason = KVM_EXIT_DEBUG; 107 107 run->debug.arch.hsr = hsr; 108 108 109 - switch (hsr >> ESR_ELx_EC_SHIFT) { 109 + switch (ESR_ELx_EC(hsr)) { 110 110 case ESR_ELx_EC_WATCHPT_LOW: 111 111 run->debug.arch.far = vcpu->arch.fault.far_el2; 112 112 /* fall through */ ··· 149 149 static exit_handle_fn kvm_get_exit_handler(struct kvm_vcpu *vcpu) 150 150 { 151 151 u32 hsr = kvm_vcpu_get_hsr(vcpu); 152 - u8 hsr_ec = hsr >> ESR_ELx_EC_SHIFT; 152 + u8 hsr_ec = ESR_ELx_EC(hsr); 153 153 154 154 if (hsr_ec >= ARRAY_SIZE(arm_exit_handlers) || 155 155 !arm_exit_handlers[hsr_ec]) {
+4
arch/arm64/kvm/hyp/Makefile
··· 17 17 obj-$(CONFIG_KVM_ARM_HOST) += hyp-entry.o 18 18 obj-$(CONFIG_KVM_ARM_HOST) += s2-setup.o 19 19 20 + # KVM code is run at a different exception code with a different map, so 21 + # compiler instrumentation that inserts callbacks or checks into the code may 22 + # cause crashes. Just disable it. 20 23 GCOV_PROFILE := n 21 24 KASAN_SANITIZE := n 22 25 UBSAN_SANITIZE := n 26 + KCOV_INSTRUMENT := n
+1 -1
arch/arm64/kvm/hyp/switch.c
··· 198 198 static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu) 199 199 { 200 200 u64 esr = read_sysreg_el2(esr); 201 - u8 ec = esr >> ESR_ELx_EC_SHIFT; 201 + u8 ec = ESR_ELx_EC(esr); 202 202 u64 hpfar, far; 203 203 204 204 vcpu->arch.fault.esr_el2 = esr;
+2 -2
arch/arm64/lib/copy_from_user.S
··· 66 66 .endm 67 67 68 68 end .req x5 69 - ENTRY(__copy_from_user) 69 + ENTRY(__arch_copy_from_user) 70 70 ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(0)), ARM64_ALT_PAN_NOT_UAO, \ 71 71 CONFIG_ARM64_PAN) 72 72 add end, x0, x2 ··· 75 75 CONFIG_ARM64_PAN) 76 76 mov x0, #0 // Nothing to copy 77 77 ret 78 - ENDPROC(__copy_from_user) 78 + ENDPROC(__arch_copy_from_user) 79 79 80 80 .section .fixup,"ax" 81 81 .align 2
+2 -2
arch/arm64/lib/copy_to_user.S
··· 65 65 .endm 66 66 67 67 end .req x5 68 - ENTRY(__copy_to_user) 68 + ENTRY(__arch_copy_to_user) 69 69 ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(0)), ARM64_ALT_PAN_NOT_UAO, \ 70 70 CONFIG_ARM64_PAN) 71 71 add end, x0, x2 ··· 74 74 CONFIG_ARM64_PAN) 75 75 mov x0, #0 76 76 ret 77 - ENDPROC(__copy_to_user) 77 + ENDPROC(__arch_copy_to_user) 78 78 79 79 .section .fixup,"ax" 80 80 .align 2
+1 -1
arch/arm64/mm/cache.S
··· 52 52 sub x3, x2, #1 53 53 bic x4, x0, x3 54 54 1: 55 - USER(9f, dc cvau, x4 ) // clean D line to PoU 55 + user_alt 9f, "dc cvau, x4", "dc civac, x4", ARM64_WORKAROUND_CLEAN_CACHE 56 56 add x4, x4, x2 57 57 cmp x4, x1 58 58 b.lo 1b
+19 -18
arch/arm64/mm/dma-mapping.c
··· 19 19 20 20 #include <linux/gfp.h> 21 21 #include <linux/acpi.h> 22 + #include <linux/bootmem.h> 22 23 #include <linux/export.h> 23 24 #include <linux/slab.h> 24 25 #include <linux/genalloc.h> ··· 29 28 #include <linux/swiotlb.h> 30 29 31 30 #include <asm/cacheflush.h> 31 + 32 + static int swiotlb __read_mostly; 32 33 33 34 static pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot, 34 35 bool coherent) ··· 344 341 return ret; 345 342 } 346 343 344 + static int __swiotlb_dma_supported(struct device *hwdev, u64 mask) 345 + { 346 + if (swiotlb) 347 + return swiotlb_dma_supported(hwdev, mask); 348 + return 1; 349 + } 350 + 347 351 static struct dma_map_ops swiotlb_dma_ops = { 348 352 .alloc = __dma_alloc, 349 353 .free = __dma_free, ··· 364 354 .sync_single_for_device = __swiotlb_sync_single_for_device, 365 355 .sync_sg_for_cpu = __swiotlb_sync_sg_for_cpu, 366 356 .sync_sg_for_device = __swiotlb_sync_sg_for_device, 367 - .dma_supported = swiotlb_dma_supported, 357 + .dma_supported = __swiotlb_dma_supported, 368 358 .mapping_error = swiotlb_dma_mapping_error, 369 359 }; 370 360 ··· 523 513 524 514 static int __init arm64_dma_init(void) 525 515 { 516 + if (swiotlb_force || max_pfn > (arm64_dma_phys_limit >> PAGE_SHIFT)) 517 + swiotlb = 1; 518 + 526 519 return atomic_pool_init(); 527 520 } 528 521 arch_initcall(arm64_dma_init); ··· 861 848 { 862 849 struct iommu_dma_notifier_data *master, *tmp; 863 850 864 - if (action != BUS_NOTIFY_ADD_DEVICE) 851 + if (action != BUS_NOTIFY_BIND_DRIVER) 865 852 return 0; 866 853 867 854 mutex_lock(&iommu_dma_notifier_lock); 868 855 list_for_each_entry_safe(master, tmp, &iommu_dma_masters, list) { 869 - if (do_iommu_attach(master->dev, master->ops, 870 - master->dma_base, master->size)) { 856 + if (data == master->dev && do_iommu_attach(master->dev, 857 + master->ops, master->dma_base, master->size)) { 871 858 list_del(&master->list); 872 859 kfree(master); 860 + break; 873 861 } 874 862 } 875 863 mutex_unlock(&iommu_dma_notifier_lock); ··· 884 870 885 871 if (!nb) 886 872 return -ENOMEM; 887 - /* 888 - * The device must be attached to a domain before the driver probe 889 - * routine gets a chance to start allocating DMA buffers. However, 890 - * the IOMMU driver also needs a chance to configure the iommu_group 891 - * via its add_device callback first, so we need to make the attach 892 - * happen between those two points. Since the IOMMU core uses a bus 893 - * notifier with default priority for add_device, do the same but 894 - * with a lower priority to ensure the appropriate ordering. 895 - */ 873 + 896 874 nb->notifier_call = __iommu_attach_notifier; 897 - nb->priority = -100; 898 875 899 876 ret = bus_register_notifier(bus, nb); 900 877 if (ret) { ··· 909 904 if (!ret) 910 905 ret = register_iommu_dma_ops_notifier(&pci_bus_type); 911 906 #endif 912 - 913 - /* handle devices queued before this arch_initcall */ 914 - if (!ret) 915 - __iommu_attach_notifier(NULL, BUS_NOTIFY_ADD_DEVICE, NULL); 916 907 return ret; 917 908 } 918 909 arch_initcall(__iommu_dma_init);
+20 -12
arch/arm64/mm/dump.c
··· 27 27 #include <asm/memory.h> 28 28 #include <asm/pgtable.h> 29 29 #include <asm/pgtable-hwdef.h> 30 - 31 - struct addr_marker { 32 - unsigned long start_address; 33 - const char *name; 34 - }; 30 + #include <asm/ptdump.h> 35 31 36 32 static const struct addr_marker address_markers[] = { 37 33 #ifdef CONFIG_KASAN ··· 286 290 } 287 291 } 288 292 289 - static void walk_pgd(struct pg_state *st, struct mm_struct *mm, unsigned long start) 293 + static void walk_pgd(struct pg_state *st, struct mm_struct *mm, 294 + unsigned long start) 290 295 { 291 296 pgd_t *pgd = pgd_offset(mm, 0UL); 292 297 unsigned i; ··· 306 309 307 310 static int ptdump_show(struct seq_file *m, void *v) 308 311 { 312 + struct ptdump_info *info = m->private; 309 313 struct pg_state st = { 310 314 .seq = m, 311 - .marker = address_markers, 315 + .marker = info->markers, 312 316 }; 313 317 314 - walk_pgd(&st, &init_mm, VA_START); 318 + walk_pgd(&st, info->mm, info->base_addr); 315 319 316 320 note_page(&st, 0, 0, 0); 317 321 return 0; ··· 320 322 321 323 static int ptdump_open(struct inode *inode, struct file *file) 322 324 { 323 - return single_open(file, ptdump_show, NULL); 325 + return single_open(file, ptdump_show, inode->i_private); 324 326 } 325 327 326 328 static const struct file_operations ptdump_fops = { ··· 330 332 .release = single_release, 331 333 }; 332 334 333 - static int ptdump_init(void) 335 + int ptdump_register(struct ptdump_info *info, const char *name) 334 336 { 335 337 struct dentry *pe; 336 338 unsigned i, j; ··· 340 342 for (j = 0; j < pg_level[i].num; j++) 341 343 pg_level[i].mask |= pg_level[i].bits[j].mask; 342 344 343 - pe = debugfs_create_file("kernel_page_tables", 0400, NULL, NULL, 344 - &ptdump_fops); 345 + pe = debugfs_create_file(name, 0400, NULL, info, &ptdump_fops); 345 346 return pe ? 0 : -ENOMEM; 347 + } 348 + 349 + static struct ptdump_info kernel_ptdump_info = { 350 + .mm = &init_mm, 351 + .markers = address_markers, 352 + .base_addr = VA_START, 353 + }; 354 + 355 + static int ptdump_init(void) 356 + { 357 + return ptdump_register(&kernel_ptdump_info, "kernel_page_tables"); 346 358 } 347 359 device_initcall(ptdump_init);
+35 -6
arch/arm64/mm/fault.c
··· 41 41 42 42 static const char *fault_name(unsigned int esr); 43 43 44 + #ifdef CONFIG_KPROBES 45 + static inline int notify_page_fault(struct pt_regs *regs, unsigned int esr) 46 + { 47 + int ret = 0; 48 + 49 + /* kprobe_running() needs smp_processor_id() */ 50 + if (!user_mode(regs)) { 51 + preempt_disable(); 52 + if (kprobe_running() && kprobe_fault_handler(regs, esr)) 53 + ret = 1; 54 + preempt_enable(); 55 + } 56 + 57 + return ret; 58 + } 59 + #else 60 + static inline int notify_page_fault(struct pt_regs *regs, unsigned int esr) 61 + { 62 + return 0; 63 + } 64 + #endif 65 + 44 66 /* 45 67 * Dump out the page tables associated with 'addr' in mm 'mm'. 46 68 */ ··· 224 202 #define VM_FAULT_BADMAP 0x010000 225 203 #define VM_FAULT_BADACCESS 0x020000 226 204 227 - #define ESR_LNX_EXEC (1 << 24) 228 - 229 205 static int __do_page_fault(struct mm_struct *mm, unsigned long addr, 230 206 unsigned int mm_flags, unsigned long vm_flags, 231 207 struct task_struct *tsk) ··· 262 242 return fault; 263 243 } 264 244 265 - static inline int permission_fault(unsigned int esr) 245 + static inline bool is_permission_fault(unsigned int esr) 266 246 { 267 - unsigned int ec = (esr & ESR_ELx_EC_MASK) >> ESR_ELx_EC_SHIFT; 247 + unsigned int ec = ESR_ELx_EC(esr); 268 248 unsigned int fsc_type = esr & ESR_ELx_FSC_TYPE; 269 249 270 250 return (ec == ESR_ELx_EC_DABT_CUR && fsc_type == ESR_ELx_FSC_PERM); 251 + } 252 + 253 + static bool is_el0_instruction_abort(unsigned int esr) 254 + { 255 + return ESR_ELx_EC(esr) == ESR_ELx_EC_IABT_LOW; 271 256 } 272 257 273 258 static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, ··· 283 258 int fault, sig, code; 284 259 unsigned long vm_flags = VM_READ | VM_WRITE | VM_EXEC; 285 260 unsigned int mm_flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE; 261 + 262 + if (notify_page_fault(regs, esr)) 263 + return 0; 286 264 287 265 tsk = current; 288 266 mm = tsk->mm; ··· 300 272 if (user_mode(regs)) 301 273 mm_flags |= FAULT_FLAG_USER; 302 274 303 - if (esr & ESR_LNX_EXEC) { 275 + if (is_el0_instruction_abort(esr)) { 304 276 vm_flags = VM_EXEC; 305 277 } else if ((esr & ESR_ELx_WNR) && !(esr & ESR_ELx_CM)) { 306 278 vm_flags = VM_WRITE; 307 279 mm_flags |= FAULT_FLAG_WRITE; 308 280 } 309 281 310 - if (permission_fault(esr) && (addr < USER_DS)) { 282 + if (is_permission_fault(esr) && (addr < USER_DS)) { 311 283 /* regs->orig_addr_limit may be 0 if we entered from EL0 */ 312 284 if (regs->orig_addr_limit == KERNEL_DS) 313 285 die("Accessing user space memory with fs=KERNEL_DS", regs, esr); ··· 658 630 659 631 return rv; 660 632 } 633 + NOKPROBE_SYMBOL(do_debug_exception); 661 634 662 635 #ifdef CONFIG_ARM64_PAN 663 636 void cpu_enable_pan(void *__unused)
+6 -7
arch/arm64/mm/init.c
··· 160 160 static void __init arm64_memory_present(void) 161 161 { 162 162 struct memblock_region *reg; 163 - int nid = 0; 164 163 165 164 for_each_memblock(memory, reg) { 166 - #ifdef CONFIG_NUMA 167 - nid = reg->nid; 168 - #endif 165 + int nid = memblock_get_region_node(reg); 166 + 169 167 memory_present(nid, memblock_region_memory_base_pfn(reg), 170 168 memblock_region_memory_end_pfn(reg)); 171 169 } ··· 401 403 */ 402 404 void __init mem_init(void) 403 405 { 404 - swiotlb_init(1); 406 + if (swiotlb_force || max_pfn > (arm64_dma_phys_limit >> PAGE_SHIFT)) 407 + swiotlb_init(1); 405 408 406 409 set_max_mapnr(pfn_to_page(max_pfn) - mem_map); 407 410 ··· 429 430 pr_cont(" vmalloc : 0x%16lx - 0x%16lx (%6ld GB)\n", 430 431 MLG(VMALLOC_START, VMALLOC_END)); 431 432 pr_cont(" .text : 0x%p" " - 0x%p" " (%6ld KB)\n", 432 - MLK_ROUNDUP(_text, __start_rodata)); 433 + MLK_ROUNDUP(_text, _etext)); 433 434 pr_cont(" .rodata : 0x%p" " - 0x%p" " (%6ld KB)\n", 434 - MLK_ROUNDUP(__start_rodata, _etext)); 435 + MLK_ROUNDUP(__start_rodata, __init_begin)); 435 436 pr_cont(" .init : 0x%p" " - 0x%p" " (%6ld KB)\n", 436 437 MLK_ROUNDUP(__init_begin, __init_end)); 437 438 pr_cont(" .data : 0x%p" " - 0x%p" " (%6ld KB)\n",
+46 -108
arch/arm64/mm/mmu.c
··· 77 77 void *ptr; 78 78 79 79 phys = memblock_alloc(PAGE_SIZE, PAGE_SIZE); 80 - BUG_ON(!phys); 81 80 82 81 /* 83 82 * The FIX_{PGD,PUD,PMD} slots may be in active use, but the FIX_PTE ··· 96 97 return phys; 97 98 } 98 99 99 - /* 100 - * remap a PMD into pages 101 - */ 102 - static void split_pmd(pmd_t *pmd, pte_t *pte) 103 - { 104 - unsigned long pfn = pmd_pfn(*pmd); 105 - int i = 0; 106 - 107 - do { 108 - /* 109 - * Need to have the least restrictive permissions available 110 - * permissions will be fixed up later 111 - */ 112 - set_pte(pte, pfn_pte(pfn, PAGE_KERNEL_EXEC)); 113 - pfn++; 114 - } while (pte++, i++, i < PTRS_PER_PTE); 115 - } 116 - 117 100 static void alloc_init_pte(pmd_t *pmd, unsigned long addr, 118 101 unsigned long end, unsigned long pfn, 119 102 pgprot_t prot, ··· 103 122 { 104 123 pte_t *pte; 105 124 106 - if (pmd_none(*pmd) || pmd_sect(*pmd)) { 125 + BUG_ON(pmd_sect(*pmd)); 126 + if (pmd_none(*pmd)) { 107 127 phys_addr_t pte_phys; 108 128 BUG_ON(!pgtable_alloc); 109 129 pte_phys = pgtable_alloc(); 110 130 pte = pte_set_fixmap(pte_phys); 111 - if (pmd_sect(*pmd)) 112 - split_pmd(pmd, pte); 113 131 __pmd_populate(pmd, pte_phys, PMD_TYPE_TABLE); 114 - flush_tlb_all(); 115 132 pte_clear_fixmap(); 116 133 } 117 134 BUG_ON(pmd_bad(*pmd)); ··· 123 144 pte_clear_fixmap(); 124 145 } 125 146 126 - static void split_pud(pud_t *old_pud, pmd_t *pmd) 127 - { 128 - unsigned long addr = pud_pfn(*old_pud) << PAGE_SHIFT; 129 - pgprot_t prot = __pgprot(pud_val(*old_pud) ^ addr); 130 - int i = 0; 131 - 132 - do { 133 - set_pmd(pmd, __pmd(addr | pgprot_val(prot))); 134 - addr += PMD_SIZE; 135 - } while (pmd++, i++, i < PTRS_PER_PMD); 136 - } 137 - 138 - #ifdef CONFIG_DEBUG_PAGEALLOC 139 - static bool block_mappings_allowed(phys_addr_t (*pgtable_alloc)(void)) 140 - { 141 - 142 - /* 143 - * If debug_page_alloc is enabled we must map the linear map 144 - * using pages. However, other mappings created by 145 - * create_mapping_noalloc must use sections in some cases. Allow 146 - * sections to be used in those cases, where no pgtable_alloc 147 - * function is provided. 148 - */ 149 - return !pgtable_alloc || !debug_pagealloc_enabled(); 150 - } 151 - #else 152 - static bool block_mappings_allowed(phys_addr_t (*pgtable_alloc)(void)) 153 - { 154 - return true; 155 - } 156 - #endif 157 - 158 147 static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end, 159 148 phys_addr_t phys, pgprot_t prot, 160 - phys_addr_t (*pgtable_alloc)(void)) 149 + phys_addr_t (*pgtable_alloc)(void), 150 + bool allow_block_mappings) 161 151 { 162 152 pmd_t *pmd; 163 153 unsigned long next; ··· 134 186 /* 135 187 * Check for initial section mappings in the pgd/pud and remove them. 136 188 */ 137 - if (pud_none(*pud) || pud_sect(*pud)) { 189 + BUG_ON(pud_sect(*pud)); 190 + if (pud_none(*pud)) { 138 191 phys_addr_t pmd_phys; 139 192 BUG_ON(!pgtable_alloc); 140 193 pmd_phys = pgtable_alloc(); 141 194 pmd = pmd_set_fixmap(pmd_phys); 142 - if (pud_sect(*pud)) { 143 - /* 144 - * need to have the 1G of mappings continue to be 145 - * present 146 - */ 147 - split_pud(pud, pmd); 148 - } 149 195 __pud_populate(pud, pmd_phys, PUD_TYPE_TABLE); 150 - flush_tlb_all(); 151 196 pmd_clear_fixmap(); 152 197 } 153 198 BUG_ON(pud_bad(*pud)); ··· 150 209 next = pmd_addr_end(addr, end); 151 210 /* try section mapping first */ 152 211 if (((addr | next | phys) & ~SECTION_MASK) == 0 && 153 - block_mappings_allowed(pgtable_alloc)) { 212 + allow_block_mappings) { 154 213 pmd_t old_pmd =*pmd; 155 214 pmd_set_huge(pmd, phys, prot); 156 215 /* ··· 189 248 190 249 static void alloc_init_pud(pgd_t *pgd, unsigned long addr, unsigned long end, 191 250 phys_addr_t phys, pgprot_t prot, 192 - phys_addr_t (*pgtable_alloc)(void)) 251 + phys_addr_t (*pgtable_alloc)(void), 252 + bool allow_block_mappings) 193 253 { 194 254 pud_t *pud; 195 255 unsigned long next; ··· 210 268 /* 211 269 * For 4K granule only, attempt to put down a 1GB block 212 270 */ 213 - if (use_1G_block(addr, next, phys) && 214 - block_mappings_allowed(pgtable_alloc)) { 271 + if (use_1G_block(addr, next, phys) && allow_block_mappings) { 215 272 pud_t old_pud = *pud; 216 273 pud_set_huge(pud, phys, prot); 217 274 ··· 231 290 } 232 291 } else { 233 292 alloc_init_pmd(pud, addr, next, phys, prot, 234 - pgtable_alloc); 293 + pgtable_alloc, allow_block_mappings); 235 294 } 236 295 phys += next - addr; 237 296 } while (pud++, addr = next, addr != end); ··· 239 298 pud_clear_fixmap(); 240 299 } 241 300 242 - /* 243 - * Create the page directory entries and any necessary page tables for the 244 - * mapping specified by 'md'. 245 - */ 246 - static void init_pgd(pgd_t *pgd, phys_addr_t phys, unsigned long virt, 247 - phys_addr_t size, pgprot_t prot, 248 - phys_addr_t (*pgtable_alloc)(void)) 301 + static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys, 302 + unsigned long virt, phys_addr_t size, 303 + pgprot_t prot, 304 + phys_addr_t (*pgtable_alloc)(void), 305 + bool allow_block_mappings) 249 306 { 250 307 unsigned long addr, length, end, next; 308 + pgd_t *pgd = pgd_offset_raw(pgdir, virt); 251 309 252 310 /* 253 311 * If the virtual and physical address don't have the same offset ··· 262 322 end = addr + length; 263 323 do { 264 324 next = pgd_addr_end(addr, end); 265 - alloc_init_pud(pgd, addr, next, phys, prot, pgtable_alloc); 325 + alloc_init_pud(pgd, addr, next, phys, prot, pgtable_alloc, 326 + allow_block_mappings); 266 327 phys += next - addr; 267 328 } while (pgd++, addr = next, addr != end); 268 329 } 269 330 270 - static phys_addr_t late_pgtable_alloc(void) 331 + static phys_addr_t pgd_pgtable_alloc(void) 271 332 { 272 333 void *ptr = (void *)__get_free_page(PGALLOC_GFP); 273 - BUG_ON(!ptr); 334 + if (!ptr || !pgtable_page_ctor(virt_to_page(ptr))) 335 + BUG(); 274 336 275 337 /* Ensure the zeroed page is visible to the page table walker */ 276 338 dsb(ishst); 277 339 return __pa(ptr); 278 - } 279 - 280 - static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys, 281 - unsigned long virt, phys_addr_t size, 282 - pgprot_t prot, 283 - phys_addr_t (*alloc)(void)) 284 - { 285 - init_pgd(pgd_offset_raw(pgdir, virt), phys, virt, size, prot, alloc); 286 340 } 287 341 288 342 /* ··· 292 358 &phys, virt); 293 359 return; 294 360 } 295 - __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, 296 - NULL); 361 + __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL, true); 297 362 } 298 363 299 364 void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys, 300 365 unsigned long virt, phys_addr_t size, 301 - pgprot_t prot) 366 + pgprot_t prot, bool allow_block_mappings) 302 367 { 368 + BUG_ON(mm == &init_mm); 369 + 303 370 __create_pgd_mapping(mm->pgd, phys, virt, size, prot, 304 - late_pgtable_alloc); 371 + pgd_pgtable_alloc, allow_block_mappings); 305 372 } 306 373 307 374 static void create_mapping_late(phys_addr_t phys, unsigned long virt, ··· 315 380 } 316 381 317 382 __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, 318 - late_pgtable_alloc); 383 + NULL, !debug_pagealloc_enabled()); 319 384 } 320 385 321 386 static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end) 322 387 { 323 388 unsigned long kernel_start = __pa(_text); 324 - unsigned long kernel_end = __pa(_etext); 389 + unsigned long kernel_end = __pa(__init_begin); 325 390 326 391 /* 327 392 * Take care not to create a writable alias for the 328 393 * read-only text and rodata sections of the kernel image. 329 394 */ 330 395 331 - /* No overlap with the kernel text */ 396 + /* No overlap with the kernel text/rodata */ 332 397 if (end < kernel_start || start >= kernel_end) { 333 398 __create_pgd_mapping(pgd, start, __phys_to_virt(start), 334 399 end - start, PAGE_KERNEL, 335 - early_pgtable_alloc); 400 + early_pgtable_alloc, 401 + !debug_pagealloc_enabled()); 336 402 return; 337 403 } 338 404 339 405 /* 340 - * This block overlaps the kernel text mapping. 406 + * This block overlaps the kernel text/rodata mappings. 341 407 * Map the portion(s) which don't overlap. 342 408 */ 343 409 if (start < kernel_start) 344 410 __create_pgd_mapping(pgd, start, 345 411 __phys_to_virt(start), 346 412 kernel_start - start, PAGE_KERNEL, 347 - early_pgtable_alloc); 413 + early_pgtable_alloc, 414 + !debug_pagealloc_enabled()); 348 415 if (kernel_end < end) 349 416 __create_pgd_mapping(pgd, kernel_end, 350 417 __phys_to_virt(kernel_end), 351 418 end - kernel_end, PAGE_KERNEL, 352 - early_pgtable_alloc); 419 + early_pgtable_alloc, 420 + !debug_pagealloc_enabled()); 353 421 354 422 /* 355 - * Map the linear alias of the [_text, _etext) interval as 423 + * Map the linear alias of the [_text, __init_begin) interval as 356 424 * read-only/non-executable. This makes the contents of the 357 425 * region accessible to subsystems such as hibernate, but 358 426 * protects it from inadvertent modification or execution. 359 427 */ 360 428 __create_pgd_mapping(pgd, kernel_start, __phys_to_virt(kernel_start), 361 429 kernel_end - kernel_start, PAGE_KERNEL_RO, 362 - early_pgtable_alloc); 430 + early_pgtable_alloc, !debug_pagealloc_enabled()); 363 431 } 364 432 365 433 static void __init map_mem(pgd_t *pgd) ··· 387 449 { 388 450 unsigned long section_size; 389 451 390 - section_size = (unsigned long)__start_rodata - (unsigned long)_text; 452 + section_size = (unsigned long)_etext - (unsigned long)_text; 391 453 create_mapping_late(__pa(_text), (unsigned long)_text, 392 454 section_size, PAGE_KERNEL_ROX); 393 455 /* 394 - * mark .rodata as read only. Use _etext rather than __end_rodata to 395 - * cover NOTES and EXCEPTION_TABLE. 456 + * mark .rodata as read only. Use __init_begin rather than __end_rodata 457 + * to cover NOTES and EXCEPTION_TABLE. 396 458 */ 397 - section_size = (unsigned long)_etext - (unsigned long)__start_rodata; 459 + section_size = (unsigned long)__init_begin - (unsigned long)__start_rodata; 398 460 create_mapping_late(__pa(__start_rodata), (unsigned long)__start_rodata, 399 461 section_size, PAGE_KERNEL_RO); 400 462 } ··· 419 481 BUG_ON(!PAGE_ALIGNED(size)); 420 482 421 483 __create_pgd_mapping(pgd, pa_start, (unsigned long)va_start, size, prot, 422 - early_pgtable_alloc); 484 + early_pgtable_alloc, !debug_pagealloc_enabled()); 423 485 424 486 vma->addr = va_start; 425 487 vma->phys_addr = pa_start; ··· 437 499 { 438 500 static struct vm_struct vmlinux_text, vmlinux_rodata, vmlinux_init, vmlinux_data; 439 501 440 - map_kernel_segment(pgd, _text, __start_rodata, PAGE_KERNEL_EXEC, &vmlinux_text); 441 - map_kernel_segment(pgd, __start_rodata, _etext, PAGE_KERNEL, &vmlinux_rodata); 502 + map_kernel_segment(pgd, _text, _etext, PAGE_KERNEL_EXEC, &vmlinux_text); 503 + map_kernel_segment(pgd, __start_rodata, __init_begin, PAGE_KERNEL, &vmlinux_rodata); 442 504 map_kernel_segment(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC, 443 505 &vmlinux_init); 444 506 map_kernel_segment(pgd, _data, _end, PAGE_KERNEL, &vmlinux_data);
+2
arch/arm64/mm/proc.S
··· 180 180 msr cpacr_el1, x0 // Enable FP/ASIMD 181 181 mov x0, #1 << 12 // Reset mdscr_el1 and disable 182 182 msr mdscr_el1, x0 // access to the DCC from EL0 183 + isb // Unmask debug exceptions now, 184 + enable_dbg // since this is per-cpu 183 185 reset_pmuserenr_el0 x0 // Disable PMU access from EL0 184 186 /* 185 187 * Memory region attributes for LPAE:
+22 -5
drivers/perf/arm_pmu.c
··· 603 603 604 604 irq = platform_get_irq(pmu_device, 0); 605 605 if (irq >= 0 && irq_is_percpu(irq)) { 606 - on_each_cpu(cpu_pmu_disable_percpu_irq, &irq, 1); 606 + on_each_cpu_mask(&cpu_pmu->supported_cpus, 607 + cpu_pmu_disable_percpu_irq, &irq, 1); 607 608 free_percpu_irq(irq, &hw_events->percpu_pmu); 608 609 } else { 609 610 for (i = 0; i < irqs; ++i) { ··· 646 645 irq); 647 646 return err; 648 647 } 649 - on_each_cpu(cpu_pmu_enable_percpu_irq, &irq, 1); 648 + 649 + on_each_cpu_mask(&cpu_pmu->supported_cpus, 650 + cpu_pmu_enable_percpu_irq, &irq, 1); 650 651 } else { 651 652 for (i = 0; i < irqs; ++i) { 652 653 int cpu = i; ··· 964 961 i++; 965 962 } while (1); 966 963 967 - /* If we didn't manage to parse anything, claim to support all CPUs */ 968 - if (cpumask_weight(&pmu->supported_cpus) == 0) 969 - cpumask_setall(&pmu->supported_cpus); 964 + /* If we didn't manage to parse anything, try the interrupt affinity */ 965 + if (cpumask_weight(&pmu->supported_cpus) == 0) { 966 + if (!using_spi) { 967 + /* If using PPIs, check the affinity of the partition */ 968 + int ret, irq; 969 + 970 + irq = platform_get_irq(pdev, 0); 971 + ret = irq_get_percpu_devid_partition(irq, &pmu->supported_cpus); 972 + if (ret) { 973 + kfree(irqs); 974 + return ret; 975 + } 976 + } else { 977 + /* Otherwise default to all CPUs */ 978 + cpumask_setall(&pmu->supported_cpus); 979 + } 980 + } 970 981 971 982 /* If we matched up the IRQ affinities, use them to route the SPIs */ 972 983 if (using_spi && i == pdev->num_resources)
+1
include/uapi/linux/kexec.h
··· 39 39 #define KEXEC_ARCH_SH (42 << 16) 40 40 #define KEXEC_ARCH_MIPS_LE (10 << 16) 41 41 #define KEXEC_ARCH_MIPS ( 8 << 16) 42 + #define KEXEC_ARCH_AARCH64 (183 << 16) 42 43 43 44 /* The artificial cap on the number of segments passed to kexec_load. */ 44 45 #define KEXEC_SEGMENT_MAX 16
+9
samples/kprobes/kprobe_example.c
··· 46 46 " ex1 = 0x%lx\n", 47 47 p->symbol_name, p->addr, regs->pc, regs->ex1); 48 48 #endif 49 + #ifdef CONFIG_ARM64 50 + pr_info("<%s> pre_handler: p->addr = 0x%p, pc = 0x%lx," 51 + " pstate = 0x%lx\n", 52 + p->symbol_name, p->addr, (long)regs->pc, (long)regs->pstate); 53 + #endif 49 54 50 55 /* A dump_stack() here will give a stack backtrace */ 51 56 return 0; ··· 75 70 #ifdef CONFIG_TILEGX 76 71 printk(KERN_INFO "<%s> post_handler: p->addr = 0x%p, ex1 = 0x%lx\n", 77 72 p->symbol_name, p->addr, regs->ex1); 73 + #endif 74 + #ifdef CONFIG_ARM64 75 + pr_info("<%s> post_handler: p->addr = 0x%p, pstate = 0x%lx\n", 76 + p->symbol_name, p->addr, (long)regs->pstate); 78 77 #endif 79 78 } 80 79