Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

- "genirq: Introduce generic irq migration for cpu hotunplugged" patch
merged from tip/irq/for-arm to allow the arm64-specific part to be
upstreamed via the arm64 tree

- CPU feature detection reworked to cope with heterogeneous systems
where CPUs may not have exactly the same features. The features
reported by the kernel via internal data structures or ELF_HWCAP are
delayed until all the CPUs are up (and before user space starts)

- Support for 16KB pages, with the additional bonus of a 36-bit VA
space, though the latter only depending on EXPERT

- Implement native {relaxed, acquire, release} atomics for arm64

- New ASID allocation algorithm which avoids IPI on roll-over, together
with TLB invalidation optimisations (using local vs global where
feasible)

- KASan support for arm64

- EFI_STUB clean-up and isolation for the kernel proper (required by
KASan)

- copy_{to,from,in}_user optimisations (sharing the memcpy template)

- perf: moving arm64 to the arm32/64 shared PMU framework

- L1_CACHE_BYTES increased to 128 to accommodate Cavium hardware

- Support for the contiguous PTE hint on kernel mapping (16 consecutive
entries may be able to use a single TLB entry)

- Generic CONFIG_HZ now used on arm64

- defconfig updates

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (91 commits)
arm64/efi: fix libstub build under CONFIG_MODVERSIONS
ARM64: Enable multi-core scheduler support by default
arm64/efi: move arm64 specific stub C code to libstub
arm64: page-align sections for DEBUG_RODATA
arm64: Fix build with CONFIG_ZONE_DMA=n
arm64: Fix compat register mappings
arm64: Increase the max granular size
arm64: remove bogus TASK_SIZE_64 check
arm64: make Timer Interrupt Frequency selectable
arm64/mm: use PAGE_ALIGNED instead of IS_ALIGNED
arm64: cachetype: fix definitions of ICACHEF_* flags
arm64: cpufeature: declare enable_cpu_capabilities as static
genirq: Make the cpuhotplug migration code less noisy
arm64: Constify hwcap name string arrays
arm64/kvm: Make use of the system wide safe values
arm64/debug: Make use of the system wide safe value
arm64: Move FP/ASIMD hwcap handling to common code
arm64/HWCAP: Use system wide safe values
arm64/capabilities: Make use of system wide safe value
arm64: Delay cpu feature capability checks
...

+3319 -2444
-2
Documentation/arm/uefi.txt
··· 58 58 -------------------------------------------------------------------------------- 59 59 linux,uefi-mmap-desc-ver | 32-bit | Version of the mmap descriptor format. 60 60 -------------------------------------------------------------------------------- 61 - linux,uefi-stub-kern-ver | string | Copy of linux_banner from build. 62 - --------------------------------------------------------------------------------
+6 -1
Documentation/arm64/booting.txt
··· 104 104 - The flags field (introduced in v3.17) is a little-endian 64-bit field 105 105 composed as follows: 106 106 Bit 0: Kernel endianness. 1 if BE, 0 if LE. 107 - Bits 1-63: Reserved. 107 + Bit 1-2: Kernel Page size. 108 + 0 - Unspecified. 109 + 1 - 4K 110 + 2 - 16K 111 + 3 - 64K 112 + Bits 3-63: Reserved. 108 113 109 114 - When image_size is zero, a bootloader should attempt to keep as much 110 115 memory as possible free for use by the kernel immediately after the
+2
Documentation/devicetree/bindings/arm/pmu.txt
··· 8 8 9 9 - compatible : should be one of 10 10 "arm,armv8-pmuv3" 11 + "arm.cortex-a57-pmu" 12 + "arm.cortex-a53-pmu" 11 13 "arm,cortex-a17-pmu" 12 14 "arm,cortex-a15-pmu" 13 15 "arm,cortex-a12-pmu"
+1 -1
Documentation/features/debug/KASAN/arch-support.txt
··· 9 9 | alpha: | TODO | 10 10 | arc: | TODO | 11 11 | arm: | TODO | 12 - | arm64: | TODO | 12 + | arm64: | ok | 13 13 | avr32: | TODO | 14 14 | blackfin: | TODO | 15 15 | c6x: | TODO |
+5 -4
MAINTAINERS
··· 823 823 824 824 ARM PMU PROFILING AND DEBUGGING 825 825 M: Will Deacon <will.deacon@arm.com> 826 + R: Mark Rutland <mark.rutland@arm.com> 826 827 S: Maintained 827 - F: arch/arm/kernel/perf_* 828 + F: arch/arm*/kernel/perf_* 828 829 F: arch/arm/oprofile/common.c 829 - F: arch/arm/kernel/hw_breakpoint.c 830 - F: arch/arm/include/asm/hw_breakpoint.h 831 - F: arch/arm/include/asm/perf_event.h 830 + F: arch/arm*/kernel/hw_breakpoint.c 831 + F: arch/arm*/include/asm/hw_breakpoint.h 832 + F: arch/arm*/include/asm/perf_event.h 832 833 F: drivers/perf/arm_pmu.c 833 834 F: include/linux/perf/arm_pmu.h 834 835
+51 -18
arch/arm64/Kconfig
··· 48 48 select HAVE_ARCH_AUDITSYSCALL 49 49 select HAVE_ARCH_BITREVERSE 50 50 select HAVE_ARCH_JUMP_LABEL 51 + select HAVE_ARCH_KASAN if SPARSEMEM_VMEMMAP 51 52 select HAVE_ARCH_KGDB 52 53 select HAVE_ARCH_SECCOMP_FILTER 53 54 select HAVE_ARCH_TRACEHOOK ··· 170 169 171 170 config PGTABLE_LEVELS 172 171 int 172 + default 2 if ARM64_16K_PAGES && ARM64_VA_BITS_36 173 173 default 2 if ARM64_64K_PAGES && ARM64_VA_BITS_42 174 174 default 3 if ARM64_64K_PAGES && ARM64_VA_BITS_48 175 175 default 3 if ARM64_4K_PAGES && ARM64_VA_BITS_39 176 - default 4 if ARM64_4K_PAGES && ARM64_VA_BITS_48 176 + default 3 if ARM64_16K_PAGES && ARM64_VA_BITS_47 177 + default 4 if !ARM64_64K_PAGES && ARM64_VA_BITS_48 177 178 178 179 source "init/Kconfig" 179 180 ··· 392 389 help 393 390 This feature enables 4KB pages support. 394 391 392 + config ARM64_16K_PAGES 393 + bool "16KB" 394 + help 395 + The system will use 16KB pages support. AArch32 emulation 396 + requires applications compiled with 16K (or a multiple of 16K) 397 + aligned segments. 398 + 395 399 config ARM64_64K_PAGES 396 400 bool "64KB" 397 401 help 398 402 This feature enables 64KB pages support (4KB by default) 399 403 allowing only two levels of page tables and faster TLB 400 - look-up. AArch32 emulation is not available when this feature 401 - is enabled. 404 + look-up. AArch32 emulation requires applications compiled 405 + with 64K aligned segments. 402 406 403 407 endchoice 404 408 405 409 choice 406 410 prompt "Virtual address space size" 407 411 default ARM64_VA_BITS_39 if ARM64_4K_PAGES 412 + default ARM64_VA_BITS_47 if ARM64_16K_PAGES 408 413 default ARM64_VA_BITS_42 if ARM64_64K_PAGES 409 414 help 410 415 Allows choosing one of multiple possible virtual address 411 416 space sizes. The level of translation table is determined by 412 417 a combination of page size and virtual address space size. 418 + 419 + config ARM64_VA_BITS_36 420 + bool "36-bit" if EXPERT 421 + depends on ARM64_16K_PAGES 413 422 414 423 config ARM64_VA_BITS_39 415 424 bool "39-bit" ··· 431 416 bool "42-bit" 432 417 depends on ARM64_64K_PAGES 433 418 419 + config ARM64_VA_BITS_47 420 + bool "47-bit" 421 + depends on ARM64_16K_PAGES 422 + 434 423 config ARM64_VA_BITS_48 435 424 bool "48-bit" 436 425 ··· 442 423 443 424 config ARM64_VA_BITS 444 425 int 426 + default 36 if ARM64_VA_BITS_36 445 427 default 39 if ARM64_VA_BITS_39 446 428 default 42 if ARM64_VA_BITS_42 429 + default 47 if ARM64_VA_BITS_47 447 430 default 48 if ARM64_VA_BITS_48 448 431 449 432 config CPU_BIG_ENDIAN ··· 475 454 476 455 config HOTPLUG_CPU 477 456 bool "Support for hot-pluggable CPUs" 457 + select GENERIC_IRQ_MIGRATION 478 458 help 479 459 Say Y here to experiment with turning CPUs off and on. CPUs 480 460 can be controlled through /sys/devices/system/cpu. 481 461 482 462 source kernel/Kconfig.preempt 483 - 484 - config HZ 485 - int 486 - default 100 463 + source kernel/Kconfig.hz 487 464 488 465 config ARCH_HAS_HOLES_MEMORYMODEL 489 466 def_bool y if SPARSEMEM ··· 500 481 def_bool ARCH_HAS_HOLES_MEMORYMODEL || !SPARSEMEM 501 482 502 483 config HW_PERF_EVENTS 503 - bool "Enable hardware performance counter support for perf events" 504 - depends on PERF_EVENTS 505 - default y 506 - help 507 - Enable hardware performance counter support for perf events. If 508 - disabled, perf events will use software events only. 484 + def_bool y 485 + depends on ARM_PMU 509 486 510 487 config SYS_SUPPORTS_HUGETLBFS 511 488 def_bool y ··· 510 495 def_bool y 511 496 512 497 config ARCH_WANT_HUGE_PMD_SHARE 513 - def_bool y if !ARM64_64K_PAGES 498 + def_bool y if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36) 514 499 515 500 config HAVE_ARCH_TRANSPARENT_HUGEPAGE 516 501 def_bool y ··· 547 532 config FORCE_MAX_ZONEORDER 548 533 int 549 534 default "14" if (ARM64_64K_PAGES && TRANSPARENT_HUGEPAGE) 535 + default "12" if (ARM64_16K_PAGES && TRANSPARENT_HUGEPAGE) 550 536 default "11" 537 + help 538 + The kernel memory allocator divides physically contiguous memory 539 + blocks into "zones", where each zone is a power of two number of 540 + pages. This option selects the largest power of two that the kernel 541 + keeps in the memory allocator. If you need to allocate very large 542 + blocks of physically contiguous memory, then you may need to 543 + increase this value. 544 + 545 + This config option is actually maximum order plus one. For example, 546 + a value of 11 means that the largest free memory block is 2^10 pages. 547 + 548 + We make sure that we can allocate upto a HugePage size for each configuration. 549 + Hence we have : 550 + MAX_ORDER = (PMD_SHIFT - PAGE_SHIFT) + 1 => PAGE_SHIFT - 2 551 + 552 + However for 4K, we choose a higher default value, 11 as opposed to 10, giving us 553 + 4M allocations matching the default size used by generic code. 551 554 552 555 menuconfig ARMV8_DEPRECATED 553 556 bool "Emulate deprecated/obsolete ARMv8 instructions" ··· 740 707 741 708 config COMPAT 742 709 bool "Kernel support for 32-bit EL0" 743 - depends on !ARM64_64K_PAGES || EXPERT 710 + depends on ARM64_4K_PAGES || EXPERT 744 711 select COMPAT_BINFMT_ELF 745 712 select HAVE_UID16 746 713 select OLD_SIGSUSPEND3 ··· 751 718 the user helper functions, VFP support and the ptrace interface are 752 719 handled appropriately by the kernel. 753 720 754 - If you also enabled CONFIG_ARM64_64K_PAGES, please be aware that you 755 - will only be able to execute AArch32 binaries that were compiled with 756 - 64k aligned segments. 721 + If you use a page size other than 4KB (i.e, 16KB or 64KB), please be aware 722 + that you will only be able to execute AArch32 binaries that were compiled 723 + with page size aligned segments. 757 724 758 725 If you want to execute 32-bit userspace applications, say Y. 759 726
+1 -1
arch/arm64/Kconfig.debug
··· 77 77 If in doubt, say Y 78 78 79 79 config DEBUG_ALIGN_RODATA 80 - depends on DEBUG_RODATA && !ARM64_64K_PAGES 80 + depends on DEBUG_RODATA && ARM64_4K_PAGES 81 81 bool "Align linker sections up to SECTION_SIZE" 82 82 help 83 83 If this option is enabled, sections that may potentially be marked as
+7
arch/arm64/Makefile
··· 55 55 TEXT_OFFSET := 0x00080000 56 56 endif 57 57 58 + # KASAN_SHADOW_OFFSET = VA_START + (1 << (VA_BITS - 3)) - (1 << 61) 59 + # in 32-bit arithmetic 60 + KASAN_SHADOW_OFFSET := $(shell printf "0x%08x00000000\n" $$(( \ 61 + (0xffffffff & (-1 << ($(CONFIG_ARM64_VA_BITS) - 32))) \ 62 + + (1 << ($(CONFIG_ARM64_VA_BITS) - 32 - 3)) \ 63 + - (1 << (64 - 32 - 3)) )) ) 64 + 58 65 export TEXT_OFFSET GZFLAGS 59 66 60 67 core-y += arch/arm64/kernel/ arch/arm64/mm/
+11 -7
arch/arm64/boot/dts/arm/juno-r1.dts
··· 91 91 }; 92 92 }; 93 93 94 - pmu { 95 - compatible = "arm,armv8-pmuv3"; 94 + pmu_a57 { 95 + compatible = "arm,cortex-a57-pmu"; 96 96 interrupts = <GIC_SPI 02 IRQ_TYPE_LEVEL_HIGH>, 97 - <GIC_SPI 06 IRQ_TYPE_LEVEL_HIGH>, 98 - <GIC_SPI 18 IRQ_TYPE_LEVEL_HIGH>, 97 + <GIC_SPI 06 IRQ_TYPE_LEVEL_HIGH>; 98 + interrupt-affinity = <&A57_0>, 99 + <&A57_1>; 100 + }; 101 + 102 + pmu_a53 { 103 + compatible = "arm,cortex-a53-pmu"; 104 + interrupts = <GIC_SPI 18 IRQ_TYPE_LEVEL_HIGH>, 99 105 <GIC_SPI 22 IRQ_TYPE_LEVEL_HIGH>, 100 106 <GIC_SPI 26 IRQ_TYPE_LEVEL_HIGH>, 101 107 <GIC_SPI 30 IRQ_TYPE_LEVEL_HIGH>; 102 - interrupt-affinity = <&A57_0>, 103 - <&A57_1>, 104 - <&A53_0>, 108 + interrupt-affinity = <&A53_0>, 105 109 <&A53_1>, 106 110 <&A53_2>, 107 111 <&A53_3>;
+11 -7
arch/arm64/boot/dts/arm/juno.dts
··· 91 91 }; 92 92 }; 93 93 94 - pmu { 95 - compatible = "arm,armv8-pmuv3"; 94 + pmu_a57 { 95 + compatible = "arm,cortex-a57-pmu"; 96 96 interrupts = <GIC_SPI 02 IRQ_TYPE_LEVEL_HIGH>, 97 - <GIC_SPI 06 IRQ_TYPE_LEVEL_HIGH>, 98 - <GIC_SPI 18 IRQ_TYPE_LEVEL_HIGH>, 97 + <GIC_SPI 06 IRQ_TYPE_LEVEL_HIGH>; 98 + interrupt-affinity = <&A57_0>, 99 + <&A57_1>; 100 + }; 101 + 102 + pmu_a53 { 103 + compatible = "arm,cortex-a53-pmu"; 104 + interrupts = <GIC_SPI 18 IRQ_TYPE_LEVEL_HIGH>, 99 105 <GIC_SPI 22 IRQ_TYPE_LEVEL_HIGH>, 100 106 <GIC_SPI 26 IRQ_TYPE_LEVEL_HIGH>, 101 107 <GIC_SPI 30 IRQ_TYPE_LEVEL_HIGH>; 102 - interrupt-affinity = <&A57_0>, 103 - <&A57_1>, 104 - <&A53_0>, 108 + interrupt-affinity = <&A53_0>, 105 109 <&A53_1>, 106 110 <&A53_2>, 107 111 <&A53_3>;
+9
arch/arm64/configs/defconfig
··· 51 51 CONFIG_PCI_MSI=y 52 52 CONFIG_PCI_XGENE=y 53 53 CONFIG_SMP=y 54 + CONFIG_SCHED_MC=y 54 55 CONFIG_PREEMPT=y 55 56 CONFIG_KSM=y 56 57 CONFIG_TRANSPARENT_HUGEPAGE=y ··· 110 109 CONFIG_SERIAL_8250_MT6577=y 111 110 CONFIG_SERIAL_AMBA_PL011=y 112 111 CONFIG_SERIAL_AMBA_PL011_CONSOLE=y 112 + CONFIG_SERIAL_SAMSUNG=y 113 + CONFIG_SERIAL_SAMSUNG_UARTS_4=y 114 + CONFIG_SERIAL_SAMSUNG_UARTS=4 115 + CONFIG_SERIAL_SAMSUNG_CONSOLE=y 113 116 CONFIG_SERIAL_MSM=y 114 117 CONFIG_SERIAL_MSM_CONSOLE=y 115 118 CONFIG_SERIAL_OF_PLATFORM=y ··· 150 145 CONFIG_MMC_SDHCI=y 151 146 CONFIG_MMC_SDHCI_PLTFM=y 152 147 CONFIG_MMC_SPI=y 148 + CONFIG_MMC_DW=y 149 + CONFIG_MMC_DW_IDMAC=y 150 + CONFIG_MMC_DW_PLTFM=y 151 + CONFIG_MMC_DW_EXYNOS=y 153 152 CONFIG_NEW_LEDS=y 154 153 CONFIG_LEDS_CLASS=y 155 154 CONFIG_LEDS_SYSCON=y
+11
arch/arm64/include/asm/assembler.h
··· 193 193 str \src, [\tmp, :lo12:\sym] 194 194 .endm 195 195 196 + /* 197 + * Annotate a function as position independent, i.e., safe to be called before 198 + * the kernel virtual mapping is activated. 199 + */ 200 + #define ENDPIPROC(x) \ 201 + .globl __pi_##x; \ 202 + .type __pi_##x, %function; \ 203 + .set __pi_##x, x; \ 204 + .size __pi_##x, . - x; \ 205 + ENDPROC(x) 206 + 196 207 #endif /* __ASM_ASSEMBLER_H */
+59 -4
arch/arm64/include/asm/atomic.h
··· 55 55 56 56 #define atomic_read(v) READ_ONCE((v)->counter) 57 57 #define atomic_set(v, i) WRITE_ONCE(((v)->counter), (i)) 58 + 59 + #define atomic_add_return_relaxed atomic_add_return_relaxed 60 + #define atomic_add_return_acquire atomic_add_return_acquire 61 + #define atomic_add_return_release atomic_add_return_release 62 + #define atomic_add_return atomic_add_return 63 + 64 + #define atomic_inc_return_relaxed(v) atomic_add_return_relaxed(1, (v)) 65 + #define atomic_inc_return_acquire(v) atomic_add_return_acquire(1, (v)) 66 + #define atomic_inc_return_release(v) atomic_add_return_release(1, (v)) 67 + #define atomic_inc_return(v) atomic_add_return(1, (v)) 68 + 69 + #define atomic_sub_return_relaxed atomic_sub_return_relaxed 70 + #define atomic_sub_return_acquire atomic_sub_return_acquire 71 + #define atomic_sub_return_release atomic_sub_return_release 72 + #define atomic_sub_return atomic_sub_return 73 + 74 + #define atomic_dec_return_relaxed(v) atomic_sub_return_relaxed(1, (v)) 75 + #define atomic_dec_return_acquire(v) atomic_sub_return_acquire(1, (v)) 76 + #define atomic_dec_return_release(v) atomic_sub_return_release(1, (v)) 77 + #define atomic_dec_return(v) atomic_sub_return(1, (v)) 78 + 79 + #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new)) 80 + #define atomic_xchg_acquire(v, new) xchg_acquire(&((v)->counter), (new)) 81 + #define atomic_xchg_release(v, new) xchg_release(&((v)->counter), (new)) 58 82 #define atomic_xchg(v, new) xchg(&((v)->counter), (new)) 83 + 84 + #define atomic_cmpxchg_relaxed(v, old, new) \ 85 + cmpxchg_relaxed(&((v)->counter), (old), (new)) 86 + #define atomic_cmpxchg_acquire(v, old, new) \ 87 + cmpxchg_acquire(&((v)->counter), (old), (new)) 88 + #define atomic_cmpxchg_release(v, old, new) \ 89 + cmpxchg_release(&((v)->counter), (old), (new)) 59 90 #define atomic_cmpxchg(v, old, new) cmpxchg(&((v)->counter), (old), (new)) 60 91 61 92 #define atomic_inc(v) atomic_add(1, (v)) 62 93 #define atomic_dec(v) atomic_sub(1, (v)) 63 - #define atomic_inc_return(v) atomic_add_return(1, (v)) 64 - #define atomic_dec_return(v) atomic_sub_return(1, (v)) 65 94 #define atomic_inc_and_test(v) (atomic_inc_return(v) == 0) 66 95 #define atomic_dec_and_test(v) (atomic_dec_return(v) == 0) 67 96 #define atomic_sub_and_test(i, v) (atomic_sub_return((i), (v)) == 0) ··· 104 75 #define ATOMIC64_INIT ATOMIC_INIT 105 76 #define atomic64_read atomic_read 106 77 #define atomic64_set atomic_set 78 + 79 + #define atomic64_add_return_relaxed atomic64_add_return_relaxed 80 + #define atomic64_add_return_acquire atomic64_add_return_acquire 81 + #define atomic64_add_return_release atomic64_add_return_release 82 + #define atomic64_add_return atomic64_add_return 83 + 84 + #define atomic64_inc_return_relaxed(v) atomic64_add_return_relaxed(1, (v)) 85 + #define atomic64_inc_return_acquire(v) atomic64_add_return_acquire(1, (v)) 86 + #define atomic64_inc_return_release(v) atomic64_add_return_release(1, (v)) 87 + #define atomic64_inc_return(v) atomic64_add_return(1, (v)) 88 + 89 + #define atomic64_sub_return_relaxed atomic64_sub_return_relaxed 90 + #define atomic64_sub_return_acquire atomic64_sub_return_acquire 91 + #define atomic64_sub_return_release atomic64_sub_return_release 92 + #define atomic64_sub_return atomic64_sub_return 93 + 94 + #define atomic64_dec_return_relaxed(v) atomic64_sub_return_relaxed(1, (v)) 95 + #define atomic64_dec_return_acquire(v) atomic64_sub_return_acquire(1, (v)) 96 + #define atomic64_dec_return_release(v) atomic64_sub_return_release(1, (v)) 97 + #define atomic64_dec_return(v) atomic64_sub_return(1, (v)) 98 + 99 + #define atomic64_xchg_relaxed atomic_xchg_relaxed 100 + #define atomic64_xchg_acquire atomic_xchg_acquire 101 + #define atomic64_xchg_release atomic_xchg_release 107 102 #define atomic64_xchg atomic_xchg 103 + 104 + #define atomic64_cmpxchg_relaxed atomic_cmpxchg_relaxed 105 + #define atomic64_cmpxchg_acquire atomic_cmpxchg_acquire 106 + #define atomic64_cmpxchg_release atomic_cmpxchg_release 108 107 #define atomic64_cmpxchg atomic_cmpxchg 109 108 110 109 #define atomic64_inc(v) atomic64_add(1, (v)) 111 110 #define atomic64_dec(v) atomic64_sub(1, (v)) 112 - #define atomic64_inc_return(v) atomic64_add_return(1, (v)) 113 - #define atomic64_dec_return(v) atomic64_sub_return(1, (v)) 114 111 #define atomic64_inc_and_test(v) (atomic64_inc_return(v) == 0) 115 112 #define atomic64_dec_and_test(v) (atomic64_dec_return(v) == 0) 116 113 #define atomic64_sub_and_test(i, v) (atomic64_sub_return((i), (v)) == 0)
+60 -38
arch/arm64/include/asm/atomic_ll_sc.h
··· 55 55 } \ 56 56 __LL_SC_EXPORT(atomic_##op); 57 57 58 - #define ATOMIC_OP_RETURN(op, asm_op) \ 58 + #define ATOMIC_OP_RETURN(name, mb, acq, rel, cl, op, asm_op) \ 59 59 __LL_SC_INLINE int \ 60 - __LL_SC_PREFIX(atomic_##op##_return(int i, atomic_t *v)) \ 60 + __LL_SC_PREFIX(atomic_##op##_return##name(int i, atomic_t *v)) \ 61 61 { \ 62 62 unsigned long tmp; \ 63 63 int result; \ 64 64 \ 65 - asm volatile("// atomic_" #op "_return\n" \ 65 + asm volatile("// atomic_" #op "_return" #name "\n" \ 66 66 " prfm pstl1strm, %2\n" \ 67 - "1: ldxr %w0, %2\n" \ 67 + "1: ld" #acq "xr %w0, %2\n" \ 68 68 " " #asm_op " %w0, %w0, %w3\n" \ 69 - " stlxr %w1, %w0, %2\n" \ 70 - " cbnz %w1, 1b" \ 69 + " st" #rel "xr %w1, %w0, %2\n" \ 70 + " cbnz %w1, 1b\n" \ 71 + " " #mb \ 71 72 : "=&r" (result), "=&r" (tmp), "+Q" (v->counter) \ 72 73 : "Ir" (i) \ 73 - : "memory"); \ 74 + : cl); \ 74 75 \ 75 - smp_mb(); \ 76 76 return result; \ 77 77 } \ 78 - __LL_SC_EXPORT(atomic_##op##_return); 78 + __LL_SC_EXPORT(atomic_##op##_return##name); 79 79 80 - #define ATOMIC_OPS(op, asm_op) \ 81 - ATOMIC_OP(op, asm_op) \ 82 - ATOMIC_OP_RETURN(op, asm_op) 80 + #define ATOMIC_OPS(...) \ 81 + ATOMIC_OP(__VA_ARGS__) \ 82 + ATOMIC_OP_RETURN( , dmb ish, , l, "memory", __VA_ARGS__) 83 83 84 - ATOMIC_OPS(add, add) 85 - ATOMIC_OPS(sub, sub) 84 + #define ATOMIC_OPS_RLX(...) \ 85 + ATOMIC_OPS(__VA_ARGS__) \ 86 + ATOMIC_OP_RETURN(_relaxed, , , , , __VA_ARGS__)\ 87 + ATOMIC_OP_RETURN(_acquire, , a, , "memory", __VA_ARGS__)\ 88 + ATOMIC_OP_RETURN(_release, , , l, "memory", __VA_ARGS__) 89 + 90 + ATOMIC_OPS_RLX(add, add) 91 + ATOMIC_OPS_RLX(sub, sub) 86 92 87 93 ATOMIC_OP(and, and) 88 94 ATOMIC_OP(andnot, bic) 89 95 ATOMIC_OP(or, orr) 90 96 ATOMIC_OP(xor, eor) 91 97 98 + #undef ATOMIC_OPS_RLX 92 99 #undef ATOMIC_OPS 93 100 #undef ATOMIC_OP_RETURN 94 101 #undef ATOMIC_OP ··· 118 111 } \ 119 112 __LL_SC_EXPORT(atomic64_##op); 120 113 121 - #define ATOMIC64_OP_RETURN(op, asm_op) \ 114 + #define ATOMIC64_OP_RETURN(name, mb, acq, rel, cl, op, asm_op) \ 122 115 __LL_SC_INLINE long \ 123 - __LL_SC_PREFIX(atomic64_##op##_return(long i, atomic64_t *v)) \ 116 + __LL_SC_PREFIX(atomic64_##op##_return##name(long i, atomic64_t *v)) \ 124 117 { \ 125 118 long result; \ 126 119 unsigned long tmp; \ 127 120 \ 128 - asm volatile("// atomic64_" #op "_return\n" \ 121 + asm volatile("// atomic64_" #op "_return" #name "\n" \ 129 122 " prfm pstl1strm, %2\n" \ 130 - "1: ldxr %0, %2\n" \ 123 + "1: ld" #acq "xr %0, %2\n" \ 131 124 " " #asm_op " %0, %0, %3\n" \ 132 - " stlxr %w1, %0, %2\n" \ 133 - " cbnz %w1, 1b" \ 125 + " st" #rel "xr %w1, %0, %2\n" \ 126 + " cbnz %w1, 1b\n" \ 127 + " " #mb \ 134 128 : "=&r" (result), "=&r" (tmp), "+Q" (v->counter) \ 135 129 : "Ir" (i) \ 136 - : "memory"); \ 130 + : cl); \ 137 131 \ 138 - smp_mb(); \ 139 132 return result; \ 140 133 } \ 141 - __LL_SC_EXPORT(atomic64_##op##_return); 134 + __LL_SC_EXPORT(atomic64_##op##_return##name); 142 135 143 - #define ATOMIC64_OPS(op, asm_op) \ 144 - ATOMIC64_OP(op, asm_op) \ 145 - ATOMIC64_OP_RETURN(op, asm_op) 136 + #define ATOMIC64_OPS(...) \ 137 + ATOMIC64_OP(__VA_ARGS__) \ 138 + ATOMIC64_OP_RETURN(, dmb ish, , l, "memory", __VA_ARGS__) 146 139 147 - ATOMIC64_OPS(add, add) 148 - ATOMIC64_OPS(sub, sub) 140 + #define ATOMIC64_OPS_RLX(...) \ 141 + ATOMIC64_OPS(__VA_ARGS__) \ 142 + ATOMIC64_OP_RETURN(_relaxed,, , , , __VA_ARGS__) \ 143 + ATOMIC64_OP_RETURN(_acquire,, a, , "memory", __VA_ARGS__) \ 144 + ATOMIC64_OP_RETURN(_release,, , l, "memory", __VA_ARGS__) 145 + 146 + ATOMIC64_OPS_RLX(add, add) 147 + ATOMIC64_OPS_RLX(sub, sub) 149 148 150 149 ATOMIC64_OP(and, and) 151 150 ATOMIC64_OP(andnot, bic) 152 151 ATOMIC64_OP(or, orr) 153 152 ATOMIC64_OP(xor, eor) 154 153 154 + #undef ATOMIC64_OPS_RLX 155 155 #undef ATOMIC64_OPS 156 156 #undef ATOMIC64_OP_RETURN 157 157 #undef ATOMIC64_OP ··· 186 172 } 187 173 __LL_SC_EXPORT(atomic64_dec_if_positive); 188 174 189 - #define __CMPXCHG_CASE(w, sz, name, mb, rel, cl) \ 175 + #define __CMPXCHG_CASE(w, sz, name, mb, acq, rel, cl) \ 190 176 __LL_SC_INLINE unsigned long \ 191 177 __LL_SC_PREFIX(__cmpxchg_case_##name(volatile void *ptr, \ 192 178 unsigned long old, \ ··· 196 182 \ 197 183 asm volatile( \ 198 184 " prfm pstl1strm, %[v]\n" \ 199 - "1: ldxr" #sz "\t%" #w "[oldval], %[v]\n" \ 185 + "1: ld" #acq "xr" #sz "\t%" #w "[oldval], %[v]\n" \ 200 186 " eor %" #w "[tmp], %" #w "[oldval], %" #w "[old]\n" \ 201 187 " cbnz %" #w "[tmp], 2f\n" \ 202 188 " st" #rel "xr" #sz "\t%w[tmp], %" #w "[new], %[v]\n" \ ··· 213 199 } \ 214 200 __LL_SC_EXPORT(__cmpxchg_case_##name); 215 201 216 - __CMPXCHG_CASE(w, b, 1, , , ) 217 - __CMPXCHG_CASE(w, h, 2, , , ) 218 - __CMPXCHG_CASE(w, , 4, , , ) 219 - __CMPXCHG_CASE( , , 8, , , ) 220 - __CMPXCHG_CASE(w, b, mb_1, dmb ish, l, "memory") 221 - __CMPXCHG_CASE(w, h, mb_2, dmb ish, l, "memory") 222 - __CMPXCHG_CASE(w, , mb_4, dmb ish, l, "memory") 223 - __CMPXCHG_CASE( , , mb_8, dmb ish, l, "memory") 202 + __CMPXCHG_CASE(w, b, 1, , , , ) 203 + __CMPXCHG_CASE(w, h, 2, , , , ) 204 + __CMPXCHG_CASE(w, , 4, , , , ) 205 + __CMPXCHG_CASE( , , 8, , , , ) 206 + __CMPXCHG_CASE(w, b, acq_1, , a, , "memory") 207 + __CMPXCHG_CASE(w, h, acq_2, , a, , "memory") 208 + __CMPXCHG_CASE(w, , acq_4, , a, , "memory") 209 + __CMPXCHG_CASE( , , acq_8, , a, , "memory") 210 + __CMPXCHG_CASE(w, b, rel_1, , , l, "memory") 211 + __CMPXCHG_CASE(w, h, rel_2, , , l, "memory") 212 + __CMPXCHG_CASE(w, , rel_4, , , l, "memory") 213 + __CMPXCHG_CASE( , , rel_8, , , l, "memory") 214 + __CMPXCHG_CASE(w, b, mb_1, dmb ish, , l, "memory") 215 + __CMPXCHG_CASE(w, h, mb_2, dmb ish, , l, "memory") 216 + __CMPXCHG_CASE(w, , mb_4, dmb ish, , l, "memory") 217 + __CMPXCHG_CASE( , , mb_8, dmb ish, , l, "memory") 224 218 225 219 #undef __CMPXCHG_CASE 226 220
+119 -80
arch/arm64/include/asm/atomic_lse.h
··· 75 75 : "x30"); 76 76 } 77 77 78 - static inline int atomic_add_return(int i, atomic_t *v) 79 - { 80 - register int w0 asm ("w0") = i; 81 - register atomic_t *x1 asm ("x1") = v; 82 - 83 - asm volatile(ARM64_LSE_ATOMIC_INSN( 84 - /* LL/SC */ 85 - " nop\n" 86 - __LL_SC_ATOMIC(add_return), 87 - /* LSE atomics */ 88 - " ldaddal %w[i], w30, %[v]\n" 89 - " add %w[i], %w[i], w30") 90 - : [i] "+r" (w0), [v] "+Q" (v->counter) 91 - : "r" (x1) 92 - : "x30", "memory"); 93 - 94 - return w0; 78 + #define ATOMIC_OP_ADD_RETURN(name, mb, cl...) \ 79 + static inline int atomic_add_return##name(int i, atomic_t *v) \ 80 + { \ 81 + register int w0 asm ("w0") = i; \ 82 + register atomic_t *x1 asm ("x1") = v; \ 83 + \ 84 + asm volatile(ARM64_LSE_ATOMIC_INSN( \ 85 + /* LL/SC */ \ 86 + " nop\n" \ 87 + __LL_SC_ATOMIC(add_return##name), \ 88 + /* LSE atomics */ \ 89 + " ldadd" #mb " %w[i], w30, %[v]\n" \ 90 + " add %w[i], %w[i], w30") \ 91 + : [i] "+r" (w0), [v] "+Q" (v->counter) \ 92 + : "r" (x1) \ 93 + : "x30" , ##cl); \ 94 + \ 95 + return w0; \ 95 96 } 97 + 98 + ATOMIC_OP_ADD_RETURN(_relaxed, ) 99 + ATOMIC_OP_ADD_RETURN(_acquire, a, "memory") 100 + ATOMIC_OP_ADD_RETURN(_release, l, "memory") 101 + ATOMIC_OP_ADD_RETURN( , al, "memory") 102 + 103 + #undef ATOMIC_OP_ADD_RETURN 96 104 97 105 static inline void atomic_and(int i, atomic_t *v) 98 106 { ··· 136 128 : "x30"); 137 129 } 138 130 139 - static inline int atomic_sub_return(int i, atomic_t *v) 140 - { 141 - register int w0 asm ("w0") = i; 142 - register atomic_t *x1 asm ("x1") = v; 143 - 144 - asm volatile(ARM64_LSE_ATOMIC_INSN( 145 - /* LL/SC */ 146 - " nop\n" 147 - __LL_SC_ATOMIC(sub_return) 148 - " nop", 149 - /* LSE atomics */ 150 - " neg %w[i], %w[i]\n" 151 - " ldaddal %w[i], w30, %[v]\n" 152 - " add %w[i], %w[i], w30") 153 - : [i] "+r" (w0), [v] "+Q" (v->counter) 154 - : "r" (x1) 155 - : "x30", "memory"); 156 - 157 - return w0; 131 + #define ATOMIC_OP_SUB_RETURN(name, mb, cl...) \ 132 + static inline int atomic_sub_return##name(int i, atomic_t *v) \ 133 + { \ 134 + register int w0 asm ("w0") = i; \ 135 + register atomic_t *x1 asm ("x1") = v; \ 136 + \ 137 + asm volatile(ARM64_LSE_ATOMIC_INSN( \ 138 + /* LL/SC */ \ 139 + " nop\n" \ 140 + __LL_SC_ATOMIC(sub_return##name) \ 141 + " nop", \ 142 + /* LSE atomics */ \ 143 + " neg %w[i], %w[i]\n" \ 144 + " ldadd" #mb " %w[i], w30, %[v]\n" \ 145 + " add %w[i], %w[i], w30") \ 146 + : [i] "+r" (w0), [v] "+Q" (v->counter) \ 147 + : "r" (x1) \ 148 + : "x30" , ##cl); \ 149 + \ 150 + return w0; \ 158 151 } 159 152 153 + ATOMIC_OP_SUB_RETURN(_relaxed, ) 154 + ATOMIC_OP_SUB_RETURN(_acquire, a, "memory") 155 + ATOMIC_OP_SUB_RETURN(_release, l, "memory") 156 + ATOMIC_OP_SUB_RETURN( , al, "memory") 157 + 158 + #undef ATOMIC_OP_SUB_RETURN 160 159 #undef __LL_SC_ATOMIC 161 160 162 161 #define __LL_SC_ATOMIC64(op) __LL_SC_CALL(atomic64_##op) ··· 216 201 : "x30"); 217 202 } 218 203 219 - static inline long atomic64_add_return(long i, atomic64_t *v) 220 - { 221 - register long x0 asm ("x0") = i; 222 - register atomic64_t *x1 asm ("x1") = v; 223 - 224 - asm volatile(ARM64_LSE_ATOMIC_INSN( 225 - /* LL/SC */ 226 - " nop\n" 227 - __LL_SC_ATOMIC64(add_return), 228 - /* LSE atomics */ 229 - " ldaddal %[i], x30, %[v]\n" 230 - " add %[i], %[i], x30") 231 - : [i] "+r" (x0), [v] "+Q" (v->counter) 232 - : "r" (x1) 233 - : "x30", "memory"); 234 - 235 - return x0; 204 + #define ATOMIC64_OP_ADD_RETURN(name, mb, cl...) \ 205 + static inline long atomic64_add_return##name(long i, atomic64_t *v) \ 206 + { \ 207 + register long x0 asm ("x0") = i; \ 208 + register atomic64_t *x1 asm ("x1") = v; \ 209 + \ 210 + asm volatile(ARM64_LSE_ATOMIC_INSN( \ 211 + /* LL/SC */ \ 212 + " nop\n" \ 213 + __LL_SC_ATOMIC64(add_return##name), \ 214 + /* LSE atomics */ \ 215 + " ldadd" #mb " %[i], x30, %[v]\n" \ 216 + " add %[i], %[i], x30") \ 217 + : [i] "+r" (x0), [v] "+Q" (v->counter) \ 218 + : "r" (x1) \ 219 + : "x30" , ##cl); \ 220 + \ 221 + return x0; \ 236 222 } 223 + 224 + ATOMIC64_OP_ADD_RETURN(_relaxed, ) 225 + ATOMIC64_OP_ADD_RETURN(_acquire, a, "memory") 226 + ATOMIC64_OP_ADD_RETURN(_release, l, "memory") 227 + ATOMIC64_OP_ADD_RETURN( , al, "memory") 228 + 229 + #undef ATOMIC64_OP_ADD_RETURN 237 230 238 231 static inline void atomic64_and(long i, atomic64_t *v) 239 232 { ··· 277 254 : "x30"); 278 255 } 279 256 280 - static inline long atomic64_sub_return(long i, atomic64_t *v) 281 - { 282 - register long x0 asm ("x0") = i; 283 - register atomic64_t *x1 asm ("x1") = v; 284 - 285 - asm volatile(ARM64_LSE_ATOMIC_INSN( 286 - /* LL/SC */ 287 - " nop\n" 288 - __LL_SC_ATOMIC64(sub_return) 289 - " nop", 290 - /* LSE atomics */ 291 - " neg %[i], %[i]\n" 292 - " ldaddal %[i], x30, %[v]\n" 293 - " add %[i], %[i], x30") 294 - : [i] "+r" (x0), [v] "+Q" (v->counter) 295 - : "r" (x1) 296 - : "x30", "memory"); 297 - 298 - return x0; 257 + #define ATOMIC64_OP_SUB_RETURN(name, mb, cl...) \ 258 + static inline long atomic64_sub_return##name(long i, atomic64_t *v) \ 259 + { \ 260 + register long x0 asm ("x0") = i; \ 261 + register atomic64_t *x1 asm ("x1") = v; \ 262 + \ 263 + asm volatile(ARM64_LSE_ATOMIC_INSN( \ 264 + /* LL/SC */ \ 265 + " nop\n" \ 266 + __LL_SC_ATOMIC64(sub_return##name) \ 267 + " nop", \ 268 + /* LSE atomics */ \ 269 + " neg %[i], %[i]\n" \ 270 + " ldadd" #mb " %[i], x30, %[v]\n" \ 271 + " add %[i], %[i], x30") \ 272 + : [i] "+r" (x0), [v] "+Q" (v->counter) \ 273 + : "r" (x1) \ 274 + : "x30" , ##cl); \ 275 + \ 276 + return x0; \ 299 277 } 278 + 279 + ATOMIC64_OP_SUB_RETURN(_relaxed, ) 280 + ATOMIC64_OP_SUB_RETURN(_acquire, a, "memory") 281 + ATOMIC64_OP_SUB_RETURN(_release, l, "memory") 282 + ATOMIC64_OP_SUB_RETURN( , al, "memory") 283 + 284 + #undef ATOMIC64_OP_SUB_RETURN 300 285 301 286 static inline long atomic64_dec_if_positive(atomic64_t *v) 302 287 { ··· 364 333 return x0; \ 365 334 } 366 335 367 - __CMPXCHG_CASE(w, b, 1, ) 368 - __CMPXCHG_CASE(w, h, 2, ) 369 - __CMPXCHG_CASE(w, , 4, ) 370 - __CMPXCHG_CASE(x, , 8, ) 371 - __CMPXCHG_CASE(w, b, mb_1, al, "memory") 372 - __CMPXCHG_CASE(w, h, mb_2, al, "memory") 373 - __CMPXCHG_CASE(w, , mb_4, al, "memory") 374 - __CMPXCHG_CASE(x, , mb_8, al, "memory") 336 + __CMPXCHG_CASE(w, b, 1, ) 337 + __CMPXCHG_CASE(w, h, 2, ) 338 + __CMPXCHG_CASE(w, , 4, ) 339 + __CMPXCHG_CASE(x, , 8, ) 340 + __CMPXCHG_CASE(w, b, acq_1, a, "memory") 341 + __CMPXCHG_CASE(w, h, acq_2, a, "memory") 342 + __CMPXCHG_CASE(w, , acq_4, a, "memory") 343 + __CMPXCHG_CASE(x, , acq_8, a, "memory") 344 + __CMPXCHG_CASE(w, b, rel_1, l, "memory") 345 + __CMPXCHG_CASE(w, h, rel_2, l, "memory") 346 + __CMPXCHG_CASE(w, , rel_4, l, "memory") 347 + __CMPXCHG_CASE(x, , rel_8, l, "memory") 348 + __CMPXCHG_CASE(w, b, mb_1, al, "memory") 349 + __CMPXCHG_CASE(w, h, mb_2, al, "memory") 350 + __CMPXCHG_CASE(w, , mb_4, al, "memory") 351 + __CMPXCHG_CASE(x, , mb_8, al, "memory") 375 352 376 353 #undef __LL_SC_CMPXCHG 377 354 #undef __CMPXCHG_CASE
+1 -1
arch/arm64/include/asm/cache.h
··· 18 18 19 19 #include <asm/cachetype.h> 20 20 21 - #define L1_CACHE_SHIFT 6 21 + #define L1_CACHE_SHIFT 7 22 22 #define L1_CACHE_BYTES (1 << L1_CACHE_SHIFT) 23 23 24 24 /*
+7
arch/arm64/include/asm/cacheflush.h
··· 115 115 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1 116 116 extern void flush_dcache_page(struct page *); 117 117 118 + static inline void __local_flush_icache_all(void) 119 + { 120 + asm("ic iallu"); 121 + dsb(nsh); 122 + isb(); 123 + } 124 + 118 125 static inline void __flush_icache_all(void) 119 126 { 120 127 asm("ic ialluis");
+2 -2
arch/arm64/include/asm/cachetype.h
··· 34 34 35 35 #define CTR_L1IP(ctr) (((ctr) >> CTR_L1IP_SHIFT) & CTR_L1IP_MASK) 36 36 37 - #define ICACHEF_ALIASING BIT(0) 38 - #define ICACHEF_AIVIVT BIT(1) 37 + #define ICACHEF_ALIASING 0 38 + #define ICACHEF_AIVIVT 1 39 39 40 40 extern unsigned long __icache_flags; 41 41
+134 -141
arch/arm64/include/asm/cmpxchg.h
··· 25 25 #include <asm/barrier.h> 26 26 #include <asm/lse.h> 27 27 28 - static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size) 29 - { 30 - unsigned long ret, tmp; 31 - 32 - switch (size) { 33 - case 1: 34 - asm volatile(ARM64_LSE_ATOMIC_INSN( 35 - /* LL/SC */ 36 - " prfm pstl1strm, %2\n" 37 - "1: ldxrb %w0, %2\n" 38 - " stlxrb %w1, %w3, %2\n" 39 - " cbnz %w1, 1b\n" 40 - " dmb ish", 41 - /* LSE atomics */ 42 - " nop\n" 43 - " nop\n" 44 - " swpalb %w3, %w0, %2\n" 45 - " nop\n" 46 - " nop") 47 - : "=&r" (ret), "=&r" (tmp), "+Q" (*(u8 *)ptr) 48 - : "r" (x) 49 - : "memory"); 50 - break; 51 - case 2: 52 - asm volatile(ARM64_LSE_ATOMIC_INSN( 53 - /* LL/SC */ 54 - " prfm pstl1strm, %2\n" 55 - "1: ldxrh %w0, %2\n" 56 - " stlxrh %w1, %w3, %2\n" 57 - " cbnz %w1, 1b\n" 58 - " dmb ish", 59 - /* LSE atomics */ 60 - " nop\n" 61 - " nop\n" 62 - " swpalh %w3, %w0, %2\n" 63 - " nop\n" 64 - " nop") 65 - : "=&r" (ret), "=&r" (tmp), "+Q" (*(u16 *)ptr) 66 - : "r" (x) 67 - : "memory"); 68 - break; 69 - case 4: 70 - asm volatile(ARM64_LSE_ATOMIC_INSN( 71 - /* LL/SC */ 72 - " prfm pstl1strm, %2\n" 73 - "1: ldxr %w0, %2\n" 74 - " stlxr %w1, %w3, %2\n" 75 - " cbnz %w1, 1b\n" 76 - " dmb ish", 77 - /* LSE atomics */ 78 - " nop\n" 79 - " nop\n" 80 - " swpal %w3, %w0, %2\n" 81 - " nop\n" 82 - " nop") 83 - : "=&r" (ret), "=&r" (tmp), "+Q" (*(u32 *)ptr) 84 - : "r" (x) 85 - : "memory"); 86 - break; 87 - case 8: 88 - asm volatile(ARM64_LSE_ATOMIC_INSN( 89 - /* LL/SC */ 90 - " prfm pstl1strm, %2\n" 91 - "1: ldxr %0, %2\n" 92 - " stlxr %w1, %3, %2\n" 93 - " cbnz %w1, 1b\n" 94 - " dmb ish", 95 - /* LSE atomics */ 96 - " nop\n" 97 - " nop\n" 98 - " swpal %3, %0, %2\n" 99 - " nop\n" 100 - " nop") 101 - : "=&r" (ret), "=&r" (tmp), "+Q" (*(u64 *)ptr) 102 - : "r" (x) 103 - : "memory"); 104 - break; 105 - default: 106 - BUILD_BUG(); 107 - } 108 - 109 - return ret; 28 + /* 29 + * We need separate acquire parameters for ll/sc and lse, since the full 30 + * barrier case is generated as release+dmb for the former and 31 + * acquire+release for the latter. 32 + */ 33 + #define __XCHG_CASE(w, sz, name, mb, nop_lse, acq, acq_lse, rel, cl) \ 34 + static inline unsigned long __xchg_case_##name(unsigned long x, \ 35 + volatile void *ptr) \ 36 + { \ 37 + unsigned long ret, tmp; \ 38 + \ 39 + asm volatile(ARM64_LSE_ATOMIC_INSN( \ 40 + /* LL/SC */ \ 41 + " prfm pstl1strm, %2\n" \ 42 + "1: ld" #acq "xr" #sz "\t%" #w "0, %2\n" \ 43 + " st" #rel "xr" #sz "\t%w1, %" #w "3, %2\n" \ 44 + " cbnz %w1, 1b\n" \ 45 + " " #mb, \ 46 + /* LSE atomics */ \ 47 + " nop\n" \ 48 + " nop\n" \ 49 + " swp" #acq_lse #rel #sz "\t%" #w "3, %" #w "0, %2\n" \ 50 + " nop\n" \ 51 + " " #nop_lse) \ 52 + : "=&r" (ret), "=&r" (tmp), "+Q" (*(u8 *)ptr) \ 53 + : "r" (x) \ 54 + : cl); \ 55 + \ 56 + return ret; \ 110 57 } 111 58 112 - #define xchg(ptr,x) \ 113 - ({ \ 114 - __typeof__(*(ptr)) __ret; \ 115 - __ret = (__typeof__(*(ptr))) \ 116 - __xchg((unsigned long)(x), (ptr), sizeof(*(ptr))); \ 117 - __ret; \ 118 - }) 59 + __XCHG_CASE(w, b, 1, , , , , , ) 60 + __XCHG_CASE(w, h, 2, , , , , , ) 61 + __XCHG_CASE(w, , 4, , , , , , ) 62 + __XCHG_CASE( , , 8, , , , , , ) 63 + __XCHG_CASE(w, b, acq_1, , , a, a, , "memory") 64 + __XCHG_CASE(w, h, acq_2, , , a, a, , "memory") 65 + __XCHG_CASE(w, , acq_4, , , a, a, , "memory") 66 + __XCHG_CASE( , , acq_8, , , a, a, , "memory") 67 + __XCHG_CASE(w, b, rel_1, , , , , l, "memory") 68 + __XCHG_CASE(w, h, rel_2, , , , , l, "memory") 69 + __XCHG_CASE(w, , rel_4, , , , , l, "memory") 70 + __XCHG_CASE( , , rel_8, , , , , l, "memory") 71 + __XCHG_CASE(w, b, mb_1, dmb ish, nop, , a, l, "memory") 72 + __XCHG_CASE(w, h, mb_2, dmb ish, nop, , a, l, "memory") 73 + __XCHG_CASE(w, , mb_4, dmb ish, nop, , a, l, "memory") 74 + __XCHG_CASE( , , mb_8, dmb ish, nop, , a, l, "memory") 119 75 120 - static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old, 121 - unsigned long new, int size) 122 - { 123 - switch (size) { 124 - case 1: 125 - return __cmpxchg_case_1(ptr, (u8)old, new); 126 - case 2: 127 - return __cmpxchg_case_2(ptr, (u16)old, new); 128 - case 4: 129 - return __cmpxchg_case_4(ptr, old, new); 130 - case 8: 131 - return __cmpxchg_case_8(ptr, old, new); 132 - default: 133 - BUILD_BUG(); 134 - } 76 + #undef __XCHG_CASE 135 77 136 - unreachable(); 78 + #define __XCHG_GEN(sfx) \ 79 + static inline unsigned long __xchg##sfx(unsigned long x, \ 80 + volatile void *ptr, \ 81 + int size) \ 82 + { \ 83 + switch (size) { \ 84 + case 1: \ 85 + return __xchg_case##sfx##_1(x, ptr); \ 86 + case 2: \ 87 + return __xchg_case##sfx##_2(x, ptr); \ 88 + case 4: \ 89 + return __xchg_case##sfx##_4(x, ptr); \ 90 + case 8: \ 91 + return __xchg_case##sfx##_8(x, ptr); \ 92 + default: \ 93 + BUILD_BUG(); \ 94 + } \ 95 + \ 96 + unreachable(); \ 137 97 } 138 98 139 - static inline unsigned long __cmpxchg_mb(volatile void *ptr, unsigned long old, 140 - unsigned long new, int size) 141 - { 142 - switch (size) { 143 - case 1: 144 - return __cmpxchg_case_mb_1(ptr, (u8)old, new); 145 - case 2: 146 - return __cmpxchg_case_mb_2(ptr, (u16)old, new); 147 - case 4: 148 - return __cmpxchg_case_mb_4(ptr, old, new); 149 - case 8: 150 - return __cmpxchg_case_mb_8(ptr, old, new); 151 - default: 152 - BUILD_BUG(); 153 - } 99 + __XCHG_GEN() 100 + __XCHG_GEN(_acq) 101 + __XCHG_GEN(_rel) 102 + __XCHG_GEN(_mb) 154 103 155 - unreachable(); 104 + #undef __XCHG_GEN 105 + 106 + #define __xchg_wrapper(sfx, ptr, x) \ 107 + ({ \ 108 + __typeof__(*(ptr)) __ret; \ 109 + __ret = (__typeof__(*(ptr))) \ 110 + __xchg##sfx((unsigned long)(x), (ptr), sizeof(*(ptr))); \ 111 + __ret; \ 112 + }) 113 + 114 + /* xchg */ 115 + #define xchg_relaxed(...) __xchg_wrapper( , __VA_ARGS__) 116 + #define xchg_acquire(...) __xchg_wrapper(_acq, __VA_ARGS__) 117 + #define xchg_release(...) __xchg_wrapper(_rel, __VA_ARGS__) 118 + #define xchg(...) __xchg_wrapper( _mb, __VA_ARGS__) 119 + 120 + #define __CMPXCHG_GEN(sfx) \ 121 + static inline unsigned long __cmpxchg##sfx(volatile void *ptr, \ 122 + unsigned long old, \ 123 + unsigned long new, \ 124 + int size) \ 125 + { \ 126 + switch (size) { \ 127 + case 1: \ 128 + return __cmpxchg_case##sfx##_1(ptr, (u8)old, new); \ 129 + case 2: \ 130 + return __cmpxchg_case##sfx##_2(ptr, (u16)old, new); \ 131 + case 4: \ 132 + return __cmpxchg_case##sfx##_4(ptr, old, new); \ 133 + case 8: \ 134 + return __cmpxchg_case##sfx##_8(ptr, old, new); \ 135 + default: \ 136 + BUILD_BUG(); \ 137 + } \ 138 + \ 139 + unreachable(); \ 156 140 } 157 141 158 - #define cmpxchg(ptr, o, n) \ 159 - ({ \ 160 - __typeof__(*(ptr)) __ret; \ 161 - __ret = (__typeof__(*(ptr))) \ 162 - __cmpxchg_mb((ptr), (unsigned long)(o), (unsigned long)(n), \ 163 - sizeof(*(ptr))); \ 164 - __ret; \ 142 + __CMPXCHG_GEN() 143 + __CMPXCHG_GEN(_acq) 144 + __CMPXCHG_GEN(_rel) 145 + __CMPXCHG_GEN(_mb) 146 + 147 + #undef __CMPXCHG_GEN 148 + 149 + #define __cmpxchg_wrapper(sfx, ptr, o, n) \ 150 + ({ \ 151 + __typeof__(*(ptr)) __ret; \ 152 + __ret = (__typeof__(*(ptr))) \ 153 + __cmpxchg##sfx((ptr), (unsigned long)(o), \ 154 + (unsigned long)(n), sizeof(*(ptr))); \ 155 + __ret; \ 165 156 }) 166 157 167 - #define cmpxchg_local(ptr, o, n) \ 168 - ({ \ 169 - __typeof__(*(ptr)) __ret; \ 170 - __ret = (__typeof__(*(ptr))) \ 171 - __cmpxchg((ptr), (unsigned long)(o), \ 172 - (unsigned long)(n), sizeof(*(ptr))); \ 173 - __ret; \ 174 - }) 158 + /* cmpxchg */ 159 + #define cmpxchg_relaxed(...) __cmpxchg_wrapper( , __VA_ARGS__) 160 + #define cmpxchg_acquire(...) __cmpxchg_wrapper(_acq, __VA_ARGS__) 161 + #define cmpxchg_release(...) __cmpxchg_wrapper(_rel, __VA_ARGS__) 162 + #define cmpxchg(...) __cmpxchg_wrapper( _mb, __VA_ARGS__) 163 + #define cmpxchg_local cmpxchg_relaxed 175 164 165 + /* cmpxchg64 */ 166 + #define cmpxchg64_relaxed cmpxchg_relaxed 167 + #define cmpxchg64_acquire cmpxchg_acquire 168 + #define cmpxchg64_release cmpxchg_release 169 + #define cmpxchg64 cmpxchg 170 + #define cmpxchg64_local cmpxchg_local 171 + 172 + /* cmpxchg_double */ 176 173 #define system_has_cmpxchg_double() 1 177 174 178 175 #define __cmpxchg_double_check(ptr1, ptr2) \ ··· 199 202 __ret; \ 200 203 }) 201 204 205 + /* this_cpu_cmpxchg */ 202 206 #define _protect_cmpxchg_local(pcp, o, n) \ 203 207 ({ \ 204 208 typeof(*raw_cpu_ptr(&(pcp))) __ret; \ ··· 224 226 preempt_enable(); \ 225 227 __ret; \ 226 228 }) 227 - 228 - #define cmpxchg64(ptr,o,n) cmpxchg((ptr),(o),(n)) 229 - #define cmpxchg64_local(ptr,o,n) cmpxchg_local((ptr),(o),(n)) 230 - 231 - #define cmpxchg64_relaxed(ptr,o,n) cmpxchg_local((ptr),(o),(n)) 232 229 233 230 #endif /* __ASM_CMPXCHG_H */
+4
arch/arm64/include/asm/cpu.h
··· 63 63 void cpuinfo_store_cpu(void); 64 64 void __init cpuinfo_store_boot_cpu(void); 65 65 66 + void __init init_cpu_features(struct cpuinfo_arm64 *info); 67 + void update_cpu_features(int cpu, struct cpuinfo_arm64 *info, 68 + struct cpuinfo_arm64 *boot); 69 + 66 70 #endif /* __ASM_CPU_H */
+83 -8
arch/arm64/include/asm/cpufeature.h
··· 10 10 #define __ASM_CPUFEATURE_H 11 11 12 12 #include <asm/hwcap.h> 13 + #include <asm/sysreg.h> 13 14 14 15 /* 15 16 * In the arm64 world (as in the ARM world), elf_hwcap is used both internally ··· 36 35 37 36 #include <linux/kernel.h> 38 37 38 + /* CPU feature register tracking */ 39 + enum ftr_type { 40 + FTR_EXACT, /* Use a predefined safe value */ 41 + FTR_LOWER_SAFE, /* Smaller value is safe */ 42 + FTR_HIGHER_SAFE,/* Bigger value is safe */ 43 + }; 44 + 45 + #define FTR_STRICT true /* SANITY check strict matching required */ 46 + #define FTR_NONSTRICT false /* SANITY check ignored */ 47 + 48 + struct arm64_ftr_bits { 49 + bool strict; /* CPU Sanity check: strict matching required ? */ 50 + enum ftr_type type; 51 + u8 shift; 52 + u8 width; 53 + s64 safe_val; /* safe value for discrete features */ 54 + }; 55 + 56 + /* 57 + * @arm64_ftr_reg - Feature register 58 + * @strict_mask Bits which should match across all CPUs for sanity. 59 + * @sys_val Safe value across the CPUs (system view) 60 + */ 61 + struct arm64_ftr_reg { 62 + u32 sys_id; 63 + const char *name; 64 + u64 strict_mask; 65 + u64 sys_val; 66 + struct arm64_ftr_bits *ftr_bits; 67 + }; 68 + 39 69 struct arm64_cpu_capabilities { 40 70 const char *desc; 41 71 u16 capability; 42 72 bool (*matches)(const struct arm64_cpu_capabilities *); 43 - void (*enable)(void); 73 + void (*enable)(void *); /* Called on all active CPUs */ 44 74 union { 45 75 struct { /* To be used for erratum handling only */ 46 76 u32 midr_model; ··· 79 47 }; 80 48 81 49 struct { /* Feature register checking */ 50 + u32 sys_reg; 82 51 int field_pos; 83 52 int min_field_value; 53 + int hwcap_type; 54 + unsigned long hwcap; 84 55 }; 85 56 }; 86 57 }; ··· 111 76 __set_bit(num, cpu_hwcaps); 112 77 } 113 78 114 - static inline int __attribute_const__ cpuid_feature_extract_field(u64 features, 115 - int field) 79 + static inline int __attribute_const__ 80 + cpuid_feature_extract_field_width(u64 features, int field, int width) 116 81 { 117 - return (s64)(features << (64 - 4 - field)) >> (64 - 4); 82 + return (s64)(features << (64 - width - field)) >> (64 - width); 118 83 } 119 84 85 + static inline int __attribute_const__ 86 + cpuid_feature_extract_field(u64 features, int field) 87 + { 88 + return cpuid_feature_extract_field_width(features, field, 4); 89 + } 120 90 121 - void check_cpu_capabilities(const struct arm64_cpu_capabilities *caps, 91 + static inline u64 arm64_ftr_mask(struct arm64_ftr_bits *ftrp) 92 + { 93 + return (u64)GENMASK(ftrp->shift + ftrp->width - 1, ftrp->shift); 94 + } 95 + 96 + static inline s64 arm64_ftr_value(struct arm64_ftr_bits *ftrp, u64 val) 97 + { 98 + return cpuid_feature_extract_field_width(val, ftrp->shift, ftrp->width); 99 + } 100 + 101 + static inline bool id_aa64mmfr0_mixed_endian_el0(u64 mmfr0) 102 + { 103 + return cpuid_feature_extract_field(mmfr0, ID_AA64MMFR0_BIGENDEL_SHIFT) == 0x1 || 104 + cpuid_feature_extract_field(mmfr0, ID_AA64MMFR0_BIGENDEL0_SHIFT) == 0x1; 105 + } 106 + 107 + void __init setup_cpu_features(void); 108 + 109 + void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps, 122 110 const char *info); 123 111 void check_local_cpu_errata(void); 124 - void check_local_cpu_features(void); 125 - bool cpu_supports_mixed_endian_el0(void); 126 - bool system_supports_mixed_endian_el0(void); 112 + 113 + #ifdef CONFIG_HOTPLUG_CPU 114 + void verify_local_cpu_capabilities(void); 115 + #else 116 + static inline void verify_local_cpu_capabilities(void) 117 + { 118 + } 119 + #endif 120 + 121 + u64 read_system_reg(u32 id); 122 + 123 + static inline bool cpu_supports_mixed_endian_el0(void) 124 + { 125 + return id_aa64mmfr0_mixed_endian_el0(read_cpuid(ID_AA64MMFR0_EL1)); 126 + } 127 + 128 + static inline bool system_supports_mixed_endian_el0(void) 129 + { 130 + return id_aa64mmfr0_mixed_endian_el0(read_system_reg(SYS_ID_AA64MMFR0_EL1)); 131 + } 127 132 128 133 #endif /* __ASSEMBLY__ */ 129 134
-15
arch/arm64/include/asm/cputype.h
··· 75 75 76 76 #define CAVIUM_CPU_PART_THUNDERX 0x0A1 77 77 78 - #define ID_AA64MMFR0_BIGENDEL0_SHIFT 16 79 - #define ID_AA64MMFR0_BIGENDEL0_MASK (0xf << ID_AA64MMFR0_BIGENDEL0_SHIFT) 80 - #define ID_AA64MMFR0_BIGENDEL0(mmfr0) \ 81 - (((mmfr0) & ID_AA64MMFR0_BIGENDEL0_MASK) >> ID_AA64MMFR0_BIGENDEL0_SHIFT) 82 - #define ID_AA64MMFR0_BIGEND_SHIFT 8 83 - #define ID_AA64MMFR0_BIGEND_MASK (0xf << ID_AA64MMFR0_BIGEND_SHIFT) 84 - #define ID_AA64MMFR0_BIGEND(mmfr0) \ 85 - (((mmfr0) & ID_AA64MMFR0_BIGEND_MASK) >> ID_AA64MMFR0_BIGEND_SHIFT) 86 - 87 78 #ifndef __ASSEMBLY__ 88 79 89 80 /* ··· 105 114 static inline u32 __attribute_const__ read_cpuid_cachetype(void) 106 115 { 107 116 return read_cpuid(CTR_EL0); 108 - } 109 - 110 - static inline bool id_aa64mmfr0_mixed_endian_el0(u64 mmfr0) 111 - { 112 - return (ID_AA64MMFR0_BIGEND(mmfr0) == 0x1) || 113 - (ID_AA64MMFR0_BIGENDEL0(mmfr0) == 0x1); 114 117 } 115 118 #endif /* __ASSEMBLY__ */ 116 119
+2 -5
arch/arm64/include/asm/fixmap.h
··· 17 17 18 18 #ifndef __ASSEMBLY__ 19 19 #include <linux/kernel.h> 20 + #include <linux/sizes.h> 20 21 #include <asm/boot.h> 21 22 #include <asm/page.h> 22 23 ··· 56 55 * Temporary boot-time mappings, used by early_ioremap(), 57 56 * before ioremap() is functional. 58 57 */ 59 - #ifdef CONFIG_ARM64_64K_PAGES 60 - #define NR_FIX_BTMAPS 4 61 - #else 62 - #define NR_FIX_BTMAPS 64 63 - #endif 58 + #define NR_FIX_BTMAPS (SZ_256K / PAGE_SIZE) 64 59 #define FIX_BTMAPS_SLOTS 7 65 60 #define TOTAL_FIX_BTMAPS (NR_FIX_BTMAPS * FIX_BTMAPS_SLOTS) 66 61
+7 -2
arch/arm64/include/asm/hw_breakpoint.h
··· 17 17 #define __ASM_HW_BREAKPOINT_H 18 18 19 19 #include <asm/cputype.h> 20 + #include <asm/cpufeature.h> 20 21 21 22 #ifdef __KERNEL__ 22 23 ··· 138 137 /* Determine number of BRP registers available. */ 139 138 static inline int get_num_brps(void) 140 139 { 141 - return ((read_cpuid(ID_AA64DFR0_EL1) >> 12) & 0xf) + 1; 140 + return 1 + 141 + cpuid_feature_extract_field(read_system_reg(SYS_ID_AA64DFR0_EL1), 142 + ID_AA64DFR0_BRPS_SHIFT); 142 143 } 143 144 144 145 /* Determine number of WRP registers available. */ 145 146 static inline int get_num_wrps(void) 146 147 { 147 - return ((read_cpuid(ID_AA64DFR0_EL1) >> 20) & 0xf) + 1; 148 + return 1 + 149 + cpuid_feature_extract_field(read_system_reg(SYS_ID_AA64DFR0_EL1), 150 + ID_AA64DFR0_WRPS_SHIFT); 148 151 } 149 152 150 153 #endif /* __KERNEL__ */
+8
arch/arm64/include/asm/hwcap.h
··· 52 52 extern unsigned int compat_elf_hwcap, compat_elf_hwcap2; 53 53 #endif 54 54 55 + enum { 56 + CAP_HWCAP = 1, 57 + #ifdef CONFIG_COMPAT 58 + CAP_COMPAT_HWCAP, 59 + CAP_COMPAT_HWCAP2, 60 + #endif 61 + }; 62 + 55 63 extern unsigned long elf_hwcap; 56 64 #endif 57 65 #endif
-1
arch/arm64/include/asm/irq.h
··· 7 7 8 8 struct pt_regs; 9 9 10 - extern void migrate_irqs(void); 11 10 extern void set_handle_irq(void (*handle_irq)(struct pt_regs *)); 12 11 13 12 static inline void acpi_irq_init(void)
+38
arch/arm64/include/asm/kasan.h
··· 1 + #ifndef __ASM_KASAN_H 2 + #define __ASM_KASAN_H 3 + 4 + #ifndef __ASSEMBLY__ 5 + 6 + #ifdef CONFIG_KASAN 7 + 8 + #include <linux/linkage.h> 9 + #include <asm/memory.h> 10 + 11 + /* 12 + * KASAN_SHADOW_START: beginning of the kernel virtual addresses. 13 + * KASAN_SHADOW_END: KASAN_SHADOW_START + 1/8 of kernel virtual addresses. 14 + */ 15 + #define KASAN_SHADOW_START (VA_START) 16 + #define KASAN_SHADOW_END (KASAN_SHADOW_START + (1UL << (VA_BITS - 3))) 17 + 18 + /* 19 + * This value is used to map an address to the corresponding shadow 20 + * address by the following formula: 21 + * shadow_addr = (address >> 3) + KASAN_SHADOW_OFFSET; 22 + * 23 + * (1 << 61) shadow addresses - [KASAN_SHADOW_OFFSET,KASAN_SHADOW_END] 24 + * cover all 64-bits of virtual addresses. So KASAN_SHADOW_OFFSET 25 + * should satisfy the following equation: 26 + * KASAN_SHADOW_OFFSET = KASAN_SHADOW_END - (1ULL << 61) 27 + */ 28 + #define KASAN_SHADOW_OFFSET (KASAN_SHADOW_END - (1ULL << (64 - 3))) 29 + 30 + void kasan_init(void); 31 + asmlinkage void kasan_early_init(void); 32 + 33 + #else 34 + static inline void kasan_init(void) { } 35 + #endif 36 + 37 + #endif 38 + #endif
+83
arch/arm64/include/asm/kernel-pgtable.h
··· 1 + /* 2 + * Kernel page table mapping 3 + * 4 + * Copyright (C) 2015 ARM Ltd. 5 + * 6 + * This program is free software; you can redistribute it and/or modify 7 + * it under the terms of the GNU General Public License version 2 as 8 + * published by the Free Software Foundation. 9 + * 10 + * This program is distributed in the hope that it will be useful, 11 + * but WITHOUT ANY WARRANTY; without even the implied warranty of 12 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 13 + * GNU General Public License for more details. 14 + * 15 + * You should have received a copy of the GNU General Public License 16 + * along with this program. If not, see <http://www.gnu.org/licenses/>. 17 + */ 18 + 19 + #ifndef __ASM_KERNEL_PGTABLE_H 20 + #define __ASM_KERNEL_PGTABLE_H 21 + 22 + 23 + /* 24 + * The linear mapping and the start of memory are both 2M aligned (per 25 + * the arm64 booting.txt requirements). Hence we can use section mapping 26 + * with 4K (section size = 2M) but not with 16K (section size = 32M) or 27 + * 64K (section size = 512M). 28 + */ 29 + #ifdef CONFIG_ARM64_4K_PAGES 30 + #define ARM64_SWAPPER_USES_SECTION_MAPS 1 31 + #else 32 + #define ARM64_SWAPPER_USES_SECTION_MAPS 0 33 + #endif 34 + 35 + /* 36 + * The idmap and swapper page tables need some space reserved in the kernel 37 + * image. Both require pgd, pud (4 levels only) and pmd tables to (section) 38 + * map the kernel. With the 64K page configuration, swapper and idmap need to 39 + * map to pte level. The swapper also maps the FDT (see __create_page_tables 40 + * for more information). Note that the number of ID map translation levels 41 + * could be increased on the fly if system RAM is out of reach for the default 42 + * VA range, so pages required to map highest possible PA are reserved in all 43 + * cases. 44 + */ 45 + #if ARM64_SWAPPER_USES_SECTION_MAPS 46 + #define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS - 1) 47 + #define IDMAP_PGTABLE_LEVELS (ARM64_HW_PGTABLE_LEVELS(PHYS_MASK_SHIFT) - 1) 48 + #else 49 + #define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS) 50 + #define IDMAP_PGTABLE_LEVELS (ARM64_HW_PGTABLE_LEVELS(PHYS_MASK_SHIFT)) 51 + #endif 52 + 53 + #define SWAPPER_DIR_SIZE (SWAPPER_PGTABLE_LEVELS * PAGE_SIZE) 54 + #define IDMAP_DIR_SIZE (IDMAP_PGTABLE_LEVELS * PAGE_SIZE) 55 + 56 + /* Initial memory map size */ 57 + #if ARM64_SWAPPER_USES_SECTION_MAPS 58 + #define SWAPPER_BLOCK_SHIFT SECTION_SHIFT 59 + #define SWAPPER_BLOCK_SIZE SECTION_SIZE 60 + #define SWAPPER_TABLE_SHIFT PUD_SHIFT 61 + #else 62 + #define SWAPPER_BLOCK_SHIFT PAGE_SHIFT 63 + #define SWAPPER_BLOCK_SIZE PAGE_SIZE 64 + #define SWAPPER_TABLE_SHIFT PMD_SHIFT 65 + #endif 66 + 67 + /* The size of the initial kernel direct mapping */ 68 + #define SWAPPER_INIT_MAP_SIZE (_AC(1, UL) << SWAPPER_TABLE_SHIFT) 69 + 70 + /* 71 + * Initial memory map attributes. 72 + */ 73 + #define SWAPPER_PTE_FLAGS (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED) 74 + #define SWAPPER_PMD_FLAGS (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S) 75 + 76 + #if ARM64_SWAPPER_USES_SECTION_MAPS 77 + #define SWAPPER_MM_MMUFLAGS (PMD_ATTRINDX(MT_NORMAL) | SWAPPER_PMD_FLAGS) 78 + #else 79 + #define SWAPPER_MM_MMUFLAGS (PTE_ATTRINDX(MT_NORMAL) | SWAPPER_PTE_FLAGS) 80 + #endif 81 + 82 + 83 + #endif /* __ASM_KERNEL_PGTABLE_H */
+2 -4
arch/arm64/include/asm/memory.h
··· 42 42 * PAGE_OFFSET - the virtual address of the start of the kernel image (top 43 43 * (VA_BITS - 1)) 44 44 * VA_BITS - the maximum number of bits for virtual addresses. 45 + * VA_START - the first kernel virtual address. 45 46 * TASK_SIZE - the maximum size of a user space task. 46 47 * TASK_UNMAPPED_BASE - the lower boundary of the mmap VM area. 47 48 * The module space lives between the addresses given by TASK_SIZE 48 49 * and PAGE_OFFSET - it must be within 128MB of the kernel text. 49 50 */ 50 51 #define VA_BITS (CONFIG_ARM64_VA_BITS) 52 + #define VA_START (UL(0xffffffffffffffff) << VA_BITS) 51 53 #define PAGE_OFFSET (UL(0xffffffffffffffff) << (VA_BITS - 1)) 52 54 #define MODULES_END (PAGE_OFFSET) 53 55 #define MODULES_VADDR (MODULES_END - SZ_64M) ··· 69 67 #endif /* CONFIG_COMPAT */ 70 68 71 69 #define TASK_UNMAPPED_BASE (PAGE_ALIGN(TASK_SIZE / 4)) 72 - 73 - #if TASK_SIZE_64 > MODULES_VADDR 74 - #error Top of 64-bit user space clashes with start of module space 75 - #endif 76 70 77 71 /* 78 72 * Physical vs virtual RAM address space conversion. These are
+8 -7
arch/arm64/include/asm/mmu.h
··· 17 17 #define __ASM_MMU_H 18 18 19 19 typedef struct { 20 - unsigned int id; 21 - raw_spinlock_t id_lock; 22 - void *vdso; 20 + atomic64_t id; 21 + void *vdso; 23 22 } mm_context_t; 24 23 25 - #define INIT_MM_CONTEXT(name) \ 26 - .context.id_lock = __RAW_SPIN_LOCK_UNLOCKED(name.context.id_lock), 27 - 28 - #define ASID(mm) ((mm)->context.id & 0xffff) 24 + /* 25 + * This macro is only used by the TLBI code, which cannot race with an 26 + * ASID change and therefore doesn't need to reload the counter using 27 + * atomic64_read. 28 + */ 29 + #define ASID(mm) ((mm)->context.id.counter & 0xffff) 29 30 30 31 extern void paging_init(void); 31 32 extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
+27 -90
arch/arm64/include/asm/mmu_context.h
··· 28 28 #include <asm/cputype.h> 29 29 #include <asm/pgtable.h> 30 30 31 - #define MAX_ASID_BITS 16 32 - 33 - extern unsigned int cpu_last_asid; 34 - 35 - void __init_new_context(struct task_struct *tsk, struct mm_struct *mm); 36 - void __new_context(struct mm_struct *mm); 37 - 38 31 #ifdef CONFIG_PID_IN_CONTEXTIDR 39 32 static inline void contextidr_thread_switch(struct task_struct *next) 40 33 { ··· 70 77 unlikely(idmap_t0sz != TCR_T0SZ(VA_BITS))); 71 78 } 72 79 73 - static inline void __cpu_set_tcr_t0sz(u64 t0sz) 74 - { 75 - unsigned long tcr; 76 - 77 - if (__cpu_uses_extended_idmap()) 78 - asm volatile ( 79 - " mrs %0, tcr_el1 ;" 80 - " bfi %0, %1, %2, %3 ;" 81 - " msr tcr_el1, %0 ;" 82 - " isb" 83 - : "=&r" (tcr) 84 - : "r"(t0sz), "I"(TCR_T0SZ_OFFSET), "I"(TCR_TxSZ_WIDTH)); 85 - } 86 - 87 - /* 88 - * Set TCR.T0SZ to the value appropriate for activating the identity map. 89 - */ 90 - static inline void cpu_set_idmap_tcr_t0sz(void) 91 - { 92 - __cpu_set_tcr_t0sz(idmap_t0sz); 93 - } 94 - 95 80 /* 96 81 * Set TCR.T0SZ to its default value (based on VA_BITS) 97 82 */ 98 83 static inline void cpu_set_default_tcr_t0sz(void) 99 84 { 100 - __cpu_set_tcr_t0sz(TCR_T0SZ(VA_BITS)); 85 + unsigned long tcr; 86 + 87 + if (!__cpu_uses_extended_idmap()) 88 + return; 89 + 90 + asm volatile ( 91 + " mrs %0, tcr_el1 ;" 92 + " bfi %0, %1, %2, %3 ;" 93 + " msr tcr_el1, %0 ;" 94 + " isb" 95 + : "=&r" (tcr) 96 + : "r"(TCR_T0SZ(VA_BITS)), "I"(TCR_T0SZ_OFFSET), "I"(TCR_TxSZ_WIDTH)); 101 97 } 102 98 103 - static inline void switch_new_context(struct mm_struct *mm) 104 - { 105 - unsigned long flags; 106 - 107 - __new_context(mm); 108 - 109 - local_irq_save(flags); 110 - cpu_switch_mm(mm->pgd, mm); 111 - local_irq_restore(flags); 112 - } 113 - 114 - static inline void check_and_switch_context(struct mm_struct *mm, 115 - struct task_struct *tsk) 116 - { 117 - /* 118 - * Required during context switch to avoid speculative page table 119 - * walking with the wrong TTBR. 120 - */ 121 - cpu_set_reserved_ttbr0(); 122 - 123 - if (!((mm->context.id ^ cpu_last_asid) >> MAX_ASID_BITS)) 124 - /* 125 - * The ASID is from the current generation, just switch to the 126 - * new pgd. This condition is only true for calls from 127 - * context_switch() and interrupts are already disabled. 128 - */ 129 - cpu_switch_mm(mm->pgd, mm); 130 - else if (irqs_disabled()) 131 - /* 132 - * Defer the new ASID allocation until after the context 133 - * switch critical region since __new_context() cannot be 134 - * called with interrupts disabled. 135 - */ 136 - set_ti_thread_flag(task_thread_info(tsk), TIF_SWITCH_MM); 137 - else 138 - /* 139 - * That is a direct call to switch_mm() or activate_mm() with 140 - * interrupts enabled and a new context. 141 - */ 142 - switch_new_context(mm); 143 - } 144 - 145 - #define init_new_context(tsk,mm) (__init_new_context(tsk,mm),0) 99 + /* 100 + * It would be nice to return ASIDs back to the allocator, but unfortunately 101 + * that introduces a race with a generation rollover where we could erroneously 102 + * free an ASID allocated in a future generation. We could workaround this by 103 + * freeing the ASID from the context of the dying mm (e.g. in arch_exit_mmap), 104 + * but we'd then need to make sure that we didn't dirty any TLBs afterwards. 105 + * Setting a reserved TTBR0 or EPD0 would work, but it all gets ugly when you 106 + * take CPU migration into account. 107 + */ 146 108 #define destroy_context(mm) do { } while(0) 109 + void check_and_switch_context(struct mm_struct *mm, unsigned int cpu); 147 110 148 - #define finish_arch_post_lock_switch \ 149 - finish_arch_post_lock_switch 150 - static inline void finish_arch_post_lock_switch(void) 151 - { 152 - if (test_and_clear_thread_flag(TIF_SWITCH_MM)) { 153 - struct mm_struct *mm = current->mm; 154 - unsigned long flags; 155 - 156 - __new_context(mm); 157 - 158 - local_irq_save(flags); 159 - cpu_switch_mm(mm->pgd, mm); 160 - local_irq_restore(flags); 161 - } 162 - } 111 + #define init_new_context(tsk,mm) ({ atomic64_set(&mm->context.id, 0); 0; }) 163 112 164 113 /* 165 114 * This is called when "tsk" is about to enter lazy TLB mode. ··· 129 194 { 130 195 unsigned int cpu = smp_processor_id(); 131 196 197 + if (prev == next) 198 + return; 199 + 132 200 /* 133 201 * init_mm.pgd does not contain any user mappings and it is always 134 202 * active for kernel addresses in TTBR1. Just set the reserved TTBR0. ··· 141 203 return; 142 204 } 143 205 144 - if (!cpumask_test_and_set_cpu(cpu, mm_cpumask(next)) || prev != next) 145 - check_and_switch_context(next, tsk); 206 + check_and_switch_context(next, cpu); 146 207 } 147 208 148 209 #define deactivate_mm(tsk,mm) do { } while (0)
+9 -18
arch/arm64/include/asm/page.h
··· 20 20 #define __ASM_PAGE_H 21 21 22 22 /* PAGE_SHIFT determines the page size */ 23 + /* CONT_SHIFT determines the number of pages which can be tracked together */ 23 24 #ifdef CONFIG_ARM64_64K_PAGES 24 25 #define PAGE_SHIFT 16 26 + #define CONT_SHIFT 5 27 + #elif defined(CONFIG_ARM64_16K_PAGES) 28 + #define PAGE_SHIFT 14 29 + #define CONT_SHIFT 7 25 30 #else 26 31 #define PAGE_SHIFT 12 32 + #define CONT_SHIFT 4 27 33 #endif 28 - #define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT) 34 + #define PAGE_SIZE (_AC(1, UL) << PAGE_SHIFT) 29 35 #define PAGE_MASK (~(PAGE_SIZE-1)) 30 36 31 - /* 32 - * The idmap and swapper page tables need some space reserved in the kernel 33 - * image. Both require pgd, pud (4 levels only) and pmd tables to (section) 34 - * map the kernel. With the 64K page configuration, swapper and idmap need to 35 - * map to pte level. The swapper also maps the FDT (see __create_page_tables 36 - * for more information). Note that the number of ID map translation levels 37 - * could be increased on the fly if system RAM is out of reach for the default 38 - * VA range, so 3 pages are reserved in all cases. 39 - */ 40 - #ifdef CONFIG_ARM64_64K_PAGES 41 - #define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS) 42 - #else 43 - #define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS - 1) 44 - #endif 45 - 46 - #define SWAPPER_DIR_SIZE (SWAPPER_PGTABLE_LEVELS * PAGE_SIZE) 47 - #define IDMAP_DIR_SIZE (3 * PAGE_SIZE) 37 + #define CONT_SIZE (_AC(1, UL) << (CONT_SHIFT + PAGE_SHIFT)) 38 + #define CONT_MASK (~(CONT_SIZE-1)) 48 39 49 40 #ifndef __ASSEMBLY__ 50 41
+1
arch/arm64/include/asm/pgalloc.h
··· 27 27 #define check_pgt_cache() do { } while (0) 28 28 29 29 #define PGALLOC_GFP (GFP_KERNEL | __GFP_NOTRACK | __GFP_REPEAT | __GFP_ZERO) 30 + #define PGD_SIZE (PTRS_PER_PGD * sizeof(pgd_t)) 30 31 31 32 #if CONFIG_PGTABLE_LEVELS > 2 32 33
+45 -3
arch/arm64/include/asm/pgtable-hwdef.h
··· 16 16 #ifndef __ASM_PGTABLE_HWDEF_H 17 17 #define __ASM_PGTABLE_HWDEF_H 18 18 19 + /* 20 + * Number of page-table levels required to address 'va_bits' wide 21 + * address, without section mapping. We resolve the top (va_bits - PAGE_SHIFT) 22 + * bits with (PAGE_SHIFT - 3) bits at each page table level. Hence: 23 + * 24 + * levels = DIV_ROUND_UP((va_bits - PAGE_SHIFT), (PAGE_SHIFT - 3)) 25 + * 26 + * where DIV_ROUND_UP(n, d) => (((n) + (d) - 1) / (d)) 27 + * 28 + * We cannot include linux/kernel.h which defines DIV_ROUND_UP here 29 + * due to build issues. So we open code DIV_ROUND_UP here: 30 + * 31 + * ((((va_bits) - PAGE_SHIFT) + (PAGE_SHIFT - 3) - 1) / (PAGE_SHIFT - 3)) 32 + * 33 + * which gets simplified as : 34 + */ 35 + #define ARM64_HW_PGTABLE_LEVELS(va_bits) (((va_bits) - 4) / (PAGE_SHIFT - 3)) 36 + 37 + /* 38 + * Size mapped by an entry at level n ( 0 <= n <= 3) 39 + * We map (PAGE_SHIFT - 3) at all translation levels and PAGE_SHIFT bits 40 + * in the final page. The maximum number of translation levels supported by 41 + * the architecture is 4. Hence, starting at at level n, we have further 42 + * ((4 - n) - 1) levels of translation excluding the offset within the page. 43 + * So, the total number of bits mapped by an entry at level n is : 44 + * 45 + * ((4 - n) - 1) * (PAGE_SHIFT - 3) + PAGE_SHIFT 46 + * 47 + * Rearranging it a bit we get : 48 + * (4 - n) * (PAGE_SHIFT - 3) + 3 49 + */ 50 + #define ARM64_HW_PGTABLE_LEVEL_SHIFT(n) ((PAGE_SHIFT - 3) * (4 - (n)) + 3) 51 + 19 52 #define PTRS_PER_PTE (1 << (PAGE_SHIFT - 3)) 20 53 21 54 /* 22 55 * PMD_SHIFT determines the size a level 2 page table entry can map. 23 56 */ 24 57 #if CONFIG_PGTABLE_LEVELS > 2 25 - #define PMD_SHIFT ((PAGE_SHIFT - 3) * 2 + 3) 58 + #define PMD_SHIFT ARM64_HW_PGTABLE_LEVEL_SHIFT(2) 26 59 #define PMD_SIZE (_AC(1, UL) << PMD_SHIFT) 27 60 #define PMD_MASK (~(PMD_SIZE-1)) 28 61 #define PTRS_PER_PMD PTRS_PER_PTE ··· 65 32 * PUD_SHIFT determines the size a level 1 page table entry can map. 66 33 */ 67 34 #if CONFIG_PGTABLE_LEVELS > 3 68 - #define PUD_SHIFT ((PAGE_SHIFT - 3) * 3 + 3) 35 + #define PUD_SHIFT ARM64_HW_PGTABLE_LEVEL_SHIFT(1) 69 36 #define PUD_SIZE (_AC(1, UL) << PUD_SHIFT) 70 37 #define PUD_MASK (~(PUD_SIZE-1)) 71 38 #define PTRS_PER_PUD PTRS_PER_PTE ··· 75 42 * PGDIR_SHIFT determines the size a top-level page table entry can map 76 43 * (depending on the configuration, this level can be 0, 1 or 2). 77 44 */ 78 - #define PGDIR_SHIFT ((PAGE_SHIFT - 3) * CONFIG_PGTABLE_LEVELS + 3) 45 + #define PGDIR_SHIFT ARM64_HW_PGTABLE_LEVEL_SHIFT(4 - CONFIG_PGTABLE_LEVELS) 79 46 #define PGDIR_SIZE (_AC(1, UL) << PGDIR_SHIFT) 80 47 #define PGDIR_MASK (~(PGDIR_SIZE-1)) 81 48 #define PTRS_PER_PGD (1 << (VA_BITS - PGDIR_SHIFT)) ··· 86 53 #define SECTION_SHIFT PMD_SHIFT 87 54 #define SECTION_SIZE (_AC(1, UL) << SECTION_SHIFT) 88 55 #define SECTION_MASK (~(SECTION_SIZE-1)) 56 + 57 + /* 58 + * Contiguous page definitions. 59 + */ 60 + #define CONT_PTES (_AC(1, UL) << CONT_SHIFT) 61 + /* the the numerical offset of the PTE within a range of CONT_PTES */ 62 + #define CONT_RANGE_OFFSET(addr) (((addr)>>PAGE_SHIFT)&(CONT_PTES-1)) 89 63 90 64 /* 91 65 * Hardware page table definitions. ··· 123 83 #define PMD_SECT_S (_AT(pmdval_t, 3) << 8) 124 84 #define PMD_SECT_AF (_AT(pmdval_t, 1) << 10) 125 85 #define PMD_SECT_NG (_AT(pmdval_t, 1) << 11) 86 + #define PMD_SECT_CONT (_AT(pmdval_t, 1) << 52) 126 87 #define PMD_SECT_PXN (_AT(pmdval_t, 1) << 53) 127 88 #define PMD_SECT_UXN (_AT(pmdval_t, 1) << 54) 128 89 ··· 146 105 #define PTE_AF (_AT(pteval_t, 1) << 10) /* Access Flag */ 147 106 #define PTE_NG (_AT(pteval_t, 1) << 11) /* nG */ 148 107 #define PTE_DBM (_AT(pteval_t, 1) << 51) /* Dirty Bit Management */ 108 + #define PTE_CONT (_AT(pteval_t, 1) << 52) /* Contiguous range */ 149 109 #define PTE_PXN (_AT(pteval_t, 1) << 53) /* Privileged XN */ 150 110 #define PTE_UXN (_AT(pteval_t, 1) << 54) /* User XN */ 151 111
+26 -4
arch/arm64/include/asm/pgtable.h
··· 41 41 * fixed mappings and modules 42 42 */ 43 43 #define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE) 44 - #define VMALLOC_START (UL(0xffffffffffffffff) << VA_BITS) 44 + 45 + #ifndef CONFIG_KASAN 46 + #define VMALLOC_START (VA_START) 47 + #else 48 + #include <asm/kasan.h> 49 + #define VMALLOC_START (KASAN_SHADOW_END + SZ_64K) 50 + #endif 51 + 45 52 #define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K) 46 53 47 54 #define vmemmap ((struct page *)(VMALLOC_END + SZ_64K)) ··· 81 74 82 75 #define PAGE_KERNEL __pgprot(_PAGE_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE) 83 76 #define PAGE_KERNEL_EXEC __pgprot(_PAGE_DEFAULT | PTE_UXN | PTE_DIRTY | PTE_WRITE) 77 + #define PAGE_KERNEL_EXEC_CONT __pgprot(_PAGE_DEFAULT | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_CONT) 84 78 85 79 #define PAGE_HYP __pgprot(_PAGE_DEFAULT | PTE_HYP) 86 80 #define PAGE_HYP_DEVICE __pgprot(PROT_DEVICE_nGnRE | PTE_HYP) ··· 150 142 #define pte_special(pte) (!!(pte_val(pte) & PTE_SPECIAL)) 151 143 #define pte_write(pte) (!!(pte_val(pte) & PTE_WRITE)) 152 144 #define pte_exec(pte) (!(pte_val(pte) & PTE_UXN)) 145 + #define pte_cont(pte) (!!(pte_val(pte) & PTE_CONT)) 153 146 154 147 #ifdef CONFIG_ARM64_HW_AFDBM 155 148 #define pte_hw_dirty(pte) (pte_write(pte) && !(pte_val(pte) & PTE_RDONLY)) ··· 211 202 static inline pte_t pte_mkspecial(pte_t pte) 212 203 { 213 204 return set_pte_bit(pte, __pgprot(PTE_SPECIAL)); 205 + } 206 + 207 + static inline pte_t pte_mkcont(pte_t pte) 208 + { 209 + return set_pte_bit(pte, __pgprot(PTE_CONT)); 210 + } 211 + 212 + static inline pte_t pte_mknoncont(pte_t pte) 213 + { 214 + return clear_pte_bit(pte, __pgprot(PTE_CONT)); 214 215 } 215 216 216 217 static inline void set_pte(pte_t *ptep, pte_t pte) ··· 667 648 unsigned long addr, pte_t *ptep) 668 649 { 669 650 /* 670 - * set_pte() does not have a DSB for user mappings, so make sure that 671 - * the page table write is visible. 651 + * We don't do anything here, so there's a very small chance of 652 + * us retaking a user fault which we just fixed up. The alternative 653 + * is doing a dsb(ishst), but that penalises the fastpath. 672 654 */ 673 - dsb(ishst); 674 655 } 675 656 676 657 #define update_mmu_cache_pmd(vma, address, pmd) do { } while (0) 658 + 659 + #define kc_vaddr_to_offset(v) ((v) & ~VA_START) 660 + #define kc_offset_to_vaddr(o) ((o) | VA_START) 677 661 678 662 #endif /* !__ASSEMBLY__ */ 679 663
-83
arch/arm64/include/asm/pmu.h
··· 1 - /* 2 - * Based on arch/arm/include/asm/pmu.h 3 - * 4 - * Copyright (C) 2009 picoChip Designs Ltd, Jamie Iles 5 - * Copyright (C) 2012 ARM Ltd. 6 - * 7 - * This program is free software; you can redistribute it and/or modify 8 - * it under the terms of the GNU General Public License version 2 as 9 - * published by the Free Software Foundation. 10 - * 11 - * This program is distributed in the hope that it will be useful, 12 - * but WITHOUT ANY WARRANTY; without even the implied warranty of 13 - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 14 - * GNU General Public License for more details. 15 - * 16 - * You should have received a copy of the GNU General Public License 17 - * along with this program. If not, see <http://www.gnu.org/licenses/>. 18 - */ 19 - #ifndef __ASM_PMU_H 20 - #define __ASM_PMU_H 21 - 22 - #ifdef CONFIG_HW_PERF_EVENTS 23 - 24 - /* The events for a given PMU register set. */ 25 - struct pmu_hw_events { 26 - /* 27 - * The events that are active on the PMU for the given index. 28 - */ 29 - struct perf_event **events; 30 - 31 - /* 32 - * A 1 bit for an index indicates that the counter is being used for 33 - * an event. A 0 means that the counter can be used. 34 - */ 35 - unsigned long *used_mask; 36 - 37 - /* 38 - * Hardware lock to serialize accesses to PMU registers. Needed for the 39 - * read/modify/write sequences. 40 - */ 41 - raw_spinlock_t pmu_lock; 42 - }; 43 - 44 - struct arm_pmu { 45 - struct pmu pmu; 46 - cpumask_t active_irqs; 47 - int *irq_affinity; 48 - const char *name; 49 - irqreturn_t (*handle_irq)(int irq_num, void *dev); 50 - void (*enable)(struct hw_perf_event *evt, int idx); 51 - void (*disable)(struct hw_perf_event *evt, int idx); 52 - int (*get_event_idx)(struct pmu_hw_events *hw_events, 53 - struct hw_perf_event *hwc); 54 - int (*set_event_filter)(struct hw_perf_event *evt, 55 - struct perf_event_attr *attr); 56 - u32 (*read_counter)(int idx); 57 - void (*write_counter)(int idx, u32 val); 58 - void (*start)(void); 59 - void (*stop)(void); 60 - void (*reset)(void *); 61 - int (*map_event)(struct perf_event *event); 62 - int num_events; 63 - atomic_t active_events; 64 - struct mutex reserve_mutex; 65 - u64 max_period; 66 - struct platform_device *plat_device; 67 - struct pmu_hw_events *(*get_hw_events)(void); 68 - }; 69 - 70 - #define to_arm_pmu(p) (container_of(p, struct arm_pmu, pmu)) 71 - 72 - int __init armpmu_register(struct arm_pmu *armpmu, char *name, int type); 73 - 74 - u64 armpmu_event_update(struct perf_event *event, 75 - struct hw_perf_event *hwc, 76 - int idx); 77 - 78 - int armpmu_event_set_period(struct perf_event *event, 79 - struct hw_perf_event *hwc, 80 - int idx); 81 - 82 - #endif /* CONFIG_HW_PERF_EVENTS */ 83 - #endif /* __ASM_PMU_H */
+1 -1
arch/arm64/include/asm/processor.h
··· 186 186 187 187 #endif 188 188 189 - void cpu_enable_pan(void); 189 + void cpu_enable_pan(void *__unused); 190 190 191 191 #endif /* __ASM_PROCESSOR_H */
+8 -8
arch/arm64/include/asm/ptrace.h
··· 83 83 #define compat_sp regs[13] 84 84 #define compat_lr regs[14] 85 85 #define compat_sp_hyp regs[15] 86 - #define compat_sp_irq regs[16] 87 - #define compat_lr_irq regs[17] 88 - #define compat_sp_svc regs[18] 89 - #define compat_lr_svc regs[19] 90 - #define compat_sp_abt regs[20] 91 - #define compat_lr_abt regs[21] 92 - #define compat_sp_und regs[22] 93 - #define compat_lr_und regs[23] 86 + #define compat_lr_irq regs[16] 87 + #define compat_sp_irq regs[17] 88 + #define compat_lr_svc regs[18] 89 + #define compat_sp_svc regs[19] 90 + #define compat_lr_abt regs[20] 91 + #define compat_sp_abt regs[21] 92 + #define compat_lr_und regs[22] 93 + #define compat_sp_und regs[23] 94 94 #define compat_r8_fiq regs[24] 95 95 #define compat_r9_fiq regs[25] 96 96 #define compat_r10_fiq regs[26]
+16
arch/arm64/include/asm/string.h
··· 36 36 37 37 #define __HAVE_ARCH_MEMCPY 38 38 extern void *memcpy(void *, const void *, __kernel_size_t); 39 + extern void *__memcpy(void *, const void *, __kernel_size_t); 39 40 40 41 #define __HAVE_ARCH_MEMMOVE 41 42 extern void *memmove(void *, const void *, __kernel_size_t); 43 + extern void *__memmove(void *, const void *, __kernel_size_t); 42 44 43 45 #define __HAVE_ARCH_MEMCHR 44 46 extern void *memchr(const void *, int, __kernel_size_t); 45 47 46 48 #define __HAVE_ARCH_MEMSET 47 49 extern void *memset(void *, int, __kernel_size_t); 50 + extern void *__memset(void *, int, __kernel_size_t); 48 51 49 52 #define __HAVE_ARCH_MEMCMP 50 53 extern int memcmp(const void *, const void *, size_t); 54 + 55 + 56 + #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__) 57 + 58 + /* 59 + * For files that are not instrumented (e.g. mm/slub.c) we 60 + * should use not instrumented version of mem* functions. 61 + */ 62 + 63 + #define memcpy(dst, src, len) __memcpy(dst, src, len) 64 + #define memmove(dst, src, len) __memmove(dst, src, len) 65 + #define memset(s, c, n) __memset(s, c, n) 66 + #endif 51 67 52 68 #endif
+152 -5
arch/arm64/include/asm/sysreg.h
··· 22 22 23 23 #include <asm/opcodes.h> 24 24 25 - #define SCTLR_EL1_CP15BEN (0x1 << 5) 26 - #define SCTLR_EL1_SED (0x1 << 8) 27 - 28 25 /* 29 26 * ARMv8 ARM reserves the following encoding for system registers: 30 27 * (Ref: ARMv8 ARM, Section: "System instruction class encoding overview", ··· 35 38 #define sys_reg(op0, op1, crn, crm, op2) \ 36 39 ((((op0)&3)<<19)|((op1)<<16)|((crn)<<12)|((crm)<<8)|((op2)<<5)) 37 40 38 - #define REG_PSTATE_PAN_IMM sys_reg(0, 0, 4, 0, 4) 39 - #define SCTLR_EL1_SPAN (1 << 23) 41 + #define SYS_MIDR_EL1 sys_reg(3, 0, 0, 0, 0) 42 + #define SYS_MPIDR_EL1 sys_reg(3, 0, 0, 0, 5) 43 + #define SYS_REVIDR_EL1 sys_reg(3, 0, 0, 0, 6) 44 + 45 + #define SYS_ID_PFR0_EL1 sys_reg(3, 0, 0, 1, 0) 46 + #define SYS_ID_PFR1_EL1 sys_reg(3, 0, 0, 1, 1) 47 + #define SYS_ID_DFR0_EL1 sys_reg(3, 0, 0, 1, 2) 48 + #define SYS_ID_MMFR0_EL1 sys_reg(3, 0, 0, 1, 4) 49 + #define SYS_ID_MMFR1_EL1 sys_reg(3, 0, 0, 1, 5) 50 + #define SYS_ID_MMFR2_EL1 sys_reg(3, 0, 0, 1, 6) 51 + #define SYS_ID_MMFR3_EL1 sys_reg(3, 0, 0, 1, 7) 52 + 53 + #define SYS_ID_ISAR0_EL1 sys_reg(3, 0, 0, 2, 0) 54 + #define SYS_ID_ISAR1_EL1 sys_reg(3, 0, 0, 2, 1) 55 + #define SYS_ID_ISAR2_EL1 sys_reg(3, 0, 0, 2, 2) 56 + #define SYS_ID_ISAR3_EL1 sys_reg(3, 0, 0, 2, 3) 57 + #define SYS_ID_ISAR4_EL1 sys_reg(3, 0, 0, 2, 4) 58 + #define SYS_ID_ISAR5_EL1 sys_reg(3, 0, 0, 2, 5) 59 + #define SYS_ID_MMFR4_EL1 sys_reg(3, 0, 0, 2, 6) 60 + 61 + #define SYS_MVFR0_EL1 sys_reg(3, 0, 0, 3, 0) 62 + #define SYS_MVFR1_EL1 sys_reg(3, 0, 0, 3, 1) 63 + #define SYS_MVFR2_EL1 sys_reg(3, 0, 0, 3, 2) 64 + 65 + #define SYS_ID_AA64PFR0_EL1 sys_reg(3, 0, 0, 4, 0) 66 + #define SYS_ID_AA64PFR1_EL1 sys_reg(3, 0, 0, 4, 1) 67 + 68 + #define SYS_ID_AA64DFR0_EL1 sys_reg(3, 0, 0, 5, 0) 69 + #define SYS_ID_AA64DFR1_EL1 sys_reg(3, 0, 0, 5, 1) 70 + 71 + #define SYS_ID_AA64ISAR0_EL1 sys_reg(3, 0, 0, 6, 0) 72 + #define SYS_ID_AA64ISAR1_EL1 sys_reg(3, 0, 0, 6, 1) 73 + 74 + #define SYS_ID_AA64MMFR0_EL1 sys_reg(3, 0, 0, 7, 0) 75 + #define SYS_ID_AA64MMFR1_EL1 sys_reg(3, 0, 0, 7, 1) 76 + 77 + #define SYS_CNTFRQ_EL0 sys_reg(3, 3, 14, 0, 0) 78 + #define SYS_CTR_EL0 sys_reg(3, 3, 0, 0, 1) 79 + #define SYS_DCZID_EL0 sys_reg(3, 3, 0, 0, 7) 80 + 81 + #define REG_PSTATE_PAN_IMM sys_reg(0, 0, 4, 0, 4) 40 82 41 83 #define SET_PSTATE_PAN(x) __inst_arm(0xd5000000 | REG_PSTATE_PAN_IMM |\ 42 84 (!!x)<<8 | 0x1f) 85 + 86 + /* SCTLR_EL1 */ 87 + #define SCTLR_EL1_CP15BEN (0x1 << 5) 88 + #define SCTLR_EL1_SED (0x1 << 8) 89 + #define SCTLR_EL1_SPAN (0x1 << 23) 90 + 91 + 92 + /* id_aa64isar0 */ 93 + #define ID_AA64ISAR0_RDM_SHIFT 28 94 + #define ID_AA64ISAR0_ATOMICS_SHIFT 20 95 + #define ID_AA64ISAR0_CRC32_SHIFT 16 96 + #define ID_AA64ISAR0_SHA2_SHIFT 12 97 + #define ID_AA64ISAR0_SHA1_SHIFT 8 98 + #define ID_AA64ISAR0_AES_SHIFT 4 99 + 100 + /* id_aa64pfr0 */ 101 + #define ID_AA64PFR0_GIC_SHIFT 24 102 + #define ID_AA64PFR0_ASIMD_SHIFT 20 103 + #define ID_AA64PFR0_FP_SHIFT 16 104 + #define ID_AA64PFR0_EL3_SHIFT 12 105 + #define ID_AA64PFR0_EL2_SHIFT 8 106 + #define ID_AA64PFR0_EL1_SHIFT 4 107 + #define ID_AA64PFR0_EL0_SHIFT 0 108 + 109 + #define ID_AA64PFR0_FP_NI 0xf 110 + #define ID_AA64PFR0_FP_SUPPORTED 0x0 111 + #define ID_AA64PFR0_ASIMD_NI 0xf 112 + #define ID_AA64PFR0_ASIMD_SUPPORTED 0x0 113 + #define ID_AA64PFR0_EL1_64BIT_ONLY 0x1 114 + #define ID_AA64PFR0_EL0_64BIT_ONLY 0x1 115 + 116 + /* id_aa64mmfr0 */ 117 + #define ID_AA64MMFR0_TGRAN4_SHIFT 28 118 + #define ID_AA64MMFR0_TGRAN64_SHIFT 24 119 + #define ID_AA64MMFR0_TGRAN16_SHIFT 20 120 + #define ID_AA64MMFR0_BIGENDEL0_SHIFT 16 121 + #define ID_AA64MMFR0_SNSMEM_SHIFT 12 122 + #define ID_AA64MMFR0_BIGENDEL_SHIFT 8 123 + #define ID_AA64MMFR0_ASID_SHIFT 4 124 + #define ID_AA64MMFR0_PARANGE_SHIFT 0 125 + 126 + #define ID_AA64MMFR0_TGRAN4_NI 0xf 127 + #define ID_AA64MMFR0_TGRAN4_SUPPORTED 0x0 128 + #define ID_AA64MMFR0_TGRAN64_NI 0xf 129 + #define ID_AA64MMFR0_TGRAN64_SUPPORTED 0x0 130 + #define ID_AA64MMFR0_TGRAN16_NI 0x0 131 + #define ID_AA64MMFR0_TGRAN16_SUPPORTED 0x1 132 + 133 + /* id_aa64mmfr1 */ 134 + #define ID_AA64MMFR1_PAN_SHIFT 20 135 + #define ID_AA64MMFR1_LOR_SHIFT 16 136 + #define ID_AA64MMFR1_HPD_SHIFT 12 137 + #define ID_AA64MMFR1_VHE_SHIFT 8 138 + #define ID_AA64MMFR1_VMIDBITS_SHIFT 4 139 + #define ID_AA64MMFR1_HADBS_SHIFT 0 140 + 141 + /* id_aa64dfr0 */ 142 + #define ID_AA64DFR0_CTX_CMPS_SHIFT 28 143 + #define ID_AA64DFR0_WRPS_SHIFT 20 144 + #define ID_AA64DFR0_BRPS_SHIFT 12 145 + #define ID_AA64DFR0_PMUVER_SHIFT 8 146 + #define ID_AA64DFR0_TRACEVER_SHIFT 4 147 + #define ID_AA64DFR0_DEBUGVER_SHIFT 0 148 + 149 + #define ID_ISAR5_RDM_SHIFT 24 150 + #define ID_ISAR5_CRC32_SHIFT 16 151 + #define ID_ISAR5_SHA2_SHIFT 12 152 + #define ID_ISAR5_SHA1_SHIFT 8 153 + #define ID_ISAR5_AES_SHIFT 4 154 + #define ID_ISAR5_SEVL_SHIFT 0 155 + 156 + #define MVFR0_FPROUND_SHIFT 28 157 + #define MVFR0_FPSHVEC_SHIFT 24 158 + #define MVFR0_FPSQRT_SHIFT 20 159 + #define MVFR0_FPDIVIDE_SHIFT 16 160 + #define MVFR0_FPTRAP_SHIFT 12 161 + #define MVFR0_FPDP_SHIFT 8 162 + #define MVFR0_FPSP_SHIFT 4 163 + #define MVFR0_SIMD_SHIFT 0 164 + 165 + #define MVFR1_SIMDFMAC_SHIFT 28 166 + #define MVFR1_FPHP_SHIFT 24 167 + #define MVFR1_SIMDHP_SHIFT 20 168 + #define MVFR1_SIMDSP_SHIFT 16 169 + #define MVFR1_SIMDINT_SHIFT 12 170 + #define MVFR1_SIMDLS_SHIFT 8 171 + #define MVFR1_FPDNAN_SHIFT 4 172 + #define MVFR1_FPFTZ_SHIFT 0 173 + 174 + 175 + #define ID_AA64MMFR0_TGRAN4_SHIFT 28 176 + #define ID_AA64MMFR0_TGRAN64_SHIFT 24 177 + #define ID_AA64MMFR0_TGRAN16_SHIFT 20 178 + 179 + #define ID_AA64MMFR0_TGRAN4_NI 0xf 180 + #define ID_AA64MMFR0_TGRAN4_SUPPORTED 0x0 181 + #define ID_AA64MMFR0_TGRAN64_NI 0xf 182 + #define ID_AA64MMFR0_TGRAN64_SUPPORTED 0x0 183 + #define ID_AA64MMFR0_TGRAN16_NI 0x0 184 + #define ID_AA64MMFR0_TGRAN16_SUPPORTED 0x1 185 + 186 + #if defined(CONFIG_ARM64_4K_PAGES) 187 + #define ID_AA64MMFR0_TGRAN_SHIFT ID_AA64MMFR0_TGRAN4_SHIFT 188 + #define ID_AA64MMFR0_TGRAN_SUPPORTED ID_AA64MMFR0_TGRAN4_SUPPORTED 189 + #elif defined(CONFIG_ARM64_16K_PAGES) 190 + #define ID_AA64MMFR0_TGRAN_SHIFT ID_AA64MMFR0_TGRAN16_SHIFT 191 + #define ID_AA64MMFR0_TGRAN_SUPPORTED ID_AA64MMFR0_TGRAN16_SUPPORTED 192 + #elif defined(CONFIG_ARM64_64K_PAGES) 193 + #define ID_AA64MMFR0_TGRAN_SHIFT ID_AA64MMFR0_TGRAN64_SHIFT 194 + #define ID_AA64MMFR0_TGRAN_SUPPORTED ID_AA64MMFR0_TGRAN64_SUPPORTED 195 + #endif 43 196 44 197 #ifdef __ASSEMBLY__ 45 198
+3 -2
arch/arm64/include/asm/thread_info.h
··· 23 23 24 24 #include <linux/compiler.h> 25 25 26 - #ifndef CONFIG_ARM64_64K_PAGES 26 + #ifdef CONFIG_ARM64_4K_PAGES 27 27 #define THREAD_SIZE_ORDER 2 28 + #elif defined(CONFIG_ARM64_16K_PAGES) 29 + #define THREAD_SIZE_ORDER 0 28 30 #endif 29 31 30 32 #define THREAD_SIZE 16384 ··· 113 111 #define TIF_RESTORE_SIGMASK 20 114 112 #define TIF_SINGLESTEP 21 115 113 #define TIF_32BIT 22 /* 32bit process */ 116 - #define TIF_SWITCH_MM 23 /* deferred switch_mm */ 117 114 118 115 #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) 119 116 #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED)
+15 -11
arch/arm64/include/asm/tlb.h
··· 37 37 38 38 static inline void tlb_flush(struct mmu_gather *tlb) 39 39 { 40 - if (tlb->fullmm) { 41 - flush_tlb_mm(tlb->mm); 42 - } else { 43 - struct vm_area_struct vma = { .vm_mm = tlb->mm, }; 44 - /* 45 - * The intermediate page table levels are already handled by 46 - * the __(pte|pmd|pud)_free_tlb() functions, so last level 47 - * TLBI is sufficient here. 48 - */ 49 - __flush_tlb_range(&vma, tlb->start, tlb->end, true); 50 - } 40 + struct vm_area_struct vma = { .vm_mm = tlb->mm, }; 41 + 42 + /* 43 + * The ASID allocator will either invalidate the ASID or mark 44 + * it as used. 45 + */ 46 + if (tlb->fullmm) 47 + return; 48 + 49 + /* 50 + * The intermediate page table levels are already handled by 51 + * the __(pte|pmd|pud)_free_tlb() functions, so last level 52 + * TLBI is sufficient here. 53 + */ 54 + __flush_tlb_range(&vma, tlb->start, tlb->end, true); 51 55 } 52 56 53 57 static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
+12 -6
arch/arm64/include/asm/tlbflush.h
··· 63 63 * only require the D-TLB to be invalidated. 64 64 * - kaddr - Kernel virtual memory address 65 65 */ 66 + static inline void local_flush_tlb_all(void) 67 + { 68 + dsb(nshst); 69 + asm("tlbi vmalle1"); 70 + dsb(nsh); 71 + isb(); 72 + } 73 + 66 74 static inline void flush_tlb_all(void) 67 75 { 68 76 dsb(ishst); ··· 81 73 82 74 static inline void flush_tlb_mm(struct mm_struct *mm) 83 75 { 84 - unsigned long asid = (unsigned long)ASID(mm) << 48; 76 + unsigned long asid = ASID(mm) << 48; 85 77 86 78 dsb(ishst); 87 79 asm("tlbi aside1is, %0" : : "r" (asid)); ··· 91 83 static inline void flush_tlb_page(struct vm_area_struct *vma, 92 84 unsigned long uaddr) 93 85 { 94 - unsigned long addr = uaddr >> 12 | 95 - ((unsigned long)ASID(vma->vm_mm) << 48); 86 + unsigned long addr = uaddr >> 12 | (ASID(vma->vm_mm) << 48); 96 87 97 88 dsb(ishst); 98 89 asm("tlbi vale1is, %0" : : "r" (addr)); ··· 108 101 unsigned long start, unsigned long end, 109 102 bool last_level) 110 103 { 111 - unsigned long asid = (unsigned long)ASID(vma->vm_mm) << 48; 104 + unsigned long asid = ASID(vma->vm_mm) << 48; 112 105 unsigned long addr; 113 106 114 107 if ((end - start) > MAX_TLB_RANGE) { ··· 161 154 static inline void __flush_tlb_pgtable(struct mm_struct *mm, 162 155 unsigned long uaddr) 163 156 { 164 - unsigned long addr = uaddr >> 12 | ((unsigned long)ASID(mm) << 48); 157 + unsigned long addr = uaddr >> 12 | (ASID(mm) << 48); 165 158 166 - dsb(ishst); 167 159 asm("tlbi vae1is, %0" : : "r" (addr)); 168 160 dsb(ish); 169 161 }
+8 -3
arch/arm64/kernel/Makefile
··· 4 4 5 5 CPPFLAGS_vmlinux.lds := -DTEXT_OFFSET=$(TEXT_OFFSET) 6 6 AFLAGS_head.o := -DTEXT_OFFSET=$(TEXT_OFFSET) 7 - CFLAGS_efi-stub.o := -DTEXT_OFFSET=$(TEXT_OFFSET) 8 7 CFLAGS_armv8_deprecated.o := -I$(src) 9 8 10 9 CFLAGS_REMOVE_ftrace.o = -pg ··· 19 20 cpufeature.o alternative.o cacheinfo.o \ 20 21 smp.o smp_spin_table.o topology.o 21 22 23 + extra-$(CONFIG_EFI) := efi-entry.o 24 + 25 + OBJCOPYFLAGS := --prefix-symbols=__efistub_ 26 + $(obj)/%.stub.o: $(obj)/%.o FORCE 27 + $(call if_changed,objcopy) 28 + 22 29 arm64-obj-$(CONFIG_COMPAT) += sys32.o kuser32.o signal32.o \ 23 30 sys_compat.o entry32.o \ 24 31 ../../arm/kernel/opcodes.o ··· 37 32 arm64-obj-$(CONFIG_CPU_IDLE) += cpuidle.o 38 33 arm64-obj-$(CONFIG_JUMP_LABEL) += jump_label.o 39 34 arm64-obj-$(CONFIG_KGDB) += kgdb.o 40 - arm64-obj-$(CONFIG_EFI) += efi.o efi-stub.o efi-entry.o 35 + arm64-obj-$(CONFIG_EFI) += efi.o efi-entry.stub.o 41 36 arm64-obj-$(CONFIG_PCI) += pci.o 42 37 arm64-obj-$(CONFIG_ARMV8_DEPRECATED) += armv8_deprecated.o 43 38 arm64-obj-$(CONFIG_ACPI) += acpi.o ··· 45 40 obj-y += $(arm64-obj-y) vdso/ 46 41 obj-m += $(arm64-obj-m) 47 42 head-y := head.o 48 - extra-y := $(head-y) vmlinux.lds 43 + extra-y += $(head-y) vmlinux.lds 49 44 50 45 # vDSO - this must be built first to generate the symbol offsets 51 46 $(call objectify,$(arm64-obj-y)): $(obj)/vdso/vdso-offsets.h
+3
arch/arm64/kernel/arm64ksyms.c
··· 51 51 EXPORT_SYMBOL(memset); 52 52 EXPORT_SYMBOL(memcpy); 53 53 EXPORT_SYMBOL(memmove); 54 + EXPORT_SYMBOL(__memset); 55 + EXPORT_SYMBOL(__memcpy); 56 + EXPORT_SYMBOL(__memmove); 54 57 EXPORT_SYMBOL(memchr); 55 58 EXPORT_SYMBOL(memcmp); 56 59
+1 -1
arch/arm64/kernel/asm-offsets.c
··· 60 60 DEFINE(S_SYSCALLNO, offsetof(struct pt_regs, syscallno)); 61 61 DEFINE(S_FRAME_SIZE, sizeof(struct pt_regs)); 62 62 BLANK(); 63 - DEFINE(MM_CONTEXT_ID, offsetof(struct mm_struct, context.id)); 63 + DEFINE(MM_CONTEXT_ID, offsetof(struct mm_struct, context.id.counter)); 64 64 BLANK(); 65 65 DEFINE(VMA_VM_MM, offsetof(struct vm_area_struct, vm_mm)); 66 66 DEFINE(VMA_VM_FLAGS, offsetof(struct vm_area_struct, vm_flags));
+1 -1
arch/arm64/kernel/cpu_errata.c
··· 97 97 98 98 void check_local_cpu_errata(void) 99 99 { 100 - check_cpu_capabilities(arm64_errata, "enabling workaround for"); 100 + update_cpu_capabilities(arm64_errata, "enabling workaround for"); 101 101 }
+827 -26
arch/arm64/kernel/cpufeature.c
··· 16 16 * along with this program. If not, see <http://www.gnu.org/licenses/>. 17 17 */ 18 18 19 - #define pr_fmt(fmt) "alternatives: " fmt 19 + #define pr_fmt(fmt) "CPU features: " fmt 20 20 21 + #include <linux/bsearch.h> 22 + #include <linux/sort.h> 21 23 #include <linux/types.h> 22 24 #include <asm/cpu.h> 23 25 #include <asm/cpufeature.h> 26 + #include <asm/cpu_ops.h> 24 27 #include <asm/processor.h> 28 + #include <asm/sysreg.h> 29 + 30 + unsigned long elf_hwcap __read_mostly; 31 + EXPORT_SYMBOL_GPL(elf_hwcap); 32 + 33 + #ifdef CONFIG_COMPAT 34 + #define COMPAT_ELF_HWCAP_DEFAULT \ 35 + (COMPAT_HWCAP_HALF|COMPAT_HWCAP_THUMB|\ 36 + COMPAT_HWCAP_FAST_MULT|COMPAT_HWCAP_EDSP|\ 37 + COMPAT_HWCAP_TLS|COMPAT_HWCAP_VFP|\ 38 + COMPAT_HWCAP_VFPv3|COMPAT_HWCAP_VFPv4|\ 39 + COMPAT_HWCAP_NEON|COMPAT_HWCAP_IDIV|\ 40 + COMPAT_HWCAP_LPAE) 41 + unsigned int compat_elf_hwcap __read_mostly = COMPAT_ELF_HWCAP_DEFAULT; 42 + unsigned int compat_elf_hwcap2 __read_mostly; 43 + #endif 44 + 45 + DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS); 46 + 47 + #define ARM64_FTR_BITS(STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \ 48 + { \ 49 + .strict = STRICT, \ 50 + .type = TYPE, \ 51 + .shift = SHIFT, \ 52 + .width = WIDTH, \ 53 + .safe_val = SAFE_VAL, \ 54 + } 55 + 56 + #define ARM64_FTR_END \ 57 + { \ 58 + .width = 0, \ 59 + } 60 + 61 + static struct arm64_ftr_bits ftr_id_aa64isar0[] = { 62 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 32, 32, 0), 63 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64ISAR0_RDM_SHIFT, 4, 0), 64 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 24, 4, 0), 65 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_ATOMICS_SHIFT, 4, 0), 66 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_CRC32_SHIFT, 4, 0), 67 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_SHA2_SHIFT, 4, 0), 68 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_SHA1_SHIFT, 4, 0), 69 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_AES_SHIFT, 4, 0), 70 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 0, 4, 0), /* RAZ */ 71 + ARM64_FTR_END, 72 + }; 73 + 74 + static struct arm64_ftr_bits ftr_id_aa64pfr0[] = { 75 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 32, 32, 0), 76 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 28, 4, 0), 77 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64PFR0_GIC_SHIFT, 4, 0), 78 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI), 79 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_FP_SHIFT, 4, ID_AA64PFR0_FP_NI), 80 + /* Linux doesn't care about the EL3 */ 81 + ARM64_FTR_BITS(FTR_NONSTRICT, FTR_EXACT, ID_AA64PFR0_EL3_SHIFT, 4, 0), 82 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64PFR0_EL2_SHIFT, 4, 0), 83 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64PFR0_EL1_SHIFT, 4, ID_AA64PFR0_EL1_64BIT_ONLY), 84 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64PFR0_EL0_SHIFT, 4, ID_AA64PFR0_EL0_64BIT_ONLY), 85 + ARM64_FTR_END, 86 + }; 87 + 88 + static struct arm64_ftr_bits ftr_id_aa64mmfr0[] = { 89 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 32, 32, 0), 90 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64MMFR0_TGRAN4_SHIFT, 4, ID_AA64MMFR0_TGRAN4_NI), 91 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64MMFR0_TGRAN64_SHIFT, 4, ID_AA64MMFR0_TGRAN64_NI), 92 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64MMFR0_TGRAN16_SHIFT, 4, ID_AA64MMFR0_TGRAN16_NI), 93 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64MMFR0_BIGENDEL0_SHIFT, 4, 0), 94 + /* Linux shouldn't care about secure memory */ 95 + ARM64_FTR_BITS(FTR_NONSTRICT, FTR_EXACT, ID_AA64MMFR0_SNSMEM_SHIFT, 4, 0), 96 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64MMFR0_BIGENDEL_SHIFT, 4, 0), 97 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64MMFR0_ASID_SHIFT, 4, 0), 98 + /* 99 + * Differing PARange is fine as long as all peripherals and memory are mapped 100 + * within the minimum PARange of all CPUs 101 + */ 102 + ARM64_FTR_BITS(FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR0_PARANGE_SHIFT, 4, 0), 103 + ARM64_FTR_END, 104 + }; 105 + 106 + static struct arm64_ftr_bits ftr_id_aa64mmfr1[] = { 107 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 32, 32, 0), 108 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_PAN_SHIFT, 4, 0), 109 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64MMFR1_LOR_SHIFT, 4, 0), 110 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64MMFR1_HPD_SHIFT, 4, 0), 111 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64MMFR1_VHE_SHIFT, 4, 0), 112 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64MMFR1_VMIDBITS_SHIFT, 4, 0), 113 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64MMFR1_HADBS_SHIFT, 4, 0), 114 + ARM64_FTR_END, 115 + }; 116 + 117 + static struct arm64_ftr_bits ftr_ctr[] = { 118 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 31, 1, 1), /* RAO */ 119 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 28, 3, 0), 120 + ARM64_FTR_BITS(FTR_STRICT, FTR_HIGHER_SAFE, 24, 4, 0), /* CWG */ 121 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, 20, 4, 0), /* ERG */ 122 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, 16, 4, 1), /* DminLine */ 123 + /* 124 + * Linux can handle differing I-cache policies. Userspace JITs will 125 + * make use of *minLine 126 + */ 127 + ARM64_FTR_BITS(FTR_NONSTRICT, FTR_EXACT, 14, 2, 0), /* L1Ip */ 128 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 4, 10, 0), /* RAZ */ 129 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, 0, 4, 0), /* IminLine */ 130 + ARM64_FTR_END, 131 + }; 132 + 133 + static struct arm64_ftr_bits ftr_id_mmfr0[] = { 134 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 28, 4, 0), /* InnerShr */ 135 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 24, 4, 0), /* FCSE */ 136 + ARM64_FTR_BITS(FTR_NONSTRICT, FTR_LOWER_SAFE, 20, 4, 0), /* AuxReg */ 137 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 16, 4, 0), /* TCM */ 138 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 12, 4, 0), /* ShareLvl */ 139 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 8, 4, 0), /* OuterShr */ 140 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 4, 4, 0), /* PMSA */ 141 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 0, 4, 0), /* VMSA */ 142 + ARM64_FTR_END, 143 + }; 144 + 145 + static struct arm64_ftr_bits ftr_id_aa64dfr0[] = { 146 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 32, 32, 0), 147 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, ID_AA64DFR0_CTX_CMPS_SHIFT, 4, 0), 148 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, ID_AA64DFR0_WRPS_SHIFT, 4, 0), 149 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, ID_AA64DFR0_BRPS_SHIFT, 4, 0), 150 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64DFR0_PMUVER_SHIFT, 4, 0), 151 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64DFR0_TRACEVER_SHIFT, 4, 0), 152 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64DFR0_DEBUGVER_SHIFT, 4, 0x6), 153 + ARM64_FTR_END, 154 + }; 155 + 156 + static struct arm64_ftr_bits ftr_mvfr2[] = { 157 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 8, 24, 0), /* RAZ */ 158 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 4, 4, 0), /* FPMisc */ 159 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 0, 4, 0), /* SIMDMisc */ 160 + ARM64_FTR_END, 161 + }; 162 + 163 + static struct arm64_ftr_bits ftr_dczid[] = { 164 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 5, 27, 0), /* RAZ */ 165 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 4, 1, 1), /* DZP */ 166 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, 0, 4, 0), /* BS */ 167 + ARM64_FTR_END, 168 + }; 169 + 170 + 171 + static struct arm64_ftr_bits ftr_id_isar5[] = { 172 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_ISAR5_RDM_SHIFT, 4, 0), 173 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 20, 4, 0), /* RAZ */ 174 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_ISAR5_CRC32_SHIFT, 4, 0), 175 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_ISAR5_SHA2_SHIFT, 4, 0), 176 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_ISAR5_SHA1_SHIFT, 4, 0), 177 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_ISAR5_AES_SHIFT, 4, 0), 178 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_ISAR5_SEVL_SHIFT, 4, 0), 179 + ARM64_FTR_END, 180 + }; 181 + 182 + static struct arm64_ftr_bits ftr_id_mmfr4[] = { 183 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 8, 24, 0), /* RAZ */ 184 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 4, 4, 0), /* ac2 */ 185 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 0, 4, 0), /* RAZ */ 186 + ARM64_FTR_END, 187 + }; 188 + 189 + static struct arm64_ftr_bits ftr_id_pfr0[] = { 190 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 16, 16, 0), /* RAZ */ 191 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 12, 4, 0), /* State3 */ 192 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 8, 4, 0), /* State2 */ 193 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 4, 4, 0), /* State1 */ 194 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 0, 4, 0), /* State0 */ 195 + ARM64_FTR_END, 196 + }; 197 + 198 + /* 199 + * Common ftr bits for a 32bit register with all hidden, strict 200 + * attributes, with 4bit feature fields and a default safe value of 201 + * 0. Covers the following 32bit registers: 202 + * id_isar[0-4], id_mmfr[1-3], id_pfr1, mvfr[0-1] 203 + */ 204 + static struct arm64_ftr_bits ftr_generic_32bits[] = { 205 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, 28, 4, 0), 206 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, 24, 4, 0), 207 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, 20, 4, 0), 208 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, 16, 4, 0), 209 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, 12, 4, 0), 210 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, 8, 4, 0), 211 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, 4, 4, 0), 212 + ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, 0, 4, 0), 213 + ARM64_FTR_END, 214 + }; 215 + 216 + static struct arm64_ftr_bits ftr_generic[] = { 217 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 0, 64, 0), 218 + ARM64_FTR_END, 219 + }; 220 + 221 + static struct arm64_ftr_bits ftr_generic32[] = { 222 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 0, 32, 0), 223 + ARM64_FTR_END, 224 + }; 225 + 226 + static struct arm64_ftr_bits ftr_aa64raz[] = { 227 + ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 0, 64, 0), 228 + ARM64_FTR_END, 229 + }; 230 + 231 + #define ARM64_FTR_REG(id, table) \ 232 + { \ 233 + .sys_id = id, \ 234 + .name = #id, \ 235 + .ftr_bits = &((table)[0]), \ 236 + } 237 + 238 + static struct arm64_ftr_reg arm64_ftr_regs[] = { 239 + 240 + /* Op1 = 0, CRn = 0, CRm = 1 */ 241 + ARM64_FTR_REG(SYS_ID_PFR0_EL1, ftr_id_pfr0), 242 + ARM64_FTR_REG(SYS_ID_PFR1_EL1, ftr_generic_32bits), 243 + ARM64_FTR_REG(SYS_ID_DFR0_EL1, ftr_generic_32bits), 244 + ARM64_FTR_REG(SYS_ID_MMFR0_EL1, ftr_id_mmfr0), 245 + ARM64_FTR_REG(SYS_ID_MMFR1_EL1, ftr_generic_32bits), 246 + ARM64_FTR_REG(SYS_ID_MMFR2_EL1, ftr_generic_32bits), 247 + ARM64_FTR_REG(SYS_ID_MMFR3_EL1, ftr_generic_32bits), 248 + 249 + /* Op1 = 0, CRn = 0, CRm = 2 */ 250 + ARM64_FTR_REG(SYS_ID_ISAR0_EL1, ftr_generic_32bits), 251 + ARM64_FTR_REG(SYS_ID_ISAR1_EL1, ftr_generic_32bits), 252 + ARM64_FTR_REG(SYS_ID_ISAR2_EL1, ftr_generic_32bits), 253 + ARM64_FTR_REG(SYS_ID_ISAR3_EL1, ftr_generic_32bits), 254 + ARM64_FTR_REG(SYS_ID_ISAR4_EL1, ftr_generic_32bits), 255 + ARM64_FTR_REG(SYS_ID_ISAR5_EL1, ftr_id_isar5), 256 + ARM64_FTR_REG(SYS_ID_MMFR4_EL1, ftr_id_mmfr4), 257 + 258 + /* Op1 = 0, CRn = 0, CRm = 3 */ 259 + ARM64_FTR_REG(SYS_MVFR0_EL1, ftr_generic_32bits), 260 + ARM64_FTR_REG(SYS_MVFR1_EL1, ftr_generic_32bits), 261 + ARM64_FTR_REG(SYS_MVFR2_EL1, ftr_mvfr2), 262 + 263 + /* Op1 = 0, CRn = 0, CRm = 4 */ 264 + ARM64_FTR_REG(SYS_ID_AA64PFR0_EL1, ftr_id_aa64pfr0), 265 + ARM64_FTR_REG(SYS_ID_AA64PFR1_EL1, ftr_aa64raz), 266 + 267 + /* Op1 = 0, CRn = 0, CRm = 5 */ 268 + ARM64_FTR_REG(SYS_ID_AA64DFR0_EL1, ftr_id_aa64dfr0), 269 + ARM64_FTR_REG(SYS_ID_AA64DFR1_EL1, ftr_generic), 270 + 271 + /* Op1 = 0, CRn = 0, CRm = 6 */ 272 + ARM64_FTR_REG(SYS_ID_AA64ISAR0_EL1, ftr_id_aa64isar0), 273 + ARM64_FTR_REG(SYS_ID_AA64ISAR1_EL1, ftr_aa64raz), 274 + 275 + /* Op1 = 0, CRn = 0, CRm = 7 */ 276 + ARM64_FTR_REG(SYS_ID_AA64MMFR0_EL1, ftr_id_aa64mmfr0), 277 + ARM64_FTR_REG(SYS_ID_AA64MMFR1_EL1, ftr_id_aa64mmfr1), 278 + 279 + /* Op1 = 3, CRn = 0, CRm = 0 */ 280 + ARM64_FTR_REG(SYS_CTR_EL0, ftr_ctr), 281 + ARM64_FTR_REG(SYS_DCZID_EL0, ftr_dczid), 282 + 283 + /* Op1 = 3, CRn = 14, CRm = 0 */ 284 + ARM64_FTR_REG(SYS_CNTFRQ_EL0, ftr_generic32), 285 + }; 286 + 287 + static int search_cmp_ftr_reg(const void *id, const void *regp) 288 + { 289 + return (int)(unsigned long)id - (int)((const struct arm64_ftr_reg *)regp)->sys_id; 290 + } 291 + 292 + /* 293 + * get_arm64_ftr_reg - Lookup a feature register entry using its 294 + * sys_reg() encoding. With the array arm64_ftr_regs sorted in the 295 + * ascending order of sys_id , we use binary search to find a matching 296 + * entry. 297 + * 298 + * returns - Upon success, matching ftr_reg entry for id. 299 + * - NULL on failure. It is upto the caller to decide 300 + * the impact of a failure. 301 + */ 302 + static struct arm64_ftr_reg *get_arm64_ftr_reg(u32 sys_id) 303 + { 304 + return bsearch((const void *)(unsigned long)sys_id, 305 + arm64_ftr_regs, 306 + ARRAY_SIZE(arm64_ftr_regs), 307 + sizeof(arm64_ftr_regs[0]), 308 + search_cmp_ftr_reg); 309 + } 310 + 311 + static u64 arm64_ftr_set_value(struct arm64_ftr_bits *ftrp, s64 reg, s64 ftr_val) 312 + { 313 + u64 mask = arm64_ftr_mask(ftrp); 314 + 315 + reg &= ~mask; 316 + reg |= (ftr_val << ftrp->shift) & mask; 317 + return reg; 318 + } 319 + 320 + static s64 arm64_ftr_safe_value(struct arm64_ftr_bits *ftrp, s64 new, s64 cur) 321 + { 322 + s64 ret = 0; 323 + 324 + switch (ftrp->type) { 325 + case FTR_EXACT: 326 + ret = ftrp->safe_val; 327 + break; 328 + case FTR_LOWER_SAFE: 329 + ret = new < cur ? new : cur; 330 + break; 331 + case FTR_HIGHER_SAFE: 332 + ret = new > cur ? new : cur; 333 + break; 334 + default: 335 + BUG(); 336 + } 337 + 338 + return ret; 339 + } 340 + 341 + static int __init sort_cmp_ftr_regs(const void *a, const void *b) 342 + { 343 + return ((const struct arm64_ftr_reg *)a)->sys_id - 344 + ((const struct arm64_ftr_reg *)b)->sys_id; 345 + } 346 + 347 + static void __init swap_ftr_regs(void *a, void *b, int size) 348 + { 349 + struct arm64_ftr_reg tmp = *(struct arm64_ftr_reg *)a; 350 + *(struct arm64_ftr_reg *)a = *(struct arm64_ftr_reg *)b; 351 + *(struct arm64_ftr_reg *)b = tmp; 352 + } 353 + 354 + static void __init sort_ftr_regs(void) 355 + { 356 + /* Keep the array sorted so that we can do the binary search */ 357 + sort(arm64_ftr_regs, 358 + ARRAY_SIZE(arm64_ftr_regs), 359 + sizeof(arm64_ftr_regs[0]), 360 + sort_cmp_ftr_regs, 361 + swap_ftr_regs); 362 + } 363 + 364 + /* 365 + * Initialise the CPU feature register from Boot CPU values. 366 + * Also initiliases the strict_mask for the register. 367 + */ 368 + static void __init init_cpu_ftr_reg(u32 sys_reg, u64 new) 369 + { 370 + u64 val = 0; 371 + u64 strict_mask = ~0x0ULL; 372 + struct arm64_ftr_bits *ftrp; 373 + struct arm64_ftr_reg *reg = get_arm64_ftr_reg(sys_reg); 374 + 375 + BUG_ON(!reg); 376 + 377 + for (ftrp = reg->ftr_bits; ftrp->width; ftrp++) { 378 + s64 ftr_new = arm64_ftr_value(ftrp, new); 379 + 380 + val = arm64_ftr_set_value(ftrp, val, ftr_new); 381 + if (!ftrp->strict) 382 + strict_mask &= ~arm64_ftr_mask(ftrp); 383 + } 384 + reg->sys_val = val; 385 + reg->strict_mask = strict_mask; 386 + } 387 + 388 + void __init init_cpu_features(struct cpuinfo_arm64 *info) 389 + { 390 + /* Before we start using the tables, make sure it is sorted */ 391 + sort_ftr_regs(); 392 + 393 + init_cpu_ftr_reg(SYS_CTR_EL0, info->reg_ctr); 394 + init_cpu_ftr_reg(SYS_DCZID_EL0, info->reg_dczid); 395 + init_cpu_ftr_reg(SYS_CNTFRQ_EL0, info->reg_cntfrq); 396 + init_cpu_ftr_reg(SYS_ID_AA64DFR0_EL1, info->reg_id_aa64dfr0); 397 + init_cpu_ftr_reg(SYS_ID_AA64DFR1_EL1, info->reg_id_aa64dfr1); 398 + init_cpu_ftr_reg(SYS_ID_AA64ISAR0_EL1, info->reg_id_aa64isar0); 399 + init_cpu_ftr_reg(SYS_ID_AA64ISAR1_EL1, info->reg_id_aa64isar1); 400 + init_cpu_ftr_reg(SYS_ID_AA64MMFR0_EL1, info->reg_id_aa64mmfr0); 401 + init_cpu_ftr_reg(SYS_ID_AA64MMFR1_EL1, info->reg_id_aa64mmfr1); 402 + init_cpu_ftr_reg(SYS_ID_AA64PFR0_EL1, info->reg_id_aa64pfr0); 403 + init_cpu_ftr_reg(SYS_ID_AA64PFR1_EL1, info->reg_id_aa64pfr1); 404 + init_cpu_ftr_reg(SYS_ID_DFR0_EL1, info->reg_id_dfr0); 405 + init_cpu_ftr_reg(SYS_ID_ISAR0_EL1, info->reg_id_isar0); 406 + init_cpu_ftr_reg(SYS_ID_ISAR1_EL1, info->reg_id_isar1); 407 + init_cpu_ftr_reg(SYS_ID_ISAR2_EL1, info->reg_id_isar2); 408 + init_cpu_ftr_reg(SYS_ID_ISAR3_EL1, info->reg_id_isar3); 409 + init_cpu_ftr_reg(SYS_ID_ISAR4_EL1, info->reg_id_isar4); 410 + init_cpu_ftr_reg(SYS_ID_ISAR5_EL1, info->reg_id_isar5); 411 + init_cpu_ftr_reg(SYS_ID_MMFR0_EL1, info->reg_id_mmfr0); 412 + init_cpu_ftr_reg(SYS_ID_MMFR1_EL1, info->reg_id_mmfr1); 413 + init_cpu_ftr_reg(SYS_ID_MMFR2_EL1, info->reg_id_mmfr2); 414 + init_cpu_ftr_reg(SYS_ID_MMFR3_EL1, info->reg_id_mmfr3); 415 + init_cpu_ftr_reg(SYS_ID_PFR0_EL1, info->reg_id_pfr0); 416 + init_cpu_ftr_reg(SYS_ID_PFR1_EL1, info->reg_id_pfr1); 417 + init_cpu_ftr_reg(SYS_MVFR0_EL1, info->reg_mvfr0); 418 + init_cpu_ftr_reg(SYS_MVFR1_EL1, info->reg_mvfr1); 419 + init_cpu_ftr_reg(SYS_MVFR2_EL1, info->reg_mvfr2); 420 + } 421 + 422 + static void update_cpu_ftr_reg(struct arm64_ftr_reg *reg, u64 new) 423 + { 424 + struct arm64_ftr_bits *ftrp; 425 + 426 + for (ftrp = reg->ftr_bits; ftrp->width; ftrp++) { 427 + s64 ftr_cur = arm64_ftr_value(ftrp, reg->sys_val); 428 + s64 ftr_new = arm64_ftr_value(ftrp, new); 429 + 430 + if (ftr_cur == ftr_new) 431 + continue; 432 + /* Find a safe value */ 433 + ftr_new = arm64_ftr_safe_value(ftrp, ftr_new, ftr_cur); 434 + reg->sys_val = arm64_ftr_set_value(ftrp, reg->sys_val, ftr_new); 435 + } 436 + 437 + } 438 + 439 + static int check_update_ftr_reg(u32 sys_id, int cpu, u64 val, u64 boot) 440 + { 441 + struct arm64_ftr_reg *regp = get_arm64_ftr_reg(sys_id); 442 + 443 + BUG_ON(!regp); 444 + update_cpu_ftr_reg(regp, val); 445 + if ((boot & regp->strict_mask) == (val & regp->strict_mask)) 446 + return 0; 447 + pr_warn("SANITY CHECK: Unexpected variation in %s. Boot CPU: %#016llx, CPU%d: %#016llx\n", 448 + regp->name, boot, cpu, val); 449 + return 1; 450 + } 451 + 452 + /* 453 + * Update system wide CPU feature registers with the values from a 454 + * non-boot CPU. Also performs SANITY checks to make sure that there 455 + * aren't any insane variations from that of the boot CPU. 456 + */ 457 + void update_cpu_features(int cpu, 458 + struct cpuinfo_arm64 *info, 459 + struct cpuinfo_arm64 *boot) 460 + { 461 + int taint = 0; 462 + 463 + /* 464 + * The kernel can handle differing I-cache policies, but otherwise 465 + * caches should look identical. Userspace JITs will make use of 466 + * *minLine. 467 + */ 468 + taint |= check_update_ftr_reg(SYS_CTR_EL0, cpu, 469 + info->reg_ctr, boot->reg_ctr); 470 + 471 + /* 472 + * Userspace may perform DC ZVA instructions. Mismatched block sizes 473 + * could result in too much or too little memory being zeroed if a 474 + * process is preempted and migrated between CPUs. 475 + */ 476 + taint |= check_update_ftr_reg(SYS_DCZID_EL0, cpu, 477 + info->reg_dczid, boot->reg_dczid); 478 + 479 + /* If different, timekeeping will be broken (especially with KVM) */ 480 + taint |= check_update_ftr_reg(SYS_CNTFRQ_EL0, cpu, 481 + info->reg_cntfrq, boot->reg_cntfrq); 482 + 483 + /* 484 + * The kernel uses self-hosted debug features and expects CPUs to 485 + * support identical debug features. We presently need CTX_CMPs, WRPs, 486 + * and BRPs to be identical. 487 + * ID_AA64DFR1 is currently RES0. 488 + */ 489 + taint |= check_update_ftr_reg(SYS_ID_AA64DFR0_EL1, cpu, 490 + info->reg_id_aa64dfr0, boot->reg_id_aa64dfr0); 491 + taint |= check_update_ftr_reg(SYS_ID_AA64DFR1_EL1, cpu, 492 + info->reg_id_aa64dfr1, boot->reg_id_aa64dfr1); 493 + /* 494 + * Even in big.LITTLE, processors should be identical instruction-set 495 + * wise. 496 + */ 497 + taint |= check_update_ftr_reg(SYS_ID_AA64ISAR0_EL1, cpu, 498 + info->reg_id_aa64isar0, boot->reg_id_aa64isar0); 499 + taint |= check_update_ftr_reg(SYS_ID_AA64ISAR1_EL1, cpu, 500 + info->reg_id_aa64isar1, boot->reg_id_aa64isar1); 501 + 502 + /* 503 + * Differing PARange support is fine as long as all peripherals and 504 + * memory are mapped within the minimum PARange of all CPUs. 505 + * Linux should not care about secure memory. 506 + */ 507 + taint |= check_update_ftr_reg(SYS_ID_AA64MMFR0_EL1, cpu, 508 + info->reg_id_aa64mmfr0, boot->reg_id_aa64mmfr0); 509 + taint |= check_update_ftr_reg(SYS_ID_AA64MMFR1_EL1, cpu, 510 + info->reg_id_aa64mmfr1, boot->reg_id_aa64mmfr1); 511 + 512 + /* 513 + * EL3 is not our concern. 514 + * ID_AA64PFR1 is currently RES0. 515 + */ 516 + taint |= check_update_ftr_reg(SYS_ID_AA64PFR0_EL1, cpu, 517 + info->reg_id_aa64pfr0, boot->reg_id_aa64pfr0); 518 + taint |= check_update_ftr_reg(SYS_ID_AA64PFR1_EL1, cpu, 519 + info->reg_id_aa64pfr1, boot->reg_id_aa64pfr1); 520 + 521 + /* 522 + * If we have AArch32, we care about 32-bit features for compat. These 523 + * registers should be RES0 otherwise. 524 + */ 525 + taint |= check_update_ftr_reg(SYS_ID_DFR0_EL1, cpu, 526 + info->reg_id_dfr0, boot->reg_id_dfr0); 527 + taint |= check_update_ftr_reg(SYS_ID_ISAR0_EL1, cpu, 528 + info->reg_id_isar0, boot->reg_id_isar0); 529 + taint |= check_update_ftr_reg(SYS_ID_ISAR1_EL1, cpu, 530 + info->reg_id_isar1, boot->reg_id_isar1); 531 + taint |= check_update_ftr_reg(SYS_ID_ISAR2_EL1, cpu, 532 + info->reg_id_isar2, boot->reg_id_isar2); 533 + taint |= check_update_ftr_reg(SYS_ID_ISAR3_EL1, cpu, 534 + info->reg_id_isar3, boot->reg_id_isar3); 535 + taint |= check_update_ftr_reg(SYS_ID_ISAR4_EL1, cpu, 536 + info->reg_id_isar4, boot->reg_id_isar4); 537 + taint |= check_update_ftr_reg(SYS_ID_ISAR5_EL1, cpu, 538 + info->reg_id_isar5, boot->reg_id_isar5); 539 + 540 + /* 541 + * Regardless of the value of the AuxReg field, the AIFSR, ADFSR, and 542 + * ACTLR formats could differ across CPUs and therefore would have to 543 + * be trapped for virtualization anyway. 544 + */ 545 + taint |= check_update_ftr_reg(SYS_ID_MMFR0_EL1, cpu, 546 + info->reg_id_mmfr0, boot->reg_id_mmfr0); 547 + taint |= check_update_ftr_reg(SYS_ID_MMFR1_EL1, cpu, 548 + info->reg_id_mmfr1, boot->reg_id_mmfr1); 549 + taint |= check_update_ftr_reg(SYS_ID_MMFR2_EL1, cpu, 550 + info->reg_id_mmfr2, boot->reg_id_mmfr2); 551 + taint |= check_update_ftr_reg(SYS_ID_MMFR3_EL1, cpu, 552 + info->reg_id_mmfr3, boot->reg_id_mmfr3); 553 + taint |= check_update_ftr_reg(SYS_ID_PFR0_EL1, cpu, 554 + info->reg_id_pfr0, boot->reg_id_pfr0); 555 + taint |= check_update_ftr_reg(SYS_ID_PFR1_EL1, cpu, 556 + info->reg_id_pfr1, boot->reg_id_pfr1); 557 + taint |= check_update_ftr_reg(SYS_MVFR0_EL1, cpu, 558 + info->reg_mvfr0, boot->reg_mvfr0); 559 + taint |= check_update_ftr_reg(SYS_MVFR1_EL1, cpu, 560 + info->reg_mvfr1, boot->reg_mvfr1); 561 + taint |= check_update_ftr_reg(SYS_MVFR2_EL1, cpu, 562 + info->reg_mvfr2, boot->reg_mvfr2); 563 + 564 + /* 565 + * Mismatched CPU features are a recipe for disaster. Don't even 566 + * pretend to support them. 567 + */ 568 + WARN_TAINT_ONCE(taint, TAINT_CPU_OUT_OF_SPEC, 569 + "Unsupported CPU feature variation.\n"); 570 + } 571 + 572 + u64 read_system_reg(u32 id) 573 + { 574 + struct arm64_ftr_reg *regp = get_arm64_ftr_reg(id); 575 + 576 + /* We shouldn't get a request for an unsupported register */ 577 + BUG_ON(!regp); 578 + return regp->sys_val; 579 + } 25 580 26 581 #include <linux/irqchip/arm-gic-v3.h> 27 582 ··· 588 33 return val >= entry->min_field_value; 589 34 } 590 35 591 - #define __ID_FEAT_CHK(reg) \ 592 - static bool __maybe_unused \ 593 - has_##reg##_feature(const struct arm64_cpu_capabilities *entry) \ 594 - { \ 595 - u64 val; \ 596 - \ 597 - val = read_cpuid(reg##_el1); \ 598 - return feature_matches(val, entry); \ 599 - } 36 + static bool 37 + has_cpuid_feature(const struct arm64_cpu_capabilities *entry) 38 + { 39 + u64 val; 600 40 601 - __ID_FEAT_CHK(id_aa64pfr0); 602 - __ID_FEAT_CHK(id_aa64mmfr1); 603 - __ID_FEAT_CHK(id_aa64isar0); 41 + val = read_system_reg(entry->sys_reg); 42 + return feature_matches(val, entry); 43 + } 604 44 605 45 static bool has_useable_gicv3_cpuif(const struct arm64_cpu_capabilities *entry) 606 46 { 607 47 bool has_sre; 608 48 609 - if (!has_id_aa64pfr0_feature(entry)) 49 + if (!has_cpuid_feature(entry)) 610 50 return false; 611 51 612 52 has_sre = gic_enable_sre(); ··· 617 67 .desc = "GIC system register CPU interface", 618 68 .capability = ARM64_HAS_SYSREG_GIC_CPUIF, 619 69 .matches = has_useable_gicv3_cpuif, 620 - .field_pos = 24, 70 + .sys_reg = SYS_ID_AA64PFR0_EL1, 71 + .field_pos = ID_AA64PFR0_GIC_SHIFT, 621 72 .min_field_value = 1, 622 73 }, 623 74 #ifdef CONFIG_ARM64_PAN 624 75 { 625 76 .desc = "Privileged Access Never", 626 77 .capability = ARM64_HAS_PAN, 627 - .matches = has_id_aa64mmfr1_feature, 628 - .field_pos = 20, 78 + .matches = has_cpuid_feature, 79 + .sys_reg = SYS_ID_AA64MMFR1_EL1, 80 + .field_pos = ID_AA64MMFR1_PAN_SHIFT, 629 81 .min_field_value = 1, 630 82 .enable = cpu_enable_pan, 631 83 }, ··· 636 84 { 637 85 .desc = "LSE atomic instructions", 638 86 .capability = ARM64_HAS_LSE_ATOMICS, 639 - .matches = has_id_aa64isar0_feature, 640 - .field_pos = 20, 87 + .matches = has_cpuid_feature, 88 + .sys_reg = SYS_ID_AA64ISAR0_EL1, 89 + .field_pos = ID_AA64ISAR0_ATOMICS_SHIFT, 641 90 .min_field_value = 2, 642 91 }, 643 92 #endif /* CONFIG_AS_LSE && CONFIG_ARM64_LSE_ATOMICS */ 644 93 {}, 645 94 }; 646 95 647 - void check_cpu_capabilities(const struct arm64_cpu_capabilities *caps, 96 + #define HWCAP_CAP(reg, field, min_value, type, cap) \ 97 + { \ 98 + .desc = #cap, \ 99 + .matches = has_cpuid_feature, \ 100 + .sys_reg = reg, \ 101 + .field_pos = field, \ 102 + .min_field_value = min_value, \ 103 + .hwcap_type = type, \ 104 + .hwcap = cap, \ 105 + } 106 + 107 + static const struct arm64_cpu_capabilities arm64_hwcaps[] = { 108 + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_AES_SHIFT, 2, CAP_HWCAP, HWCAP_PMULL), 109 + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_AES_SHIFT, 1, CAP_HWCAP, HWCAP_AES), 110 + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SHA1_SHIFT, 1, CAP_HWCAP, HWCAP_SHA1), 111 + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SHA2_SHIFT, 1, CAP_HWCAP, HWCAP_SHA2), 112 + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_CRC32_SHIFT, 1, CAP_HWCAP, HWCAP_CRC32), 113 + HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_ATOMICS_SHIFT, 2, CAP_HWCAP, HWCAP_ATOMICS), 114 + HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_FP_SHIFT, 0, CAP_HWCAP, HWCAP_FP), 115 + HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_ASIMD_SHIFT, 0, CAP_HWCAP, HWCAP_ASIMD), 116 + #ifdef CONFIG_COMPAT 117 + HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_AES_SHIFT, 2, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_PMULL), 118 + HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_AES_SHIFT, 1, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_AES), 119 + HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_SHA1_SHIFT, 1, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_SHA1), 120 + HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_SHA2_SHIFT, 1, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_SHA2), 121 + HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_CRC32_SHIFT, 1, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_CRC32), 122 + #endif 123 + {}, 124 + }; 125 + 126 + static void cap_set_hwcap(const struct arm64_cpu_capabilities *cap) 127 + { 128 + switch (cap->hwcap_type) { 129 + case CAP_HWCAP: 130 + elf_hwcap |= cap->hwcap; 131 + break; 132 + #ifdef CONFIG_COMPAT 133 + case CAP_COMPAT_HWCAP: 134 + compat_elf_hwcap |= (u32)cap->hwcap; 135 + break; 136 + case CAP_COMPAT_HWCAP2: 137 + compat_elf_hwcap2 |= (u32)cap->hwcap; 138 + break; 139 + #endif 140 + default: 141 + WARN_ON(1); 142 + break; 143 + } 144 + } 145 + 146 + /* Check if we have a particular HWCAP enabled */ 147 + static bool cpus_have_hwcap(const struct arm64_cpu_capabilities *cap) 148 + { 149 + bool rc; 150 + 151 + switch (cap->hwcap_type) { 152 + case CAP_HWCAP: 153 + rc = (elf_hwcap & cap->hwcap) != 0; 154 + break; 155 + #ifdef CONFIG_COMPAT 156 + case CAP_COMPAT_HWCAP: 157 + rc = (compat_elf_hwcap & (u32)cap->hwcap) != 0; 158 + break; 159 + case CAP_COMPAT_HWCAP2: 160 + rc = (compat_elf_hwcap2 & (u32)cap->hwcap) != 0; 161 + break; 162 + #endif 163 + default: 164 + WARN_ON(1); 165 + rc = false; 166 + } 167 + 168 + return rc; 169 + } 170 + 171 + static void setup_cpu_hwcaps(void) 172 + { 173 + int i; 174 + const struct arm64_cpu_capabilities *hwcaps = arm64_hwcaps; 175 + 176 + for (i = 0; hwcaps[i].desc; i++) 177 + if (hwcaps[i].matches(&hwcaps[i])) 178 + cap_set_hwcap(&hwcaps[i]); 179 + } 180 + 181 + void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps, 648 182 const char *info) 649 183 { 650 184 int i; ··· 743 105 pr_info("%s %s\n", info, caps[i].desc); 744 106 cpus_set_cap(caps[i].capability); 745 107 } 108 + } 746 109 747 - /* second pass allows enable() to consider interacting capabilities */ 748 - for (i = 0; caps[i].desc; i++) { 749 - if (cpus_have_cap(caps[i].capability) && caps[i].enable) 750 - caps[i].enable(); 110 + /* 111 + * Run through the enabled capabilities and enable() it on all active 112 + * CPUs 113 + */ 114 + static void enable_cpu_capabilities(const struct arm64_cpu_capabilities *caps) 115 + { 116 + int i; 117 + 118 + for (i = 0; caps[i].desc; i++) 119 + if (caps[i].enable && cpus_have_cap(caps[i].capability)) 120 + on_each_cpu(caps[i].enable, NULL, true); 121 + } 122 + 123 + #ifdef CONFIG_HOTPLUG_CPU 124 + 125 + /* 126 + * Flag to indicate if we have computed the system wide 127 + * capabilities based on the boot time active CPUs. This 128 + * will be used to determine if a new booting CPU should 129 + * go through the verification process to make sure that it 130 + * supports the system capabilities, without using a hotplug 131 + * notifier. 132 + */ 133 + static bool sys_caps_initialised; 134 + 135 + static inline void set_sys_caps_initialised(void) 136 + { 137 + sys_caps_initialised = true; 138 + } 139 + 140 + /* 141 + * __raw_read_system_reg() - Used by a STARTING cpu before cpuinfo is populated. 142 + */ 143 + static u64 __raw_read_system_reg(u32 sys_id) 144 + { 145 + switch (sys_id) { 146 + case SYS_ID_PFR0_EL1: return (u64)read_cpuid(ID_PFR0_EL1); 147 + case SYS_ID_PFR1_EL1: return (u64)read_cpuid(ID_PFR1_EL1); 148 + case SYS_ID_DFR0_EL1: return (u64)read_cpuid(ID_DFR0_EL1); 149 + case SYS_ID_MMFR0_EL1: return (u64)read_cpuid(ID_MMFR0_EL1); 150 + case SYS_ID_MMFR1_EL1: return (u64)read_cpuid(ID_MMFR1_EL1); 151 + case SYS_ID_MMFR2_EL1: return (u64)read_cpuid(ID_MMFR2_EL1); 152 + case SYS_ID_MMFR3_EL1: return (u64)read_cpuid(ID_MMFR3_EL1); 153 + case SYS_ID_ISAR0_EL1: return (u64)read_cpuid(ID_ISAR0_EL1); 154 + case SYS_ID_ISAR1_EL1: return (u64)read_cpuid(ID_ISAR1_EL1); 155 + case SYS_ID_ISAR2_EL1: return (u64)read_cpuid(ID_ISAR2_EL1); 156 + case SYS_ID_ISAR3_EL1: return (u64)read_cpuid(ID_ISAR3_EL1); 157 + case SYS_ID_ISAR4_EL1: return (u64)read_cpuid(ID_ISAR4_EL1); 158 + case SYS_ID_ISAR5_EL1: return (u64)read_cpuid(ID_ISAR4_EL1); 159 + case SYS_MVFR0_EL1: return (u64)read_cpuid(MVFR0_EL1); 160 + case SYS_MVFR1_EL1: return (u64)read_cpuid(MVFR1_EL1); 161 + case SYS_MVFR2_EL1: return (u64)read_cpuid(MVFR2_EL1); 162 + 163 + case SYS_ID_AA64PFR0_EL1: return (u64)read_cpuid(ID_AA64PFR0_EL1); 164 + case SYS_ID_AA64PFR1_EL1: return (u64)read_cpuid(ID_AA64PFR0_EL1); 165 + case SYS_ID_AA64DFR0_EL1: return (u64)read_cpuid(ID_AA64DFR0_EL1); 166 + case SYS_ID_AA64DFR1_EL1: return (u64)read_cpuid(ID_AA64DFR0_EL1); 167 + case SYS_ID_AA64MMFR0_EL1: return (u64)read_cpuid(ID_AA64MMFR0_EL1); 168 + case SYS_ID_AA64MMFR1_EL1: return (u64)read_cpuid(ID_AA64MMFR1_EL1); 169 + case SYS_ID_AA64ISAR0_EL1: return (u64)read_cpuid(ID_AA64ISAR0_EL1); 170 + case SYS_ID_AA64ISAR1_EL1: return (u64)read_cpuid(ID_AA64ISAR1_EL1); 171 + 172 + case SYS_CNTFRQ_EL0: return (u64)read_cpuid(CNTFRQ_EL0); 173 + case SYS_CTR_EL0: return (u64)read_cpuid(CTR_EL0); 174 + case SYS_DCZID_EL0: return (u64)read_cpuid(DCZID_EL0); 175 + default: 176 + BUG(); 177 + return 0; 751 178 } 752 179 } 753 180 754 - void check_local_cpu_features(void) 181 + /* 182 + * Park the CPU which doesn't have the capability as advertised 183 + * by the system. 184 + */ 185 + static void fail_incapable_cpu(char *cap_type, 186 + const struct arm64_cpu_capabilities *cap) 755 187 { 756 - check_cpu_capabilities(arm64_features, "detected feature:"); 188 + int cpu = smp_processor_id(); 189 + 190 + pr_crit("CPU%d: missing %s : %s\n", cpu, cap_type, cap->desc); 191 + /* Mark this CPU absent */ 192 + set_cpu_present(cpu, 0); 193 + 194 + /* Check if we can park ourselves */ 195 + if (cpu_ops[cpu] && cpu_ops[cpu]->cpu_die) 196 + cpu_ops[cpu]->cpu_die(cpu); 197 + asm( 198 + "1: wfe\n" 199 + " wfi\n" 200 + " b 1b"); 201 + } 202 + 203 + /* 204 + * Run through the enabled system capabilities and enable() it on this CPU. 205 + * The capabilities were decided based on the available CPUs at the boot time. 206 + * Any new CPU should match the system wide status of the capability. If the 207 + * new CPU doesn't have a capability which the system now has enabled, we 208 + * cannot do anything to fix it up and could cause unexpected failures. So 209 + * we park the CPU. 210 + */ 211 + void verify_local_cpu_capabilities(void) 212 + { 213 + int i; 214 + const struct arm64_cpu_capabilities *caps; 215 + 216 + /* 217 + * If we haven't computed the system capabilities, there is nothing 218 + * to verify. 219 + */ 220 + if (!sys_caps_initialised) 221 + return; 222 + 223 + caps = arm64_features; 224 + for (i = 0; caps[i].desc; i++) { 225 + if (!cpus_have_cap(caps[i].capability) || !caps[i].sys_reg) 226 + continue; 227 + /* 228 + * If the new CPU misses an advertised feature, we cannot proceed 229 + * further, park the cpu. 230 + */ 231 + if (!feature_matches(__raw_read_system_reg(caps[i].sys_reg), &caps[i])) 232 + fail_incapable_cpu("arm64_features", &caps[i]); 233 + if (caps[i].enable) 234 + caps[i].enable(NULL); 235 + } 236 + 237 + for (i = 0, caps = arm64_hwcaps; caps[i].desc; i++) { 238 + if (!cpus_have_hwcap(&caps[i])) 239 + continue; 240 + if (!feature_matches(__raw_read_system_reg(caps[i].sys_reg), &caps[i])) 241 + fail_incapable_cpu("arm64_hwcaps", &caps[i]); 242 + } 243 + } 244 + 245 + #else /* !CONFIG_HOTPLUG_CPU */ 246 + 247 + static inline void set_sys_caps_initialised(void) 248 + { 249 + } 250 + 251 + #endif /* CONFIG_HOTPLUG_CPU */ 252 + 253 + static void setup_feature_capabilities(void) 254 + { 255 + update_cpu_capabilities(arm64_features, "detected feature:"); 256 + enable_cpu_capabilities(arm64_features); 257 + } 258 + 259 + void __init setup_cpu_features(void) 260 + { 261 + u32 cwg; 262 + int cls; 263 + 264 + /* Set the CPU feature capabilies */ 265 + setup_feature_capabilities(); 266 + setup_cpu_hwcaps(); 267 + 268 + /* Advertise that we have computed the system capabilities */ 269 + set_sys_caps_initialised(); 270 + 271 + /* 272 + * Check for sane CTR_EL0.CWG value. 273 + */ 274 + cwg = cache_type_cwg(); 275 + cls = cache_line_size(); 276 + if (!cwg) 277 + pr_warn("No Cache Writeback Granule information, assuming cache line size %d\n", 278 + cls); 279 + if (L1_CACHE_BYTES < cls) 280 + pr_warn("L1_CACHE_BYTES smaller than the Cache Writeback Granule (%d < %d)\n", 281 + L1_CACHE_BYTES, cls); 757 282 }
+126 -134
arch/arm64/kernel/cpuinfo.c
··· 24 24 #include <linux/bug.h> 25 25 #include <linux/init.h> 26 26 #include <linux/kernel.h> 27 + #include <linux/personality.h> 27 28 #include <linux/preempt.h> 28 29 #include <linux/printk.h> 30 + #include <linux/seq_file.h> 31 + #include <linux/sched.h> 29 32 #include <linux/smp.h> 30 33 31 34 /* ··· 38 35 */ 39 36 DEFINE_PER_CPU(struct cpuinfo_arm64, cpu_data); 40 37 static struct cpuinfo_arm64 boot_cpu_data; 41 - static bool mixed_endian_el0 = true; 42 38 43 39 static char *icache_policy_str[] = { 44 40 [ICACHE_POLICY_RESERVED] = "RESERVED/UNKNOWN", ··· 47 45 }; 48 46 49 47 unsigned long __icache_flags; 48 + 49 + static const char *const hwcap_str[] = { 50 + "fp", 51 + "asimd", 52 + "evtstrm", 53 + "aes", 54 + "pmull", 55 + "sha1", 56 + "sha2", 57 + "crc32", 58 + "atomics", 59 + NULL 60 + }; 61 + 62 + #ifdef CONFIG_COMPAT 63 + static const char *const compat_hwcap_str[] = { 64 + "swp", 65 + "half", 66 + "thumb", 67 + "26bit", 68 + "fastmult", 69 + "fpa", 70 + "vfp", 71 + "edsp", 72 + "java", 73 + "iwmmxt", 74 + "crunch", 75 + "thumbee", 76 + "neon", 77 + "vfpv3", 78 + "vfpv3d16", 79 + "tls", 80 + "vfpv4", 81 + "idiva", 82 + "idivt", 83 + "vfpd32", 84 + "lpae", 85 + "evtstrm" 86 + }; 87 + 88 + static const char *const compat_hwcap2_str[] = { 89 + "aes", 90 + "pmull", 91 + "sha1", 92 + "sha2", 93 + "crc32", 94 + NULL 95 + }; 96 + #endif /* CONFIG_COMPAT */ 97 + 98 + static int c_show(struct seq_file *m, void *v) 99 + { 100 + int i, j; 101 + 102 + for_each_online_cpu(i) { 103 + struct cpuinfo_arm64 *cpuinfo = &per_cpu(cpu_data, i); 104 + u32 midr = cpuinfo->reg_midr; 105 + 106 + /* 107 + * glibc reads /proc/cpuinfo to determine the number of 108 + * online processors, looking for lines beginning with 109 + * "processor". Give glibc what it expects. 110 + */ 111 + seq_printf(m, "processor\t: %d\n", i); 112 + 113 + /* 114 + * Dump out the common processor features in a single line. 115 + * Userspace should read the hwcaps with getauxval(AT_HWCAP) 116 + * rather than attempting to parse this, but there's a body of 117 + * software which does already (at least for 32-bit). 118 + */ 119 + seq_puts(m, "Features\t:"); 120 + if (personality(current->personality) == PER_LINUX32) { 121 + #ifdef CONFIG_COMPAT 122 + for (j = 0; compat_hwcap_str[j]; j++) 123 + if (compat_elf_hwcap & (1 << j)) 124 + seq_printf(m, " %s", compat_hwcap_str[j]); 125 + 126 + for (j = 0; compat_hwcap2_str[j]; j++) 127 + if (compat_elf_hwcap2 & (1 << j)) 128 + seq_printf(m, " %s", compat_hwcap2_str[j]); 129 + #endif /* CONFIG_COMPAT */ 130 + } else { 131 + for (j = 0; hwcap_str[j]; j++) 132 + if (elf_hwcap & (1 << j)) 133 + seq_printf(m, " %s", hwcap_str[j]); 134 + } 135 + seq_puts(m, "\n"); 136 + 137 + seq_printf(m, "CPU implementer\t: 0x%02x\n", 138 + MIDR_IMPLEMENTOR(midr)); 139 + seq_printf(m, "CPU architecture: 8\n"); 140 + seq_printf(m, "CPU variant\t: 0x%x\n", MIDR_VARIANT(midr)); 141 + seq_printf(m, "CPU part\t: 0x%03x\n", MIDR_PARTNUM(midr)); 142 + seq_printf(m, "CPU revision\t: %d\n\n", MIDR_REVISION(midr)); 143 + } 144 + 145 + return 0; 146 + } 147 + 148 + static void *c_start(struct seq_file *m, loff_t *pos) 149 + { 150 + return *pos < 1 ? (void *)1 : NULL; 151 + } 152 + 153 + static void *c_next(struct seq_file *m, void *v, loff_t *pos) 154 + { 155 + ++*pos; 156 + return NULL; 157 + } 158 + 159 + static void c_stop(struct seq_file *m, void *v) 160 + { 161 + } 162 + 163 + const struct seq_operations cpuinfo_op = { 164 + .start = c_start, 165 + .next = c_next, 166 + .stop = c_stop, 167 + .show = c_show 168 + }; 50 169 51 170 static void cpuinfo_detect_icache_policy(struct cpuinfo_arm64 *info) 52 171 { ··· 190 67 set_bit(ICACHEF_AIVIVT, &__icache_flags); 191 68 192 69 pr_info("Detected %s I-cache on CPU%d\n", icache_policy_str[l1ip], cpu); 193 - } 194 - 195 - bool cpu_supports_mixed_endian_el0(void) 196 - { 197 - return id_aa64mmfr0_mixed_endian_el0(read_cpuid(ID_AA64MMFR0_EL1)); 198 - } 199 - 200 - bool system_supports_mixed_endian_el0(void) 201 - { 202 - return mixed_endian_el0; 203 - } 204 - 205 - static void update_mixed_endian_el0_support(struct cpuinfo_arm64 *info) 206 - { 207 - mixed_endian_el0 &= id_aa64mmfr0_mixed_endian_el0(info->reg_id_aa64mmfr0); 208 - } 209 - 210 - static void update_cpu_features(struct cpuinfo_arm64 *info) 211 - { 212 - update_mixed_endian_el0_support(info); 213 - } 214 - 215 - static int check_reg_mask(char *name, u64 mask, u64 boot, u64 cur, int cpu) 216 - { 217 - if ((boot & mask) == (cur & mask)) 218 - return 0; 219 - 220 - pr_warn("SANITY CHECK: Unexpected variation in %s. Boot CPU: %#016lx, CPU%d: %#016lx\n", 221 - name, (unsigned long)boot, cpu, (unsigned long)cur); 222 - 223 - return 1; 224 - } 225 - 226 - #define CHECK_MASK(field, mask, boot, cur, cpu) \ 227 - check_reg_mask(#field, mask, (boot)->reg_ ## field, (cur)->reg_ ## field, cpu) 228 - 229 - #define CHECK(field, boot, cur, cpu) \ 230 - CHECK_MASK(field, ~0ULL, boot, cur, cpu) 231 - 232 - /* 233 - * Verify that CPUs don't have unexpected differences that will cause problems. 234 - */ 235 - static void cpuinfo_sanity_check(struct cpuinfo_arm64 *cur) 236 - { 237 - unsigned int cpu = smp_processor_id(); 238 - struct cpuinfo_arm64 *boot = &boot_cpu_data; 239 - unsigned int diff = 0; 240 - 241 - /* 242 - * The kernel can handle differing I-cache policies, but otherwise 243 - * caches should look identical. Userspace JITs will make use of 244 - * *minLine. 245 - */ 246 - diff |= CHECK_MASK(ctr, 0xffff3fff, boot, cur, cpu); 247 - 248 - /* 249 - * Userspace may perform DC ZVA instructions. Mismatched block sizes 250 - * could result in too much or too little memory being zeroed if a 251 - * process is preempted and migrated between CPUs. 252 - */ 253 - diff |= CHECK(dczid, boot, cur, cpu); 254 - 255 - /* If different, timekeeping will be broken (especially with KVM) */ 256 - diff |= CHECK(cntfrq, boot, cur, cpu); 257 - 258 - /* 259 - * The kernel uses self-hosted debug features and expects CPUs to 260 - * support identical debug features. We presently need CTX_CMPs, WRPs, 261 - * and BRPs to be identical. 262 - * ID_AA64DFR1 is currently RES0. 263 - */ 264 - diff |= CHECK(id_aa64dfr0, boot, cur, cpu); 265 - diff |= CHECK(id_aa64dfr1, boot, cur, cpu); 266 - 267 - /* 268 - * Even in big.LITTLE, processors should be identical instruction-set 269 - * wise. 270 - */ 271 - diff |= CHECK(id_aa64isar0, boot, cur, cpu); 272 - diff |= CHECK(id_aa64isar1, boot, cur, cpu); 273 - 274 - /* 275 - * Differing PARange support is fine as long as all peripherals and 276 - * memory are mapped within the minimum PARange of all CPUs. 277 - * Linux should not care about secure memory. 278 - * ID_AA64MMFR1 is currently RES0. 279 - */ 280 - diff |= CHECK_MASK(id_aa64mmfr0, 0xffffffffffff0ff0, boot, cur, cpu); 281 - diff |= CHECK(id_aa64mmfr1, boot, cur, cpu); 282 - 283 - /* 284 - * EL3 is not our concern. 285 - * ID_AA64PFR1 is currently RES0. 286 - */ 287 - diff |= CHECK_MASK(id_aa64pfr0, 0xffffffffffff0fff, boot, cur, cpu); 288 - diff |= CHECK(id_aa64pfr1, boot, cur, cpu); 289 - 290 - /* 291 - * If we have AArch32, we care about 32-bit features for compat. These 292 - * registers should be RES0 otherwise. 293 - */ 294 - diff |= CHECK(id_dfr0, boot, cur, cpu); 295 - diff |= CHECK(id_isar0, boot, cur, cpu); 296 - diff |= CHECK(id_isar1, boot, cur, cpu); 297 - diff |= CHECK(id_isar2, boot, cur, cpu); 298 - diff |= CHECK(id_isar3, boot, cur, cpu); 299 - diff |= CHECK(id_isar4, boot, cur, cpu); 300 - diff |= CHECK(id_isar5, boot, cur, cpu); 301 - /* 302 - * Regardless of the value of the AuxReg field, the AIFSR, ADFSR, and 303 - * ACTLR formats could differ across CPUs and therefore would have to 304 - * be trapped for virtualization anyway. 305 - */ 306 - diff |= CHECK_MASK(id_mmfr0, 0xff0fffff, boot, cur, cpu); 307 - diff |= CHECK(id_mmfr1, boot, cur, cpu); 308 - diff |= CHECK(id_mmfr2, boot, cur, cpu); 309 - diff |= CHECK(id_mmfr3, boot, cur, cpu); 310 - diff |= CHECK(id_pfr0, boot, cur, cpu); 311 - diff |= CHECK(id_pfr1, boot, cur, cpu); 312 - 313 - diff |= CHECK(mvfr0, boot, cur, cpu); 314 - diff |= CHECK(mvfr1, boot, cur, cpu); 315 - diff |= CHECK(mvfr2, boot, cur, cpu); 316 - 317 - /* 318 - * Mismatched CPU features are a recipe for disaster. Don't even 319 - * pretend to support them. 320 - */ 321 - WARN_TAINT_ONCE(diff, TAINT_CPU_OUT_OF_SPEC, 322 - "Unsupported CPU feature variation.\n"); 323 70 } 324 71 325 72 static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info) ··· 229 236 cpuinfo_detect_icache_policy(info); 230 237 231 238 check_local_cpu_errata(); 232 - check_local_cpu_features(); 233 - update_cpu_features(info); 234 239 } 235 240 236 241 void cpuinfo_store_cpu(void) 237 242 { 238 243 struct cpuinfo_arm64 *info = this_cpu_ptr(&cpu_data); 239 244 __cpuinfo_store_cpu(info); 240 - cpuinfo_sanity_check(info); 245 + update_cpu_features(smp_processor_id(), info, &boot_cpu_data); 241 246 } 242 247 243 248 void __init cpuinfo_store_boot_cpu(void) ··· 244 253 __cpuinfo_store_cpu(info); 245 254 246 255 boot_cpu_data = *info; 256 + init_cpu_features(&boot_cpu_data); 247 257 }
+4 -2
arch/arm64/kernel/debug-monitors.c
··· 26 26 #include <linux/stat.h> 27 27 #include <linux/uaccess.h> 28 28 29 - #include <asm/debug-monitors.h> 29 + #include <asm/cpufeature.h> 30 30 #include <asm/cputype.h> 31 + #include <asm/debug-monitors.h> 31 32 #include <asm/system_misc.h> 32 33 33 34 /* Determine debug architecture. */ 34 35 u8 debug_monitors_arch(void) 35 36 { 36 - return read_cpuid(ID_AA64DFR0_EL1) & 0xf; 37 + return cpuid_feature_extract_field(read_system_reg(SYS_ID_AA64DFR0_EL1), 38 + ID_AA64DFR0_DEBUGVER_SHIFT); 37 39 } 38 40 39 41 /*
+5 -5
arch/arm64/kernel/efi-entry.S
··· 29 29 * we want to be. The kernel image wants to be placed at TEXT_OFFSET 30 30 * from start of RAM. 31 31 */ 32 - ENTRY(efi_stub_entry) 32 + ENTRY(entry) 33 33 /* 34 34 * Create a stack frame to save FP/LR with extra space 35 35 * for image_addr variable passed to efi_entry(). ··· 86 86 * entries for the VA range of the current image, so no maintenance is 87 87 * necessary. 88 88 */ 89 - adr x0, efi_stub_entry 90 - adr x1, efi_stub_entry_end 89 + adr x0, entry 90 + adr x1, entry_end 91 91 sub x1, x1, x0 92 92 bl __flush_dcache_area 93 93 ··· 120 120 ldp x29, x30, [sp], #32 121 121 ret 122 122 123 - efi_stub_entry_end: 124 - ENDPROC(efi_stub_entry) 123 + entry_end: 124 + ENDPROC(entry)
arch/arm64/kernel/efi-stub.c drivers/firmware/efi/libstub/arm64-stub.c
+2 -3
arch/arm64/kernel/efi.c
··· 48 48 .mmap_sem = __RWSEM_INITIALIZER(efi_mm.mmap_sem), 49 49 .page_table_lock = __SPIN_LOCK_UNLOCKED(efi_mm.page_table_lock), 50 50 .mmlist = LIST_HEAD_INIT(efi_mm.mmlist), 51 - INIT_MM_CONTEXT(efi_mm) 52 51 }; 53 52 54 53 static int __init is_normal_ram(efi_memory_desc_t *md) ··· 334 335 else 335 336 cpu_switch_mm(mm->pgd, mm); 336 337 337 - flush_tlb_all(); 338 + local_flush_tlb_all(); 338 339 if (icache_is_aivivt()) 339 - __flush_icache_all(); 340 + __local_flush_icache_all(); 340 341 } 341 342 342 343 void efi_virtmap_load(void)
+2
arch/arm64/kernel/entry.S
··· 430 430 b.eq el0_fpsimd_acc 431 431 cmp x24, #ESR_ELx_EC_FP_EXC32 // FP/ASIMD exception 432 432 b.eq el0_fpsimd_exc 433 + cmp x24, #ESR_ELx_EC_PC_ALIGN // pc alignment exception 434 + b.eq el0_sp_pc 433 435 cmp x24, #ESR_ELx_EC_UNKNOWN // unknown exception in EL0 434 436 b.eq el0_undef 435 437 cmp x24, #ESR_ELx_EC_CP15_32 // CP15 MRC/MCR trap
+5 -11
arch/arm64/kernel/fpsimd.c
··· 332 332 */ 333 333 static int __init fpsimd_init(void) 334 334 { 335 - u64 pfr = read_cpuid(ID_AA64PFR0_EL1); 336 - 337 - if (pfr & (0xf << 16)) { 335 + if (elf_hwcap & HWCAP_FP) { 336 + fpsimd_pm_init(); 337 + fpsimd_hotplug_init(); 338 + } else { 338 339 pr_notice("Floating-point is not implemented\n"); 339 - return 0; 340 340 } 341 - elf_hwcap |= HWCAP_FP; 342 341 343 - if (pfr & (0xf << 20)) 342 + if (!(elf_hwcap & HWCAP_ASIMD)) 344 343 pr_notice("Advanced SIMD is not implemented\n"); 345 - else 346 - elf_hwcap |= HWCAP_ASIMD; 347 - 348 - fpsimd_pm_init(); 349 - fpsimd_hotplug_init(); 350 344 351 345 return 0; 352 346 }
+37 -39
arch/arm64/kernel/head.S
··· 29 29 #include <asm/asm-offsets.h> 30 30 #include <asm/cache.h> 31 31 #include <asm/cputype.h> 32 + #include <asm/kernel-pgtable.h> 32 33 #include <asm/memory.h> 33 - #include <asm/thread_info.h> 34 34 #include <asm/pgtable-hwdef.h> 35 35 #include <asm/pgtable.h> 36 36 #include <asm/page.h> 37 + #include <asm/sysreg.h> 38 + #include <asm/thread_info.h> 37 39 #include <asm/virt.h> 38 40 39 41 #define __PHYS_OFFSET (KERNEL_START - TEXT_OFFSET) ··· 48 46 #error TEXT_OFFSET must be less than 2MB 49 47 #endif 50 48 51 - #ifdef CONFIG_ARM64_64K_PAGES 52 - #define BLOCK_SHIFT PAGE_SHIFT 53 - #define BLOCK_SIZE PAGE_SIZE 54 - #define TABLE_SHIFT PMD_SHIFT 55 - #else 56 - #define BLOCK_SHIFT SECTION_SHIFT 57 - #define BLOCK_SIZE SECTION_SIZE 58 - #define TABLE_SHIFT PUD_SHIFT 59 - #endif 60 - 61 49 #define KERNEL_START _text 62 50 #define KERNEL_END _end 63 - 64 - /* 65 - * Initial memory map attributes. 66 - */ 67 - #define PTE_FLAGS PTE_TYPE_PAGE | PTE_AF | PTE_SHARED 68 - #define PMD_FLAGS PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S 69 - 70 - #ifdef CONFIG_ARM64_64K_PAGES 71 - #define MM_MMUFLAGS PTE_ATTRINDX(MT_NORMAL) | PTE_FLAGS 72 - #else 73 - #define MM_MMUFLAGS PMD_ATTRINDX(MT_NORMAL) | PMD_FLAGS 74 - #endif 75 51 76 52 /* 77 53 * Kernel startup entry point. ··· 100 120 #endif 101 121 102 122 #ifdef CONFIG_EFI 103 - .globl stext_offset 104 - .set stext_offset, stext - efi_head 123 + .globl __efistub_stext_offset 124 + .set __efistub_stext_offset, stext - efi_head 105 125 .align 3 106 126 pe_header: 107 127 .ascii "PE" ··· 124 144 .long _end - stext // SizeOfCode 125 145 .long 0 // SizeOfInitializedData 126 146 .long 0 // SizeOfUninitializedData 127 - .long efi_stub_entry - efi_head // AddressOfEntryPoint 128 - .long stext_offset // BaseOfCode 147 + .long __efistub_entry - efi_head // AddressOfEntryPoint 148 + .long __efistub_stext_offset // BaseOfCode 129 149 130 150 extra_header_fields: 131 151 .quad 0 // ImageBase ··· 142 162 .long _end - efi_head // SizeOfImage 143 163 144 164 // Everything before the kernel image is considered part of the header 145 - .long stext_offset // SizeOfHeaders 165 + .long __efistub_stext_offset // SizeOfHeaders 146 166 .long 0 // CheckSum 147 167 .short 0xa // Subsystem (EFI application) 148 168 .short 0 // DllCharacteristics ··· 187 207 .byte 0 188 208 .byte 0 // end of 0 padding of section name 189 209 .long _end - stext // VirtualSize 190 - .long stext_offset // VirtualAddress 210 + .long __efistub_stext_offset // VirtualAddress 191 211 .long _edata - stext // SizeOfRawData 192 - .long stext_offset // PointerToRawData 212 + .long __efistub_stext_offset // PointerToRawData 193 213 194 214 .long 0 // PointerToRelocations (0 for executables) 195 215 .long 0 // PointerToLineNumbers (0 for executables) ··· 272 292 */ 273 293 .macro create_pgd_entry, tbl, virt, tmp1, tmp2 274 294 create_table_entry \tbl, \virt, PGDIR_SHIFT, PTRS_PER_PGD, \tmp1, \tmp2 275 - #if SWAPPER_PGTABLE_LEVELS == 3 276 - create_table_entry \tbl, \virt, TABLE_SHIFT, PTRS_PER_PTE, \tmp1, \tmp2 295 + #if SWAPPER_PGTABLE_LEVELS > 3 296 + create_table_entry \tbl, \virt, PUD_SHIFT, PTRS_PER_PUD, \tmp1, \tmp2 297 + #endif 298 + #if SWAPPER_PGTABLE_LEVELS > 2 299 + create_table_entry \tbl, \virt, SWAPPER_TABLE_SHIFT, PTRS_PER_PTE, \tmp1, \tmp2 277 300 #endif 278 301 .endm 279 302 ··· 288 305 * Corrupts: phys, start, end, pstate 289 306 */ 290 307 .macro create_block_map, tbl, flags, phys, start, end 291 - lsr \phys, \phys, #BLOCK_SHIFT 292 - lsr \start, \start, #BLOCK_SHIFT 308 + lsr \phys, \phys, #SWAPPER_BLOCK_SHIFT 309 + lsr \start, \start, #SWAPPER_BLOCK_SHIFT 293 310 and \start, \start, #PTRS_PER_PTE - 1 // table index 294 - orr \phys, \flags, \phys, lsl #BLOCK_SHIFT // table entry 295 - lsr \end, \end, #BLOCK_SHIFT 311 + orr \phys, \flags, \phys, lsl #SWAPPER_BLOCK_SHIFT // table entry 312 + lsr \end, \end, #SWAPPER_BLOCK_SHIFT 296 313 and \end, \end, #PTRS_PER_PTE - 1 // table end index 297 314 9999: str \phys, [\tbl, \start, lsl #3] // store the entry 298 315 add \start, \start, #1 // next entry 299 - add \phys, \phys, #BLOCK_SIZE // next block 316 + add \phys, \phys, #SWAPPER_BLOCK_SIZE // next block 300 317 cmp \start, \end 301 318 b.ls 9999b 302 319 .endm ··· 333 350 cmp x0, x6 334 351 b.lo 1b 335 352 336 - ldr x7, =MM_MMUFLAGS 353 + ldr x7, =SWAPPER_MM_MMUFLAGS 337 354 338 355 /* 339 356 * Create the identity mapping. ··· 427 444 str_l x21, __fdt_pointer, x5 // Save FDT pointer 428 445 str_l x24, memstart_addr, x6 // Save PHYS_OFFSET 429 446 mov x29, #0 447 + #ifdef CONFIG_KASAN 448 + bl kasan_early_init 449 + #endif 430 450 b start_kernel 431 451 ENDPROC(__mmap_switched) 432 452 ··· 616 630 * x0 = SCTLR_EL1 value for turning on the MMU. 617 631 * x27 = *virtual* address to jump to upon completion 618 632 * 619 - * other registers depend on the function called upon completion 633 + * Other registers depend on the function called upon completion. 634 + * 635 + * Checks if the selected granule size is supported by the CPU. 636 + * If it isn't, park the CPU 620 637 */ 621 638 .section ".idmap.text", "ax" 622 639 __enable_mmu: 640 + mrs x1, ID_AA64MMFR0_EL1 641 + ubfx x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4 642 + cmp x2, #ID_AA64MMFR0_TGRAN_SUPPORTED 643 + b.ne __no_granule_support 623 644 ldr x5, =vectors 624 645 msr vbar_el1, x5 625 646 msr ttbr0_el1, x25 // load TTBR0 ··· 644 651 isb 645 652 br x27 646 653 ENDPROC(__enable_mmu) 654 + 655 + __no_granule_support: 656 + wfe 657 + b __no_granule_support 658 + ENDPROC(__no_granule_support)
+17 -2
arch/arm64/kernel/hw_breakpoint.c
··· 28 28 #include <linux/ptrace.h> 29 29 #include <linux/smp.h> 30 30 31 + #include <asm/compat.h> 31 32 #include <asm/current.h> 32 33 #include <asm/debug-monitors.h> 33 34 #include <asm/hw_breakpoint.h> ··· 163 162 HW_BREAKPOINT_UNINSTALL, 164 163 HW_BREAKPOINT_RESTORE 165 164 }; 165 + 166 + static int is_compat_bp(struct perf_event *bp) 167 + { 168 + struct task_struct *tsk = bp->hw.target; 169 + 170 + /* 171 + * tsk can be NULL for per-cpu (non-ptrace) breakpoints. 172 + * In this case, use the native interface, since we don't have 173 + * the notion of a "compat CPU" and could end up relying on 174 + * deprecated behaviour if we use unaligned watchpoints in 175 + * AArch64 state. 176 + */ 177 + return tsk && is_compat_thread(task_thread_info(tsk)); 178 + } 166 179 167 180 /** 168 181 * hw_breakpoint_slot_setup - Find and setup a perf slot according to ··· 435 420 * Watchpoints can be of length 1, 2, 4 or 8 bytes. 436 421 */ 437 422 if (info->ctrl.type == ARM_BREAKPOINT_EXECUTE) { 438 - if (is_compat_task()) { 423 + if (is_compat_bp(bp)) { 439 424 if (info->ctrl.len != ARM_BREAKPOINT_LEN_2 && 440 425 info->ctrl.len != ARM_BREAKPOINT_LEN_4) 441 426 return -EINVAL; ··· 492 477 * AArch32 tasks expect some simple alignment fixups, so emulate 493 478 * that here. 494 479 */ 495 - if (is_compat_task()) { 480 + if (is_compat_bp(bp)) { 496 481 if (info->ctrl.len == ARM_BREAKPOINT_LEN_8) 497 482 alignment_mask = 0x7; 498 483 else
+37 -1
arch/arm64/kernel/image.h
··· 47 47 #define __HEAD_FLAG_BE 0 48 48 #endif 49 49 50 - #define __HEAD_FLAGS (__HEAD_FLAG_BE << 0) 50 + #define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2) 51 + 52 + #define __HEAD_FLAGS ((__HEAD_FLAG_BE << 0) | \ 53 + (__HEAD_FLAG_PAGE_SIZE << 1)) 51 54 52 55 /* 53 56 * These will output as part of the Image header, which should be little-endian ··· 61 58 _kernel_size_le = DATA_LE64(_end - _text); \ 62 59 _kernel_offset_le = DATA_LE64(TEXT_OFFSET); \ 63 60 _kernel_flags_le = DATA_LE64(__HEAD_FLAGS); 61 + 62 + #ifdef CONFIG_EFI 63 + 64 + /* 65 + * The EFI stub has its own symbol namespace prefixed by __efistub_, to 66 + * isolate it from the kernel proper. The following symbols are legally 67 + * accessed by the stub, so provide some aliases to make them accessible. 68 + * Only include data symbols here, or text symbols of functions that are 69 + * guaranteed to be safe when executed at another offset than they were 70 + * linked at. The routines below are all implemented in assembler in a 71 + * position independent manner 72 + */ 73 + __efistub_memcmp = __pi_memcmp; 74 + __efistub_memchr = __pi_memchr; 75 + __efistub_memcpy = __pi_memcpy; 76 + __efistub_memmove = __pi_memmove; 77 + __efistub_memset = __pi_memset; 78 + __efistub_strlen = __pi_strlen; 79 + __efistub_strcmp = __pi_strcmp; 80 + __efistub_strncmp = __pi_strncmp; 81 + __efistub___flush_dcache_area = __pi___flush_dcache_area; 82 + 83 + #ifdef CONFIG_KASAN 84 + __efistub___memcpy = __pi_memcpy; 85 + __efistub___memmove = __pi_memmove; 86 + __efistub___memset = __pi_memset; 87 + #endif 88 + 89 + __efistub__text = _text; 90 + __efistub__end = _end; 91 + __efistub__edata = _edata; 92 + 93 + #endif 64 94 65 95 #endif /* __ASM_IMAGE_H */
-62
arch/arm64/kernel/irq.c
··· 27 27 #include <linux/init.h> 28 28 #include <linux/irqchip.h> 29 29 #include <linux/seq_file.h> 30 - #include <linux/ratelimit.h> 31 30 32 31 unsigned long irq_err_count; 33 32 ··· 53 54 if (!handle_arch_irq) 54 55 panic("No interrupt controller found."); 55 56 } 56 - 57 - #ifdef CONFIG_HOTPLUG_CPU 58 - static bool migrate_one_irq(struct irq_desc *desc) 59 - { 60 - struct irq_data *d = irq_desc_get_irq_data(desc); 61 - const struct cpumask *affinity = irq_data_get_affinity_mask(d); 62 - struct irq_chip *c; 63 - bool ret = false; 64 - 65 - /* 66 - * If this is a per-CPU interrupt, or the affinity does not 67 - * include this CPU, then we have nothing to do. 68 - */ 69 - if (irqd_is_per_cpu(d) || !cpumask_test_cpu(smp_processor_id(), affinity)) 70 - return false; 71 - 72 - if (cpumask_any_and(affinity, cpu_online_mask) >= nr_cpu_ids) { 73 - affinity = cpu_online_mask; 74 - ret = true; 75 - } 76 - 77 - c = irq_data_get_irq_chip(d); 78 - if (!c->irq_set_affinity) 79 - pr_debug("IRQ%u: unable to set affinity\n", d->irq); 80 - else if (c->irq_set_affinity(d, affinity, false) == IRQ_SET_MASK_OK && ret) 81 - cpumask_copy(irq_data_get_affinity_mask(d), affinity); 82 - 83 - return ret; 84 - } 85 - 86 - /* 87 - * The current CPU has been marked offline. Migrate IRQs off this CPU. 88 - * If the affinity settings do not allow other CPUs, force them onto any 89 - * available CPU. 90 - * 91 - * Note: we must iterate over all IRQs, whether they have an attached 92 - * action structure or not, as we need to get chained interrupts too. 93 - */ 94 - void migrate_irqs(void) 95 - { 96 - unsigned int i; 97 - struct irq_desc *desc; 98 - unsigned long flags; 99 - 100 - local_irq_save(flags); 101 - 102 - for_each_irq_desc(i, desc) { 103 - bool affinity_broken; 104 - 105 - raw_spin_lock(&desc->lock); 106 - affinity_broken = migrate_one_irq(desc); 107 - raw_spin_unlock(&desc->lock); 108 - 109 - if (affinity_broken) 110 - pr_warn_ratelimited("IRQ%u no longer affine to CPU%u\n", 111 - i, smp_processor_id()); 112 - } 113 - 114 - local_irq_restore(flags); 115 - } 116 - #endif /* CONFIG_HOTPLUG_CPU */
+13 -3
arch/arm64/kernel/module.c
··· 21 21 #include <linux/bitops.h> 22 22 #include <linux/elf.h> 23 23 #include <linux/gfp.h> 24 + #include <linux/kasan.h> 24 25 #include <linux/kernel.h> 25 26 #include <linux/mm.h> 26 27 #include <linux/moduleloader.h> ··· 35 34 36 35 void *module_alloc(unsigned long size) 37 36 { 38 - return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END, 39 - GFP_KERNEL, PAGE_KERNEL_EXEC, 0, 40 - NUMA_NO_NODE, __builtin_return_address(0)); 37 + void *p; 38 + 39 + p = __vmalloc_node_range(size, MODULE_ALIGN, MODULES_VADDR, MODULES_END, 40 + GFP_KERNEL, PAGE_KERNEL_EXEC, 0, 41 + NUMA_NO_NODE, __builtin_return_address(0)); 42 + 43 + if (p && (kasan_module_alloc(p, size) < 0)) { 44 + vfree(p); 45 + return NULL; 46 + } 47 + 48 + return p; 41 49 } 42 50 43 51 enum aarch64_reloc_op {
+211 -861
arch/arm64/kernel/perf_event.c
··· 18 18 * You should have received a copy of the GNU General Public License 19 19 * along with this program. If not, see <http://www.gnu.org/licenses/>. 20 20 */ 21 - #define pr_fmt(fmt) "hw perfevents: " fmt 22 21 23 - #include <linux/bitmap.h> 24 - #include <linux/interrupt.h> 25 - #include <linux/irq.h> 26 - #include <linux/kernel.h> 27 - #include <linux/export.h> 28 - #include <linux/of_device.h> 29 - #include <linux/perf_event.h> 30 - #include <linux/platform_device.h> 31 - #include <linux/slab.h> 32 - #include <linux/spinlock.h> 33 - #include <linux/uaccess.h> 34 - 35 - #include <asm/cputype.h> 36 - #include <asm/irq.h> 37 22 #include <asm/irq_regs.h> 38 - #include <asm/pmu.h> 39 23 40 - /* 41 - * ARMv8 supports a maximum of 32 events. 42 - * The cycle counter is included in this total. 43 - */ 44 - #define ARMPMU_MAX_HWEVENTS 32 45 - 46 - static DEFINE_PER_CPU(struct perf_event * [ARMPMU_MAX_HWEVENTS], hw_events); 47 - static DEFINE_PER_CPU(unsigned long [BITS_TO_LONGS(ARMPMU_MAX_HWEVENTS)], used_mask); 48 - static DEFINE_PER_CPU(struct pmu_hw_events, cpu_hw_events); 49 - 50 - #define to_arm_pmu(p) (container_of(p, struct arm_pmu, pmu)) 51 - 52 - /* Set at runtime when we know what CPU type we are. */ 53 - static struct arm_pmu *cpu_pmu; 54 - 55 - int 56 - armpmu_get_max_events(void) 57 - { 58 - int max_events = 0; 59 - 60 - if (cpu_pmu != NULL) 61 - max_events = cpu_pmu->num_events; 62 - 63 - return max_events; 64 - } 65 - EXPORT_SYMBOL_GPL(armpmu_get_max_events); 66 - 67 - int perf_num_counters(void) 68 - { 69 - return armpmu_get_max_events(); 70 - } 71 - EXPORT_SYMBOL_GPL(perf_num_counters); 72 - 73 - #define HW_OP_UNSUPPORTED 0xFFFF 74 - 75 - #define C(_x) \ 76 - PERF_COUNT_HW_CACHE_##_x 77 - 78 - #define CACHE_OP_UNSUPPORTED 0xFFFF 79 - 80 - #define PERF_MAP_ALL_UNSUPPORTED \ 81 - [0 ... PERF_COUNT_HW_MAX - 1] = HW_OP_UNSUPPORTED 82 - 83 - #define PERF_CACHE_MAP_ALL_UNSUPPORTED \ 84 - [0 ... C(MAX) - 1] = { \ 85 - [0 ... C(OP_MAX) - 1] = { \ 86 - [0 ... C(RESULT_MAX) - 1] = CACHE_OP_UNSUPPORTED, \ 87 - }, \ 88 - } 89 - 90 - static int 91 - armpmu_map_cache_event(const unsigned (*cache_map) 92 - [PERF_COUNT_HW_CACHE_MAX] 93 - [PERF_COUNT_HW_CACHE_OP_MAX] 94 - [PERF_COUNT_HW_CACHE_RESULT_MAX], 95 - u64 config) 96 - { 97 - unsigned int cache_type, cache_op, cache_result, ret; 98 - 99 - cache_type = (config >> 0) & 0xff; 100 - if (cache_type >= PERF_COUNT_HW_CACHE_MAX) 101 - return -EINVAL; 102 - 103 - cache_op = (config >> 8) & 0xff; 104 - if (cache_op >= PERF_COUNT_HW_CACHE_OP_MAX) 105 - return -EINVAL; 106 - 107 - cache_result = (config >> 16) & 0xff; 108 - if (cache_result >= PERF_COUNT_HW_CACHE_RESULT_MAX) 109 - return -EINVAL; 110 - 111 - ret = (int)(*cache_map)[cache_type][cache_op][cache_result]; 112 - 113 - if (ret == CACHE_OP_UNSUPPORTED) 114 - return -ENOENT; 115 - 116 - return ret; 117 - } 118 - 119 - static int 120 - armpmu_map_event(const unsigned (*event_map)[PERF_COUNT_HW_MAX], u64 config) 121 - { 122 - int mapping; 123 - 124 - if (config >= PERF_COUNT_HW_MAX) 125 - return -EINVAL; 126 - 127 - mapping = (*event_map)[config]; 128 - return mapping == HW_OP_UNSUPPORTED ? -ENOENT : mapping; 129 - } 130 - 131 - static int 132 - armpmu_map_raw_event(u32 raw_event_mask, u64 config) 133 - { 134 - return (int)(config & raw_event_mask); 135 - } 136 - 137 - static int map_cpu_event(struct perf_event *event, 138 - const unsigned (*event_map)[PERF_COUNT_HW_MAX], 139 - const unsigned (*cache_map) 140 - [PERF_COUNT_HW_CACHE_MAX] 141 - [PERF_COUNT_HW_CACHE_OP_MAX] 142 - [PERF_COUNT_HW_CACHE_RESULT_MAX], 143 - u32 raw_event_mask) 144 - { 145 - u64 config = event->attr.config; 146 - 147 - switch (event->attr.type) { 148 - case PERF_TYPE_HARDWARE: 149 - return armpmu_map_event(event_map, config); 150 - case PERF_TYPE_HW_CACHE: 151 - return armpmu_map_cache_event(cache_map, config); 152 - case PERF_TYPE_RAW: 153 - return armpmu_map_raw_event(raw_event_mask, config); 154 - } 155 - 156 - return -ENOENT; 157 - } 158 - 159 - int 160 - armpmu_event_set_period(struct perf_event *event, 161 - struct hw_perf_event *hwc, 162 - int idx) 163 - { 164 - struct arm_pmu *armpmu = to_arm_pmu(event->pmu); 165 - s64 left = local64_read(&hwc->period_left); 166 - s64 period = hwc->sample_period; 167 - int ret = 0; 168 - 169 - if (unlikely(left <= -period)) { 170 - left = period; 171 - local64_set(&hwc->period_left, left); 172 - hwc->last_period = period; 173 - ret = 1; 174 - } 175 - 176 - if (unlikely(left <= 0)) { 177 - left += period; 178 - local64_set(&hwc->period_left, left); 179 - hwc->last_period = period; 180 - ret = 1; 181 - } 182 - 183 - /* 184 - * Limit the maximum period to prevent the counter value 185 - * from overtaking the one we are about to program. In 186 - * effect we are reducing max_period to account for 187 - * interrupt latency (and we are being very conservative). 188 - */ 189 - if (left > (armpmu->max_period >> 1)) 190 - left = armpmu->max_period >> 1; 191 - 192 - local64_set(&hwc->prev_count, (u64)-left); 193 - 194 - armpmu->write_counter(idx, (u64)(-left) & 0xffffffff); 195 - 196 - perf_event_update_userpage(event); 197 - 198 - return ret; 199 - } 200 - 201 - u64 202 - armpmu_event_update(struct perf_event *event, 203 - struct hw_perf_event *hwc, 204 - int idx) 205 - { 206 - struct arm_pmu *armpmu = to_arm_pmu(event->pmu); 207 - u64 delta, prev_raw_count, new_raw_count; 208 - 209 - again: 210 - prev_raw_count = local64_read(&hwc->prev_count); 211 - new_raw_count = armpmu->read_counter(idx); 212 - 213 - if (local64_cmpxchg(&hwc->prev_count, prev_raw_count, 214 - new_raw_count) != prev_raw_count) 215 - goto again; 216 - 217 - delta = (new_raw_count - prev_raw_count) & armpmu->max_period; 218 - 219 - local64_add(delta, &event->count); 220 - local64_sub(delta, &hwc->period_left); 221 - 222 - return new_raw_count; 223 - } 224 - 225 - static void 226 - armpmu_read(struct perf_event *event) 227 - { 228 - struct hw_perf_event *hwc = &event->hw; 229 - 230 - /* Don't read disabled counters! */ 231 - if (hwc->idx < 0) 232 - return; 233 - 234 - armpmu_event_update(event, hwc, hwc->idx); 235 - } 236 - 237 - static void 238 - armpmu_stop(struct perf_event *event, int flags) 239 - { 240 - struct arm_pmu *armpmu = to_arm_pmu(event->pmu); 241 - struct hw_perf_event *hwc = &event->hw; 242 - 243 - /* 244 - * ARM pmu always has to update the counter, so ignore 245 - * PERF_EF_UPDATE, see comments in armpmu_start(). 246 - */ 247 - if (!(hwc->state & PERF_HES_STOPPED)) { 248 - armpmu->disable(hwc, hwc->idx); 249 - barrier(); /* why? */ 250 - armpmu_event_update(event, hwc, hwc->idx); 251 - hwc->state |= PERF_HES_STOPPED | PERF_HES_UPTODATE; 252 - } 253 - } 254 - 255 - static void 256 - armpmu_start(struct perf_event *event, int flags) 257 - { 258 - struct arm_pmu *armpmu = to_arm_pmu(event->pmu); 259 - struct hw_perf_event *hwc = &event->hw; 260 - 261 - /* 262 - * ARM pmu always has to reprogram the period, so ignore 263 - * PERF_EF_RELOAD, see the comment below. 264 - */ 265 - if (flags & PERF_EF_RELOAD) 266 - WARN_ON_ONCE(!(hwc->state & PERF_HES_UPTODATE)); 267 - 268 - hwc->state = 0; 269 - /* 270 - * Set the period again. Some counters can't be stopped, so when we 271 - * were stopped we simply disabled the IRQ source and the counter 272 - * may have been left counting. If we don't do this step then we may 273 - * get an interrupt too soon or *way* too late if the overflow has 274 - * happened since disabling. 275 - */ 276 - armpmu_event_set_period(event, hwc, hwc->idx); 277 - armpmu->enable(hwc, hwc->idx); 278 - } 279 - 280 - static void 281 - armpmu_del(struct perf_event *event, int flags) 282 - { 283 - struct arm_pmu *armpmu = to_arm_pmu(event->pmu); 284 - struct pmu_hw_events *hw_events = armpmu->get_hw_events(); 285 - struct hw_perf_event *hwc = &event->hw; 286 - int idx = hwc->idx; 287 - 288 - WARN_ON(idx < 0); 289 - 290 - armpmu_stop(event, PERF_EF_UPDATE); 291 - hw_events->events[idx] = NULL; 292 - clear_bit(idx, hw_events->used_mask); 293 - 294 - perf_event_update_userpage(event); 295 - } 296 - 297 - static int 298 - armpmu_add(struct perf_event *event, int flags) 299 - { 300 - struct arm_pmu *armpmu = to_arm_pmu(event->pmu); 301 - struct pmu_hw_events *hw_events = armpmu->get_hw_events(); 302 - struct hw_perf_event *hwc = &event->hw; 303 - int idx; 304 - int err = 0; 305 - 306 - perf_pmu_disable(event->pmu); 307 - 308 - /* If we don't have a space for the counter then finish early. */ 309 - idx = armpmu->get_event_idx(hw_events, hwc); 310 - if (idx < 0) { 311 - err = idx; 312 - goto out; 313 - } 314 - 315 - /* 316 - * If there is an event in the counter we are going to use then make 317 - * sure it is disabled. 318 - */ 319 - event->hw.idx = idx; 320 - armpmu->disable(hwc, idx); 321 - hw_events->events[idx] = event; 322 - 323 - hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE; 324 - if (flags & PERF_EF_START) 325 - armpmu_start(event, PERF_EF_RELOAD); 326 - 327 - /* Propagate our changes to the userspace mapping. */ 328 - perf_event_update_userpage(event); 329 - 330 - out: 331 - perf_pmu_enable(event->pmu); 332 - return err; 333 - } 334 - 335 - static int 336 - validate_event(struct pmu *pmu, struct pmu_hw_events *hw_events, 337 - struct perf_event *event) 338 - { 339 - struct arm_pmu *armpmu; 340 - struct hw_perf_event fake_event = event->hw; 341 - struct pmu *leader_pmu = event->group_leader->pmu; 342 - 343 - if (is_software_event(event)) 344 - return 1; 345 - 346 - /* 347 - * Reject groups spanning multiple HW PMUs (e.g. CPU + CCI). The 348 - * core perf code won't check that the pmu->ctx == leader->ctx 349 - * until after pmu->event_init(event). 350 - */ 351 - if (event->pmu != pmu) 352 - return 0; 353 - 354 - if (event->pmu != leader_pmu || event->state < PERF_EVENT_STATE_OFF) 355 - return 1; 356 - 357 - if (event->state == PERF_EVENT_STATE_OFF && !event->attr.enable_on_exec) 358 - return 1; 359 - 360 - armpmu = to_arm_pmu(event->pmu); 361 - return armpmu->get_event_idx(hw_events, &fake_event) >= 0; 362 - } 363 - 364 - static int 365 - validate_group(struct perf_event *event) 366 - { 367 - struct perf_event *sibling, *leader = event->group_leader; 368 - struct pmu_hw_events fake_pmu; 369 - DECLARE_BITMAP(fake_used_mask, ARMPMU_MAX_HWEVENTS); 370 - 371 - /* 372 - * Initialise the fake PMU. We only need to populate the 373 - * used_mask for the purposes of validation. 374 - */ 375 - memset(fake_used_mask, 0, sizeof(fake_used_mask)); 376 - fake_pmu.used_mask = fake_used_mask; 377 - 378 - if (!validate_event(event->pmu, &fake_pmu, leader)) 379 - return -EINVAL; 380 - 381 - list_for_each_entry(sibling, &leader->sibling_list, group_entry) { 382 - if (!validate_event(event->pmu, &fake_pmu, sibling)) 383 - return -EINVAL; 384 - } 385 - 386 - if (!validate_event(event->pmu, &fake_pmu, event)) 387 - return -EINVAL; 388 - 389 - return 0; 390 - } 391 - 392 - static void 393 - armpmu_disable_percpu_irq(void *data) 394 - { 395 - unsigned int irq = *(unsigned int *)data; 396 - disable_percpu_irq(irq); 397 - } 398 - 399 - static void 400 - armpmu_release_hardware(struct arm_pmu *armpmu) 401 - { 402 - int irq; 403 - unsigned int i, irqs; 404 - struct platform_device *pmu_device = armpmu->plat_device; 405 - 406 - irqs = min(pmu_device->num_resources, num_possible_cpus()); 407 - if (!irqs) 408 - return; 409 - 410 - irq = platform_get_irq(pmu_device, 0); 411 - if (irq <= 0) 412 - return; 413 - 414 - if (irq_is_percpu(irq)) { 415 - on_each_cpu(armpmu_disable_percpu_irq, &irq, 1); 416 - free_percpu_irq(irq, &cpu_hw_events); 417 - } else { 418 - for (i = 0; i < irqs; ++i) { 419 - int cpu = i; 420 - 421 - if (armpmu->irq_affinity) 422 - cpu = armpmu->irq_affinity[i]; 423 - 424 - if (!cpumask_test_and_clear_cpu(cpu, &armpmu->active_irqs)) 425 - continue; 426 - irq = platform_get_irq(pmu_device, i); 427 - if (irq > 0) 428 - free_irq(irq, armpmu); 429 - } 430 - } 431 - } 432 - 433 - static void 434 - armpmu_enable_percpu_irq(void *data) 435 - { 436 - unsigned int irq = *(unsigned int *)data; 437 - enable_percpu_irq(irq, IRQ_TYPE_NONE); 438 - } 439 - 440 - static int 441 - armpmu_reserve_hardware(struct arm_pmu *armpmu) 442 - { 443 - int err, irq; 444 - unsigned int i, irqs; 445 - struct platform_device *pmu_device = armpmu->plat_device; 446 - 447 - if (!pmu_device) 448 - return -ENODEV; 449 - 450 - irqs = min(pmu_device->num_resources, num_possible_cpus()); 451 - if (!irqs) { 452 - pr_err("no irqs for PMUs defined\n"); 453 - return -ENODEV; 454 - } 455 - 456 - irq = platform_get_irq(pmu_device, 0); 457 - if (irq <= 0) { 458 - pr_err("failed to get valid irq for PMU device\n"); 459 - return -ENODEV; 460 - } 461 - 462 - if (irq_is_percpu(irq)) { 463 - err = request_percpu_irq(irq, armpmu->handle_irq, 464 - "arm-pmu", &cpu_hw_events); 465 - 466 - if (err) { 467 - pr_err("unable to request percpu IRQ%d for ARM PMU counters\n", 468 - irq); 469 - armpmu_release_hardware(armpmu); 470 - return err; 471 - } 472 - 473 - on_each_cpu(armpmu_enable_percpu_irq, &irq, 1); 474 - } else { 475 - for (i = 0; i < irqs; ++i) { 476 - int cpu = i; 477 - 478 - err = 0; 479 - irq = platform_get_irq(pmu_device, i); 480 - if (irq <= 0) 481 - continue; 482 - 483 - if (armpmu->irq_affinity) 484 - cpu = armpmu->irq_affinity[i]; 485 - 486 - /* 487 - * If we have a single PMU interrupt that we can't shift, 488 - * assume that we're running on a uniprocessor machine and 489 - * continue. Otherwise, continue without this interrupt. 490 - */ 491 - if (irq_set_affinity(irq, cpumask_of(cpu)) && irqs > 1) { 492 - pr_warning("unable to set irq affinity (irq=%d, cpu=%u)\n", 493 - irq, cpu); 494 - continue; 495 - } 496 - 497 - err = request_irq(irq, armpmu->handle_irq, 498 - IRQF_NOBALANCING | IRQF_NO_THREAD, 499 - "arm-pmu", armpmu); 500 - if (err) { 501 - pr_err("unable to request IRQ%d for ARM PMU counters\n", 502 - irq); 503 - armpmu_release_hardware(armpmu); 504 - return err; 505 - } 506 - 507 - cpumask_set_cpu(cpu, &armpmu->active_irqs); 508 - } 509 - } 510 - 511 - return 0; 512 - } 513 - 514 - static void 515 - hw_perf_event_destroy(struct perf_event *event) 516 - { 517 - struct arm_pmu *armpmu = to_arm_pmu(event->pmu); 518 - atomic_t *active_events = &armpmu->active_events; 519 - struct mutex *pmu_reserve_mutex = &armpmu->reserve_mutex; 520 - 521 - if (atomic_dec_and_mutex_lock(active_events, pmu_reserve_mutex)) { 522 - armpmu_release_hardware(armpmu); 523 - mutex_unlock(pmu_reserve_mutex); 524 - } 525 - } 526 - 527 - static int 528 - event_requires_mode_exclusion(struct perf_event_attr *attr) 529 - { 530 - return attr->exclude_idle || attr->exclude_user || 531 - attr->exclude_kernel || attr->exclude_hv; 532 - } 533 - 534 - static int 535 - __hw_perf_event_init(struct perf_event *event) 536 - { 537 - struct arm_pmu *armpmu = to_arm_pmu(event->pmu); 538 - struct hw_perf_event *hwc = &event->hw; 539 - int mapping, err; 540 - 541 - mapping = armpmu->map_event(event); 542 - 543 - if (mapping < 0) { 544 - pr_debug("event %x:%llx not supported\n", event->attr.type, 545 - event->attr.config); 546 - return mapping; 547 - } 548 - 549 - /* 550 - * We don't assign an index until we actually place the event onto 551 - * hardware. Use -1 to signify that we haven't decided where to put it 552 - * yet. For SMP systems, each core has it's own PMU so we can't do any 553 - * clever allocation or constraints checking at this point. 554 - */ 555 - hwc->idx = -1; 556 - hwc->config_base = 0; 557 - hwc->config = 0; 558 - hwc->event_base = 0; 559 - 560 - /* 561 - * Check whether we need to exclude the counter from certain modes. 562 - */ 563 - if ((!armpmu->set_event_filter || 564 - armpmu->set_event_filter(hwc, &event->attr)) && 565 - event_requires_mode_exclusion(&event->attr)) { 566 - pr_debug("ARM performance counters do not support mode exclusion\n"); 567 - return -EPERM; 568 - } 569 - 570 - /* 571 - * Store the event encoding into the config_base field. 572 - */ 573 - hwc->config_base |= (unsigned long)mapping; 574 - 575 - if (!hwc->sample_period) { 576 - /* 577 - * For non-sampling runs, limit the sample_period to half 578 - * of the counter width. That way, the new counter value 579 - * is far less likely to overtake the previous one unless 580 - * you have some serious IRQ latency issues. 581 - */ 582 - hwc->sample_period = armpmu->max_period >> 1; 583 - hwc->last_period = hwc->sample_period; 584 - local64_set(&hwc->period_left, hwc->sample_period); 585 - } 586 - 587 - err = 0; 588 - if (event->group_leader != event) { 589 - err = validate_group(event); 590 - if (err) 591 - return -EINVAL; 592 - } 593 - 594 - return err; 595 - } 596 - 597 - static int armpmu_event_init(struct perf_event *event) 598 - { 599 - struct arm_pmu *armpmu = to_arm_pmu(event->pmu); 600 - int err = 0; 601 - atomic_t *active_events = &armpmu->active_events; 602 - 603 - if (armpmu->map_event(event) == -ENOENT) 604 - return -ENOENT; 605 - 606 - event->destroy = hw_perf_event_destroy; 607 - 608 - if (!atomic_inc_not_zero(active_events)) { 609 - mutex_lock(&armpmu->reserve_mutex); 610 - if (atomic_read(active_events) == 0) 611 - err = armpmu_reserve_hardware(armpmu); 612 - 613 - if (!err) 614 - atomic_inc(active_events); 615 - mutex_unlock(&armpmu->reserve_mutex); 616 - } 617 - 618 - if (err) 619 - return err; 620 - 621 - err = __hw_perf_event_init(event); 622 - if (err) 623 - hw_perf_event_destroy(event); 624 - 625 - return err; 626 - } 627 - 628 - static void armpmu_enable(struct pmu *pmu) 629 - { 630 - struct arm_pmu *armpmu = to_arm_pmu(pmu); 631 - struct pmu_hw_events *hw_events = armpmu->get_hw_events(); 632 - int enabled = bitmap_weight(hw_events->used_mask, armpmu->num_events); 633 - 634 - if (enabled) 635 - armpmu->start(); 636 - } 637 - 638 - static void armpmu_disable(struct pmu *pmu) 639 - { 640 - struct arm_pmu *armpmu = to_arm_pmu(pmu); 641 - armpmu->stop(); 642 - } 643 - 644 - static void __init armpmu_init(struct arm_pmu *armpmu) 645 - { 646 - atomic_set(&armpmu->active_events, 0); 647 - mutex_init(&armpmu->reserve_mutex); 648 - 649 - armpmu->pmu = (struct pmu) { 650 - .pmu_enable = armpmu_enable, 651 - .pmu_disable = armpmu_disable, 652 - .event_init = armpmu_event_init, 653 - .add = armpmu_add, 654 - .del = armpmu_del, 655 - .start = armpmu_start, 656 - .stop = armpmu_stop, 657 - .read = armpmu_read, 658 - }; 659 - } 660 - 661 - int __init armpmu_register(struct arm_pmu *armpmu, char *name, int type) 662 - { 663 - armpmu_init(armpmu); 664 - return perf_pmu_register(&armpmu->pmu, name, type); 665 - } 24 + #include <linux/of.h> 25 + #include <linux/perf/arm_pmu.h> 26 + #include <linux/platform_device.h> 666 27 667 28 /* 668 29 * ARMv8 PMUv3 Performance Events handling code. ··· 69 708 ARMV8_PMUV3_PERFCTR_BUS_CYCLES = 0x1D, 70 709 }; 71 710 711 + /* ARMv8 Cortex-A53 specific event types. */ 712 + enum armv8_a53_pmu_perf_types { 713 + ARMV8_A53_PERFCTR_PREFETCH_LINEFILL = 0xC2, 714 + }; 715 + 716 + /* ARMv8 Cortex-A57 specific event types. */ 717 + enum armv8_a57_perf_types { 718 + ARMV8_A57_PERFCTR_L1_DCACHE_ACCESS_LD = 0x40, 719 + ARMV8_A57_PERFCTR_L1_DCACHE_ACCESS_ST = 0x41, 720 + ARMV8_A57_PERFCTR_L1_DCACHE_REFILL_LD = 0x42, 721 + ARMV8_A57_PERFCTR_L1_DCACHE_REFILL_ST = 0x43, 722 + ARMV8_A57_PERFCTR_DTLB_REFILL_LD = 0x4c, 723 + ARMV8_A57_PERFCTR_DTLB_REFILL_ST = 0x4d, 724 + }; 725 + 72 726 /* PMUv3 HW events mapping. */ 73 727 static const unsigned armv8_pmuv3_perf_map[PERF_COUNT_HW_MAX] = { 74 728 PERF_MAP_ALL_UNSUPPORTED, ··· 92 716 [PERF_COUNT_HW_CACHE_REFERENCES] = ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS, 93 717 [PERF_COUNT_HW_CACHE_MISSES] = ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL, 94 718 [PERF_COUNT_HW_BRANCH_MISSES] = ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED, 719 + }; 720 + 721 + /* ARM Cortex-A53 HW events mapping. */ 722 + static const unsigned armv8_a53_perf_map[PERF_COUNT_HW_MAX] = { 723 + PERF_MAP_ALL_UNSUPPORTED, 724 + [PERF_COUNT_HW_CPU_CYCLES] = ARMV8_PMUV3_PERFCTR_CLOCK_CYCLES, 725 + [PERF_COUNT_HW_INSTRUCTIONS] = ARMV8_PMUV3_PERFCTR_INSTR_EXECUTED, 726 + [PERF_COUNT_HW_CACHE_REFERENCES] = ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS, 727 + [PERF_COUNT_HW_CACHE_MISSES] = ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL, 728 + [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = ARMV8_PMUV3_PERFCTR_PC_WRITE, 729 + [PERF_COUNT_HW_BRANCH_MISSES] = ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED, 730 + [PERF_COUNT_HW_BUS_CYCLES] = ARMV8_PMUV3_PERFCTR_BUS_CYCLES, 731 + }; 732 + 733 + static const unsigned armv8_a57_perf_map[PERF_COUNT_HW_MAX] = { 734 + PERF_MAP_ALL_UNSUPPORTED, 735 + [PERF_COUNT_HW_CPU_CYCLES] = ARMV8_PMUV3_PERFCTR_CLOCK_CYCLES, 736 + [PERF_COUNT_HW_INSTRUCTIONS] = ARMV8_PMUV3_PERFCTR_INSTR_EXECUTED, 737 + [PERF_COUNT_HW_CACHE_REFERENCES] = ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS, 738 + [PERF_COUNT_HW_CACHE_MISSES] = ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL, 739 + [PERF_COUNT_HW_BRANCH_MISSES] = ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED, 740 + [PERF_COUNT_HW_BUS_CYCLES] = ARMV8_PMUV3_PERFCTR_BUS_CYCLES, 95 741 }; 96 742 97 743 static const unsigned armv8_pmuv3_perf_cache_map[PERF_COUNT_HW_CACHE_MAX] ··· 132 734 [C(BPU)][C(OP_WRITE)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED, 133 735 }; 134 736 737 + static const unsigned armv8_a53_perf_cache_map[PERF_COUNT_HW_CACHE_MAX] 738 + [PERF_COUNT_HW_CACHE_OP_MAX] 739 + [PERF_COUNT_HW_CACHE_RESULT_MAX] = { 740 + PERF_CACHE_MAP_ALL_UNSUPPORTED, 741 + 742 + [C(L1D)][C(OP_READ)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS, 743 + [C(L1D)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL, 744 + [C(L1D)][C(OP_WRITE)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS, 745 + [C(L1D)][C(OP_WRITE)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL, 746 + [C(L1D)][C(OP_PREFETCH)][C(RESULT_MISS)] = ARMV8_A53_PERFCTR_PREFETCH_LINEFILL, 747 + 748 + [C(L1I)][C(OP_READ)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_L1_ICACHE_ACCESS, 749 + [C(L1I)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_L1_ICACHE_REFILL, 750 + 751 + [C(ITLB)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_ITLB_REFILL, 752 + 753 + [C(BPU)][C(OP_READ)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_PC_BRANCH_PRED, 754 + [C(BPU)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED, 755 + [C(BPU)][C(OP_WRITE)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_PC_BRANCH_PRED, 756 + [C(BPU)][C(OP_WRITE)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED, 757 + }; 758 + 759 + static const unsigned armv8_a57_perf_cache_map[PERF_COUNT_HW_CACHE_MAX] 760 + [PERF_COUNT_HW_CACHE_OP_MAX] 761 + [PERF_COUNT_HW_CACHE_RESULT_MAX] = { 762 + PERF_CACHE_MAP_ALL_UNSUPPORTED, 763 + 764 + [C(L1D)][C(OP_READ)][C(RESULT_ACCESS)] = ARMV8_A57_PERFCTR_L1_DCACHE_ACCESS_LD, 765 + [C(L1D)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_A57_PERFCTR_L1_DCACHE_REFILL_LD, 766 + [C(L1D)][C(OP_WRITE)][C(RESULT_ACCESS)] = ARMV8_A57_PERFCTR_L1_DCACHE_ACCESS_ST, 767 + [C(L1D)][C(OP_WRITE)][C(RESULT_MISS)] = ARMV8_A57_PERFCTR_L1_DCACHE_REFILL_ST, 768 + 769 + [C(L1I)][C(OP_READ)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_L1_ICACHE_ACCESS, 770 + [C(L1I)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_L1_ICACHE_REFILL, 771 + 772 + [C(DTLB)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_A57_PERFCTR_DTLB_REFILL_LD, 773 + [C(DTLB)][C(OP_WRITE)][C(RESULT_MISS)] = ARMV8_A57_PERFCTR_DTLB_REFILL_ST, 774 + 775 + [C(ITLB)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_ITLB_REFILL, 776 + 777 + [C(BPU)][C(OP_READ)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_PC_BRANCH_PRED, 778 + [C(BPU)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED, 779 + [C(BPU)][C(OP_WRITE)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_PC_BRANCH_PRED, 780 + [C(BPU)][C(OP_WRITE)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED, 781 + }; 782 + 783 + 135 784 /* 136 785 * Perf Events' indices 137 786 */ 138 787 #define ARMV8_IDX_CYCLE_COUNTER 0 139 788 #define ARMV8_IDX_COUNTER0 1 140 - #define ARMV8_IDX_COUNTER_LAST (ARMV8_IDX_CYCLE_COUNTER + cpu_pmu->num_events - 1) 789 + #define ARMV8_IDX_COUNTER_LAST(cpu_pmu) \ 790 + (ARMV8_IDX_CYCLE_COUNTER + cpu_pmu->num_events - 1) 141 791 142 792 #define ARMV8_MAX_COUNTERS 32 143 793 #define ARMV8_COUNTER_MASK (ARMV8_MAX_COUNTERS - 1) ··· 251 805 return pmovsr & ARMV8_OVERFLOWED_MASK; 252 806 } 253 807 254 - static inline int armv8pmu_counter_valid(int idx) 808 + static inline int armv8pmu_counter_valid(struct arm_pmu *cpu_pmu, int idx) 255 809 { 256 - return idx >= ARMV8_IDX_CYCLE_COUNTER && idx <= ARMV8_IDX_COUNTER_LAST; 810 + return idx >= ARMV8_IDX_CYCLE_COUNTER && 811 + idx <= ARMV8_IDX_COUNTER_LAST(cpu_pmu); 257 812 } 258 813 259 814 static inline int armv8pmu_counter_has_overflowed(u32 pmnc, int idx) 260 815 { 261 - int ret = 0; 262 - u32 counter; 263 - 264 - if (!armv8pmu_counter_valid(idx)) { 265 - pr_err("CPU%u checking wrong counter %d overflow status\n", 266 - smp_processor_id(), idx); 267 - } else { 268 - counter = ARMV8_IDX_TO_COUNTER(idx); 269 - ret = pmnc & BIT(counter); 270 - } 271 - 272 - return ret; 816 + return pmnc & BIT(ARMV8_IDX_TO_COUNTER(idx)); 273 817 } 274 818 275 819 static inline int armv8pmu_select_counter(int idx) 276 820 { 277 - u32 counter; 278 - 279 - if (!armv8pmu_counter_valid(idx)) { 280 - pr_err("CPU%u selecting wrong PMNC counter %d\n", 281 - smp_processor_id(), idx); 282 - return -EINVAL; 283 - } 284 - 285 - counter = ARMV8_IDX_TO_COUNTER(idx); 821 + u32 counter = ARMV8_IDX_TO_COUNTER(idx); 286 822 asm volatile("msr pmselr_el0, %0" :: "r" (counter)); 287 823 isb(); 288 824 289 825 return idx; 290 826 } 291 827 292 - static inline u32 armv8pmu_read_counter(int idx) 828 + static inline u32 armv8pmu_read_counter(struct perf_event *event) 293 829 { 830 + struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); 831 + struct hw_perf_event *hwc = &event->hw; 832 + int idx = hwc->idx; 294 833 u32 value = 0; 295 834 296 - if (!armv8pmu_counter_valid(idx)) 835 + if (!armv8pmu_counter_valid(cpu_pmu, idx)) 297 836 pr_err("CPU%u reading wrong counter %d\n", 298 837 smp_processor_id(), idx); 299 838 else if (idx == ARMV8_IDX_CYCLE_COUNTER) ··· 289 858 return value; 290 859 } 291 860 292 - static inline void armv8pmu_write_counter(int idx, u32 value) 861 + static inline void armv8pmu_write_counter(struct perf_event *event, u32 value) 293 862 { 294 - if (!armv8pmu_counter_valid(idx)) 863 + struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); 864 + struct hw_perf_event *hwc = &event->hw; 865 + int idx = hwc->idx; 866 + 867 + if (!armv8pmu_counter_valid(cpu_pmu, idx)) 295 868 pr_err("CPU%u writing wrong counter %d\n", 296 869 smp_processor_id(), idx); 297 870 else if (idx == ARMV8_IDX_CYCLE_COUNTER) ··· 314 879 315 880 static inline int armv8pmu_enable_counter(int idx) 316 881 { 317 - u32 counter; 318 - 319 - if (!armv8pmu_counter_valid(idx)) { 320 - pr_err("CPU%u enabling wrong PMNC counter %d\n", 321 - smp_processor_id(), idx); 322 - return -EINVAL; 323 - } 324 - 325 - counter = ARMV8_IDX_TO_COUNTER(idx); 882 + u32 counter = ARMV8_IDX_TO_COUNTER(idx); 326 883 asm volatile("msr pmcntenset_el0, %0" :: "r" (BIT(counter))); 327 884 return idx; 328 885 } 329 886 330 887 static inline int armv8pmu_disable_counter(int idx) 331 888 { 332 - u32 counter; 333 - 334 - if (!armv8pmu_counter_valid(idx)) { 335 - pr_err("CPU%u disabling wrong PMNC counter %d\n", 336 - smp_processor_id(), idx); 337 - return -EINVAL; 338 - } 339 - 340 - counter = ARMV8_IDX_TO_COUNTER(idx); 889 + u32 counter = ARMV8_IDX_TO_COUNTER(idx); 341 890 asm volatile("msr pmcntenclr_el0, %0" :: "r" (BIT(counter))); 342 891 return idx; 343 892 } 344 893 345 894 static inline int armv8pmu_enable_intens(int idx) 346 895 { 347 - u32 counter; 348 - 349 - if (!armv8pmu_counter_valid(idx)) { 350 - pr_err("CPU%u enabling wrong PMNC counter IRQ enable %d\n", 351 - smp_processor_id(), idx); 352 - return -EINVAL; 353 - } 354 - 355 - counter = ARMV8_IDX_TO_COUNTER(idx); 896 + u32 counter = ARMV8_IDX_TO_COUNTER(idx); 356 897 asm volatile("msr pmintenset_el1, %0" :: "r" (BIT(counter))); 357 898 return idx; 358 899 } 359 900 360 901 static inline int armv8pmu_disable_intens(int idx) 361 902 { 362 - u32 counter; 363 - 364 - if (!armv8pmu_counter_valid(idx)) { 365 - pr_err("CPU%u disabling wrong PMNC counter IRQ enable %d\n", 366 - smp_processor_id(), idx); 367 - return -EINVAL; 368 - } 369 - 370 - counter = ARMV8_IDX_TO_COUNTER(idx); 903 + u32 counter = ARMV8_IDX_TO_COUNTER(idx); 371 904 asm volatile("msr pmintenclr_el1, %0" :: "r" (BIT(counter))); 372 905 isb(); 373 906 /* Clear the overflow flag in case an interrupt is pending. */ 374 907 asm volatile("msr pmovsclr_el0, %0" :: "r" (BIT(counter))); 375 908 isb(); 909 + 376 910 return idx; 377 911 } 378 912 ··· 359 955 return value; 360 956 } 361 957 362 - static void armv8pmu_enable_event(struct hw_perf_event *hwc, int idx) 958 + static void armv8pmu_enable_event(struct perf_event *event) 363 959 { 364 960 unsigned long flags; 365 - struct pmu_hw_events *events = cpu_pmu->get_hw_events(); 961 + struct hw_perf_event *hwc = &event->hw; 962 + struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); 963 + struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 964 + int idx = hwc->idx; 366 965 367 966 /* 368 967 * Enable counter and interrupt, and set the counter to count ··· 396 989 raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 397 990 } 398 991 399 - static void armv8pmu_disable_event(struct hw_perf_event *hwc, int idx) 992 + static void armv8pmu_disable_event(struct perf_event *event) 400 993 { 401 994 unsigned long flags; 402 - struct pmu_hw_events *events = cpu_pmu->get_hw_events(); 995 + struct hw_perf_event *hwc = &event->hw; 996 + struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); 997 + struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 998 + int idx = hwc->idx; 403 999 404 1000 /* 405 1001 * Disable counter and interrupt ··· 426 1016 { 427 1017 u32 pmovsr; 428 1018 struct perf_sample_data data; 429 - struct pmu_hw_events *cpuc; 1019 + struct arm_pmu *cpu_pmu = (struct arm_pmu *)dev; 1020 + struct pmu_hw_events *cpuc = this_cpu_ptr(cpu_pmu->hw_events); 430 1021 struct pt_regs *regs; 431 1022 int idx; 432 1023 ··· 447 1036 */ 448 1037 regs = get_irq_regs(); 449 1038 450 - cpuc = this_cpu_ptr(&cpu_hw_events); 451 1039 for (idx = 0; idx < cpu_pmu->num_events; ++idx) { 452 1040 struct perf_event *event = cpuc->events[idx]; 453 1041 struct hw_perf_event *hwc; ··· 463 1053 continue; 464 1054 465 1055 hwc = &event->hw; 466 - armpmu_event_update(event, hwc, idx); 1056 + armpmu_event_update(event); 467 1057 perf_sample_data_init(&data, 0, hwc->last_period); 468 - if (!armpmu_event_set_period(event, hwc, idx)) 1058 + if (!armpmu_event_set_period(event)) 469 1059 continue; 470 1060 471 1061 if (perf_event_overflow(event, &data, regs)) 472 - cpu_pmu->disable(hwc, idx); 1062 + cpu_pmu->disable(event); 473 1063 } 474 1064 475 1065 /* ··· 484 1074 return IRQ_HANDLED; 485 1075 } 486 1076 487 - static void armv8pmu_start(void) 1077 + static void armv8pmu_start(struct arm_pmu *cpu_pmu) 488 1078 { 489 1079 unsigned long flags; 490 - struct pmu_hw_events *events = cpu_pmu->get_hw_events(); 1080 + struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 491 1081 492 1082 raw_spin_lock_irqsave(&events->pmu_lock, flags); 493 1083 /* Enable all counters */ ··· 495 1085 raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 496 1086 } 497 1087 498 - static void armv8pmu_stop(void) 1088 + static void armv8pmu_stop(struct arm_pmu *cpu_pmu) 499 1089 { 500 1090 unsigned long flags; 501 - struct pmu_hw_events *events = cpu_pmu->get_hw_events(); 1091 + struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); 502 1092 503 1093 raw_spin_lock_irqsave(&events->pmu_lock, flags); 504 1094 /* Disable all counters */ ··· 507 1097 } 508 1098 509 1099 static int armv8pmu_get_event_idx(struct pmu_hw_events *cpuc, 510 - struct hw_perf_event *event) 1100 + struct perf_event *event) 511 1101 { 512 1102 int idx; 513 - unsigned long evtype = event->config_base & ARMV8_EVTYPE_EVENT; 1103 + struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); 1104 + struct hw_perf_event *hwc = &event->hw; 1105 + unsigned long evtype = hwc->config_base & ARMV8_EVTYPE_EVENT; 514 1106 515 1107 /* Always place a cycle counter into the cycle counter. */ 516 1108 if (evtype == ARMV8_PMUV3_PERFCTR_CLOCK_CYCLES) { ··· 563 1151 564 1152 static void armv8pmu_reset(void *info) 565 1153 { 1154 + struct arm_pmu *cpu_pmu = (struct arm_pmu *)info; 566 1155 u32 idx, nb_cnt = cpu_pmu->num_events; 567 1156 568 1157 /* The counter and interrupt enable registers are unknown at reset. */ 569 - for (idx = ARMV8_IDX_CYCLE_COUNTER; idx < nb_cnt; ++idx) 570 - armv8pmu_disable_event(NULL, idx); 1158 + for (idx = ARMV8_IDX_CYCLE_COUNTER; idx < nb_cnt; ++idx) { 1159 + armv8pmu_disable_counter(idx); 1160 + armv8pmu_disable_intens(idx); 1161 + } 571 1162 572 1163 /* Initialize & Reset PMNC: C and P bits. */ 573 1164 armv8pmu_pmcr_write(ARMV8_PMCR_P | ARMV8_PMCR_C); ··· 581 1166 582 1167 static int armv8_pmuv3_map_event(struct perf_event *event) 583 1168 { 584 - return map_cpu_event(event, &armv8_pmuv3_perf_map, 1169 + return armpmu_map_event(event, &armv8_pmuv3_perf_map, 585 1170 &armv8_pmuv3_perf_cache_map, 586 1171 ARMV8_EVTYPE_EVENT); 587 1172 } 588 1173 589 - static struct arm_pmu armv8pmu = { 590 - .handle_irq = armv8pmu_handle_irq, 591 - .enable = armv8pmu_enable_event, 592 - .disable = armv8pmu_disable_event, 593 - .read_counter = armv8pmu_read_counter, 594 - .write_counter = armv8pmu_write_counter, 595 - .get_event_idx = armv8pmu_get_event_idx, 596 - .start = armv8pmu_start, 597 - .stop = armv8pmu_stop, 598 - .reset = armv8pmu_reset, 599 - .max_period = (1LLU << 32) - 1, 600 - }; 601 - 602 - static u32 __init armv8pmu_read_num_pmnc_events(void) 1174 + static int armv8_a53_map_event(struct perf_event *event) 603 1175 { 604 - u32 nb_cnt; 1176 + return armpmu_map_event(event, &armv8_a53_perf_map, 1177 + &armv8_a53_perf_cache_map, 1178 + ARMV8_EVTYPE_EVENT); 1179 + } 1180 + 1181 + static int armv8_a57_map_event(struct perf_event *event) 1182 + { 1183 + return armpmu_map_event(event, &armv8_a57_perf_map, 1184 + &armv8_a57_perf_cache_map, 1185 + ARMV8_EVTYPE_EVENT); 1186 + } 1187 + 1188 + static void armv8pmu_read_num_pmnc_events(void *info) 1189 + { 1190 + int *nb_cnt = info; 605 1191 606 1192 /* Read the nb of CNTx counters supported from PMNC */ 607 - nb_cnt = (armv8pmu_pmcr_read() >> ARMV8_PMCR_N_SHIFT) & ARMV8_PMCR_N_MASK; 1193 + *nb_cnt = (armv8pmu_pmcr_read() >> ARMV8_PMCR_N_SHIFT) & ARMV8_PMCR_N_MASK; 608 1194 609 - /* Add the CPU cycles counter and return */ 610 - return nb_cnt + 1; 1195 + /* Add the CPU cycles counter */ 1196 + *nb_cnt += 1; 611 1197 } 612 1198 613 - static struct arm_pmu *__init armv8_pmuv3_pmu_init(void) 1199 + static int armv8pmu_probe_num_events(struct arm_pmu *arm_pmu) 614 1200 { 615 - armv8pmu.name = "arm/armv8-pmuv3"; 616 - armv8pmu.map_event = armv8_pmuv3_map_event; 617 - armv8pmu.num_events = armv8pmu_read_num_pmnc_events(); 618 - armv8pmu.set_event_filter = armv8pmu_set_event_filter; 619 - return &armv8pmu; 1201 + return smp_call_function_any(&arm_pmu->supported_cpus, 1202 + armv8pmu_read_num_pmnc_events, 1203 + &arm_pmu->num_events, 1); 620 1204 } 621 1205 622 - /* 623 - * Ensure the PMU has sane values out of reset. 624 - * This requires SMP to be available, so exists as a separate initcall. 625 - */ 626 - static int __init 627 - cpu_pmu_reset(void) 1206 + static void armv8_pmu_init(struct arm_pmu *cpu_pmu) 628 1207 { 629 - if (cpu_pmu && cpu_pmu->reset) 630 - return on_each_cpu(cpu_pmu->reset, NULL, 1); 631 - return 0; 1208 + cpu_pmu->handle_irq = armv8pmu_handle_irq, 1209 + cpu_pmu->enable = armv8pmu_enable_event, 1210 + cpu_pmu->disable = armv8pmu_disable_event, 1211 + cpu_pmu->read_counter = armv8pmu_read_counter, 1212 + cpu_pmu->write_counter = armv8pmu_write_counter, 1213 + cpu_pmu->get_event_idx = armv8pmu_get_event_idx, 1214 + cpu_pmu->start = armv8pmu_start, 1215 + cpu_pmu->stop = armv8pmu_stop, 1216 + cpu_pmu->reset = armv8pmu_reset, 1217 + cpu_pmu->max_period = (1LLU << 32) - 1, 1218 + cpu_pmu->set_event_filter = armv8pmu_set_event_filter; 632 1219 } 633 - arch_initcall(cpu_pmu_reset); 634 1220 635 - /* 636 - * PMU platform driver and devicetree bindings. 637 - */ 638 - static const struct of_device_id armpmu_of_device_ids[] = { 639 - {.compatible = "arm,armv8-pmuv3"}, 1221 + static int armv8_pmuv3_init(struct arm_pmu *cpu_pmu) 1222 + { 1223 + armv8_pmu_init(cpu_pmu); 1224 + cpu_pmu->name = "armv8_pmuv3"; 1225 + cpu_pmu->map_event = armv8_pmuv3_map_event; 1226 + return armv8pmu_probe_num_events(cpu_pmu); 1227 + } 1228 + 1229 + static int armv8_a53_pmu_init(struct arm_pmu *cpu_pmu) 1230 + { 1231 + armv8_pmu_init(cpu_pmu); 1232 + cpu_pmu->name = "armv8_cortex_a53"; 1233 + cpu_pmu->map_event = armv8_a53_map_event; 1234 + return armv8pmu_probe_num_events(cpu_pmu); 1235 + } 1236 + 1237 + static int armv8_a57_pmu_init(struct arm_pmu *cpu_pmu) 1238 + { 1239 + armv8_pmu_init(cpu_pmu); 1240 + cpu_pmu->name = "armv8_cortex_a57"; 1241 + cpu_pmu->map_event = armv8_a57_map_event; 1242 + return armv8pmu_probe_num_events(cpu_pmu); 1243 + } 1244 + 1245 + static const struct of_device_id armv8_pmu_of_device_ids[] = { 1246 + {.compatible = "arm,armv8-pmuv3", .data = armv8_pmuv3_init}, 1247 + {.compatible = "arm,cortex-a53-pmu", .data = armv8_a53_pmu_init}, 1248 + {.compatible = "arm,cortex-a57-pmu", .data = armv8_a57_pmu_init}, 640 1249 {}, 641 1250 }; 642 1251 643 - static int armpmu_device_probe(struct platform_device *pdev) 1252 + static int armv8_pmu_device_probe(struct platform_device *pdev) 644 1253 { 645 - int i, irq, *irqs; 646 - 647 - if (!cpu_pmu) 648 - return -ENODEV; 649 - 650 - /* Don't bother with PPIs; they're already affine */ 651 - irq = platform_get_irq(pdev, 0); 652 - if (irq >= 0 && irq_is_percpu(irq)) 653 - goto out; 654 - 655 - irqs = kcalloc(pdev->num_resources, sizeof(*irqs), GFP_KERNEL); 656 - if (!irqs) 657 - return -ENOMEM; 658 - 659 - for (i = 0; i < pdev->num_resources; ++i) { 660 - struct device_node *dn; 661 - int cpu; 662 - 663 - dn = of_parse_phandle(pdev->dev.of_node, "interrupt-affinity", 664 - i); 665 - if (!dn) { 666 - pr_warn("Failed to parse %s/interrupt-affinity[%d]\n", 667 - of_node_full_name(pdev->dev.of_node), i); 668 - break; 669 - } 670 - 671 - for_each_possible_cpu(cpu) 672 - if (dn == of_cpu_device_node_get(cpu)) 673 - break; 674 - 675 - if (cpu >= nr_cpu_ids) { 676 - pr_warn("Failed to find logical CPU for %s\n", 677 - dn->name); 678 - of_node_put(dn); 679 - break; 680 - } 681 - of_node_put(dn); 682 - 683 - irqs[i] = cpu; 684 - } 685 - 686 - if (i == pdev->num_resources) 687 - cpu_pmu->irq_affinity = irqs; 688 - else 689 - kfree(irqs); 690 - 691 - out: 692 - cpu_pmu->plat_device = pdev; 693 - return 0; 1254 + return arm_pmu_device_probe(pdev, armv8_pmu_of_device_ids, NULL); 694 1255 } 695 1256 696 - static struct platform_driver armpmu_driver = { 1257 + static struct platform_driver armv8_pmu_driver = { 697 1258 .driver = { 698 - .name = "arm-pmu", 699 - .of_match_table = armpmu_of_device_ids, 1259 + .name = "armv8-pmu", 1260 + .of_match_table = armv8_pmu_of_device_ids, 700 1261 }, 701 - .probe = armpmu_device_probe, 1262 + .probe = armv8_pmu_device_probe, 702 1263 }; 703 1264 704 - static int __init register_pmu_driver(void) 1265 + static int __init register_armv8_pmu_driver(void) 705 1266 { 706 - return platform_driver_register(&armpmu_driver); 1267 + return platform_driver_register(&armv8_pmu_driver); 707 1268 } 708 - device_initcall(register_pmu_driver); 709 - 710 - static struct pmu_hw_events *armpmu_get_cpu_events(void) 711 - { 712 - return this_cpu_ptr(&cpu_hw_events); 713 - } 714 - 715 - static void __init cpu_pmu_init(struct arm_pmu *armpmu) 716 - { 717 - int cpu; 718 - for_each_possible_cpu(cpu) { 719 - struct pmu_hw_events *events = &per_cpu(cpu_hw_events, cpu); 720 - events->events = per_cpu(hw_events, cpu); 721 - events->used_mask = per_cpu(used_mask, cpu); 722 - raw_spin_lock_init(&events->pmu_lock); 723 - } 724 - armpmu->get_hw_events = armpmu_get_cpu_events; 725 - } 726 - 727 - static int __init init_hw_perf_events(void) 728 - { 729 - u64 dfr = read_cpuid(ID_AA64DFR0_EL1); 730 - 731 - switch ((dfr >> 8) & 0xf) { 732 - case 0x1: /* PMUv3 */ 733 - cpu_pmu = armv8_pmuv3_pmu_init(); 734 - break; 735 - } 736 - 737 - if (cpu_pmu) { 738 - pr_info("enabled with %s PMU driver, %d counters available\n", 739 - cpu_pmu->name, cpu_pmu->num_events); 740 - cpu_pmu_init(cpu_pmu); 741 - armpmu_register(cpu_pmu, "cpu", PERF_TYPE_RAW); 742 - } else { 743 - pr_info("no hardware support available\n"); 744 - } 745 - 746 - return 0; 747 - } 748 - early_initcall(init_hw_perf_events); 749 - 1269 + device_initcall(register_armv8_pmu_driver);
+3
arch/arm64/kernel/process.c
··· 44 44 #include <linux/hw_breakpoint.h> 45 45 #include <linux/personality.h> 46 46 #include <linux/notifier.h> 47 + #include <trace/events/power.h> 47 48 48 49 #include <asm/compat.h> 49 50 #include <asm/cacheflush.h> ··· 76 75 * This should do all the clock switching and wait for interrupt 77 76 * tricks 78 77 */ 78 + trace_cpu_idle_rcuidle(1, smp_processor_id()); 79 79 cpu_do_idle(); 80 80 local_irq_enable(); 81 + trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id()); 81 82 } 82 83 83 84 #ifdef CONFIG_HOTPLUG_CPU
+6 -239
arch/arm64/kernel/setup.c
··· 28 28 #include <linux/console.h> 29 29 #include <linux/cache.h> 30 30 #include <linux/bootmem.h> 31 - #include <linux/seq_file.h> 32 31 #include <linux/screen_info.h> 33 32 #include <linux/init.h> 34 33 #include <linux/kexec.h> ··· 43 44 #include <linux/of_fdt.h> 44 45 #include <linux/of_platform.h> 45 46 #include <linux/efi.h> 46 - #include <linux/personality.h> 47 47 #include <linux/psci.h> 48 48 49 49 #include <asm/acpi.h> ··· 52 54 #include <asm/elf.h> 53 55 #include <asm/cpufeature.h> 54 56 #include <asm/cpu_ops.h> 57 + #include <asm/kasan.h> 55 58 #include <asm/sections.h> 56 59 #include <asm/setup.h> 57 60 #include <asm/smp_plat.h> ··· 62 63 #include <asm/memblock.h> 63 64 #include <asm/efi.h> 64 65 #include <asm/xen/hypervisor.h> 65 - 66 - unsigned long elf_hwcap __read_mostly; 67 - EXPORT_SYMBOL_GPL(elf_hwcap); 68 - 69 - #ifdef CONFIG_COMPAT 70 - #define COMPAT_ELF_HWCAP_DEFAULT \ 71 - (COMPAT_HWCAP_HALF|COMPAT_HWCAP_THUMB|\ 72 - COMPAT_HWCAP_FAST_MULT|COMPAT_HWCAP_EDSP|\ 73 - COMPAT_HWCAP_TLS|COMPAT_HWCAP_VFP|\ 74 - COMPAT_HWCAP_VFPv3|COMPAT_HWCAP_VFPv4|\ 75 - COMPAT_HWCAP_NEON|COMPAT_HWCAP_IDIV|\ 76 - COMPAT_HWCAP_LPAE) 77 - unsigned int compat_elf_hwcap __read_mostly = COMPAT_ELF_HWCAP_DEFAULT; 78 - unsigned int compat_elf_hwcap2 __read_mostly; 79 - #endif 80 - 81 - DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS); 82 66 83 67 phys_addr_t __fdt_pointer __initdata; 84 68 ··· 177 195 __flush_dcache_area(&mpidr_hash, sizeof(struct mpidr_hash)); 178 196 } 179 197 180 - static void __init setup_processor(void) 181 - { 182 - u64 features; 183 - s64 block; 184 - u32 cwg; 185 - int cls; 186 - 187 - printk("CPU: AArch64 Processor [%08x] revision %d\n", 188 - read_cpuid_id(), read_cpuid_id() & 15); 189 - 190 - sprintf(init_utsname()->machine, ELF_PLATFORM); 191 - elf_hwcap = 0; 192 - 193 - cpuinfo_store_boot_cpu(); 194 - 195 - /* 196 - * Check for sane CTR_EL0.CWG value. 197 - */ 198 - cwg = cache_type_cwg(); 199 - cls = cache_line_size(); 200 - if (!cwg) 201 - pr_warn("No Cache Writeback Granule information, assuming cache line size %d\n", 202 - cls); 203 - if (L1_CACHE_BYTES < cls) 204 - pr_warn("L1_CACHE_BYTES smaller than the Cache Writeback Granule (%d < %d)\n", 205 - L1_CACHE_BYTES, cls); 206 - 207 - /* 208 - * ID_AA64ISAR0_EL1 contains 4-bit wide signed feature blocks. 209 - * The blocks we test below represent incremental functionality 210 - * for non-negative values. Negative values are reserved. 211 - */ 212 - features = read_cpuid(ID_AA64ISAR0_EL1); 213 - block = cpuid_feature_extract_field(features, 4); 214 - if (block > 0) { 215 - switch (block) { 216 - default: 217 - case 2: 218 - elf_hwcap |= HWCAP_PMULL; 219 - case 1: 220 - elf_hwcap |= HWCAP_AES; 221 - case 0: 222 - break; 223 - } 224 - } 225 - 226 - if (cpuid_feature_extract_field(features, 8) > 0) 227 - elf_hwcap |= HWCAP_SHA1; 228 - 229 - if (cpuid_feature_extract_field(features, 12) > 0) 230 - elf_hwcap |= HWCAP_SHA2; 231 - 232 - if (cpuid_feature_extract_field(features, 16) > 0) 233 - elf_hwcap |= HWCAP_CRC32; 234 - 235 - block = cpuid_feature_extract_field(features, 20); 236 - if (block > 0) { 237 - switch (block) { 238 - default: 239 - case 2: 240 - elf_hwcap |= HWCAP_ATOMICS; 241 - case 1: 242 - /* RESERVED */ 243 - case 0: 244 - break; 245 - } 246 - } 247 - 248 - #ifdef CONFIG_COMPAT 249 - /* 250 - * ID_ISAR5_EL1 carries similar information as above, but pertaining to 251 - * the AArch32 32-bit execution state. 252 - */ 253 - features = read_cpuid(ID_ISAR5_EL1); 254 - block = cpuid_feature_extract_field(features, 4); 255 - if (block > 0) { 256 - switch (block) { 257 - default: 258 - case 2: 259 - compat_elf_hwcap2 |= COMPAT_HWCAP2_PMULL; 260 - case 1: 261 - compat_elf_hwcap2 |= COMPAT_HWCAP2_AES; 262 - case 0: 263 - break; 264 - } 265 - } 266 - 267 - if (cpuid_feature_extract_field(features, 8) > 0) 268 - compat_elf_hwcap2 |= COMPAT_HWCAP2_SHA1; 269 - 270 - if (cpuid_feature_extract_field(features, 12) > 0) 271 - compat_elf_hwcap2 |= COMPAT_HWCAP2_SHA2; 272 - 273 - if (cpuid_feature_extract_field(features, 16) > 0) 274 - compat_elf_hwcap2 |= COMPAT_HWCAP2_CRC32; 275 - #endif 276 - } 277 - 278 198 static void __init setup_machine_fdt(phys_addr_t dt_phys) 279 199 { 280 200 void *dt_virt = fixmap_remap_fdt(dt_phys); ··· 290 406 291 407 void __init setup_arch(char **cmdline_p) 292 408 { 293 - setup_processor(); 409 + pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id()); 294 410 411 + sprintf(init_utsname()->machine, ELF_PLATFORM); 295 412 init_mm.start_code = (unsigned long) _text; 296 413 init_mm.end_code = (unsigned long) _etext; 297 414 init_mm.end_data = (unsigned long) _edata; ··· 321 436 322 437 paging_init(); 323 438 relocate_initrd(); 439 + 440 + kasan_init(); 441 + 324 442 request_standard_resources(); 325 443 326 444 early_ioremap_reset(); ··· 381 493 return 0; 382 494 } 383 495 subsys_initcall(topology_init); 384 - 385 - static const char *hwcap_str[] = { 386 - "fp", 387 - "asimd", 388 - "evtstrm", 389 - "aes", 390 - "pmull", 391 - "sha1", 392 - "sha2", 393 - "crc32", 394 - "atomics", 395 - NULL 396 - }; 397 - 398 - #ifdef CONFIG_COMPAT 399 - static const char *compat_hwcap_str[] = { 400 - "swp", 401 - "half", 402 - "thumb", 403 - "26bit", 404 - "fastmult", 405 - "fpa", 406 - "vfp", 407 - "edsp", 408 - "java", 409 - "iwmmxt", 410 - "crunch", 411 - "thumbee", 412 - "neon", 413 - "vfpv3", 414 - "vfpv3d16", 415 - "tls", 416 - "vfpv4", 417 - "idiva", 418 - "idivt", 419 - "vfpd32", 420 - "lpae", 421 - "evtstrm" 422 - }; 423 - 424 - static const char *compat_hwcap2_str[] = { 425 - "aes", 426 - "pmull", 427 - "sha1", 428 - "sha2", 429 - "crc32", 430 - NULL 431 - }; 432 - #endif /* CONFIG_COMPAT */ 433 - 434 - static int c_show(struct seq_file *m, void *v) 435 - { 436 - int i, j; 437 - 438 - for_each_online_cpu(i) { 439 - struct cpuinfo_arm64 *cpuinfo = &per_cpu(cpu_data, i); 440 - u32 midr = cpuinfo->reg_midr; 441 - 442 - /* 443 - * glibc reads /proc/cpuinfo to determine the number of 444 - * online processors, looking for lines beginning with 445 - * "processor". Give glibc what it expects. 446 - */ 447 - seq_printf(m, "processor\t: %d\n", i); 448 - 449 - /* 450 - * Dump out the common processor features in a single line. 451 - * Userspace should read the hwcaps with getauxval(AT_HWCAP) 452 - * rather than attempting to parse this, but there's a body of 453 - * software which does already (at least for 32-bit). 454 - */ 455 - seq_puts(m, "Features\t:"); 456 - if (personality(current->personality) == PER_LINUX32) { 457 - #ifdef CONFIG_COMPAT 458 - for (j = 0; compat_hwcap_str[j]; j++) 459 - if (compat_elf_hwcap & (1 << j)) 460 - seq_printf(m, " %s", compat_hwcap_str[j]); 461 - 462 - for (j = 0; compat_hwcap2_str[j]; j++) 463 - if (compat_elf_hwcap2 & (1 << j)) 464 - seq_printf(m, " %s", compat_hwcap2_str[j]); 465 - #endif /* CONFIG_COMPAT */ 466 - } else { 467 - for (j = 0; hwcap_str[j]; j++) 468 - if (elf_hwcap & (1 << j)) 469 - seq_printf(m, " %s", hwcap_str[j]); 470 - } 471 - seq_puts(m, "\n"); 472 - 473 - seq_printf(m, "CPU implementer\t: 0x%02x\n", 474 - MIDR_IMPLEMENTOR(midr)); 475 - seq_printf(m, "CPU architecture: 8\n"); 476 - seq_printf(m, "CPU variant\t: 0x%x\n", MIDR_VARIANT(midr)); 477 - seq_printf(m, "CPU part\t: 0x%03x\n", MIDR_PARTNUM(midr)); 478 - seq_printf(m, "CPU revision\t: %d\n\n", MIDR_REVISION(midr)); 479 - } 480 - 481 - return 0; 482 - } 483 - 484 - static void *c_start(struct seq_file *m, loff_t *pos) 485 - { 486 - return *pos < 1 ? (void *)1 : NULL; 487 - } 488 - 489 - static void *c_next(struct seq_file *m, void *v, loff_t *pos) 490 - { 491 - ++*pos; 492 - return NULL; 493 - } 494 - 495 - static void c_stop(struct seq_file *m, void *v) 496 - { 497 - } 498 - 499 - const struct seq_operations cpuinfo_op = { 500 - .start = c_start, 501 - .next = c_next, 502 - .stop = c_stop, 503 - .show = c_show 504 - };
+13 -9
arch/arm64/kernel/smp.c
··· 142 142 */ 143 143 atomic_inc(&mm->mm_count); 144 144 current->active_mm = mm; 145 - cpumask_set_cpu(cpu, mm_cpumask(mm)); 146 145 147 146 set_my_cpu_offset(per_cpu_offset(smp_processor_id())); 148 - printk("CPU%u: Booted secondary processor\n", cpu); 149 147 150 148 /* 151 149 * TTBR0 is only used for the identity mapping at this stage. Make it 152 150 * point to zero page to avoid speculatively fetching new entries. 153 151 */ 154 152 cpu_set_reserved_ttbr0(); 155 - flush_tlb_all(); 153 + local_flush_tlb_all(); 156 154 cpu_set_default_tcr_t0sz(); 157 155 158 156 preempt_disable(); 159 157 trace_hardirqs_off(); 158 + 159 + /* 160 + * If the system has established the capabilities, make sure 161 + * this CPU ticks all of those. If it doesn't, the CPU will 162 + * fail to come online. 163 + */ 164 + verify_local_cpu_capabilities(); 160 165 161 166 if (cpu_ops[cpu]->cpu_postboot) 162 167 cpu_ops[cpu]->cpu_postboot(); ··· 183 178 * the CPU migration code to notice that the CPU is online 184 179 * before we continue. 185 180 */ 181 + pr_info("CPU%u: Booted secondary processor [%08x]\n", 182 + cpu, read_cpuid_id()); 186 183 set_cpu_online(cpu, true); 187 184 complete(&cpu_running); 188 185 ··· 239 232 /* 240 233 * OK - migrate IRQs away from this CPU 241 234 */ 242 - migrate_irqs(); 243 - 244 - /* 245 - * Remove this CPU from the vm mask set of all processes. 246 - */ 247 - clear_tasks_mm_cpumask(cpu); 235 + irq_migrate_all_off_this_cpu(); 248 236 249 237 return 0; 250 238 } ··· 327 325 void __init smp_cpus_done(unsigned int max_cpus) 328 326 { 329 327 pr_info("SMP: Total of %d processors activated.\n", num_online_cpus()); 328 + setup_cpu_features(); 330 329 hyp_mode_check(); 331 330 apply_alternatives_all(); 332 331 } 333 332 334 333 void __init smp_prepare_boot_cpu(void) 335 334 { 335 + cpuinfo_store_boot_cpu(); 336 336 set_my_cpu_offset(per_cpu_offset(smp_processor_id())); 337 337 } 338 338
+1 -1
arch/arm64/kernel/suspend.c
··· 90 90 * restoration before returning. 91 91 */ 92 92 cpu_set_reserved_ttbr0(); 93 - flush_tlb_all(); 93 + local_flush_tlb_all(); 94 94 cpu_set_default_tcr_t0sz(); 95 95 96 96 if (mm != &init_mm)
+10 -5
arch/arm64/kernel/traps.c
··· 103 103 set_fs(fs); 104 104 } 105 105 106 - static void dump_backtrace_entry(unsigned long where, unsigned long stack) 106 + static void dump_backtrace_entry(unsigned long where) 107 107 { 108 + /* 109 + * Note that 'where' can have a physical address, but it's not handled. 110 + */ 108 111 print_ip_sym(where); 109 - if (in_exception_text(where)) 110 - dump_mem("", "Exception stack", stack, 111 - stack + sizeof(struct pt_regs), false); 112 112 } 113 113 114 114 static void dump_instr(const char *lvl, struct pt_regs *regs) ··· 172 172 pr_emerg("Call trace:\n"); 173 173 while (1) { 174 174 unsigned long where = frame.pc; 175 + unsigned long stack; 175 176 int ret; 176 177 178 + dump_backtrace_entry(where); 177 179 ret = unwind_frame(&frame); 178 180 if (ret < 0) 179 181 break; 180 - dump_backtrace_entry(where, frame.sp); 182 + stack = frame.sp; 183 + if (in_exception_text(where)) 184 + dump_mem("", "Exception stack", stack, 185 + stack + sizeof(struct pt_regs), false); 181 186 } 182 187 } 183 188
+5 -1
arch/arm64/kernel/vmlinux.lds.S
··· 5 5 */ 6 6 7 7 #include <asm-generic/vmlinux.lds.h> 8 + #include <asm/kernel-pgtable.h> 8 9 #include <asm/thread_info.h> 9 10 #include <asm/memory.h> 10 11 #include <asm/page.h> ··· 61 60 #define PECOFF_EDATA_PADDING 62 61 #endif 63 62 64 - #ifdef CONFIG_DEBUG_ALIGN_RODATA 63 + #if defined(CONFIG_DEBUG_ALIGN_RODATA) 65 64 #define ALIGN_DEBUG_RO . = ALIGN(1<<SECTION_SHIFT); 65 + #define ALIGN_DEBUG_RO_MIN(min) ALIGN_DEBUG_RO 66 + #elif defined(CONFIG_DEBUG_RODATA) 67 + #define ALIGN_DEBUG_RO . = ALIGN(1<<PAGE_SHIFT); 66 68 #define ALIGN_DEBUG_RO_MIN(min) ALIGN_DEBUG_RO 67 69 #else 68 70 #define ALIGN_DEBUG_RO
+3
arch/arm64/kvm/Kconfig
··· 22 22 config KVM 23 23 bool "Kernel-based Virtual Machine (KVM) support" 24 24 depends on OF 25 + depends on !ARM64_16K_PAGES 25 26 select MMU_NOTIFIER 26 27 select PREEMPT_NOTIFIERS 27 28 select ANON_INODES ··· 38 37 select KVM_ARM_VGIC_V3 39 38 ---help--- 40 39 Support hosting virtualized guest machines. 40 + We don't support KVM with 16K page tables yet, due to the multiple 41 + levels of fake page tables. 41 42 42 43 If unsure, say N. 43 44
+1 -1
arch/arm64/kvm/reset.c
··· 53 53 { 54 54 u64 pfr0; 55 55 56 - pfr0 = read_cpuid(ID_AA64PFR0_EL1); 56 + pfr0 = read_system_reg(SYS_ID_AA64PFR0_EL1); 57 57 return !!(pfr0 & 0x20); 58 58 } 59 59
+6 -6
arch/arm64/kvm/sys_regs.c
··· 693 693 if (p->is_write) { 694 694 return ignore_write(vcpu, p); 695 695 } else { 696 - u64 dfr = read_cpuid(ID_AA64DFR0_EL1); 697 - u64 pfr = read_cpuid(ID_AA64PFR0_EL1); 698 - u32 el3 = !!((pfr >> 12) & 0xf); 696 + u64 dfr = read_system_reg(SYS_ID_AA64DFR0_EL1); 697 + u64 pfr = read_system_reg(SYS_ID_AA64PFR0_EL1); 698 + u32 el3 = !!cpuid_feature_extract_field(pfr, ID_AA64PFR0_EL3_SHIFT); 699 699 700 - *vcpu_reg(vcpu, p->Rt) = ((((dfr >> 20) & 0xf) << 28) | 701 - (((dfr >> 12) & 0xf) << 24) | 702 - (((dfr >> 28) & 0xf) << 20) | 700 + *vcpu_reg(vcpu, p->Rt) = ((((dfr >> ID_AA64DFR0_WRPS_SHIFT) & 0xf) << 28) | 701 + (((dfr >> ID_AA64DFR0_BRPS_SHIFT) & 0xf) << 24) | 702 + (((dfr >> ID_AA64DFR0_CTX_CMPS_SHIFT) & 0xf) << 20) | 703 703 (6 << 16) | (el3 << 14) | (el3 << 12)); 704 704 return true; 705 705 }
+44 -34
arch/arm64/lib/copy_from_user.S
··· 18 18 19 19 #include <asm/alternative.h> 20 20 #include <asm/assembler.h> 21 + #include <asm/cache.h> 21 22 #include <asm/cpufeature.h> 22 23 #include <asm/sysreg.h> 23 24 ··· 32 31 * Returns: 33 32 * x0 - bytes not copied 34 33 */ 34 + 35 + .macro ldrb1 ptr, regB, val 36 + USER(9998f, ldrb \ptr, [\regB], \val) 37 + .endm 38 + 39 + .macro strb1 ptr, regB, val 40 + strb \ptr, [\regB], \val 41 + .endm 42 + 43 + .macro ldrh1 ptr, regB, val 44 + USER(9998f, ldrh \ptr, [\regB], \val) 45 + .endm 46 + 47 + .macro strh1 ptr, regB, val 48 + strh \ptr, [\regB], \val 49 + .endm 50 + 51 + .macro ldr1 ptr, regB, val 52 + USER(9998f, ldr \ptr, [\regB], \val) 53 + .endm 54 + 55 + .macro str1 ptr, regB, val 56 + str \ptr, [\regB], \val 57 + .endm 58 + 59 + .macro ldp1 ptr, regB, regC, val 60 + USER(9998f, ldp \ptr, \regB, [\regC], \val) 61 + .endm 62 + 63 + .macro stp1 ptr, regB, regC, val 64 + stp \ptr, \regB, [\regC], \val 65 + .endm 66 + 67 + end .req x5 35 68 ENTRY(__copy_from_user) 36 69 ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(0)), ARM64_HAS_PAN, \ 37 70 CONFIG_ARM64_PAN) 38 - add x5, x1, x2 // upper user buffer boundary 39 - subs x2, x2, #16 40 - b.mi 1f 41 - 0: 42 - USER(9f, ldp x3, x4, [x1], #16) 43 - subs x2, x2, #16 44 - stp x3, x4, [x0], #16 45 - b.pl 0b 46 - 1: adds x2, x2, #8 47 - b.mi 2f 48 - USER(9f, ldr x3, [x1], #8 ) 49 - sub x2, x2, #8 50 - str x3, [x0], #8 51 - 2: adds x2, x2, #4 52 - b.mi 3f 53 - USER(9f, ldr w3, [x1], #4 ) 54 - sub x2, x2, #4 55 - str w3, [x0], #4 56 - 3: adds x2, x2, #2 57 - b.mi 4f 58 - USER(9f, ldrh w3, [x1], #2 ) 59 - sub x2, x2, #2 60 - strh w3, [x0], #2 61 - 4: adds x2, x2, #1 62 - b.mi 5f 63 - USER(9f, ldrb w3, [x1] ) 64 - strb w3, [x0] 65 - 5: mov x0, #0 71 + add end, x0, x2 72 + #include "copy_template.S" 66 73 ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(1)), ARM64_HAS_PAN, \ 67 74 CONFIG_ARM64_PAN) 75 + mov x0, #0 // Nothing to copy 68 76 ret 69 77 ENDPROC(__copy_from_user) 70 78 71 79 .section .fixup,"ax" 72 80 .align 2 73 - 9: sub x2, x5, x1 74 - mov x3, x2 75 - 10: strb wzr, [x0], #1 // zero remaining buffer space 76 - subs x3, x3, #1 77 - b.ne 10b 78 - mov x0, x2 // bytes not copied 81 + 9998: 82 + sub x0, end, dst 83 + 9999: 84 + strb wzr, [dst], #1 // zero remaining buffer space 85 + cmp dst, end 86 + b.lo 9999b 79 87 ret 80 88 .previous
+38 -29
arch/arm64/lib/copy_in_user.S
··· 20 20 21 21 #include <asm/alternative.h> 22 22 #include <asm/assembler.h> 23 + #include <asm/cache.h> 23 24 #include <asm/cpufeature.h> 24 25 #include <asm/sysreg.h> 25 26 ··· 34 33 * Returns: 35 34 * x0 - bytes not copied 36 35 */ 36 + .macro ldrb1 ptr, regB, val 37 + USER(9998f, ldrb \ptr, [\regB], \val) 38 + .endm 39 + 40 + .macro strb1 ptr, regB, val 41 + USER(9998f, strb \ptr, [\regB], \val) 42 + .endm 43 + 44 + .macro ldrh1 ptr, regB, val 45 + USER(9998f, ldrh \ptr, [\regB], \val) 46 + .endm 47 + 48 + .macro strh1 ptr, regB, val 49 + USER(9998f, strh \ptr, [\regB], \val) 50 + .endm 51 + 52 + .macro ldr1 ptr, regB, val 53 + USER(9998f, ldr \ptr, [\regB], \val) 54 + .endm 55 + 56 + .macro str1 ptr, regB, val 57 + USER(9998f, str \ptr, [\regB], \val) 58 + .endm 59 + 60 + .macro ldp1 ptr, regB, regC, val 61 + USER(9998f, ldp \ptr, \regB, [\regC], \val) 62 + .endm 63 + 64 + .macro stp1 ptr, regB, regC, val 65 + USER(9998f, stp \ptr, \regB, [\regC], \val) 66 + .endm 67 + 68 + end .req x5 37 69 ENTRY(__copy_in_user) 38 70 ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(0)), ARM64_HAS_PAN, \ 39 71 CONFIG_ARM64_PAN) 40 - add x5, x0, x2 // upper user buffer boundary 41 - subs x2, x2, #16 42 - b.mi 1f 43 - 0: 44 - USER(9f, ldp x3, x4, [x1], #16) 45 - subs x2, x2, #16 46 - USER(9f, stp x3, x4, [x0], #16) 47 - b.pl 0b 48 - 1: adds x2, x2, #8 49 - b.mi 2f 50 - USER(9f, ldr x3, [x1], #8 ) 51 - sub x2, x2, #8 52 - USER(9f, str x3, [x0], #8 ) 53 - 2: adds x2, x2, #4 54 - b.mi 3f 55 - USER(9f, ldr w3, [x1], #4 ) 56 - sub x2, x2, #4 57 - USER(9f, str w3, [x0], #4 ) 58 - 3: adds x2, x2, #2 59 - b.mi 4f 60 - USER(9f, ldrh w3, [x1], #2 ) 61 - sub x2, x2, #2 62 - USER(9f, strh w3, [x0], #2 ) 63 - 4: adds x2, x2, #1 64 - b.mi 5f 65 - USER(9f, ldrb w3, [x1] ) 66 - USER(9f, strb w3, [x0] ) 67 - 5: mov x0, #0 72 + add end, x0, x2 73 + #include "copy_template.S" 68 74 ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(1)), ARM64_HAS_PAN, \ 69 75 CONFIG_ARM64_PAN) 76 + mov x0, #0 70 77 ret 71 78 ENDPROC(__copy_in_user) 72 79 73 80 .section .fixup,"ax" 74 81 .align 2 75 - 9: sub x0, x5, x0 // bytes not copied 82 + 9998: sub x0, end, dst // bytes not copied 76 83 ret 77 84 .previous
+193
arch/arm64/lib/copy_template.S
··· 1 + /* 2 + * Copyright (C) 2013 ARM Ltd. 3 + * Copyright (C) 2013 Linaro. 4 + * 5 + * This code is based on glibc cortex strings work originally authored by Linaro 6 + * and re-licensed under GPLv2 for the Linux kernel. The original code can 7 + * be found @ 8 + * 9 + * http://bazaar.launchpad.net/~linaro-toolchain-dev/cortex-strings/trunk/ 10 + * files/head:/src/aarch64/ 11 + * 12 + * This program is free software; you can redistribute it and/or modify 13 + * it under the terms of the GNU General Public License version 2 as 14 + * published by the Free Software Foundation. 15 + * 16 + * This program is distributed in the hope that it will be useful, 17 + * but WITHOUT ANY WARRANTY; without even the implied warranty of 18 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 19 + * GNU General Public License for more details. 20 + * 21 + * You should have received a copy of the GNU General Public License 22 + * along with this program. If not, see <http://www.gnu.org/licenses/>. 23 + */ 24 + 25 + 26 + /* 27 + * Copy a buffer from src to dest (alignment handled by the hardware) 28 + * 29 + * Parameters: 30 + * x0 - dest 31 + * x1 - src 32 + * x2 - n 33 + * Returns: 34 + * x0 - dest 35 + */ 36 + dstin .req x0 37 + src .req x1 38 + count .req x2 39 + tmp1 .req x3 40 + tmp1w .req w3 41 + tmp2 .req x4 42 + tmp2w .req w4 43 + dst .req x6 44 + 45 + A_l .req x7 46 + A_h .req x8 47 + B_l .req x9 48 + B_h .req x10 49 + C_l .req x11 50 + C_h .req x12 51 + D_l .req x13 52 + D_h .req x14 53 + 54 + mov dst, dstin 55 + cmp count, #16 56 + /*When memory length is less than 16, the accessed are not aligned.*/ 57 + b.lo .Ltiny15 58 + 59 + neg tmp2, src 60 + ands tmp2, tmp2, #15/* Bytes to reach alignment. */ 61 + b.eq .LSrcAligned 62 + sub count, count, tmp2 63 + /* 64 + * Copy the leading memory data from src to dst in an increasing 65 + * address order.By this way,the risk of overwritting the source 66 + * memory data is eliminated when the distance between src and 67 + * dst is less than 16. The memory accesses here are alignment. 68 + */ 69 + tbz tmp2, #0, 1f 70 + ldrb1 tmp1w, src, #1 71 + strb1 tmp1w, dst, #1 72 + 1: 73 + tbz tmp2, #1, 2f 74 + ldrh1 tmp1w, src, #2 75 + strh1 tmp1w, dst, #2 76 + 2: 77 + tbz tmp2, #2, 3f 78 + ldr1 tmp1w, src, #4 79 + str1 tmp1w, dst, #4 80 + 3: 81 + tbz tmp2, #3, .LSrcAligned 82 + ldr1 tmp1, src, #8 83 + str1 tmp1, dst, #8 84 + 85 + .LSrcAligned: 86 + cmp count, #64 87 + b.ge .Lcpy_over64 88 + /* 89 + * Deal with small copies quickly by dropping straight into the 90 + * exit block. 91 + */ 92 + .Ltail63: 93 + /* 94 + * Copy up to 48 bytes of data. At this point we only need the 95 + * bottom 6 bits of count to be accurate. 96 + */ 97 + ands tmp1, count, #0x30 98 + b.eq .Ltiny15 99 + cmp tmp1w, #0x20 100 + b.eq 1f 101 + b.lt 2f 102 + ldp1 A_l, A_h, src, #16 103 + stp1 A_l, A_h, dst, #16 104 + 1: 105 + ldp1 A_l, A_h, src, #16 106 + stp1 A_l, A_h, dst, #16 107 + 2: 108 + ldp1 A_l, A_h, src, #16 109 + stp1 A_l, A_h, dst, #16 110 + .Ltiny15: 111 + /* 112 + * Prefer to break one ldp/stp into several load/store to access 113 + * memory in an increasing address order,rather than to load/store 16 114 + * bytes from (src-16) to (dst-16) and to backward the src to aligned 115 + * address,which way is used in original cortex memcpy. If keeping 116 + * the original memcpy process here, memmove need to satisfy the 117 + * precondition that src address is at least 16 bytes bigger than dst 118 + * address,otherwise some source data will be overwritten when memove 119 + * call memcpy directly. To make memmove simpler and decouple the 120 + * memcpy's dependency on memmove, withdrew the original process. 121 + */ 122 + tbz count, #3, 1f 123 + ldr1 tmp1, src, #8 124 + str1 tmp1, dst, #8 125 + 1: 126 + tbz count, #2, 2f 127 + ldr1 tmp1w, src, #4 128 + str1 tmp1w, dst, #4 129 + 2: 130 + tbz count, #1, 3f 131 + ldrh1 tmp1w, src, #2 132 + strh1 tmp1w, dst, #2 133 + 3: 134 + tbz count, #0, .Lexitfunc 135 + ldrb1 tmp1w, src, #1 136 + strb1 tmp1w, dst, #1 137 + 138 + b .Lexitfunc 139 + 140 + .Lcpy_over64: 141 + subs count, count, #128 142 + b.ge .Lcpy_body_large 143 + /* 144 + * Less than 128 bytes to copy, so handle 64 here and then jump 145 + * to the tail. 146 + */ 147 + ldp1 A_l, A_h, src, #16 148 + stp1 A_l, A_h, dst, #16 149 + ldp1 B_l, B_h, src, #16 150 + ldp1 C_l, C_h, src, #16 151 + stp1 B_l, B_h, dst, #16 152 + stp1 C_l, C_h, dst, #16 153 + ldp1 D_l, D_h, src, #16 154 + stp1 D_l, D_h, dst, #16 155 + 156 + tst count, #0x3f 157 + b.ne .Ltail63 158 + b .Lexitfunc 159 + 160 + /* 161 + * Critical loop. Start at a new cache line boundary. Assuming 162 + * 64 bytes per line this ensures the entire loop is in one line. 163 + */ 164 + .p2align L1_CACHE_SHIFT 165 + .Lcpy_body_large: 166 + /* pre-get 64 bytes data. */ 167 + ldp1 A_l, A_h, src, #16 168 + ldp1 B_l, B_h, src, #16 169 + ldp1 C_l, C_h, src, #16 170 + ldp1 D_l, D_h, src, #16 171 + 1: 172 + /* 173 + * interlace the load of next 64 bytes data block with store of the last 174 + * loaded 64 bytes data. 175 + */ 176 + stp1 A_l, A_h, dst, #16 177 + ldp1 A_l, A_h, src, #16 178 + stp1 B_l, B_h, dst, #16 179 + ldp1 B_l, B_h, src, #16 180 + stp1 C_l, C_h, dst, #16 181 + ldp1 C_l, C_h, src, #16 182 + stp1 D_l, D_h, dst, #16 183 + ldp1 D_l, D_h, src, #16 184 + subs count, count, #64 185 + b.ge 1b 186 + stp1 A_l, A_h, dst, #16 187 + stp1 B_l, B_h, dst, #16 188 + stp1 C_l, C_h, dst, #16 189 + stp1 D_l, D_h, dst, #16 190 + 191 + tst count, #0x3f 192 + b.ne .Ltail63 193 + .Lexitfunc:
+38 -29
arch/arm64/lib/copy_to_user.S
··· 18 18 19 19 #include <asm/alternative.h> 20 20 #include <asm/assembler.h> 21 + #include <asm/cache.h> 21 22 #include <asm/cpufeature.h> 22 23 #include <asm/sysreg.h> 23 24 ··· 32 31 * Returns: 33 32 * x0 - bytes not copied 34 33 */ 34 + .macro ldrb1 ptr, regB, val 35 + ldrb \ptr, [\regB], \val 36 + .endm 37 + 38 + .macro strb1 ptr, regB, val 39 + USER(9998f, strb \ptr, [\regB], \val) 40 + .endm 41 + 42 + .macro ldrh1 ptr, regB, val 43 + ldrh \ptr, [\regB], \val 44 + .endm 45 + 46 + .macro strh1 ptr, regB, val 47 + USER(9998f, strh \ptr, [\regB], \val) 48 + .endm 49 + 50 + .macro ldr1 ptr, regB, val 51 + ldr \ptr, [\regB], \val 52 + .endm 53 + 54 + .macro str1 ptr, regB, val 55 + USER(9998f, str \ptr, [\regB], \val) 56 + .endm 57 + 58 + .macro ldp1 ptr, regB, regC, val 59 + ldp \ptr, \regB, [\regC], \val 60 + .endm 61 + 62 + .macro stp1 ptr, regB, regC, val 63 + USER(9998f, stp \ptr, \regB, [\regC], \val) 64 + .endm 65 + 66 + end .req x5 35 67 ENTRY(__copy_to_user) 36 68 ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(0)), ARM64_HAS_PAN, \ 37 69 CONFIG_ARM64_PAN) 38 - add x5, x0, x2 // upper user buffer boundary 39 - subs x2, x2, #16 40 - b.mi 1f 41 - 0: 42 - ldp x3, x4, [x1], #16 43 - subs x2, x2, #16 44 - USER(9f, stp x3, x4, [x0], #16) 45 - b.pl 0b 46 - 1: adds x2, x2, #8 47 - b.mi 2f 48 - ldr x3, [x1], #8 49 - sub x2, x2, #8 50 - USER(9f, str x3, [x0], #8 ) 51 - 2: adds x2, x2, #4 52 - b.mi 3f 53 - ldr w3, [x1], #4 54 - sub x2, x2, #4 55 - USER(9f, str w3, [x0], #4 ) 56 - 3: adds x2, x2, #2 57 - b.mi 4f 58 - ldrh w3, [x1], #2 59 - sub x2, x2, #2 60 - USER(9f, strh w3, [x0], #2 ) 61 - 4: adds x2, x2, #1 62 - b.mi 5f 63 - ldrb w3, [x1] 64 - USER(9f, strb w3, [x0] ) 65 - 5: mov x0, #0 70 + add end, x0, x2 71 + #include "copy_template.S" 66 72 ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(1)), ARM64_HAS_PAN, \ 67 73 CONFIG_ARM64_PAN) 74 + mov x0, #0 68 75 ret 69 76 ENDPROC(__copy_to_user) 70 77 71 78 .section .fixup,"ax" 72 79 .align 2 73 - 9: sub x0, x5, x0 // bytes not copied 80 + 9998: sub x0, end, dst // bytes not copied 74 81 ret 75 82 .previous
+1 -1
arch/arm64/lib/memchr.S
··· 41 41 ret 42 42 2: mov x0, #0 43 43 ret 44 - ENDPROC(memchr) 44 + ENDPIPROC(memchr)
+1 -1
arch/arm64/lib/memcmp.S
··· 255 255 .Lret0: 256 256 mov result, #0 257 257 ret 258 - ENDPROC(memcmp) 258 + ENDPIPROC(memcmp)
+35 -159
arch/arm64/lib/memcpy.S
··· 36 36 * Returns: 37 37 * x0 - dest 38 38 */ 39 - dstin .req x0 40 - src .req x1 41 - count .req x2 42 - tmp1 .req x3 43 - tmp1w .req w3 44 - tmp2 .req x4 45 - tmp2w .req w4 46 - tmp3 .req x5 47 - tmp3w .req w5 48 - dst .req x6 39 + .macro ldrb1 ptr, regB, val 40 + ldrb \ptr, [\regB], \val 41 + .endm 49 42 50 - A_l .req x7 51 - A_h .req x8 52 - B_l .req x9 53 - B_h .req x10 54 - C_l .req x11 55 - C_h .req x12 56 - D_l .req x13 57 - D_h .req x14 43 + .macro strb1 ptr, regB, val 44 + strb \ptr, [\regB], \val 45 + .endm 58 46 47 + .macro ldrh1 ptr, regB, val 48 + ldrh \ptr, [\regB], \val 49 + .endm 50 + 51 + .macro strh1 ptr, regB, val 52 + strh \ptr, [\regB], \val 53 + .endm 54 + 55 + .macro ldr1 ptr, regB, val 56 + ldr \ptr, [\regB], \val 57 + .endm 58 + 59 + .macro str1 ptr, regB, val 60 + str \ptr, [\regB], \val 61 + .endm 62 + 63 + .macro ldp1 ptr, regB, regC, val 64 + ldp \ptr, \regB, [\regC], \val 65 + .endm 66 + 67 + .macro stp1 ptr, regB, regC, val 68 + stp \ptr, \regB, [\regC], \val 69 + .endm 70 + 71 + .weak memcpy 72 + ENTRY(__memcpy) 59 73 ENTRY(memcpy) 60 - mov dst, dstin 61 - cmp count, #16 62 - /*When memory length is less than 16, the accessed are not aligned.*/ 63 - b.lo .Ltiny15 64 - 65 - neg tmp2, src 66 - ands tmp2, tmp2, #15/* Bytes to reach alignment. */ 67 - b.eq .LSrcAligned 68 - sub count, count, tmp2 69 - /* 70 - * Copy the leading memory data from src to dst in an increasing 71 - * address order.By this way,the risk of overwritting the source 72 - * memory data is eliminated when the distance between src and 73 - * dst is less than 16. The memory accesses here are alignment. 74 - */ 75 - tbz tmp2, #0, 1f 76 - ldrb tmp1w, [src], #1 77 - strb tmp1w, [dst], #1 78 - 1: 79 - tbz tmp2, #1, 2f 80 - ldrh tmp1w, [src], #2 81 - strh tmp1w, [dst], #2 82 - 2: 83 - tbz tmp2, #2, 3f 84 - ldr tmp1w, [src], #4 85 - str tmp1w, [dst], #4 86 - 3: 87 - tbz tmp2, #3, .LSrcAligned 88 - ldr tmp1, [src],#8 89 - str tmp1, [dst],#8 90 - 91 - .LSrcAligned: 92 - cmp count, #64 93 - b.ge .Lcpy_over64 94 - /* 95 - * Deal with small copies quickly by dropping straight into the 96 - * exit block. 97 - */ 98 - .Ltail63: 99 - /* 100 - * Copy up to 48 bytes of data. At this point we only need the 101 - * bottom 6 bits of count to be accurate. 102 - */ 103 - ands tmp1, count, #0x30 104 - b.eq .Ltiny15 105 - cmp tmp1w, #0x20 106 - b.eq 1f 107 - b.lt 2f 108 - ldp A_l, A_h, [src], #16 109 - stp A_l, A_h, [dst], #16 110 - 1: 111 - ldp A_l, A_h, [src], #16 112 - stp A_l, A_h, [dst], #16 113 - 2: 114 - ldp A_l, A_h, [src], #16 115 - stp A_l, A_h, [dst], #16 116 - .Ltiny15: 117 - /* 118 - * Prefer to break one ldp/stp into several load/store to access 119 - * memory in an increasing address order,rather than to load/store 16 120 - * bytes from (src-16) to (dst-16) and to backward the src to aligned 121 - * address,which way is used in original cortex memcpy. If keeping 122 - * the original memcpy process here, memmove need to satisfy the 123 - * precondition that src address is at least 16 bytes bigger than dst 124 - * address,otherwise some source data will be overwritten when memove 125 - * call memcpy directly. To make memmove simpler and decouple the 126 - * memcpy's dependency on memmove, withdrew the original process. 127 - */ 128 - tbz count, #3, 1f 129 - ldr tmp1, [src], #8 130 - str tmp1, [dst], #8 131 - 1: 132 - tbz count, #2, 2f 133 - ldr tmp1w, [src], #4 134 - str tmp1w, [dst], #4 135 - 2: 136 - tbz count, #1, 3f 137 - ldrh tmp1w, [src], #2 138 - strh tmp1w, [dst], #2 139 - 3: 140 - tbz count, #0, .Lexitfunc 141 - ldrb tmp1w, [src] 142 - strb tmp1w, [dst] 143 - 144 - .Lexitfunc: 74 + #include "copy_template.S" 145 75 ret 146 - 147 - .Lcpy_over64: 148 - subs count, count, #128 149 - b.ge .Lcpy_body_large 150 - /* 151 - * Less than 128 bytes to copy, so handle 64 here and then jump 152 - * to the tail. 153 - */ 154 - ldp A_l, A_h, [src],#16 155 - stp A_l, A_h, [dst],#16 156 - ldp B_l, B_h, [src],#16 157 - ldp C_l, C_h, [src],#16 158 - stp B_l, B_h, [dst],#16 159 - stp C_l, C_h, [dst],#16 160 - ldp D_l, D_h, [src],#16 161 - stp D_l, D_h, [dst],#16 162 - 163 - tst count, #0x3f 164 - b.ne .Ltail63 165 - ret 166 - 167 - /* 168 - * Critical loop. Start at a new cache line boundary. Assuming 169 - * 64 bytes per line this ensures the entire loop is in one line. 170 - */ 171 - .p2align L1_CACHE_SHIFT 172 - .Lcpy_body_large: 173 - /* pre-get 64 bytes data. */ 174 - ldp A_l, A_h, [src],#16 175 - ldp B_l, B_h, [src],#16 176 - ldp C_l, C_h, [src],#16 177 - ldp D_l, D_h, [src],#16 178 - 1: 179 - /* 180 - * interlace the load of next 64 bytes data block with store of the last 181 - * loaded 64 bytes data. 182 - */ 183 - stp A_l, A_h, [dst],#16 184 - ldp A_l, A_h, [src],#16 185 - stp B_l, B_h, [dst],#16 186 - ldp B_l, B_h, [src],#16 187 - stp C_l, C_h, [dst],#16 188 - ldp C_l, C_h, [src],#16 189 - stp D_l, D_h, [dst],#16 190 - ldp D_l, D_h, [src],#16 191 - subs count, count, #64 192 - b.ge 1b 193 - stp A_l, A_h, [dst],#16 194 - stp B_l, B_h, [dst],#16 195 - stp C_l, C_h, [dst],#16 196 - stp D_l, D_h, [dst],#16 197 - 198 - tst count, #0x3f 199 - b.ne .Ltail63 200 - ret 201 - ENDPROC(memcpy) 76 + ENDPIPROC(memcpy) 77 + ENDPROC(__memcpy)
+6 -3
arch/arm64/lib/memmove.S
··· 57 57 D_l .req x13 58 58 D_h .req x14 59 59 60 + .weak memmove 61 + ENTRY(__memmove) 60 62 ENTRY(memmove) 61 63 cmp dstin, src 62 - b.lo memcpy 64 + b.lo __memcpy 63 65 add tmp1, src, count 64 66 cmp dstin, tmp1 65 - b.hs memcpy /* No overlap. */ 67 + b.hs __memcpy /* No overlap. */ 66 68 67 69 add dst, dstin, count 68 70 add src, src, count ··· 196 194 tst count, #0x3f 197 195 b.ne .Ltail63 198 196 ret 199 - ENDPROC(memmove) 197 + ENDPIPROC(memmove) 198 + ENDPROC(__memmove)
+4 -1
arch/arm64/lib/memset.S
··· 54 54 tmp3w .req w9 55 55 tmp3 .req x9 56 56 57 + .weak memset 58 + ENTRY(__memset) 57 59 ENTRY(memset) 58 60 mov dst, dstin /* Preserve return value. */ 59 61 and A_lw, val, #255 ··· 215 213 ands count, count, zva_bits_x 216 214 b.ne .Ltail_maybe_long 217 215 ret 218 - ENDPROC(memset) 216 + ENDPIPROC(memset) 217 + ENDPROC(__memset)
+1 -1
arch/arm64/lib/strcmp.S
··· 231 231 lsr data1, data1, #56 232 232 sub result, data1, data2, lsr #56 233 233 ret 234 - ENDPROC(strcmp) 234 + ENDPIPROC(strcmp)
+1 -1
arch/arm64/lib/strlen.S
··· 123 123 csinv data1, data1, xzr, le 124 124 csel data2, data2, data2a, le 125 125 b .Lrealigned 126 - ENDPROC(strlen) 126 + ENDPIPROC(strlen)
+1 -1
arch/arm64/lib/strncmp.S
··· 307 307 .Lret0: 308 308 mov result, #0 309 309 ret 310 - ENDPROC(strncmp) 310 + ENDPIPROC(strncmp)
+3
arch/arm64/mm/Makefile
··· 4 4 context.o proc.o pageattr.o 5 5 obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o 6 6 obj-$(CONFIG_ARM64_PTDUMP) += dump.o 7 + 8 + obj-$(CONFIG_KASAN) += kasan_init.o 9 + KASAN_SANITIZE_kasan_init.o := n
+5 -5
arch/arm64/mm/cache.S
··· 98 98 b.lo 1b 99 99 dsb sy 100 100 ret 101 - ENDPROC(__flush_dcache_area) 101 + ENDPIPROC(__flush_dcache_area) 102 102 103 103 /* 104 104 * __inval_cache_range(start, end) ··· 131 131 b.lo 2b 132 132 dsb sy 133 133 ret 134 - ENDPROC(__inval_cache_range) 134 + ENDPIPROC(__inval_cache_range) 135 135 ENDPROC(__dma_inv_range) 136 136 137 137 /* ··· 171 171 b.lo 1b 172 172 dsb sy 173 173 ret 174 - ENDPROC(__dma_flush_range) 174 + ENDPIPROC(__dma_flush_range) 175 175 176 176 /* 177 177 * __dma_map_area(start, size, dir) ··· 184 184 cmp w2, #DMA_FROM_DEVICE 185 185 b.eq __dma_inv_range 186 186 b __dma_clean_range 187 - ENDPROC(__dma_map_area) 187 + ENDPIPROC(__dma_map_area) 188 188 189 189 /* 190 190 * __dma_unmap_area(start, size, dir) ··· 197 197 cmp w2, #DMA_TO_DEVICE 198 198 b.ne __dma_inv_range 199 199 ret 200 - ENDPROC(__dma_unmap_area) 200 + ENDPIPROC(__dma_unmap_area)
+150 -100
arch/arm64/mm/context.c
··· 17 17 * along with this program. If not, see <http://www.gnu.org/licenses/>. 18 18 */ 19 19 20 - #include <linux/init.h> 20 + #include <linux/bitops.h> 21 21 #include <linux/sched.h> 22 + #include <linux/slab.h> 22 23 #include <linux/mm.h> 23 - #include <linux/smp.h> 24 - #include <linux/percpu.h> 25 24 25 + #include <asm/cpufeature.h> 26 26 #include <asm/mmu_context.h> 27 27 #include <asm/tlbflush.h> 28 - #include <asm/cachetype.h> 29 28 30 - #define asid_bits(reg) \ 31 - (((read_cpuid(ID_AA64MMFR0_EL1) & 0xf0) >> 2) + 8) 32 - 33 - #define ASID_FIRST_VERSION (1 << MAX_ASID_BITS) 34 - 29 + static u32 asid_bits; 35 30 static DEFINE_RAW_SPINLOCK(cpu_asid_lock); 36 - unsigned int cpu_last_asid = ASID_FIRST_VERSION; 37 31 38 - /* 39 - * We fork()ed a process, and we need a new context for the child to run in. 40 - */ 41 - void __init_new_context(struct task_struct *tsk, struct mm_struct *mm) 42 - { 43 - mm->context.id = 0; 44 - raw_spin_lock_init(&mm->context.id_lock); 45 - } 32 + static atomic64_t asid_generation; 33 + static unsigned long *asid_map; 46 34 47 - static void flush_context(void) 35 + static DEFINE_PER_CPU(atomic64_t, active_asids); 36 + static DEFINE_PER_CPU(u64, reserved_asids); 37 + static cpumask_t tlb_flush_pending; 38 + 39 + #define ASID_MASK (~GENMASK(asid_bits - 1, 0)) 40 + #define ASID_FIRST_VERSION (1UL << asid_bits) 41 + #define NUM_USER_ASIDS ASID_FIRST_VERSION 42 + 43 + static void flush_context(unsigned int cpu) 48 44 { 49 - /* set the reserved TTBR0 before flushing the TLB */ 50 - cpu_set_reserved_ttbr0(); 51 - flush_tlb_all(); 45 + int i; 46 + u64 asid; 47 + 48 + /* Update the list of reserved ASIDs and the ASID bitmap. */ 49 + bitmap_clear(asid_map, 0, NUM_USER_ASIDS); 50 + 51 + /* 52 + * Ensure the generation bump is observed before we xchg the 53 + * active_asids. 54 + */ 55 + smp_wmb(); 56 + 57 + for_each_possible_cpu(i) { 58 + asid = atomic64_xchg_relaxed(&per_cpu(active_asids, i), 0); 59 + /* 60 + * If this CPU has already been through a 61 + * rollover, but hasn't run another task in 62 + * the meantime, we must preserve its reserved 63 + * ASID, as this is the only trace we have of 64 + * the process it is still running. 65 + */ 66 + if (asid == 0) 67 + asid = per_cpu(reserved_asids, i); 68 + __set_bit(asid & ~ASID_MASK, asid_map); 69 + per_cpu(reserved_asids, i) = asid; 70 + } 71 + 72 + /* Queue a TLB invalidate and flush the I-cache if necessary. */ 73 + cpumask_setall(&tlb_flush_pending); 74 + 52 75 if (icache_is_aivivt()) 53 76 __flush_icache_all(); 54 77 } 55 78 56 - static void set_mm_context(struct mm_struct *mm, unsigned int asid) 79 + static int is_reserved_asid(u64 asid) 57 80 { 58 - unsigned long flags; 59 - 60 - /* 61 - * Locking needed for multi-threaded applications where the same 62 - * mm->context.id could be set from different CPUs during the 63 - * broadcast. This function is also called via IPI so the 64 - * mm->context.id_lock has to be IRQ-safe. 65 - */ 66 - raw_spin_lock_irqsave(&mm->context.id_lock, flags); 67 - if (likely((mm->context.id ^ cpu_last_asid) >> MAX_ASID_BITS)) { 68 - /* 69 - * Old version of ASID found. Set the new one and reset 70 - * mm_cpumask(mm). 71 - */ 72 - mm->context.id = asid; 73 - cpumask_clear(mm_cpumask(mm)); 74 - } 75 - raw_spin_unlock_irqrestore(&mm->context.id_lock, flags); 76 - 77 - /* 78 - * Set the mm_cpumask(mm) bit for the current CPU. 79 - */ 80 - cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm)); 81 + int cpu; 82 + for_each_possible_cpu(cpu) 83 + if (per_cpu(reserved_asids, cpu) == asid) 84 + return 1; 85 + return 0; 81 86 } 82 87 83 - /* 84 - * Reset the ASID on the current CPU. This function call is broadcast from the 85 - * CPU handling the ASID rollover and holding cpu_asid_lock. 86 - */ 87 - static void reset_context(void *info) 88 + static u64 new_context(struct mm_struct *mm, unsigned int cpu) 88 89 { 89 - unsigned int asid; 90 - unsigned int cpu = smp_processor_id(); 91 - struct mm_struct *mm = current->active_mm; 90 + static u32 cur_idx = 1; 91 + u64 asid = atomic64_read(&mm->context.id); 92 + u64 generation = atomic64_read(&asid_generation); 93 + 94 + if (asid != 0) { 95 + /* 96 + * If our current ASID was active during a rollover, we 97 + * can continue to use it and this was just a false alarm. 98 + */ 99 + if (is_reserved_asid(asid)) 100 + return generation | (asid & ~ASID_MASK); 101 + 102 + /* 103 + * We had a valid ASID in a previous life, so try to re-use 104 + * it if possible. 105 + */ 106 + asid &= ~ASID_MASK; 107 + if (!__test_and_set_bit(asid, asid_map)) 108 + goto bump_gen; 109 + } 92 110 93 111 /* 94 - * current->active_mm could be init_mm for the idle thread immediately 95 - * after secondary CPU boot or hotplug. TTBR0_EL1 is already set to 96 - * the reserved value, so no need to reset any context. 112 + * Allocate a free ASID. If we can't find one, take a note of the 113 + * currently active ASIDs and mark the TLBs as requiring flushes. 114 + * We always count from ASID #1, as we use ASID #0 when setting a 115 + * reserved TTBR0 for the init_mm. 97 116 */ 98 - if (mm == &init_mm) 99 - return; 117 + asid = find_next_zero_bit(asid_map, NUM_USER_ASIDS, cur_idx); 118 + if (asid != NUM_USER_ASIDS) 119 + goto set_asid; 100 120 101 - smp_rmb(); 102 - asid = cpu_last_asid + cpu; 121 + /* We're out of ASIDs, so increment the global generation count */ 122 + generation = atomic64_add_return_relaxed(ASID_FIRST_VERSION, 123 + &asid_generation); 124 + flush_context(cpu); 103 125 104 - flush_context(); 105 - set_mm_context(mm, asid); 126 + /* We have at least 1 ASID per CPU, so this will always succeed */ 127 + asid = find_next_zero_bit(asid_map, NUM_USER_ASIDS, 1); 106 128 107 - /* set the new ASID */ 129 + set_asid: 130 + __set_bit(asid, asid_map); 131 + cur_idx = asid; 132 + 133 + bump_gen: 134 + asid |= generation; 135 + return asid; 136 + } 137 + 138 + void check_and_switch_context(struct mm_struct *mm, unsigned int cpu) 139 + { 140 + unsigned long flags; 141 + u64 asid; 142 + 143 + asid = atomic64_read(&mm->context.id); 144 + 145 + /* 146 + * The memory ordering here is subtle. We rely on the control 147 + * dependency between the generation read and the update of 148 + * active_asids to ensure that we are synchronised with a 149 + * parallel rollover (i.e. this pairs with the smp_wmb() in 150 + * flush_context). 151 + */ 152 + if (!((asid ^ atomic64_read(&asid_generation)) >> asid_bits) 153 + && atomic64_xchg_relaxed(&per_cpu(active_asids, cpu), asid)) 154 + goto switch_mm_fastpath; 155 + 156 + raw_spin_lock_irqsave(&cpu_asid_lock, flags); 157 + /* Check that our ASID belongs to the current generation. */ 158 + asid = atomic64_read(&mm->context.id); 159 + if ((asid ^ atomic64_read(&asid_generation)) >> asid_bits) { 160 + asid = new_context(mm, cpu); 161 + atomic64_set(&mm->context.id, asid); 162 + } 163 + 164 + if (cpumask_test_and_clear_cpu(cpu, &tlb_flush_pending)) 165 + local_flush_tlb_all(); 166 + 167 + atomic64_set(&per_cpu(active_asids, cpu), asid); 168 + raw_spin_unlock_irqrestore(&cpu_asid_lock, flags); 169 + 170 + switch_mm_fastpath: 108 171 cpu_switch_mm(mm->pgd, mm); 109 172 } 110 173 111 - void __new_context(struct mm_struct *mm) 174 + static int asids_init(void) 112 175 { 113 - unsigned int asid; 114 - unsigned int bits = asid_bits(); 176 + int fld = cpuid_feature_extract_field(read_cpuid(ID_AA64MMFR0_EL1), 4); 115 177 116 - raw_spin_lock(&cpu_asid_lock); 117 - /* 118 - * Check the ASID again, in case the change was broadcast from another 119 - * CPU before we acquired the lock. 120 - */ 121 - if (!unlikely((mm->context.id ^ cpu_last_asid) >> MAX_ASID_BITS)) { 122 - cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm)); 123 - raw_spin_unlock(&cpu_asid_lock); 124 - return; 125 - } 126 - /* 127 - * At this point, it is guaranteed that the current mm (with an old 128 - * ASID) isn't active on any other CPU since the ASIDs are changed 129 - * simultaneously via IPI. 130 - */ 131 - asid = ++cpu_last_asid; 132 - 133 - /* 134 - * If we've used up all our ASIDs, we need to start a new version and 135 - * flush the TLB. 136 - */ 137 - if (unlikely((asid & ((1 << bits) - 1)) == 0)) { 138 - /* increment the ASID version */ 139 - cpu_last_asid += (1 << MAX_ASID_BITS) - (1 << bits); 140 - if (cpu_last_asid == 0) 141 - cpu_last_asid = ASID_FIRST_VERSION; 142 - asid = cpu_last_asid + smp_processor_id(); 143 - flush_context(); 144 - smp_wmb(); 145 - smp_call_function(reset_context, NULL, 1); 146 - cpu_last_asid += NR_CPUS - 1; 178 + switch (fld) { 179 + default: 180 + pr_warn("Unknown ASID size (%d); assuming 8-bit\n", fld); 181 + /* Fallthrough */ 182 + case 0: 183 + asid_bits = 8; 184 + break; 185 + case 2: 186 + asid_bits = 16; 147 187 } 148 188 149 - set_mm_context(mm, asid); 150 - raw_spin_unlock(&cpu_asid_lock); 189 + /* If we end up with more CPUs than ASIDs, expect things to crash */ 190 + WARN_ON(NUM_USER_ASIDS < num_possible_cpus()); 191 + atomic64_set(&asid_generation, ASID_FIRST_VERSION); 192 + asid_map = kzalloc(BITS_TO_LONGS(NUM_USER_ASIDS) * sizeof(*asid_map), 193 + GFP_KERNEL); 194 + if (!asid_map) 195 + panic("Failed to allocate bitmap for %lu ASIDs\n", 196 + NUM_USER_ASIDS); 197 + 198 + pr_info("ASID allocator initialised with %lu entries\n", NUM_USER_ASIDS); 199 + return 0; 151 200 } 201 + early_initcall(asids_init);
+17 -1
arch/arm64/mm/dump.c
··· 67 67 { -1, NULL }, 68 68 }; 69 69 70 + /* 71 + * The page dumper groups page table entries of the same type into a single 72 + * description. It uses pg_state to track the range information while 73 + * iterating over the pte entries. When the continuity is broken it then 74 + * dumps out a description of the range. 75 + */ 70 76 struct pg_state { 71 77 struct seq_file *seq; 72 78 const struct addr_marker *marker; ··· 119 113 .val = PTE_NG, 120 114 .set = "NG", 121 115 .clear = " ", 116 + }, { 117 + .mask = PTE_CONT, 118 + .val = PTE_CONT, 119 + .set = "CON", 120 + .clear = " ", 121 + }, { 122 + .mask = PTE_TABLE_BIT, 123 + .val = PTE_TABLE_BIT, 124 + .set = " ", 125 + .clear = "BLK", 122 126 }, { 123 127 .mask = PTE_UXN, 124 128 .val = PTE_UXN, ··· 214 198 unsigned long delta; 215 199 216 200 if (st->current_prot) { 217 - seq_printf(st->seq, "0x%16lx-0x%16lx ", 201 + seq_printf(st->seq, "0x%016lx-0x%016lx ", 218 202 st->start_address, addr); 219 203 220 204 delta = (addr - st->start_address) >> 10;
+1 -1
arch/arm64/mm/fault.c
··· 556 556 } 557 557 558 558 #ifdef CONFIG_ARM64_PAN 559 - void cpu_enable_pan(void) 559 + void cpu_enable_pan(void *__unused) 560 560 { 561 561 config_sctlr_el1(SCTLR_EL1_SPAN, 0); 562 562 }
+13 -6
arch/arm64/mm/init.c
··· 86 86 memset(zone_size, 0, sizeof(zone_size)); 87 87 88 88 /* 4GB maximum for 32-bit only capable devices */ 89 - if (IS_ENABLED(CONFIG_ZONE_DMA)) { 90 - max_dma = PFN_DOWN(arm64_dma_phys_limit); 91 - zone_size[ZONE_DMA] = max_dma - min; 92 - } 89 + #ifdef CONFIG_ZONE_DMA 90 + max_dma = PFN_DOWN(arm64_dma_phys_limit); 91 + zone_size[ZONE_DMA] = max_dma - min; 92 + #endif 93 93 zone_size[ZONE_NORMAL] = max - max_dma; 94 94 95 95 memcpy(zhole_size, zone_size, sizeof(zhole_size)); ··· 101 101 if (start >= max) 102 102 continue; 103 103 104 - if (IS_ENABLED(CONFIG_ZONE_DMA) && start < max_dma) { 104 + #ifdef CONFIG_ZONE_DMA 105 + if (start < max_dma) { 105 106 unsigned long dma_end = min(end, max_dma); 106 107 zhole_size[ZONE_DMA] -= dma_end - start; 107 108 } 108 - 109 + #endif 109 110 if (end > max_dma) { 110 111 unsigned long normal_end = min(end, max); 111 112 unsigned long normal_start = max(start, max_dma); ··· 299 298 #define MLK_ROUNDUP(b, t) b, t, DIV_ROUND_UP(((t) - (b)), SZ_1K) 300 299 301 300 pr_notice("Virtual kernel memory layout:\n" 301 + #ifdef CONFIG_KASAN 302 + " kasan : 0x%16lx - 0x%16lx (%6ld GB)\n" 303 + #endif 302 304 " vmalloc : 0x%16lx - 0x%16lx (%6ld GB)\n" 303 305 #ifdef CONFIG_SPARSEMEM_VMEMMAP 304 306 " vmemmap : 0x%16lx - 0x%16lx (%6ld GB maximum)\n" ··· 314 310 " .init : 0x%p" " - 0x%p" " (%6ld KB)\n" 315 311 " .text : 0x%p" " - 0x%p" " (%6ld KB)\n" 316 312 " .data : 0x%p" " - 0x%p" " (%6ld KB)\n", 313 + #ifdef CONFIG_KASAN 314 + MLG(KASAN_SHADOW_START, KASAN_SHADOW_END), 315 + #endif 317 316 MLG(VMALLOC_START, VMALLOC_END), 318 317 #ifdef CONFIG_SPARSEMEM_VMEMMAP 319 318 MLG((unsigned long)vmemmap,
+165
arch/arm64/mm/kasan_init.c
··· 1 + /* 2 + * This file contains kasan initialization code for ARM64. 3 + * 4 + * Copyright (c) 2015 Samsung Electronics Co., Ltd. 5 + * Author: Andrey Ryabinin <ryabinin.a.a@gmail.com> 6 + * 7 + * This program is free software; you can redistribute it and/or modify 8 + * it under the terms of the GNU General Public License version 2 as 9 + * published by the Free Software Foundation. 10 + * 11 + */ 12 + 13 + #define pr_fmt(fmt) "kasan: " fmt 14 + #include <linux/kasan.h> 15 + #include <linux/kernel.h> 16 + #include <linux/memblock.h> 17 + #include <linux/start_kernel.h> 18 + 19 + #include <asm/page.h> 20 + #include <asm/pgalloc.h> 21 + #include <asm/pgtable.h> 22 + #include <asm/tlbflush.h> 23 + 24 + static pgd_t tmp_pg_dir[PTRS_PER_PGD] __initdata __aligned(PGD_SIZE); 25 + 26 + static void __init kasan_early_pte_populate(pmd_t *pmd, unsigned long addr, 27 + unsigned long end) 28 + { 29 + pte_t *pte; 30 + unsigned long next; 31 + 32 + if (pmd_none(*pmd)) 33 + pmd_populate_kernel(&init_mm, pmd, kasan_zero_pte); 34 + 35 + pte = pte_offset_kernel(pmd, addr); 36 + do { 37 + next = addr + PAGE_SIZE; 38 + set_pte(pte, pfn_pte(virt_to_pfn(kasan_zero_page), 39 + PAGE_KERNEL)); 40 + } while (pte++, addr = next, addr != end && pte_none(*pte)); 41 + } 42 + 43 + static void __init kasan_early_pmd_populate(pud_t *pud, 44 + unsigned long addr, 45 + unsigned long end) 46 + { 47 + pmd_t *pmd; 48 + unsigned long next; 49 + 50 + if (pud_none(*pud)) 51 + pud_populate(&init_mm, pud, kasan_zero_pmd); 52 + 53 + pmd = pmd_offset(pud, addr); 54 + do { 55 + next = pmd_addr_end(addr, end); 56 + kasan_early_pte_populate(pmd, addr, next); 57 + } while (pmd++, addr = next, addr != end && pmd_none(*pmd)); 58 + } 59 + 60 + static void __init kasan_early_pud_populate(pgd_t *pgd, 61 + unsigned long addr, 62 + unsigned long end) 63 + { 64 + pud_t *pud; 65 + unsigned long next; 66 + 67 + if (pgd_none(*pgd)) 68 + pgd_populate(&init_mm, pgd, kasan_zero_pud); 69 + 70 + pud = pud_offset(pgd, addr); 71 + do { 72 + next = pud_addr_end(addr, end); 73 + kasan_early_pmd_populate(pud, addr, next); 74 + } while (pud++, addr = next, addr != end && pud_none(*pud)); 75 + } 76 + 77 + static void __init kasan_map_early_shadow(void) 78 + { 79 + unsigned long addr = KASAN_SHADOW_START; 80 + unsigned long end = KASAN_SHADOW_END; 81 + unsigned long next; 82 + pgd_t *pgd; 83 + 84 + pgd = pgd_offset_k(addr); 85 + do { 86 + next = pgd_addr_end(addr, end); 87 + kasan_early_pud_populate(pgd, addr, next); 88 + } while (pgd++, addr = next, addr != end); 89 + } 90 + 91 + asmlinkage void __init kasan_early_init(void) 92 + { 93 + BUILD_BUG_ON(KASAN_SHADOW_OFFSET != KASAN_SHADOW_END - (1UL << 61)); 94 + BUILD_BUG_ON(!IS_ALIGNED(KASAN_SHADOW_START, PGDIR_SIZE)); 95 + BUILD_BUG_ON(!IS_ALIGNED(KASAN_SHADOW_END, PGDIR_SIZE)); 96 + kasan_map_early_shadow(); 97 + } 98 + 99 + static void __init clear_pgds(unsigned long start, 100 + unsigned long end) 101 + { 102 + /* 103 + * Remove references to kasan page tables from 104 + * swapper_pg_dir. pgd_clear() can't be used 105 + * here because it's nop on 2,3-level pagetable setups 106 + */ 107 + for (; start < end; start += PGDIR_SIZE) 108 + set_pgd(pgd_offset_k(start), __pgd(0)); 109 + } 110 + 111 + static void __init cpu_set_ttbr1(unsigned long ttbr1) 112 + { 113 + asm( 114 + " msr ttbr1_el1, %0\n" 115 + " isb" 116 + : 117 + : "r" (ttbr1)); 118 + } 119 + 120 + void __init kasan_init(void) 121 + { 122 + struct memblock_region *reg; 123 + 124 + /* 125 + * We are going to perform proper setup of shadow memory. 126 + * At first we should unmap early shadow (clear_pgds() call bellow). 127 + * However, instrumented code couldn't execute without shadow memory. 128 + * tmp_pg_dir used to keep early shadow mapped until full shadow 129 + * setup will be finished. 130 + */ 131 + memcpy(tmp_pg_dir, swapper_pg_dir, sizeof(tmp_pg_dir)); 132 + cpu_set_ttbr1(__pa(tmp_pg_dir)); 133 + flush_tlb_all(); 134 + 135 + clear_pgds(KASAN_SHADOW_START, KASAN_SHADOW_END); 136 + 137 + kasan_populate_zero_shadow((void *)KASAN_SHADOW_START, 138 + kasan_mem_to_shadow((void *)MODULES_VADDR)); 139 + 140 + for_each_memblock(memory, reg) { 141 + void *start = (void *)__phys_to_virt(reg->base); 142 + void *end = (void *)__phys_to_virt(reg->base + reg->size); 143 + 144 + if (start >= end) 145 + break; 146 + 147 + /* 148 + * end + 1 here is intentional. We check several shadow bytes in 149 + * advance to slightly speed up fastpath. In some rare cases 150 + * we could cross boundary of mapped shadow, so we just map 151 + * some more here. 152 + */ 153 + vmemmap_populate((unsigned long)kasan_mem_to_shadow(start), 154 + (unsigned long)kasan_mem_to_shadow(end) + 1, 155 + pfn_to_nid(virt_to_pfn(start))); 156 + } 157 + 158 + memset(kasan_zero_page, 0, PAGE_SIZE); 159 + cpu_set_ttbr1(__pa(swapper_pg_dir)); 160 + flush_tlb_all(); 161 + 162 + /* At this point kasan is fully initialized. Enable error messages */ 163 + init_task.kasan_depth = 0; 164 + pr_info("KernelAddressSanitizer initialized\n"); 165 + }
+95 -50
arch/arm64/mm/mmu.c
··· 32 32 33 33 #include <asm/cputype.h> 34 34 #include <asm/fixmap.h> 35 + #include <asm/kernel-pgtable.h> 35 36 #include <asm/sections.h> 36 37 #include <asm/setup.h> 37 38 #include <asm/sizes.h> ··· 81 80 do { 82 81 /* 83 82 * Need to have the least restrictive permissions available 84 - * permissions will be fixed up later 83 + * permissions will be fixed up later. Default the new page 84 + * range as contiguous ptes. 85 85 */ 86 - set_pte(pte, pfn_pte(pfn, PAGE_KERNEL_EXEC)); 86 + set_pte(pte, pfn_pte(pfn, PAGE_KERNEL_EXEC_CONT)); 87 87 pfn++; 88 88 } while (pte++, i++, i < PTRS_PER_PTE); 89 89 } 90 90 91 + /* 92 + * Given a PTE with the CONT bit set, determine where the CONT range 93 + * starts, and clear the entire range of PTE CONT bits. 94 + */ 95 + static void clear_cont_pte_range(pte_t *pte, unsigned long addr) 96 + { 97 + int i; 98 + 99 + pte -= CONT_RANGE_OFFSET(addr); 100 + for (i = 0; i < CONT_PTES; i++) { 101 + set_pte(pte, pte_mknoncont(*pte)); 102 + pte++; 103 + } 104 + flush_tlb_all(); 105 + } 106 + 107 + /* 108 + * Given a range of PTEs set the pfn and provided page protection flags 109 + */ 110 + static void __populate_init_pte(pte_t *pte, unsigned long addr, 111 + unsigned long end, phys_addr_t phys, 112 + pgprot_t prot) 113 + { 114 + unsigned long pfn = __phys_to_pfn(phys); 115 + 116 + do { 117 + /* clear all the bits except the pfn, then apply the prot */ 118 + set_pte(pte, pfn_pte(pfn, prot)); 119 + pte++; 120 + pfn++; 121 + addr += PAGE_SIZE; 122 + } while (addr != end); 123 + } 124 + 91 125 static void alloc_init_pte(pmd_t *pmd, unsigned long addr, 92 - unsigned long end, unsigned long pfn, 126 + unsigned long end, phys_addr_t phys, 93 127 pgprot_t prot, 94 128 void *(*alloc)(unsigned long size)) 95 129 { 96 130 pte_t *pte; 131 + unsigned long next; 97 132 98 133 if (pmd_none(*pmd) || pmd_sect(*pmd)) { 99 134 pte = alloc(PTRS_PER_PTE * sizeof(pte_t)); ··· 142 105 143 106 pte = pte_offset_kernel(pmd, addr); 144 107 do { 145 - set_pte(pte, pfn_pte(pfn, prot)); 146 - pfn++; 147 - } while (pte++, addr += PAGE_SIZE, addr != end); 108 + next = min(end, (addr + CONT_SIZE) & CONT_MASK); 109 + if (((addr | next | phys) & ~CONT_MASK) == 0) { 110 + /* a block of CONT_PTES */ 111 + __populate_init_pte(pte, addr, next, phys, 112 + prot | __pgprot(PTE_CONT)); 113 + } else { 114 + /* 115 + * If the range being split is already inside of a 116 + * contiguous range but this PTE isn't going to be 117 + * contiguous, then we want to unmark the adjacent 118 + * ranges, then update the portion of the range we 119 + * are interrested in. 120 + */ 121 + clear_cont_pte_range(pte, addr); 122 + __populate_init_pte(pte, addr, next, phys, prot); 123 + } 124 + 125 + pte += (next - addr) >> PAGE_SHIFT; 126 + phys += next - addr; 127 + addr = next; 128 + } while (addr != end); 148 129 } 149 130 150 131 void split_pud(pud_t *old_pud, pmd_t *pmd) ··· 223 168 } 224 169 } 225 170 } else { 226 - alloc_init_pte(pmd, addr, next, __phys_to_pfn(phys), 227 - prot, alloc); 171 + alloc_init_pte(pmd, addr, next, phys, prot, alloc); 228 172 } 229 173 phys += next - addr; 230 174 } while (pmd++, addr = next, addr != end); ··· 407 353 * memory addressable from the initial direct kernel mapping. 408 354 * 409 355 * The initial direct kernel mapping, located at swapper_pg_dir, gives 410 - * us PUD_SIZE (4K pages) or PMD_SIZE (64K pages) memory starting from 411 - * PHYS_OFFSET (which must be aligned to 2MB as per 412 - * Documentation/arm64/booting.txt). 356 + * us PUD_SIZE (with SECTION maps) or PMD_SIZE (without SECTION maps, 357 + * memory starting from PHYS_OFFSET (which must be aligned to 2MB as 358 + * per Documentation/arm64/booting.txt). 413 359 */ 414 - if (IS_ENABLED(CONFIG_ARM64_64K_PAGES)) 415 - limit = PHYS_OFFSET + PMD_SIZE; 416 - else 417 - limit = PHYS_OFFSET + PUD_SIZE; 360 + limit = PHYS_OFFSET + SWAPPER_INIT_MAP_SIZE; 418 361 memblock_set_current_limit(limit); 419 362 420 363 /* map all the memory banks */ ··· 422 371 if (start >= end) 423 372 break; 424 373 425 - #ifndef CONFIG_ARM64_64K_PAGES 426 - /* 427 - * For the first memory bank align the start address and 428 - * current memblock limit to prevent create_mapping() from 429 - * allocating pte page tables from unmapped memory. 430 - * When 64K pages are enabled, the pte page table for the 431 - * first PGDIR_SIZE is already present in swapper_pg_dir. 432 - */ 433 - if (start < limit) 434 - start = ALIGN(start, PMD_SIZE); 435 - if (end < limit) { 436 - limit = end & PMD_MASK; 437 - memblock_set_current_limit(limit); 374 + if (ARM64_SWAPPER_USES_SECTION_MAPS) { 375 + /* 376 + * For the first memory bank align the start address and 377 + * current memblock limit to prevent create_mapping() from 378 + * allocating pte page tables from unmapped memory. With 379 + * the section maps, if the first block doesn't end on section 380 + * size boundary, create_mapping() will try to allocate a pte 381 + * page, which may be returned from an unmapped area. 382 + * When section maps are not used, the pte page table for the 383 + * current limit is already present in swapper_pg_dir. 384 + */ 385 + if (start < limit) 386 + start = ALIGN(start, SECTION_SIZE); 387 + if (end < limit) { 388 + limit = end & SECTION_MASK; 389 + memblock_set_current_limit(limit); 390 + } 438 391 } 439 - #endif 440 392 __map_memblock(start, end); 441 393 } 442 394 ··· 510 456 * point to zero page to avoid speculatively fetching new entries. 511 457 */ 512 458 cpu_set_reserved_ttbr0(); 513 - flush_tlb_all(); 459 + local_flush_tlb_all(); 514 460 cpu_set_default_tcr_t0sz(); 515 461 } 516 462 ··· 552 498 return pfn_valid(pte_pfn(*pte)); 553 499 } 554 500 #ifdef CONFIG_SPARSEMEM_VMEMMAP 555 - #ifdef CONFIG_ARM64_64K_PAGES 501 + #if !ARM64_SWAPPER_USES_SECTION_MAPS 556 502 int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node) 557 503 { 558 504 return vmemmap_populate_basepages(start, end, node); 559 505 } 560 - #else /* !CONFIG_ARM64_64K_PAGES */ 506 + #else /* !ARM64_SWAPPER_USES_SECTION_MAPS */ 561 507 int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node) 562 508 { 563 509 unsigned long addr = start; ··· 692 638 { 693 639 const u64 dt_virt_base = __fix_to_virt(FIX_FDT); 694 640 pgprot_t prot = PAGE_KERNEL | PTE_RDONLY; 695 - int granularity, size, offset; 641 + int size, offset; 696 642 void *dt_virt; 697 643 698 644 /* ··· 718 664 */ 719 665 BUILD_BUG_ON(dt_virt_base % SZ_2M); 720 666 721 - if (IS_ENABLED(CONFIG_ARM64_64K_PAGES)) { 722 - BUILD_BUG_ON(__fix_to_virt(FIX_FDT_END) >> PMD_SHIFT != 723 - __fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT); 667 + BUILD_BUG_ON(__fix_to_virt(FIX_FDT_END) >> SWAPPER_TABLE_SHIFT != 668 + __fix_to_virt(FIX_BTMAP_BEGIN) >> SWAPPER_TABLE_SHIFT); 724 669 725 - granularity = PAGE_SIZE; 726 - } else { 727 - BUILD_BUG_ON(__fix_to_virt(FIX_FDT_END) >> PUD_SHIFT != 728 - __fix_to_virt(FIX_BTMAP_BEGIN) >> PUD_SHIFT); 729 - 730 - granularity = PMD_SIZE; 731 - } 732 - 733 - offset = dt_phys % granularity; 670 + offset = dt_phys % SWAPPER_BLOCK_SIZE; 734 671 dt_virt = (void *)dt_virt_base + offset; 735 672 736 673 /* map the first chunk so we can read the size from the header */ 737 - create_mapping(round_down(dt_phys, granularity), dt_virt_base, 738 - granularity, prot); 674 + create_mapping(round_down(dt_phys, SWAPPER_BLOCK_SIZE), dt_virt_base, 675 + SWAPPER_BLOCK_SIZE, prot); 739 676 740 677 if (fdt_check_header(dt_virt) != 0) 741 678 return NULL; ··· 735 690 if (size > MAX_FDT_SIZE) 736 691 return NULL; 737 692 738 - if (offset + size > granularity) 739 - create_mapping(round_down(dt_phys, granularity), dt_virt_base, 740 - round_up(offset + size, granularity), prot); 693 + if (offset + size > SWAPPER_BLOCK_SIZE) 694 + create_mapping(round_down(dt_phys, SWAPPER_BLOCK_SIZE), dt_virt_base, 695 + round_up(offset + size, SWAPPER_BLOCK_SIZE), prot); 741 696 742 697 memblock_reserve(dt_phys, size); 743 698
+1 -1
arch/arm64/mm/pageattr.c
··· 45 45 int ret; 46 46 struct page_change_data data; 47 47 48 - if (!IS_ALIGNED(addr, PAGE_SIZE)) { 48 + if (!PAGE_ALIGNED(addr)) { 49 49 start &= PAGE_MASK; 50 50 end = start + size; 51 51 WARN_ON_ONCE(1);
-2
arch/arm64/mm/pgd.c
··· 28 28 29 29 #include "mm.h" 30 30 31 - #define PGD_SIZE (PTRS_PER_PGD * sizeof(pgd_t)) 32 - 33 31 static struct kmem_cache *pgd_cache; 34 32 35 33 pgd_t *pgd_alloc(struct mm_struct *mm)
+6 -4
arch/arm64/mm/proc.S
··· 30 30 31 31 #ifdef CONFIG_ARM64_64K_PAGES 32 32 #define TCR_TG_FLAGS TCR_TG0_64K | TCR_TG1_64K 33 - #else 33 + #elif defined(CONFIG_ARM64_16K_PAGES) 34 + #define TCR_TG_FLAGS TCR_TG0_16K | TCR_TG1_16K 35 + #else /* CONFIG_ARM64_4K_PAGES */ 34 36 #define TCR_TG_FLAGS TCR_TG0_4K | TCR_TG1_4K 35 37 #endif 36 38 ··· 132 130 * - pgd_phys - physical address of new TTB 133 131 */ 134 132 ENTRY(cpu_do_switch_mm) 135 - mmid w1, x1 // get mm->context.id 133 + mmid x1, x1 // get mm->context.id 136 134 bfi x0, x1, #48, #16 // set the ASID 137 135 msr ttbr0_el1, x0 // set TTBR0 138 136 isb ··· 148 146 * value of the SCTLR_EL1 register. 149 147 */ 150 148 ENTRY(__cpu_setup) 151 - tlbi vmalle1is // invalidate I + D TLBs 152 - dsb ish 149 + tlbi vmalle1 // Invalidate local TLB 150 + dsb nsh 153 151 154 152 mov x0, #3 << 20 155 153 msr cpacr_el1, x0 // Enable FP/ASIMD
+8
drivers/firmware/efi/Makefile
··· 1 1 # 2 2 # Makefile for linux kernel 3 3 # 4 + 5 + # 6 + # ARM64 maps efi runtime services in userspace addresses 7 + # which don't have KASAN shadow. So dereference of these addresses 8 + # in efi_call_virt() will cause crash if this code instrumented. 9 + # 10 + KASAN_SANITIZE_runtime-wrappers.o := n 11 + 4 12 obj-$(CONFIG_EFI) += efi.o vars.o reboot.o 5 13 obj-$(CONFIG_EFI_VARS) += efivars.o 6 14 obj-$(CONFIG_EFI_ESRT) += esrt.o
+36 -6
drivers/firmware/efi/libstub/Makefile
··· 14 14 cflags-$(CONFIG_ARM) := $(subst -pg,,$(KBUILD_CFLAGS)) \ 15 15 -fno-builtin -fpic -mno-single-pic-base 16 16 17 + cflags-$(CONFIG_EFI_ARMSTUB) += -I$(srctree)/scripts/dtc/libfdt 18 + 17 19 KBUILD_CFLAGS := $(cflags-y) \ 18 20 $(call cc-option,-ffreestanding) \ 19 21 $(call cc-option,-fno-stack-protector) ··· 24 22 KASAN_SANITIZE := n 25 23 26 24 lib-y := efi-stub-helper.o 27 - lib-$(CONFIG_EFI_ARMSTUB) += arm-stub.o fdt.o 25 + 26 + # include the stub's generic dependencies from lib/ when building for ARM/arm64 27 + arm-deps := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c sort.c 28 + 29 + $(obj)/lib-%.o: $(srctree)/lib/%.c FORCE 30 + $(call if_changed_rule,cc_o_c) 31 + 32 + lib-$(CONFIG_EFI_ARMSTUB) += arm-stub.o fdt.o string.o \ 33 + $(patsubst %.c,lib-%.o,$(arm-deps)) 34 + 35 + lib-$(CONFIG_ARM64) += arm64-stub.o 36 + CFLAGS_arm64-stub.o := -DTEXT_OFFSET=$(TEXT_OFFSET) 28 37 29 38 # 30 39 # arm64 puts the stub in the kernel proper, which will unnecessarily retain all ··· 43 30 # So let's apply the __init annotations at the section level, by prefixing 44 31 # the section names directly. This will ensure that even all the inline string 45 32 # literals are covered. 33 + # The fact that the stub and the kernel proper are essentially the same binary 34 + # also means that we need to be extra careful to make sure that the stub does 35 + # not rely on any absolute symbol references, considering that the virtual 36 + # kernel mapping that the linker uses is not active yet when the stub is 37 + # executing. So build all C dependencies of the EFI stub into libstub, and do 38 + # a verification pass to see if any absolute relocations exist in any of the 39 + # object files. 46 40 # 47 - extra-$(CONFIG_ARM64) := $(lib-y) 48 - lib-$(CONFIG_ARM64) := $(patsubst %.o,%.init.o,$(lib-y)) 41 + extra-$(CONFIG_EFI_ARMSTUB) := $(lib-y) 42 + lib-$(CONFIG_EFI_ARMSTUB) := $(patsubst %.o,%.stub.o,$(lib-y)) 49 43 50 - OBJCOPYFLAGS := --prefix-alloc-sections=.init 51 - $(obj)/%.init.o: $(obj)/%.o FORCE 52 - $(call if_changed,objcopy) 44 + STUBCOPY_FLAGS-y := -R .debug* -R *ksymtab* -R *kcrctab* 45 + STUBCOPY_FLAGS-$(CONFIG_ARM64) += --prefix-alloc-sections=.init \ 46 + --prefix-symbols=__efistub_ 47 + STUBCOPY_RELOC-$(CONFIG_ARM64) := R_AARCH64_ABS 48 + 49 + $(obj)/%.stub.o: $(obj)/%.o FORCE 50 + $(call if_changed,stubcopy) 51 + 52 + quiet_cmd_stubcopy = STUBCPY $@ 53 + cmd_stubcopy = if $(OBJCOPY) $(STUBCOPY_FLAGS-y) $< $@; then \ 54 + $(OBJDUMP) -r $@ | grep $(STUBCOPY_RELOC-y) \ 55 + && (echo >&2 "$@: absolute symbol references not allowed in the EFI stub"; \ 56 + rm -f $@; /bin/false); else /bin/false; fi
-9
drivers/firmware/efi/libstub/fdt.c
··· 147 147 if (status) 148 148 goto fdt_set_fail; 149 149 150 - /* 151 - * Add kernel version banner so stub/kernel match can be 152 - * verified. 153 - */ 154 - status = fdt_setprop_string(fdt, node, "linux,uefi-stub-kern-ver", 155 - linux_banner); 156 - if (status) 157 - goto fdt_set_fail; 158 - 159 150 return EFI_SUCCESS; 160 151 161 152 fdt_set_fail:
+57
drivers/firmware/efi/libstub/string.c
··· 1 + /* 2 + * Taken from: 3 + * linux/lib/string.c 4 + * 5 + * Copyright (C) 1991, 1992 Linus Torvalds 6 + */ 7 + 8 + #include <linux/types.h> 9 + #include <linux/string.h> 10 + 11 + #ifndef __HAVE_ARCH_STRSTR 12 + /** 13 + * strstr - Find the first substring in a %NUL terminated string 14 + * @s1: The string to be searched 15 + * @s2: The string to search for 16 + */ 17 + char *strstr(const char *s1, const char *s2) 18 + { 19 + size_t l1, l2; 20 + 21 + l2 = strlen(s2); 22 + if (!l2) 23 + return (char *)s1; 24 + l1 = strlen(s1); 25 + while (l1 >= l2) { 26 + l1--; 27 + if (!memcmp(s1, s2, l2)) 28 + return (char *)s1; 29 + s1++; 30 + } 31 + return NULL; 32 + } 33 + #endif 34 + 35 + #ifndef __HAVE_ARCH_STRNCMP 36 + /** 37 + * strncmp - Compare two length-limited strings 38 + * @cs: One string 39 + * @ct: Another string 40 + * @count: The maximum number of bytes to compare 41 + */ 42 + int strncmp(const char *cs, const char *ct, size_t count) 43 + { 44 + unsigned char c1, c2; 45 + 46 + while (count) { 47 + c1 = *cs++; 48 + c2 = *ct++; 49 + if (c1 != c2) 50 + return c1 < c2 ? -1 : 1; 51 + if (!c1) 52 + break; 53 + count--; 54 + } 55 + return 0; 56 + } 57 + #endif
+1 -1
drivers/perf/Kconfig
··· 5 5 menu "Performance monitor support" 6 6 7 7 config ARM_PMU 8 - depends on PERF_EVENTS && ARM 8 + depends on PERF_EVENTS && (ARM || ARM64) 9 9 bool "ARM PMU framework" 10 10 default y 11 11 help
+1 -1
kernel/irq/cpuhotplug.c
··· 36 36 37 37 c = irq_data_get_irq_chip(d); 38 38 if (!c->irq_set_affinity) { 39 - pr_warn_ratelimited("IRQ%u: unable to set affinity\n", d->irq); 39 + pr_debug("IRQ%u: unable to set affinity\n", d->irq); 40 40 } else { 41 41 int r = irq_do_set_affinity(d, affinity, false); 42 42 if (r)
+3 -1
scripts/Makefile.kasan
··· 5 5 call_threshold := 0 6 6 endif 7 7 8 + KASAN_SHADOW_OFFSET ?= $(CONFIG_KASAN_SHADOW_OFFSET) 9 + 8 10 CFLAGS_KASAN_MINIMAL := -fsanitize=kernel-address 9 11 10 12 CFLAGS_KASAN := $(call cc-option, -fsanitize=kernel-address \ 11 - -fasan-shadow-offset=$(CONFIG_KASAN_SHADOW_OFFSET) \ 13 + -fasan-shadow-offset=$(KASAN_SHADOW_OFFSET) \ 12 14 --param asan-stack=1 --param asan-globals=1 \ 13 15 --param asan-instrumentation-with-call-threshold=$(call_threshold)) 14 16