Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'riscv-for-linus-6.6-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

- Support for the new "riscv,isa-extensions" and "riscv,isa-base"
device tree interfaces for probing extensions

- Support for userspace access to the performance counters

- Support for more instructions in kprobes

- Crash kernels can be allocated above 4GiB

- Support for KCFI

- Support for ELFs in !MMU configurations

- ARCH_KMALLOC_MINALIGN has been reduced to 8

- mmap() defaults to sv48-sized addresses, with longer addresses hidden
behind a hint (similar to Arm and Intel)

- Also various fixes and cleanups

* tag 'riscv-for-linus-6.6-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (51 commits)
lib/Kconfig.debug: Restrict DEBUG_INFO_SPLIT for RISC-V
riscv: support PREEMPT_DYNAMIC with static keys
riscv: Move create_tmp_mapping() to init sections
riscv: Mark KASAN tmp* page tables variables as static
riscv: mm: use bitmap_zero() API
riscv: enable DEBUG_FORCE_FUNCTION_ALIGN_64B
riscv: remove redundant mv instructions
RISC-V: mm: Document mmap changes
RISC-V: mm: Update pgtable comment documentation
RISC-V: mm: Add tests for RISC-V mm
RISC-V: mm: Restrict address space for sv39,sv48,sv57
riscv: enable DMA_BOUNCE_UNALIGNED_KMALLOC for !dma_coherent
riscv: allow kmalloc() caches aligned to the smallest value
riscv: support the elf-fdpic binfmt loader
binfmt_elf_fdpic: support 64-bit systems
riscv: Allow CONFIG_CFI_CLANG to be selected
riscv/purgatory: Disable CFI
riscv: Add CFI error handling
riscv: Add ftrace_stub_graph
riscv: Add types to indirectly called assembly functions
...

+1712 -412
+15 -7
Documentation/admin-guide/kernel-parameters.txt
··· 873 873 memory region [offset, offset + size] for that kernel 874 874 image. If '@offset' is omitted, then a suitable offset 875 875 is selected automatically. 876 - [KNL, X86-64, ARM64] Select a region under 4G first, and 876 + [KNL, X86-64, ARM64, RISCV] Select a region under 4G first, and 877 877 fall back to reserve region above 4G when '@offset' 878 878 hasn't been specified. 879 879 See Documentation/admin-guide/kdump/kdump.rst for further details. ··· 886 886 Documentation/admin-guide/kdump/kdump.rst for an example. 887 887 888 888 crashkernel=size[KMG],high 889 - [KNL, X86-64, ARM64] range could be above 4G. Allow kernel 890 - to allocate physical memory region from top, so could 891 - be above 4G if system have more than 4G ram installed. 892 - Otherwise memory region will be allocated below 4G, if 893 - available. 889 + [KNL, X86-64, ARM64, RISCV] range could be above 4G. 890 + Allow kernel to allocate physical memory region from top, 891 + so could be above 4G if system have more than 4G ram 892 + installed. Otherwise memory region will be allocated 893 + below 4G, if available. 894 894 It will be ignored if crashkernel=X is specified. 895 895 crashkernel=size[KMG],low 896 - [KNL, X86-64, ARM64] range under 4G. When crashkernel=X,high 896 + [KNL, X86-64, ARM64, RISCV] range under 4G. When crashkernel=X,high 897 897 is passed, kernel could allocate physical memory region 898 898 above 4G, that cause second kernel crash on system 899 899 that require some amount of low memory, e.g. swiotlb ··· 904 904 size is platform dependent. 905 905 --> x86: max(swiotlb_size_or_default() + 8MiB, 256MiB) 906 906 --> arm64: 128MiB 907 + --> riscv: 128MiB 907 908 This one lets the user specify own low range under 4G 908 909 for second kernel instead. 909 910 0: to disable low allocation. ··· 5554 5553 ring3mwait=disable 5555 5554 [KNL] Disable ring 3 MONITOR/MWAIT feature on supported 5556 5555 CPUs. 5556 + 5557 + riscv_isa_fallback [RISCV] 5558 + When CONFIG_RISCV_ISA_FALLBACK is not enabled, permit 5559 + falling back to detecting extension support by parsing 5560 + "riscv,isa" property on devicetree systems when the 5561 + replacement properties are not found. See the Kconfig 5562 + entry for RISCV_ISA_FALLBACK. 5557 5563 5558 5564 ro [KNL] Mount root device read-only on boot 5559 5565
+23 -4
Documentation/admin-guide/sysctl/kernel.rst
··· 941 941 The default value is 8. 942 942 943 943 944 - perf_user_access (arm64 only) 945 - ================================= 944 + perf_user_access (arm64 and riscv only) 945 + ======================================= 946 946 947 - Controls user space access for reading perf event counters. When set to 1, 948 - user space can read performance monitor counter registers directly. 947 + Controls user space access for reading perf event counters. 948 + 949 + arm64 950 + ===== 949 951 950 952 The default value is 0 (access disabled). 951 953 954 + When set to 1, user space can read performance monitor counter registers 955 + directly. 956 + 952 957 See Documentation/arch/arm64/perf.rst for more information. 953 958 959 + riscv 960 + ===== 961 + 962 + When set to 0, user space access is disabled. 963 + 964 + The default value is 1, user space can read performance monitor counter 965 + registers through perf, any direct access without perf intervention will trigger 966 + an illegal instruction. 967 + 968 + When set to 2, which enables legacy mode (user space has direct access to cycle 969 + and insret CSRs only). Note that this legacy value is deprecated and will be 970 + removed once all user space applications are fixed. 971 + 972 + Note that the time CSR is always directly accessible to all modes. 954 973 955 974 pid_max 956 975 =======
+22
Documentation/riscv/vm-layout.rst
··· 133 133 ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | modules, BPF 134 134 ffffffff80000000 | -2 GB | ffffffffffffffff | 2 GB | kernel 135 135 __________________|____________|__________________|_________|____________________________________________________________ 136 + 137 + 138 + Userspace VAs 139 + -------------------- 140 + To maintain compatibility with software that relies on the VA space with a 141 + maximum of 48 bits the kernel will, by default, return virtual addresses to 142 + userspace from a 48-bit range (sv48). This default behavior is achieved by 143 + passing 0 into the hint address parameter of mmap. On CPUs with an address space 144 + smaller than sv48, the CPU maximum supported address space will be the default. 145 + 146 + Software can "opt-in" to receiving VAs from another VA space by providing 147 + a hint address to mmap. A hint address passed to mmap will cause the largest 148 + address space that fits entirely into the hint to be used, unless there is no 149 + space left in the address space. If there is no space available in the requested 150 + address space, an address in the next smallest available address space will be 151 + returned. 152 + 153 + For example, in order to obtain 48-bit VA space, a hint address greater than 154 + :code:`1 << 47` must be provided. Note that this is 47 due to sv48 userspace 155 + ending at :code:`1 << 47` and the addresses beyond this are reserved for the 156 + kernel. Similarly, to obtain 57-bit VA space addresses, a hint address greater 157 + than or equal to :code:`1 << 56` must be provided.
+24
arch/riscv/Kconfig
··· 35 35 select ARCH_HAS_SET_MEMORY if MMU 36 36 select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL 37 37 select ARCH_HAS_STRICT_MODULE_RWX if MMU && !XIP_KERNEL 38 + select ARCH_HAS_SYSCALL_WRAPPER 38 39 select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST 39 40 select ARCH_HAS_UBSAN_SANITIZE_ALL 40 41 select ARCH_HAS_VDSO_DATA ··· 43 42 select ARCH_OPTIONAL_KERNEL_RWX_DEFAULT 44 43 select ARCH_STACKWALK 45 44 select ARCH_SUPPORTS_ATOMIC_RMW 45 + select ARCH_SUPPORTS_CFI_CLANG 46 46 select ARCH_SUPPORTS_DEBUG_PAGEALLOC if MMU 47 47 select ARCH_SUPPORTS_HUGETLBFS if MMU 48 48 select ARCH_SUPPORTS_PAGE_TABLE_CHECK if MMU 49 49 select ARCH_SUPPORTS_PER_VMA_LOCK if MMU 50 50 select ARCH_USE_MEMTEST 51 51 select ARCH_USE_QUEUED_RWLOCKS 52 + select ARCH_USES_CFI_TRAPS if CFI_CLANG 52 53 select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU 53 54 select ARCH_WANT_FRAME_POINTERS 54 55 select ARCH_WANT_GENERAL_HUGETLB if !RISCV_ISA_SVNAPOT ··· 65 62 select COMMON_CLK 66 63 select CPU_PM if CPU_IDLE || HIBERNATION 67 64 select EDAC_SUPPORT 65 + select FRAME_POINTER if PERF_EVENTS || (FUNCTION_TRACER && !DYNAMIC_FTRACE) 68 66 select GENERIC_ARCH_TOPOLOGY 69 67 select GENERIC_ATOMIC64 if !64BIT 70 68 select GENERIC_CLOCKEVENTS_BROADCAST if SMP ··· 134 130 select HAVE_PERF_REGS 135 131 select HAVE_PERF_USER_STACK_DUMP 136 132 select HAVE_POSIX_CPU_TIMERS_TASK_WORK 133 + select HAVE_PREEMPT_DYNAMIC_KEY if !XIP_KERNEL 137 134 select HAVE_REGS_AND_STACK_ACCESS_API 138 135 select HAVE_RETHOOK if !XIP_KERNEL 139 136 select HAVE_RSEQ ··· 272 267 select ARCH_HAS_SETUP_DMA_OPS 273 268 select ARCH_HAS_SYNC_DMA_FOR_CPU 274 269 select ARCH_HAS_SYNC_DMA_FOR_DEVICE 270 + select DMA_BOUNCE_UNALIGNED_KMALLOC if SWIOTLB 275 271 select DMA_DIRECT_REMAP 276 272 277 273 config AS_HAS_INSN ··· 841 835 This is the physical address in your flash memory the kernel will 842 836 be linked for and stored to. This address is dependent on your 843 837 own flash usage. 838 + 839 + config RISCV_ISA_FALLBACK 840 + bool "Permit falling back to parsing riscv,isa for extension support by default" 841 + default y 842 + help 843 + Parsing the "riscv,isa" devicetree property has been deprecated and 844 + replaced by a list of explicitly defined strings. For compatibility 845 + with existing platforms, the kernel will fall back to parsing the 846 + "riscv,isa" property if the replacements are not found. 847 + 848 + Selecting N here will result in a kernel that does not use the 849 + fallback, unless the commandline "riscv_isa_fallback" parameter is 850 + present. 851 + 852 + Please see the dt-binding, located at 853 + Documentation/devicetree/bindings/riscv/extensions.yaml for details 854 + on the replacement properties, "riscv,isa-base" and 855 + "riscv,isa-extensions". 844 856 845 857 endmenu # "Boot options" 846 858
-3
arch/riscv/Makefile
··· 87 87 ifeq ($(CONFIG_CMODEL_MEDANY),y) 88 88 KBUILD_CFLAGS += -mcmodel=medany 89 89 endif 90 - ifeq ($(CONFIG_PERF_EVENTS),y) 91 - KBUILD_CFLAGS += -fno-omit-frame-pointer 92 - endif 93 90 94 91 # Avoid generating .eh_frame sections. 95 92 KBUILD_CFLAGS += -fno-asynchronous-unwind-tables -fno-unwind-tables
+1 -1
arch/riscv/include/asm/alternative-macros.h
··· 146 146 * vendor_id: The CPU vendor ID. 147 147 * patch_id: The patch ID (erratum ID or cpufeature ID). 148 148 * CONFIG_k: The Kconfig of this patch ID. When Kconfig is disabled, the old 149 - * content will alwyas be executed. 149 + * content will always be executed. 150 150 */ 151 151 #define ALTERNATIVE(old_content, new_content, vendor_id, patch_id, CONFIG_k) \ 152 152 _ALTERNATIVE_CFG(old_content, new_content, vendor_id, patch_id, CONFIG_k)
+14
arch/riscv/include/asm/cache.h
··· 13 13 14 14 #ifdef CONFIG_RISCV_DMA_NONCOHERENT 15 15 #define ARCH_DMA_MINALIGN L1_CACHE_BYTES 16 + #define ARCH_KMALLOC_MINALIGN (8) 16 17 #endif 17 18 18 19 /* ··· 23 22 #ifndef CONFIG_MMU 24 23 #define ARCH_SLAB_MINALIGN 16 25 24 #endif 25 + 26 + #ifndef __ASSEMBLY__ 27 + 28 + #ifdef CONFIG_RISCV_DMA_NONCOHERENT 29 + extern int dma_cache_alignment; 30 + #define dma_get_cache_alignment dma_get_cache_alignment 31 + static inline int dma_get_cache_alignment(void) 32 + { 33 + return dma_cache_alignment; 34 + } 35 + #endif 36 + 37 + #endif /* __ASSEMBLY__ */ 26 38 27 39 #endif /* _ASM_RISCV_CACHE_H */
+2
arch/riscv/include/asm/cacheflush.h
··· 58 58 59 59 #ifdef CONFIG_RISCV_DMA_NONCOHERENT 60 60 void riscv_noncoherent_supported(void); 61 + void __init riscv_set_dma_cache_alignment(void); 61 62 #else 62 63 static inline void riscv_noncoherent_supported(void) {} 64 + static inline void riscv_set_dma_cache_alignment(void) {} 63 65 #endif 64 66 65 67 /*
+22
arch/riscv/include/asm/cfi.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + #ifndef _ASM_RISCV_CFI_H 3 + #define _ASM_RISCV_CFI_H 4 + 5 + /* 6 + * Clang Control Flow Integrity (CFI) support. 7 + * 8 + * Copyright (C) 2023 Google LLC 9 + */ 10 + 11 + #include <linux/cfi.h> 12 + 13 + #ifdef CONFIG_CFI_CLANG 14 + enum bug_trap_type handle_cfi_failure(struct pt_regs *regs); 15 + #else 16 + static inline enum bug_trap_type handle_cfi_failure(struct pt_regs *regs) 17 + { 18 + return BUG_TRAP_TYPE_NONE; 19 + } 20 + #endif /* CONFIG_CFI_CLANG */ 21 + 22 + #endif /* _ASM_RISCV_CFI_H */
+11 -2
arch/riscv/include/asm/elf.h
··· 41 41 #define compat_elf_check_arch compat_elf_check_arch 42 42 43 43 #define CORE_DUMP_USE_REGSET 44 + #define ELF_FDPIC_CORE_EFLAGS 0 44 45 #define ELF_EXEC_PAGESIZE (PAGE_SIZE) 45 46 46 47 /* ··· 50 49 * the loader. We need to make sure that it is out of the way of the program 51 50 * that it will "exec", and that there is sufficient room for the brk. 52 51 */ 53 - #define ELF_ET_DYN_BASE ((TASK_SIZE / 3) * 2) 52 + #define ELF_ET_DYN_BASE ((DEFAULT_MAP_WINDOW / 3) * 2) 54 53 55 54 #ifdef CONFIG_64BIT 56 55 #ifdef CONFIG_COMPAT ··· 70 69 #define ELF_HWCAP riscv_get_elf_hwcap() 71 70 extern unsigned long elf_hwcap; 72 71 72 + #define ELF_FDPIC_PLAT_INIT(_r, _exec_map_addr, _interp_map_addr, dynamic_addr) \ 73 + do { \ 74 + (_r)->a1 = _exec_map_addr; \ 75 + (_r)->a2 = _interp_map_addr; \ 76 + (_r)->a3 = dynamic_addr; \ 77 + } while (0) 78 + 73 79 /* 74 80 * This yields a string that ld.so will use to load implementation 75 81 * specific libraries for optimization. This is more specific in ··· 86 78 87 79 #define COMPAT_ELF_PLATFORM (NULL) 88 80 89 - #ifdef CONFIG_MMU 90 81 #define ARCH_DLINFO \ 91 82 do { \ 92 83 /* \ ··· 122 115 else \ 123 116 NEW_AUX_ENT(AT_IGNORE, 0); \ 124 117 } while (0) 118 + 119 + #ifdef CONFIG_MMU 125 120 #define ARCH_HAS_SETUP_ADDITIONAL_PAGES 126 121 struct linux_binprm; 127 122 extern int arch_setup_additional_pages(struct linux_binprm *bprm,
+12 -5
arch/riscv/include/asm/hwcap.h
··· 14 14 #include <uapi/asm/hwcap.h> 15 15 16 16 #define RISCV_ISA_EXT_a ('a' - 'a') 17 + #define RISCV_ISA_EXT_b ('b' - 'a') 17 18 #define RISCV_ISA_EXT_c ('c' - 'a') 18 19 #define RISCV_ISA_EXT_d ('d' - 'a') 19 20 #define RISCV_ISA_EXT_f ('f' - 'a') 20 21 #define RISCV_ISA_EXT_h ('h' - 'a') 21 22 #define RISCV_ISA_EXT_i ('i' - 'a') 23 + #define RISCV_ISA_EXT_j ('j' - 'a') 24 + #define RISCV_ISA_EXT_k ('k' - 'a') 22 25 #define RISCV_ISA_EXT_m ('m' - 'a') 26 + #define RISCV_ISA_EXT_p ('p' - 'a') 27 + #define RISCV_ISA_EXT_q ('q' - 'a') 23 28 #define RISCV_ISA_EXT_s ('s' - 'a') 24 29 #define RISCV_ISA_EXT_u ('u' - 'a') 25 30 #define RISCV_ISA_EXT_v ('v' - 'a') ··· 60 55 #define RISCV_ISA_EXT_ZIHPM 42 61 56 62 57 #define RISCV_ISA_EXT_MAX 64 63 - #define RISCV_ISA_EXT_NAME_LEN_MAX 32 64 58 65 59 #ifdef CONFIG_RISCV_M_MODE 66 60 #define RISCV_ISA_EXT_SxAIA RISCV_ISA_EXT_SMAIA ··· 74 70 unsigned long riscv_get_elf_hwcap(void); 75 71 76 72 struct riscv_isa_ext_data { 77 - /* Name of the extension displayed to userspace via /proc/cpuinfo */ 78 - char uprop[RISCV_ISA_EXT_NAME_LEN_MAX]; 79 - /* The logical ISA extension ID */ 80 - unsigned int isa_ext_id; 73 + const unsigned int id; 74 + const char *name; 75 + const char *property; 81 76 }; 77 + 78 + extern const struct riscv_isa_ext_data riscv_isa_ext[]; 79 + extern const size_t riscv_isa_ext_count; 80 + extern bool riscv_isa_fallback; 82 81 83 82 unsigned long riscv_isa_extension_base(const unsigned long *isa_bitmap); 84 83
+10
arch/riscv/include/asm/insn.h
··· 63 63 #define RVG_RS1_OPOFF 15 64 64 #define RVG_RS2_OPOFF 20 65 65 #define RVG_RD_OPOFF 7 66 + #define RVG_RS1_MASK GENMASK(4, 0) 66 67 #define RVG_RD_MASK GENMASK(4, 0) 67 68 68 69 /* The bit field of immediate value in RVC J instruction */ ··· 131 130 #define RVC_C2_RS1_OPOFF 7 132 131 #define RVC_C2_RS2_OPOFF 2 133 132 #define RVC_C2_RD_OPOFF 7 133 + #define RVC_C2_RS1_MASK GENMASK(4, 0) 134 134 135 135 /* parts of opcode for RVG*/ 136 136 #define RVG_OPCODE_FENCE 0x0f ··· 291 289 #define RV_X(X, s, mask) (((X) >> (s)) & (mask)) 292 290 #define RVC_X(X, s, mask) RV_X(X, s, mask) 293 291 292 + #define RV_EXTRACT_RS1_REG(x) \ 293 + ({typeof(x) x_ = (x); \ 294 + (RV_X(x_, RVG_RS1_OPOFF, RVG_RS1_MASK)); }) 295 + 294 296 #define RV_EXTRACT_RD_REG(x) \ 295 297 ({typeof(x) x_ = (x); \ 296 298 (RV_X(x_, RVG_RD_OPOFF, RVG_RD_MASK)); }) ··· 321 315 (RV_X(x_, RV_B_IMM_10_5_OPOFF, RV_B_IMM_10_5_MASK) << RV_B_IMM_10_5_OFF) | \ 322 316 (RV_X(x_, RV_B_IMM_11_OPOFF, RV_B_IMM_11_MASK) << RV_B_IMM_11_OFF) | \ 323 317 (RV_IMM_SIGN(x_) << RV_B_IMM_SIGN_OFF); }) 318 + 319 + #define RVC_EXTRACT_C2_RS1_REG(x) \ 320 + ({typeof(x) x_ = (x); \ 321 + (RV_X(x_, RVC_C2_RS1_OPOFF, RVC_C2_RS1_MASK)); }) 324 322 325 323 #define RVC_EXTRACT_JTYPE_IMM(x) \ 326 324 ({typeof(x) x_ = (x); \
+4
arch/riscv/include/asm/mmu.h
··· 20 20 /* A local icache flush is needed before user execution can resume. */ 21 21 cpumask_t icache_stale_mask; 22 22 #endif 23 + #ifdef CONFIG_BINFMT_ELF_FDPIC 24 + unsigned long exec_fdpic_loadmap; 25 + unsigned long interp_fdpic_loadmap; 26 + #endif 23 27 } mm_context_t; 24 28 25 29 void __init create_pgd_mapping(pgd_t *pgdp, uintptr_t va, phys_addr_t pa,
+28 -5
arch/riscv/include/asm/pgtable.h
··· 62 62 * struct pages to map half the virtual address space. Then 63 63 * position vmemmap directly below the VMALLOC region. 64 64 */ 65 + #define VA_BITS_SV32 32 65 66 #ifdef CONFIG_64BIT 67 + #define VA_BITS_SV39 39 68 + #define VA_BITS_SV48 48 69 + #define VA_BITS_SV57 57 70 + 66 71 #define VA_BITS (pgtable_l5_enabled ? \ 67 - 57 : (pgtable_l4_enabled ? 48 : 39)) 72 + VA_BITS_SV57 : (pgtable_l4_enabled ? VA_BITS_SV48 : VA_BITS_SV39)) 68 73 #else 69 - #define VA_BITS 32 74 + #define VA_BITS VA_BITS_SV32 70 75 #endif 71 76 72 77 #define VMEMMAP_SHIFT \ ··· 116 111 #include <asm/page.h> 117 112 #include <asm/tlbflush.h> 118 113 #include <linux/mm_types.h> 114 + #include <asm/compat.h> 119 115 120 116 #define __page_val_to_pfn(_val) (((_val) & _PAGE_PFN_MASK) >> _PAGE_PFN_SHIFT) 121 117 122 118 #ifdef CONFIG_64BIT 123 119 #include <asm/pgtable-64.h> 120 + 121 + #define VA_USER_SV39 (UL(1) << (VA_BITS_SV39 - 1)) 122 + #define VA_USER_SV48 (UL(1) << (VA_BITS_SV48 - 1)) 123 + #define VA_USER_SV57 (UL(1) << (VA_BITS_SV57 - 1)) 124 + 125 + #ifdef CONFIG_COMPAT 126 + #define MMAP_VA_BITS_64 ((VA_BITS >= VA_BITS_SV48) ? VA_BITS_SV48 : VA_BITS) 127 + #define MMAP_MIN_VA_BITS_64 (VA_BITS_SV39) 128 + #define MMAP_VA_BITS (is_compat_task() ? VA_BITS_SV32 : MMAP_VA_BITS_64) 129 + #define MMAP_MIN_VA_BITS (is_compat_task() ? VA_BITS_SV32 : MMAP_MIN_VA_BITS_64) 130 + #else 131 + #define MMAP_VA_BITS ((VA_BITS >= VA_BITS_SV48) ? VA_BITS_SV48 : VA_BITS) 132 + #define MMAP_MIN_VA_BITS (VA_BITS_SV39) 133 + #endif /* CONFIG_COMPAT */ 134 + 124 135 #else 125 136 #include <asm/pgtable-32.h> 126 137 #endif /* CONFIG_64BIT */ ··· 864 843 * Task size is 0x4000000000 for RV64 or 0x9fc00000 for RV32. 865 844 * Note that PGDIR_SIZE must evenly divide TASK_SIZE. 866 845 * Task size is: 867 - * - 0x9fc00000 (~2.5GB) for RV32. 868 - * - 0x4000000000 ( 256GB) for RV64 using SV39 mmu 869 - * - 0x800000000000 ( 128TB) for RV64 using SV48 mmu 846 + * - 0x9fc00000 (~2.5GB) for RV32. 847 + * - 0x4000000000 ( 256GB) for RV64 using SV39 mmu 848 + * - 0x800000000000 ( 128TB) for RV64 using SV48 mmu 849 + * - 0x100000000000000 ( 64PB) for RV64 using SV57 mmu 870 850 * 871 851 * Note that PGDIR_SIZE must evenly divide TASK_SIZE since "RISC-V 872 852 * Instruction Set Manual Volume II: Privileged Architecture" states that 873 853 * "load and store effective addresses, which are 64bits, must have bits 874 854 * 63–48 all equal to bit 47, or else a page-fault exception will occur." 855 + * Similarly for SV57, bits 63–57 must be equal to bit 56. 875 856 */ 876 857 #ifdef CONFIG_64BIT 877 858 #define TASK_SIZE_64 (PGDIR_SIZE * PTRS_PER_PGD / 2)
+46 -6
arch/riscv/include/asm/processor.h
··· 13 13 14 14 #include <asm/ptrace.h> 15 15 16 + #ifdef CONFIG_64BIT 17 + #define DEFAULT_MAP_WINDOW (UL(1) << (MMAP_VA_BITS - 1)) 18 + #define STACK_TOP_MAX TASK_SIZE_64 19 + 20 + #define arch_get_mmap_end(addr, len, flags) \ 21 + ({ \ 22 + unsigned long mmap_end; \ 23 + typeof(addr) _addr = (addr); \ 24 + if ((_addr) == 0 || (IS_ENABLED(CONFIG_COMPAT) && is_compat_task())) \ 25 + mmap_end = STACK_TOP_MAX; \ 26 + else if ((_addr) >= VA_USER_SV57) \ 27 + mmap_end = STACK_TOP_MAX; \ 28 + else if ((((_addr) >= VA_USER_SV48)) && (VA_BITS >= VA_BITS_SV48)) \ 29 + mmap_end = VA_USER_SV48; \ 30 + else \ 31 + mmap_end = VA_USER_SV39; \ 32 + mmap_end; \ 33 + }) 34 + 35 + #define arch_get_mmap_base(addr, base) \ 36 + ({ \ 37 + unsigned long mmap_base; \ 38 + typeof(addr) _addr = (addr); \ 39 + typeof(base) _base = (base); \ 40 + unsigned long rnd_gap = DEFAULT_MAP_WINDOW - (_base); \ 41 + if ((_addr) == 0 || (IS_ENABLED(CONFIG_COMPAT) && is_compat_task())) \ 42 + mmap_base = (_base); \ 43 + else if (((_addr) >= VA_USER_SV57) && (VA_BITS >= VA_BITS_SV57)) \ 44 + mmap_base = VA_USER_SV57 - rnd_gap; \ 45 + else if ((((_addr) >= VA_USER_SV48)) && (VA_BITS >= VA_BITS_SV48)) \ 46 + mmap_base = VA_USER_SV48 - rnd_gap; \ 47 + else \ 48 + mmap_base = VA_USER_SV39 - rnd_gap; \ 49 + mmap_base; \ 50 + }) 51 + 52 + #else 53 + #define DEFAULT_MAP_WINDOW TASK_SIZE 54 + #define STACK_TOP_MAX TASK_SIZE 55 + #endif 56 + #define STACK_ALIGN 16 57 + 58 + #define STACK_TOP DEFAULT_MAP_WINDOW 59 + 16 60 /* 17 61 * This decides where the kernel will search for a free chunk of vm 18 62 * space during mmap's. 19 63 */ 20 - #define TASK_UNMAPPED_BASE PAGE_ALIGN(TASK_SIZE / 3) 21 - 22 - #define STACK_TOP TASK_SIZE 23 64 #ifdef CONFIG_64BIT 24 - #define STACK_TOP_MAX TASK_SIZE_64 65 + #define TASK_UNMAPPED_BASE PAGE_ALIGN((UL(1) << MMAP_MIN_VA_BITS) / 3) 25 66 #else 26 - #define STACK_TOP_MAX TASK_SIZE 67 + #define TASK_UNMAPPED_BASE PAGE_ALIGN(TASK_SIZE / 3) 27 68 #endif 28 - #define STACK_ALIGN 16 29 69 30 70 #ifndef __ASSEMBLY__ 31 71
+2 -3
arch/riscv/include/asm/syscall.h
··· 75 75 #endif 76 76 } 77 77 78 - typedef long (*syscall_t)(ulong, ulong, ulong, ulong, ulong, ulong, ulong); 78 + typedef long (*syscall_t)(const struct pt_regs *); 79 79 static inline void syscall_handler(struct pt_regs *regs, ulong syscall) 80 80 { 81 81 syscall_t fn; ··· 87 87 #endif 88 88 fn = sys_call_table[syscall]; 89 89 90 - regs->a0 = fn(regs->orig_a0, regs->a1, regs->a2, 91 - regs->a3, regs->a4, regs->a5, regs->a6); 90 + regs->a0 = fn(regs); 92 91 } 93 92 94 93 static inline bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs)
+87
arch/riscv/include/asm/syscall_wrapper.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + /* 3 + * syscall_wrapper.h - riscv specific wrappers to syscall definitions 4 + * 5 + * Based on arch/arm64/include/syscall_wrapper.h 6 + */ 7 + 8 + #ifndef __ASM_SYSCALL_WRAPPER_H 9 + #define __ASM_SYSCALL_WRAPPER_H 10 + 11 + #include <asm/ptrace.h> 12 + 13 + asmlinkage long __riscv_sys_ni_syscall(const struct pt_regs *); 14 + 15 + #define SC_RISCV_REGS_TO_ARGS(x, ...) \ 16 + __MAP(x,__SC_ARGS \ 17 + ,,regs->orig_a0,,regs->a1,,regs->a2 \ 18 + ,,regs->a3,,regs->a4,,regs->a5,,regs->a6) 19 + 20 + #ifdef CONFIG_COMPAT 21 + 22 + #define COMPAT_SYSCALL_DEFINEx(x, name, ...) \ 23 + asmlinkage long __riscv_compat_sys##name(const struct pt_regs *regs); \ 24 + ALLOW_ERROR_INJECTION(__riscv_compat_sys##name, ERRNO); \ 25 + static long __se_compat_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \ 26 + static inline long __do_compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)); \ 27 + asmlinkage long __riscv_compat_sys##name(const struct pt_regs *regs) \ 28 + { \ 29 + return __se_compat_sys##name(SC_RISCV_REGS_TO_ARGS(x,__VA_ARGS__)); \ 30 + } \ 31 + static long __se_compat_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \ 32 + { \ 33 + return __do_compat_sys##name(__MAP(x,__SC_DELOUSE,__VA_ARGS__)); \ 34 + } \ 35 + static inline long __do_compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) 36 + 37 + #define COMPAT_SYSCALL_DEFINE0(sname) \ 38 + asmlinkage long __riscv_compat_sys_##sname(const struct pt_regs *__unused); \ 39 + ALLOW_ERROR_INJECTION(__riscv_compat_sys_##sname, ERRNO); \ 40 + asmlinkage long __riscv_compat_sys_##sname(const struct pt_regs *__unused) 41 + 42 + #define COND_SYSCALL_COMPAT(name) \ 43 + asmlinkage long __weak __riscv_compat_sys_##name(const struct pt_regs *regs); \ 44 + asmlinkage long __weak __riscv_compat_sys_##name(const struct pt_regs *regs) \ 45 + { \ 46 + return sys_ni_syscall(); \ 47 + } 48 + 49 + #define COMPAT_SYS_NI(name) \ 50 + SYSCALL_ALIAS(__riscv_compat_sys_##name, sys_ni_posix_timers); 51 + 52 + #endif /* CONFIG_COMPAT */ 53 + 54 + #define __SYSCALL_DEFINEx(x, name, ...) \ 55 + asmlinkage long __riscv_sys##name(const struct pt_regs *regs); \ 56 + ALLOW_ERROR_INJECTION(__riscv_sys##name, ERRNO); \ 57 + static long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \ 58 + static inline long __do_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)); \ 59 + asmlinkage long __riscv_sys##name(const struct pt_regs *regs) \ 60 + { \ 61 + return __se_sys##name(SC_RISCV_REGS_TO_ARGS(x,__VA_ARGS__)); \ 62 + } \ 63 + static long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \ 64 + { \ 65 + long ret = __do_sys##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \ 66 + __MAP(x,__SC_TEST,__VA_ARGS__); \ 67 + __PROTECT(x, ret,__MAP(x,__SC_ARGS,__VA_ARGS__)); \ 68 + return ret; \ 69 + } \ 70 + static inline long __do_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) 71 + 72 + #define SYSCALL_DEFINE0(sname) \ 73 + SYSCALL_METADATA(_##sname, 0); \ 74 + asmlinkage long __riscv_sys_##sname(const struct pt_regs *__unused); \ 75 + ALLOW_ERROR_INJECTION(__riscv_sys_##sname, ERRNO); \ 76 + asmlinkage long __riscv_sys_##sname(const struct pt_regs *__unused) 77 + 78 + #define COND_SYSCALL(name) \ 79 + asmlinkage long __weak __riscv_sys_##name(const struct pt_regs *regs); \ 80 + asmlinkage long __weak __riscv_sys_##name(const struct pt_regs *regs) \ 81 + { \ 82 + return sys_ni_syscall(); \ 83 + } 84 + 85 + #define SYS_NI(name) SYSCALL_ALIAS(__riscv_sys_##name, sys_ni_posix_timers); 86 + 87 + #endif /* __ASM_SYSCALL_WRAPPER_H */
+5
arch/riscv/include/uapi/asm/ptrace.h
··· 10 10 11 11 #include <linux/types.h> 12 12 13 + #define PTRACE_GETFDPIC 33 14 + 15 + #define PTRACE_GETFDPIC_EXEC 0 16 + #define PTRACE_GETFDPIC_INTERP 1 17 + 13 18 /* 14 19 * User-mode register state for core dumps, ptrace, sigcontext 15 20 *
+1 -1
arch/riscv/include/uapi/asm/sigcontext.h
··· 25 25 * Signal context structure 26 26 * 27 27 * This contains the context saved before a signal handler is invoked; 28 - * it is restored by sys_sigreturn / sys_rt_sigreturn. 28 + * it is restored by sys_rt_sigreturn. 29 29 */ 30 30 struct sigcontext { 31 31 struct user_regs_struct sc_regs;
+2
arch/riscv/kernel/Makefile
··· 91 91 92 92 obj-$(CONFIG_JUMP_LABEL) += jump_label.o 93 93 94 + obj-$(CONFIG_CFI_CLANG) += cfi.o 95 + 94 96 obj-$(CONFIG_EFI) += efi.o 95 97 obj-$(CONFIG_COMPAT) += compat_syscall_table.o 96 98 obj-$(CONFIG_COMPAT) += compat_signal.o
+77
arch/riscv/kernel/cfi.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Clang Control Flow Integrity (CFI) support. 4 + * 5 + * Copyright (C) 2023 Google LLC 6 + */ 7 + #include <asm/cfi.h> 8 + #include <asm/insn.h> 9 + 10 + /* 11 + * Returns the target address and the expected type when regs->epc points 12 + * to a compiler-generated CFI trap. 13 + */ 14 + static bool decode_cfi_insn(struct pt_regs *regs, unsigned long *target, 15 + u32 *type) 16 + { 17 + unsigned long *regs_ptr = (unsigned long *)regs; 18 + int rs1_num; 19 + u32 insn; 20 + 21 + *target = *type = 0; 22 + 23 + /* 24 + * The compiler generates the following instruction sequence 25 + * for indirect call checks: 26 + * 27 + *   lw t1, -4(<reg>) 28 + * lui t2, <hi20> 29 + * addiw t2, t2, <lo12> 30 + * beq t1, t2, .Ltmp1 31 + * ebreak ; <- regs->epc 32 + * .Ltmp1: 33 + * jalr <reg> 34 + * 35 + * We can read the expected type and the target address from the 36 + * registers passed to the beq/jalr instructions. 37 + */ 38 + if (get_kernel_nofault(insn, (void *)regs->epc - 4)) 39 + return false; 40 + if (!riscv_insn_is_beq(insn)) 41 + return false; 42 + 43 + *type = (u32)regs_ptr[RV_EXTRACT_RS1_REG(insn)]; 44 + 45 + if (get_kernel_nofault(insn, (void *)regs->epc) || 46 + get_kernel_nofault(insn, (void *)regs->epc + GET_INSN_LENGTH(insn))) 47 + return false; 48 + 49 + if (riscv_insn_is_jalr(insn)) 50 + rs1_num = RV_EXTRACT_RS1_REG(insn); 51 + else if (riscv_insn_is_c_jalr(insn)) 52 + rs1_num = RVC_EXTRACT_C2_RS1_REG(insn); 53 + else 54 + return false; 55 + 56 + *target = regs_ptr[rs1_num]; 57 + 58 + return true; 59 + } 60 + 61 + /* 62 + * Checks if the ebreak trap is because of a CFI failure, and handles the trap 63 + * if needed. Returns a bug_trap_type value similarly to report_bug. 64 + */ 65 + enum bug_trap_type handle_cfi_failure(struct pt_regs *regs) 66 + { 67 + unsigned long target; 68 + u32 type; 69 + 70 + if (!is_cfi_trap(regs->epc)) 71 + return BUG_TRAP_TYPE_NONE; 72 + 73 + if (!decode_cfi_insn(regs, &target, &type)) 74 + return report_cfi_failure_noaddr(regs, regs->epc); 75 + 76 + return report_cfi_failure(regs, regs->epc, &target, type); 77 + }
+6 -2
arch/riscv/kernel/compat_syscall_table.c
··· 9 9 #include <asm/syscall.h> 10 10 11 11 #undef __SYSCALL 12 - #define __SYSCALL(nr, call) [nr] = (call), 12 + #define __SYSCALL(nr, call) asmlinkage long __riscv_##call(const struct pt_regs *); 13 + #include <asm/unistd.h> 14 + 15 + #undef __SYSCALL 16 + #define __SYSCALL(nr, call) [nr] = __riscv_##call, 13 17 14 18 asmlinkage long compat_sys_rt_sigreturn(void); 15 19 16 20 void * const compat_sys_call_table[__NR_syscalls] = { 17 - [0 ... __NR_syscalls - 1] = sys_ni_syscall, 21 + [0 ... __NR_syscalls - 1] = __riscv_sys_ni_syscall, 18 22 #include <asm/unistd.h> 19 23 };
+68 -123
arch/riscv/kernel/cpu.c
··· 46 46 return 0; 47 47 } 48 48 49 - int riscv_early_of_processor_hartid(struct device_node *node, unsigned long *hart) 49 + int __init riscv_early_of_processor_hartid(struct device_node *node, unsigned long *hart) 50 50 { 51 51 const char *isa; 52 52 ··· 66 66 return -ENODEV; 67 67 } 68 68 69 - if (of_property_read_string(node, "riscv,isa", &isa)) { 70 - pr_warn("CPU with hartid=%lu has no \"riscv,isa\" property\n", *hart); 69 + if (of_property_read_string(node, "riscv,isa-base", &isa)) 70 + goto old_interface; 71 + 72 + if (IS_ENABLED(CONFIG_32BIT) && strncasecmp(isa, "rv32i", 5)) { 73 + pr_warn("CPU with hartid=%lu does not support rv32i", *hart); 71 74 return -ENODEV; 72 75 } 73 76 74 - if (IS_ENABLED(CONFIG_32BIT) && strncasecmp(isa, "rv32ima", 7)) 77 + if (IS_ENABLED(CONFIG_64BIT) && strncasecmp(isa, "rv64i", 5)) { 78 + pr_warn("CPU with hartid=%lu does not support rv64i", *hart); 79 + return -ENODEV; 80 + } 81 + 82 + if (!of_property_present(node, "riscv,isa-extensions")) 75 83 return -ENODEV; 76 84 77 - if (IS_ENABLED(CONFIG_64BIT) && strncasecmp(isa, "rv64ima", 7)) 85 + if (of_property_match_string(node, "riscv,isa-extensions", "i") < 0 || 86 + of_property_match_string(node, "riscv,isa-extensions", "m") < 0 || 87 + of_property_match_string(node, "riscv,isa-extensions", "a") < 0) { 88 + pr_warn("CPU with hartid=%lu does not support ima", *hart); 78 89 return -ENODEV; 90 + } 91 + 92 + return 0; 93 + 94 + old_interface: 95 + if (!riscv_isa_fallback) { 96 + pr_warn("CPU with hartid=%lu is invalid: this kernel does not parse \"riscv,isa\"", 97 + *hart); 98 + return -ENODEV; 99 + } 100 + 101 + if (of_property_read_string(node, "riscv,isa", &isa)) { 102 + pr_warn("CPU with hartid=%lu has no \"riscv,isa-base\" or \"riscv,isa\" property\n", 103 + *hart); 104 + return -ENODEV; 105 + } 106 + 107 + if (IS_ENABLED(CONFIG_32BIT) && strncasecmp(isa, "rv32ima", 7)) { 108 + pr_warn("CPU with hartid=%lu does not support rv32ima", *hart); 109 + return -ENODEV; 110 + } 111 + 112 + if (IS_ENABLED(CONFIG_64BIT) && strncasecmp(isa, "rv64ima", 7)) { 113 + pr_warn("CPU with hartid=%lu does not support rv64ima", *hart); 114 + return -ENODEV; 115 + } 79 116 80 117 return 0; 81 118 } ··· 202 165 203 166 #ifdef CONFIG_PROC_FS 204 167 205 - #define __RISCV_ISA_EXT_DATA(UPROP, EXTID) \ 206 - { \ 207 - .uprop = #UPROP, \ 208 - .isa_ext_id = EXTID, \ 209 - } 210 - 211 - /* 212 - * The canonical order of ISA extension names in the ISA string is defined in 213 - * chapter 27 of the unprivileged specification. 214 - * 215 - * Ordinarily, for in-kernel data structures, this order is unimportant but 216 - * isa_ext_arr defines the order of the ISA string in /proc/cpuinfo. 217 - * 218 - * The specification uses vague wording, such as should, when it comes to 219 - * ordering, so for our purposes the following rules apply: 220 - * 221 - * 1. All multi-letter extensions must be separated from other extensions by an 222 - * underscore. 223 - * 224 - * 2. Additional standard extensions (starting with 'Z') must be sorted after 225 - * single-letter extensions and before any higher-privileged extensions. 226 - 227 - * 3. The first letter following the 'Z' conventionally indicates the most 228 - * closely related alphabetical extension category, IMAFDQLCBKJTPVH. 229 - * If multiple 'Z' extensions are named, they must be ordered first by 230 - * category, then alphabetically within a category. 231 - * 232 - * 3. Standard supervisor-level extensions (starting with 'S') must be listed 233 - * after standard unprivileged extensions. If multiple supervisor-level 234 - * extensions are listed, they must be ordered alphabetically. 235 - * 236 - * 4. Standard machine-level extensions (starting with 'Zxm') must be listed 237 - * after any lower-privileged, standard extensions. If multiple 238 - * machine-level extensions are listed, they must be ordered 239 - * alphabetically. 240 - * 241 - * 5. Non-standard extensions (starting with 'X') must be listed after all 242 - * standard extensions. If multiple non-standard extensions are listed, they 243 - * must be ordered alphabetically. 244 - * 245 - * An example string following the order is: 246 - * rv64imadc_zifoo_zigoo_zafoo_sbar_scar_zxmbaz_xqux_xrux 247 - * 248 - * New entries to this struct should follow the ordering rules described above. 249 - */ 250 - static struct riscv_isa_ext_data isa_ext_arr[] = { 251 - __RISCV_ISA_EXT_DATA(zicbom, RISCV_ISA_EXT_ZICBOM), 252 - __RISCV_ISA_EXT_DATA(zicboz, RISCV_ISA_EXT_ZICBOZ), 253 - __RISCV_ISA_EXT_DATA(zicntr, RISCV_ISA_EXT_ZICNTR), 254 - __RISCV_ISA_EXT_DATA(zicsr, RISCV_ISA_EXT_ZICSR), 255 - __RISCV_ISA_EXT_DATA(zifencei, RISCV_ISA_EXT_ZIFENCEI), 256 - __RISCV_ISA_EXT_DATA(zihintpause, RISCV_ISA_EXT_ZIHINTPAUSE), 257 - __RISCV_ISA_EXT_DATA(zihpm, RISCV_ISA_EXT_ZIHPM), 258 - __RISCV_ISA_EXT_DATA(zba, RISCV_ISA_EXT_ZBA), 259 - __RISCV_ISA_EXT_DATA(zbb, RISCV_ISA_EXT_ZBB), 260 - __RISCV_ISA_EXT_DATA(zbs, RISCV_ISA_EXT_ZBS), 261 - __RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA), 262 - __RISCV_ISA_EXT_DATA(ssaia, RISCV_ISA_EXT_SSAIA), 263 - __RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF), 264 - __RISCV_ISA_EXT_DATA(sstc, RISCV_ISA_EXT_SSTC), 265 - __RISCV_ISA_EXT_DATA(svinval, RISCV_ISA_EXT_SVINVAL), 266 - __RISCV_ISA_EXT_DATA(svnapot, RISCV_ISA_EXT_SVNAPOT), 267 - __RISCV_ISA_EXT_DATA(svpbmt, RISCV_ISA_EXT_SVPBMT), 268 - __RISCV_ISA_EXT_DATA("", RISCV_ISA_EXT_MAX), 269 - }; 270 - 271 - static void print_isa_ext(struct seq_file *f) 168 + static void print_isa(struct seq_file *f) 272 169 { 273 - struct riscv_isa_ext_data *edata; 274 - int i = 0, arr_sz; 275 - 276 - arr_sz = ARRAY_SIZE(isa_ext_arr) - 1; 277 - 278 - /* No extension support available */ 279 - if (arr_sz <= 0) 280 - return; 281 - 282 - for (i = 0; i <= arr_sz; i++) { 283 - edata = &isa_ext_arr[i]; 284 - if (!__riscv_isa_extension_available(NULL, edata->isa_ext_id)) 285 - continue; 286 - seq_printf(f, "_%s", edata->uprop); 287 - } 288 - } 289 - 290 - /* 291 - * These are the only valid base (single letter) ISA extensions as per the spec. 292 - * It also specifies the canonical order in which it appears in the spec. 293 - * Some of the extension may just be a place holder for now (B, K, P, J). 294 - * This should be updated once corresponding extensions are ratified. 295 - */ 296 - static const char base_riscv_exts[13] = "imafdqcbkjpvh"; 297 - 298 - static void print_isa(struct seq_file *f, const char *isa) 299 - { 300 - int i; 301 - 302 170 seq_puts(f, "isa\t\t: "); 303 - /* Print the rv[64/32] part */ 304 - seq_write(f, isa, 4); 305 - for (i = 0; i < sizeof(base_riscv_exts); i++) { 306 - if (__riscv_isa_extension_available(NULL, base_riscv_exts[i] - 'a')) 307 - /* Print only enabled the base ISA extensions */ 308 - seq_write(f, &base_riscv_exts[i], 1); 171 + 172 + if (IS_ENABLED(CONFIG_32BIT)) 173 + seq_write(f, "rv32", 4); 174 + else 175 + seq_write(f, "rv64", 4); 176 + 177 + for (int i = 0; i < riscv_isa_ext_count; i++) { 178 + if (!__riscv_isa_extension_available(NULL, riscv_isa_ext[i].id)) 179 + continue; 180 + 181 + /* Only multi-letter extensions are split by underscores */ 182 + if (strnlen(riscv_isa_ext[i].name, 2) != 1) 183 + seq_puts(f, "_"); 184 + 185 + seq_printf(f, "%s", riscv_isa_ext[i].name); 309 186 } 310 - print_isa_ext(f); 187 + 311 188 seq_puts(f, "\n"); 312 189 } 313 190 314 191 static void print_mmu(struct seq_file *f) 315 192 { 316 - char sv_type[16]; 193 + const char *sv_type; 317 194 318 195 #ifdef CONFIG_MMU 319 196 #if defined(CONFIG_32BIT) 320 - strncpy(sv_type, "sv32", 5); 197 + sv_type = "sv32"; 321 198 #elif defined(CONFIG_64BIT) 322 199 if (pgtable_l5_enabled) 323 - strncpy(sv_type, "sv57", 5); 200 + sv_type = "sv57"; 324 201 else if (pgtable_l4_enabled) 325 - strncpy(sv_type, "sv48", 5); 202 + sv_type = "sv48"; 326 203 else 327 - strncpy(sv_type, "sv39", 5); 204 + sv_type = "sv39"; 328 205 #endif 329 206 #else 330 - strncpy(sv_type, "none", 5); 207 + sv_type = "none"; 331 208 #endif /* CONFIG_MMU */ 332 209 seq_printf(f, "mmu\t\t: %s\n", sv_type); 333 210 } ··· 272 321 unsigned long cpu_id = (unsigned long)v - 1; 273 322 struct riscv_cpuinfo *ci = per_cpu_ptr(&riscv_cpuinfo, cpu_id); 274 323 struct device_node *node; 275 - const char *compat, *isa; 324 + const char *compat; 276 325 277 326 seq_printf(m, "processor\t: %lu\n", cpu_id); 278 327 seq_printf(m, "hart\t\t: %lu\n", cpuid_to_hartid_map(cpu_id)); 328 + print_isa(m); 329 + print_mmu(m); 279 330 280 331 if (acpi_disabled) { 281 332 node = of_get_cpu_node(cpu_id, NULL); 282 - if (!of_property_read_string(node, "riscv,isa", &isa)) 283 - print_isa(m, isa); 284 333 285 - print_mmu(m); 286 334 if (!of_property_read_string(node, "compatible", &compat) && 287 335 strcmp(compat, "riscv")) 288 336 seq_printf(m, "uarch\t\t: %s\n", compat); 289 337 290 338 of_node_put(node); 291 - } else { 292 - if (!acpi_get_riscv_isa(NULL, cpu_id, &isa)) 293 - print_isa(m, isa); 294 - 295 - print_mmu(m); 296 339 } 297 340 298 341 seq_printf(m, "mvendorid\t: 0x%lx\n", ci->mvendorid);
+340 -181
arch/riscv/kernel/cpufeature.c
··· 98 98 return true; 99 99 } 100 100 101 - void __init riscv_fill_hwcap(void) 101 + #define __RISCV_ISA_EXT_DATA(_name, _id) { \ 102 + .name = #_name, \ 103 + .property = #_name, \ 104 + .id = _id, \ 105 + } 106 + 107 + /* 108 + * The canonical order of ISA extension names in the ISA string is defined in 109 + * chapter 27 of the unprivileged specification. 110 + * 111 + * Ordinarily, for in-kernel data structures, this order is unimportant but 112 + * isa_ext_arr defines the order of the ISA string in /proc/cpuinfo. 113 + * 114 + * The specification uses vague wording, such as should, when it comes to 115 + * ordering, so for our purposes the following rules apply: 116 + * 117 + * 1. All multi-letter extensions must be separated from other extensions by an 118 + * underscore. 119 + * 120 + * 2. Additional standard extensions (starting with 'Z') must be sorted after 121 + * single-letter extensions and before any higher-privileged extensions. 122 + * 123 + * 3. The first letter following the 'Z' conventionally indicates the most 124 + * closely related alphabetical extension category, IMAFDQLCBKJTPVH. 125 + * If multiple 'Z' extensions are named, they must be ordered first by 126 + * category, then alphabetically within a category. 127 + * 128 + * 3. Standard supervisor-level extensions (starting with 'S') must be listed 129 + * after standard unprivileged extensions. If multiple supervisor-level 130 + * extensions are listed, they must be ordered alphabetically. 131 + * 132 + * 4. Standard machine-level extensions (starting with 'Zxm') must be listed 133 + * after any lower-privileged, standard extensions. If multiple 134 + * machine-level extensions are listed, they must be ordered 135 + * alphabetically. 136 + * 137 + * 5. Non-standard extensions (starting with 'X') must be listed after all 138 + * standard extensions. If multiple non-standard extensions are listed, they 139 + * must be ordered alphabetically. 140 + * 141 + * An example string following the order is: 142 + * rv64imadc_zifoo_zigoo_zafoo_sbar_scar_zxmbaz_xqux_xrux 143 + * 144 + * New entries to this struct should follow the ordering rules described above. 145 + */ 146 + const struct riscv_isa_ext_data riscv_isa_ext[] = { 147 + __RISCV_ISA_EXT_DATA(i, RISCV_ISA_EXT_i), 148 + __RISCV_ISA_EXT_DATA(m, RISCV_ISA_EXT_m), 149 + __RISCV_ISA_EXT_DATA(a, RISCV_ISA_EXT_a), 150 + __RISCV_ISA_EXT_DATA(f, RISCV_ISA_EXT_f), 151 + __RISCV_ISA_EXT_DATA(d, RISCV_ISA_EXT_d), 152 + __RISCV_ISA_EXT_DATA(q, RISCV_ISA_EXT_q), 153 + __RISCV_ISA_EXT_DATA(c, RISCV_ISA_EXT_c), 154 + __RISCV_ISA_EXT_DATA(b, RISCV_ISA_EXT_b), 155 + __RISCV_ISA_EXT_DATA(k, RISCV_ISA_EXT_k), 156 + __RISCV_ISA_EXT_DATA(j, RISCV_ISA_EXT_j), 157 + __RISCV_ISA_EXT_DATA(p, RISCV_ISA_EXT_p), 158 + __RISCV_ISA_EXT_DATA(v, RISCV_ISA_EXT_v), 159 + __RISCV_ISA_EXT_DATA(h, RISCV_ISA_EXT_h), 160 + __RISCV_ISA_EXT_DATA(zicbom, RISCV_ISA_EXT_ZICBOM), 161 + __RISCV_ISA_EXT_DATA(zicboz, RISCV_ISA_EXT_ZICBOZ), 162 + __RISCV_ISA_EXT_DATA(zicntr, RISCV_ISA_EXT_ZICNTR), 163 + __RISCV_ISA_EXT_DATA(zicsr, RISCV_ISA_EXT_ZICSR), 164 + __RISCV_ISA_EXT_DATA(zifencei, RISCV_ISA_EXT_ZIFENCEI), 165 + __RISCV_ISA_EXT_DATA(zihintpause, RISCV_ISA_EXT_ZIHINTPAUSE), 166 + __RISCV_ISA_EXT_DATA(zihpm, RISCV_ISA_EXT_ZIHPM), 167 + __RISCV_ISA_EXT_DATA(zba, RISCV_ISA_EXT_ZBA), 168 + __RISCV_ISA_EXT_DATA(zbb, RISCV_ISA_EXT_ZBB), 169 + __RISCV_ISA_EXT_DATA(zbs, RISCV_ISA_EXT_ZBS), 170 + __RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA), 171 + __RISCV_ISA_EXT_DATA(ssaia, RISCV_ISA_EXT_SSAIA), 172 + __RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF), 173 + __RISCV_ISA_EXT_DATA(sstc, RISCV_ISA_EXT_SSTC), 174 + __RISCV_ISA_EXT_DATA(svinval, RISCV_ISA_EXT_SVINVAL), 175 + __RISCV_ISA_EXT_DATA(svnapot, RISCV_ISA_EXT_SVNAPOT), 176 + __RISCV_ISA_EXT_DATA(svpbmt, RISCV_ISA_EXT_SVPBMT), 177 + }; 178 + 179 + const size_t riscv_isa_ext_count = ARRAY_SIZE(riscv_isa_ext); 180 + 181 + static void __init riscv_parse_isa_string(unsigned long *this_hwcap, struct riscv_isainfo *isainfo, 182 + unsigned long *isa2hwcap, const char *isa) 183 + { 184 + /* 185 + * For all possible cpus, we have already validated in 186 + * the boot process that they at least contain "rv" and 187 + * whichever of "32"/"64" this kernel supports, and so this 188 + * section can be skipped. 189 + */ 190 + isa += 4; 191 + 192 + while (*isa) { 193 + const char *ext = isa++; 194 + const char *ext_end = isa; 195 + bool ext_long = false, ext_err = false; 196 + 197 + switch (*ext) { 198 + case 's': 199 + /* 200 + * Workaround for invalid single-letter 's' & 'u'(QEMU). 201 + * No need to set the bit in riscv_isa as 's' & 'u' are 202 + * not valid ISA extensions. It works until multi-letter 203 + * extension starting with "Su" appears. 204 + */ 205 + if (ext[-1] != '_' && ext[1] == 'u') { 206 + ++isa; 207 + ext_err = true; 208 + break; 209 + } 210 + fallthrough; 211 + case 'S': 212 + case 'x': 213 + case 'X': 214 + case 'z': 215 + case 'Z': 216 + /* 217 + * Before attempting to parse the extension itself, we find its end. 218 + * As multi-letter extensions must be split from other multi-letter 219 + * extensions with an "_", the end of a multi-letter extension will 220 + * either be the null character or the "_" at the start of the next 221 + * multi-letter extension. 222 + * 223 + * Next, as the extensions version is currently ignored, we 224 + * eliminate that portion. This is done by parsing backwards from 225 + * the end of the extension, removing any numbers. This may be a 226 + * major or minor number however, so the process is repeated if a 227 + * minor number was found. 228 + * 229 + * ext_end is intended to represent the first character *after* the 230 + * name portion of an extension, but will be decremented to the last 231 + * character itself while eliminating the extensions version number. 232 + * A simple re-increment solves this problem. 233 + */ 234 + ext_long = true; 235 + for (; *isa && *isa != '_'; ++isa) 236 + if (unlikely(!isalnum(*isa))) 237 + ext_err = true; 238 + 239 + ext_end = isa; 240 + if (unlikely(ext_err)) 241 + break; 242 + 243 + if (!isdigit(ext_end[-1])) 244 + break; 245 + 246 + while (isdigit(*--ext_end)) 247 + ; 248 + 249 + if (tolower(ext_end[0]) != 'p' || !isdigit(ext_end[-1])) { 250 + ++ext_end; 251 + break; 252 + } 253 + 254 + while (isdigit(*--ext_end)) 255 + ; 256 + 257 + ++ext_end; 258 + break; 259 + default: 260 + /* 261 + * Things are a little easier for single-letter extensions, as they 262 + * are parsed forwards. 263 + * 264 + * After checking that our starting position is valid, we need to 265 + * ensure that, when isa was incremented at the start of the loop, 266 + * that it arrived at the start of the next extension. 267 + * 268 + * If we are already on a non-digit, there is nothing to do. Either 269 + * we have a multi-letter extension's _, or the start of an 270 + * extension. 271 + * 272 + * Otherwise we have found the current extension's major version 273 + * number. Parse past it, and a subsequent p/minor version number 274 + * if present. The `p` extension must not appear immediately after 275 + * a number, so there is no fear of missing it. 276 + * 277 + */ 278 + if (unlikely(!isalpha(*ext))) { 279 + ext_err = true; 280 + break; 281 + } 282 + 283 + if (!isdigit(*isa)) 284 + break; 285 + 286 + while (isdigit(*++isa)) 287 + ; 288 + 289 + if (tolower(*isa) != 'p') 290 + break; 291 + 292 + if (!isdigit(*++isa)) { 293 + --isa; 294 + break; 295 + } 296 + 297 + while (isdigit(*++isa)) 298 + ; 299 + 300 + break; 301 + } 302 + 303 + /* 304 + * The parser expects that at the start of an iteration isa points to the 305 + * first character of the next extension. As we stop parsing an extension 306 + * on meeting a non-alphanumeric character, an extra increment is needed 307 + * where the succeeding extension is a multi-letter prefixed with an "_". 308 + */ 309 + if (*isa == '_') 310 + ++isa; 311 + 312 + #define SET_ISA_EXT_MAP(name, bit) \ 313 + do { \ 314 + if ((ext_end - ext == strlen(name)) && \ 315 + !strncasecmp(ext, name, strlen(name)) && \ 316 + riscv_isa_extension_check(bit)) \ 317 + set_bit(bit, isainfo->isa); \ 318 + } while (false) \ 319 + 320 + if (unlikely(ext_err)) 321 + continue; 322 + if (!ext_long) { 323 + int nr = tolower(*ext) - 'a'; 324 + 325 + if (riscv_isa_extension_check(nr)) { 326 + *this_hwcap |= isa2hwcap[nr]; 327 + set_bit(nr, isainfo->isa); 328 + } 329 + } else { 330 + for (int i = 0; i < riscv_isa_ext_count; i++) 331 + SET_ISA_EXT_MAP(riscv_isa_ext[i].name, 332 + riscv_isa_ext[i].id); 333 + } 334 + #undef SET_ISA_EXT_MAP 335 + } 336 + } 337 + 338 + static void __init riscv_fill_hwcap_from_isa_string(unsigned long *isa2hwcap) 102 339 { 103 340 struct device_node *node; 104 341 const char *isa; 105 - char print_str[NUM_ALPHA_EXTS + 1]; 106 - int i, j, rc; 107 - unsigned long isa2hwcap[26] = {0}; 342 + int rc; 108 343 struct acpi_table_header *rhct; 109 344 acpi_status status; 110 345 unsigned int cpu; 111 - 112 - isa2hwcap['i' - 'a'] = COMPAT_HWCAP_ISA_I; 113 - isa2hwcap['m' - 'a'] = COMPAT_HWCAP_ISA_M; 114 - isa2hwcap['a' - 'a'] = COMPAT_HWCAP_ISA_A; 115 - isa2hwcap['f' - 'a'] = COMPAT_HWCAP_ISA_F; 116 - isa2hwcap['d' - 'a'] = COMPAT_HWCAP_ISA_D; 117 - isa2hwcap['c' - 'a'] = COMPAT_HWCAP_ISA_C; 118 - isa2hwcap['v' - 'a'] = COMPAT_HWCAP_ISA_V; 119 - 120 - elf_hwcap = 0; 121 - 122 - bitmap_zero(riscv_isa, RISCV_ISA_EXT_MAX); 123 346 124 347 if (!acpi_disabled) { 125 348 status = acpi_get_table(ACPI_SIG_RHCT, 0, &rhct); ··· 375 152 } 376 153 } 377 154 378 - /* 379 - * For all possible cpus, we have already validated in 380 - * the boot process that they at least contain "rv" and 381 - * whichever of "32"/"64" this kernel supports, and so this 382 - * section can be skipped. 383 - */ 384 - isa += 4; 385 - 386 - while (*isa) { 387 - const char *ext = isa++; 388 - const char *ext_end = isa; 389 - bool ext_long = false, ext_err = false; 390 - 391 - switch (*ext) { 392 - case 's': 393 - /* 394 - * Workaround for invalid single-letter 's' & 'u'(QEMU). 395 - * No need to set the bit in riscv_isa as 's' & 'u' are 396 - * not valid ISA extensions. It works until multi-letter 397 - * extension starting with "Su" appears. 398 - */ 399 - if (ext[-1] != '_' && ext[1] == 'u') { 400 - ++isa; 401 - ext_err = true; 402 - break; 403 - } 404 - fallthrough; 405 - case 'S': 406 - case 'x': 407 - case 'X': 408 - case 'z': 409 - case 'Z': 410 - /* 411 - * Before attempting to parse the extension itself, we find its end. 412 - * As multi-letter extensions must be split from other multi-letter 413 - * extensions with an "_", the end of a multi-letter extension will 414 - * either be the null character or the "_" at the start of the next 415 - * multi-letter extension. 416 - * 417 - * Next, as the extensions version is currently ignored, we 418 - * eliminate that portion. This is done by parsing backwards from 419 - * the end of the extension, removing any numbers. This may be a 420 - * major or minor number however, so the process is repeated if a 421 - * minor number was found. 422 - * 423 - * ext_end is intended to represent the first character *after* the 424 - * name portion of an extension, but will be decremented to the last 425 - * character itself while eliminating the extensions version number. 426 - * A simple re-increment solves this problem. 427 - */ 428 - ext_long = true; 429 - for (; *isa && *isa != '_'; ++isa) 430 - if (unlikely(!isalnum(*isa))) 431 - ext_err = true; 432 - 433 - ext_end = isa; 434 - if (unlikely(ext_err)) 435 - break; 436 - 437 - if (!isdigit(ext_end[-1])) 438 - break; 439 - 440 - while (isdigit(*--ext_end)) 441 - ; 442 - 443 - if (tolower(ext_end[0]) != 'p' || !isdigit(ext_end[-1])) { 444 - ++ext_end; 445 - break; 446 - } 447 - 448 - while (isdigit(*--ext_end)) 449 - ; 450 - 451 - ++ext_end; 452 - break; 453 - default: 454 - /* 455 - * Things are a little easier for single-letter extensions, as they 456 - * are parsed forwards. 457 - * 458 - * After checking that our starting position is valid, we need to 459 - * ensure that, when isa was incremented at the start of the loop, 460 - * that it arrived at the start of the next extension. 461 - * 462 - * If we are already on a non-digit, there is nothing to do. Either 463 - * we have a multi-letter extension's _, or the start of an 464 - * extension. 465 - * 466 - * Otherwise we have found the current extension's major version 467 - * number. Parse past it, and a subsequent p/minor version number 468 - * if present. The `p` extension must not appear immediately after 469 - * a number, so there is no fear of missing it. 470 - * 471 - */ 472 - if (unlikely(!isalpha(*ext))) { 473 - ext_err = true; 474 - break; 475 - } 476 - 477 - if (!isdigit(*isa)) 478 - break; 479 - 480 - while (isdigit(*++isa)) 481 - ; 482 - 483 - if (tolower(*isa) != 'p') 484 - break; 485 - 486 - if (!isdigit(*++isa)) { 487 - --isa; 488 - break; 489 - } 490 - 491 - while (isdigit(*++isa)) 492 - ; 493 - 494 - break; 495 - } 496 - 497 - /* 498 - * The parser expects that at the start of an iteration isa points to the 499 - * first character of the next extension. As we stop parsing an extension 500 - * on meeting a non-alphanumeric character, an extra increment is needed 501 - * where the succeeding extension is a multi-letter prefixed with an "_". 502 - */ 503 - if (*isa == '_') 504 - ++isa; 505 - 506 - #define SET_ISA_EXT_MAP(name, bit) \ 507 - do { \ 508 - if ((ext_end - ext == sizeof(name) - 1) && \ 509 - !strncasecmp(ext, name, sizeof(name) - 1) && \ 510 - riscv_isa_extension_check(bit)) \ 511 - set_bit(bit, isainfo->isa); \ 512 - } while (false) \ 513 - 514 - if (unlikely(ext_err)) 515 - continue; 516 - if (!ext_long) { 517 - int nr = tolower(*ext) - 'a'; 518 - 519 - if (riscv_isa_extension_check(nr)) { 520 - this_hwcap |= isa2hwcap[nr]; 521 - set_bit(nr, isainfo->isa); 522 - } 523 - } else { 524 - /* sorted alphabetically */ 525 - SET_ISA_EXT_MAP("smaia", RISCV_ISA_EXT_SMAIA); 526 - SET_ISA_EXT_MAP("ssaia", RISCV_ISA_EXT_SSAIA); 527 - SET_ISA_EXT_MAP("sscofpmf", RISCV_ISA_EXT_SSCOFPMF); 528 - SET_ISA_EXT_MAP("sstc", RISCV_ISA_EXT_SSTC); 529 - SET_ISA_EXT_MAP("svinval", RISCV_ISA_EXT_SVINVAL); 530 - SET_ISA_EXT_MAP("svnapot", RISCV_ISA_EXT_SVNAPOT); 531 - SET_ISA_EXT_MAP("svpbmt", RISCV_ISA_EXT_SVPBMT); 532 - SET_ISA_EXT_MAP("zba", RISCV_ISA_EXT_ZBA); 533 - SET_ISA_EXT_MAP("zbb", RISCV_ISA_EXT_ZBB); 534 - SET_ISA_EXT_MAP("zbs", RISCV_ISA_EXT_ZBS); 535 - SET_ISA_EXT_MAP("zicbom", RISCV_ISA_EXT_ZICBOM); 536 - SET_ISA_EXT_MAP("zicboz", RISCV_ISA_EXT_ZICBOZ); 537 - SET_ISA_EXT_MAP("zihintpause", RISCV_ISA_EXT_ZIHINTPAUSE); 538 - } 539 - #undef SET_ISA_EXT_MAP 540 - } 155 + riscv_parse_isa_string(&this_hwcap, isainfo, isa2hwcap, isa); 541 156 542 157 /* 543 158 * These ones were as they were part of the base ISA when the ··· 407 346 408 347 if (!acpi_disabled && rhct) 409 348 acpi_put_table((struct acpi_table_header *)rhct); 349 + } 410 350 411 - /* We don't support systems with F but without D, so mask those out 412 - * here. */ 351 + static int __init riscv_fill_hwcap_from_ext_list(unsigned long *isa2hwcap) 352 + { 353 + unsigned int cpu; 354 + 355 + for_each_possible_cpu(cpu) { 356 + unsigned long this_hwcap = 0; 357 + struct device_node *cpu_node; 358 + struct riscv_isainfo *isainfo = &hart_isa[cpu]; 359 + 360 + cpu_node = of_cpu_device_node_get(cpu); 361 + if (!cpu_node) { 362 + pr_warn("Unable to find cpu node\n"); 363 + continue; 364 + } 365 + 366 + if (!of_property_present(cpu_node, "riscv,isa-extensions")) { 367 + of_node_put(cpu_node); 368 + continue; 369 + } 370 + 371 + for (int i = 0; i < riscv_isa_ext_count; i++) { 372 + if (of_property_match_string(cpu_node, "riscv,isa-extensions", 373 + riscv_isa_ext[i].property) < 0) 374 + continue; 375 + 376 + if (!riscv_isa_extension_check(riscv_isa_ext[i].id)) 377 + continue; 378 + 379 + /* Only single letter extensions get set in hwcap */ 380 + if (strnlen(riscv_isa_ext[i].name, 2) == 1) 381 + this_hwcap |= isa2hwcap[riscv_isa_ext[i].id]; 382 + 383 + set_bit(riscv_isa_ext[i].id, isainfo->isa); 384 + } 385 + 386 + of_node_put(cpu_node); 387 + 388 + /* 389 + * All "okay" harts should have same isa. Set HWCAP based on 390 + * common capabilities of every "okay" hart, in case they don't. 391 + */ 392 + if (elf_hwcap) 393 + elf_hwcap &= this_hwcap; 394 + else 395 + elf_hwcap = this_hwcap; 396 + 397 + if (bitmap_empty(riscv_isa, RISCV_ISA_EXT_MAX)) 398 + bitmap_copy(riscv_isa, isainfo->isa, RISCV_ISA_EXT_MAX); 399 + else 400 + bitmap_and(riscv_isa, riscv_isa, isainfo->isa, RISCV_ISA_EXT_MAX); 401 + } 402 + 403 + if (bitmap_empty(riscv_isa, RISCV_ISA_EXT_MAX)) 404 + return -ENOENT; 405 + 406 + return 0; 407 + } 408 + 409 + #ifdef CONFIG_RISCV_ISA_FALLBACK 410 + bool __initdata riscv_isa_fallback = true; 411 + #else 412 + bool __initdata riscv_isa_fallback; 413 + static int __init riscv_isa_fallback_setup(char *__unused) 414 + { 415 + riscv_isa_fallback = true; 416 + return 1; 417 + } 418 + early_param("riscv_isa_fallback", riscv_isa_fallback_setup); 419 + #endif 420 + 421 + void __init riscv_fill_hwcap(void) 422 + { 423 + char print_str[NUM_ALPHA_EXTS + 1]; 424 + unsigned long isa2hwcap[26] = {0}; 425 + int i, j; 426 + 427 + isa2hwcap['i' - 'a'] = COMPAT_HWCAP_ISA_I; 428 + isa2hwcap['m' - 'a'] = COMPAT_HWCAP_ISA_M; 429 + isa2hwcap['a' - 'a'] = COMPAT_HWCAP_ISA_A; 430 + isa2hwcap['f' - 'a'] = COMPAT_HWCAP_ISA_F; 431 + isa2hwcap['d' - 'a'] = COMPAT_HWCAP_ISA_D; 432 + isa2hwcap['c' - 'a'] = COMPAT_HWCAP_ISA_C; 433 + isa2hwcap['v' - 'a'] = COMPAT_HWCAP_ISA_V; 434 + 435 + if (!acpi_disabled) { 436 + riscv_fill_hwcap_from_isa_string(isa2hwcap); 437 + } else { 438 + int ret = riscv_fill_hwcap_from_ext_list(isa2hwcap); 439 + 440 + if (ret && riscv_isa_fallback) { 441 + pr_info("Falling back to deprecated \"riscv,isa\"\n"); 442 + riscv_fill_hwcap_from_isa_string(isa2hwcap); 443 + } 444 + } 445 + 446 + /* 447 + * We don't support systems with F but without D, so mask those out 448 + * here. 449 + */ 413 450 if ((elf_hwcap & COMPAT_HWCAP_ISA_F) && !(elf_hwcap & COMPAT_HWCAP_ISA_D)) { 414 451 pr_info("This kernel does not support systems with F but not D\n"); 415 452 elf_hwcap &= ~COMPAT_HWCAP_ISA_F;
+1 -5
arch/riscv/kernel/head.S
··· 289 289 blt a3, a4, clear_bss 290 290 clear_bss_done: 291 291 #endif 292 - /* Save hart ID and DTB physical address */ 293 - mv s0, a0 294 - mv s1, a1 295 - 296 292 la a2, boot_cpu_hartid 297 293 XIP_FIXUP_OFFSET a2 298 294 REG_S a0, (a2) ··· 302 306 la a0, __dtb_start 303 307 XIP_FIXUP_OFFSET a0 304 308 #else 305 - mv a0, s1 309 + mv a0, a1 306 310 #endif /* CONFIG_BUILTIN_DTB */ 307 311 call setup_vm 308 312 #ifdef CONFIG_MMU
+7 -2
arch/riscv/kernel/mcount.S
··· 3 3 4 4 #include <linux/init.h> 5 5 #include <linux/linkage.h> 6 + #include <linux/cfi_types.h> 6 7 #include <asm/asm.h> 7 8 #include <asm/csr.h> 8 9 #include <asm/unistd.h> ··· 48 47 addi sp, sp, 4*SZREG 49 48 .endm 50 49 51 - ENTRY(ftrace_stub) 50 + SYM_TYPED_FUNC_START(ftrace_stub) 52 51 #ifdef CONFIG_DYNAMIC_FTRACE 53 52 .global MCOUNT_NAME 54 53 .set MCOUNT_NAME, ftrace_stub 55 54 #endif 56 55 ret 57 - ENDPROC(ftrace_stub) 56 + SYM_FUNC_END(ftrace_stub) 58 57 59 58 #ifdef CONFIG_FUNCTION_GRAPH_TRACER 59 + SYM_TYPED_FUNC_START(ftrace_stub_graph) 60 + ret 61 + SYM_FUNC_END(ftrace_stub_graph) 62 + 60 63 ENTRY(return_to_handler) 61 64 /* 62 65 * On implementing the frame point test, the ideal way is to compare the
+6 -5
arch/riscv/kernel/probes/decode-insn.c
··· 29 29 * TODO: the REJECTED ones below need to be implemented 30 30 */ 31 31 #ifdef CONFIG_RISCV_ISA_C 32 - RISCV_INSN_REJECTED(c_j, insn); 33 - RISCV_INSN_REJECTED(c_jr, insn); 34 32 RISCV_INSN_REJECTED(c_jal, insn); 35 - RISCV_INSN_REJECTED(c_jalr, insn); 36 - RISCV_INSN_REJECTED(c_beqz, insn); 37 - RISCV_INSN_REJECTED(c_bnez, insn); 38 33 RISCV_INSN_REJECTED(c_ebreak, insn); 34 + 35 + RISCV_INSN_SET_SIMULATE(c_j, insn); 36 + RISCV_INSN_SET_SIMULATE(c_jr, insn); 37 + RISCV_INSN_SET_SIMULATE(c_jalr, insn); 38 + RISCV_INSN_SET_SIMULATE(c_beqz, insn); 39 + RISCV_INSN_SET_SIMULATE(c_bnez, insn); 39 40 #endif 40 41 41 42 RISCV_INSN_SET_SIMULATE(jal, insn);
+105
arch/riscv/kernel/probes/simulate-insn.c
··· 188 188 189 189 return true; 190 190 } 191 + 192 + bool __kprobes simulate_c_j(u32 opcode, unsigned long addr, struct pt_regs *regs) 193 + { 194 + /* 195 + * 15 13 12 2 1 0 196 + * | funct3 | offset[11|4|9:8|10|6|7|3:1|5] | opcode | 197 + * 3 11 2 198 + */ 199 + 200 + s32 offset; 201 + 202 + offset = ((opcode >> 3) & 0x7) << 1; 203 + offset |= ((opcode >> 11) & 0x1) << 4; 204 + offset |= ((opcode >> 2) & 0x1) << 5; 205 + offset |= ((opcode >> 7) & 0x1) << 6; 206 + offset |= ((opcode >> 6) & 0x1) << 7; 207 + offset |= ((opcode >> 9) & 0x3) << 8; 208 + offset |= ((opcode >> 8) & 0x1) << 10; 209 + offset |= ((opcode >> 12) & 0x1) << 11; 210 + 211 + instruction_pointer_set(regs, addr + sign_extend32(offset, 11)); 212 + 213 + return true; 214 + } 215 + 216 + static bool __kprobes simulate_c_jr_jalr(u32 opcode, unsigned long addr, struct pt_regs *regs, 217 + bool is_jalr) 218 + { 219 + /* 220 + * 15 12 11 7 6 2 1 0 221 + * | funct4 | rs1 | rs2 | op | 222 + * 4 5 5 2 223 + */ 224 + 225 + unsigned long jump_addr; 226 + 227 + u32 rs1 = (opcode >> 7) & 0x1f; 228 + 229 + if (rs1 == 0) /* C.JR is only valid when rs1 != x0 */ 230 + return false; 231 + 232 + if (!rv_insn_reg_get_val(regs, rs1, &jump_addr)) 233 + return false; 234 + 235 + if (is_jalr && !rv_insn_reg_set_val(regs, 1, addr + 2)) 236 + return false; 237 + 238 + instruction_pointer_set(regs, jump_addr); 239 + 240 + return true; 241 + } 242 + 243 + bool __kprobes simulate_c_jr(u32 opcode, unsigned long addr, struct pt_regs *regs) 244 + { 245 + return simulate_c_jr_jalr(opcode, addr, regs, false); 246 + } 247 + 248 + bool __kprobes simulate_c_jalr(u32 opcode, unsigned long addr, struct pt_regs *regs) 249 + { 250 + return simulate_c_jr_jalr(opcode, addr, regs, true); 251 + } 252 + 253 + static bool __kprobes simulate_c_bnez_beqz(u32 opcode, unsigned long addr, struct pt_regs *regs, 254 + bool is_bnez) 255 + { 256 + /* 257 + * 15 13 12 10 9 7 6 2 1 0 258 + * | funct3 | offset[8|4:3] | rs1' | offset[7:6|2:1|5] | op | 259 + * 3 3 3 5 2 260 + */ 261 + 262 + s32 offset; 263 + u32 rs1; 264 + unsigned long rs1_val; 265 + 266 + rs1 = 0x8 | ((opcode >> 7) & 0x7); 267 + 268 + if (!rv_insn_reg_get_val(regs, rs1, &rs1_val)) 269 + return false; 270 + 271 + if ((rs1_val != 0 && is_bnez) || (rs1_val == 0 && !is_bnez)) { 272 + offset = ((opcode >> 3) & 0x3) << 1; 273 + offset |= ((opcode >> 10) & 0x3) << 3; 274 + offset |= ((opcode >> 2) & 0x1) << 5; 275 + offset |= ((opcode >> 5) & 0x3) << 6; 276 + offset |= ((opcode >> 12) & 0x1) << 8; 277 + offset = sign_extend32(offset, 8); 278 + } else { 279 + offset = 2; 280 + } 281 + 282 + instruction_pointer_set(regs, addr + offset); 283 + 284 + return true; 285 + } 286 + 287 + bool __kprobes simulate_c_bnez(u32 opcode, unsigned long addr, struct pt_regs *regs) 288 + { 289 + return simulate_c_bnez_beqz(opcode, addr, regs, true); 290 + } 291 + 292 + bool __kprobes simulate_c_beqz(u32 opcode, unsigned long addr, struct pt_regs *regs) 293 + { 294 + return simulate_c_bnez_beqz(opcode, addr, regs, false); 295 + }
+5
arch/riscv/kernel/probes/simulate-insn.h
··· 24 24 bool simulate_branch(u32 opcode, unsigned long addr, struct pt_regs *regs); 25 25 bool simulate_jal(u32 opcode, unsigned long addr, struct pt_regs *regs); 26 26 bool simulate_jalr(u32 opcode, unsigned long addr, struct pt_regs *regs); 27 + bool simulate_c_j(u32 opcode, unsigned long addr, struct pt_regs *regs); 28 + bool simulate_c_jr(u32 opcode, unsigned long addr, struct pt_regs *regs); 29 + bool simulate_c_jalr(u32 opcode, unsigned long addr, struct pt_regs *regs); 30 + bool simulate_c_bnez(u32 opcode, unsigned long addr, struct pt_regs *regs); 31 + bool simulate_c_beqz(u32 opcode, unsigned long addr, struct pt_regs *regs); 27 32 28 33 #endif /* _RISCV_KERNEL_PROBES_SIMULATE_INSN_H */
+6
arch/riscv/kernel/setup.c
··· 178 178 if (ret < 0) 179 179 goto error; 180 180 } 181 + if (crashk_low_res.start != crashk_low_res.end) { 182 + ret = add_resource(&iomem_resource, &crashk_low_res); 183 + if (ret < 0) 184 + goto error; 185 + } 181 186 #endif 182 187 183 188 #ifdef CONFIG_CRASH_DUMP ··· 316 311 if (IS_ENABLED(CONFIG_RISCV_ISA_ZICBOM) && 317 312 riscv_isa_extension_available(NULL, ZICBOM)) 318 313 riscv_noncoherent_supported(); 314 + riscv_set_dma_cache_alignment(); 319 315 } 320 316 321 317 static int __init topology_init(void)
+3 -2
arch/riscv/kernel/suspend_entry.S
··· 5 5 */ 6 6 7 7 #include <linux/linkage.h> 8 + #include <linux/cfi_types.h> 8 9 #include <asm/asm.h> 9 10 #include <asm/asm-offsets.h> 10 11 #include <asm/assembler.h> ··· 59 58 ret 60 59 END(__cpu_suspend_enter) 61 60 62 - ENTRY(__cpu_resume_enter) 61 + SYM_TYPED_FUNC_START(__cpu_resume_enter) 63 62 /* Load the global pointer */ 64 63 .option push 65 64 .option norelax ··· 95 94 96 95 /* Return to C code */ 97 96 ret 98 - END(__cpu_resume_enter) 97 + SYM_FUNC_END(__cpu_resume_enter)
+6
arch/riscv/kernel/sys_riscv.c
··· 335 335 return do_riscv_hwprobe(pairs, pair_count, cpu_count, 336 336 cpus, flags); 337 337 } 338 + 339 + /* Not defined using SYSCALL_DEFINE0 to avoid error injection */ 340 + asmlinkage long __riscv_sys_ni_syscall(const struct pt_regs *__unused) 341 + { 342 + return -ENOSYS; 343 + }
+6 -2
arch/riscv/kernel/syscall_table.c
··· 10 10 #include <asm/syscall.h> 11 11 12 12 #undef __SYSCALL 13 - #define __SYSCALL(nr, call) [nr] = (call), 13 + #define __SYSCALL(nr, call) asmlinkage long __riscv_##call(const struct pt_regs *); 14 + #include <asm/unistd.h> 15 + 16 + #undef __SYSCALL 17 + #define __SYSCALL(nr, call) [nr] = __riscv_##call, 14 18 15 19 void * const sys_call_table[__NR_syscalls] = { 16 - [0 ... __NR_syscalls - 1] = sys_ni_syscall, 20 + [0 ... __NR_syscalls - 1] = __riscv_sys_ni_syscall, 17 21 #include <asm/unistd.h> 18 22 };
+3 -1
arch/riscv/kernel/traps.c
··· 21 21 22 22 #include <asm/asm-prototypes.h> 23 23 #include <asm/bug.h> 24 + #include <asm/cfi.h> 24 25 #include <asm/csr.h> 25 26 #include <asm/processor.h> 26 27 #include <asm/ptrace.h> ··· 272 271 == NOTIFY_STOP) 273 272 return; 274 273 #endif 275 - else if (report_bug(regs->epc, regs) == BUG_TRAP_TYPE_WARN) 274 + else if (report_bug(regs->epc, regs) == BUG_TRAP_TYPE_WARN || 275 + handle_cfi_failure(regs) == BUG_TRAP_TYPE_WARN) 276 276 regs->epc += get_break_insn_length(regs->epc); 277 277 else 278 278 die(regs, "Kernel BUG");
+1 -1
arch/riscv/mm/context.c
··· 67 67 lockdep_assert_held(&context_lock); 68 68 69 69 /* Update the list of reserved ASIDs and the ASID bitmap. */ 70 - bitmap_clear(context_asid_map, 0, num_asids); 70 + bitmap_zero(context_asid_map, num_asids); 71 71 72 72 /* Mark already active ASIDs as used */ 73 73 for_each_possible_cpu(i) {
+8
arch/riscv/mm/dma-noncoherent.c
··· 11 11 #include <asm/cacheflush.h> 12 12 13 13 static bool noncoherent_supported __ro_after_init; 14 + int dma_cache_alignment __ro_after_init = ARCH_DMA_MINALIGN; 15 + EXPORT_SYMBOL_GPL(dma_cache_alignment); 14 16 15 17 void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, 16 18 enum dma_data_direction dir) ··· 79 77 WARN(!riscv_cbom_block_size, 80 78 "Non-coherent DMA support enabled without a block size\n"); 81 79 noncoherent_supported = true; 80 + } 81 + 82 + void __init riscv_set_dma_cache_alignment(void) 83 + { 84 + if (!noncoherent_supported) 85 + dma_cache_alignment = 1; 82 86 }
+86 -7
arch/riscv/mm/init.c
··· 1299 1299 } 1300 1300 #endif /* CONFIG_MMU */ 1301 1301 1302 + /* Reserve 128M low memory by default for swiotlb buffer */ 1303 + #define DEFAULT_CRASH_KERNEL_LOW_SIZE (128UL << 20) 1304 + 1305 + static int __init reserve_crashkernel_low(unsigned long long low_size) 1306 + { 1307 + unsigned long long low_base; 1308 + 1309 + low_base = memblock_phys_alloc_range(low_size, PMD_SIZE, 0, dma32_phys_limit); 1310 + if (!low_base) { 1311 + pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size); 1312 + return -ENOMEM; 1313 + } 1314 + 1315 + pr_info("crashkernel low memory reserved: 0x%016llx - 0x%016llx (%lld MB)\n", 1316 + low_base, low_base + low_size, low_size >> 20); 1317 + 1318 + crashk_low_res.start = low_base; 1319 + crashk_low_res.end = low_base + low_size - 1; 1320 + 1321 + return 0; 1322 + } 1323 + 1302 1324 /* 1303 1325 * reserve_crashkernel() - reserves memory for crash kernel 1304 1326 * ··· 1332 1310 { 1333 1311 unsigned long long crash_base = 0; 1334 1312 unsigned long long crash_size = 0; 1313 + unsigned long long crash_low_size = 0; 1335 1314 unsigned long search_start = memblock_start_of_DRAM(); 1336 - unsigned long search_end = memblock_end_of_DRAM(); 1315 + unsigned long search_end = (unsigned long)dma32_phys_limit; 1316 + char *cmdline = boot_command_line; 1317 + bool fixed_base = false; 1318 + bool high = false; 1337 1319 1338 1320 int ret = 0; 1339 1321 ··· 1353 1327 return; 1354 1328 } 1355 1329 1356 - ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), 1330 + ret = parse_crashkernel(cmdline, memblock_phys_mem_size(), 1357 1331 &crash_size, &crash_base); 1358 - if (ret || !crash_size) 1332 + if (ret == -ENOENT) { 1333 + /* Fallback to crashkernel=X,[high,low] */ 1334 + ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base); 1335 + if (ret || !crash_size) 1336 + return; 1337 + 1338 + /* 1339 + * crashkernel=Y,low is valid only when crashkernel=X,high 1340 + * is passed. 1341 + */ 1342 + ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base); 1343 + if (ret == -ENOENT) 1344 + crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; 1345 + else if (ret) 1346 + return; 1347 + 1348 + search_start = (unsigned long)dma32_phys_limit; 1349 + search_end = memblock_end_of_DRAM(); 1350 + high = true; 1351 + } else if (ret || !crash_size) { 1352 + /* Invalid argument value specified */ 1359 1353 return; 1354 + } 1360 1355 1361 1356 crash_size = PAGE_ALIGN(crash_size); 1362 1357 1363 1358 if (crash_base) { 1359 + fixed_base = true; 1364 1360 search_start = crash_base; 1365 1361 search_end = crash_base + crash_size; 1366 1362 } ··· 1395 1347 * swiotlb can work on the crash kernel. 1396 1348 */ 1397 1349 crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE, 1398 - search_start, 1399 - min(search_end, (unsigned long)(SZ_4G - 1))); 1350 + search_start, search_end); 1400 1351 if (crash_base == 0) { 1401 - /* Try again without restricting region to 32bit addressible memory */ 1352 + /* 1353 + * For crashkernel=size[KMG]@offset[KMG], print out failure 1354 + * message if can't reserve the specified region. 1355 + */ 1356 + if (fixed_base) { 1357 + pr_warn("crashkernel: allocating failed with given size@offset\n"); 1358 + return; 1359 + } 1360 + 1361 + if (high) { 1362 + /* 1363 + * For crashkernel=size[KMG],high, if the first attempt was 1364 + * for high memory, fall back to low memory. 1365 + */ 1366 + search_start = memblock_start_of_DRAM(); 1367 + search_end = (unsigned long)dma32_phys_limit; 1368 + } else { 1369 + /* 1370 + * For crashkernel=size[KMG], if the first attempt was for 1371 + * low memory, fall back to high memory, the minimum required 1372 + * low memory will be reserved later. 1373 + */ 1374 + search_start = (unsigned long)dma32_phys_limit; 1375 + search_end = memblock_end_of_DRAM(); 1376 + crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; 1377 + } 1378 + 1402 1379 crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE, 1403 - search_start, search_end); 1380 + search_start, search_end); 1404 1381 if (crash_base == 0) { 1405 1382 pr_warn("crashkernel: couldn't allocate %lldKB\n", 1406 1383 crash_size >> 10); 1407 1384 return; 1408 1385 } 1386 + } 1387 + 1388 + if ((crash_base >= dma32_phys_limit) && crash_low_size && 1389 + reserve_crashkernel_low(crash_low_size)) { 1390 + memblock_phys_free(crash_base, crash_size); 1391 + return; 1409 1392 } 1410 1393 1411 1394 pr_info("crashkernel: reserved 0x%016llx - 0x%016llx (%lld MB)\n",
+4 -4
arch/riscv/mm/kasan_init.c
··· 22 22 * region is not and then we have to go down to the PUD level. 23 23 */ 24 24 25 - pgd_t tmp_pg_dir[PTRS_PER_PGD] __page_aligned_bss; 26 - p4d_t tmp_p4d[PTRS_PER_P4D] __page_aligned_bss; 27 - pud_t tmp_pud[PTRS_PER_PUD] __page_aligned_bss; 25 + static pgd_t tmp_pg_dir[PTRS_PER_PGD] __page_aligned_bss; 26 + static p4d_t tmp_p4d[PTRS_PER_P4D] __page_aligned_bss; 27 + static pud_t tmp_pud[PTRS_PER_PUD] __page_aligned_bss; 28 28 29 29 static void __init kasan_populate_pte(pmd_t *pmd, unsigned long vaddr, unsigned long end) 30 30 { ··· 438 438 kasan_shallow_populate_pgd(vaddr, vend); 439 439 } 440 440 441 - static void create_tmp_mapping(void) 441 + static void __init create_tmp_mapping(void) 442 442 { 443 443 void *ptr; 444 444 p4d_t *base_p4d;
+4
arch/riscv/purgatory/Makefile
··· 77 77 PURGATORY_CFLAGS_REMOVE += -fstack-protector-strong 78 78 endif 79 79 80 + ifdef CONFIG_CFI_CLANG 81 + PURGATORY_CFLAGS_REMOVE += $(CC_FLAGS_CFI) 82 + endif 83 + 80 84 CFLAGS_REMOVE_purgatory.o += $(PURGATORY_CFLAGS_REMOVE) 81 85 CFLAGS_purgatory.o += $(PURGATORY_CFLAGS) 82 86
+113
drivers/perf/riscv_pmu.c
··· 14 14 #include <linux/perf/riscv_pmu.h> 15 15 #include <linux/printk.h> 16 16 #include <linux/smp.h> 17 + #include <linux/sched_clock.h> 17 18 18 19 #include <asm/sbi.h> 20 + 21 + static bool riscv_perf_user_access(struct perf_event *event) 22 + { 23 + return ((event->attr.type == PERF_TYPE_HARDWARE) || 24 + (event->attr.type == PERF_TYPE_HW_CACHE) || 25 + (event->attr.type == PERF_TYPE_RAW)) && 26 + !!(event->hw.flags & PERF_EVENT_FLAG_USER_READ_CNT); 27 + } 28 + 29 + void arch_perf_update_userpage(struct perf_event *event, 30 + struct perf_event_mmap_page *userpg, u64 now) 31 + { 32 + struct clock_read_data *rd; 33 + unsigned int seq; 34 + u64 ns; 35 + 36 + userpg->cap_user_time = 0; 37 + userpg->cap_user_time_zero = 0; 38 + userpg->cap_user_time_short = 0; 39 + userpg->cap_user_rdpmc = riscv_perf_user_access(event); 40 + 41 + #ifdef CONFIG_RISCV_PMU 42 + /* 43 + * The counters are 64-bit but the priv spec doesn't mandate all the 44 + * bits to be implemented: that's why, counter width can vary based on 45 + * the cpu vendor. 46 + */ 47 + if (userpg->cap_user_rdpmc) 48 + userpg->pmc_width = to_riscv_pmu(event->pmu)->ctr_get_width(event->hw.idx) + 1; 49 + #endif 50 + 51 + do { 52 + rd = sched_clock_read_begin(&seq); 53 + 54 + userpg->time_mult = rd->mult; 55 + userpg->time_shift = rd->shift; 56 + userpg->time_zero = rd->epoch_ns; 57 + userpg->time_cycles = rd->epoch_cyc; 58 + userpg->time_mask = rd->sched_clock_mask; 59 + 60 + /* 61 + * Subtract the cycle base, such that software that 62 + * doesn't know about cap_user_time_short still 'works' 63 + * assuming no wraps. 64 + */ 65 + ns = mul_u64_u32_shr(rd->epoch_cyc, rd->mult, rd->shift); 66 + userpg->time_zero -= ns; 67 + 68 + } while (sched_clock_read_retry(seq)); 69 + 70 + userpg->time_offset = userpg->time_zero - now; 71 + 72 + /* 73 + * time_shift is not expected to be greater than 31 due to 74 + * the original published conversion algorithm shifting a 75 + * 32-bit value (now specifies a 64-bit value) - refer 76 + * perf_event_mmap_page documentation in perf_event.h. 77 + */ 78 + if (userpg->time_shift == 32) { 79 + userpg->time_shift = 31; 80 + userpg->time_mult >>= 1; 81 + } 82 + 83 + /* 84 + * Internal timekeeping for enabled/running/stopped times 85 + * is always computed with the sched_clock. 86 + */ 87 + userpg->cap_user_time = 1; 88 + userpg->cap_user_time_zero = 1; 89 + userpg->cap_user_time_short = 1; 90 + } 19 91 20 92 static unsigned long csr_read_num(int csr_num) 21 93 { ··· 243 171 244 172 local64_set(&hwc->prev_count, (u64)-left); 245 173 174 + perf_event_update_userpage(event); 175 + 246 176 return overflow; 247 177 } 248 178 ··· 338 264 hwc->idx = -1; 339 265 hwc->event_base = mapped_event; 340 266 267 + if (rvpmu->event_init) 268 + rvpmu->event_init(event); 269 + 341 270 if (!is_sampling_event(event)) { 342 271 /* 343 272 * For non-sampling runs, limit the sample_period to half ··· 355 278 } 356 279 357 280 return 0; 281 + } 282 + 283 + static int riscv_pmu_event_idx(struct perf_event *event) 284 + { 285 + struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu); 286 + 287 + if (!(event->hw.flags & PERF_EVENT_FLAG_USER_READ_CNT)) 288 + return 0; 289 + 290 + if (rvpmu->csr_index) 291 + return rvpmu->csr_index(event) + 1; 292 + 293 + return 0; 294 + } 295 + 296 + static void riscv_pmu_event_mapped(struct perf_event *event, struct mm_struct *mm) 297 + { 298 + struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu); 299 + 300 + if (rvpmu->event_mapped) { 301 + rvpmu->event_mapped(event, mm); 302 + perf_event_update_userpage(event); 303 + } 304 + } 305 + 306 + static void riscv_pmu_event_unmapped(struct perf_event *event, struct mm_struct *mm) 307 + { 308 + struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu); 309 + 310 + if (rvpmu->event_unmapped) { 311 + rvpmu->event_unmapped(event, mm); 312 + perf_event_update_userpage(event); 313 + } 358 314 } 359 315 360 316 struct riscv_pmu *riscv_pmu_alloc(void) ··· 414 304 } 415 305 pmu->pmu = (struct pmu) { 416 306 .event_init = riscv_pmu_event_init, 307 + .event_mapped = riscv_pmu_event_mapped, 308 + .event_unmapped = riscv_pmu_event_unmapped, 309 + .event_idx = riscv_pmu_event_idx, 417 310 .add = riscv_pmu_add, 418 311 .del = riscv_pmu_del, 419 312 .start = riscv_pmu_start,
+27 -1
drivers/perf/riscv_pmu_legacy.c
··· 13 13 #include <linux/platform_device.h> 14 14 15 15 #define RISCV_PMU_LEGACY_CYCLE 0 16 - #define RISCV_PMU_LEGACY_INSTRET 1 16 + #define RISCV_PMU_LEGACY_INSTRET 2 17 17 18 18 static bool pmu_init_done; 19 19 ··· 71 71 local64_set(&hwc->prev_count, initial_val); 72 72 } 73 73 74 + static uint8_t pmu_legacy_csr_index(struct perf_event *event) 75 + { 76 + return event->hw.idx; 77 + } 78 + 79 + static void pmu_legacy_event_mapped(struct perf_event *event, struct mm_struct *mm) 80 + { 81 + if (event->attr.config != PERF_COUNT_HW_CPU_CYCLES && 82 + event->attr.config != PERF_COUNT_HW_INSTRUCTIONS) 83 + return; 84 + 85 + event->hw.flags |= PERF_EVENT_FLAG_USER_READ_CNT; 86 + } 87 + 88 + static void pmu_legacy_event_unmapped(struct perf_event *event, struct mm_struct *mm) 89 + { 90 + if (event->attr.config != PERF_COUNT_HW_CPU_CYCLES && 91 + event->attr.config != PERF_COUNT_HW_INSTRUCTIONS) 92 + return; 93 + 94 + event->hw.flags &= ~PERF_EVENT_FLAG_USER_READ_CNT; 95 + } 96 + 74 97 /* 75 98 * This is just a simple implementation to allow legacy implementations 76 99 * compatible with new RISC-V PMU driver framework. ··· 114 91 pmu->ctr_get_width = NULL; 115 92 pmu->ctr_clear_idx = NULL; 116 93 pmu->ctr_read = pmu_legacy_read_ctr; 94 + pmu->event_mapped = pmu_legacy_event_mapped; 95 + pmu->event_unmapped = pmu_legacy_event_unmapped; 96 + pmu->csr_index = pmu_legacy_csr_index; 117 97 118 98 perf_pmu_register(&pmu->pmu, "cpu", PERF_TYPE_RAW); 119 99 }
+188 -8
drivers/perf/riscv_pmu_sbi.c
··· 24 24 #include <asm/sbi.h> 25 25 #include <asm/hwcap.h> 26 26 27 + #define SYSCTL_NO_USER_ACCESS 0 28 + #define SYSCTL_USER_ACCESS 1 29 + #define SYSCTL_LEGACY 2 30 + 31 + #define PERF_EVENT_FLAG_NO_USER_ACCESS BIT(SYSCTL_NO_USER_ACCESS) 32 + #define PERF_EVENT_FLAG_USER_ACCESS BIT(SYSCTL_USER_ACCESS) 33 + #define PERF_EVENT_FLAG_LEGACY BIT(SYSCTL_LEGACY) 34 + 27 35 PMU_FORMAT_ATTR(event, "config:0-47"); 28 36 PMU_FORMAT_ATTR(firmware, "config:63"); 29 37 ··· 50 42 &riscv_pmu_format_group, 51 43 NULL, 52 44 }; 45 + 46 + /* Allow user mode access by default */ 47 + static int sysctl_perf_user_access __read_mostly = SYSCTL_USER_ACCESS; 53 48 54 49 /* 55 50 * RISC-V doesn't have heterogeneous harts yet. This need to be part of ··· 312 301 } 313 302 EXPORT_SYMBOL_GPL(riscv_pmu_get_hpm_info); 314 303 304 + static uint8_t pmu_sbi_csr_index(struct perf_event *event) 305 + { 306 + return pmu_ctr_list[event->hw.idx].csr - CSR_CYCLE; 307 + } 308 + 315 309 static unsigned long pmu_sbi_get_filter_flags(struct perf_event *event) 316 310 { 317 311 unsigned long cflags = 0; ··· 345 329 struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events); 346 330 struct sbiret ret; 347 331 int idx; 348 - uint64_t cbase = 0; 332 + uint64_t cbase = 0, cmask = rvpmu->cmask; 349 333 unsigned long cflags = 0; 350 334 351 335 cflags = pmu_sbi_get_filter_flags(event); 336 + 337 + /* 338 + * In legacy mode, we have to force the fixed counters for those events 339 + * but not in the user access mode as we want to use the other counters 340 + * that support sampling/filtering. 341 + */ 342 + if (hwc->flags & PERF_EVENT_FLAG_LEGACY) { 343 + if (event->attr.config == PERF_COUNT_HW_CPU_CYCLES) { 344 + cflags |= SBI_PMU_CFG_FLAG_SKIP_MATCH; 345 + cmask = 1; 346 + } else if (event->attr.config == PERF_COUNT_HW_INSTRUCTIONS) { 347 + cflags |= SBI_PMU_CFG_FLAG_SKIP_MATCH; 348 + cmask = 1UL << (CSR_INSTRET - CSR_CYCLE); 349 + } 350 + } 351 + 352 352 /* retrieve the available counter index */ 353 353 #if defined(CONFIG_32BIT) 354 354 ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase, 355 - rvpmu->cmask, cflags, hwc->event_base, hwc->config, 355 + cmask, cflags, hwc->event_base, hwc->config, 356 356 hwc->config >> 32); 357 357 #else 358 358 ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase, 359 - rvpmu->cmask, cflags, hwc->event_base, hwc->config, 0); 359 + cmask, cflags, hwc->event_base, hwc->config, 0); 360 360 #endif 361 361 if (ret.error) { 362 362 pr_debug("Not able to find a counter for event %lx config %llx\n", ··· 506 474 return val; 507 475 } 508 476 477 + static void pmu_sbi_set_scounteren(void *arg) 478 + { 479 + struct perf_event *event = (struct perf_event *)arg; 480 + 481 + csr_write(CSR_SCOUNTEREN, 482 + csr_read(CSR_SCOUNTEREN) | (1 << pmu_sbi_csr_index(event))); 483 + } 484 + 485 + static void pmu_sbi_reset_scounteren(void *arg) 486 + { 487 + struct perf_event *event = (struct perf_event *)arg; 488 + 489 + csr_write(CSR_SCOUNTEREN, 490 + csr_read(CSR_SCOUNTEREN) & ~(1 << pmu_sbi_csr_index(event))); 491 + } 492 + 509 493 static void pmu_sbi_ctr_start(struct perf_event *event, u64 ival) 510 494 { 511 495 struct sbiret ret; ··· 538 490 if (ret.error && (ret.error != SBI_ERR_ALREADY_STARTED)) 539 491 pr_err("Starting counter idx %d failed with error %d\n", 540 492 hwc->idx, sbi_err_map_linux_errno(ret.error)); 493 + 494 + if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) && 495 + (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT)) 496 + pmu_sbi_set_scounteren((void *)event); 541 497 } 542 498 543 499 static void pmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag) 544 500 { 545 501 struct sbiret ret; 546 502 struct hw_perf_event *hwc = &event->hw; 503 + 504 + if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) && 505 + (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT)) 506 + pmu_sbi_reset_scounteren((void *)event); 547 507 548 508 ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_STOP, hwc->idx, 1, flag, 0, 0, 0); 549 509 if (ret.error && (ret.error != SBI_ERR_ALREADY_STOPPED) && ··· 760 704 struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events); 761 705 762 706 /* 763 - * Enable the access for CYCLE, TIME, and INSTRET CSRs from userspace, 764 - * as is necessary to maintain uABI compatibility. 707 + * We keep enabling userspace access to CYCLE, TIME and INSTRET via the 708 + * legacy option but that will be removed in the future. 765 709 */ 766 - csr_write(CSR_SCOUNTEREN, 0x7); 710 + if (sysctl_perf_user_access == SYSCTL_LEGACY) 711 + csr_write(CSR_SCOUNTEREN, 0x7); 712 + else 713 + csr_write(CSR_SCOUNTEREN, 0x2); 767 714 768 715 /* Stop all the counters so that they can be enabled from perf */ 769 716 pmu_sbi_stop_all(pmu); ··· 897 838 cpuhp_state_remove_instance(CPUHP_AP_PERF_RISCV_STARTING, &pmu->node); 898 839 } 899 840 841 + static void pmu_sbi_event_init(struct perf_event *event) 842 + { 843 + /* 844 + * The permissions are set at event_init so that we do not depend 845 + * on the sysctl value that can change. 846 + */ 847 + if (sysctl_perf_user_access == SYSCTL_NO_USER_ACCESS) 848 + event->hw.flags |= PERF_EVENT_FLAG_NO_USER_ACCESS; 849 + else if (sysctl_perf_user_access == SYSCTL_USER_ACCESS) 850 + event->hw.flags |= PERF_EVENT_FLAG_USER_ACCESS; 851 + else 852 + event->hw.flags |= PERF_EVENT_FLAG_LEGACY; 853 + } 854 + 855 + static void pmu_sbi_event_mapped(struct perf_event *event, struct mm_struct *mm) 856 + { 857 + if (event->hw.flags & PERF_EVENT_FLAG_NO_USER_ACCESS) 858 + return; 859 + 860 + if (event->hw.flags & PERF_EVENT_FLAG_LEGACY) { 861 + if (event->attr.config != PERF_COUNT_HW_CPU_CYCLES && 862 + event->attr.config != PERF_COUNT_HW_INSTRUCTIONS) { 863 + return; 864 + } 865 + } 866 + 867 + /* 868 + * The user mmapped the event to directly access it: this is where 869 + * we determine based on sysctl_perf_user_access if we grant userspace 870 + * the direct access to this event. That means that within the same 871 + * task, some events may be directly accessible and some other may not, 872 + * if the user changes the value of sysctl_perf_user_accesss in the 873 + * meantime. 874 + */ 875 + 876 + event->hw.flags |= PERF_EVENT_FLAG_USER_READ_CNT; 877 + 878 + /* 879 + * We must enable userspace access *before* advertising in the user page 880 + * that it is possible to do so to avoid any race. 881 + * And we must notify all cpus here because threads that currently run 882 + * on other cpus will try to directly access the counter too without 883 + * calling pmu_sbi_ctr_start. 884 + */ 885 + if (event->hw.flags & PERF_EVENT_FLAG_USER_ACCESS) 886 + on_each_cpu_mask(mm_cpumask(mm), 887 + pmu_sbi_set_scounteren, (void *)event, 1); 888 + } 889 + 890 + static void pmu_sbi_event_unmapped(struct perf_event *event, struct mm_struct *mm) 891 + { 892 + if (event->hw.flags & PERF_EVENT_FLAG_NO_USER_ACCESS) 893 + return; 894 + 895 + if (event->hw.flags & PERF_EVENT_FLAG_LEGACY) { 896 + if (event->attr.config != PERF_COUNT_HW_CPU_CYCLES && 897 + event->attr.config != PERF_COUNT_HW_INSTRUCTIONS) { 898 + return; 899 + } 900 + } 901 + 902 + /* 903 + * Here we can directly remove user access since the user does not have 904 + * access to the user page anymore so we avoid the racy window where the 905 + * user could have read cap_user_rdpmc to true right before we disable 906 + * it. 907 + */ 908 + event->hw.flags &= ~PERF_EVENT_FLAG_USER_READ_CNT; 909 + 910 + if (event->hw.flags & PERF_EVENT_FLAG_USER_ACCESS) 911 + on_each_cpu_mask(mm_cpumask(mm), 912 + pmu_sbi_reset_scounteren, (void *)event, 1); 913 + } 914 + 915 + static void riscv_pmu_update_counter_access(void *info) 916 + { 917 + if (sysctl_perf_user_access == SYSCTL_LEGACY) 918 + csr_write(CSR_SCOUNTEREN, 0x7); 919 + else 920 + csr_write(CSR_SCOUNTEREN, 0x2); 921 + } 922 + 923 + static int riscv_pmu_proc_user_access_handler(struct ctl_table *table, 924 + int write, void *buffer, 925 + size_t *lenp, loff_t *ppos) 926 + { 927 + int prev = sysctl_perf_user_access; 928 + int ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); 929 + 930 + /* 931 + * Test against the previous value since we clear SCOUNTEREN when 932 + * sysctl_perf_user_access is set to SYSCTL_USER_ACCESS, but we should 933 + * not do that if that was already the case. 934 + */ 935 + if (ret || !write || prev == sysctl_perf_user_access) 936 + return ret; 937 + 938 + on_each_cpu(riscv_pmu_update_counter_access, NULL, 1); 939 + 940 + return 0; 941 + } 942 + 943 + static struct ctl_table sbi_pmu_sysctl_table[] = { 944 + { 945 + .procname = "perf_user_access", 946 + .data = &sysctl_perf_user_access, 947 + .maxlen = sizeof(unsigned int), 948 + .mode = 0644, 949 + .proc_handler = riscv_pmu_proc_user_access_handler, 950 + .extra1 = SYSCTL_ZERO, 951 + .extra2 = SYSCTL_TWO, 952 + }, 953 + { } 954 + }; 955 + 900 956 static int pmu_sbi_device_probe(struct platform_device *pdev) 901 957 { 902 958 struct riscv_pmu *pmu = NULL; ··· 1055 881 pmu->ctr_get_width = pmu_sbi_ctr_get_width; 1056 882 pmu->ctr_clear_idx = pmu_sbi_ctr_clear_idx; 1057 883 pmu->ctr_read = pmu_sbi_ctr_read; 884 + pmu->event_init = pmu_sbi_event_init; 885 + pmu->event_mapped = pmu_sbi_event_mapped; 886 + pmu->event_unmapped = pmu_sbi_event_unmapped; 887 + pmu->csr_index = pmu_sbi_csr_index; 1058 888 1059 889 ret = cpuhp_state_add_instance(CPUHP_AP_PERF_RISCV_STARTING, &pmu->node); 1060 890 if (ret) ··· 1071 893 ret = perf_pmu_register(&pmu->pmu, "cpu", PERF_TYPE_RAW); 1072 894 if (ret) 1073 895 goto out_unregister; 896 + 897 + register_sysctl("kernel", sbi_pmu_sysctl_table); 1074 898 1075 899 return 0; 1076 900 ··· 1087 907 static struct platform_driver pmu_sbi_driver = { 1088 908 .probe = pmu_sbi_device_probe, 1089 909 .driver = { 1090 - .name = RISCV_PMU_PDEV_NAME, 910 + .name = RISCV_PMU_SBI_PDEV_NAME, 1091 911 }, 1092 912 }; 1093 913 ··· 1114 934 if (ret) 1115 935 return ret; 1116 936 1117 - pdev = platform_device_register_simple(RISCV_PMU_PDEV_NAME, -1, NULL, 0); 937 + pdev = platform_device_register_simple(RISCV_PMU_SBI_PDEV_NAME, -1, NULL, 0); 1118 938 if (IS_ERR(pdev)) { 1119 939 platform_driver_unregister(&pmu_sbi_driver); 1120 940 return PTR_ERR(pdev);
+1 -1
fs/Kconfig.binfmt
··· 58 58 config BINFMT_ELF_FDPIC 59 59 bool "Kernel support for FDPIC ELF binaries" 60 60 default y if !BINFMT_ELF 61 - depends on ARM || ((M68K || SUPERH || XTENSA) && !MMU) 61 + depends on ARM || ((M68K || RISCV || SUPERH || XTENSA) && !MMU) 62 62 select ELFCORE 63 63 help 64 64 ELF FDPIC binaries are based on ELF, but allow the individual load
+19 -19
fs/binfmt_elf_fdpic.c
··· 138 138 static int elf_fdpic_fetch_phdrs(struct elf_fdpic_params *params, 139 139 struct file *file) 140 140 { 141 - struct elf32_phdr *phdr; 141 + struct elf_phdr *phdr; 142 142 unsigned long size; 143 143 int retval, loop; 144 144 loff_t pos = params->hdr.e_phoff; ··· 560 560 sp &= ~7UL; 561 561 562 562 /* stack the load map(s) */ 563 - len = sizeof(struct elf32_fdpic_loadmap); 564 - len += sizeof(struct elf32_fdpic_loadseg) * exec_params->loadmap->nsegs; 563 + len = sizeof(struct elf_fdpic_loadmap); 564 + len += sizeof(struct elf_fdpic_loadseg) * exec_params->loadmap->nsegs; 565 565 sp = (sp - len) & ~7UL; 566 566 exec_params->map_addr = sp; 567 567 ··· 571 571 current->mm->context.exec_fdpic_loadmap = (unsigned long) sp; 572 572 573 573 if (interp_params->loadmap) { 574 - len = sizeof(struct elf32_fdpic_loadmap); 575 - len += sizeof(struct elf32_fdpic_loadseg) * 574 + len = sizeof(struct elf_fdpic_loadmap); 575 + len += sizeof(struct elf_fdpic_loadseg) * 576 576 interp_params->loadmap->nsegs; 577 577 sp = (sp - len) & ~7UL; 578 578 interp_params->map_addr = sp; ··· 740 740 struct mm_struct *mm, 741 741 const char *what) 742 742 { 743 - struct elf32_fdpic_loadmap *loadmap; 743 + struct elf_fdpic_loadmap *loadmap; 744 744 #ifdef CONFIG_MMU 745 - struct elf32_fdpic_loadseg *mseg; 745 + struct elf_fdpic_loadseg *mseg; 746 746 unsigned long load_addr; 747 747 #endif 748 - struct elf32_fdpic_loadseg *seg; 749 - struct elf32_phdr *phdr; 748 + struct elf_fdpic_loadseg *seg; 749 + struct elf_phdr *phdr; 750 750 unsigned nloads, tmp; 751 751 unsigned long stop; 752 752 int loop, ret; ··· 766 766 767 767 params->loadmap = loadmap; 768 768 769 - loadmap->version = ELF32_FDPIC_LOADMAP_VERSION; 769 + loadmap->version = ELF_FDPIC_LOADMAP_VERSION; 770 770 loadmap->nsegs = nloads; 771 771 772 772 /* map the requested LOADs into the memory space */ ··· 839 839 if (phdr->p_vaddr >= seg->p_vaddr && 840 840 phdr->p_vaddr + phdr->p_memsz <= 841 841 seg->p_vaddr + seg->p_memsz) { 842 - Elf32_Dyn __user *dyn; 843 - Elf32_Sword d_tag; 842 + Elf_Dyn __user *dyn; 843 + Elf_Sword d_tag; 844 844 845 845 params->dynamic_addr = 846 846 (phdr->p_vaddr - seg->p_vaddr) + ··· 850 850 * one item, and that the last item is a NULL 851 851 * entry */ 852 852 if (phdr->p_memsz == 0 || 853 - phdr->p_memsz % sizeof(Elf32_Dyn) != 0) 853 + phdr->p_memsz % sizeof(Elf_Dyn) != 0) 854 854 goto dynamic_error; 855 855 856 - tmp = phdr->p_memsz / sizeof(Elf32_Dyn); 857 - dyn = (Elf32_Dyn __user *)params->dynamic_addr; 856 + tmp = phdr->p_memsz / sizeof(Elf_Dyn); 857 + dyn = (Elf_Dyn __user *)params->dynamic_addr; 858 858 if (get_user(d_tag, &dyn[tmp - 1].d_tag) || 859 859 d_tag != 0) 860 860 goto dynamic_error; ··· 923 923 struct file *file, 924 924 struct mm_struct *mm) 925 925 { 926 - struct elf32_fdpic_loadseg *seg; 927 - struct elf32_phdr *phdr; 926 + struct elf_fdpic_loadseg *seg; 927 + struct elf_phdr *phdr; 928 928 unsigned long load_addr, base = ULONG_MAX, top = 0, maddr = 0; 929 929 int loop, ret; 930 930 ··· 1007 1007 struct file *file, 1008 1008 struct mm_struct *mm) 1009 1009 { 1010 - struct elf32_fdpic_loadseg *seg; 1011 - struct elf32_phdr *phdr; 1010 + struct elf_fdpic_loadseg *seg; 1011 + struct elf_phdr *phdr; 1012 1012 unsigned long load_addr, delta_vaddr; 1013 1013 int loop, dvset; 1014 1014
+13 -1
include/asm-generic/preempt.h
··· 80 80 81 81 #ifdef CONFIG_PREEMPTION 82 82 extern asmlinkage void preempt_schedule(void); 83 - #define __preempt_schedule() preempt_schedule() 84 83 extern asmlinkage void preempt_schedule_notrace(void); 84 + 85 + #if defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY) 86 + 87 + void dynamic_preempt_schedule(void); 88 + void dynamic_preempt_schedule_notrace(void); 89 + #define __preempt_schedule() dynamic_preempt_schedule() 90 + #define __preempt_schedule_notrace() dynamic_preempt_schedule_notrace() 91 + 92 + #else /* !CONFIG_PREEMPT_DYNAMIC || !CONFIG_HAVE_PREEMPT_DYNAMIC_KEY*/ 93 + 94 + #define __preempt_schedule() preempt_schedule() 85 95 #define __preempt_schedule_notrace() preempt_schedule_notrace() 96 + 97 + #endif /* CONFIG_PREEMPT_DYNAMIC && CONFIG_HAVE_PREEMPT_DYNAMIC_KEY*/ 86 98 #endif /* CONFIG_PREEMPTION */ 87 99 88 100 #endif /* __ASM_PREEMPT_H */
+13 -1
include/linux/elf-fdpic.h
··· 10 10 11 11 #include <uapi/linux/elf-fdpic.h> 12 12 13 + #if ELF_CLASS == ELFCLASS32 14 + #define Elf_Sword Elf32_Sword 15 + #define elf_fdpic_loadseg elf32_fdpic_loadseg 16 + #define elf_fdpic_loadmap elf32_fdpic_loadmap 17 + #define ELF_FDPIC_LOADMAP_VERSION ELF32_FDPIC_LOADMAP_VERSION 18 + #else 19 + #define Elf_Sword Elf64_Sxword 20 + #define elf_fdpic_loadmap elf64_fdpic_loadmap 21 + #define elf_fdpic_loadseg elf64_fdpic_loadseg 22 + #define ELF_FDPIC_LOADMAP_VERSION ELF64_FDPIC_LOADMAP_VERSION 23 + #endif 24 + 13 25 /* 14 26 * binfmt binary parameters structure 15 27 */ 16 28 struct elf_fdpic_params { 17 29 struct elfhdr hdr; /* ref copy of ELF header */ 18 30 struct elf_phdr *phdrs; /* ref copy of PT_PHDR table */ 19 - struct elf32_fdpic_loadmap *loadmap; /* loadmap to be passed to userspace */ 31 + struct elf_fdpic_loadmap *loadmap; /* loadmap to be passed to userspace */ 20 32 unsigned long elfhdr_addr; /* mapped ELF header user address */ 21 33 unsigned long ph_addr; /* mapped PT_PHDR user address */ 22 34 unsigned long map_addr; /* mapped loadmap user address */
+8 -4
include/linux/perf/riscv_pmu.h
··· 6 6 * 7 7 */ 8 8 9 - #ifndef _ASM_RISCV_PERF_EVENT_H 10 - #define _ASM_RISCV_PERF_EVENT_H 9 + #ifndef _RISCV_PMU_H 10 + #define _RISCV_PMU_H 11 11 12 12 #include <linux/perf_event.h> 13 13 #include <linux/ptrace.h> ··· 21 21 22 22 #define RISCV_MAX_COUNTERS 64 23 23 #define RISCV_OP_UNSUPP (-EOPNOTSUPP) 24 - #define RISCV_PMU_PDEV_NAME "riscv-pmu" 24 + #define RISCV_PMU_SBI_PDEV_NAME "riscv-pmu-sbi" 25 25 #define RISCV_PMU_LEGACY_PDEV_NAME "riscv-pmu-legacy" 26 26 27 27 #define RISCV_PMU_STOP_FLAG_RESET 1 ··· 55 55 void (*ctr_start)(struct perf_event *event, u64 init_val); 56 56 void (*ctr_stop)(struct perf_event *event, unsigned long flag); 57 57 int (*event_map)(struct perf_event *event, u64 *config); 58 + void (*event_init)(struct perf_event *event); 59 + void (*event_mapped)(struct perf_event *event, struct mm_struct *mm); 60 + void (*event_unmapped)(struct perf_event *event, struct mm_struct *mm); 61 + uint8_t (*csr_index)(struct perf_event *event); 58 62 59 63 struct cpu_hw_events __percpu *hw_events; 60 64 struct hlist_node node; ··· 85 81 86 82 #endif /* CONFIG_RISCV_PMU */ 87 83 88 - #endif /* _ASM_RISCV_PERF_EVENT_H */ 84 + #endif /* _RISCV_PMU_H */
+2 -1
include/linux/perf_event.h
··· 444 444 445 445 /* 446 446 * Will return the value for perf_event_mmap_page::index for this event, 447 - * if no implementation is provided it will default to: event->hw.idx + 1. 447 + * if no implementation is provided it will default to 0 (see 448 + * perf_event_idx_default). 448 449 */ 449 450 int (*event_idx) (struct perf_event *event); /*optional */ 450 451
+15
include/uapi/linux/elf-fdpic.h
··· 32 32 33 33 #define ELF32_FDPIC_LOADMAP_VERSION 0x0000 34 34 35 + /* segment mappings for ELF FDPIC libraries/executables/interpreters */ 36 + struct elf64_fdpic_loadseg { 37 + Elf64_Addr addr; /* core address to which mapped */ 38 + Elf64_Addr p_vaddr; /* VMA recorded in file */ 39 + Elf64_Word p_memsz; /* allocation size recorded in file */ 40 + }; 41 + 42 + struct elf64_fdpic_loadmap { 43 + Elf64_Half version; /* version of these structures, just in case... */ 44 + Elf64_Half nsegs; /* number of segments */ 45 + struct elf64_fdpic_loadseg segs[]; 46 + }; 47 + 48 + #define ELF64_FDPIC_LOADMAP_VERSION 0x0000 49 + 35 50 #endif /* _UAPI_LINUX_ELF_FDPIC_H */
+6 -1
lib/Kconfig.debug
··· 355 355 config DEBUG_INFO_SPLIT 356 356 bool "Produce split debuginfo in .dwo files" 357 357 depends on $(cc-option,-gsplit-dwarf) 358 + # RISC-V linker relaxation + -gsplit-dwarf has issues with LLVM and GCC 359 + # prior to 12.x: 360 + # https://github.com/llvm/llvm-project/issues/56642 361 + # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99090 362 + depends on !RISCV || GCC_VERSION >= 120000 358 363 help 359 364 Generate debug info into separate .dwo files. This significantly 360 365 reduces the build directory size for builds with DEBUG_INFO, ··· 507 502 508 503 config DEBUG_FORCE_FUNCTION_ALIGN_64B 509 504 bool "Force all function address 64B aligned" 510 - depends on EXPERT && (X86_64 || ARM64 || PPC32 || PPC64 || ARC || S390) 505 + depends on EXPERT && (X86_64 || ARM64 || PPC32 || PPC64 || ARC || RISCV || S390) 511 506 select FUNCTION_ALIGNMENT_64B 512 507 help 513 508 There are cases that a commit from one domain changes the function
+66
tools/lib/perf/mmap.c
··· 392 392 393 393 static u64 read_timestamp(void) { return read_sysreg(cntvct_el0); } 394 394 395 + /* __riscv_xlen contains the witdh of the native base integer, here 64-bit */ 396 + #elif defined(__riscv) && __riscv_xlen == 64 397 + 398 + /* TODO: implement rv32 support */ 399 + 400 + #define CSR_CYCLE 0xc00 401 + #define CSR_TIME 0xc01 402 + 403 + #define csr_read(csr) \ 404 + ({ \ 405 + register unsigned long __v; \ 406 + __asm__ __volatile__ ("csrr %0, %1" \ 407 + : "=r" (__v) \ 408 + : "i" (csr) : ); \ 409 + __v; \ 410 + }) 411 + 412 + static unsigned long csr_read_num(int csr_num) 413 + { 414 + #define switchcase_csr_read(__csr_num, __val) {\ 415 + case __csr_num: \ 416 + __val = csr_read(__csr_num); \ 417 + break; } 418 + #define switchcase_csr_read_2(__csr_num, __val) {\ 419 + switchcase_csr_read(__csr_num + 0, __val) \ 420 + switchcase_csr_read(__csr_num + 1, __val)} 421 + #define switchcase_csr_read_4(__csr_num, __val) {\ 422 + switchcase_csr_read_2(__csr_num + 0, __val) \ 423 + switchcase_csr_read_2(__csr_num + 2, __val)} 424 + #define switchcase_csr_read_8(__csr_num, __val) {\ 425 + switchcase_csr_read_4(__csr_num + 0, __val) \ 426 + switchcase_csr_read_4(__csr_num + 4, __val)} 427 + #define switchcase_csr_read_16(__csr_num, __val) {\ 428 + switchcase_csr_read_8(__csr_num + 0, __val) \ 429 + switchcase_csr_read_8(__csr_num + 8, __val)} 430 + #define switchcase_csr_read_32(__csr_num, __val) {\ 431 + switchcase_csr_read_16(__csr_num + 0, __val) \ 432 + switchcase_csr_read_16(__csr_num + 16, __val)} 433 + 434 + unsigned long ret = 0; 435 + 436 + switch (csr_num) { 437 + switchcase_csr_read_32(CSR_CYCLE, ret) 438 + default: 439 + break; 440 + } 441 + 442 + return ret; 443 + #undef switchcase_csr_read_32 444 + #undef switchcase_csr_read_16 445 + #undef switchcase_csr_read_8 446 + #undef switchcase_csr_read_4 447 + #undef switchcase_csr_read_2 448 + #undef switchcase_csr_read 449 + } 450 + 451 + static u64 read_perf_counter(unsigned int counter) 452 + { 453 + return csr_read_num(CSR_CYCLE + counter); 454 + } 455 + 456 + static u64 read_timestamp(void) 457 + { 458 + return csr_read_num(CSR_TIME); 459 + } 460 + 395 461 #else 396 462 static u64 read_perf_counter(unsigned int counter __maybe_unused) { return 0; } 397 463 static u64 read_timestamp(void) { return 0; }
+4 -2
tools/perf/tests/mmap-basic.c
··· 284 284 "permissions"), 285 285 TEST_CASE_REASON("User space counter reading of instructions", 286 286 mmap_user_read_instr, 287 - #if defined(__i386__) || defined(__x86_64__) || defined(__aarch64__) 287 + #if defined(__i386__) || defined(__x86_64__) || defined(__aarch64__) || \ 288 + (defined(__riscv) && __riscv_xlen == 64) 288 289 "permissions" 289 290 #else 290 291 "unsupported" ··· 293 292 ), 294 293 TEST_CASE_REASON("User space counter reading of cycles", 295 294 mmap_user_read_cycles, 296 - #if defined(__i386__) || defined(__x86_64__) || defined(__aarch64__) 295 + #if defined(__i386__) || defined(__x86_64__) || defined(__aarch64__) || \ 296 + (defined(__riscv) && __riscv_xlen == 64) 297 297 "permissions" 298 298 #else 299 299 "unsupported"
+1 -1
tools/testing/selftests/riscv/Makefile
··· 5 5 ARCH ?= $(shell uname -m 2>/dev/null || echo not) 6 6 7 7 ifneq (,$(filter $(ARCH),riscv)) 8 - RISCV_SUBTARGETS ?= hwprobe vector 8 + RISCV_SUBTARGETS ?= hwprobe vector mm 9 9 else 10 10 RISCV_SUBTARGETS := 11 11 endif
+2
tools/testing/selftests/riscv/mm/.gitignore
··· 1 + mmap_bottomup 2 + mmap_default
+15
tools/testing/selftests/riscv/mm/Makefile
··· 1 + # SPDX-License-Identifier: GPL-2.0 2 + # Copyright (C) 2021 ARM Limited 3 + # Originally tools/testing/arm64/abi/Makefile 4 + 5 + # Additional include paths needed by kselftest.h and local headers 6 + CFLAGS += -D_GNU_SOURCE -std=gnu99 -I. 7 + 8 + TEST_GEN_FILES := testcases/mmap_default testcases/mmap_bottomup 9 + 10 + TEST_PROGS := testcases/run_mmap.sh 11 + 12 + include ../../lib.mk 13 + 14 + $(OUTPUT)/mm: testcases/mmap_default.c testcases/mmap_bottomup.c testcases/mmap_tests.h 15 + $(CC) -o$@ $(CFLAGS) $(LDFLAGS) $^
+35
tools/testing/selftests/riscv/mm/testcases/mmap_bottomup.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + #include <sys/mman.h> 3 + #include <testcases/mmap_test.h> 4 + 5 + #include "../../kselftest_harness.h" 6 + 7 + TEST(infinite_rlimit) 8 + { 9 + // Only works on 64 bit 10 + #if __riscv_xlen == 64 11 + struct addresses mmap_addresses; 12 + 13 + EXPECT_EQ(BOTTOM_UP, memory_layout()); 14 + 15 + do_mmaps(&mmap_addresses); 16 + 17 + EXPECT_NE(MAP_FAILED, mmap_addresses.no_hint); 18 + EXPECT_NE(MAP_FAILED, mmap_addresses.on_37_addr); 19 + EXPECT_NE(MAP_FAILED, mmap_addresses.on_38_addr); 20 + EXPECT_NE(MAP_FAILED, mmap_addresses.on_46_addr); 21 + EXPECT_NE(MAP_FAILED, mmap_addresses.on_47_addr); 22 + EXPECT_NE(MAP_FAILED, mmap_addresses.on_55_addr); 23 + EXPECT_NE(MAP_FAILED, mmap_addresses.on_56_addr); 24 + 25 + EXPECT_GT(1UL << 47, (unsigned long)mmap_addresses.no_hint); 26 + EXPECT_GT(1UL << 38, (unsigned long)mmap_addresses.on_37_addr); 27 + EXPECT_GT(1UL << 38, (unsigned long)mmap_addresses.on_38_addr); 28 + EXPECT_GT(1UL << 38, (unsigned long)mmap_addresses.on_46_addr); 29 + EXPECT_GT(1UL << 47, (unsigned long)mmap_addresses.on_47_addr); 30 + EXPECT_GT(1UL << 47, (unsigned long)mmap_addresses.on_55_addr); 31 + EXPECT_GT(1UL << 56, (unsigned long)mmap_addresses.on_56_addr); 32 + #endif 33 + } 34 + 35 + TEST_HARNESS_MAIN
+35
tools/testing/selftests/riscv/mm/testcases/mmap_default.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + #include <sys/mman.h> 3 + #include <testcases/mmap_test.h> 4 + 5 + #include "../../kselftest_harness.h" 6 + 7 + TEST(default_rlimit) 8 + { 9 + // Only works on 64 bit 10 + #if __riscv_xlen == 64 11 + struct addresses mmap_addresses; 12 + 13 + EXPECT_EQ(TOP_DOWN, memory_layout()); 14 + 15 + do_mmaps(&mmap_addresses); 16 + 17 + EXPECT_NE(MAP_FAILED, mmap_addresses.no_hint); 18 + EXPECT_NE(MAP_FAILED, mmap_addresses.on_37_addr); 19 + EXPECT_NE(MAP_FAILED, mmap_addresses.on_38_addr); 20 + EXPECT_NE(MAP_FAILED, mmap_addresses.on_46_addr); 21 + EXPECT_NE(MAP_FAILED, mmap_addresses.on_47_addr); 22 + EXPECT_NE(MAP_FAILED, mmap_addresses.on_55_addr); 23 + EXPECT_NE(MAP_FAILED, mmap_addresses.on_56_addr); 24 + 25 + EXPECT_GT(1UL << 47, (unsigned long)mmap_addresses.no_hint); 26 + EXPECT_GT(1UL << 38, (unsigned long)mmap_addresses.on_37_addr); 27 + EXPECT_GT(1UL << 38, (unsigned long)mmap_addresses.on_38_addr); 28 + EXPECT_GT(1UL << 38, (unsigned long)mmap_addresses.on_46_addr); 29 + EXPECT_GT(1UL << 47, (unsigned long)mmap_addresses.on_47_addr); 30 + EXPECT_GT(1UL << 47, (unsigned long)mmap_addresses.on_55_addr); 31 + EXPECT_GT(1UL << 56, (unsigned long)mmap_addresses.on_56_addr); 32 + #endif 33 + } 34 + 35 + TEST_HARNESS_MAIN
+64
tools/testing/selftests/riscv/mm/testcases/mmap_test.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0-only */ 2 + #ifndef _TESTCASES_MMAP_TEST_H 3 + #define _TESTCASES_MMAP_TEST_H 4 + #include <sys/mman.h> 5 + #include <sys/resource.h> 6 + #include <stddef.h> 7 + 8 + #define TOP_DOWN 0 9 + #define BOTTOM_UP 1 10 + 11 + struct addresses { 12 + int *no_hint; 13 + int *on_37_addr; 14 + int *on_38_addr; 15 + int *on_46_addr; 16 + int *on_47_addr; 17 + int *on_55_addr; 18 + int *on_56_addr; 19 + }; 20 + 21 + static inline void do_mmaps(struct addresses *mmap_addresses) 22 + { 23 + /* 24 + * Place all of the hint addresses on the boundaries of mmap 25 + * sv39, sv48, sv57 26 + * User addresses end at 1<<38, 1<<47, 1<<56 respectively 27 + */ 28 + void *on_37_bits = (void *)(1UL << 37); 29 + void *on_38_bits = (void *)(1UL << 38); 30 + void *on_46_bits = (void *)(1UL << 46); 31 + void *on_47_bits = (void *)(1UL << 47); 32 + void *on_55_bits = (void *)(1UL << 55); 33 + void *on_56_bits = (void *)(1UL << 56); 34 + 35 + int prot = PROT_READ | PROT_WRITE; 36 + int flags = MAP_PRIVATE | MAP_ANONYMOUS; 37 + 38 + mmap_addresses->no_hint = 39 + mmap(NULL, 5 * sizeof(int), prot, flags, 0, 0); 40 + mmap_addresses->on_37_addr = 41 + mmap(on_37_bits, 5 * sizeof(int), prot, flags, 0, 0); 42 + mmap_addresses->on_38_addr = 43 + mmap(on_38_bits, 5 * sizeof(int), prot, flags, 0, 0); 44 + mmap_addresses->on_46_addr = 45 + mmap(on_46_bits, 5 * sizeof(int), prot, flags, 0, 0); 46 + mmap_addresses->on_47_addr = 47 + mmap(on_47_bits, 5 * sizeof(int), prot, flags, 0, 0); 48 + mmap_addresses->on_55_addr = 49 + mmap(on_55_bits, 5 * sizeof(int), prot, flags, 0, 0); 50 + mmap_addresses->on_56_addr = 51 + mmap(on_56_bits, 5 * sizeof(int), prot, flags, 0, 0); 52 + } 53 + 54 + static inline int memory_layout(void) 55 + { 56 + int prot = PROT_READ | PROT_WRITE; 57 + int flags = MAP_PRIVATE | MAP_ANONYMOUS; 58 + 59 + void *value1 = mmap(NULL, sizeof(int), prot, flags, 0, 0); 60 + void *value2 = mmap(NULL, sizeof(int), prot, flags, 0, 0); 61 + 62 + return value2 > value1; 63 + } 64 + #endif /* _TESTCASES_MMAP_TEST_H */
+12
tools/testing/selftests/riscv/mm/testcases/run_mmap.sh
··· 1 + #!/bin/sh 2 + # SPDX-License-Identifier: GPL-2.0 3 + 4 + original_stack_limit=$(ulimit -s) 5 + 6 + ./mmap_default 7 + 8 + # Force mmap_bottomup to be ran with bottomup memory due to 9 + # the unlimited stack 10 + ulimit -s unlimited 11 + ./mmap_bottomup 12 + ulimit -s $original_stack_limit