Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

s390/boot: rework decompressor reserved tracking

Currently several approaches for finding unused memory in decompressor
are utilized. While "safe_addr" grows towards higher addresses, vmem
code allocates paging structures top down. The former requires careful
ordering. In addition to that ipl report handling code verifies potential
intersections with secure boot certificates on its own. Neither of two
approaches are memory holes aware and consistent with each other in low
memory conditions.

To solve that, existing approaches are generalized and combined
together, as well as online memory ranges are now taken into
consideration.

physmem_info has been extended to contain reserved memory ranges. New
set of functions allow to handle reserves and find unused memory.
All reserves and memory allocations are "typed". In case of out of
memory condition decompressor fails with detailed info on current
reserved ranges and usable online memory.

Linux version 6.2.0 ...
Kernel command line: ... mem=100M
Our of memory allocating 100000 bytes 100000 aligned in range 0:5800000
Reserved memory ranges:
0000000000000000 0000000003e33000 DECOMPRESSOR
0000000003f00000 00000000057648a3 INITRD
00000000063e0000 00000000063e8000 VMEM
00000000063eb000 00000000063f4000 VMEM
00000000063f7800 0000000006400000 VMEM
0000000005800000 0000000006300000 KASAN
Usable online memory ranges (info source: sclp read info [3]):
0000000000000000 0000000006400000
Usable online memory total: 6400000 Reserved: 61b10a3 Free: 24ef5d
Call Trace:
(sp:000000000002bd58 [<0000000000012a70>] physmem_alloc_top_down+0x60/0x14c)
sp:000000000002bdc8 [<0000000000013756>] _pa+0x56/0x6a
sp:000000000002bdf0 [<0000000000013bcc>] pgtable_populate+0x45c/0x65e
sp:000000000002be90 [<00000000000140aa>] setup_vmem+0x2da/0x424
sp:000000000002bec8 [<0000000000011c20>] startup_kernel+0x428/0x8b4
sp:000000000002bf60 [<00000000000100f4>] startup_normal+0xd4/0xd4

physmem_alloc_range allows to find free memory in specified range. It
should be used for one time allocations only like finding position for
amode31 and vmlinux.
physmem_alloc_top_down can be used just like physmem_alloc_range, but
it also allows multiple allocations per type and tries to merge sequential
allocations together. Which is useful for paging structures allocations.
If sequential allocations cannot be merged together they are "chained",
allowing easy per type reserved ranges enumeration and migration to
memblock later. Extra "struct reserved_range" allocated for chaining are
not tracked or reserved but rely on the fact that both
physmem_alloc_range and physmem_alloc_top_down search for free memory
only below current top down allocator position. All reserved ranges
should be transferred to memblock before memblock allocations are
enabled.

The startup code has been reordered to delay any memory allocations until
online memory ranges are detected and occupied memory ranges are marked as
reserved to be excluded from follow-up allocations.
Ipl report certificates are a special case, ipl report certificates list
is checked together with other memory reserves until certificates are
saved elsewhere.
KASAN required memory for shadow memory allocation and mapping is reserved
as 1 large chunk which is later passed to KASAN early initialization code.

Acked-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

authored by

Vasily Gorbik and committed by
Heiko Carstens
f913a660 8c37cb7d

+436 -411
+25 -5
arch/s390/boot/boot.h
··· 8 8 9 9 #ifndef __ASSEMBLY__ 10 10 11 + #include <asm/physmem_info.h> 12 + 11 13 struct machine_info { 12 14 unsigned char has_edat1 : 1; 13 15 unsigned char has_edat2 : 1; ··· 35 33 }; 36 34 37 35 void startup_kernel(void); 38 - unsigned long detect_memory(unsigned long *safe_addr); 36 + unsigned long detect_max_physmem_end(void); 37 + void detect_physmem_online_ranges(unsigned long max_physmem_end); 39 38 void physmem_set_usable_limit(unsigned long limit); 39 + void physmem_reserve(enum reserved_range_type type, unsigned long addr, unsigned long size); 40 + void physmem_free(enum reserved_range_type type); 41 + /* for continuous/multiple allocations per type */ 42 + unsigned long physmem_alloc_top_down(enum reserved_range_type type, unsigned long size, 43 + unsigned long align); 44 + /* for single allocations, 1 per type */ 45 + unsigned long physmem_alloc_range(enum reserved_range_type type, unsigned long size, 46 + unsigned long align, unsigned long min, unsigned long max, 47 + bool die_on_oom); 48 + bool ipl_report_certs_intersects(unsigned long addr, unsigned long size, 49 + unsigned long *intersection_start); 40 50 bool is_ipl_block_dump(void); 41 51 void store_ipl_parmblock(void); 42 - unsigned long read_ipl_report(unsigned long safe_addr); 52 + int read_ipl_report(void); 53 + void save_ipl_cert_comp_list(void); 43 54 void setup_boot_command_line(void); 44 55 void parse_boot_command_line(void); 45 56 void verify_facilities(void); 46 57 void print_missing_facilities(void); 47 58 void sclp_early_setup_buffer(void); 48 59 void print_pgm_check_info(void); 49 - unsigned long get_random_base(unsigned long safe_addr); 60 + unsigned long get_random_base(void); 50 61 void setup_vmem(unsigned long asce_limit); 51 - unsigned long vmem_estimate_memory_needs(unsigned long online_mem_total); 52 62 void __printf(1, 2) decompressor_printk(const char *fmt, ...); 63 + void print_stacktrace(unsigned long sp); 53 64 void error(char *m); 54 65 55 66 extern struct machine_info machine; ··· 77 62 extern char __boot_data_preserved_start[], __boot_data_preserved_end[]; 78 63 extern char _decompressor_syms_start[], _decompressor_syms_end[]; 79 64 extern char _stack_start[], _stack_end[]; 80 - extern char _end[]; 65 + extern char _end[], _decompressor_end[]; 81 66 extern unsigned char _compressed_start[]; 82 67 extern unsigned char _compressed_end[]; 83 68 extern struct vmlinux_info _vmlinux_info; ··· 85 70 86 71 #define __abs_lowcore_pa(x) (((unsigned long)(x) - __abs_lowcore) % sizeof(struct lowcore)) 87 72 73 + static inline bool intersects(unsigned long addr0, unsigned long size0, 74 + unsigned long addr1, unsigned long size1) 75 + { 76 + return addr0 + size0 > addr1 && addr1 + size1 > addr0; 77 + } 88 78 #endif /* __ASSEMBLY__ */ 89 79 #endif /* BOOT_BOOT_H */
+52 -60
arch/s390/boot/ipl_report.c
··· 5 5 #include <asm/sclp.h> 6 6 #include <asm/sections.h> 7 7 #include <asm/boot_data.h> 8 + #include <asm/physmem_info.h> 8 9 #include <uapi/asm/ipl.h> 9 10 #include "boot.h" 10 11 ··· 17 16 unsigned long __bootdata(early_ipl_comp_list_addr); 18 17 unsigned long __bootdata(early_ipl_comp_list_size); 19 18 19 + static struct ipl_rb_certificates *certs; 20 + static struct ipl_rb_components *comps; 21 + static bool ipl_report_needs_saving; 22 + 20 23 #define for_each_rb_entry(entry, rb) \ 21 24 for (entry = rb->entries; \ 22 25 (void *) entry + sizeof(*entry) <= (void *) rb + rb->len; \ 23 26 entry++) 24 27 25 - static inline bool intersects(unsigned long addr0, unsigned long size0, 26 - unsigned long addr1, unsigned long size1) 27 - { 28 - return addr0 + size0 > addr1 && addr1 + size1 > addr0; 29 - } 30 - 31 - static unsigned long find_bootdata_space(struct ipl_rb_components *comps, 32 - struct ipl_rb_certificates *certs, 33 - unsigned long safe_addr) 28 + static unsigned long get_cert_comp_list_size(void) 34 29 { 35 30 struct ipl_rb_certificate_entry *cert; 36 31 struct ipl_rb_component_entry *comp; ··· 41 44 ipl_cert_list_size = 0; 42 45 for_each_rb_entry(cert, certs) 43 46 ipl_cert_list_size += sizeof(unsigned int) + cert->len; 44 - size = ipl_cert_list_size + early_ipl_comp_list_size; 45 - 46 - /* 47 - * Start from safe_addr to find a free memory area large 48 - * enough for the IPL report boot data. This area is used 49 - * for ipl_cert_list_addr/ipl_cert_list_size and 50 - * early_ipl_comp_list_addr/early_ipl_comp_list_size. It must 51 - * not overlap with any component or any certificate. 52 - */ 53 - repeat: 54 - if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && initrd_data.start && initrd_data.size && 55 - intersects(initrd_data.start, initrd_data.size, safe_addr, size)) 56 - safe_addr = initrd_data.start + initrd_data.size; 57 - if (intersects(safe_addr, size, (unsigned long)comps, comps->len)) { 58 - safe_addr = (unsigned long)comps + comps->len; 59 - goto repeat; 60 - } 61 - for_each_rb_entry(comp, comps) 62 - if (intersects(safe_addr, size, comp->addr, comp->len)) { 63 - safe_addr = comp->addr + comp->len; 64 - goto repeat; 65 - } 66 - if (intersects(safe_addr, size, (unsigned long)certs, certs->len)) { 67 - safe_addr = (unsigned long)certs + certs->len; 68 - goto repeat; 69 - } 70 - for_each_rb_entry(cert, certs) 71 - if (intersects(safe_addr, size, cert->addr, cert->len)) { 72 - safe_addr = cert->addr + cert->len; 73 - goto repeat; 74 - } 75 - early_ipl_comp_list_addr = safe_addr; 76 - ipl_cert_list_addr = safe_addr + early_ipl_comp_list_size; 77 - 78 - return safe_addr + size; 47 + return ipl_cert_list_size + early_ipl_comp_list_size; 79 48 } 80 49 81 - static void copy_components_bootdata(struct ipl_rb_components *comps) 50 + bool ipl_report_certs_intersects(unsigned long addr, unsigned long size, 51 + unsigned long *intersection_start) 52 + { 53 + struct ipl_rb_certificate_entry *cert; 54 + 55 + if (!ipl_report_needs_saving) 56 + return false; 57 + 58 + for_each_rb_entry(cert, certs) { 59 + if (intersects(addr, size, cert->addr, cert->len)) { 60 + *intersection_start = cert->addr; 61 + return true; 62 + } 63 + } 64 + return false; 65 + } 66 + 67 + static void copy_components_bootdata(void) 82 68 { 83 69 struct ipl_rb_component_entry *comp, *ptr; 84 70 ··· 70 90 memcpy(ptr++, comp, sizeof(*ptr)); 71 91 } 72 92 73 - static void copy_certificates_bootdata(struct ipl_rb_certificates *certs) 93 + static void copy_certificates_bootdata(void) 74 94 { 75 95 struct ipl_rb_certificate_entry *cert; 76 96 void *ptr; ··· 84 104 } 85 105 } 86 106 87 - unsigned long read_ipl_report(unsigned long safe_addr) 107 + int read_ipl_report(void) 88 108 { 89 - struct ipl_rb_certificates *certs; 90 - struct ipl_rb_components *comps; 91 109 struct ipl_pl_hdr *pl_hdr; 92 110 struct ipl_rl_hdr *rl_hdr; 93 111 struct ipl_rb_hdr *rb_hdr; ··· 98 120 */ 99 121 if (!ipl_block_valid || 100 122 !(ipl_block.hdr.flags & IPL_PL_FLAG_IPLSR)) 101 - return safe_addr; 123 + return -1; 102 124 ipl_secure_flag = !!(ipl_block.hdr.flags & IPL_PL_FLAG_SIPL); 103 125 /* 104 126 * There is an IPL report, to find it load the pointer to the ··· 136 158 * With either the component list or the certificate list 137 159 * missing the kernel will stay ignorant of secure IPL. 138 160 */ 139 - if (!comps || !certs) 140 - return safe_addr; 161 + if (!comps || !certs) { 162 + certs = NULL; 163 + return -1; 164 + } 141 165 142 - /* 143 - * Copy component and certificate list to a safe area 144 - * where the decompressed kernel can find them. 145 - */ 146 - safe_addr = find_bootdata_space(comps, certs, safe_addr); 147 - copy_components_bootdata(comps); 148 - copy_certificates_bootdata(certs); 166 + ipl_report_needs_saving = true; 167 + physmem_reserve(RR_IPLREPORT, (unsigned long)pl_hdr, 168 + (unsigned long)rl_end - (unsigned long)pl_hdr); 169 + return 0; 170 + } 149 171 150 - return safe_addr; 172 + void save_ipl_cert_comp_list(void) 173 + { 174 + unsigned long size; 175 + 176 + if (!ipl_report_needs_saving) 177 + return; 178 + 179 + size = get_cert_comp_list_size(); 180 + early_ipl_comp_list_addr = physmem_alloc_top_down(RR_CERT_COMP_LIST, size, sizeof(int)); 181 + ipl_cert_list_addr = early_ipl_comp_list_addr + early_ipl_comp_list_size; 182 + 183 + copy_components_bootdata(); 184 + copy_certificates_bootdata(); 185 + physmem_free(RR_IPLREPORT); 186 + ipl_report_needs_saving = false; 151 187 }
+8 -105
arch/s390/boot/kaslr.c
··· 91 91 return 0; 92 92 } 93 93 94 - /* 95 - * To randomize kernel base address we have to consider several facts: 96 - * 1. physical online memory might not be continuous and have holes. physmem 97 - * info contains list of online memory ranges we should consider. 98 - * 2. we have several memory regions which are occupied and we should not 99 - * overlap and destroy them. Currently safe_addr tells us the border below 100 - * which all those occupied regions are. We are safe to use anything above 101 - * safe_addr. 102 - * 3. the upper limit might apply as well, even if memory above that limit is 103 - * online. Currently those limitations are: 104 - * 3.1. Limit set by "mem=" kernel command line option 105 - * 3.2. memory reserved at the end for kasan initialization. 106 - * 4. kernel base address must be aligned to THREAD_SIZE (kernel stack size). 107 - * Which is required for CONFIG_CHECK_STACK. Currently THREAD_SIZE is 4 pages 108 - * (16 pages when the kernel is built with kasan enabled) 109 - * Assumptions: 110 - * 1. kernel size (including .bss size) and upper memory limit are page aligned. 111 - * 2. physmem online region start is THREAD_SIZE aligned / end is PAGE_SIZE 112 - * aligned (in practice memory configurations granularity on z/VM and LPAR 113 - * is 1mb). 114 - * 115 - * To guarantee uniform distribution of kernel base address among all suitable 116 - * addresses we generate random value just once. For that we need to build a 117 - * continuous range in which every value would be suitable. We can build this 118 - * range by simply counting all suitable addresses (let's call them positions) 119 - * which would be valid as kernel base address. To count positions we iterate 120 - * over online memory ranges. For each range which is big enough for the 121 - * kernel image we count all suitable addresses we can put the kernel image at 122 - * that is 123 - * (end - start - kernel_size) / THREAD_SIZE + 1 124 - * Two functions count_valid_kernel_positions and position_to_address help 125 - * to count positions in memory range given and then convert position back 126 - * to address. 127 - */ 128 - static unsigned long count_valid_kernel_positions(unsigned long kernel_size, 129 - unsigned long _min, 130 - unsigned long _max) 94 + unsigned long get_random_base(void) 131 95 { 132 - unsigned long start, end, pos = 0; 133 - int i; 96 + unsigned long vmlinux_size = vmlinux.image_size + vmlinux.bss_size; 97 + unsigned long minimal_pos = vmlinux.default_lma + vmlinux_size; 98 + unsigned long random; 134 99 135 - for_each_physmem_usable_range(i, &start, &end) { 136 - if (_min >= end) 137 - continue; 138 - if (start >= _max) 139 - break; 140 - start = max(_min, start); 141 - end = min(_max, end); 142 - if (end - start < kernel_size) 143 - continue; 144 - pos += (end - start - kernel_size) / THREAD_SIZE + 1; 145 - } 146 - 147 - return pos; 148 - } 149 - 150 - static unsigned long position_to_address(unsigned long pos, unsigned long kernel_size, 151 - unsigned long _min, unsigned long _max) 152 - { 153 - unsigned long start, end; 154 - int i; 155 - 156 - for_each_physmem_usable_range(i, &start, &end) { 157 - if (_min >= end) 158 - continue; 159 - if (start >= _max) 160 - break; 161 - start = max(_min, start); 162 - end = min(_max, end); 163 - if (end - start < kernel_size) 164 - continue; 165 - if ((end - start - kernel_size) / THREAD_SIZE + 1 >= pos) 166 - return start + (pos - 1) * THREAD_SIZE; 167 - pos -= (end - start - kernel_size) / THREAD_SIZE + 1; 168 - } 169 - 170 - return 0; 171 - } 172 - 173 - unsigned long get_random_base(unsigned long safe_addr) 174 - { 175 - unsigned long usable_total = get_physmem_usable_total(); 176 - unsigned long memory_limit = get_physmem_usable_end(); 177 - unsigned long base_pos, max_pos, kernel_size; 178 - int i; 179 - 180 - /* 181 - * Avoid putting kernel in the end of physical memory 182 - * which vmem and kasan code will use for shadow memory and 183 - * pgtable mapping allocations. 184 - */ 185 - memory_limit -= kasan_estimate_memory_needs(usable_total); 186 - memory_limit -= vmem_estimate_memory_needs(usable_total); 187 - 188 - safe_addr = ALIGN(safe_addr, THREAD_SIZE); 189 - kernel_size = vmlinux.image_size + vmlinux.bss_size; 190 - if (safe_addr + kernel_size > memory_limit) 100 + /* [vmlinux.default_lma + vmlinux.image_size + vmlinux.bss_size : physmem_info.usable] */ 101 + if (get_random(physmem_info.usable - minimal_pos, &random)) 191 102 return 0; 192 103 193 - max_pos = count_valid_kernel_positions(kernel_size, safe_addr, memory_limit); 194 - if (!max_pos) { 195 - sclp_early_printk("KASLR disabled: not enough memory\n"); 196 - return 0; 197 - } 198 - 199 - /* we need a value in the range [1, base_pos] inclusive */ 200 - if (get_random(max_pos, &base_pos)) 201 - return 0; 202 - return position_to_address(base_pos + 1, kernel_size, safe_addr, memory_limit); 104 + return physmem_alloc_range(RR_VMLINUX, vmlinux_size, THREAD_SIZE, 105 + vmlinux.default_lma, minimal_pos + random, false); 203 106 }
+2 -3
arch/s390/boot/pgm_check_info.c
··· 123 123 sclp_early_printk(buf); 124 124 } 125 125 126 - static noinline void print_stacktrace(void) 126 + void print_stacktrace(unsigned long sp) 127 127 { 128 128 struct stack_info boot_stack = { STACK_TYPE_TASK, (unsigned long)_stack_start, 129 129 (unsigned long)_stack_end }; 130 - unsigned long sp = S390_lowcore.gpregs_save_area[15]; 131 130 bool first = true; 132 131 133 132 decompressor_printk("Call Trace:\n"); ··· 172 173 gpregs[8], gpregs[9], gpregs[10], gpregs[11]); 173 174 decompressor_printk(" %016lx %016lx %016lx %016lx\n", 174 175 gpregs[12], gpregs[13], gpregs[14], gpregs[15]); 175 - print_stacktrace(); 176 + print_stacktrace(S390_lowcore.gpregs_save_area[15]); 176 177 decompressor_printk("Last Breaking-Event-Address:\n"); 177 178 decompressor_printk(" [<%016lx>] %pS\n", (unsigned long)S390_lowcore.pgm_last_break, 178 179 (void *)S390_lowcore.pgm_last_break);
+163 -31
arch/s390/boot/physmem_info.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 + #include <linux/processor.h> 2 3 #include <linux/errno.h> 3 4 #include <linux/init.h> 4 - #include <asm/setup.h> 5 - #include <asm/processor.h> 6 - #include <asm/sclp.h> 7 - #include <asm/sections.h> 8 5 #include <asm/physmem_info.h> 6 + #include <asm/stacktrace.h> 7 + #include <asm/boot_data.h> 9 8 #include <asm/sparsemem.h> 9 + #include <asm/sections.h> 10 + #include <asm/setup.h> 11 + #include <asm/sclp.h> 12 + #include <asm/uv.h> 10 13 #include "decompressor.h" 11 14 #include "boot.h" 12 15 13 16 struct physmem_info __bootdata(physmem_info); 17 + static unsigned int physmem_alloc_ranges; 18 + static unsigned long physmem_alloc_pos; 14 19 15 20 /* up to 256 storage elements, 1020 subincrements each */ 16 21 #define ENTRIES_EXTENDED_MAX \ ··· 25 20 { 26 21 if (n < MEM_INLINED_ENTRIES) 27 22 return &physmem_info.online[n]; 23 + if (unlikely(!physmem_info.online_extended)) { 24 + physmem_info.online_extended = (struct physmem_range *)physmem_alloc_range( 25 + RR_MEM_DETECT_EXTENDED, ENTRIES_EXTENDED_MAX, sizeof(long), 0, 26 + physmem_alloc_pos, true); 27 + } 28 28 return &physmem_info.online_extended[n - MEM_INLINED_ENTRIES]; 29 29 } 30 30 ··· 153 143 return (offset + 1) << 20; 154 144 } 155 145 156 - unsigned long detect_memory(unsigned long *safe_addr) 146 + unsigned long detect_max_physmem_end(void) 157 147 { 158 148 unsigned long max_physmem_end = 0; 159 149 160 - sclp_early_get_memsize(&max_physmem_end); 161 - physmem_info.online_extended = (struct physmem_range *)ALIGN(*safe_addr, sizeof(u64)); 150 + if (!sclp_early_get_memsize(&max_physmem_end)) { 151 + physmem_info.info_source = MEM_DETECT_SCLP_READ_INFO; 152 + } else { 153 + max_physmem_end = search_mem_end(); 154 + physmem_info.info_source = MEM_DETECT_BIN_SEARCH; 155 + } 156 + return max_physmem_end; 157 + } 162 158 159 + void detect_physmem_online_ranges(unsigned long max_physmem_end) 160 + { 163 161 if (!sclp_early_read_storage_info()) { 164 162 physmem_info.info_source = MEM_DETECT_SCLP_STOR_INFO; 165 163 } else if (!diag260()) { 166 164 physmem_info.info_source = MEM_DETECT_DIAG260; 167 - max_physmem_end = max_physmem_end ?: get_physmem_usable_end(); 168 165 } else if (max_physmem_end) { 169 166 add_physmem_online_range(0, max_physmem_end); 170 - physmem_info.info_source = MEM_DETECT_SCLP_READ_INFO; 171 - } else { 172 - max_physmem_end = search_mem_end(); 173 - add_physmem_online_range(0, max_physmem_end); 174 - physmem_info.info_source = MEM_DETECT_BIN_SEARCH; 175 167 } 176 - 177 - if (physmem_info.range_count > MEM_INLINED_ENTRIES) { 178 - *safe_addr += (physmem_info.range_count - MEM_INLINED_ENTRIES) * 179 - sizeof(struct physmem_range); 180 - } 181 - 182 - return max_physmem_end; 183 168 } 184 169 185 170 void physmem_set_usable_limit(unsigned long limit) 186 171 { 187 - struct physmem_range *range; 172 + physmem_info.usable = limit; 173 + physmem_alloc_pos = limit; 174 + } 175 + 176 + static void die_oom(unsigned long size, unsigned long align, unsigned long min, unsigned long max) 177 + { 178 + unsigned long start, end, total_mem = 0, total_reserved_mem = 0; 179 + struct reserved_range *range; 180 + enum reserved_range_type t; 188 181 int i; 189 182 190 - /* make sure mem_detect.usable ends up within online memory block */ 191 - for (i = 0; i < physmem_info.range_count; i++) { 192 - range = __get_physmem_range_ptr(i); 193 - if (range->start >= limit) 194 - break; 195 - if (range->end >= limit) { 196 - physmem_info.usable = limit; 197 - break; 198 - } 199 - physmem_info.usable = range->end; 183 + decompressor_printk("Linux version %s\n", kernel_version); 184 + if (!is_prot_virt_guest() && early_command_line[0]) 185 + decompressor_printk("Kernel command line: %s\n", early_command_line); 186 + decompressor_printk("Out of memory allocating %lx bytes %lx aligned in range %lx:%lx\n", 187 + size, align, min, max); 188 + decompressor_printk("Reserved memory ranges:\n"); 189 + for_each_physmem_reserved_range(t, range, &start, &end) { 190 + decompressor_printk("%016lx %016lx %s\n", start, end, get_rr_type_name(t)); 191 + total_reserved_mem += end - start; 200 192 } 193 + decompressor_printk("Usable online memory ranges (info source: %s [%x]):\n", 194 + get_physmem_info_source(), physmem_info.info_source); 195 + for_each_physmem_usable_range(i, &start, &end) { 196 + decompressor_printk("%016lx %016lx\n", start, end); 197 + total_mem += end - start; 198 + } 199 + decompressor_printk("Usable online memory total: %lx Reserved: %lx Free: %lx\n", 200 + total_mem, total_reserved_mem, 201 + total_mem > total_reserved_mem ? total_mem - total_reserved_mem : 0); 202 + print_stacktrace(current_frame_address()); 203 + sclp_early_printk("\n\n -- System halted\n"); 204 + disabled_wait(); 205 + } 206 + 207 + void physmem_reserve(enum reserved_range_type type, unsigned long addr, unsigned long size) 208 + { 209 + physmem_info.reserved[type].start = addr; 210 + physmem_info.reserved[type].end = addr + size; 211 + } 212 + 213 + void physmem_free(enum reserved_range_type type) 214 + { 215 + physmem_info.reserved[type].start = 0; 216 + physmem_info.reserved[type].end = 0; 217 + } 218 + 219 + static bool __physmem_alloc_intersects(unsigned long addr, unsigned long size, 220 + unsigned long *intersection_start) 221 + { 222 + unsigned long res_addr, res_size; 223 + int t; 224 + 225 + for (t = 0; t < RR_MAX; t++) { 226 + if (!get_physmem_reserved(t, &res_addr, &res_size)) 227 + continue; 228 + if (intersects(addr, size, res_addr, res_size)) { 229 + *intersection_start = res_addr; 230 + return true; 231 + } 232 + } 233 + return ipl_report_certs_intersects(addr, size, intersection_start); 234 + } 235 + 236 + static unsigned long __physmem_alloc_range(unsigned long size, unsigned long align, 237 + unsigned long min, unsigned long max, 238 + unsigned int from_ranges, unsigned int *ranges_left, 239 + bool die_on_oom) 240 + { 241 + unsigned int nranges = from_ranges ?: physmem_info.range_count; 242 + unsigned long range_start, range_end; 243 + unsigned long intersection_start; 244 + unsigned long addr, pos = max; 245 + 246 + align = max(align, 8UL); 247 + while (nranges) { 248 + __get_physmem_range(nranges - 1, &range_start, &range_end, false); 249 + pos = min(range_end, pos); 250 + 251 + if (round_up(min, align) + size > pos) 252 + break; 253 + addr = round_down(pos - size, align); 254 + if (range_start > addr) { 255 + nranges--; 256 + continue; 257 + } 258 + if (__physmem_alloc_intersects(addr, size, &intersection_start)) { 259 + pos = intersection_start; 260 + continue; 261 + } 262 + 263 + if (ranges_left) 264 + *ranges_left = nranges; 265 + return addr; 266 + } 267 + if (die_on_oom) 268 + die_oom(size, align, min, max); 269 + return 0; 270 + } 271 + 272 + unsigned long physmem_alloc_range(enum reserved_range_type type, unsigned long size, 273 + unsigned long align, unsigned long min, unsigned long max, 274 + bool die_on_oom) 275 + { 276 + unsigned long addr; 277 + 278 + max = min(max, physmem_alloc_pos); 279 + addr = __physmem_alloc_range(size, align, min, max, 0, NULL, die_on_oom); 280 + if (addr) 281 + physmem_reserve(type, addr, size); 282 + return addr; 283 + } 284 + 285 + unsigned long physmem_alloc_top_down(enum reserved_range_type type, unsigned long size, 286 + unsigned long align) 287 + { 288 + struct reserved_range *range = &physmem_info.reserved[type]; 289 + struct reserved_range *new_range; 290 + unsigned int ranges_left; 291 + unsigned long addr; 292 + 293 + addr = __physmem_alloc_range(size, align, 0, physmem_alloc_pos, physmem_alloc_ranges, 294 + &ranges_left, true); 295 + /* if not a consecutive allocation of the same type or first allocation */ 296 + if (range->start != addr + size) { 297 + if (range->end) { 298 + physmem_alloc_pos = __physmem_alloc_range( 299 + sizeof(struct reserved_range), 0, 0, physmem_alloc_pos, 300 + physmem_alloc_ranges, &ranges_left, true); 301 + new_range = (struct reserved_range *)physmem_alloc_pos; 302 + *new_range = *range; 303 + range->chain = new_range; 304 + addr = __physmem_alloc_range(size, align, 0, physmem_alloc_pos, 305 + ranges_left, &ranges_left, true); 306 + } 307 + range->end = addr + size; 308 + } 309 + range->start = addr; 310 + physmem_alloc_pos = addr; 311 + physmem_alloc_ranges = ranges_left; 312 + return addr; 201 313 }
+48 -38
arch/s390/boot/startup.c
··· 21 21 unsigned long __bootdata_preserved(__abs_lowcore); 22 22 unsigned long __bootdata_preserved(__memcpy_real_area); 23 23 pte_t *__bootdata_preserved(memcpy_real_ptep); 24 - unsigned long __bootdata(__amode31_base); 25 24 unsigned long __bootdata_preserved(VMALLOC_START); 26 25 unsigned long __bootdata_preserved(VMALLOC_END); 27 26 struct page *__bootdata_preserved(vmemmap); ··· 28 29 unsigned long __bootdata_preserved(MODULES_VADDR); 29 30 unsigned long __bootdata_preserved(MODULES_END); 30 31 unsigned long __bootdata(ident_map_size); 31 - struct initrd_data __bootdata(initrd_data); 32 32 33 33 u64 __bootdata_preserved(stfle_fac_list[16]); 34 34 u64 __bootdata_preserved(alt_stfle_fac_list[16]); ··· 73 75 } 74 76 #endif 75 77 76 - static unsigned long rescue_initrd(unsigned long safe_addr) 78 + static void rescue_initrd(unsigned long min, unsigned long max) 77 79 { 80 + unsigned long old_addr, addr, size; 81 + 78 82 if (!IS_ENABLED(CONFIG_BLK_DEV_INITRD)) 79 - return safe_addr; 80 - if (!initrd_data.start || !initrd_data.size) 81 - return safe_addr; 82 - if (initrd_data.start < safe_addr) { 83 - memmove((void *)safe_addr, (void *)initrd_data.start, initrd_data.size); 84 - initrd_data.start = safe_addr; 85 - } 86 - return initrd_data.start + initrd_data.size; 83 + return; 84 + if (!get_physmem_reserved(RR_INITRD, &addr, &size)) 85 + return; 86 + if (addr >= min && addr + size <= max) 87 + return; 88 + old_addr = addr; 89 + physmem_free(RR_INITRD); 90 + addr = physmem_alloc_top_down(RR_INITRD, size, 0); 91 + memmove((void *)addr, (void *)old_addr, size); 87 92 } 88 93 89 94 static void copy_bootdata(void) ··· 268 267 vmlinux.invalid_pg_dir_off += offset; 269 268 } 270 269 271 - static unsigned long reserve_amode31(unsigned long safe_addr) 272 - { 273 - __amode31_base = PAGE_ALIGN(safe_addr); 274 - return __amode31_base + vmlinux.amode31_size; 275 - } 276 - 277 270 void startup_kernel(void) 278 271 { 279 272 unsigned long max_physmem_end; 280 273 unsigned long random_lma; 281 - unsigned long safe_addr; 282 274 unsigned long asce_limit; 275 + unsigned long safe_addr; 283 276 void *img; 284 277 psw_t psw; 285 278 286 - initrd_data.start = parmarea.initrd_start; 287 - initrd_data.size = parmarea.initrd_size; 279 + setup_lpp(); 280 + safe_addr = mem_safe_offset(); 281 + /* 282 + * reserve decompressor memory together with decompression heap, buffer and 283 + * memory which might be occupied by uncompressed kernel at default 1Mb 284 + * position (if KASLR is off or failed). 285 + */ 286 + physmem_reserve(RR_DECOMPRESSOR, 0, safe_addr); 287 + if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && parmarea.initrd_size) 288 + physmem_reserve(RR_INITRD, parmarea.initrd_start, parmarea.initrd_size); 288 289 oldmem_data.start = parmarea.oldmem_base; 289 290 oldmem_data.size = parmarea.oldmem_size; 290 291 291 - setup_lpp(); 292 292 store_ipl_parmblock(); 293 - safe_addr = mem_safe_offset(); 294 - safe_addr = reserve_amode31(safe_addr); 295 - safe_addr = read_ipl_report(safe_addr); 293 + read_ipl_report(); 296 294 uv_query_info(); 297 - safe_addr = rescue_initrd(safe_addr); 298 295 sclp_early_read_info(); 299 296 setup_boot_command_line(); 300 297 parse_boot_command_line(); 301 298 detect_facilities(); 302 299 sanitize_prot_virt_host(); 303 - max_physmem_end = detect_memory(&safe_addr); 300 + max_physmem_end = detect_max_physmem_end(); 304 301 setup_ident_map_size(max_physmem_end); 305 302 setup_vmalloc_size(); 306 303 asce_limit = setup_kernel_memory_layout(); 304 + /* got final ident_map_size, physmem allocations could be performed now */ 307 305 physmem_set_usable_limit(ident_map_size); 306 + detect_physmem_online_ranges(max_physmem_end); 307 + save_ipl_cert_comp_list(); 308 + rescue_initrd(safe_addr, ident_map_size); 309 + #ifdef CONFIG_KASAN 310 + physmem_alloc_top_down(RR_KASAN, kasan_estimate_memory_needs(get_physmem_usable_total()), 311 + _SEGMENT_SIZE); 312 + #endif 308 313 309 314 if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_enabled) { 310 - random_lma = get_random_base(safe_addr); 315 + random_lma = get_random_base(); 311 316 if (random_lma) { 312 317 __kaslr_offset = random_lma - vmlinux.default_lma; 313 318 img = (void *)vmlinux.default_lma; ··· 324 317 if (!IS_ENABLED(CONFIG_KERNEL_UNCOMPRESSED)) { 325 318 img = decompress_kernel(); 326 319 memmove((void *)vmlinux.default_lma, img, vmlinux.image_size); 327 - } else if (__kaslr_offset) 320 + } else if (__kaslr_offset) { 328 321 memcpy((void *)vmlinux.default_lma, img, vmlinux.image_size); 322 + memset(img, 0, vmlinux.image_size); 323 + } 324 + 325 + /* vmlinux decompression is done, shrink reserved low memory */ 326 + physmem_reserve(RR_DECOMPRESSOR, 0, (unsigned long)_decompressor_end); 327 + if (!__kaslr_offset) 328 + physmem_reserve(RR_VMLINUX, vmlinux.default_lma, vmlinux.image_size + vmlinux.bss_size); 329 + physmem_alloc_range(RR_AMODE31, vmlinux.amode31_size, PAGE_SIZE, 0, SZ_2G, true); 329 330 330 331 /* 331 332 * The order of the following operations is important: ··· 353 338 setup_vmem(asce_limit); 354 339 copy_bootdata(); 355 340 356 - if (__kaslr_offset) { 357 - /* 358 - * Save KASLR offset for early dumps, before vmcore_info is set. 359 - * Mark as uneven to distinguish from real vmcore_info pointer. 360 - */ 361 - S390_lowcore.vmcore_info = __kaslr_offset | 0x1UL; 362 - /* Clear non-relocated kernel */ 363 - if (IS_ENABLED(CONFIG_KERNEL_UNCOMPRESSED)) 364 - memset(img, 0, vmlinux.image_size); 365 - } 341 + /* 342 + * Save KASLR offset for early dumps, before vmcore_info is set. 343 + * Mark as uneven to distinguish from real vmcore_info pointer. 344 + */ 345 + S390_lowcore.vmcore_info = __kaslr_offset ? __kaslr_offset | 0x1UL : 0; 366 346 367 347 /* 368 348 * Jump to the decompressed kernel entry point and switch DAT mode on.
+6 -63
arch/s390/boot/vmem.c
··· 10 10 #include "decompressor.h" 11 11 #include "boot.h" 12 12 13 + unsigned long __bootdata_preserved(s390_invalid_asce); 14 + 13 15 #define init_mm (*(struct mm_struct *)vmlinux.init_mm_off) 14 16 #define swapper_pg_dir vmlinux.swapper_pg_dir_off 15 17 #define invalid_pg_dir vmlinux.invalid_pg_dir_off ··· 24 22 return pte_offset_kernel(pmd_offset(pud_offset(p4d_offset(pgd_offset_k(va), va), va), va), va); 25 23 } 26 24 27 - unsigned long __bootdata_preserved(s390_invalid_asce); 28 - unsigned long __bootdata(pgalloc_pos); 29 - unsigned long __bootdata(pgalloc_end); 30 - unsigned long __bootdata(pgalloc_low); 31 - 32 25 enum populate_mode { 33 26 POPULATE_NONE, 34 27 POPULATE_ONE2ONE, 35 28 POPULATE_ABS_LOWCORE, 36 29 }; 37 30 38 - static void boot_check_oom(void) 39 - { 40 - if (pgalloc_pos < pgalloc_low) 41 - error("out of memory on boot\n"); 42 - } 43 - 44 - static void pgtable_populate_init(void) 45 - { 46 - unsigned long initrd_end; 47 - unsigned long kernel_end; 48 - 49 - kernel_end = vmlinux.default_lma + vmlinux.image_size + vmlinux.bss_size; 50 - pgalloc_low = round_up(kernel_end, PAGE_SIZE); 51 - if (IS_ENABLED(CONFIG_BLK_DEV_INITRD)) { 52 - initrd_end = round_up(initrd_data.start + initrd_data.size, _SEGMENT_SIZE); 53 - pgalloc_low = max(pgalloc_low, initrd_end); 54 - } 55 - 56 - pgalloc_end = round_down(get_physmem_usable_end(), PAGE_SIZE); 57 - pgalloc_pos = pgalloc_end; 58 - 59 - boot_check_oom(); 60 - } 61 - 62 - static void *boot_alloc_pages(unsigned int order) 63 - { 64 - unsigned long size = PAGE_SIZE << order; 65 - 66 - pgalloc_pos -= size; 67 - pgalloc_pos = round_down(pgalloc_pos, size); 68 - 69 - boot_check_oom(); 70 - 71 - return (void *)pgalloc_pos; 72 - } 73 - 74 31 static void *boot_crst_alloc(unsigned long val) 75 32 { 33 + unsigned long size = PAGE_SIZE << CRST_ALLOC_ORDER; 76 34 unsigned long *table; 77 35 78 - table = boot_alloc_pages(CRST_ALLOC_ORDER); 79 - if (table) 80 - crst_table_init(table, val); 36 + table = (unsigned long *)physmem_alloc_top_down(RR_VMEM, size, size); 37 + crst_table_init(table, val); 81 38 return table; 82 39 } 83 40 84 41 static pte_t *boot_pte_alloc(void) 85 42 { 86 - static void *pte_leftover; 87 43 pte_t *pte; 88 44 89 - BUILD_BUG_ON(_PAGE_TABLE_SIZE * 2 != PAGE_SIZE); 90 - 91 - if (!pte_leftover) { 92 - pte_leftover = boot_alloc_pages(0); 93 - pte = pte_leftover + _PAGE_TABLE_SIZE; 94 - } else { 95 - pte = pte_leftover; 96 - pte_leftover = NULL; 97 - } 45 + pte = (pte_t *)physmem_alloc_top_down(RR_VMEM, _PAGE_TABLE_SIZE, _PAGE_TABLE_SIZE); 98 46 memset64((u64 *)pte, _PAGE_INVALID, PTRS_PER_PTE); 99 47 return pte; 100 48 } ··· 78 126 static void pgtable_pte_populate(pmd_t *pmd, unsigned long addr, unsigned long end, 79 127 enum populate_mode mode) 80 128 { 81 - unsigned long next; 82 129 pte_t *pte, entry; 83 130 84 131 pte = pte_offset_kernel(pmd, addr); ··· 201 250 * To prevent creation of a large page at address 0 first map 202 251 * the lowcore and create the identity mapping only afterwards. 203 252 */ 204 - pgtable_populate_init(); 205 253 pgtable_populate(0, sizeof(struct lowcore), POPULATE_ONE2ONE); 206 254 for_each_physmem_usable_range(i, &start, &end) 207 255 pgtable_populate(start, end, POPULATE_ONE2ONE); ··· 218 268 __ctl_load(S390_lowcore.kernel_asce, 13, 13); 219 269 220 270 init_mm.context.asce = S390_lowcore.kernel_asce; 221 - } 222 - 223 - unsigned long vmem_estimate_memory_needs(unsigned long online_mem_total) 224 - { 225 - unsigned long pages = DIV_ROUND_UP(online_mem_total, PAGE_SIZE); 226 - 227 - return DIV_ROUND_UP(pages, _PAGE_ENTRIES) * _PAGE_TABLE_SIZE * 2; 228 271 }
+2
arch/s390/boot/vmlinux.lds.S
··· 93 93 _decompressor_syms_end = .; 94 94 } 95 95 96 + _decompressor_end = .; 97 + 96 98 #ifdef CONFIG_KERNEL_UNCOMPRESSED 97 99 . = 0x100000; 98 100 #else
+91 -21
arch/s390/include/asm/physmem_info.h
··· 17 17 u64 end; 18 18 }; 19 19 20 + enum reserved_range_type { 21 + RR_DECOMPRESSOR, 22 + RR_INITRD, 23 + RR_VMLINUX, 24 + RR_AMODE31, 25 + RR_IPLREPORT, 26 + RR_CERT_COMP_LIST, 27 + RR_MEM_DETECT_EXTENDED, 28 + RR_VMEM, 29 + #ifdef CONFIG_KASAN 30 + RR_KASAN, 31 + #endif 32 + RR_MAX 33 + }; 34 + 35 + struct reserved_range { 36 + unsigned long start; 37 + unsigned long end; 38 + struct reserved_range *chain; 39 + }; 40 + 20 41 /* 21 42 * Storage element id is defined as 1 byte (up to 256 storage elements). 22 43 * In practise only storage element id 0 and 1 are used). ··· 52 31 u32 range_count; 53 32 u8 info_source; 54 33 unsigned long usable; 34 + struct reserved_range reserved[RR_MAX]; 55 35 struct physmem_range online[MEM_INLINED_ENTRIES]; 56 36 struct physmem_range *online_extended; 57 37 }; ··· 102 80 #define for_each_physmem_online_range(i, p_start, p_end) \ 103 81 for (i = 0; !__get_physmem_range(i, p_start, p_end, false); i++) 104 82 83 + static inline const char *get_physmem_info_source(void) 84 + { 85 + switch (physmem_info.info_source) { 86 + case MEM_DETECT_SCLP_STOR_INFO: 87 + return "sclp storage info"; 88 + case MEM_DETECT_DIAG260: 89 + return "diag260"; 90 + case MEM_DETECT_SCLP_READ_INFO: 91 + return "sclp read info"; 92 + case MEM_DETECT_BIN_SEARCH: 93 + return "binary search"; 94 + } 95 + return "none"; 96 + } 97 + 98 + #define RR_TYPE_NAME(t) case RR_ ## t: return #t 99 + static inline const char *get_rr_type_name(enum reserved_range_type t) 100 + { 101 + switch (t) { 102 + RR_TYPE_NAME(DECOMPRESSOR); 103 + RR_TYPE_NAME(INITRD); 104 + RR_TYPE_NAME(VMLINUX); 105 + RR_TYPE_NAME(AMODE31); 106 + RR_TYPE_NAME(IPLREPORT); 107 + RR_TYPE_NAME(CERT_COMP_LIST); 108 + RR_TYPE_NAME(MEM_DETECT_EXTENDED); 109 + RR_TYPE_NAME(VMEM); 110 + #ifdef CONFIG_KASAN 111 + RR_TYPE_NAME(KASAN); 112 + #endif 113 + default: 114 + return "UNKNOWN"; 115 + } 116 + } 117 + 118 + #define for_each_physmem_reserved_type_range(t, range, p_start, p_end) \ 119 + for (range = &physmem_info.reserved[t], *p_start = range->start, *p_end = range->end; \ 120 + range && range->end; range = range->chain, \ 121 + *p_start = range ? range->start : 0, *p_end = range ? range->end : 0) 122 + 123 + static inline struct reserved_range *__physmem_reserved_next(enum reserved_range_type *t, 124 + struct reserved_range *range) 125 + { 126 + if (!range) { 127 + range = &physmem_info.reserved[*t]; 128 + if (range->end) 129 + return range; 130 + } 131 + if (range->chain) 132 + return range->chain; 133 + while (++*t < RR_MAX) { 134 + range = &physmem_info.reserved[*t]; 135 + if (range->end) 136 + return range; 137 + } 138 + return NULL; 139 + } 140 + 141 + #define for_each_physmem_reserved_range(t, range, p_start, p_end) \ 142 + for (t = 0, range = __physmem_reserved_next(&t, NULL), \ 143 + *p_start = range ? range->start : 0, *p_end = range ? range->end : 0; \ 144 + range; range = __physmem_reserved_next(&t, range), \ 145 + *p_start = range ? range->start : 0, *p_end = range ? range->end : 0) 146 + 105 147 static inline unsigned long get_physmem_usable_total(void) 106 148 { 107 149 unsigned long start, end, total = 0; ··· 177 91 return total; 178 92 } 179 93 180 - static inline void get_physmem_reserved(unsigned long *start, unsigned long *size) 94 + static inline unsigned long get_physmem_reserved(enum reserved_range_type type, 95 + unsigned long *addr, unsigned long *size) 181 96 { 182 - *start = (unsigned long)physmem_info.online_extended; 183 - if (physmem_info.range_count > MEM_INLINED_ENTRIES) 184 - *size = (physmem_info.range_count - MEM_INLINED_ENTRIES) * 185 - sizeof(struct physmem_range); 186 - else 187 - *size = 0; 188 - } 189 - 190 - static inline unsigned long get_physmem_usable_end(void) 191 - { 192 - unsigned long start; 193 - unsigned long end; 194 - 195 - if (physmem_info.usable) 196 - return physmem_info.usable; 197 - if (physmem_info.range_count) { 198 - __get_physmem_range(physmem_info.range_count - 1, &start, &end, false); 199 - return end; 200 - } 201 - return 0; 97 + *addr = physmem_info.reserved[type].start; 98 + *size = physmem_info.reserved[type].end - physmem_info.reserved[type].start; 99 + return *size; 202 100 } 203 101 204 102 #endif
-9
arch/s390/include/asm/setup.h
··· 74 74 75 75 extern int noexec_disabled; 76 76 extern unsigned long ident_map_size; 77 - extern unsigned long pgalloc_pos; 78 - extern unsigned long pgalloc_end; 79 - extern unsigned long pgalloc_low; 80 - extern unsigned long __amode31_base; 81 77 82 78 /* The Write Back bit position in the physaddr is given by the SLPC PCI */ 83 79 extern unsigned long mio_wb_bit_mask; ··· 146 150 return __kaslr_offset; 147 151 } 148 152 149 - struct initrd_data { 150 - unsigned long start; 151 - unsigned long size; 152 - }; 153 - extern struct initrd_data initrd_data; 154 153 155 154 struct oldmem_data { 156 155 unsigned long start;
+21 -55
arch/s390/kernel/setup.c
··· 148 148 int __bootdata(noexec_disabled); 149 149 unsigned long __bootdata(ident_map_size); 150 150 struct physmem_info __bootdata(physmem_info); 151 - struct initrd_data __bootdata(initrd_data); 152 - unsigned long __bootdata(pgalloc_pos); 153 - unsigned long __bootdata(pgalloc_end); 154 - unsigned long __bootdata(pgalloc_low); 155 151 156 152 unsigned long __bootdata_preserved(__kaslr_offset); 157 - unsigned long __bootdata(__amode31_base); 158 153 unsigned int __bootdata_preserved(zlib_dfltcc_support); 159 154 EXPORT_SYMBOL(zlib_dfltcc_support); 160 155 u64 __bootdata_preserved(stfle_fac_list[16]); ··· 630 635 */ 631 636 static void __init reserve_pgtables(void) 632 637 { 633 - memblock_reserve(pgalloc_pos, pgalloc_end - pgalloc_pos); 638 + unsigned long start, end; 639 + struct reserved_range *range; 640 + 641 + for_each_physmem_reserved_type_range(RR_VMEM, range, &start, &end) 642 + memblock_reserve(start, end - start); 634 643 } 635 644 636 645 /* ··· 711 712 */ 712 713 static void __init reserve_initrd(void) 713 714 { 714 - #ifdef CONFIG_BLK_DEV_INITRD 715 - if (!initrd_data.start || !initrd_data.size) 715 + unsigned long addr, size; 716 + 717 + if (!IS_ENABLED(CONFIG_BLK_DEV_INITRD) || !get_physmem_reserved(RR_INITRD, &addr, &size)) 716 718 return; 717 - initrd_start = (unsigned long)__va(initrd_data.start); 718 - initrd_end = initrd_start + initrd_data.size; 719 - memblock_reserve(initrd_data.start, initrd_data.size); 720 - #endif 719 + initrd_start = (unsigned long)__va(addr); 720 + initrd_end = initrd_start + size; 721 + memblock_reserve(addr, size); 721 722 } 722 723 723 724 /* ··· 731 732 732 733 static void __init reserve_physmem_info(void) 733 734 { 734 - unsigned long start, size; 735 + unsigned long addr, size; 735 736 736 - get_physmem_reserved(&start, &size); 737 - if (size) 738 - memblock_reserve(start, size); 737 + if (get_physmem_reserved(RR_MEM_DETECT_EXTENDED, &addr, &size)) 738 + memblock_reserve(addr, size); 739 739 } 740 740 741 741 static void __init free_physmem_info(void) 742 742 { 743 - unsigned long start, size; 743 + unsigned long addr, size; 744 744 745 - get_physmem_reserved(&start, &size); 746 - if (size) 747 - memblock_phys_free(start, size); 748 - } 749 - 750 - static const char * __init get_mem_info_source(void) 751 - { 752 - switch (physmem_info.info_source) { 753 - case MEM_DETECT_SCLP_STOR_INFO: 754 - return "sclp storage info"; 755 - case MEM_DETECT_DIAG260: 756 - return "diag260"; 757 - case MEM_DETECT_SCLP_READ_INFO: 758 - return "sclp read info"; 759 - case MEM_DETECT_BIN_SEARCH: 760 - return "binary search"; 761 - } 762 - return "none"; 745 + if (get_physmem_reserved(RR_MEM_DETECT_EXTENDED, &addr, &size)) 746 + memblock_phys_free(addr, size); 763 747 } 764 748 765 749 static void __init memblock_add_physmem_info(void) ··· 751 769 int i; 752 770 753 771 pr_debug("physmem info source: %s (%hhd)\n", 754 - get_mem_info_source(), physmem_info.info_source); 772 + get_physmem_info_source(), physmem_info.info_source); 755 773 /* keep memblock lists close to the kernel */ 756 774 memblock_set_bottom_up(true); 757 775 for_each_physmem_usable_range(i, &start, &end) ··· 763 781 } 764 782 765 783 /* 766 - * Check for initrd being in usable memory 767 - */ 768 - static void __init check_initrd(void) 769 - { 770 - #ifdef CONFIG_BLK_DEV_INITRD 771 - if (initrd_data.start && initrd_data.size && 772 - !memblock_is_region_memory(initrd_data.start, initrd_data.size)) { 773 - pr_err("The initial RAM disk does not fit into the memory\n"); 774 - memblock_phys_free(initrd_data.start, initrd_data.size); 775 - initrd_start = initrd_end = 0; 776 - } 777 - #endif 778 - } 779 - 780 - /* 781 784 * Reserve memory used for lowcore/command line/kernel image. 782 785 */ 783 786 static void __init reserve_kernel(void) ··· 770 803 memblock_reserve(0, STARTUP_NORMAL_OFFSET); 771 804 memblock_reserve(OLDMEM_BASE, sizeof(unsigned long)); 772 805 memblock_reserve(OLDMEM_SIZE, sizeof(unsigned long)); 773 - memblock_reserve(__amode31_base, __eamode31 - __samode31); 806 + memblock_reserve(physmem_info.reserved[RR_AMODE31].start, __eamode31 - __samode31); 774 807 memblock_reserve(__pa(sclp_early_sccb), EXT_SCCB_READ_SCP); 775 808 memblock_reserve(__pa(_stext), _end - _stext); 776 809 } ··· 792 825 static void __init relocate_amode31_section(void) 793 826 { 794 827 unsigned long amode31_size = __eamode31 - __samode31; 795 - long amode31_offset = __amode31_base - __samode31; 828 + long amode31_offset = physmem_info.reserved[RR_AMODE31].start - __samode31; 796 829 long *ptr; 797 830 798 831 pr_info("Relocating AMODE31 section of size 0x%08lx\n", amode31_size); 799 832 800 833 /* Move original AMODE31 section to the new one */ 801 - memmove((void *)__amode31_base, (void *)__samode31, amode31_size); 834 + memmove((void *)physmem_info.reserved[RR_AMODE31].start, (void *)__samode31, amode31_size); 802 835 /* Zero out the old AMODE31 section to catch invalid accesses within it */ 803 836 memset((void *)__samode31, 0, amode31_size); 804 837 ··· 984 1017 if (MACHINE_HAS_EDAT2) 985 1018 hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT); 986 1019 987 - check_initrd(); 988 1020 reserve_crashkernel(); 989 1021 #ifdef CONFIG_CRASH_DUMP 990 1022 /*
+18 -21
arch/s390/mm/kasan_init.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 - #include <linux/kasan.h> 3 - #include <linux/sched/task.h> 2 + #include <linux/memblock.h> 4 3 #include <linux/pgtable.h> 5 - #include <asm/pgalloc.h> 6 - #include <asm/kasan.h> 4 + #include <linux/kasan.h> 7 5 #include <asm/physmem_info.h> 8 6 #include <asm/processor.h> 9 - #include <asm/sclp.h> 10 7 #include <asm/facility.h> 11 - #include <asm/sections.h> 12 - #include <asm/setup.h> 13 - #include <asm/uv.h> 8 + #include <asm/pgalloc.h> 9 + #include <asm/sclp.h> 14 10 11 + static unsigned long pgalloc_pos __initdata; 15 12 static unsigned long segment_pos __initdata; 16 - static unsigned long segment_low __initdata; 17 13 static bool has_edat __initdata; 18 14 static bool has_nx __initdata; 19 15 ··· 24 28 25 29 static void * __init kasan_early_alloc_segment(void) 26 30 { 27 - segment_pos -= _SEGMENT_SIZE; 31 + unsigned long addr = segment_pos; 28 32 29 - if (segment_pos < segment_low) 33 + segment_pos += _SEGMENT_SIZE; 34 + if (segment_pos > pgalloc_pos) 30 35 kasan_early_panic("out of memory during initialisation\n"); 31 36 32 - return __va(segment_pos); 37 + return __va(addr); 33 38 } 34 39 35 40 static void * __init kasan_early_alloc_pages(unsigned int order) 36 41 { 37 42 pgalloc_pos -= (PAGE_SIZE << order); 38 43 39 - if (pgalloc_pos < pgalloc_low) 44 + if (segment_pos > pgalloc_pos) 40 45 kasan_early_panic("out of memory during initialisation\n"); 41 46 42 47 return __va(pgalloc_pos); ··· 222 225 pmd_t pmd_z = __pmd(__pa(kasan_early_shadow_pte) | _SEGMENT_ENTRY); 223 226 pud_t pud_z = __pud(__pa(kasan_early_shadow_pmd) | _REGION3_ENTRY); 224 227 p4d_t p4d_z = __p4d(__pa(kasan_early_shadow_pud) | _REGION2_ENTRY); 228 + unsigned long pgalloc_pos_initial, segment_pos_initial; 225 229 unsigned long untracked_end = MODULES_VADDR; 226 - unsigned long shadow_alloc_size; 227 230 unsigned long start, end; 228 231 int i; 229 232 ··· 240 243 crst_table_init((unsigned long *)kasan_early_shadow_pmd, pmd_val(pmd_z)); 241 244 memset64((u64 *)kasan_early_shadow_pte, pte_val(pte_z), PTRS_PER_PTE); 242 245 243 - if (has_edat) { 244 - shadow_alloc_size = get_physmem_usable_total() >> KASAN_SHADOW_SCALE_SHIFT; 245 - segment_pos = round_down(pgalloc_pos, _SEGMENT_SIZE); 246 - segment_low = segment_pos - shadow_alloc_size; 247 - segment_low = round_down(segment_low, _SEGMENT_SIZE); 248 - pgalloc_pos = segment_low; 249 - } 246 + /* segment allocations go bottom up -> <- pgalloc go top down */ 247 + segment_pos_initial = physmem_info.reserved[RR_KASAN].start; 248 + segment_pos = segment_pos_initial; 249 + pgalloc_pos_initial = physmem_info.reserved[RR_KASAN].end; 250 + pgalloc_pos = pgalloc_pos_initial; 250 251 /* 251 252 * Current memory layout: 252 253 * +- 0 -------------+ +- shadow start -+ ··· 293 298 /* enable kasan */ 294 299 init_task.kasan_depth = 0; 295 300 sclp_early_printk("KernelAddressSanitizer initialized\n"); 301 + memblock_reserve(segment_pos_initial, segment_pos - segment_pos_initial); 302 + memblock_reserve(pgalloc_pos, pgalloc_pos_initial - pgalloc_pos); 296 303 }