Merge tag 'pstore-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

+243

Documentation/admin-guide/pstore-blk.rst

··· 1 + .. SPDX-License-Identifier: GPL-2.0 2 + 3 + pstore block oops/panic logger 4 + ============================== 5 + 6 + Introduction 7 + ------------ 8 + 9 + pstore block (pstore/blk) is an oops/panic logger that writes its logs to a 10 + block device and non-block device before the system crashes. You can get 11 + these log files by mounting pstore filesystem like:: 12 + 13 + mount -t pstore pstore /sys/fs/pstore 14 + 15 + 16 + pstore block concepts 17 + --------------------- 18 + 19 + pstore/blk provides efficient configuration method for pstore/blk, which 20 + divides all configurations into two parts, configurations for user and 21 + configurations for driver. 22 + 23 + Configurations for user determine how pstore/blk works, such as pmsg_size, 24 + kmsg_size and so on. All of them support both Kconfig and module parameters, 25 + but module parameters have priority over Kconfig. 26 + 27 + Configurations for driver are all about block device and non-block device, 28 + such as total_size of block device and read/write operations. 29 + 30 + Configurations for user 31 + ----------------------- 32 + 33 + All of these configurations support both Kconfig and module parameters, but 34 + module parameters have priority over Kconfig. 35 + 36 + Here is an example for module parameters:: 37 + 38 + pstore_blk.blkdev=179:7 pstore_blk.kmsg_size=64 39 + 40 + The detail of each configurations may be of interest to you. 41 + 42 + blkdev 43 + ~~~~~~ 44 + 45 + The block device to use. Most of the time, it is a partition of block device. 46 + It's required for pstore/blk. It is also used for MTD device. 47 + 48 + It accepts the following variants for block device: 49 + 50 + 1. <hex_major><hex_minor> device number in hexadecimal represents itself; no 51 + leading 0x, for example b302. 52 + #. /dev/<disk_name> represents the device number of disk 53 + #. /dev/<disk_name><decimal> represents the device number of partition - device 54 + number of disk plus the partition number 55 + #. /dev/<disk_name>p<decimal> - same as the above; this form is used when disk 56 + name of partitioned disk ends with a digit. 57 + #. PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF represents the unique id of 58 + a partition if the partition table provides it. The UUID may be either an 59 + EFI/GPT UUID, or refer to an MSDOS partition using the format SSSSSSSS-PP, 60 + where SSSSSSSS is a zero-filled hex representation of the 32-bit 61 + "NT disk signature", and PP is a zero-filled hex representation of the 62 + 1-based partition number. 63 + #. PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation to a 64 + partition with a known unique id. 65 + #. <major>:<minor> major and minor number of the device separated by a colon. 66 + 67 + It accepts the following variants for MTD device: 68 + 69 + 1. <device name> MTD device name. "pstore" is recommended. 70 + #. <device number> MTD device number. 71 + 72 + kmsg_size 73 + ~~~~~~~~~ 74 + 75 + The chunk size in KB for oops/panic front-end. It **MUST** be a multiple of 4. 76 + It's optional if you do not care oops/panic log. 77 + 78 + There are multiple chunks for oops/panic front-end depending on the remaining 79 + space except other pstore front-ends. 80 + 81 + pstore/blk will log to oops/panic chunks one by one, and always overwrite the 82 + oldest chunk if there is no more free chunk. 83 + 84 + pmsg_size 85 + ~~~~~~~~~ 86 + 87 + The chunk size in KB for pmsg front-end. It **MUST** be a multiple of 4. 88 + It's optional if you do not care pmsg log. 89 + 90 + Unlike oops/panic front-end, there is only one chunk for pmsg front-end. 91 + 92 + Pmsg is a user space accessible pstore object. Writes to */dev/pmsg0* are 93 + appended to the chunk. On reboot the contents are available in 94 + */sys/fs/pstore/pmsg-pstore-blk-0*. 95 + 96 + console_size 97 + ~~~~~~~~~~~~ 98 + 99 + The chunk size in KB for console front-end. It **MUST** be a multiple of 4. 100 + It's optional if you do not care console log. 101 + 102 + Similar to pmsg front-end, there is only one chunk for console front-end. 103 + 104 + All log of console will be appended to the chunk. On reboot the contents are 105 + available in */sys/fs/pstore/console-pstore-blk-0*. 106 + 107 + ftrace_size 108 + ~~~~~~~~~~~ 109 + 110 + The chunk size in KB for ftrace front-end. It **MUST** be a multiple of 4. 111 + It's optional if you do not care console log. 112 + 113 + Similar to oops front-end, there are multiple chunks for ftrace front-end 114 + depending on the count of cpu processors. Each chunk size is equal to 115 + ftrace_size / processors_count. 116 + 117 + All log of ftrace will be appended to the chunk. On reboot the contents are 118 + combined and available in */sys/fs/pstore/ftrace-pstore-blk-0*. 119 + 120 + Persistent function tracing might be useful for debugging software or hardware 121 + related hangs. Here is an example of usage:: 122 + 123 + # mount -t pstore pstore /sys/fs/pstore 124 + # mount -t debugfs debugfs /sys/kernel/debug/ 125 + # echo 1 > /sys/kernel/debug/pstore/record_ftrace 126 + # reboot -f 127 + [...] 128 + # mount -t pstore pstore /sys/fs/pstore 129 + # tail /sys/fs/pstore/ftrace-pstore-blk-0 130 + CPU:0 ts:5914676 c0063828 c0063b94 call_cpuidle <- cpu_startup_entry+0x1b8/0x1e0 131 + CPU:0 ts:5914678 c039ecdc c006385c cpuidle_enter_state <- call_cpuidle+0x44/0x48 132 + CPU:0 ts:5914680 c039e9a0 c039ecf0 cpuidle_enter_freeze <- cpuidle_enter_state+0x304/0x314 133 + CPU:0 ts:5914681 c0063870 c039ea30 sched_idle_set_state <- cpuidle_enter_state+0x44/0x314 134 + CPU:1 ts:5916720 c0160f59 c015ee04 kernfs_unmap_bin_file <- __kernfs_remove+0x140/0x204 135 + CPU:1 ts:5916721 c05ca625 c015ee0c __mutex_lock_slowpath <- __kernfs_remove+0x148/0x204 136 + CPU:1 ts:5916723 c05c813d c05ca630 yield_to <- __mutex_lock_slowpath+0x314/0x358 137 + CPU:1 ts:5916724 c05ca2d1 c05ca638 __ww_mutex_lock <- __mutex_lock_slowpath+0x31c/0x358 138 + 139 + max_reason 140 + ~~~~~~~~~~ 141 + 142 + Limiting which kinds of kmsg dumps are stored can be controlled via 143 + the ``max_reason`` value, as defined in include/linux/kmsg_dump.h's 144 + ``enum kmsg_dump_reason``. For example, to store both Oopses and Panics, 145 + ``max_reason`` should be set to 2 (KMSG_DUMP_OOPS), to store only Panics 146 + ``max_reason`` should be set to 1 (KMSG_DUMP_PANIC). Setting this to 0 147 + (KMSG_DUMP_UNDEF), means the reason filtering will be controlled by the 148 + ``printk.always_kmsg_dump`` boot param: if unset, it'll be KMSG_DUMP_OOPS, 149 + otherwise KMSG_DUMP_MAX. 150 + 151 + Configurations for driver 152 + ------------------------- 153 + 154 + Only a block device driver cares about these configurations. A block device 155 + driver uses ``register_pstore_blk`` to register to pstore/blk. 156 + 157 + .. kernel-doc:: fs/pstore/blk.c 158 + :identifiers: register_pstore_blk 159 + 160 + A non-block device driver uses ``register_pstore_device`` with 161 + ``struct pstore_device_info`` to register to pstore/blk. 162 + 163 + .. kernel-doc:: fs/pstore/blk.c 164 + :identifiers: register_pstore_device 165 + 166 + .. kernel-doc:: include/linux/pstore_blk.h 167 + :identifiers: pstore_device_info 168 + 169 + Compression and header 170 + ---------------------- 171 + 172 + Block device is large enough for uncompressed oops data. Actually we do not 173 + recommend data compression because pstore/blk will insert some information into 174 + the first line of oops/panic data. For example:: 175 + 176 + Panic: Total 16 times 177 + 178 + It means that it's OOPS|Panic for the 16th time since the first booting. 179 + Sometimes the number of occurrences of oops|panic since the first booting is 180 + important to judge whether the system is stable. 181 + 182 + The following line is inserted by pstore filesystem. For example:: 183 + 184 + Oops#2 Part1 185 + 186 + It means that it's OOPS for the 2nd time on the last boot. 187 + 188 + Reading the data 189 + ---------------- 190 + 191 + The dump data can be read from the pstore filesystem. The format for these 192 + files is ``dmesg-pstore-blk-[N]`` for oops/panic front-end, 193 + ``pmsg-pstore-blk-0`` for pmsg front-end and so on. The timestamp of the 194 + dump file records the trigger time. To delete a stored record from block 195 + device, simply unlink the respective pstore file. 196 + 197 + Attentions in panic read/write APIs 198 + ----------------------------------- 199 + 200 + If on panic, the kernel is not going to run for much longer, the tasks will not 201 + be scheduled and most kernel resources will be out of service. It 202 + looks like a single-threaded program running on a single-core computer. 203 + 204 + The following points require special attention for panic read/write APIs: 205 + 206 + 1. Can **NOT** allocate any memory. 207 + If you need memory, just allocate while the block driver is initializing 208 + rather than waiting until the panic. 209 + #. Must be polled, **NOT** interrupt driven. 210 + No task schedule any more. The block driver should delay to ensure the write 211 + succeeds, but NOT sleep. 212 + #. Can **NOT** take any lock. 213 + There is no other task, nor any shared resource; you are safe to break all 214 + locks. 215 + #. Just use CPU to transfer. 216 + Do not use DMA to transfer unless you are sure that DMA will not keep lock. 217 + #. Control registers directly. 218 + Please control registers directly rather than use Linux kernel resources. 219 + Do I/O map while initializing rather than wait until a panic occurs. 220 + #. Reset your block device and controller if necessary. 221 + If you are not sure of the state of your block device and controller when 222 + a panic occurs, you are safe to stop and reset them. 223 + 224 + pstore/blk supports psblk_blkdev_info(), which is defined in 225 + *linux/pstore_blk.h*, to get information of using block device, such as the 226 + device number, sector count and start sector of the whole disk. 227 + 228 + pstore block internals 229 + ---------------------- 230 + 231 + For developer reference, here are all the important structures and APIs: 232 + 233 + .. kernel-doc:: fs/pstore/zone.c 234 + :internal: 235 + 236 + .. kernel-doc:: include/linux/pstore_zone.h 237 + :internal: 238 + 239 + .. kernel-doc:: fs/pstore/blk.c 240 + :export: 241 + 242 + .. kernel-doc:: include/linux/pstore_blk.h 243 + :internal:

+10 -4

Documentation/admin-guide/ramoops.rst

··· 32 32 memory are implementation defined, and won't work on many ARMs such as omaps. 33 33 34 34 The memory area is divided into ``record_size`` chunks (also rounded down to 35 - power of two) and each oops/panic writes a ``record_size`` chunk of 35 + power of two) and each kmesg dump writes a ``record_size`` chunk of 36 36 information. 37 37 38 - Dumping both oopses and panics can be done by setting 1 in the ``dump_oops`` 39 - variable while setting 0 in that variable dumps only the panics. 38 + Limiting which kinds of kmsg dumps are stored can be controlled via 39 + the ``max_reason`` value, as defined in include/linux/kmsg_dump.h's 40 + ``enum kmsg_dump_reason``. For example, to store both Oopses and Panics, 41 + ``max_reason`` should be set to 2 (KMSG_DUMP_OOPS), to store only Panics 42 + ``max_reason`` should be set to 1 (KMSG_DUMP_PANIC). Setting this to 0 43 + (KMSG_DUMP_UNDEF), means the reason filtering will be controlled by the 44 + ``printk.always_kmsg_dump`` boot param: if unset, it'll be KMSG_DUMP_OOPS, 45 + otherwise KMSG_DUMP_MAX. 40 46 41 47 The module uses a counter to record multiple dumps but the counter gets reset 42 48 on restart (i.e. new dumps after the restart will overwrite old ones). ··· 96 90 .mem_address = <...>, 97 91 .mem_type = <...>, 98 92 .record_size = <...>, 99 - .dump_oops = <...>, 93 + .max_reason = <...>, 100 94 .ecc = <...>, 101 95 }; 102 96

+11 -2

Documentation/devicetree/bindings/reserved-memory/ramoops.txt

··· 30 30 - ecc-size: enables ECC support and specifies ECC buffer size in bytes 31 31 (defaults to 0: no ECC) 32 32 33 - - record-size: maximum size in bytes of each dump done on oops/panic 33 + - record-size: maximum size in bytes of each kmsg dump. 34 34 (defaults to 0: disabled) 35 35 36 36 - console-size: size in bytes of log buffer reserved for kernel messages ··· 45 45 - unbuffered: if present, use unbuffered mappings to map the reserved region 46 46 (defaults to buffered mappings) 47 47 48 - - no-dump-oops: if present, only dump panics (defaults to panics and oops) 48 + - max-reason: if present, sets maximum type of kmsg dump reasons to store 49 + (defaults to 2: log Oopses and Panics). This can be set to INT_MAX to 50 + store all kmsg dumps. See include/linux/kmsg_dump.h KMSG_DUMP_* for other 51 + kmsg dump reason values. Setting this to 0 (KMSG_DUMP_UNDEF), means the 52 + reason filtering will be controlled by the printk.always_kmsg_dump boot 53 + param: if unset, it will be KMSG_DUMP_OOPS, otherwise KMSG_DUMP_MAX. 54 + 55 + - no-dump-oops: deprecated, use max_reason instead. If present, and 56 + max_reason is not specified, it is equivalent to max_reason = 1 57 + (KMSG_DUMP_PANIC). 49 58 50 59 - flags: if present, pass ramoops behavioral flags (defaults to 0, 51 60 see include/linux/pstore_ram.h RAMOOPS_FLAG_* for flag values).

+1

MAINTAINERS

··· 13715 13715 S: Maintained 13716 13716 T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/pstore 13717 13717 F: Documentation/admin-guide/ramoops.rst 13718 + F: Documentation/admin-guide/pstore-blk.rst 13718 13719 F: Documentation/devicetree/bindings/reserved-memory/ramoops.txt 13719 13720 F: drivers/acpi/apei/erst.c 13720 13721 F: drivers/firmware/efi/efi-pstore.c

+1 -3

arch/powerpc/kernel/nvram_64.c

··· 655 655 int rc = -1; 656 656 657 657 switch (reason) { 658 - case KMSG_DUMP_RESTART: 659 - case KMSG_DUMP_HALT: 660 - case KMSG_DUMP_POWEROFF: 658 + case KMSG_DUMP_SHUTDOWN: 661 659 /* These are almost always orderly shutdowns. */ 662 660 return; 663 661 case KMSG_DUMP_OOPS:

+10

drivers/mtd/Kconfig

··· 170 170 buffer in a flash partition where it can be read back at some 171 171 later point. 172 172 173 + config MTD_PSTORE 174 + tristate "Log panic/oops to an MTD buffer based on pstore" 175 + depends on PSTORE_BLK 176 + help 177 + This enables panic and oops messages to be logged to a circular 178 + buffer in a flash partition where it can be read back as files after 179 + mounting pstore filesystem. 180 + 181 + If unsure, say N. 182 + 173 183 config MTD_SWAP 174 184 tristate "Swap on MTD device support" 175 185 depends on MTD && SWAP

+1

drivers/mtd/Makefile

··· 20 20 obj-$(CONFIG_SSFDC) += ssfdc.o 21 21 obj-$(CONFIG_SM_FTL) += sm_ftl.o 22 22 obj-$(CONFIG_MTD_OOPS) += mtdoops.o 23 + obj-$(CONFIG_MTD_PSTORE) += mtdpstore.o 23 24 obj-$(CONFIG_MTD_SWAP) += mtdswap.o 24 25 25 26 nftl-objs := nftlcore.o nftlmount.o

+578

drivers/mtd/mtdpstore.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #define dev_fmt(fmt) "mtdoops-pstore: " fmt 4 + 5 + #include <linux/kernel.h> 6 + #include <linux/module.h> 7 + #include <linux/pstore_blk.h> 8 + #include <linux/mtd/mtd.h> 9 + #include <linux/bitops.h> 10 + 11 + static struct mtdpstore_context { 12 + int index; 13 + struct pstore_blk_config info; 14 + struct pstore_device_info dev; 15 + struct mtd_info *mtd; 16 + unsigned long *rmmap; /* removed bit map */ 17 + unsigned long *usedmap; /* used bit map */ 18 + /* 19 + * used for panic write 20 + * As there are no block_isbad for panic case, we should keep this 21 + * status before panic to ensure panic_write not failed. 22 + */ 23 + unsigned long *badmap; /* bad block bit map */ 24 + } oops_cxt; 25 + 26 + static int mtdpstore_block_isbad(struct mtdpstore_context *cxt, loff_t off) 27 + { 28 + int ret; 29 + struct mtd_info *mtd = cxt->mtd; 30 + u64 blknum; 31 + 32 + off = ALIGN_DOWN(off, mtd->erasesize); 33 + blknum = div_u64(off, mtd->erasesize); 34 + 35 + if (test_bit(blknum, cxt->badmap)) 36 + return true; 37 + ret = mtd_block_isbad(mtd, off); 38 + if (ret < 0) { 39 + dev_err(&mtd->dev, "mtd_block_isbad failed, aborting\n"); 40 + return ret; 41 + } else if (ret > 0) { 42 + set_bit(blknum, cxt->badmap); 43 + return true; 44 + } 45 + return false; 46 + } 47 + 48 + static inline int mtdpstore_panic_block_isbad(struct mtdpstore_context *cxt, 49 + loff_t off) 50 + { 51 + struct mtd_info *mtd = cxt->mtd; 52 + u64 blknum; 53 + 54 + off = ALIGN_DOWN(off, mtd->erasesize); 55 + blknum = div_u64(off, mtd->erasesize); 56 + return test_bit(blknum, cxt->badmap); 57 + } 58 + 59 + static inline void mtdpstore_mark_used(struct mtdpstore_context *cxt, 60 + loff_t off) 61 + { 62 + struct mtd_info *mtd = cxt->mtd; 63 + u64 zonenum = div_u64(off, cxt->info.kmsg_size); 64 + 65 + dev_dbg(&mtd->dev, "mark zone %llu used\n", zonenum); 66 + set_bit(zonenum, cxt->usedmap); 67 + } 68 + 69 + static inline void mtdpstore_mark_unused(struct mtdpstore_context *cxt, 70 + loff_t off) 71 + { 72 + struct mtd_info *mtd = cxt->mtd; 73 + u64 zonenum = div_u64(off, cxt->info.kmsg_size); 74 + 75 + dev_dbg(&mtd->dev, "mark zone %llu unused\n", zonenum); 76 + clear_bit(zonenum, cxt->usedmap); 77 + } 78 + 79 + static inline void mtdpstore_block_mark_unused(struct mtdpstore_context *cxt, 80 + loff_t off) 81 + { 82 + struct mtd_info *mtd = cxt->mtd; 83 + u32 zonecnt = mtd->erasesize / cxt->info.kmsg_size; 84 + u64 zonenum; 85 + 86 + off = ALIGN_DOWN(off, mtd->erasesize); 87 + zonenum = div_u64(off, cxt->info.kmsg_size); 88 + while (zonecnt > 0) { 89 + dev_dbg(&mtd->dev, "mark zone %llu unused\n", zonenum); 90 + clear_bit(zonenum, cxt->usedmap); 91 + zonenum++; 92 + zonecnt--; 93 + } 94 + } 95 + 96 + static inline int mtdpstore_is_used(struct mtdpstore_context *cxt, loff_t off) 97 + { 98 + u64 zonenum = div_u64(off, cxt->info.kmsg_size); 99 + u64 blknum = div_u64(off, cxt->mtd->erasesize); 100 + 101 + if (test_bit(blknum, cxt->badmap)) 102 + return true; 103 + return test_bit(zonenum, cxt->usedmap); 104 + } 105 + 106 + static int mtdpstore_block_is_used(struct mtdpstore_context *cxt, 107 + loff_t off) 108 + { 109 + struct mtd_info *mtd = cxt->mtd; 110 + u32 zonecnt = mtd->erasesize / cxt->info.kmsg_size; 111 + u64 zonenum; 112 + 113 + off = ALIGN_DOWN(off, mtd->erasesize); 114 + zonenum = div_u64(off, cxt->info.kmsg_size); 115 + while (zonecnt > 0) { 116 + if (test_bit(zonenum, cxt->usedmap)) 117 + return true; 118 + zonenum++; 119 + zonecnt--; 120 + } 121 + return false; 122 + } 123 + 124 + static int mtdpstore_is_empty(struct mtdpstore_context *cxt, char *buf, 125 + size_t size) 126 + { 127 + struct mtd_info *mtd = cxt->mtd; 128 + size_t sz; 129 + int i; 130 + 131 + sz = min_t(uint32_t, size, mtd->writesize / 4); 132 + for (i = 0; i < sz; i++) { 133 + if (buf[i] != (char)0xFF) 134 + return false; 135 + } 136 + return true; 137 + } 138 + 139 + static void mtdpstore_mark_removed(struct mtdpstore_context *cxt, loff_t off) 140 + { 141 + struct mtd_info *mtd = cxt->mtd; 142 + u64 zonenum = div_u64(off, cxt->info.kmsg_size); 143 + 144 + dev_dbg(&mtd->dev, "mark zone %llu removed\n", zonenum); 145 + set_bit(zonenum, cxt->rmmap); 146 + } 147 + 148 + static void mtdpstore_block_clear_removed(struct mtdpstore_context *cxt, 149 + loff_t off) 150 + { 151 + struct mtd_info *mtd = cxt->mtd; 152 + u32 zonecnt = mtd->erasesize / cxt->info.kmsg_size; 153 + u64 zonenum; 154 + 155 + off = ALIGN_DOWN(off, mtd->erasesize); 156 + zonenum = div_u64(off, cxt->info.kmsg_size); 157 + while (zonecnt > 0) { 158 + clear_bit(zonenum, cxt->rmmap); 159 + zonenum++; 160 + zonecnt--; 161 + } 162 + } 163 + 164 + static int mtdpstore_block_is_removed(struct mtdpstore_context *cxt, 165 + loff_t off) 166 + { 167 + struct mtd_info *mtd = cxt->mtd; 168 + u32 zonecnt = mtd->erasesize / cxt->info.kmsg_size; 169 + u64 zonenum; 170 + 171 + off = ALIGN_DOWN(off, mtd->erasesize); 172 + zonenum = div_u64(off, cxt->info.kmsg_size); 173 + while (zonecnt > 0) { 174 + if (test_bit(zonenum, cxt->rmmap)) 175 + return true; 176 + zonenum++; 177 + zonecnt--; 178 + } 179 + return false; 180 + } 181 + 182 + static int mtdpstore_erase_do(struct mtdpstore_context *cxt, loff_t off) 183 + { 184 + struct mtd_info *mtd = cxt->mtd; 185 + struct erase_info erase; 186 + int ret; 187 + 188 + off = ALIGN_DOWN(off, cxt->mtd->erasesize); 189 + dev_dbg(&mtd->dev, "try to erase off 0x%llx\n", off); 190 + erase.len = cxt->mtd->erasesize; 191 + erase.addr = off; 192 + ret = mtd_erase(cxt->mtd, &erase); 193 + if (!ret) 194 + mtdpstore_block_clear_removed(cxt, off); 195 + else 196 + dev_err(&mtd->dev, "erase of region [0x%llx, 0x%llx] on \"%s\" failed\n", 197 + (unsigned long long)erase.addr, 198 + (unsigned long long)erase.len, cxt->info.device); 199 + return ret; 200 + } 201 + 202 + /* 203 + * called while removing file 204 + * 205 + * Avoiding over erasing, do erase block only when the whole block is unused. 206 + * If the block contains valid log, do erase lazily on flush_removed() when 207 + * unregister. 208 + */ 209 + static ssize_t mtdpstore_erase(size_t size, loff_t off) 210 + { 211 + struct mtdpstore_context *cxt = &oops_cxt; 212 + 213 + if (mtdpstore_block_isbad(cxt, off)) 214 + return -EIO; 215 + 216 + mtdpstore_mark_unused(cxt, off); 217 + 218 + /* If the block still has valid data, mtdpstore do erase lazily */ 219 + if (likely(mtdpstore_block_is_used(cxt, off))) { 220 + mtdpstore_mark_removed(cxt, off); 221 + return 0; 222 + } 223 + 224 + /* all zones are unused, erase it */ 225 + return mtdpstore_erase_do(cxt, off); 226 + } 227 + 228 + /* 229 + * What is security for mtdpstore? 230 + * As there is no erase for panic case, we should ensure at least one zone 231 + * is writable. Otherwise, panic write will fail. 232 + * If zone is used, write operation will return -ENOMSG, which means that 233 + * pstore/blk will try one by one until gets an empty zone. So, it is not 234 + * needed to ensure the next zone is empty, but at least one. 235 + */ 236 + static int mtdpstore_security(struct mtdpstore_context *cxt, loff_t off) 237 + { 238 + int ret = 0, i; 239 + struct mtd_info *mtd = cxt->mtd; 240 + u32 zonenum = (u32)div_u64(off, cxt->info.kmsg_size); 241 + u32 zonecnt = (u32)div_u64(cxt->mtd->size, cxt->info.kmsg_size); 242 + u32 blkcnt = (u32)div_u64(cxt->mtd->size, cxt->mtd->erasesize); 243 + u32 erasesize = cxt->mtd->erasesize; 244 + 245 + for (i = 0; i < zonecnt; i++) { 246 + u32 num = (zonenum + i) % zonecnt; 247 + 248 + /* found empty zone */ 249 + if (!test_bit(num, cxt->usedmap)) 250 + return 0; 251 + } 252 + 253 + /* If there is no any empty zone, we have no way but to do erase */ 254 + while (blkcnt--) { 255 + div64_u64_rem(off + erasesize, cxt->mtd->size, (u64 *)&off); 256 + 257 + if (mtdpstore_block_isbad(cxt, off)) 258 + continue; 259 + 260 + ret = mtdpstore_erase_do(cxt, off); 261 + if (!ret) { 262 + mtdpstore_block_mark_unused(cxt, off); 263 + break; 264 + } 265 + } 266 + 267 + if (ret) 268 + dev_err(&mtd->dev, "all blocks bad!\n"); 269 + dev_dbg(&mtd->dev, "end security\n"); 270 + return ret; 271 + } 272 + 273 + static ssize_t mtdpstore_write(const char *buf, size_t size, loff_t off) 274 + { 275 + struct mtdpstore_context *cxt = &oops_cxt; 276 + struct mtd_info *mtd = cxt->mtd; 277 + size_t retlen; 278 + int ret; 279 + 280 + if (mtdpstore_block_isbad(cxt, off)) 281 + return -ENOMSG; 282 + 283 + /* zone is used, please try next one */ 284 + if (mtdpstore_is_used(cxt, off)) 285 + return -ENOMSG; 286 + 287 + dev_dbg(&mtd->dev, "try to write off 0x%llx size %zu\n", off, size); 288 + ret = mtd_write(cxt->mtd, off, size, &retlen, (u_char *)buf); 289 + if (ret < 0 || retlen != size) { 290 + dev_err(&mtd->dev, "write failure at %lld (%zu of %zu written), err %d\n", 291 + off, retlen, size, ret); 292 + return -EIO; 293 + } 294 + mtdpstore_mark_used(cxt, off); 295 + 296 + mtdpstore_security(cxt, off); 297 + return retlen; 298 + } 299 + 300 + static inline bool mtdpstore_is_io_error(int ret) 301 + { 302 + return ret < 0 && !mtd_is_bitflip(ret) && !mtd_is_eccerr(ret); 303 + } 304 + 305 + /* 306 + * All zones will be read as pstore/blk will read zone one by one when do 307 + * recover. 308 + */ 309 + static ssize_t mtdpstore_read(char *buf, size_t size, loff_t off) 310 + { 311 + struct mtdpstore_context *cxt = &oops_cxt; 312 + struct mtd_info *mtd = cxt->mtd; 313 + size_t retlen, done; 314 + int ret; 315 + 316 + if (mtdpstore_block_isbad(cxt, off)) 317 + return -ENOMSG; 318 + 319 + dev_dbg(&mtd->dev, "try to read off 0x%llx size %zu\n", off, size); 320 + for (done = 0, retlen = 0; done < size; done += retlen) { 321 + retlen = 0; 322 + 323 + ret = mtd_read(cxt->mtd, off + done, size - done, &retlen, 324 + (u_char *)buf + done); 325 + if (mtdpstore_is_io_error(ret)) { 326 + dev_err(&mtd->dev, "read failure at %lld (%zu of %zu read), err %d\n", 327 + off + done, retlen, size - done, ret); 328 + /* the zone may be broken, try next one */ 329 + return -ENOMSG; 330 + } 331 + 332 + /* 333 + * ECC error. The impact on log data is so small. Maybe we can 334 + * still read it and try to understand. So mtdpstore just hands 335 + * over what it gets and user can judge whether the data is 336 + * valid or not. 337 + */ 338 + if (mtd_is_eccerr(ret)) { 339 + dev_err(&mtd->dev, "ecc error at %lld (%zu of %zu read), err %d\n", 340 + off + done, retlen, size - done, ret); 341 + /* driver may not set retlen when ecc error */ 342 + retlen = retlen == 0 ? size - done : retlen; 343 + } 344 + } 345 + 346 + if (mtdpstore_is_empty(cxt, buf, size)) 347 + mtdpstore_mark_unused(cxt, off); 348 + else 349 + mtdpstore_mark_used(cxt, off); 350 + 351 + mtdpstore_security(cxt, off); 352 + return retlen; 353 + } 354 + 355 + static ssize_t mtdpstore_panic_write(const char *buf, size_t size, loff_t off) 356 + { 357 + struct mtdpstore_context *cxt = &oops_cxt; 358 + struct mtd_info *mtd = cxt->mtd; 359 + size_t retlen; 360 + int ret; 361 + 362 + if (mtdpstore_panic_block_isbad(cxt, off)) 363 + return -ENOMSG; 364 + 365 + /* zone is used, please try next one */ 366 + if (mtdpstore_is_used(cxt, off)) 367 + return -ENOMSG; 368 + 369 + ret = mtd_panic_write(cxt->mtd, off, size, &retlen, (u_char *)buf); 370 + if (ret < 0 || size != retlen) { 371 + dev_err(&mtd->dev, "panic write failure at %lld (%zu of %zu read), err %d\n", 372 + off, retlen, size, ret); 373 + return -EIO; 374 + } 375 + mtdpstore_mark_used(cxt, off); 376 + 377 + return retlen; 378 + } 379 + 380 + static void mtdpstore_notify_add(struct mtd_info *mtd) 381 + { 382 + int ret; 383 + struct mtdpstore_context *cxt = &oops_cxt; 384 + struct pstore_blk_config *info = &cxt->info; 385 + unsigned long longcnt; 386 + 387 + if (!strcmp(mtd->name, info->device)) 388 + cxt->index = mtd->index; 389 + 390 + if (mtd->index != cxt->index || cxt->index < 0) 391 + return; 392 + 393 + dev_dbg(&mtd->dev, "found matching MTD device %s\n", mtd->name); 394 + 395 + if (mtd->size < info->kmsg_size * 2) { 396 + dev_err(&mtd->dev, "MTD partition %d not big enough\n", 397 + mtd->index); 398 + return; 399 + } 400 + /* 401 + * kmsg_size must be aligned to 4096 Bytes, which is limited by 402 + * psblk. The default value of kmsg_size is 64KB. If kmsg_size 403 + * is larger than erasesize, some errors will occur since mtdpsotre 404 + * is designed on it. 405 + */ 406 + if (mtd->erasesize < info->kmsg_size) { 407 + dev_err(&mtd->dev, "eraseblock size of MTD partition %d too small\n", 408 + mtd->index); 409 + return; 410 + } 411 + if (unlikely(info->kmsg_size % mtd->writesize)) { 412 + dev_err(&mtd->dev, "record size %lu KB must align to write size %d KB\n", 413 + info->kmsg_size / 1024, 414 + mtd->writesize / 1024); 415 + return; 416 + } 417 + 418 + longcnt = BITS_TO_LONGS(div_u64(mtd->size, info->kmsg_size)); 419 + cxt->rmmap = kcalloc(longcnt, sizeof(long), GFP_KERNEL); 420 + cxt->usedmap = kcalloc(longcnt, sizeof(long), GFP_KERNEL); 421 + 422 + longcnt = BITS_TO_LONGS(div_u64(mtd->size, mtd->erasesize)); 423 + cxt->badmap = kcalloc(longcnt, sizeof(long), GFP_KERNEL); 424 + 425 + cxt->dev.total_size = mtd->size; 426 + /* just support dmesg right now */ 427 + cxt->dev.flags = PSTORE_FLAGS_DMESG; 428 + cxt->dev.read = mtdpstore_read; 429 + cxt->dev.write = mtdpstore_write; 430 + cxt->dev.erase = mtdpstore_erase; 431 + cxt->dev.panic_write = mtdpstore_panic_write; 432 + 433 + ret = register_pstore_device(&cxt->dev); 434 + if (ret) { 435 + dev_err(&mtd->dev, "mtd%d register to psblk failed\n", 436 + mtd->index); 437 + return; 438 + } 439 + cxt->mtd = mtd; 440 + dev_info(&mtd->dev, "Attached to MTD device %d\n", mtd->index); 441 + } 442 + 443 + static int mtdpstore_flush_removed_do(struct mtdpstore_context *cxt, 444 + loff_t off, size_t size) 445 + { 446 + struct mtd_info *mtd = cxt->mtd; 447 + u_char *buf; 448 + int ret; 449 + size_t retlen; 450 + struct erase_info erase; 451 + 452 + buf = kmalloc(mtd->erasesize, GFP_KERNEL); 453 + if (!buf) 454 + return -ENOMEM; 455 + 456 + /* 1st. read to cache */ 457 + ret = mtd_read(mtd, off, mtd->erasesize, &retlen, buf); 458 + if (mtdpstore_is_io_error(ret)) 459 + goto free; 460 + 461 + /* 2nd. erase block */ 462 + erase.len = mtd->erasesize; 463 + erase.addr = off; 464 + ret = mtd_erase(mtd, &erase); 465 + if (ret) 466 + goto free; 467 + 468 + /* 3rd. write back */ 469 + while (size) { 470 + unsigned int zonesize = cxt->info.kmsg_size; 471 + 472 + /* there is valid data on block, write back */ 473 + if (mtdpstore_is_used(cxt, off)) { 474 + ret = mtd_write(mtd, off, zonesize, &retlen, buf); 475 + if (ret) 476 + dev_err(&mtd->dev, "write failure at %lld (%zu of %u written), err %d\n", 477 + off, retlen, zonesize, ret); 478 + } 479 + 480 + off += zonesize; 481 + size -= min_t(unsigned int, zonesize, size); 482 + } 483 + 484 + free: 485 + kfree(buf); 486 + return ret; 487 + } 488 + 489 + /* 490 + * What does mtdpstore_flush_removed() do? 491 + * When user remove any log file on pstore filesystem, mtdpstore should do 492 + * something to ensure log file removed. If the whole block is no longer used, 493 + * it's nice to erase the block. However if the block still contains valid log, 494 + * what mtdpstore can do is to erase and write the valid log back. 495 + */ 496 + static int mtdpstore_flush_removed(struct mtdpstore_context *cxt) 497 + { 498 + struct mtd_info *mtd = cxt->mtd; 499 + int ret; 500 + loff_t off; 501 + u32 blkcnt = (u32)div_u64(mtd->size, mtd->erasesize); 502 + 503 + for (off = 0; blkcnt > 0; blkcnt--, off += mtd->erasesize) { 504 + ret = mtdpstore_block_isbad(cxt, off); 505 + if (ret) 506 + continue; 507 + 508 + ret = mtdpstore_block_is_removed(cxt, off); 509 + if (!ret) 510 + continue; 511 + 512 + ret = mtdpstore_flush_removed_do(cxt, off, mtd->erasesize); 513 + if (ret) 514 + return ret; 515 + } 516 + return 0; 517 + } 518 + 519 + static void mtdpstore_notify_remove(struct mtd_info *mtd) 520 + { 521 + struct mtdpstore_context *cxt = &oops_cxt; 522 + 523 + if (mtd->index != cxt->index || cxt->index < 0) 524 + return; 525 + 526 + mtdpstore_flush_removed(cxt); 527 + 528 + unregister_pstore_device(&cxt->dev); 529 + kfree(cxt->badmap); 530 + kfree(cxt->usedmap); 531 + kfree(cxt->rmmap); 532 + cxt->mtd = NULL; 533 + cxt->index = -1; 534 + } 535 + 536 + static struct mtd_notifier mtdpstore_notifier = { 537 + .add = mtdpstore_notify_add, 538 + .remove = mtdpstore_notify_remove, 539 + }; 540 + 541 + static int __init mtdpstore_init(void) 542 + { 543 + int ret; 544 + struct mtdpstore_context *cxt = &oops_cxt; 545 + struct pstore_blk_config *info = &cxt->info; 546 + 547 + ret = pstore_blk_get_config(info); 548 + if (unlikely(ret)) 549 + return ret; 550 + 551 + if (strlen(info->device) == 0) { 552 + pr_err("mtd device must be supplied (device name is empty)\n"); 553 + return -EINVAL; 554 + } 555 + if (!info->kmsg_size) { 556 + pr_err("no backend enabled (kmsg_size is 0)\n"); 557 + return -EINVAL; 558 + } 559 + 560 + /* Setup the MTD device to use */ 561 + ret = kstrtoint((char *)info->device, 0, &cxt->index); 562 + if (ret) 563 + cxt->index = -1; 564 + 565 + register_mtd_user(&mtdpstore_notifier); 566 + return 0; 567 + } 568 + module_init(mtdpstore_init); 569 + 570 + static void __exit mtdpstore_exit(void) 571 + { 572 + unregister_mtd_user(&mtdpstore_notifier); 573 + } 574 + module_exit(mtdpstore_exit); 575 + 576 + MODULE_LICENSE("GPL"); 577 + MODULE_AUTHOR("WeiXiong Liao <liaoweixiong@allwinnertech.com>"); 578 + MODULE_DESCRIPTION("MTD backend for pstore/blk");

+1 -1

drivers/platform/chrome/chromeos_pstore.c

··· 57 57 .record_size = 0x40000, 58 58 .console_size = 0x20000, 59 59 .ftrace_size = 0x20000, 60 - .dump_oops = 1, 60 + .max_reason = KMSG_DUMP_OOPS, 61 61 }; 62 62 63 63 static struct platform_device chromeos_ramoops = {

+109

fs/pstore/Kconfig

··· 153 153 "ramoops.ko". 154 154 155 155 For more information, see Documentation/admin-guide/ramoops.rst. 156 + 157 + config PSTORE_ZONE 158 + tristate 159 + depends on PSTORE 160 + help 161 + The common layer for pstore/blk (and pstore/ram in the future) 162 + to manage storage in zones. 163 + 164 + config PSTORE_BLK 165 + tristate "Log panic/oops to a block device" 166 + depends on PSTORE 167 + depends on BLOCK 168 + select PSTORE_ZONE 169 + default n 170 + help 171 + This enables panic and oops message to be logged to a block dev 172 + where it can be read back at some later point. 173 + 174 + For more information, see Documentation/admin-guide/pstore-blk.rst 175 + 176 + If unsure, say N. 177 + 178 + config PSTORE_BLK_BLKDEV 179 + string "block device identifier" 180 + depends on PSTORE_BLK 181 + default "" 182 + help 183 + Which block device should be used for pstore/blk. 184 + 185 + It accepts the following variants: 186 + 1) <hex_major><hex_minor> device number in hexadecimal representation, 187 + with no leading 0x, for example b302. 188 + 2) /dev/<disk_name> represents the device name of disk 189 + 3) /dev/<disk_name><decimal> represents the device name and number 190 + of partition - device number of disk plus the partition number 191 + 4) /dev/<disk_name>p<decimal> - same as the above, this form is 192 + used when disk name of partitioned disk ends with a digit. 193 + 5) PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing the 194 + unique id of a partition if the partition table provides it. 195 + The UUID may be either an EFI/GPT UUID, or refer to an MSDOS 196 + partition using the format SSSSSSSS-PP, where SSSSSSSS is a zero- 197 + filled hex representation of the 32-bit "NT disk signature", and PP 198 + is a zero-filled hex representation of the 1-based partition number. 199 + 6) PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation 200 + to a partition with a known unique id. 201 + 7) <major>:<minor> major and minor number of the device separated by 202 + a colon. 203 + 204 + NOTE that, both Kconfig and module parameters can configure 205 + pstore/blk, but module parameters have priority over Kconfig. 206 + 207 + config PSTORE_BLK_KMSG_SIZE 208 + int "Size in Kbytes of kmsg dump log to store" 209 + depends on PSTORE_BLK 210 + default 64 211 + help 212 + This just sets size of kmsg dump (oops, panic, etc) log for 213 + pstore/blk. The size is in KB and must be a multiple of 4. 214 + 215 + NOTE that, both Kconfig and module parameters can configure 216 + pstore/blk, but module parameters have priority over Kconfig. 217 + 218 + config PSTORE_BLK_MAX_REASON 219 + int "Maximum kmsg dump reason to store" 220 + depends on PSTORE_BLK 221 + default 2 222 + help 223 + The maximum reason for kmsg dumps to store. The default is 224 + 2 (KMSG_DUMP_OOPS), see include/linux/kmsg_dump.h's 225 + enum kmsg_dump_reason for more details. 226 + 227 + NOTE that, both Kconfig and module parameters can configure 228 + pstore/blk, but module parameters have priority over Kconfig. 229 + 230 + config PSTORE_BLK_PMSG_SIZE 231 + int "Size in Kbytes of pmsg to store" 232 + depends on PSTORE_BLK 233 + depends on PSTORE_PMSG 234 + default 64 235 + help 236 + This just sets size of pmsg (pmsg_size) for pstore/blk. The size is 237 + in KB and must be a multiple of 4. 238 + 239 + NOTE that, both Kconfig and module parameters can configure 240 + pstore/blk, but module parameters have priority over Kconfig. 241 + 242 + config PSTORE_BLK_CONSOLE_SIZE 243 + int "Size in Kbytes of console log to store" 244 + depends on PSTORE_BLK 245 + depends on PSTORE_CONSOLE 246 + default 64 247 + help 248 + This just sets size of console log (console_size) to store via 249 + pstore/blk. The size is in KB and must be a multiple of 4. 250 + 251 + NOTE that, both Kconfig and module parameters can configure 252 + pstore/blk, but module parameters have priority over Kconfig. 253 + 254 + config PSTORE_BLK_FTRACE_SIZE 255 + int "Size in Kbytes of ftrace log to store" 256 + depends on PSTORE_BLK 257 + depends on PSTORE_FTRACE 258 + default 64 259 + help 260 + This just sets size of ftrace log (ftrace_size) for pstore/blk. The 261 + size is in KB and must be a multiple of 4. 262 + 263 + NOTE that, both Kconfig and module parameters can configure 264 + pstore/blk, but module parameters have priority over Kconfig.

+6

fs/pstore/Makefile

··· 12 12 13 13 ramoops-objs += ram.o ram_core.o 14 14 obj-$(CONFIG_PSTORE_RAM) += ramoops.o 15 + 16 + pstore_zone-objs += zone.o 17 + obj-$(CONFIG_PSTORE_ZONE) += pstore_zone.o 18 + 19 + pstore_blk-objs += blk.o 20 + obj-$(CONFIG_PSTORE_BLK) += pstore_blk.o

+517

fs/pstore/blk.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Implements pstore backend driver that write to block (or non-block) storage 4 + * devices, using the pstore/zone API. 5 + */ 6 + 7 + #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt 8 + 9 + #include <linux/kernel.h> 10 + #include <linux/module.h> 11 + #include "../../block/blk.h" 12 + #include <linux/blkdev.h> 13 + #include <linux/string.h> 14 + #include <linux/of.h> 15 + #include <linux/of_address.h> 16 + #include <linux/platform_device.h> 17 + #include <linux/pstore_blk.h> 18 + #include <linux/mount.h> 19 + #include <linux/uio.h> 20 + 21 + static long kmsg_size = CONFIG_PSTORE_BLK_KMSG_SIZE; 22 + module_param(kmsg_size, long, 0400); 23 + MODULE_PARM_DESC(kmsg_size, "kmsg dump record size in kbytes"); 24 + 25 + static int max_reason = CONFIG_PSTORE_BLK_MAX_REASON; 26 + module_param(max_reason, int, 0400); 27 + MODULE_PARM_DESC(max_reason, 28 + "maximum reason for kmsg dump (default 2: Oops and Panic)"); 29 + 30 + #if IS_ENABLED(CONFIG_PSTORE_PMSG) 31 + static long pmsg_size = CONFIG_PSTORE_BLK_PMSG_SIZE; 32 + #else 33 + static long pmsg_size = -1; 34 + #endif 35 + module_param(pmsg_size, long, 0400); 36 + MODULE_PARM_DESC(pmsg_size, "pmsg size in kbytes"); 37 + 38 + #if IS_ENABLED(CONFIG_PSTORE_CONSOLE) 39 + static long console_size = CONFIG_PSTORE_BLK_CONSOLE_SIZE; 40 + #else 41 + static long console_size = -1; 42 + #endif 43 + module_param(console_size, long, 0400); 44 + MODULE_PARM_DESC(console_size, "console size in kbytes"); 45 + 46 + #if IS_ENABLED(CONFIG_PSTORE_FTRACE) 47 + static long ftrace_size = CONFIG_PSTORE_BLK_FTRACE_SIZE; 48 + #else 49 + static long ftrace_size = -1; 50 + #endif 51 + module_param(ftrace_size, long, 0400); 52 + MODULE_PARM_DESC(ftrace_size, "ftrace size in kbytes"); 53 + 54 + static bool best_effort; 55 + module_param(best_effort, bool, 0400); 56 + MODULE_PARM_DESC(best_effort, "use best effort to write (i.e. do not require storage driver pstore support, default: off)"); 57 + 58 + /* 59 + * blkdev - the block device to use for pstore storage 60 + * 61 + * Usually, this will be a partition of a block device. 62 + * 63 + * blkdev accepts the following variants: 64 + * 1) <hex_major><hex_minor> device number in hexadecimal representation, 65 + * with no leading 0x, for example b302. 66 + * 2) /dev/<disk_name> represents the device number of disk 67 + * 3) /dev/<disk_name><decimal> represents the device number 68 + * of partition - device number of disk plus the partition number 69 + * 4) /dev/<disk_name>p<decimal> - same as the above, that form is 70 + * used when disk name of partitioned disk ends on a digit. 71 + * 5) PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing the 72 + * unique id of a partition if the partition table provides it. 73 + * The UUID may be either an EFI/GPT UUID, or refer to an MSDOS 74 + * partition using the format SSSSSSSS-PP, where SSSSSSSS is a zero- 75 + * filled hex representation of the 32-bit "NT disk signature", and PP 76 + * is a zero-filled hex representation of the 1-based partition number. 77 + * 6) PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation to 78 + * a partition with a known unique id. 79 + * 7) <major>:<minor> major and minor number of the device separated by 80 + * a colon. 81 + */ 82 + static char blkdev[80] = CONFIG_PSTORE_BLK_BLKDEV; 83 + module_param_string(blkdev, blkdev, 80, 0400); 84 + MODULE_PARM_DESC(blkdev, "block device for pstore storage"); 85 + 86 + /* 87 + * All globals must only be accessed under the pstore_blk_lock 88 + * during the register/unregister functions. 89 + */ 90 + static DEFINE_MUTEX(pstore_blk_lock); 91 + static struct block_device *psblk_bdev; 92 + static struct pstore_zone_info *pstore_zone_info; 93 + static pstore_blk_panic_write_op blkdev_panic_write; 94 + 95 + struct bdev_info { 96 + dev_t devt; 97 + sector_t nr_sects; 98 + sector_t start_sect; 99 + }; 100 + 101 + #define check_size(name, alignsize) ({ \ 102 + long _##name_ = (name); \ 103 + _##name_ = _##name_ <= 0 ? 0 : (_##name_ * 1024); \ 104 + if (_##name_ & ((alignsize) - 1)) { \ 105 + pr_info(#name " must align to %d\n", \ 106 + (alignsize)); \ 107 + _##name_ = ALIGN(name, (alignsize)); \ 108 + } \ 109 + _##name_; \ 110 + }) 111 + 112 + static int __register_pstore_device(struct pstore_device_info *dev) 113 + { 114 + int ret; 115 + 116 + lockdep_assert_held(&pstore_blk_lock); 117 + 118 + if (!dev || !dev->total_size || !dev->read || !dev->write) 119 + return -EINVAL; 120 + 121 + /* someone already registered before */ 122 + if (pstore_zone_info) 123 + return -EBUSY; 124 + 125 + pstore_zone_info = kzalloc(sizeof(struct pstore_zone_info), GFP_KERNEL); 126 + if (!pstore_zone_info) 127 + return -ENOMEM; 128 + 129 + /* zero means not limit on which backends to attempt to store. */ 130 + if (!dev->flags) 131 + dev->flags = UINT_MAX; 132 + 133 + #define verify_size(name, alignsize, enabled) { \ 134 + long _##name_; \ 135 + if (enabled) \ 136 + _##name_ = check_size(name, alignsize); \ 137 + else \ 138 + _##name_ = 0; \ 139 + name = _##name_ / 1024; \ 140 + pstore_zone_info->name = _##name_; \ 141 + } 142 + 143 + verify_size(kmsg_size, 4096, dev->flags & PSTORE_FLAGS_DMESG); 144 + verify_size(pmsg_size, 4096, dev->flags & PSTORE_FLAGS_PMSG); 145 + verify_size(console_size, 4096, dev->flags & PSTORE_FLAGS_CONSOLE); 146 + verify_size(ftrace_size, 4096, dev->flags & PSTORE_FLAGS_FTRACE); 147 + #undef verify_size 148 + 149 + pstore_zone_info->total_size = dev->total_size; 150 + pstore_zone_info->max_reason = max_reason; 151 + pstore_zone_info->read = dev->read; 152 + pstore_zone_info->write = dev->write; 153 + pstore_zone_info->erase = dev->erase; 154 + pstore_zone_info->panic_write = dev->panic_write; 155 + pstore_zone_info->name = KBUILD_MODNAME; 156 + pstore_zone_info->owner = THIS_MODULE; 157 + 158 + ret = register_pstore_zone(pstore_zone_info); 159 + if (ret) { 160 + kfree(pstore_zone_info); 161 + pstore_zone_info = NULL; 162 + } 163 + return ret; 164 + } 165 + /** 166 + * register_pstore_device() - register non-block device to pstore/blk 167 + * 168 + * @dev: non-block device information 169 + * 170 + * Return: 171 + * * 0 - OK 172 + * * Others - something error. 173 + */ 174 + int register_pstore_device(struct pstore_device_info *dev) 175 + { 176 + int ret; 177 + 178 + mutex_lock(&pstore_blk_lock); 179 + ret = __register_pstore_device(dev); 180 + mutex_unlock(&pstore_blk_lock); 181 + 182 + return ret; 183 + } 184 + EXPORT_SYMBOL_GPL(register_pstore_device); 185 + 186 + static void __unregister_pstore_device(struct pstore_device_info *dev) 187 + { 188 + lockdep_assert_held(&pstore_blk_lock); 189 + if (pstore_zone_info && pstore_zone_info->read == dev->read) { 190 + unregister_pstore_zone(pstore_zone_info); 191 + kfree(pstore_zone_info); 192 + pstore_zone_info = NULL; 193 + } 194 + } 195 + 196 + /** 197 + * unregister_pstore_device() - unregister non-block device from pstore/blk 198 + * 199 + * @dev: non-block device information 200 + */ 201 + void unregister_pstore_device(struct pstore_device_info *dev) 202 + { 203 + mutex_lock(&pstore_blk_lock); 204 + __unregister_pstore_device(dev); 205 + mutex_unlock(&pstore_blk_lock); 206 + } 207 + EXPORT_SYMBOL_GPL(unregister_pstore_device); 208 + 209 + /** 210 + * psblk_get_bdev() - open block device 211 + * 212 + * @holder: Exclusive holder identifier 213 + * @info: Information about bdev to fill in 214 + * 215 + * Return: pointer to block device on success and others on error. 216 + * 217 + * On success, the returned block_device has reference count of one. 218 + */ 219 + static struct block_device *psblk_get_bdev(void *holder, 220 + struct bdev_info *info) 221 + { 222 + struct block_device *bdev = ERR_PTR(-ENODEV); 223 + fmode_t mode = FMODE_READ | FMODE_WRITE; 224 + sector_t nr_sects; 225 + 226 + lockdep_assert_held(&pstore_blk_lock); 227 + 228 + if (pstore_zone_info) 229 + return ERR_PTR(-EBUSY); 230 + 231 + if (!blkdev[0]) 232 + return ERR_PTR(-ENODEV); 233 + 234 + if (holder) 235 + mode |= FMODE_EXCL; 236 + bdev = blkdev_get_by_path(blkdev, mode, holder); 237 + if (IS_ERR(bdev)) { 238 + dev_t devt; 239 + 240 + devt = name_to_dev_t(blkdev); 241 + if (devt == 0) 242 + return ERR_PTR(-ENODEV); 243 + bdev = blkdev_get_by_dev(devt, mode, holder); 244 + if (IS_ERR(bdev)) 245 + return bdev; 246 + } 247 + 248 + nr_sects = part_nr_sects_read(bdev->bd_part); 249 + if (!nr_sects) { 250 + pr_err("not enough space for '%s'\n", blkdev); 251 + blkdev_put(bdev, mode); 252 + return ERR_PTR(-ENOSPC); 253 + } 254 + 255 + if (info) { 256 + info->devt = bdev->bd_dev; 257 + info->nr_sects = nr_sects; 258 + info->start_sect = get_start_sect(bdev); 259 + } 260 + 261 + return bdev; 262 + } 263 + 264 + static void psblk_put_bdev(struct block_device *bdev, void *holder) 265 + { 266 + fmode_t mode = FMODE_READ | FMODE_WRITE; 267 + 268 + lockdep_assert_held(&pstore_blk_lock); 269 + 270 + if (!bdev) 271 + return; 272 + 273 + if (holder) 274 + mode |= FMODE_EXCL; 275 + blkdev_put(bdev, mode); 276 + } 277 + 278 + static ssize_t psblk_generic_blk_read(char *buf, size_t bytes, loff_t pos) 279 + { 280 + struct block_device *bdev = psblk_bdev; 281 + struct file file; 282 + struct kiocb kiocb; 283 + struct iov_iter iter; 284 + struct kvec iov = {.iov_base = buf, .iov_len = bytes}; 285 + 286 + if (!bdev) 287 + return -ENODEV; 288 + 289 + memset(&file, 0, sizeof(struct file)); 290 + file.f_mapping = bdev->bd_inode->i_mapping; 291 + file.f_flags = O_DSYNC | __O_SYNC | O_NOATIME; 292 + file.f_inode = bdev->bd_inode; 293 + file_ra_state_init(&file.f_ra, file.f_mapping); 294 + 295 + init_sync_kiocb(&kiocb, &file); 296 + kiocb.ki_pos = pos; 297 + iov_iter_kvec(&iter, READ, &iov, 1, bytes); 298 + 299 + return generic_file_read_iter(&kiocb, &iter); 300 + } 301 + 302 + static ssize_t psblk_generic_blk_write(const char *buf, size_t bytes, 303 + loff_t pos) 304 + { 305 + struct block_device *bdev = psblk_bdev; 306 + struct iov_iter iter; 307 + struct kiocb kiocb; 308 + struct file file; 309 + ssize_t ret; 310 + struct kvec iov = {.iov_base = (void *)buf, .iov_len = bytes}; 311 + 312 + if (!bdev) 313 + return -ENODEV; 314 + 315 + /* Console/Ftrace backend may handle buffer until flush dirty zones */ 316 + if (in_interrupt() || irqs_disabled()) 317 + return -EBUSY; 318 + 319 + memset(&file, 0, sizeof(struct file)); 320 + file.f_mapping = bdev->bd_inode->i_mapping; 321 + file.f_flags = O_DSYNC | __O_SYNC | O_NOATIME; 322 + file.f_inode = bdev->bd_inode; 323 + 324 + init_sync_kiocb(&kiocb, &file); 325 + kiocb.ki_pos = pos; 326 + iov_iter_kvec(&iter, WRITE, &iov, 1, bytes); 327 + 328 + inode_lock(bdev->bd_inode); 329 + ret = generic_write_checks(&kiocb, &iter); 330 + if (ret > 0) 331 + ret = generic_perform_write(&file, &iter, pos); 332 + inode_unlock(bdev->bd_inode); 333 + 334 + if (likely(ret > 0)) { 335 + const struct file_operations f_op = {.fsync = blkdev_fsync}; 336 + 337 + file.f_op = &f_op; 338 + kiocb.ki_pos += ret; 339 + ret = generic_write_sync(&kiocb, ret); 340 + } 341 + return ret; 342 + } 343 + 344 + static ssize_t psblk_blk_panic_write(const char *buf, size_t size, 345 + loff_t off) 346 + { 347 + int ret; 348 + 349 + if (!blkdev_panic_write) 350 + return -EOPNOTSUPP; 351 + 352 + /* size and off must align to SECTOR_SIZE for block device */ 353 + ret = blkdev_panic_write(buf, off >> SECTOR_SHIFT, 354 + size >> SECTOR_SHIFT); 355 + /* try next zone */ 356 + if (ret == -ENOMSG) 357 + return ret; 358 + return ret ? -EIO : size; 359 + } 360 + 361 + static int __register_pstore_blk(struct pstore_blk_info *info) 362 + { 363 + char bdev_name[BDEVNAME_SIZE]; 364 + struct block_device *bdev; 365 + struct pstore_device_info dev; 366 + struct bdev_info binfo; 367 + void *holder = blkdev; 368 + int ret = -ENODEV; 369 + 370 + lockdep_assert_held(&pstore_blk_lock); 371 + 372 + /* hold bdev exclusively */ 373 + memset(&binfo, 0, sizeof(binfo)); 374 + bdev = psblk_get_bdev(holder, &binfo); 375 + if (IS_ERR(bdev)) { 376 + pr_err("failed to open '%s'!\n", blkdev); 377 + return PTR_ERR(bdev); 378 + } 379 + 380 + /* only allow driver matching the @blkdev */ 381 + if (!binfo.devt || (!best_effort && 382 + MAJOR(binfo.devt) != info->major)) { 383 + pr_debug("invalid major %u (expect %u)\n", 384 + info->major, MAJOR(binfo.devt)); 385 + ret = -ENODEV; 386 + goto err_put_bdev; 387 + } 388 + 389 + /* psblk_bdev must be assigned before register to pstore/blk */ 390 + psblk_bdev = bdev; 391 + blkdev_panic_write = info->panic_write; 392 + 393 + /* Copy back block device details. */ 394 + info->devt = binfo.devt; 395 + info->nr_sects = binfo.nr_sects; 396 + info->start_sect = binfo.start_sect; 397 + 398 + memset(&dev, 0, sizeof(dev)); 399 + dev.total_size = info->nr_sects << SECTOR_SHIFT; 400 + dev.flags = info->flags; 401 + dev.read = psblk_generic_blk_read; 402 + dev.write = psblk_generic_blk_write; 403 + dev.erase = NULL; 404 + dev.panic_write = info->panic_write ? psblk_blk_panic_write : NULL; 405 + 406 + ret = __register_pstore_device(&dev); 407 + if (ret) 408 + goto err_put_bdev; 409 + 410 + bdevname(bdev, bdev_name); 411 + pr_info("attached %s%s\n", bdev_name, 412 + info->panic_write ? "" : " (no dedicated panic_write!)"); 413 + return 0; 414 + 415 + err_put_bdev: 416 + psblk_bdev = NULL; 417 + blkdev_panic_write = NULL; 418 + psblk_put_bdev(bdev, holder); 419 + return ret; 420 + } 421 + 422 + /** 423 + * register_pstore_blk() - register block device to pstore/blk 424 + * 425 + * @info: details on the desired block device interface 426 + * 427 + * Return: 428 + * * 0 - OK 429 + * * Others - something error. 430 + */ 431 + int register_pstore_blk(struct pstore_blk_info *info) 432 + { 433 + int ret; 434 + 435 + mutex_lock(&pstore_blk_lock); 436 + ret = __register_pstore_blk(info); 437 + mutex_unlock(&pstore_blk_lock); 438 + 439 + return ret; 440 + } 441 + EXPORT_SYMBOL_GPL(register_pstore_blk); 442 + 443 + static void __unregister_pstore_blk(unsigned int major) 444 + { 445 + struct pstore_device_info dev = { .read = psblk_generic_blk_read }; 446 + void *holder = blkdev; 447 + 448 + lockdep_assert_held(&pstore_blk_lock); 449 + if (psblk_bdev && MAJOR(psblk_bdev->bd_dev) == major) { 450 + __unregister_pstore_device(&dev); 451 + psblk_put_bdev(psblk_bdev, holder); 452 + blkdev_panic_write = NULL; 453 + psblk_bdev = NULL; 454 + } 455 + } 456 + 457 + /** 458 + * unregister_pstore_blk() - unregister block device from pstore/blk 459 + * 460 + * @major: the major device number of device 461 + */ 462 + void unregister_pstore_blk(unsigned int major) 463 + { 464 + mutex_lock(&pstore_blk_lock); 465 + __unregister_pstore_blk(major); 466 + mutex_unlock(&pstore_blk_lock); 467 + } 468 + EXPORT_SYMBOL_GPL(unregister_pstore_blk); 469 + 470 + /* get information of pstore/blk */ 471 + int pstore_blk_get_config(struct pstore_blk_config *info) 472 + { 473 + strncpy(info->device, blkdev, 80); 474 + info->max_reason = max_reason; 475 + info->kmsg_size = check_size(kmsg_size, 4096); 476 + info->pmsg_size = check_size(pmsg_size, 4096); 477 + info->ftrace_size = check_size(ftrace_size, 4096); 478 + info->console_size = check_size(console_size, 4096); 479 + 480 + return 0; 481 + } 482 + EXPORT_SYMBOL_GPL(pstore_blk_get_config); 483 + 484 + static int __init pstore_blk_init(void) 485 + { 486 + struct pstore_blk_info info = { }; 487 + int ret = 0; 488 + 489 + mutex_lock(&pstore_blk_lock); 490 + if (!pstore_zone_info && best_effort && blkdev[0]) 491 + ret = __register_pstore_blk(&info); 492 + mutex_unlock(&pstore_blk_lock); 493 + 494 + return ret; 495 + } 496 + late_initcall(pstore_blk_init); 497 + 498 + static void __exit pstore_blk_exit(void) 499 + { 500 + mutex_lock(&pstore_blk_lock); 501 + if (psblk_bdev) 502 + __unregister_pstore_blk(MAJOR(psblk_bdev->bd_dev)); 503 + else { 504 + struct pstore_device_info dev = { }; 505 + 506 + if (pstore_zone_info) 507 + dev.read = pstore_zone_info->read; 508 + __unregister_pstore_device(&dev); 509 + } 510 + mutex_unlock(&pstore_blk_lock); 511 + } 512 + module_exit(pstore_blk_exit); 513 + 514 + MODULE_LICENSE("GPL"); 515 + MODULE_AUTHOR("WeiXiong Liao <liaoweixiong@allwinnertech.com>"); 516 + MODULE_AUTHOR("Kees Cook <keescook@chromium.org>"); 517 + MODULE_DESCRIPTION("pstore backend for block devices");

+54

fs/pstore/ftrace.c

··· 16 16 #include <linux/debugfs.h> 17 17 #include <linux/err.h> 18 18 #include <linux/cache.h> 19 + #include <linux/slab.h> 19 20 #include <asm/barrier.h> 20 21 #include "internal.h" 21 22 ··· 133 132 134 133 debugfs_remove_recursive(pstore_ftrace_dir); 135 134 } 135 + 136 + ssize_t pstore_ftrace_combine_log(char **dest_log, size_t *dest_log_size, 137 + const char *src_log, size_t src_log_size) 138 + { 139 + size_t dest_size, src_size, total, dest_off, src_off; 140 + size_t dest_idx = 0, src_idx = 0, merged_idx = 0; 141 + void *merged_buf; 142 + struct pstore_ftrace_record *drec, *srec, *mrec; 143 + size_t record_size = sizeof(struct pstore_ftrace_record); 144 + 145 + dest_off = *dest_log_size % record_size; 146 + dest_size = *dest_log_size - dest_off; 147 + 148 + src_off = src_log_size % record_size; 149 + src_size = src_log_size - src_off; 150 + 151 + total = dest_size + src_size; 152 + merged_buf = kmalloc(total, GFP_KERNEL); 153 + if (!merged_buf) 154 + return -ENOMEM; 155 + 156 + drec = (struct pstore_ftrace_record *)(*dest_log + dest_off); 157 + srec = (struct pstore_ftrace_record *)(src_log + src_off); 158 + mrec = (struct pstore_ftrace_record *)(merged_buf); 159 + 160 + while (dest_size > 0 && src_size > 0) { 161 + if (pstore_ftrace_read_timestamp(&drec[dest_idx]) < 162 + pstore_ftrace_read_timestamp(&srec[src_idx])) { 163 + mrec[merged_idx++] = drec[dest_idx++]; 164 + dest_size -= record_size; 165 + } else { 166 + mrec[merged_idx++] = srec[src_idx++]; 167 + src_size -= record_size; 168 + } 169 + } 170 + 171 + while (dest_size > 0) { 172 + mrec[merged_idx++] = drec[dest_idx++]; 173 + dest_size -= record_size; 174 + } 175 + 176 + while (src_size > 0) { 177 + mrec[merged_idx++] = srec[src_idx++]; 178 + src_size -= record_size; 179 + } 180 + 181 + kfree(*dest_log); 182 + *dest_log = merged_buf; 183 + *dest_log_size = total; 184 + 185 + return 0; 186 + } 187 + EXPORT_SYMBOL_GPL(pstore_ftrace_combine_log);

+94 -37

fs/pstore/inode.c

··· 22 22 #include <linux/magic.h> 23 23 #include <linux/pstore.h> 24 24 #include <linux/slab.h> 25 - #include <linux/spinlock.h> 26 25 #include <linux/uaccess.h> 27 26 28 27 #include "internal.h" 29 28 30 29 #define PSTORE_NAMELEN 64 31 30 32 - static DEFINE_SPINLOCK(allpstore_lock); 33 - static LIST_HEAD(allpstore); 31 + static DEFINE_MUTEX(records_list_lock); 32 + static LIST_HEAD(records_list); 33 + 34 + static DEFINE_MUTEX(pstore_sb_lock); 35 + static struct super_block *pstore_sb; 34 36 35 37 struct pstore_private { 36 38 struct list_head list; 39 + struct dentry *dentry; 37 40 struct pstore_record *record; 38 41 size_t total_size; 39 42 }; ··· 181 178 { 182 179 struct pstore_private *p = d_inode(dentry)->i_private; 183 180 struct pstore_record *record = p->record; 181 + int rc = 0; 184 182 185 183 if (!record->psi->erase) 186 184 return -EPERM; 185 + 186 + /* Make sure we can't race while removing this file. */ 187 + mutex_lock(&records_list_lock); 188 + if (!list_empty(&p->list)) 189 + list_del_init(&p->list); 190 + else 191 + rc = -ENOENT; 192 + p->dentry = NULL; 193 + mutex_unlock(&records_list_lock); 194 + if (rc) 195 + return rc; 187 196 188 197 mutex_lock(&record->psi->read_mutex); 189 198 record->psi->erase(record); ··· 207 192 static void pstore_evict_inode(struct inode *inode) 208 193 { 209 194 struct pstore_private *p = inode->i_private; 210 - unsigned long flags; 211 195 212 196 clear_inode(inode); 213 - if (p) { 214 - spin_lock_irqsave(&allpstore_lock, flags); 215 - list_del(&p->list); 216 - spin_unlock_irqrestore(&allpstore_lock, flags); 217 - free_pstore_private(p); 218 - } 197 + free_pstore_private(p); 219 198 } 220 199 221 200 static const struct inode_operations pstore_dir_inode_operations = { ··· 287 278 .show_options = pstore_show_options, 288 279 }; 289 280 290 - static struct super_block *pstore_sb; 291 - 292 - bool pstore_is_mounted(void) 281 + static struct dentry *psinfo_lock_root(void) 293 282 { 294 - return pstore_sb != NULL; 283 + struct dentry *root; 284 + 285 + mutex_lock(&pstore_sb_lock); 286 + /* 287 + * Having no backend is fine -- no records appear. 288 + * Not being mounted is fine -- nothing to do. 289 + */ 290 + if (!psinfo || !pstore_sb) { 291 + mutex_unlock(&pstore_sb_lock); 292 + return NULL; 293 + } 294 + 295 + root = pstore_sb->s_root; 296 + inode_lock(d_inode(root)); 297 + mutex_unlock(&pstore_sb_lock); 298 + 299 + return root; 300 + } 301 + 302 + int pstore_put_backend_records(struct pstore_info *psi) 303 + { 304 + struct pstore_private *pos, *tmp; 305 + struct dentry *root; 306 + int rc = 0; 307 + 308 + root = psinfo_lock_root(); 309 + if (!root) 310 + return 0; 311 + 312 + mutex_lock(&records_list_lock); 313 + list_for_each_entry_safe(pos, tmp, &records_list, list) { 314 + if (pos->record->psi == psi) { 315 + list_del_init(&pos->list); 316 + rc = simple_unlink(d_inode(root), pos->dentry); 317 + if (WARN_ON(rc)) 318 + break; 319 + d_drop(pos->dentry); 320 + dput(pos->dentry); 321 + pos->dentry = NULL; 322 + } 323 + } 324 + mutex_unlock(&records_list_lock); 325 + 326 + inode_unlock(d_inode(root)); 327 + 328 + return rc; 295 329 } 296 330 297 331 /* ··· 349 297 int rc = 0; 350 298 char name[PSTORE_NAMELEN]; 351 299 struct pstore_private *private, *pos; 352 - unsigned long flags; 353 300 size_t size = record->size + record->ecc_notice_size; 354 301 355 - WARN_ON(!inode_is_locked(d_inode(root))); 302 + if (WARN_ON(!inode_is_locked(d_inode(root)))) 303 + return -EINVAL; 356 304 357 - spin_lock_irqsave(&allpstore_lock, flags); 358 - list_for_each_entry(pos, &allpstore, list) { 305 + rc = -EEXIST; 306 + /* Skip records that are already present in the filesystem. */ 307 + mutex_lock(&records_list_lock); 308 + list_for_each_entry(pos, &records_list, list) { 359 309 if (pos->record->type == record->type && 360 310 pos->record->id == record->id && 361 - pos->record->psi == record->psi) { 362 - rc = -EEXIST; 363 - break; 364 - } 311 + pos->record->psi == record->psi) 312 + goto fail; 365 313 } 366 - spin_unlock_irqrestore(&allpstore_lock, flags); 367 - if (rc) 368 - return rc; 369 314 370 315 rc = -ENOMEM; 371 316 inode = pstore_get_inode(root->d_sb); ··· 383 334 if (!dentry) 384 335 goto fail_private; 385 336 337 + private->dentry = dentry; 386 338 private->record = record; 387 339 inode->i_size = private->total_size = size; 388 340 inode->i_private = private; ··· 393 343 394 344 d_add(dentry, inode); 395 345 396 - spin_lock_irqsave(&allpstore_lock, flags); 397 - list_add(&private->list, &allpstore); 398 - spin_unlock_irqrestore(&allpstore_lock, flags); 346 + list_add(&private->list, &records_list); 347 + mutex_unlock(&records_list_lock); 399 348 400 349 return 0; 401 350 ··· 402 353 free_pstore_private(private); 403 354 fail_inode: 404 355 iput(inode); 405 - 406 356 fail: 357 + mutex_unlock(&records_list_lock); 407 358 return rc; 408 359 } 409 360 ··· 415 366 */ 416 367 void pstore_get_records(int quiet) 417 368 { 418 - struct pstore_info *psi = psinfo; 419 369 struct dentry *root; 420 370 421 - if (!psi || !pstore_sb) 371 + root = psinfo_lock_root(); 372 + if (!root) 422 373 return; 423 374 424 - root = pstore_sb->s_root; 425 - 426 - inode_lock(d_inode(root)); 427 - pstore_get_backend_records(psi, root, quiet); 375 + pstore_get_backend_records(psinfo, root, quiet); 428 376 inode_unlock(d_inode(root)); 429 377 } 430 378 431 379 static int pstore_fill_super(struct super_block *sb, void *data, int silent) 432 380 { 433 381 struct inode *inode; 434 - 435 - pstore_sb = sb; 436 382 437 383 sb->s_maxbytes = MAX_LFS_FILESIZE; 438 384 sb->s_blocksize = PAGE_SIZE; ··· 449 405 if (!sb->s_root) 450 406 return -ENOMEM; 451 407 408 + mutex_lock(&pstore_sb_lock); 409 + pstore_sb = sb; 410 + mutex_unlock(&pstore_sb_lock); 411 + 452 412 pstore_get_records(0); 453 413 454 414 return 0; ··· 466 418 467 419 static void pstore_kill_sb(struct super_block *sb) 468 420 { 421 + mutex_lock(&pstore_sb_lock); 422 + WARN_ON(pstore_sb != sb); 423 + 469 424 kill_litter_super(sb); 470 425 pstore_sb = NULL; 426 + 427 + mutex_lock(&records_list_lock); 428 + INIT_LIST_HEAD(&records_list); 429 + mutex_unlock(&records_list_lock); 430 + 431 + mutex_unlock(&pstore_sb_lock); 471 432 } 472 433 473 434 static struct file_system_type pstore_fs_type = {

+10 -1

fs/pstore/internal.h

··· 12 12 #ifdef CONFIG_PSTORE_FTRACE 13 13 extern void pstore_register_ftrace(void); 14 14 extern void pstore_unregister_ftrace(void); 15 + ssize_t pstore_ftrace_combine_log(char **dest_log, size_t *dest_log_size, 16 + const char *src_log, size_t src_log_size); 15 17 #else 16 18 static inline void pstore_register_ftrace(void) {} 17 19 static inline void pstore_unregister_ftrace(void) {} 20 + static inline ssize_t 21 + pstore_ftrace_combine_log(char **dest_log, size_t *dest_log_size, 22 + const char *src_log, size_t src_log_size) 23 + { 24 + *dest_log_size = 0; 25 + return 0; 26 + } 18 27 #endif 19 28 20 29 #ifdef CONFIG_PSTORE_PMSG ··· 40 31 extern void pstore_get_records(int); 41 32 extern void pstore_get_backend_records(struct pstore_info *psi, 42 33 struct dentry *root, int quiet); 34 + extern int pstore_put_backend_records(struct pstore_info *psi); 43 35 extern int pstore_mkfile(struct dentry *root, 44 36 struct pstore_record *record); 45 - extern bool pstore_is_mounted(void); 46 37 extern void pstore_record_init(struct pstore_record *record, 47 38 struct pstore_info *psi); 48 39

+58 -59

fs/pstore/platform.c

··· 44 44 module_param_named(update_ms, pstore_update_ms, int, 0600); 45 45 MODULE_PARM_DESC(update_ms, "milliseconds before pstore updates its content " 46 46 "(default is -1, which means runtime updates are disabled; " 47 - "enabling this option is not safe, it may lead to further " 47 + "enabling this option may not be safe; it may lead to further " 48 48 "corruption on Oopses)"); 49 49 50 50 /* Names should be in the same order as the enum pstore_type_id */ ··· 69 69 static DECLARE_WORK(pstore_work, pstore_dowork); 70 70 71 71 /* 72 - * pstore_lock just protects "psinfo" during 73 - * calls to pstore_register() 72 + * psinfo_lock protects "psinfo" during calls to 73 + * pstore_register(), pstore_unregister(), and 74 + * the filesystem mount/unmount routines. 74 75 */ 75 - static DEFINE_SPINLOCK(pstore_lock); 76 + static DEFINE_MUTEX(psinfo_lock); 76 77 struct pstore_info *psinfo; 77 78 78 79 static char *backend; 80 + module_param(backend, charp, 0444); 81 + MODULE_PARM_DESC(backend, "specific backend to use"); 82 + 79 83 static char *compress = 80 84 #ifdef CONFIG_PSTORE_COMPRESS_DEFAULT 81 85 CONFIG_PSTORE_COMPRESS_DEFAULT; 82 86 #else 83 87 NULL; 84 88 #endif 89 + module_param(compress, charp, 0444); 90 + MODULE_PARM_DESC(compress, "compression to use"); 85 91 86 92 /* Compression parameters */ 87 93 static struct crypto_comp *tfm; ··· 135 129 } 136 130 EXPORT_SYMBOL_GPL(pstore_name_to_type); 137 131 138 - static const char *get_reason_str(enum kmsg_dump_reason reason) 132 + static void pstore_timer_kick(void) 139 133 { 140 - switch (reason) { 141 - case KMSG_DUMP_PANIC: 142 - return "Panic"; 143 - case KMSG_DUMP_OOPS: 144 - return "Oops"; 145 - case KMSG_DUMP_EMERG: 146 - return "Emergency"; 147 - case KMSG_DUMP_RESTART: 148 - return "Restart"; 149 - case KMSG_DUMP_HALT: 150 - return "Halt"; 151 - case KMSG_DUMP_POWEROFF: 152 - return "Poweroff"; 153 - default: 154 - return "Unknown"; 155 - } 134 + if (pstore_update_ms < 0) 135 + return; 136 + 137 + mod_timer(&pstore_timer, jiffies + msecs_to_jiffies(pstore_update_ms)); 156 138 } 157 139 158 140 /* ··· 387 393 unsigned int part = 1; 388 394 int ret; 389 395 390 - why = get_reason_str(reason); 396 + why = kmsg_dump_reason_str(reason); 391 397 392 398 if (down_trylock(&psinfo->buf_lock)) { 393 399 /* Failed to acquire lock: give up if we cannot wait. */ ··· 453 459 } 454 460 455 461 ret = psinfo->write(&record); 456 - if (ret == 0 && reason == KMSG_DUMP_OOPS && pstore_is_mounted()) 462 + if (ret == 0 && reason == KMSG_DUMP_OOPS) { 457 463 pstore_new_entry = 1; 464 + pstore_timer_kick(); 465 + } 458 466 459 467 total += record.size; 460 468 part++; ··· 499 503 } 500 504 501 505 static struct console pstore_console = { 502 - .name = "pstore", 503 506 .write = pstore_console_write, 504 - .flags = CON_PRINTBUFFER | CON_ENABLED | CON_ANYTIME, 505 507 .index = -1, 506 508 }; 507 509 508 510 static void pstore_register_console(void) 509 511 { 512 + /* Show which backend is going to get console writes. */ 513 + strscpy(pstore_console.name, psinfo->name, 514 + sizeof(pstore_console.name)); 515 + /* 516 + * Always initialize flags here since prior unregister_console() 517 + * calls may have changed settings (specifically CON_ENABLED). 518 + */ 519 + pstore_console.flags = CON_PRINTBUFFER | CON_ENABLED | CON_ANYTIME; 510 520 register_console(&pstore_console); 511 521 } 512 522 ··· 557 555 */ 558 556 int pstore_register(struct pstore_info *psi) 559 557 { 560 - struct module *owner = psi->owner; 561 - 562 558 if (backend && strcmp(backend, psi->name)) { 563 559 pr_warn("ignoring unexpected backend '%s'\n", psi->name); 564 560 return -EPERM; ··· 576 576 return -EINVAL; 577 577 } 578 578 579 - spin_lock(&pstore_lock); 579 + mutex_lock(&psinfo_lock); 580 580 if (psinfo) { 581 581 pr_warn("backend '%s' already loaded: ignoring '%s'\n", 582 582 psinfo->name, psi->name); 583 - spin_unlock(&pstore_lock); 583 + mutex_unlock(&psinfo_lock); 584 584 return -EBUSY; 585 585 } 586 586 ··· 589 589 psinfo = psi; 590 590 mutex_init(&psinfo->read_mutex); 591 591 sema_init(&psinfo->buf_lock, 1); 592 - spin_unlock(&pstore_lock); 593 - 594 - if (owner && !try_module_get(owner)) { 595 - psinfo = NULL; 596 - return -EINVAL; 597 - } 598 592 599 593 if (psi->flags & PSTORE_FLAGS_DMESG) 600 594 allocate_buf_for_compression(); 601 595 602 - if (pstore_is_mounted()) 603 - pstore_get_records(0); 596 + pstore_get_records(0); 604 597 605 - if (psi->flags & PSTORE_FLAGS_DMESG) 598 + if (psi->flags & PSTORE_FLAGS_DMESG) { 599 + pstore_dumper.max_reason = psinfo->max_reason; 606 600 pstore_register_kmsg(); 601 + } 607 602 if (psi->flags & PSTORE_FLAGS_CONSOLE) 608 603 pstore_register_console(); 609 604 if (psi->flags & PSTORE_FLAGS_FTRACE) ··· 607 612 pstore_register_pmsg(); 608 613 609 614 /* Start watching for new records, if desired. */ 610 - if (pstore_update_ms >= 0) { 611 - pstore_timer.expires = jiffies + 612 - msecs_to_jiffies(pstore_update_ms); 613 - add_timer(&pstore_timer); 614 - } 615 + pstore_timer_kick(); 615 616 616 617 /* 617 618 * Update the module parameter backend, so it is visible 618 619 * through /sys/module/pstore/parameters/backend 619 620 */ 620 - backend = psi->name; 621 + backend = kstrdup(psi->name, GFP_KERNEL); 621 622 622 623 pr_info("Registered %s as persistent store backend\n", psi->name); 623 624 624 - module_put(owner); 625 - 625 + mutex_unlock(&psinfo_lock); 626 626 return 0; 627 627 } 628 628 EXPORT_SYMBOL_GPL(pstore_register); 629 629 630 630 void pstore_unregister(struct pstore_info *psi) 631 631 { 632 - /* Stop timer and make sure all work has finished. */ 633 - pstore_update_ms = -1; 634 - del_timer_sync(&pstore_timer); 635 - flush_work(&pstore_work); 632 + /* It's okay to unregister nothing. */ 633 + if (!psi) 634 + return; 636 635 636 + mutex_lock(&psinfo_lock); 637 + 638 + /* Only one backend can be registered at a time. */ 639 + if (WARN_ON(psi != psinfo)) { 640 + mutex_unlock(&psinfo_lock); 641 + return; 642 + } 643 + 644 + /* Unregister all callbacks. */ 637 645 if (psi->flags & PSTORE_FLAGS_PMSG) 638 646 pstore_unregister_pmsg(); 639 647 if (psi->flags & PSTORE_FLAGS_FTRACE) ··· 646 648 if (psi->flags & PSTORE_FLAGS_DMESG) 647 649 pstore_unregister_kmsg(); 648 650 651 + /* Stop timer and make sure all work has finished. */ 652 + del_timer_sync(&pstore_timer); 653 + flush_work(&pstore_work); 654 + 655 + /* Remove all backend records from filesystem tree. */ 656 + pstore_put_backend_records(psi); 657 + 649 658 free_buf_for_compression(); 650 659 651 660 psinfo = NULL; 661 + kfree(backend); 652 662 backend = NULL; 663 + mutex_unlock(&psinfo_lock); 653 664 } 654 665 EXPORT_SYMBOL_GPL(pstore_unregister); 655 666 ··· 795 788 schedule_work(&pstore_work); 796 789 } 797 790 798 - if (pstore_update_ms >= 0) 799 - mod_timer(&pstore_timer, 800 - jiffies + msecs_to_jiffies(pstore_update_ms)); 791 + pstore_timer_kick(); 801 792 } 802 793 803 794 static void __init pstore_choose_compression(void) ··· 839 834 pstore_exit_fs(); 840 835 } 841 836 module_exit(pstore_exit) 842 - 843 - module_param(compress, charp, 0444); 844 - MODULE_PARM_DESC(compress, "Pstore compression to use"); 845 - 846 - module_param(backend, charp, 0444); 847 - MODULE_PARM_DESC(backend, "Pstore backend to use"); 848 837 849 838 MODULE_AUTHOR("Tony Luck <tony.luck@intel.com>"); 850 839 MODULE_LICENSE("GPL");

+67 -86

fs/pstore/ram.c

··· 21 21 #include <linux/pstore_ram.h> 22 22 #include <linux/of.h> 23 23 #include <linux/of_address.h> 24 + #include "internal.h" 24 25 25 26 #define RAMOOPS_KERNMSG_HDR "====" 26 27 #define MIN_MEM_SIZE 4096UL ··· 54 53 "size of reserved RAM used to store oops/panic logs"); 55 54 56 55 static unsigned int mem_type; 57 - module_param(mem_type, uint, 0600); 56 + module_param(mem_type, uint, 0400); 58 57 MODULE_PARM_DESC(mem_type, 59 58 "set to 1 to try to use unbuffered memory (default 0)"); 60 59 61 - static int dump_oops = 1; 62 - module_param(dump_oops, int, 0600); 63 - MODULE_PARM_DESC(dump_oops, 64 - "set to 1 to dump oopses, 0 to only dump panics (default 1)"); 60 + static int ramoops_max_reason = -1; 61 + module_param_named(max_reason, ramoops_max_reason, int, 0400); 62 + MODULE_PARM_DESC(max_reason, 63 + "maximum reason for kmsg dump (default 2: Oops and Panic) "); 65 64 66 65 static int ramoops_ecc; 67 - module_param_named(ecc, ramoops_ecc, int, 0600); 66 + module_param_named(ecc, ramoops_ecc, int, 0400); 68 67 MODULE_PARM_DESC(ramoops_ecc, 69 68 "if non-zero, the option enables ECC support and specifies " 70 69 "ECC buffer size in bytes (1 is a special value, means 16 " 71 70 "bytes ECC)"); 71 + 72 + static int ramoops_dump_oops = -1; 73 + module_param_named(dump_oops, ramoops_dump_oops, int, 0400); 74 + MODULE_PARM_DESC(dump_oops, 75 + "(deprecated: use max_reason instead) set to 1 to dump oopses & panics, 0 to only dump panics"); 72 76 73 77 struct ramoops_context { 74 78 struct persistent_ram_zone **dprzs; /* Oops dump zones */ ··· 87 81 size_t console_size; 88 82 size_t ftrace_size; 89 83 size_t pmsg_size; 90 - int dump_oops; 91 84 u32 flags; 92 85 struct persistent_ram_ecc_info ecc_info; 93 86 unsigned int max_dump_cnt; ··· 173 168 persistent_ram_ecc_string(prz, NULL, 0)); 174 169 } 175 170 176 - static ssize_t ftrace_log_combine(struct persistent_ram_zone *dest, 177 - struct persistent_ram_zone *src) 178 - { 179 - size_t dest_size, src_size, total, dest_off, src_off; 180 - size_t dest_idx = 0, src_idx = 0, merged_idx = 0; 181 - void *merged_buf; 182 - struct pstore_ftrace_record *drec, *srec, *mrec; 183 - size_t record_size = sizeof(struct pstore_ftrace_record); 184 - 185 - dest_off = dest->old_log_size % record_size; 186 - dest_size = dest->old_log_size - dest_off; 187 - 188 - src_off = src->old_log_size % record_size; 189 - src_size = src->old_log_size - src_off; 190 - 191 - total = dest_size + src_size; 192 - merged_buf = kmalloc(total, GFP_KERNEL); 193 - if (!merged_buf) 194 - return -ENOMEM; 195 - 196 - drec = (struct pstore_ftrace_record *)(dest->old_log + dest_off); 197 - srec = (struct pstore_ftrace_record *)(src->old_log + src_off); 198 - mrec = (struct pstore_ftrace_record *)(merged_buf); 199 - 200 - while (dest_size > 0 && src_size > 0) { 201 - if (pstore_ftrace_read_timestamp(&drec[dest_idx]) < 202 - pstore_ftrace_read_timestamp(&srec[src_idx])) { 203 - mrec[merged_idx++] = drec[dest_idx++]; 204 - dest_size -= record_size; 205 - } else { 206 - mrec[merged_idx++] = srec[src_idx++]; 207 - src_size -= record_size; 208 - } 209 - } 210 - 211 - while (dest_size > 0) { 212 - mrec[merged_idx++] = drec[dest_idx++]; 213 - dest_size -= record_size; 214 - } 215 - 216 - while (src_size > 0) { 217 - mrec[merged_idx++] = srec[src_idx++]; 218 - src_size -= record_size; 219 - } 220 - 221 - kfree(dest->old_log); 222 - dest->old_log = merged_buf; 223 - dest->old_log_size = total; 224 - 225 - return 0; 226 - } 227 - 228 171 static ssize_t ramoops_pstore_read(struct pstore_record *record) 229 172 { 230 173 ssize_t size = 0; ··· 244 291 tmp_prz->corrected_bytes += 245 292 prz_next->corrected_bytes; 246 293 tmp_prz->bad_blocks += prz_next->bad_blocks; 247 - size = ftrace_log_combine(tmp_prz, prz_next); 294 + 295 + size = pstore_ftrace_combine_log( 296 + &tmp_prz->old_log, 297 + &tmp_prz->old_log_size, 298 + prz_next->old_log, 299 + prz_next->old_log_size); 248 300 if (size) 249 301 goto out; 250 302 } ··· 340 382 return -EINVAL; 341 383 342 384 /* 343 - * Out of the various dmesg dump types, ramoops is currently designed 344 - * to only store crash logs, rather than storing general kernel logs. 385 + * We could filter on record->reason here if we wanted to (which 386 + * would duplicate what happened before the "max_reason" setting 387 + * was added), but that would defeat the purpose of a system 388 + * changing printk.always_kmsg_dump, so instead log everything that 389 + * the kmsg dumper sends us, since it should be doing the filtering 390 + * based on the combination of printk.always_kmsg_dump and our 391 + * requested "max_reason". 345 392 */ 346 - if (record->reason != KMSG_DUMP_OOPS && 347 - record->reason != KMSG_DUMP_PANIC) 348 - return -EINVAL; 349 - 350 - /* Skip Oopes when configured to do so. */ 351 - if (record->reason == KMSG_DUMP_OOPS && !cxt->dump_oops) 352 - return -EINVAL; 353 393 354 394 /* 355 395 * Explicitly only take the first part of any new crash. ··· 600 644 return 0; 601 645 } 602 646 603 - static int ramoops_parse_dt_size(struct platform_device *pdev, 604 - const char *propname, u32 *value) 647 + /* Read a u32 from a dt property and make sure it's safe for an int. */ 648 + static int ramoops_parse_dt_u32(struct platform_device *pdev, 649 + const char *propname, 650 + u32 default_value, u32 *value) 605 651 { 606 652 u32 val32 = 0; 607 653 int ret; 608 654 609 655 ret = of_property_read_u32(pdev->dev.of_node, propname, &val32); 610 - if (ret < 0 && ret != -EINVAL) { 656 + if (ret == -EINVAL) { 657 + /* field is missing, use default value. */ 658 + val32 = default_value; 659 + } else if (ret < 0) { 611 660 dev_err(&pdev->dev, "failed to parse property %s: %d\n", 612 661 propname, ret); 613 662 return ret; 614 663 } 615 664 665 + /* Sanity check our results. */ 616 666 if (val32 > INT_MAX) { 617 667 dev_err(&pdev->dev, "%s %u > INT_MAX\n", propname, val32); 618 668 return -EOVERFLOW; ··· 649 687 pdata->mem_size = resource_size(res); 650 688 pdata->mem_address = res->start; 651 689 pdata->mem_type = of_property_read_bool(of_node, "unbuffered"); 652 - pdata->dump_oops = !of_property_read_bool(of_node, "no-dump-oops"); 690 + /* 691 + * Setting "no-dump-oops" is deprecated and will be ignored if 692 + * "max_reason" is also specified. 693 + */ 694 + if (of_property_read_bool(of_node, "no-dump-oops")) 695 + pdata->max_reason = KMSG_DUMP_PANIC; 696 + else 697 + pdata->max_reason = KMSG_DUMP_OOPS; 653 698 654 - #define parse_size(name, field) { \ 655 - ret = ramoops_parse_dt_size(pdev, name, &value); \ 699 + #define parse_u32(name, field, default_value) { \ 700 + ret = ramoops_parse_dt_u32(pdev, name, default_value, \ 701 + &value); \ 656 702 if (ret < 0) \ 657 703 return ret; \ 658 704 field = value; \ 659 705 } 660 706 661 - parse_size("record-size", pdata->record_size); 662 - parse_size("console-size", pdata->console_size); 663 - parse_size("ftrace-size", pdata->ftrace_size); 664 - parse_size("pmsg-size", pdata->pmsg_size); 665 - parse_size("ecc-size", pdata->ecc_info.ecc_size); 666 - parse_size("flags", pdata->flags); 707 + parse_u32("record-size", pdata->record_size, 0); 708 + parse_u32("console-size", pdata->console_size, 0); 709 + parse_u32("ftrace-size", pdata->ftrace_size, 0); 710 + parse_u32("pmsg-size", pdata->pmsg_size, 0); 711 + parse_u32("ecc-size", pdata->ecc_info.ecc_size, 0); 712 + parse_u32("flags", pdata->flags, 0); 713 + parse_u32("max-reason", pdata->max_reason, pdata->max_reason); 667 714 668 - #undef parse_size 715 + #undef parse_u32 669 716 670 717 /* 671 718 * Some old Chromebooks relied on the kernel setting the ··· 756 785 cxt->console_size = pdata->console_size; 757 786 cxt->ftrace_size = pdata->ftrace_size; 758 787 cxt->pmsg_size = pdata->pmsg_size; 759 - cxt->dump_oops = pdata->dump_oops; 760 788 cxt->flags = pdata->flags; 761 789 cxt->ecc_info = pdata->ecc_info; 762 790 ··· 798 828 * the single region size is how to check. 799 829 */ 800 830 cxt->pstore.flags = 0; 801 - if (cxt->max_dump_cnt) 831 + if (cxt->max_dump_cnt) { 802 832 cxt->pstore.flags |= PSTORE_FLAGS_DMESG; 833 + cxt->pstore.max_reason = pdata->max_reason; 834 + } 803 835 if (cxt->console_size) 804 836 cxt->pstore.flags |= PSTORE_FLAGS_CONSOLE; 805 837 if (cxt->max_ftrace_cnt) ··· 837 865 mem_size = pdata->mem_size; 838 866 mem_address = pdata->mem_address; 839 867 record_size = pdata->record_size; 840 - dump_oops = pdata->dump_oops; 868 + ramoops_max_reason = pdata->max_reason; 841 869 ramoops_console_size = pdata->console_size; 842 870 ramoops_pmsg_size = pdata->pmsg_size; 843 871 ramoops_ftrace_size = pdata->ftrace_size; ··· 920 948 pdata.console_size = ramoops_console_size; 921 949 pdata.ftrace_size = ramoops_ftrace_size; 922 950 pdata.pmsg_size = ramoops_pmsg_size; 923 - pdata.dump_oops = dump_oops; 951 + /* If "max_reason" is set, its value has priority over "dump_oops". */ 952 + if (ramoops_max_reason >= 0) 953 + pdata.max_reason = ramoops_max_reason; 954 + /* Otherwise, if "dump_oops" is set, parse it into "max_reason". */ 955 + else if (ramoops_dump_oops != -1) 956 + pdata.max_reason = ramoops_dump_oops ? KMSG_DUMP_OOPS 957 + : KMSG_DUMP_PANIC; 958 + /* And if neither are explicitly set, use the default. */ 959 + else 960 + pdata.max_reason = KMSG_DUMP_OOPS; 924 961 pdata.flags = RAMOOPS_FLAG_FTRACE_PER_CPU; 925 962 926 963 /*

+1465

fs/pstore/zone.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Provide a pstore intermediate backend, organized into kernel memory 4 + * allocated zones that are then mapped and flushed into a single 5 + * contiguous region on a storage backend of some kind (block, mtd, etc). 6 + */ 7 + 8 + #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt 9 + 10 + #include <linux/kernel.h> 11 + #include <linux/module.h> 12 + #include <linux/slab.h> 13 + #include <linux/mount.h> 14 + #include <linux/printk.h> 15 + #include <linux/fs.h> 16 + #include <linux/pstore_zone.h> 17 + #include <linux/kdev_t.h> 18 + #include <linux/device.h> 19 + #include <linux/namei.h> 20 + #include <linux/fcntl.h> 21 + #include <linux/uio.h> 22 + #include <linux/writeback.h> 23 + #include "internal.h" 24 + 25 + /** 26 + * struct psz_head - header of zone to flush to storage 27 + * 28 + * @sig: signature to indicate header (PSZ_SIG xor PSZONE-type value) 29 + * @datalen: length of data in @data 30 + * @start: offset into @data where the beginning of the stored bytes begin 31 + * @data: zone data. 32 + */ 33 + struct psz_buffer { 34 + #define PSZ_SIG (0x43474244) /* DBGC */ 35 + uint32_t sig; 36 + atomic_t datalen; 37 + atomic_t start; 38 + uint8_t data[]; 39 + }; 40 + 41 + /** 42 + * struct psz_kmsg_header - kmsg dump-specific header to flush to storage 43 + * 44 + * @magic: magic num for kmsg dump header 45 + * @time: kmsg dump trigger time 46 + * @compressed: whether conpressed 47 + * @counter: kmsg dump counter 48 + * @reason: the kmsg dump reason (e.g. oops, panic, etc) 49 + * @data: pointer to log data 50 + * 51 + * This is a sub-header for a kmsg dump, trailing after &psz_buffer. 52 + */ 53 + struct psz_kmsg_header { 54 + #define PSTORE_KMSG_HEADER_MAGIC 0x4dfc3ae5 /* Just a random number */ 55 + uint32_t magic; 56 + struct timespec64 time; 57 + bool compressed; 58 + uint32_t counter; 59 + enum kmsg_dump_reason reason; 60 + uint8_t data[]; 61 + }; 62 + 63 + /** 64 + * struct pstore_zone - single stored buffer 65 + * 66 + * @off: zone offset of storage 67 + * @type: front-end type for this zone 68 + * @name: front-end name for this zone 69 + * @buffer: pointer to data buffer managed by this zone 70 + * @oldbuf: pointer to old data buffer 71 + * @buffer_size: bytes in @buffer->data 72 + * @should_recover: whether this zone should recover from storage 73 + * @dirty: whether the data in @buffer dirty 74 + * 75 + * zone structure in memory. 76 + */ 77 + struct pstore_zone { 78 + loff_t off; 79 + const char *name; 80 + enum pstore_type_id type; 81 + 82 + struct psz_buffer *buffer; 83 + struct psz_buffer *oldbuf; 84 + size_t buffer_size; 85 + bool should_recover; 86 + atomic_t dirty; 87 + }; 88 + 89 + /** 90 + * struct psz_context - all about running state of pstore/zone 91 + * 92 + * @kpszs: kmsg dump storage zones 93 + * @ppsz: pmsg storage zone 94 + * @cpsz: console storage zone 95 + * @fpszs: ftrace storage zones 96 + * @kmsg_max_cnt: max count of @kpszs 97 + * @kmsg_read_cnt: counter of total read kmsg dumps 98 + * @kmsg_write_cnt: counter of total kmsg dump writes 99 + * @pmsg_read_cnt: counter of total read pmsg zone 100 + * @console_read_cnt: counter of total read console zone 101 + * @ftrace_max_cnt: max count of @fpszs 102 + * @ftrace_read_cnt: counter of max read ftrace zone 103 + * @oops_counter: counter of oops dumps 104 + * @panic_counter: counter of panic dumps 105 + * @recovered: whether finished recovering data from storage 106 + * @on_panic: whether panic is happening 107 + * @pstore_zone_info_lock: lock to @pstore_zone_info 108 + * @pstore_zone_info: information from backend 109 + * @pstore: structure for pstore 110 + */ 111 + struct psz_context { 112 + struct pstore_zone **kpszs; 113 + struct pstore_zone *ppsz; 114 + struct pstore_zone *cpsz; 115 + struct pstore_zone **fpszs; 116 + unsigned int kmsg_max_cnt; 117 + unsigned int kmsg_read_cnt; 118 + unsigned int kmsg_write_cnt; 119 + unsigned int pmsg_read_cnt; 120 + unsigned int console_read_cnt; 121 + unsigned int ftrace_max_cnt; 122 + unsigned int ftrace_read_cnt; 123 + /* 124 + * These counters should be calculated during recovery. 125 + * It records the oops/panic times after crashes rather than boots. 126 + */ 127 + unsigned int oops_counter; 128 + unsigned int panic_counter; 129 + atomic_t recovered; 130 + atomic_t on_panic; 131 + 132 + /* 133 + * pstore_zone_info_lock protects this entire structure during calls 134 + * to register_pstore_zone()/unregister_pstore_zone(). 135 + */ 136 + struct mutex pstore_zone_info_lock; 137 + struct pstore_zone_info *pstore_zone_info; 138 + struct pstore_info pstore; 139 + }; 140 + static struct psz_context pstore_zone_cxt; 141 + 142 + static void psz_flush_all_dirty_zones(struct work_struct *); 143 + static DECLARE_DELAYED_WORK(psz_cleaner, psz_flush_all_dirty_zones); 144 + 145 + /** 146 + * enum psz_flush_mode - flush mode for psz_zone_write() 147 + * 148 + * @FLUSH_NONE: do not flush to storage but update data on memory 149 + * @FLUSH_PART: just flush part of data including meta data to storage 150 + * @FLUSH_META: just flush meta data of zone to storage 151 + * @FLUSH_ALL: flush all of zone 152 + */ 153 + enum psz_flush_mode { 154 + FLUSH_NONE = 0, 155 + FLUSH_PART, 156 + FLUSH_META, 157 + FLUSH_ALL, 158 + }; 159 + 160 + static inline int buffer_datalen(struct pstore_zone *zone) 161 + { 162 + return atomic_read(&zone->buffer->datalen); 163 + } 164 + 165 + static inline int buffer_start(struct pstore_zone *zone) 166 + { 167 + return atomic_read(&zone->buffer->start); 168 + } 169 + 170 + static inline bool is_on_panic(void) 171 + { 172 + return atomic_read(&pstore_zone_cxt.on_panic); 173 + } 174 + 175 + static ssize_t psz_zone_read_buffer(struct pstore_zone *zone, char *buf, 176 + size_t len, unsigned long off) 177 + { 178 + if (!buf || !zone || !zone->buffer) 179 + return -EINVAL; 180 + if (off > zone->buffer_size) 181 + return -EINVAL; 182 + len = min_t(size_t, len, zone->buffer_size - off); 183 + memcpy(buf, zone->buffer->data + off, len); 184 + return len; 185 + } 186 + 187 + static int psz_zone_read_oldbuf(struct pstore_zone *zone, char *buf, 188 + size_t len, unsigned long off) 189 + { 190 + if (!buf || !zone || !zone->oldbuf) 191 + return -EINVAL; 192 + if (off > zone->buffer_size) 193 + return -EINVAL; 194 + len = min_t(size_t, len, zone->buffer_size - off); 195 + memcpy(buf, zone->oldbuf->data + off, len); 196 + return 0; 197 + } 198 + 199 + static int psz_zone_write(struct pstore_zone *zone, 200 + enum psz_flush_mode flush_mode, const char *buf, 201 + size_t len, unsigned long off) 202 + { 203 + struct pstore_zone_info *info = pstore_zone_cxt.pstore_zone_info; 204 + ssize_t wcnt = 0; 205 + ssize_t (*writeop)(const char *buf, size_t bytes, loff_t pos); 206 + size_t wlen; 207 + 208 + if (off > zone->buffer_size) 209 + return -EINVAL; 210 + 211 + wlen = min_t(size_t, len, zone->buffer_size - off); 212 + if (buf && wlen) { 213 + memcpy(zone->buffer->data + off, buf, wlen); 214 + atomic_set(&zone->buffer->datalen, wlen + off); 215 + } 216 + 217 + /* avoid to damage old records */ 218 + if (!is_on_panic() && !atomic_read(&pstore_zone_cxt.recovered)) 219 + goto dirty; 220 + 221 + writeop = is_on_panic() ? info->panic_write : info->write; 222 + if (!writeop) 223 + goto dirty; 224 + 225 + switch (flush_mode) { 226 + case FLUSH_NONE: 227 + if (unlikely(buf && wlen)) 228 + goto dirty; 229 + return 0; 230 + case FLUSH_PART: 231 + wcnt = writeop((const char *)zone->buffer->data + off, wlen, 232 + zone->off + sizeof(*zone->buffer) + off); 233 + if (wcnt != wlen) 234 + goto dirty; 235 + fallthrough; 236 + case FLUSH_META: 237 + wlen = sizeof(struct psz_buffer); 238 + wcnt = writeop((const char *)zone->buffer, wlen, zone->off); 239 + if (wcnt != wlen) 240 + goto dirty; 241 + break; 242 + case FLUSH_ALL: 243 + wlen = zone->buffer_size + sizeof(*zone->buffer); 244 + wcnt = writeop((const char *)zone->buffer, wlen, zone->off); 245 + if (wcnt != wlen) 246 + goto dirty; 247 + break; 248 + } 249 + 250 + return 0; 251 + dirty: 252 + /* no need to mark dirty if going to try next zone */ 253 + if (wcnt == -ENOMSG) 254 + return -ENOMSG; 255 + atomic_set(&zone->dirty, true); 256 + /* flush dirty zones nicely */ 257 + if (wcnt == -EBUSY && !is_on_panic()) 258 + schedule_delayed_work(&psz_cleaner, msecs_to_jiffies(500)); 259 + return -EBUSY; 260 + } 261 + 262 + static int psz_flush_dirty_zone(struct pstore_zone *zone) 263 + { 264 + int ret; 265 + 266 + if (unlikely(!zone)) 267 + return -EINVAL; 268 + 269 + if (unlikely(!atomic_read(&pstore_zone_cxt.recovered))) 270 + return -EBUSY; 271 + 272 + if (!atomic_xchg(&zone->dirty, false)) 273 + return 0; 274 + 275 + ret = psz_zone_write(zone, FLUSH_ALL, NULL, 0, 0); 276 + if (ret) 277 + atomic_set(&zone->dirty, true); 278 + return ret; 279 + } 280 + 281 + static int psz_flush_dirty_zones(struct pstore_zone **zones, unsigned int cnt) 282 + { 283 + int i, ret; 284 + struct pstore_zone *zone; 285 + 286 + if (!zones) 287 + return -EINVAL; 288 + 289 + for (i = 0; i < cnt; i++) { 290 + zone = zones[i]; 291 + if (!zone) 292 + return -EINVAL; 293 + ret = psz_flush_dirty_zone(zone); 294 + if (ret) 295 + return ret; 296 + } 297 + return 0; 298 + } 299 + 300 + static int psz_move_zone(struct pstore_zone *old, struct pstore_zone *new) 301 + { 302 + const char *data = (const char *)old->buffer->data; 303 + int ret; 304 + 305 + ret = psz_zone_write(new, FLUSH_ALL, data, buffer_datalen(old), 0); 306 + if (ret) { 307 + atomic_set(&new->buffer->datalen, 0); 308 + atomic_set(&new->dirty, false); 309 + return ret; 310 + } 311 + atomic_set(&old->buffer->datalen, 0); 312 + return 0; 313 + } 314 + 315 + static void psz_flush_all_dirty_zones(struct work_struct *work) 316 + { 317 + struct psz_context *cxt = &pstore_zone_cxt; 318 + int ret = 0; 319 + 320 + if (cxt->ppsz) 321 + ret |= psz_flush_dirty_zone(cxt->ppsz); 322 + if (cxt->cpsz) 323 + ret |= psz_flush_dirty_zone(cxt->cpsz); 324 + if (cxt->kpszs) 325 + ret |= psz_flush_dirty_zones(cxt->kpszs, cxt->kmsg_max_cnt); 326 + if (cxt->fpszs) 327 + ret |= psz_flush_dirty_zones(cxt->fpszs, cxt->ftrace_max_cnt); 328 + if (ret && cxt->pstore_zone_info) 329 + schedule_delayed_work(&psz_cleaner, msecs_to_jiffies(1000)); 330 + } 331 + 332 + static int psz_kmsg_recover_data(struct psz_context *cxt) 333 + { 334 + struct pstore_zone_info *info = cxt->pstore_zone_info; 335 + struct pstore_zone *zone = NULL; 336 + struct psz_buffer *buf; 337 + unsigned long i; 338 + ssize_t rcnt; 339 + 340 + if (!info->read) 341 + return -EINVAL; 342 + 343 + for (i = 0; i < cxt->kmsg_max_cnt; i++) { 344 + zone = cxt->kpszs[i]; 345 + if (unlikely(!zone)) 346 + return -EINVAL; 347 + if (atomic_read(&zone->dirty)) { 348 + unsigned int wcnt = cxt->kmsg_write_cnt; 349 + struct pstore_zone *new = cxt->kpszs[wcnt]; 350 + int ret; 351 + 352 + ret = psz_move_zone(zone, new); 353 + if (ret) { 354 + pr_err("move zone from %lu to %d failed\n", 355 + i, wcnt); 356 + return ret; 357 + } 358 + cxt->kmsg_write_cnt = (wcnt + 1) % cxt->kmsg_max_cnt; 359 + } 360 + if (!zone->should_recover) 361 + continue; 362 + buf = zone->buffer; 363 + rcnt = info->read((char *)buf, zone->buffer_size + sizeof(*buf), 364 + zone->off); 365 + if (rcnt != zone->buffer_size + sizeof(*buf)) 366 + return (int)rcnt < 0 ? (int)rcnt : -EIO; 367 + } 368 + return 0; 369 + } 370 + 371 + static int psz_kmsg_recover_meta(struct psz_context *cxt) 372 + { 373 + struct pstore_zone_info *info = cxt->pstore_zone_info; 374 + struct pstore_zone *zone; 375 + size_t rcnt, len; 376 + struct psz_buffer *buf; 377 + struct psz_kmsg_header *hdr; 378 + struct timespec64 time = { }; 379 + unsigned long i; 380 + /* 381 + * Recover may on panic, we can't allocate any memory by kmalloc. 382 + * So, we use local array instead. 383 + */ 384 + char buffer_header[sizeof(*buf) + sizeof(*hdr)] = {0}; 385 + 386 + if (!info->read) 387 + return -EINVAL; 388 + 389 + len = sizeof(*buf) + sizeof(*hdr); 390 + buf = (struct psz_buffer *)buffer_header; 391 + for (i = 0; i < cxt->kmsg_max_cnt; i++) { 392 + zone = cxt->kpszs[i]; 393 + if (unlikely(!zone)) 394 + return -EINVAL; 395 + 396 + rcnt = info->read((char *)buf, len, zone->off); 397 + if (rcnt == -ENOMSG) { 398 + pr_debug("%s with id %lu may be broken, skip\n", 399 + zone->name, i); 400 + continue; 401 + } else if (rcnt != len) { 402 + pr_err("read %s with id %lu failed\n", zone->name, i); 403 + return (int)rcnt < 0 ? (int)rcnt : -EIO; 404 + } 405 + 406 + if (buf->sig != zone->buffer->sig) { 407 + pr_debug("no valid data in kmsg dump zone %lu\n", i); 408 + continue; 409 + } 410 + 411 + if (zone->buffer_size < atomic_read(&buf->datalen)) { 412 + pr_info("found overtop zone: %s: id %lu, off %lld, size %zu\n", 413 + zone->name, i, zone->off, 414 + zone->buffer_size); 415 + continue; 416 + } 417 + 418 + hdr = (struct psz_kmsg_header *)buf->data; 419 + if (hdr->magic != PSTORE_KMSG_HEADER_MAGIC) { 420 + pr_info("found invalid zone: %s: id %lu, off %lld, size %zu\n", 421 + zone->name, i, zone->off, 422 + zone->buffer_size); 423 + continue; 424 + } 425 + 426 + /* 427 + * we get the newest zone, and the next one must be the oldest 428 + * or unused zone, because we do write one by one like a circle. 429 + */ 430 + if (hdr->time.tv_sec >= time.tv_sec) { 431 + time.tv_sec = hdr->time.tv_sec; 432 + cxt->kmsg_write_cnt = (i + 1) % cxt->kmsg_max_cnt; 433 + } 434 + 435 + if (hdr->reason == KMSG_DUMP_OOPS) 436 + cxt->oops_counter = 437 + max(cxt->oops_counter, hdr->counter); 438 + else if (hdr->reason == KMSG_DUMP_PANIC) 439 + cxt->panic_counter = 440 + max(cxt->panic_counter, hdr->counter); 441 + 442 + if (!atomic_read(&buf->datalen)) { 443 + pr_debug("found erased zone: %s: id %lu, off %lld, size %zu, datalen %d\n", 444 + zone->name, i, zone->off, 445 + zone->buffer_size, 446 + atomic_read(&buf->datalen)); 447 + continue; 448 + } 449 + 450 + if (!is_on_panic()) 451 + zone->should_recover = true; 452 + pr_debug("found nice zone: %s: id %lu, off %lld, size %zu, datalen %d\n", 453 + zone->name, i, zone->off, 454 + zone->buffer_size, atomic_read(&buf->datalen)); 455 + } 456 + 457 + return 0; 458 + } 459 + 460 + static int psz_kmsg_recover(struct psz_context *cxt) 461 + { 462 + int ret; 463 + 464 + if (!cxt->kpszs) 465 + return 0; 466 + 467 + ret = psz_kmsg_recover_meta(cxt); 468 + if (ret) 469 + goto recover_fail; 470 + 471 + ret = psz_kmsg_recover_data(cxt); 472 + if (ret) 473 + goto recover_fail; 474 + 475 + return 0; 476 + recover_fail: 477 + pr_debug("psz_recover_kmsg failed\n"); 478 + return ret; 479 + } 480 + 481 + static int psz_recover_zone(struct psz_context *cxt, struct pstore_zone *zone) 482 + { 483 + struct pstore_zone_info *info = cxt->pstore_zone_info; 484 + struct psz_buffer *oldbuf, tmpbuf; 485 + int ret = 0; 486 + char *buf; 487 + ssize_t rcnt, len, start, off; 488 + 489 + if (!zone || zone->oldbuf) 490 + return 0; 491 + 492 + if (is_on_panic()) { 493 + /* save data as much as possible */ 494 + psz_flush_dirty_zone(zone); 495 + return 0; 496 + } 497 + 498 + if (unlikely(!info->read)) 499 + return -EINVAL; 500 + 501 + len = sizeof(struct psz_buffer); 502 + rcnt = info->read((char *)&tmpbuf, len, zone->off); 503 + if (rcnt != len) { 504 + pr_debug("read zone %s failed\n", zone->name); 505 + return (int)rcnt < 0 ? (int)rcnt : -EIO; 506 + } 507 + 508 + if (tmpbuf.sig != zone->buffer->sig) { 509 + pr_debug("no valid data in zone %s\n", zone->name); 510 + return 0; 511 + } 512 + 513 + if (zone->buffer_size < atomic_read(&tmpbuf.datalen) || 514 + zone->buffer_size < atomic_read(&tmpbuf.start)) { 515 + pr_info("found overtop zone: %s: off %lld, size %zu\n", 516 + zone->name, zone->off, zone->buffer_size); 517 + /* just keep going */ 518 + return 0; 519 + } 520 + 521 + if (!atomic_read(&tmpbuf.datalen)) { 522 + pr_debug("found erased zone: %s: off %lld, size %zu, datalen %d\n", 523 + zone->name, zone->off, zone->buffer_size, 524 + atomic_read(&tmpbuf.datalen)); 525 + return 0; 526 + } 527 + 528 + pr_debug("found nice zone: %s: off %lld, size %zu, datalen %d\n", 529 + zone->name, zone->off, zone->buffer_size, 530 + atomic_read(&tmpbuf.datalen)); 531 + 532 + len = atomic_read(&tmpbuf.datalen) + sizeof(*oldbuf); 533 + oldbuf = kzalloc(len, GFP_KERNEL); 534 + if (!oldbuf) 535 + return -ENOMEM; 536 + 537 + memcpy(oldbuf, &tmpbuf, sizeof(*oldbuf)); 538 + buf = (char *)oldbuf + sizeof(*oldbuf); 539 + len = atomic_read(&oldbuf->datalen); 540 + start = atomic_read(&oldbuf->start); 541 + off = zone->off + sizeof(*oldbuf); 542 + 543 + /* get part of data */ 544 + rcnt = info->read(buf, len - start, off + start); 545 + if (rcnt != len - start) { 546 + pr_err("read zone %s failed\n", zone->name); 547 + ret = (int)rcnt < 0 ? (int)rcnt : -EIO; 548 + goto free_oldbuf; 549 + } 550 + 551 + /* get the rest of data */ 552 + rcnt = info->read(buf + len - start, start, off); 553 + if (rcnt != start) { 554 + pr_err("read zone %s failed\n", zone->name); 555 + ret = (int)rcnt < 0 ? (int)rcnt : -EIO; 556 + goto free_oldbuf; 557 + } 558 + 559 + zone->oldbuf = oldbuf; 560 + psz_flush_dirty_zone(zone); 561 + return 0; 562 + 563 + free_oldbuf: 564 + kfree(oldbuf); 565 + return ret; 566 + } 567 + 568 + static int psz_recover_zones(struct psz_context *cxt, 569 + struct pstore_zone **zones, unsigned int cnt) 570 + { 571 + int ret; 572 + unsigned int i; 573 + struct pstore_zone *zone; 574 + 575 + if (!zones) 576 + return 0; 577 + 578 + for (i = 0; i < cnt; i++) { 579 + zone = zones[i]; 580 + if (unlikely(!zone)) 581 + continue; 582 + ret = psz_recover_zone(cxt, zone); 583 + if (ret) 584 + goto recover_fail; 585 + } 586 + 587 + return 0; 588 + recover_fail: 589 + pr_debug("recover %s[%u] failed\n", zone->name, i); 590 + return ret; 591 + } 592 + 593 + /** 594 + * psz_recovery() - recover data from storage 595 + * @cxt: the context of pstore/zone 596 + * 597 + * recovery means reading data back from storage after rebooting 598 + * 599 + * Return: 0 on success, others on failure. 600 + */ 601 + static inline int psz_recovery(struct psz_context *cxt) 602 + { 603 + int ret; 604 + 605 + if (atomic_read(&cxt->recovered)) 606 + return 0; 607 + 608 + ret = psz_kmsg_recover(cxt); 609 + if (ret) 610 + goto out; 611 + 612 + ret = psz_recover_zone(cxt, cxt->ppsz); 613 + if (ret) 614 + goto out; 615 + 616 + ret = psz_recover_zone(cxt, cxt->cpsz); 617 + if (ret) 618 + goto out; 619 + 620 + ret = psz_recover_zones(cxt, cxt->fpszs, cxt->ftrace_max_cnt); 621 + 622 + out: 623 + if (unlikely(ret)) 624 + pr_err("recover failed\n"); 625 + else { 626 + pr_debug("recover end!\n"); 627 + atomic_set(&cxt->recovered, 1); 628 + } 629 + return ret; 630 + } 631 + 632 + static int psz_pstore_open(struct pstore_info *psi) 633 + { 634 + struct psz_context *cxt = psi->data; 635 + 636 + cxt->kmsg_read_cnt = 0; 637 + cxt->pmsg_read_cnt = 0; 638 + cxt->console_read_cnt = 0; 639 + cxt->ftrace_read_cnt = 0; 640 + return 0; 641 + } 642 + 643 + static inline bool psz_old_ok(struct pstore_zone *zone) 644 + { 645 + if (zone && zone->oldbuf && atomic_read(&zone->oldbuf->datalen)) 646 + return true; 647 + return false; 648 + } 649 + 650 + static inline bool psz_ok(struct pstore_zone *zone) 651 + { 652 + if (zone && zone->buffer && buffer_datalen(zone)) 653 + return true; 654 + return false; 655 + } 656 + 657 + static inline int psz_kmsg_erase(struct psz_context *cxt, 658 + struct pstore_zone *zone, struct pstore_record *record) 659 + { 660 + struct psz_buffer *buffer = zone->buffer; 661 + struct psz_kmsg_header *hdr = 662 + (struct psz_kmsg_header *)buffer->data; 663 + size_t size; 664 + 665 + if (unlikely(!psz_ok(zone))) 666 + return 0; 667 + 668 + /* this zone is already updated, no need to erase */ 669 + if (record->count != hdr->counter) 670 + return 0; 671 + 672 + size = buffer_datalen(zone) + sizeof(*zone->buffer); 673 + atomic_set(&zone->buffer->datalen, 0); 674 + if (cxt->pstore_zone_info->erase) 675 + return cxt->pstore_zone_info->erase(size, zone->off); 676 + else 677 + return psz_zone_write(zone, FLUSH_META, NULL, 0, 0); 678 + } 679 + 680 + static inline int psz_record_erase(struct psz_context *cxt, 681 + struct pstore_zone *zone) 682 + { 683 + if (unlikely(!psz_old_ok(zone))) 684 + return 0; 685 + 686 + kfree(zone->oldbuf); 687 + zone->oldbuf = NULL; 688 + /* 689 + * if there are new data in zone buffer, that means the old data 690 + * are already invalid. It is no need to flush 0 (erase) to 691 + * block device. 692 + */ 693 + if (!buffer_datalen(zone)) 694 + return psz_zone_write(zone, FLUSH_META, NULL, 0, 0); 695 + psz_flush_dirty_zone(zone); 696 + return 0; 697 + } 698 + 699 + static int psz_pstore_erase(struct pstore_record *record) 700 + { 701 + struct psz_context *cxt = record->psi->data; 702 + 703 + switch (record->type) { 704 + case PSTORE_TYPE_DMESG: 705 + if (record->id >= cxt->kmsg_max_cnt) 706 + return -EINVAL; 707 + return psz_kmsg_erase(cxt, cxt->kpszs[record->id], record); 708 + case PSTORE_TYPE_PMSG: 709 + return psz_record_erase(cxt, cxt->ppsz); 710 + case PSTORE_TYPE_CONSOLE: 711 + return psz_record_erase(cxt, cxt->cpsz); 712 + case PSTORE_TYPE_FTRACE: 713 + if (record->id >= cxt->ftrace_max_cnt) 714 + return -EINVAL; 715 + return psz_record_erase(cxt, cxt->fpszs[record->id]); 716 + default: return -EINVAL; 717 + } 718 + } 719 + 720 + static void psz_write_kmsg_hdr(struct pstore_zone *zone, 721 + struct pstore_record *record) 722 + { 723 + struct psz_context *cxt = record->psi->data; 724 + struct psz_buffer *buffer = zone->buffer; 725 + struct psz_kmsg_header *hdr = 726 + (struct psz_kmsg_header *)buffer->data; 727 + 728 + hdr->magic = PSTORE_KMSG_HEADER_MAGIC; 729 + hdr->compressed = record->compressed; 730 + hdr->time.tv_sec = record->time.tv_sec; 731 + hdr->time.tv_nsec = record->time.tv_nsec; 732 + hdr->reason = record->reason; 733 + if (hdr->reason == KMSG_DUMP_OOPS) 734 + hdr->counter = ++cxt->oops_counter; 735 + else if (hdr->reason == KMSG_DUMP_PANIC) 736 + hdr->counter = ++cxt->panic_counter; 737 + else 738 + hdr->counter = 0; 739 + } 740 + 741 + /* 742 + * In case zone is broken, which may occur to MTD device, we try each zones, 743 + * start at cxt->kmsg_write_cnt. 744 + */ 745 + static inline int notrace psz_kmsg_write_record(struct psz_context *cxt, 746 + struct pstore_record *record) 747 + { 748 + size_t size, hlen; 749 + struct pstore_zone *zone; 750 + unsigned int i; 751 + 752 + for (i = 0; i < cxt->kmsg_max_cnt; i++) { 753 + unsigned int zonenum, len; 754 + int ret; 755 + 756 + zonenum = (cxt->kmsg_write_cnt + i) % cxt->kmsg_max_cnt; 757 + zone = cxt->kpszs[zonenum]; 758 + if (unlikely(!zone)) 759 + return -ENOSPC; 760 + 761 + /* avoid destroying old data, allocate a new one */ 762 + len = zone->buffer_size + sizeof(*zone->buffer); 763 + zone->oldbuf = zone->buffer; 764 + zone->buffer = kzalloc(len, GFP_KERNEL); 765 + if (!zone->buffer) { 766 + zone->buffer = zone->oldbuf; 767 + return -ENOMEM; 768 + } 769 + zone->buffer->sig = zone->oldbuf->sig; 770 + 771 + pr_debug("write %s to zone id %d\n", zone->name, zonenum); 772 + psz_write_kmsg_hdr(zone, record); 773 + hlen = sizeof(struct psz_kmsg_header); 774 + size = min_t(size_t, record->size, zone->buffer_size - hlen); 775 + ret = psz_zone_write(zone, FLUSH_ALL, record->buf, size, hlen); 776 + if (likely(!ret || ret != -ENOMSG)) { 777 + cxt->kmsg_write_cnt = zonenum + 1; 778 + cxt->kmsg_write_cnt %= cxt->kmsg_max_cnt; 779 + /* no need to try next zone, free last zone buffer */ 780 + kfree(zone->oldbuf); 781 + zone->oldbuf = NULL; 782 + return ret; 783 + } 784 + 785 + pr_debug("zone %u may be broken, try next dmesg zone\n", 786 + zonenum); 787 + kfree(zone->buffer); 788 + zone->buffer = zone->oldbuf; 789 + zone->oldbuf = NULL; 790 + } 791 + 792 + return -EBUSY; 793 + } 794 + 795 + static int notrace psz_kmsg_write(struct psz_context *cxt, 796 + struct pstore_record *record) 797 + { 798 + int ret; 799 + 800 + /* 801 + * Explicitly only take the first part of any new crash. 802 + * If our buffer is larger than kmsg_bytes, this can never happen, 803 + * and if our buffer is smaller than kmsg_bytes, we don't want the 804 + * report split across multiple records. 805 + */ 806 + if (record->part != 1) 807 + return -ENOSPC; 808 + 809 + if (!cxt->kpszs) 810 + return -ENOSPC; 811 + 812 + ret = psz_kmsg_write_record(cxt, record); 813 + if (!ret && is_on_panic()) { 814 + /* ensure all data are flushed to storage when panic */ 815 + pr_debug("try to flush other dirty zones\n"); 816 + psz_flush_all_dirty_zones(NULL); 817 + } 818 + 819 + /* always return 0 as we had handled it on buffer */ 820 + return 0; 821 + } 822 + 823 + static int notrace psz_record_write(struct pstore_zone *zone, 824 + struct pstore_record *record) 825 + { 826 + size_t start, rem; 827 + bool is_full_data = false; 828 + char *buf; 829 + int cnt; 830 + 831 + if (!zone || !record) 832 + return -ENOSPC; 833 + 834 + if (atomic_read(&zone->buffer->datalen) >= zone->buffer_size) 835 + is_full_data = true; 836 + 837 + cnt = record->size; 838 + buf = record->buf; 839 + if (unlikely(cnt > zone->buffer_size)) { 840 + buf += cnt - zone->buffer_size; 841 + cnt = zone->buffer_size; 842 + } 843 + 844 + start = buffer_start(zone); 845 + rem = zone->buffer_size - start; 846 + if (unlikely(rem < cnt)) { 847 + psz_zone_write(zone, FLUSH_PART, buf, rem, start); 848 + buf += rem; 849 + cnt -= rem; 850 + start = 0; 851 + is_full_data = true; 852 + } 853 + 854 + atomic_set(&zone->buffer->start, cnt + start); 855 + psz_zone_write(zone, FLUSH_PART, buf, cnt, start); 856 + 857 + /** 858 + * psz_zone_write will set datalen as start + cnt. 859 + * It work if actual data length lesser than buffer size. 860 + * If data length greater than buffer size, pmsg will rewrite to 861 + * beginning of zone, which make buffer->datalen wrongly. 862 + * So we should reset datalen as buffer size once actual data length 863 + * greater than buffer size. 864 + */ 865 + if (is_full_data) { 866 + atomic_set(&zone->buffer->datalen, zone->buffer_size); 867 + psz_zone_write(zone, FLUSH_META, NULL, 0, 0); 868 + } 869 + return 0; 870 + } 871 + 872 + static int notrace psz_pstore_write(struct pstore_record *record) 873 + { 874 + struct psz_context *cxt = record->psi->data; 875 + 876 + if (record->type == PSTORE_TYPE_DMESG && 877 + record->reason == KMSG_DUMP_PANIC) 878 + atomic_set(&cxt->on_panic, 1); 879 + 880 + /* 881 + * if on panic, do not write except panic records 882 + * Fix case that panic_write prints log which wakes up console backend. 883 + */ 884 + if (is_on_panic() && record->type != PSTORE_TYPE_DMESG) 885 + return -EBUSY; 886 + 887 + switch (record->type) { 888 + case PSTORE_TYPE_DMESG: 889 + return psz_kmsg_write(cxt, record); 890 + case PSTORE_TYPE_CONSOLE: 891 + return psz_record_write(cxt->cpsz, record); 892 + case PSTORE_TYPE_PMSG: 893 + return psz_record_write(cxt->ppsz, record); 894 + case PSTORE_TYPE_FTRACE: { 895 + int zonenum = smp_processor_id(); 896 + 897 + if (!cxt->fpszs) 898 + return -ENOSPC; 899 + return psz_record_write(cxt->fpszs[zonenum], record); 900 + } 901 + default: 902 + return -EINVAL; 903 + } 904 + } 905 + 906 + static struct pstore_zone *psz_read_next_zone(struct psz_context *cxt) 907 + { 908 + struct pstore_zone *zone = NULL; 909 + 910 + while (cxt->kmsg_read_cnt < cxt->kmsg_max_cnt) { 911 + zone = cxt->kpszs[cxt->kmsg_read_cnt++]; 912 + if (psz_ok(zone)) 913 + return zone; 914 + } 915 + 916 + if (cxt->ftrace_read_cnt < cxt->ftrace_max_cnt) 917 + /* 918 + * No need psz_old_ok(). Let psz_ftrace_read() do so for 919 + * combination. psz_ftrace_read() should traverse over 920 + * all zones in case of some zone without data. 921 + */ 922 + return cxt->fpszs[cxt->ftrace_read_cnt++]; 923 + 924 + if (cxt->pmsg_read_cnt == 0) { 925 + cxt->pmsg_read_cnt++; 926 + zone = cxt->ppsz; 927 + if (psz_old_ok(zone)) 928 + return zone; 929 + } 930 + 931 + if (cxt->console_read_cnt == 0) { 932 + cxt->console_read_cnt++; 933 + zone = cxt->cpsz; 934 + if (psz_old_ok(zone)) 935 + return zone; 936 + } 937 + 938 + return NULL; 939 + } 940 + 941 + static int psz_kmsg_read_hdr(struct pstore_zone *zone, 942 + struct pstore_record *record) 943 + { 944 + struct psz_buffer *buffer = zone->buffer; 945 + struct psz_kmsg_header *hdr = 946 + (struct psz_kmsg_header *)buffer->data; 947 + 948 + if (hdr->magic != PSTORE_KMSG_HEADER_MAGIC) 949 + return -EINVAL; 950 + record->compressed = hdr->compressed; 951 + record->time.tv_sec = hdr->time.tv_sec; 952 + record->time.tv_nsec = hdr->time.tv_nsec; 953 + record->reason = hdr->reason; 954 + record->count = hdr->counter; 955 + return 0; 956 + } 957 + 958 + static ssize_t psz_kmsg_read(struct pstore_zone *zone, 959 + struct pstore_record *record) 960 + { 961 + ssize_t size, hlen = 0; 962 + 963 + size = buffer_datalen(zone); 964 + /* Clear and skip this kmsg dump record if it has no valid header */ 965 + if (psz_kmsg_read_hdr(zone, record)) { 966 + atomic_set(&zone->buffer->datalen, 0); 967 + atomic_set(&zone->dirty, 0); 968 + return -ENOMSG; 969 + } 970 + size -= sizeof(struct psz_kmsg_header); 971 + 972 + if (!record->compressed) { 973 + char *buf = kasprintf(GFP_KERNEL, "%s: Total %d times\n", 974 + kmsg_dump_reason_str(record->reason), 975 + record->count); 976 + hlen = strlen(buf); 977 + record->buf = krealloc(buf, hlen + size, GFP_KERNEL); 978 + if (!record->buf) { 979 + kfree(buf); 980 + return -ENOMEM; 981 + } 982 + } else { 983 + record->buf = kmalloc(size, GFP_KERNEL); 984 + if (!record->buf) 985 + return -ENOMEM; 986 + } 987 + 988 + size = psz_zone_read_buffer(zone, record->buf + hlen, size, 989 + sizeof(struct psz_kmsg_header)); 990 + if (unlikely(size < 0)) { 991 + kfree(record->buf); 992 + return -ENOMSG; 993 + } 994 + 995 + return size + hlen; 996 + } 997 + 998 + /* try to combine all ftrace zones */ 999 + static ssize_t psz_ftrace_read(struct pstore_zone *zone, 1000 + struct pstore_record *record) 1001 + { 1002 + struct psz_context *cxt; 1003 + struct psz_buffer *buf; 1004 + int ret; 1005 + 1006 + if (!zone || !record) 1007 + return -ENOSPC; 1008 + 1009 + if (!psz_old_ok(zone)) 1010 + goto out; 1011 + 1012 + buf = (struct psz_buffer *)zone->oldbuf; 1013 + if (!buf) 1014 + return -ENOMSG; 1015 + 1016 + ret = pstore_ftrace_combine_log(&record->buf, &record->size, 1017 + (char *)buf->data, atomic_read(&buf->datalen)); 1018 + if (unlikely(ret)) 1019 + return ret; 1020 + 1021 + out: 1022 + cxt = record->psi->data; 1023 + if (cxt->ftrace_read_cnt < cxt->ftrace_max_cnt) 1024 + /* then, read next ftrace zone */ 1025 + return -ENOMSG; 1026 + record->id = 0; 1027 + return record->size ? record->size : -ENOMSG; 1028 + } 1029 + 1030 + static ssize_t psz_record_read(struct pstore_zone *zone, 1031 + struct pstore_record *record) 1032 + { 1033 + size_t len; 1034 + struct psz_buffer *buf; 1035 + 1036 + if (!zone || !record) 1037 + return -ENOSPC; 1038 + 1039 + buf = (struct psz_buffer *)zone->oldbuf; 1040 + if (!buf) 1041 + return -ENOMSG; 1042 + 1043 + len = atomic_read(&buf->datalen); 1044 + record->buf = kmalloc(len, GFP_KERNEL); 1045 + if (!record->buf) 1046 + return -ENOMEM; 1047 + 1048 + if (unlikely(psz_zone_read_oldbuf(zone, record->buf, len, 0))) { 1049 + kfree(record->buf); 1050 + return -ENOMSG; 1051 + } 1052 + 1053 + return len; 1054 + } 1055 + 1056 + static ssize_t psz_pstore_read(struct pstore_record *record) 1057 + { 1058 + struct psz_context *cxt = record->psi->data; 1059 + ssize_t (*readop)(struct pstore_zone *zone, 1060 + struct pstore_record *record); 1061 + struct pstore_zone *zone; 1062 + ssize_t ret; 1063 + 1064 + /* before read, we must recover from storage */ 1065 + ret = psz_recovery(cxt); 1066 + if (ret) 1067 + return ret; 1068 + 1069 + next_zone: 1070 + zone = psz_read_next_zone(cxt); 1071 + if (!zone) 1072 + return 0; 1073 + 1074 + record->type = zone->type; 1075 + switch (record->type) { 1076 + case PSTORE_TYPE_DMESG: 1077 + readop = psz_kmsg_read; 1078 + record->id = cxt->kmsg_read_cnt - 1; 1079 + break; 1080 + case PSTORE_TYPE_FTRACE: 1081 + readop = psz_ftrace_read; 1082 + break; 1083 + case PSTORE_TYPE_CONSOLE: 1084 + fallthrough; 1085 + case PSTORE_TYPE_PMSG: 1086 + readop = psz_record_read; 1087 + break; 1088 + default: 1089 + goto next_zone; 1090 + } 1091 + 1092 + ret = readop(zone, record); 1093 + if (ret == -ENOMSG) 1094 + goto next_zone; 1095 + return ret; 1096 + } 1097 + 1098 + static struct psz_context pstore_zone_cxt = { 1099 + .pstore_zone_info_lock = 1100 + __MUTEX_INITIALIZER(pstore_zone_cxt.pstore_zone_info_lock), 1101 + .recovered = ATOMIC_INIT(0), 1102 + .on_panic = ATOMIC_INIT(0), 1103 + .pstore = { 1104 + .owner = THIS_MODULE, 1105 + .open = psz_pstore_open, 1106 + .read = psz_pstore_read, 1107 + .write = psz_pstore_write, 1108 + .erase = psz_pstore_erase, 1109 + }, 1110 + }; 1111 + 1112 + static void psz_free_zone(struct pstore_zone **pszone) 1113 + { 1114 + struct pstore_zone *zone = *pszone; 1115 + 1116 + if (!zone) 1117 + return; 1118 + 1119 + kfree(zone->buffer); 1120 + kfree(zone); 1121 + *pszone = NULL; 1122 + } 1123 + 1124 + static void psz_free_zones(struct pstore_zone ***pszones, unsigned int *cnt) 1125 + { 1126 + struct pstore_zone **zones = *pszones; 1127 + 1128 + if (!zones) 1129 + return; 1130 + 1131 + while (*cnt > 0) { 1132 + (*cnt)--; 1133 + psz_free_zone(&(zones[*cnt])); 1134 + } 1135 + kfree(zones); 1136 + *pszones = NULL; 1137 + } 1138 + 1139 + static void psz_free_all_zones(struct psz_context *cxt) 1140 + { 1141 + if (cxt->kpszs) 1142 + psz_free_zones(&cxt->kpszs, &cxt->kmsg_max_cnt); 1143 + if (cxt->ppsz) 1144 + psz_free_zone(&cxt->ppsz); 1145 + if (cxt->cpsz) 1146 + psz_free_zone(&cxt->cpsz); 1147 + if (cxt->fpszs) 1148 + psz_free_zones(&cxt->fpszs, &cxt->ftrace_max_cnt); 1149 + } 1150 + 1151 + static struct pstore_zone *psz_init_zone(enum pstore_type_id type, 1152 + loff_t *off, size_t size) 1153 + { 1154 + struct pstore_zone_info *info = pstore_zone_cxt.pstore_zone_info; 1155 + struct pstore_zone *zone; 1156 + const char *name = pstore_type_to_name(type); 1157 + 1158 + if (!size) 1159 + return NULL; 1160 + 1161 + if (*off + size > info->total_size) { 1162 + pr_err("no room for %s (0x%zx@0x%llx over 0x%lx)\n", 1163 + name, size, *off, info->total_size); 1164 + return ERR_PTR(-ENOMEM); 1165 + } 1166 + 1167 + zone = kzalloc(sizeof(struct pstore_zone), GFP_KERNEL); 1168 + if (!zone) 1169 + return ERR_PTR(-ENOMEM); 1170 + 1171 + zone->buffer = kmalloc(size, GFP_KERNEL); 1172 + if (!zone->buffer) { 1173 + kfree(zone); 1174 + return ERR_PTR(-ENOMEM); 1175 + } 1176 + memset(zone->buffer, 0xFF, size); 1177 + zone->off = *off; 1178 + zone->name = name; 1179 + zone->type = type; 1180 + zone->buffer_size = size - sizeof(struct psz_buffer); 1181 + zone->buffer->sig = type ^ PSZ_SIG; 1182 + zone->oldbuf = NULL; 1183 + atomic_set(&zone->dirty, 0); 1184 + atomic_set(&zone->buffer->datalen, 0); 1185 + atomic_set(&zone->buffer->start, 0); 1186 + 1187 + *off += size; 1188 + 1189 + pr_debug("pszone %s: off 0x%llx, %zu header, %zu data\n", zone->name, 1190 + zone->off, sizeof(*zone->buffer), zone->buffer_size); 1191 + return zone; 1192 + } 1193 + 1194 + static struct pstore_zone **psz_init_zones(enum pstore_type_id type, 1195 + loff_t *off, size_t total_size, ssize_t record_size, 1196 + unsigned int *cnt) 1197 + { 1198 + struct pstore_zone_info *info = pstore_zone_cxt.pstore_zone_info; 1199 + struct pstore_zone **zones, *zone; 1200 + const char *name = pstore_type_to_name(type); 1201 + int c, i; 1202 + 1203 + *cnt = 0; 1204 + if (!total_size || !record_size) 1205 + return NULL; 1206 + 1207 + if (*off + total_size > info->total_size) { 1208 + pr_err("no room for zones %s (0x%zx@0x%llx over 0x%lx)\n", 1209 + name, total_size, *off, info->total_size); 1210 + return ERR_PTR(-ENOMEM); 1211 + } 1212 + 1213 + c = total_size / record_size; 1214 + zones = kcalloc(c, sizeof(*zones), GFP_KERNEL); 1215 + if (!zones) { 1216 + pr_err("allocate for zones %s failed\n", name); 1217 + return ERR_PTR(-ENOMEM); 1218 + } 1219 + memset(zones, 0, c * sizeof(*zones)); 1220 + 1221 + for (i = 0; i < c; i++) { 1222 + zone = psz_init_zone(type, off, record_size); 1223 + if (!zone || IS_ERR(zone)) { 1224 + pr_err("initialize zones %s failed\n", name); 1225 + psz_free_zones(&zones, &i); 1226 + return (void *)zone; 1227 + } 1228 + zones[i] = zone; 1229 + } 1230 + 1231 + *cnt = c; 1232 + return zones; 1233 + } 1234 + 1235 + static int psz_alloc_zones(struct psz_context *cxt) 1236 + { 1237 + struct pstore_zone_info *info = cxt->pstore_zone_info; 1238 + loff_t off = 0; 1239 + int err; 1240 + size_t off_size = 0; 1241 + 1242 + off_size += info->pmsg_size; 1243 + cxt->ppsz = psz_init_zone(PSTORE_TYPE_PMSG, &off, info->pmsg_size); 1244 + if (IS_ERR(cxt->ppsz)) { 1245 + err = PTR_ERR(cxt->ppsz); 1246 + cxt->ppsz = NULL; 1247 + goto free_out; 1248 + } 1249 + 1250 + off_size += info->console_size; 1251 + cxt->cpsz = psz_init_zone(PSTORE_TYPE_CONSOLE, &off, 1252 + info->console_size); 1253 + if (IS_ERR(cxt->cpsz)) { 1254 + err = PTR_ERR(cxt->cpsz); 1255 + cxt->cpsz = NULL; 1256 + goto free_out; 1257 + } 1258 + 1259 + off_size += info->ftrace_size; 1260 + cxt->fpszs = psz_init_zones(PSTORE_TYPE_FTRACE, &off, 1261 + info->ftrace_size, 1262 + info->ftrace_size / nr_cpu_ids, 1263 + &cxt->ftrace_max_cnt); 1264 + if (IS_ERR(cxt->fpszs)) { 1265 + err = PTR_ERR(cxt->fpszs); 1266 + cxt->fpszs = NULL; 1267 + goto free_out; 1268 + } 1269 + 1270 + cxt->kpszs = psz_init_zones(PSTORE_TYPE_DMESG, &off, 1271 + info->total_size - off_size, 1272 + info->kmsg_size, &cxt->kmsg_max_cnt); 1273 + if (IS_ERR(cxt->kpszs)) { 1274 + err = PTR_ERR(cxt->kpszs); 1275 + cxt->kpszs = NULL; 1276 + goto free_out; 1277 + } 1278 + 1279 + return 0; 1280 + free_out: 1281 + psz_free_all_zones(cxt); 1282 + return err; 1283 + } 1284 + 1285 + /** 1286 + * register_pstore_zone() - register to pstore/zone 1287 + * 1288 + * @info: back-end driver information. See &struct pstore_zone_info. 1289 + * 1290 + * Only one back-end at one time. 1291 + * 1292 + * Return: 0 on success, others on failure. 1293 + */ 1294 + int register_pstore_zone(struct pstore_zone_info *info) 1295 + { 1296 + int err = -EINVAL; 1297 + struct psz_context *cxt = &pstore_zone_cxt; 1298 + 1299 + if (info->total_size < 4096) { 1300 + pr_warn("total_size must be >= 4096\n"); 1301 + return -EINVAL; 1302 + } 1303 + 1304 + if (!info->kmsg_size && !info->pmsg_size && !info->console_size && 1305 + !info->ftrace_size) { 1306 + pr_warn("at least one record size must be non-zero\n"); 1307 + return -EINVAL; 1308 + } 1309 + 1310 + if (!info->name || !info->name[0]) 1311 + return -EINVAL; 1312 + 1313 + #define check_size(name, size) { \ 1314 + if (info->name > 0 && info->name < (size)) { \ 1315 + pr_err(#name " must be over %d\n", (size)); \ 1316 + return -EINVAL; \ 1317 + } \ 1318 + if (info->name & (size - 1)) { \ 1319 + pr_err(#name " must be a multiple of %d\n", \ 1320 + (size)); \ 1321 + return -EINVAL; \ 1322 + } \ 1323 + } 1324 + 1325 + check_size(total_size, 4096); 1326 + check_size(kmsg_size, SECTOR_SIZE); 1327 + check_size(pmsg_size, SECTOR_SIZE); 1328 + check_size(console_size, SECTOR_SIZE); 1329 + check_size(ftrace_size, SECTOR_SIZE); 1330 + 1331 + #undef check_size 1332 + 1333 + /* 1334 + * the @read and @write must be applied. 1335 + * if no @read, pstore may mount failed. 1336 + * if no @write, pstore do not support to remove record file. 1337 + */ 1338 + if (!info->read || !info->write) { 1339 + pr_err("no valid general read/write interface\n"); 1340 + return -EINVAL; 1341 + } 1342 + 1343 + mutex_lock(&cxt->pstore_zone_info_lock); 1344 + if (cxt->pstore_zone_info) { 1345 + pr_warn("'%s' already loaded: ignoring '%s'\n", 1346 + cxt->pstore_zone_info->name, info->name); 1347 + mutex_unlock(&cxt->pstore_zone_info_lock); 1348 + return -EBUSY; 1349 + } 1350 + cxt->pstore_zone_info = info; 1351 + 1352 + pr_debug("register %s with properties:\n", info->name); 1353 + pr_debug("\ttotal size : %ld Bytes\n", info->total_size); 1354 + pr_debug("\tkmsg size : %ld Bytes\n", info->kmsg_size); 1355 + pr_debug("\tpmsg size : %ld Bytes\n", info->pmsg_size); 1356 + pr_debug("\tconsole size : %ld Bytes\n", info->console_size); 1357 + pr_debug("\tftrace size : %ld Bytes\n", info->ftrace_size); 1358 + 1359 + err = psz_alloc_zones(cxt); 1360 + if (err) { 1361 + pr_err("alloc zones failed\n"); 1362 + goto fail_out; 1363 + } 1364 + 1365 + if (info->kmsg_size) { 1366 + cxt->pstore.bufsize = cxt->kpszs[0]->buffer_size - 1367 + sizeof(struct psz_kmsg_header); 1368 + cxt->pstore.buf = kzalloc(cxt->pstore.bufsize, GFP_KERNEL); 1369 + if (!cxt->pstore.buf) { 1370 + err = -ENOMEM; 1371 + goto fail_free; 1372 + } 1373 + } 1374 + cxt->pstore.data = cxt; 1375 + 1376 + pr_info("registered %s as backend for", info->name); 1377 + cxt->pstore.max_reason = info->max_reason; 1378 + cxt->pstore.name = info->name; 1379 + if (info->kmsg_size) { 1380 + cxt->pstore.flags |= PSTORE_FLAGS_DMESG; 1381 + pr_cont(" kmsg(%s", 1382 + kmsg_dump_reason_str(cxt->pstore.max_reason)); 1383 + if (cxt->pstore_zone_info->panic_write) 1384 + pr_cont(",panic_write"); 1385 + pr_cont(")"); 1386 + } 1387 + if (info->pmsg_size) { 1388 + cxt->pstore.flags |= PSTORE_FLAGS_PMSG; 1389 + pr_cont(" pmsg"); 1390 + } 1391 + if (info->console_size) { 1392 + cxt->pstore.flags |= PSTORE_FLAGS_CONSOLE; 1393 + pr_cont(" console"); 1394 + } 1395 + if (info->ftrace_size) { 1396 + cxt->pstore.flags |= PSTORE_FLAGS_FTRACE; 1397 + pr_cont(" ftrace"); 1398 + } 1399 + pr_cont("\n"); 1400 + 1401 + err = pstore_register(&cxt->pstore); 1402 + if (err) { 1403 + pr_err("registering with pstore failed\n"); 1404 + goto fail_free; 1405 + } 1406 + mutex_unlock(&pstore_zone_cxt.pstore_zone_info_lock); 1407 + 1408 + return 0; 1409 + 1410 + fail_free: 1411 + kfree(cxt->pstore.buf); 1412 + cxt->pstore.buf = NULL; 1413 + cxt->pstore.bufsize = 0; 1414 + psz_free_all_zones(cxt); 1415 + fail_out: 1416 + pstore_zone_cxt.pstore_zone_info = NULL; 1417 + mutex_unlock(&pstore_zone_cxt.pstore_zone_info_lock); 1418 + return err; 1419 + } 1420 + EXPORT_SYMBOL_GPL(register_pstore_zone); 1421 + 1422 + /** 1423 + * unregister_pstore_zone() - unregister to pstore/zone 1424 + * 1425 + * @info: back-end driver information. See struct pstore_zone_info. 1426 + */ 1427 + void unregister_pstore_zone(struct pstore_zone_info *info) 1428 + { 1429 + struct psz_context *cxt = &pstore_zone_cxt; 1430 + 1431 + mutex_lock(&cxt->pstore_zone_info_lock); 1432 + if (!cxt->pstore_zone_info) { 1433 + mutex_unlock(&cxt->pstore_zone_info_lock); 1434 + return; 1435 + } 1436 + 1437 + /* Stop incoming writes from pstore. */ 1438 + pstore_unregister(&cxt->pstore); 1439 + 1440 + /* Flush any pending writes. */ 1441 + psz_flush_all_dirty_zones(NULL); 1442 + flush_delayed_work(&psz_cleaner); 1443 + 1444 + /* Clean up allocations. */ 1445 + kfree(cxt->pstore.buf); 1446 + cxt->pstore.buf = NULL; 1447 + cxt->pstore.bufsize = 0; 1448 + cxt->pstore_zone_info = NULL; 1449 + 1450 + psz_free_all_zones(cxt); 1451 + 1452 + /* Clear counters and zone state. */ 1453 + cxt->oops_counter = 0; 1454 + cxt->panic_counter = 0; 1455 + atomic_set(&cxt->recovered, 0); 1456 + atomic_set(&cxt->on_panic, 0); 1457 + 1458 + mutex_unlock(&cxt->pstore_zone_info_lock); 1459 + } 1460 + EXPORT_SYMBOL_GPL(unregister_pstore_zone); 1461 + 1462 + MODULE_LICENSE("GPL"); 1463 + MODULE_AUTHOR("WeiXiong Liao <liaoweixiong@allwinnertech.com>"); 1464 + MODULE_AUTHOR("Kees Cook <keescook@chromium.org>"); 1465 + MODULE_DESCRIPTION("Storage Manager for pstore/blk");

+9 -3

include/linux/kmsg_dump.h

··· 25 25 KMSG_DUMP_PANIC, 26 26 KMSG_DUMP_OOPS, 27 27 KMSG_DUMP_EMERG, 28 - KMSG_DUMP_RESTART, 29 - KMSG_DUMP_HALT, 30 - KMSG_DUMP_POWEROFF, 28 + KMSG_DUMP_SHUTDOWN, 29 + KMSG_DUMP_MAX 31 30 }; 32 31 33 32 /** ··· 70 71 int kmsg_dump_register(struct kmsg_dumper *dumper); 71 72 72 73 int kmsg_dump_unregister(struct kmsg_dumper *dumper); 74 + 75 + const char *kmsg_dump_reason_str(enum kmsg_dump_reason reason); 73 76 #else 74 77 static inline void kmsg_dump(enum kmsg_dump_reason reason) 75 78 { ··· 112 111 static inline int kmsg_dump_unregister(struct kmsg_dumper *dumper) 113 112 { 114 113 return -EINVAL; 114 + } 115 + 116 + static inline const char *kmsg_dump_reason_str(enum kmsg_dump_reason reason) 117 + { 118 + return "Disabled"; 115 119 } 116 120 #endif 117 121

+8 -1

include/linux/pstore.h

··· 96 96 * 97 97 * @read_mutex: serializes @open, @read, @close, and @erase callbacks 98 98 * @flags: bitfield of frontends the backend can accept writes for 99 + * @max_reason: Used when PSTORE_FLAGS_DMESG is set. Contains the 100 + * kmsg_dump_reason enum value. KMSG_DUMP_UNDEF means 101 + * "use existing kmsg_dump() filtering, based on the 102 + * printk.always_kmsg_dump boot param" (which is either 103 + * KMSG_DUMP_OOPS when false, or KMSG_DUMP_MAX when 104 + * true); see printk.always_kmsg_dump for more details. 99 105 * @data: backend-private pointer passed back during callbacks 100 106 * 101 107 * Callbacks: ··· 176 170 */ 177 171 struct pstore_info { 178 172 struct module *owner; 179 - char *name; 173 + const char *name; 180 174 181 175 struct semaphore buf_lock; 182 176 char *buf; ··· 185 179 struct mutex read_mutex; 186 180 187 181 int flags; 182 + int max_reason; 188 183 void *data; 189 184 190 185 int (*open)(struct pstore_info *psi);

+118

include/linux/pstore_blk.h

··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + 3 + #ifndef __PSTORE_BLK_H_ 4 + #define __PSTORE_BLK_H_ 5 + 6 + #include <linux/types.h> 7 + #include <linux/pstore.h> 8 + #include <linux/pstore_zone.h> 9 + 10 + /** 11 + * typedef pstore_blk_panic_write_op - panic write operation to block device 12 + * 13 + * @buf: the data to write 14 + * @start_sect: start sector to block device 15 + * @sects: sectors count on buf 16 + * 17 + * Return: On success, zero should be returned. Others excluding -ENOMSG 18 + * mean error. -ENOMSG means to try next zone. 19 + * 20 + * Panic write to block device must be aligned to SECTOR_SIZE. 21 + */ 22 + typedef int (*pstore_blk_panic_write_op)(const char *buf, sector_t start_sect, 23 + sector_t sects); 24 + 25 + /** 26 + * struct pstore_blk_info - pstore/blk registration details 27 + * 28 + * @major: Which major device number to support with pstore/blk 29 + * @flags: The supported PSTORE_FLAGS_* from linux/pstore.h. 30 + * @panic_write:The write operation only used for the panic case. 31 + * This can be NULL, but is recommended to avoid losing 32 + * crash data if the kernel's IO path or work queues are 33 + * broken during a panic. 34 + * @devt: The dev_t that pstore/blk has attached to. 35 + * @nr_sects: Number of sectors on @devt. 36 + * @start_sect: Starting sector on @devt. 37 + */ 38 + struct pstore_blk_info { 39 + unsigned int major; 40 + unsigned int flags; 41 + pstore_blk_panic_write_op panic_write; 42 + 43 + /* Filled in by pstore/blk after registration. */ 44 + dev_t devt; 45 + sector_t nr_sects; 46 + sector_t start_sect; 47 + }; 48 + 49 + int register_pstore_blk(struct pstore_blk_info *info); 50 + void unregister_pstore_blk(unsigned int major); 51 + 52 + /** 53 + * struct pstore_device_info - back-end pstore/blk driver structure. 54 + * 55 + * @total_size: The total size in bytes pstore/blk can use. It must be greater 56 + * than 4096 and be multiple of 4096. 57 + * @flags: Refer to macro starting with PSTORE_FLAGS defined in 58 + * linux/pstore.h. It means what front-ends this device support. 59 + * Zero means all backends for compatible. 60 + * @read: The general read operation. Both of the function parameters 61 + * @size and @offset are relative value to bock device (not the 62 + * whole disk). 63 + * On success, the number of bytes should be returned, others 64 + * means error. 65 + * @write: The same as @read, but the following error number: 66 + * -EBUSY means try to write again later. 67 + * -ENOMSG means to try next zone. 68 + * @erase: The general erase operation for device with special removing 69 + * job. Both of the function parameters @size and @offset are 70 + * relative value to storage. 71 + * Return 0 on success and others on failure. 72 + * @panic_write:The write operation only used for panic case. It's optional 73 + * if you do not care panic log. The parameters are relative 74 + * value to storage. 75 + * On success, the number of bytes should be returned, others 76 + * excluding -ENOMSG mean error. -ENOMSG means to try next zone. 77 + */ 78 + struct pstore_device_info { 79 + unsigned long total_size; 80 + unsigned int flags; 81 + pstore_zone_read_op read; 82 + pstore_zone_write_op write; 83 + pstore_zone_erase_op erase; 84 + pstore_zone_write_op panic_write; 85 + }; 86 + 87 + int register_pstore_device(struct pstore_device_info *dev); 88 + void unregister_pstore_device(struct pstore_device_info *dev); 89 + 90 + /** 91 + * struct pstore_blk_config - the pstore_blk backend configuration 92 + * 93 + * @device: Name of the desired block device 94 + * @max_reason: Maximum kmsg dump reason to store to block device 95 + * @kmsg_size: Total size of for kmsg dumps 96 + * @pmsg_size: Total size of the pmsg storage area 97 + * @console_size: Total size of the console storage area 98 + * @ftrace_size: Total size for ftrace logging data (for all CPUs) 99 + */ 100 + struct pstore_blk_config { 101 + char device[80]; 102 + enum kmsg_dump_reason max_reason; 103 + unsigned long kmsg_size; 104 + unsigned long pmsg_size; 105 + unsigned long console_size; 106 + unsigned long ftrace_size; 107 + }; 108 + 109 + /** 110 + * pstore_blk_get_config - get a copy of the pstore_blk backend configuration 111 + * 112 + * @info: The sturct pstore_blk_config to be filled in 113 + * 114 + * Failure returns negative error code, and success returns 0. 115 + */ 116 + int pstore_blk_get_config(struct pstore_blk_config *info); 117 + 118 + #endif

+1 -1

include/linux/pstore_ram.h

··· 133 133 unsigned long console_size; 134 134 unsigned long ftrace_size; 135 135 unsigned long pmsg_size; 136 - int dump_oops; 136 + int max_reason; 137 137 u32 flags; 138 138 struct persistent_ram_ecc_info ecc_info; 139 139 };

+60

include/linux/pstore_zone.h

··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + 3 + #ifndef __PSTORE_ZONE_H_ 4 + #define __PSTORE_ZONE_H_ 5 + 6 + #include <linux/types.h> 7 + 8 + typedef ssize_t (*pstore_zone_read_op)(char *, size_t, loff_t); 9 + typedef ssize_t (*pstore_zone_write_op)(const char *, size_t, loff_t); 10 + typedef ssize_t (*pstore_zone_erase_op)(size_t, loff_t); 11 + /** 12 + * struct pstore_zone_info - pstore/zone back-end driver structure 13 + * 14 + * @owner: Module which is responsible for this back-end driver. 15 + * @name: Name of the back-end driver. 16 + * @total_size: The total size in bytes pstore/zone can use. It must be greater 17 + * than 4096 and be multiple of 4096. 18 + * @kmsg_size: The size of oops/panic zone. Zero means disabled, otherwise, 19 + * it must be multiple of SECTOR_SIZE(512 Bytes). 20 + * @max_reason: Maximum kmsg dump reason to store. 21 + * @pmsg_size: The size of pmsg zone which is the same as @kmsg_size. 22 + * @console_size:The size of console zone which is the same as @kmsg_size. 23 + * @ftrace_size:The size of ftrace zone which is the same as @kmsg_size. 24 + * @read: The general read operation. Both of the function parameters 25 + * @size and @offset are relative value to storage. 26 + * On success, the number of bytes should be returned, others 27 + * mean error. 28 + * @write: The same as @read, but the following error number: 29 + * -EBUSY means try to write again later. 30 + * -ENOMSG means to try next zone. 31 + * @erase: The general erase operation for device with special removing 32 + * job. Both of the function parameters @size and @offset are 33 + * relative value to storage. 34 + * Return 0 on success and others on failure. 35 + * @panic_write:The write operation only used for panic case. It's optional 36 + * if you do not care panic log. The parameters are relative 37 + * value to storage. 38 + * On success, the number of bytes should be returned, others 39 + * excluding -ENOMSG mean error. -ENOMSG means to try next zone. 40 + */ 41 + struct pstore_zone_info { 42 + struct module *owner; 43 + const char *name; 44 + 45 + unsigned long total_size; 46 + unsigned long kmsg_size; 47 + int max_reason; 48 + unsigned long pmsg_size; 49 + unsigned long console_size; 50 + unsigned long ftrace_size; 51 + pstore_zone_read_op read; 52 + pstore_zone_write_op write; 53 + pstore_zone_erase_op erase; 54 + pstore_zone_write_op panic_write; 55 + }; 56 + 57 + extern int register_pstore_zone(struct pstore_zone_info *info); 58 + extern void unregister_pstore_zone(struct pstore_zone_info *info); 59 + 60 + #endif

+28 -4

kernel/printk/printk.c

··· 3144 3144 static bool always_kmsg_dump; 3145 3145 module_param_named(always_kmsg_dump, always_kmsg_dump, bool, S_IRUGO | S_IWUSR); 3146 3146 3147 + const char *kmsg_dump_reason_str(enum kmsg_dump_reason reason) 3148 + { 3149 + switch (reason) { 3150 + case KMSG_DUMP_PANIC: 3151 + return "Panic"; 3152 + case KMSG_DUMP_OOPS: 3153 + return "Oops"; 3154 + case KMSG_DUMP_EMERG: 3155 + return "Emergency"; 3156 + case KMSG_DUMP_SHUTDOWN: 3157 + return "Shutdown"; 3158 + default: 3159 + return "Unknown"; 3160 + } 3161 + } 3162 + EXPORT_SYMBOL_GPL(kmsg_dump_reason_str); 3163 + 3147 3164 /** 3148 3165 * kmsg_dump - dump kernel log to kernel message dumpers. 3149 3166 * @reason: the reason (oops, panic etc) for dumping ··· 3174 3157 struct kmsg_dumper *dumper; 3175 3158 unsigned long flags; 3176 3159 3177 - if ((reason > KMSG_DUMP_OOPS) && !always_kmsg_dump) 3178 - return; 3179 - 3180 3160 rcu_read_lock(); 3181 3161 list_for_each_entry_rcu(dumper, &dump_list, list) { 3182 - if (dumper->max_reason && reason > dumper->max_reason) 3162 + enum kmsg_dump_reason max_reason = dumper->max_reason; 3163 + 3164 + /* 3165 + * If client has not provided a specific max_reason, default 3166 + * to KMSG_DUMP_OOPS, unless always_kmsg_dump was set. 3167 + */ 3168 + if (max_reason == KMSG_DUMP_UNDEF) { 3169 + max_reason = always_kmsg_dump ? KMSG_DUMP_MAX : 3170 + KMSG_DUMP_OOPS; 3171 + } 3172 + if (reason > max_reason) 3183 3173 continue; 3184 3174 3185 3175 /* initialize iterator with data about the stored records */

+3 -3

kernel/reboot.c

··· 250 250 pr_emerg("Restarting system\n"); 251 251 else 252 252 pr_emerg("Restarting system with command '%s'\n", cmd); 253 - kmsg_dump(KMSG_DUMP_RESTART); 253 + kmsg_dump(KMSG_DUMP_SHUTDOWN); 254 254 machine_restart(cmd); 255 255 } 256 256 EXPORT_SYMBOL_GPL(kernel_restart); ··· 274 274 migrate_to_reboot_cpu(); 275 275 syscore_shutdown(); 276 276 pr_emerg("System halted\n"); 277 - kmsg_dump(KMSG_DUMP_HALT); 277 + kmsg_dump(KMSG_DUMP_SHUTDOWN); 278 278 machine_halt(); 279 279 } 280 280 EXPORT_SYMBOL_GPL(kernel_halt); ··· 292 292 migrate_to_reboot_cpu(); 293 293 syscore_shutdown(); 294 294 pr_emerg("Power down\n"); 295 - kmsg_dump(KMSG_DUMP_POWEROFF); 295 + kmsg_dump(KMSG_DUMP_SHUTDOWN); 296 296 machine_power_off(); 297 297 } 298 298 EXPORT_SYMBOL_GPL(kernel_power_off);

+1 -1

tools/testing/selftests/pstore/pstore_tests

··· 10 10 . ./common_tests 11 11 12 12 prlog -n "Checking pstore console is registered ... " 13 - dmesg | grep -q "console \[pstore" 13 + dmesg | grep -Eq "console \[(pstore|${backend})" 14 14 show_result $? 15 15 16 16 prlog -n "Checking /dev/pmsg0 exists ... "