Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

reboot: add support for configuring emergency hardware protection action

We currently leave the decision of whether to shutdown or reboot to
protect hardware in an emergency situation to the individual drivers.

This works out in some cases, where the driver detecting the critical
failure has inside knowledge: It binds to the system management controller
for example or is guided by hardware description that defines what to do.

In the general case, however, the driver detecting the issue can't know
what the appropriate course of action is and shouldn't be dictating the
policy of dealing with it.

Therefore, add a global hw_protection toggle that allows the user to
specify whether shutdown or reboot should be the default action when the
driver doesn't set policy.

This introduces no functional change yet as hw_protection_trigger() has no
callers, but these will be added in subsequent commits.

[arnd@arndb.de: hide unused hw_protection_attr]
Link: https://lkml.kernel.org/r/20250224141849.1546019-1-arnd@kernel.org
Link: https://lkml.kernel.org/r/20250217-hw_protection-reboot-v3-7-e1c09b090c0c@pengutronix.de
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Reviewed-by: Tzung-Bi Shih <tzungbi@kernel.org>
Cc: Benson Leung <bleung@chromium.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Fabio Estevam <festevam@denx.de>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Liam Girdwood <lgirdwood@gmail.com>
Cc: Lukasz Luba <lukasz.luba@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Matteo Croce <teknoraver@meta.com>
Cc: Matti Vaittinen <mazziesaccount@gmail.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Rob Herring (Arm) <robh@kernel.org>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Ahmad Fatoum and committed by
Andrew Morton
e016173f 96201a8a

+84 -1
+8
Documentation/ABI/testing/sysfs-kernel-reboot
··· 30 30 Contact: Matteo Croce <mcroce@microsoft.com> 31 31 Description: Don't wait for any other CPUs on reboot and 32 32 avoid anything that could hang. 33 + 34 + What: /sys/kernel/reboot/hw_protection 35 + Date: April 2025 36 + KernelVersion: 6.15 37 + Contact: Ahmad Fatoum <a.fatoum@pengutronix.de> 38 + Description: Hardware protection action taken on critical events like 39 + overtemperature or imminent voltage loss. 40 + Valid values are: reboot shutdown
+6
Documentation/admin-guide/kernel-parameters.txt
··· 1933 1933 which allow the hypervisor to 'idle' the guest 1934 1934 on lock contention. 1935 1935 1936 + hw_protection= [HW] 1937 + Format: reboot | shutdown 1938 + 1939 + Hardware protection action taken on critical events like 1940 + overtemperature or imminent voltage loss. 1941 + 1936 1942 i2c_bus= [HW] Override the default board specific I2C bus speed 1937 1943 or register an additional I2C bus that is not 1938 1944 registered from board initialization code.
+21 -1
include/linux/reboot.h
··· 181 181 /** 182 182 * enum hw_protection_action - Hardware protection action 183 183 * 184 + * @HWPROT_ACT_DEFAULT: 185 + * The default action should be taken. This is HWPROT_ACT_SHUTDOWN 186 + * by default, but can be overridden. 184 187 * @HWPROT_ACT_SHUTDOWN: 185 188 * The system should be shut down (powered off) for HW protection. 186 189 * @HWPROT_ACT_REBOOT: 187 190 * The system should be rebooted for HW protection. 188 191 */ 189 - enum hw_protection_action { HWPROT_ACT_SHUTDOWN, HWPROT_ACT_REBOOT }; 192 + enum hw_protection_action { HWPROT_ACT_DEFAULT, HWPROT_ACT_SHUTDOWN, HWPROT_ACT_REBOOT }; 190 193 191 194 void __hw_protection_trigger(const char *reason, int ms_until_forced, 192 195 enum hw_protection_action action); 196 + 197 + /** 198 + * hw_protection_trigger - Trigger default emergency system hardware protection action 199 + * 200 + * @reason: Reason of emergency shutdown or reboot to be printed. 201 + * @ms_until_forced: Time to wait for orderly shutdown or reboot before 202 + * triggering it. Negative value disables the forced 203 + * shutdown or reboot. 204 + * 205 + * Initiate an emergency system shutdown or reboot in order to protect 206 + * hardware from further damage. The exact action taken is controllable at 207 + * runtime and defaults to shutdown. 208 + */ 209 + static inline void hw_protection_trigger(const char *reason, int ms_until_forced) 210 + { 211 + __hw_protection_trigger(reason, ms_until_forced, HWPROT_ACT_DEFAULT); 212 + } 193 213 194 214 static inline void hw_protection_reboot(const char *reason, int ms_until_forced) 195 215 {
+1
include/uapi/linux/capability.h
··· 275 275 /* Allow setting encryption key on loopback filesystem */ 276 276 /* Allow setting zone reclaim policy */ 277 277 /* Allow everything under CAP_BPF and CAP_PERFMON for backward compatibility */ 278 + /* Allow setting hardware protection emergency action */ 278 279 279 280 #define CAP_SYS_ADMIN 21 280 281
+48
kernel/reboot.c
··· 36 36 EXPORT_SYMBOL_GPL(reboot_mode); 37 37 enum reboot_mode panic_reboot_mode = REBOOT_UNDEFINED; 38 38 39 + static enum hw_protection_action hw_protection_action = HWPROT_ACT_SHUTDOWN; 40 + 39 41 /* 40 42 * This variable is used privately to keep track of whether or not 41 43 * reboot_type is still set to its default value (i.e., reboot= hasn't ··· 1029 1027 { 1030 1028 static atomic_t allow_proceed = ATOMIC_INIT(1); 1031 1029 1030 + if (action == HWPROT_ACT_DEFAULT) 1031 + action = hw_protection_action; 1032 + 1032 1033 pr_emerg("HARDWARE PROTECTION %s (%s)\n", 1033 1034 hw_protection_action_str(action), reason); 1034 1035 ··· 1050 1045 orderly_poweroff(true); 1051 1046 } 1052 1047 EXPORT_SYMBOL_GPL(__hw_protection_trigger); 1048 + 1049 + static bool hw_protection_action_parse(const char *str, 1050 + enum hw_protection_action *action) 1051 + { 1052 + if (sysfs_streq(str, "shutdown")) 1053 + *action = HWPROT_ACT_SHUTDOWN; 1054 + else if (sysfs_streq(str, "reboot")) 1055 + *action = HWPROT_ACT_REBOOT; 1056 + else 1057 + return false; 1058 + 1059 + return true; 1060 + } 1061 + 1062 + static int __init hw_protection_setup(char *str) 1063 + { 1064 + hw_protection_action_parse(str, &hw_protection_action); 1065 + return 1; 1066 + } 1067 + __setup("hw_protection=", hw_protection_setup); 1068 + 1069 + #ifdef CONFIG_SYSFS 1070 + static ssize_t hw_protection_show(struct kobject *kobj, 1071 + struct kobj_attribute *attr, char *buf) 1072 + { 1073 + return sysfs_emit(buf, "%s\n", 1074 + hw_protection_action_str(hw_protection_action)); 1075 + } 1076 + static ssize_t hw_protection_store(struct kobject *kobj, 1077 + struct kobj_attribute *attr, const char *buf, 1078 + size_t count) 1079 + { 1080 + if (!capable(CAP_SYS_ADMIN)) 1081 + return -EPERM; 1082 + 1083 + if (!hw_protection_action_parse(buf, &hw_protection_action)) 1084 + return -EINVAL; 1085 + 1086 + return count; 1087 + } 1088 + static struct kobj_attribute hw_protection_attr = __ATTR_RW(hw_protection); 1089 + #endif 1053 1090 1054 1091 static int __init reboot_setup(char *str) 1055 1092 { ··· 1352 1305 #endif 1353 1306 1354 1307 static struct attribute *reboot_attrs[] = { 1308 + &hw_protection_attr.attr, 1355 1309 &reboot_mode_attr.attr, 1356 1310 #ifdef CONFIG_X86 1357 1311 &reboot_force_attr.attr,