Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

panic: fix the panic_print NMI backtrace setting

Commit 8d470a45d1a6 ("panic: add option to dump all CPUs backtraces in
panic_print") introduced a setting for the "panic_print" kernel parameter
to allow users to request a NMI backtrace on panic. Problem is that the
panic_print handling happens after the secondary CPUs are already
disabled, hence this option ended-up being kind of a no-op - kernel skips
the NMI trace in idling CPUs, which is the case of offline CPUs.

Fix it by checking the NMI backtrace bit in the panic_print prior to the
CPU disabling function.

Link: https://lkml.kernel.org/r/20230226160838.414257-1-gpiccoli@igalia.com
Fixes: 8d470a45d1a6 ("panic: add option to dump all CPUs backtraces in panic_print")
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Cc: <stable@vger.kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michael Kelley <mikelley@microsoft.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Guilherme G. Piccoli and committed by
Andrew Morton
b905039e 359d6255

+26 -18
+26 -18
kernel/panic.c
··· 212 212 return; 213 213 } 214 214 215 - if (panic_print & PANIC_PRINT_ALL_CPU_BT) 216 - trigger_all_cpu_backtrace(); 217 - 218 215 if (panic_print & PANIC_PRINT_TASK_INFO) 219 216 show_state(); 220 217 ··· 239 242 if (atomic_inc_return(&warn_count) >= limit && limit) 240 243 panic("%s: system warned too often (kernel.warn_limit is %d)", 241 244 origin, limit); 245 + } 246 + 247 + /* 248 + * Helper that triggers the NMI backtrace (if set in panic_print) 249 + * and then performs the secondary CPUs shutdown - we cannot have 250 + * the NMI backtrace after the CPUs are off! 251 + */ 252 + static void panic_other_cpus_shutdown(bool crash_kexec) 253 + { 254 + if (panic_print & PANIC_PRINT_ALL_CPU_BT) 255 + trigger_all_cpu_backtrace(); 256 + 257 + /* 258 + * Note that smp_send_stop() is the usual SMP shutdown function, 259 + * which unfortunately may not be hardened to work in a panic 260 + * situation. If we want to do crash dump after notifier calls 261 + * and kmsg_dump, we will need architecture dependent extra 262 + * bits in addition to stopping other CPUs, hence we rely on 263 + * crash_smp_send_stop() for that. 264 + */ 265 + if (!crash_kexec) 266 + smp_send_stop(); 267 + else 268 + crash_smp_send_stop(); 242 269 } 243 270 244 271 /** ··· 355 334 * 356 335 * Bypass the panic_cpu check and call __crash_kexec directly. 357 336 */ 358 - if (!_crash_kexec_post_notifiers) { 337 + if (!_crash_kexec_post_notifiers) 359 338 __crash_kexec(NULL); 360 339 361 - /* 362 - * Note smp_send_stop is the usual smp shutdown function, which 363 - * unfortunately means it may not be hardened to work in a 364 - * panic situation. 365 - */ 366 - smp_send_stop(); 367 - } else { 368 - /* 369 - * If we want to do crash dump after notifier calls and 370 - * kmsg_dump, we will need architecture dependent extra 371 - * works in addition to stopping other CPUs. 372 - */ 373 - crash_smp_send_stop(); 374 - } 340 + panic_other_cpus_shutdown(_crash_kexec_post_notifiers); 375 341 376 342 /* 377 343 * Run any panic handlers, including those that might need to