Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

at v2.6.21-rc2 434 lines 15 kB view raw
1================================================================ 2Documentation for Kdump - The kexec-based Crash Dumping Solution 3================================================================ 4 5This document includes overview, setup and installation, and analysis 6information. 7 8Overview 9======== 10 11Kdump uses kexec to quickly boot to a dump-capture kernel whenever a 12dump of the system kernel's memory needs to be taken (for example, when 13the system panics). The system kernel's memory image is preserved across 14the reboot and is accessible to the dump-capture kernel. 15 16You can use common Linux commands, such as cp and scp, to copy the 17memory image to a dump file on the local disk, or across the network to 18a remote system. 19 20Kdump and kexec are currently supported on the x86, x86_64, ppc64 and ia64 21architectures. 22 23When the system kernel boots, it reserves a small section of memory for 24the dump-capture kernel. This ensures that ongoing Direct Memory Access 25(DMA) from the system kernel does not corrupt the dump-capture kernel. 26The kexec -p command loads the dump-capture kernel into this reserved 27memory. 28 29On x86 machines, the first 640 KB of physical memory is needed to boot, 30regardless of where the kernel loads. Therefore, kexec backs up this 31region just before rebooting into the dump-capture kernel. 32 33Similarly on PPC64 machines first 32KB of physical memory is needed for 34booting regardless of where the kernel is loaded and to support 64K page 35size kexec backs up the first 64KB memory. 36 37All of the necessary information about the system kernel's core image is 38encoded in the ELF format, and stored in a reserved area of memory 39before a crash. The physical address of the start of the ELF header is 40passed to the dump-capture kernel through the elfcorehdr= boot 41parameter. 42 43With the dump-capture kernel, you can access the memory image, or "old 44memory," in two ways: 45 46- Through a /dev/oldmem device interface. A capture utility can read the 47 device file and write out the memory in raw format. This is a raw dump 48 of memory. Analysis and capture tools must be intelligent enough to 49 determine where to look for the right information. 50 51- Through /proc/vmcore. This exports the dump as an ELF-format file that 52 you can write out using file copy commands such as cp or scp. Further, 53 you can use analysis tools such as the GNU Debugger (GDB) and the Crash 54 tool to debug the dump file. This method ensures that the dump pages are 55 correctly ordered. 56 57 58Setup and Installation 59====================== 60 61Install kexec-tools 62------------------- 63 641) Login as the root user. 65 662) Download the kexec-tools user-space package from the following URL: 67 68http://www.kernel.org/pub/linux/kernel/people/horms/kexec-tools/kexec-tools-testing.tar.gz 69 70This is a symlink to the latest version, which at the time of writing is 7120061214, the only release of kexec-tools-testing so far. As other versions 72are made released, the older onese will remain available at 73http://www.kernel.org/pub/linux/kernel/people/horms/kexec-tools/ 74 75Note: Latest kexec-tools-testing git tree is available at 76 77git://git.kernel.org/pub/scm/linux/kernel/git/horms/kexec-tools-testing.git 78or 79http://www.kernel.org/git/?p=linux/kernel/git/horms/kexec-tools-testing.git;a=summary 80 813) Unpack the tarball with the tar command, as follows: 82 83 tar xvpzf kexec-tools-testing.tar.gz 84 854) Change to the kexec-tools directory, as follows: 86 87 cd kexec-tools-testing-VERSION 88 895) Configure the package, as follows: 90 91 ./configure 92 936) Compile the package, as follows: 94 95 make 96 977) Install the package, as follows: 98 99 make install 100 101 102Build the system and dump-capture kernels 103----------------------------------------- 104There are two possible methods of using Kdump. 105 1061) Build a separate custom dump-capture kernel for capturing the 107 kernel core dump. 108 1092) Or use the system kernel binary itself as dump-capture kernel and there is 110 no need to build a separate dump-capture kernel. This is possible 111 only with the architecutres which support a relocatable kernel. As 112 of today i386 and ia64 architectures support relocatable kernel. 113 114Building a relocatable kernel is advantageous from the point of view that 115one does not have to build a second kernel for capturing the dump. But 116at the same time one might want to build a custom dump capture kernel 117suitable to his needs. 118 119Following are the configuration setting required for system and 120dump-capture kernels for enabling kdump support. 121 122System kernel config options 123---------------------------- 124 1251) Enable "kexec system call" in "Processor type and features." 126 127 CONFIG_KEXEC=y 128 1292) Enable "sysfs file system support" in "Filesystem" -> "Pseudo 130 filesystems." This is usually enabled by default. 131 132 CONFIG_SYSFS=y 133 134 Note that "sysfs file system support" might not appear in the "Pseudo 135 filesystems" menu if "Configure standard kernel features (for small 136 systems)" is not enabled in "General Setup." In this case, check the 137 .config file itself to ensure that sysfs is turned on, as follows: 138 139 grep 'CONFIG_SYSFS' .config 140 1413) Enable "Compile the kernel with debug info" in "Kernel hacking." 142 143 CONFIG_DEBUG_INFO=Y 144 145 This causes the kernel to be built with debug symbols. The dump 146 analysis tools require a vmlinux with debug symbols in order to read 147 and analyze a dump file. 148 149Dump-capture kernel config options (Arch Independent) 150----------------------------------------------------- 151 1521) Enable "kernel crash dumps" support under "Processor type and 153 features": 154 155 CONFIG_CRASH_DUMP=y 156 1572) Enable "/proc/vmcore support" under "Filesystems" -> "Pseudo filesystems". 158 159 CONFIG_PROC_VMCORE=y 160 (CONFIG_PROC_VMCORE is set by default when CONFIG_CRASH_DUMP is selected.) 161 162Dump-capture kernel config options (Arch Dependent, i386) 163-------------------------------------------------------- 1641) On x86, enable high memory support under "Processor type and 165 features": 166 167 CONFIG_HIGHMEM64G=y 168 or 169 CONFIG_HIGHMEM4G 170 1712) On x86 and x86_64, disable symmetric multi-processing support 172 under "Processor type and features": 173 174 CONFIG_SMP=n 175 176 (If CONFIG_SMP=y, then specify maxcpus=1 on the kernel command line 177 when loading the dump-capture kernel, see section "Load the Dump-capture 178 Kernel".) 179 1803) If one wants to build and use a relocatable kernel, 181 Enable "Build a relocatable kernel" support under "Processor type and 182 features" 183 184 CONFIG_RELOCATABLE=y 185 1864) Use a suitable value for "Physical address where the kernel is 187 loaded" (under "Processor type and features"). This only appears when 188 "kernel crash dumps" is enabled. A suitable value depends upon 189 whether kernel is relocatable or not. 190 191 If you are using a relocatable kernel use CONFIG_PHYSICAL_START=0x100000 192 This will compile the kernel for physical address 1MB, but given the fact 193 kernel is relocatable, it can be run from any physical address hence 194 kexec boot loader will load it in memory region reserved for dump-capture 195 kernel. 196 197 Otherwise it should be the start of memory region reserved for 198 second kernel using boot parameter "crashkernel=Y@X". Here X is 199 start of memory region reserved for dump-capture kernel. 200 Generally X is 16MB (0x1000000). So you can set 201 CONFIG_PHYSICAL_START=0x1000000 202 2035) Make and install the kernel and its modules. DO NOT add this kernel 204 to the boot loader configuration files. 205 206Dump-capture kernel config options (Arch Dependent, x86_64) 207---------------------------------------------------------- 2081) On x86 and x86_64, disable symmetric multi-processing support 209 under "Processor type and features": 210 211 CONFIG_SMP=n 212 213 (If CONFIG_SMP=y, then specify maxcpus=1 on the kernel command line 214 when loading the dump-capture kernel, see section "Load the Dump-capture 215 Kernel".) 216 2172) Use a suitable value for "Physical address where the kernel is 218 loaded" (under "Processor type and features"). This only appears when 219 "kernel crash dumps" is enabled. By default this value is 0x1000000 220 (16MB). It should be the same as X in the "crashkernel=Y@X" boot 221 parameter. 222 223 For x86_64, normally "CONFIG_PHYSICAL_START=0x1000000". 224 2253) Make and install the kernel and its modules. DO NOT add this kernel 226 to the boot loader configuration files. 227 228Dump-capture kernel config options (Arch Dependent, ppc64) 229---------------------------------------------------------- 230 231* Make and install the kernel and its modules. DO NOT add this kernel 232 to the boot loader configuration files. 233 234Dump-capture kernel config options (Arch Dependent, ia64) 235---------------------------------------------------------- 236 237- No specific options are required to create a dump-capture kernel 238 for ia64, other than those specified in the arch idependent section 239 above. This means that it is possible to use the system kernel 240 as a dump-capture kernel if desired. 241 242 The crashkernel region can be automatically placed by the system 243 kernel at run time. This is done by specifying the base address as 0, 244 or omitting it all together. 245 246 crashkernel=256M@0 247 or 248 crashkernel=256M 249 250 If the start address is specified, note that the start address of the 251 kernel will be aligned to 64Mb, so if the start address is not then 252 any space below the alignment point will be wasted. 253 254 255Boot into System Kernel 256======================= 257 2581) Update the boot loader (such as grub, yaboot, or lilo) configuration 259 files as necessary. 260 2612) Boot the system kernel with the boot parameter "crashkernel=Y@X", 262 where Y specifies how much memory to reserve for the dump-capture kernel 263 and X specifies the beginning of this reserved memory. For example, 264 "crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory 265 starting at physical address 0x01000000 (16MB) for the dump-capture kernel. 266 267 On x86 and x86_64, use "crashkernel=64M@16M". 268 269 On ppc64, use "crashkernel=128M@32M". 270 271 On ia64, 256M@256M is a generous value that typically works. 272 The region may be automatically placed on ia64, see the 273 dump-capture kernel config option notes above. 274 275Load the Dump-capture Kernel 276============================ 277 278After booting to the system kernel, dump-capture kernel needs to be 279loaded. 280 281Based on the architecture and type of image (relocatable or not), one 282can choose to load the uncompressed vmlinux or compressed bzImage/vmlinuz 283of dump-capture kernel. Following is the summary. 284 285For i386: 286 - Use vmlinux if kernel is not relocatable. 287 - Use bzImage/vmlinuz if kernel is relocatable. 288For x86_64: 289 - Use vmlinux 290For ppc64: 291 - Use vmlinux 292For ia64: 293 - Use vmlinux or vmlinuz.gz 294 295 296If you are using a uncompressed vmlinux image then use following command 297to load dump-capture kernel. 298 299 kexec -p <dump-capture-kernel-vmlinux-image> \ 300 --initrd=<initrd-for-dump-capture-kernel> --args-linux \ 301 --append="root=<root-dev> <arch-specific-options>" 302 303If you are using a compressed bzImage/vmlinuz, then use following command 304to load dump-capture kernel. 305 306 kexec -p <dump-capture-kernel-bzImage> \ 307 --initrd=<initrd-for-dump-capture-kernel> \ 308 --append="root=<root-dev> <arch-specific-options>" 309 310Please note, that --args-linux does not need to be specified for ia64. 311It is planned to make this a no-op on that architecture, but for now 312it should be omitted 313 314Following are the arch specific command line options to be used while 315loading dump-capture kernel. 316 317For i386, x86_64 and ia64: 318 "1 irqpoll maxcpus=1" 319 320For ppc64: 321 "1 maxcpus=1 noirqdistrib" 322 323 324Notes on loading the dump-capture kernel: 325 326* By default, the ELF headers are stored in ELF64 format to support 327 systems with more than 4GB memory. The --elf32-core-headers option can 328 be used to force the generation of ELF32 headers. This is necessary 329 because GDB currently cannot open vmcore files with ELF64 headers on 330 32-bit systems. ELF32 headers can be used on non-PAE systems (that is, 331 less than 4GB of memory). 332 333* The "irqpoll" boot parameter reduces driver initialization failures 334 due to shared interrupts in the dump-capture kernel. 335 336* You must specify <root-dev> in the format corresponding to the root 337 device name in the output of mount command. 338 339* Boot parameter "1" boots the dump-capture kernel into single-user 340 mode without networking. If you want networking, use "3". 341 342* We generally don' have to bring up a SMP kernel just to capture the 343 dump. Hence generally it is useful either to build a UP dump-capture 344 kernel or specify maxcpus=1 option while loading dump-capture kernel. 345 346Kernel Panic 347============ 348 349After successfully loading the dump-capture kernel as previously 350described, the system will reboot into the dump-capture kernel if a 351system crash is triggered. Trigger points are located in panic(), 352die(), die_nmi() and in the sysrq handler (ALT-SysRq-c). 353 354The following conditions will execute a crash trigger point: 355 356If a hard lockup is detected and "NMI watchdog" is configured, the system 357will boot into the dump-capture kernel ( die_nmi() ). 358 359If die() is called, and it happens to be a thread with pid 0 or 1, or die() 360is called inside interrupt context or die() is called and panic_on_oops is set, 361the system will boot into the dump-capture kernel. 362 363On powererpc systems when a soft-reset is generated, die() is called by all cpus 364and the system will boot into the dump-capture kernel. 365 366For testing purposes, you can trigger a crash by using "ALT-SysRq-c", 367"echo c > /proc/sysrq-trigger" or write a module to force the panic. 368 369Write Out the Dump File 370======================= 371 372After the dump-capture kernel is booted, write out the dump file with 373the following command: 374 375 cp /proc/vmcore <dump-file> 376 377You can also access dumped memory as a /dev/oldmem device for a linear 378and raw view. To create the device, use the following command: 379 380 mknod /dev/oldmem c 1 12 381 382Use the dd command with suitable options for count, bs, and skip to 383access specific portions of the dump. 384 385To see the entire memory, use the following command: 386 387 dd if=/dev/oldmem of=oldmem.001 388 389 390Analysis 391======== 392 393Before analyzing the dump image, you should reboot into a stable kernel. 394 395You can do limited analysis using GDB on the dump file copied out of 396/proc/vmcore. Use the debug vmlinux built with -g and run the following 397command: 398 399 gdb vmlinux <dump-file> 400 401Stack trace for the task on processor 0, register display, and memory 402display work fine. 403 404Note: GDB cannot analyze core files generated in ELF64 format for x86. 405On systems with a maximum of 4GB of memory, you can generate 406ELF32-format headers using the --elf32-core-headers kernel option on the 407dump kernel. 408 409You can also use the Crash utility to analyze dump files in Kdump 410format. Crash is available on Dave Anderson's site at the following URL: 411 412 http://people.redhat.com/~anderson/ 413 414 415To Do 416===== 417 4181) Provide relocatable kernels for all architectures to help in maintaining 419 multiple kernels for crash_dump, and the same kernel as the system kernel 420 can be used to capture the dump. 421 422 423Contact 424======= 425 426Vivek Goyal (vgoyal@in.ibm.com) 427Maneesh Soni (maneesh@in.ibm.com) 428 429 430Trademark 431========= 432 433Linux is a trademark of Linus Torvalds in the United States, other 434countries, or both.