Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Documentation: Update to BUG-HUNTING

Signed-off-by: Ian McDonald <imcdnzl@gmail.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

authored by

Ian McDonald and committed by
Adrian Bunk
43019a56 a609164f

+113
+113
Documentation/BUG-HUNTING
··· 1 + Table of contents 2 + ================= 3 + 4 + Last updated: 20 December 2005 5 + 6 + Contents 7 + ======== 8 + 9 + - Introduction 10 + - Devices not appearing 11 + - Finding patch that caused a bug 12 + -- Finding using git-bisect 13 + -- Finding it the old way 14 + - Fixing the bug 15 + 16 + Introduction 17 + ============ 18 + 19 + Always try the latest kernel from kernel.org and build from source. If you are 20 + not confident in doing that please report the bug to your distribution vendor 21 + instead of to a kernel developer. 22 + 23 + Finding bugs is not always easy. Have a go though. If you can't find it don't 24 + give up. Report as much as you have found to the relevant maintainer. See 25 + MAINTAINERS for who that is for the subsystem you have worked on. 26 + 27 + Before you submit a bug report read REPORTING-BUGS. 28 + 29 + Devices not appearing 30 + ===================== 31 + 32 + Often this is caused by udev. Check that first before blaming it on the 33 + kernel. 34 + 35 + Finding patch that caused a bug 36 + =============================== 37 + 38 + 39 + 40 + Finding using git-bisect 41 + ------------------------ 42 + 43 + Using the provided tools with git makes finding bugs easy provided the bug is 44 + reproducible. 45 + 46 + Steps to do it: 47 + - start using git for the kernel source 48 + - read the man page for git-bisect 49 + - have fun 50 + 51 + Finding it the old way 52 + ---------------------- 53 + 1 54 [Sat Mar 2 10:32:33 PST 1996 KERNEL_BUG-HOWTO lm@sgi.com (Larry McVoy)] 2 55 3 56 This is how to track down a bug if you know nothing about kernel hacking. ··· 143 90 because Linux snapshots will let you do this - something that you can't 144 91 do with vendor supplied releases. 145 92 93 + Fixing the bug 94 + ============== 95 + 96 + Nobody is going to tell you how to fix bugs. Seriously. You need to work it 97 + out. But below are some hints on how to use the tools. 98 + 99 + To debug a kernel, use objdump and look for the hex offset from the crash 100 + output to find the valid line of code/assembler. Without debug symbols, you 101 + will see the assembler code for the routine shown, but if your kernel has 102 + debug symbols the C code will also be available. (Debug symbols can be enabled 103 + in the kernel hacking menu of the menu configuration.) For example: 104 + 105 + objdump -r -S -l --disassemble net/dccp/ipv4.o 106 + 107 + NB.: you need to be at the top level of the kernel tree for this to pick up 108 + your C files. 109 + 110 + If you don't have access to the code you can also debug on some crash dumps 111 + e.g. crash dump output as shown by Dave Miller. 112 + 113 + > EIP is at ip_queue_xmit+0x14/0x4c0 114 + > ... 115 + > Code: 44 24 04 e8 6f 05 00 00 e9 e8 fe ff ff 8d 76 00 8d bc 27 00 00 116 + > 00 00 55 57 56 53 81 ec bc 00 00 00 8b ac 24 d0 00 00 00 8b 5d 08 117 + > <8b> 83 3c 01 00 00 89 44 24 14 8b 45 28 85 c0 89 44 24 18 0f 85 118 + > 119 + > Put the bytes into a "foo.s" file like this: 120 + > 121 + > .text 122 + > .globl foo 123 + > foo: 124 + > .byte .... /* bytes from Code: part of OOPS dump */ 125 + > 126 + > Compile it with "gcc -c -o foo.o foo.s" then look at the output of 127 + > "objdump --disassemble foo.o". 128 + > 129 + > Output: 130 + > 131 + > ip_queue_xmit: 132 + > push %ebp 133 + > push %edi 134 + > push %esi 135 + > push %ebx 136 + > sub $0xbc, %esp 137 + > mov 0xd0(%esp), %ebp ! %ebp = arg0 (skb) 138 + > mov 0x8(%ebp), %ebx ! %ebx = skb->sk 139 + > mov 0x13c(%ebx), %eax ! %eax = inet_sk(sk)->opt 140 + 141 + Another very useful option of the Kernel Hacking section in menuconfig is 142 + Debug memory allocations. This will help you see whether data has been 143 + initialised and not set before use etc. To see the values that get assigned 144 + with this look at mm/slab.c and search for POISON_INUSE. When using this an 145 + Oops will often show the poisoned data instead of zero which is the default. 146 + 147 + Once you have worked out a fix please submit it upstream. After all open 148 + source is about sharing what you do and don't you want to be recognised for 149 + your genius? 150 + 151 + Please do read Documentation/SubmittingPatches though to help your code get 152 + accepted.