0% found this document useful (0 votes)
110 views

Debugging Kernel OOPs FUDCon2011

The document discusses debugging kernel oopses or crashes in Linux. It begins with an overview of typical oopses, including that the kernel exception handler kills the offending process and prints registers and stack trace. Key details are discussed like the kernel symbol table (stored in System.map and /proc/kallsyms), an example oops crash from a sample driver, and how oopses can provide clues to help debug issues in the running kernel.

Uploaded by

tuxracersv
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
110 views

Debugging Kernel OOPs FUDCon2011

The document discusses debugging kernel oopses or crashes in Linux. It begins with an overview of typical oopses, including that the kernel exception handler kills the offending process and prints registers and stack trace. Key details are discussed like the kernel symbol table (stored in System.map and /proc/kallsyms), an example oops crash from a sample driver, and how oopses can provide clues to help debug issues in the running kernel.

Uploaded by

tuxracersv
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Debugging Kernel OOPs

Neependra Khare
STEC India

November 3, 2011

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

1 / 19

OOPs

Agenda

What is OOPs Typical OOPs Kernel Symbol Table Examples Debugging a running Kernel with gdb

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

2 / 19

OOPs

What is OOPs

Whats the most common Bug in user space program.

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

3 / 19

OOPs

What is OOPs

Whats the most common Bug in user space program.


The segfault.

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

3 / 19

OOPs

What is OOPs

Whats the most common Bug in user space program.


The segfault.

What is the most common bug in the Linux kernel?

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

3 / 19

OOPs

What is OOPs

Whats the most common Bug in user space program.


The segfault.

What is the most common bug in the Linux kernel?


The segfault. Except here:The notion of a segfault is much more complicated When the kernel de-references an invalid pointer, its not called a segfault its called an oops.

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

3 / 19

OOPs

Typical OOPs

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

4 / 19

OOPs

Typical OOPs

Kernel exception handler


Kills oending process Prints registers, stack trace with symbolic info

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

4 / 19

OOPs

Typical OOPs

Kernel exception handler


Kills oending process Prints registers, stack trace with symbolic info

Some exceptions non-recoverable (panic())


Prints message on console, halts kernel Oops in interrupt handler, idle (0) or init (1)

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

4 / 19

OOPs

Typical OOPs

Kernel exception handler


Kills oending process Prints registers, stack trace with symbolic info

Some exceptions non-recoverable (panic())


Prints message on console, halts kernel Oops in interrupt handler, idle (0) or init (1)

Oops generated by macros:


BUG(), BUG ON(condition)

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

4 / 19

OOPs

Typical OOPs

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

5 / 19

OOPs

Tainted kernels

Some oops reports contain the string Tainted: after the program counter. G if all modules loaded have a GPL or compatible license, P any proprietary module has been loaded. F if any module was force loaded by insmod -f, if all modules were loaded normally. M if any processor has reported a Machine Check Exception, if no Machine Check Exceptions have occurred. etc.

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

6 / 19

OOPs

Kernel Symbol Table


c04a1210 c04a1525 c04a7a18 c04ac37a ... c068b262 c068c4a0 c068d01f c068d055 c0698a01 c0853400 T T T T t T T T T r vfs_readv vfs_read vfs_readlink vfs_readdir net_rx_action netif_rx netif_rx_ni netdev_rx_csum_fault __netpoll_rx __ksymtab_netif_rx

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

7 / 19

OOPs

Kernel Symbol Table


c04a1210 c04a1525 c04a7a18 c04ac37a ... c068b262 c068c4a0 c068d01f c068d055 c0698a01 c0853400 T T T T t T T T T r vfs_readv vfs_read vfs_readlink vfs_readdir net_rx_action netif_rx netif_rx_ni netdev_rx_csum_fault __netpoll_rx __ksymtab_netif_rx

System.map

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

7 / 19

OOPs

Kernel Symbol Table


c04a1210 c04a1525 c04a7a18 c04ac37a ... c068b262 c068c4a0 c068d01f c068d055 c0698a01 c0853400 T T T T t T T T T r vfs_readv vfs_read vfs_readlink vfs_readdir net_rx_action netif_rx netif_rx_ni netdev_rx_csum_fault __netpoll_rx __ksymtab_netif_rx

System.map /proc/kallsyms
Neependra Khare (STEC India) FUDCon 2011 November 3, 2011 7 / 19

OOPs

System.map File

System.map is a phone directory list of function in a particular build of a kernel.


[root@nkhare ~]# ls -l /boot/System.map* lrwxrwxrwx 1 root root 25 2009-08-26 -rw-r--r--. 1 root root 1274567 2009-05-28 -rw-r--r-- 1 root root 1225104 2009-08-26 -rw-r--r-- 1 root root 1257149 2009-06-17 17:05 03:09 17:05 09:02 /boot/System.map -> /boot/System.map-2.6.29.5 /boot/System.map-2.6.29.4-167.fc11.i686.PAE /boot/System.map-2.6.29.5 /boot/System.map-2.6.29.5-191.fc11.i586

How it is produced?

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

8 / 19

OOPs

System.map File

System.map is a phone directory list of function in a particular build of a kernel.


[root@nkhare ~]# ls -l /boot/System.map* lrwxrwxrwx 1 root root 25 2009-08-26 -rw-r--r--. 1 root root 1274567 2009-05-28 -rw-r--r-- 1 root root 1225104 2009-08-26 -rw-r--r-- 1 root root 1257149 2009-06-17 17:05 03:09 17:05 09:02 /boot/System.map -> /boot/System.map-2.6.29.5 /boot/System.map-2.6.29.4-167.fc11.i686.PAE /boot/System.map-2.6.29.5 /boot/System.map-2.6.29.5-191.fc11.i586

How it is produced? When you compile the kernel.

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

8 / 19

OOPs

System.map File

System.map is a phone directory list of function in a particular build of a kernel.


[root@nkhare ~]# ls -l /boot/System.map* lrwxrwxrwx 1 root root 25 2009-08-26 -rw-r--r--. 1 root root 1274567 2009-05-28 -rw-r--r-- 1 root root 1225104 2009-08-26 -rw-r--r-- 1 root root 1257149 2009-06-17 17:05 03:09 17:05 09:02 /boot/System.map -> /boot/System.map-2.6.29.5 /boot/System.map-2.6.29.4-167.fc11.i686.PAE /boot/System.map-2.6.29.5 /boot/System.map-2.6.29.5-191.fc11.i586

How it is produced? When you compile the kernel. nm vmlinux

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

8 / 19

OOPs

/proc/kallsyms

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

9 / 19

OOPs

/proc/kallsyms

Created on the y when a kernel boots up.

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

9 / 19

OOPs

/proc/kallsyms

Created on the y when a kernel boots up. Kernel data which is given the illusion of being a disk le. # le /proc/kallsyms /proc/kallsyms: empty Symbol for custom modules # grep mycdrv /proc/kallsyms f7d14000 t mycdrv1_read [char_read_write] f7d14014 t mycdrv1_release [char_read_write] f7d1402f t mycdrv1_open [char_read_write] f7d140ec r mycdrv1_fops [char_read_write] f7d1404a t mycdrv1_write [char_read_write]

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

9 / 19

OOPs

OOPs Example 1

Aug Aug Aug Aug

28 28 28 28

21:22:55 21:22:55 21:22:55 21:22:55

localhost localhost localhost localhost

kernel: kernel: BUG: unable to handle kernel NULL pointer dereference at (null) kernel: IP: [<f7d18008>] mycdrv1_read+0x8/0x14 [char_read_write] kernel: *pde = 5f136067

Message from syslogd@localhost at Aug 28 21:22:55 ... kernel:Oops: 0002 [#1] SMP Aug 28 21:22:55 localhost kernel: Oops: 0002 [#1] SMP

Message from syslogd@localhost at Aug 28 21:22:55 ... kernel:last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0A08:00/device:01/PNP0C09:00/PNP0C0A:00/power_su Aug 28 21:22:55 localhost kernel: last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0A08:00/device:01/PNP0 Aug 28 21:22:55 localhost kernel: Modules linked in: char_read_write ppp_synctty n_hdlc ppp_deflate zlib_deflat Aug 28 21:22:55 localhost kernel: i2c_core video output [last unloaded: scsi_wait_scan] Aug 28 21:22:55 localhost kernel: Aug 28 21:22:55 localhost kernel: Pid: 4716, comm: cat Tainted: G W (2.6.29.5-191.fc11.i586 #1) 7735AE7 Aug 28 21:22:55 localhost kernel: EIP: 0060:[<f7d18008>] EFLAGS: 00010286 CPU: 0 Aug 28 21:22:55 localhost kernel: EIP is at mycdrv1_read+0x8/0x14 [char_read_write] Aug 28 21:22:55 localhost kernel: EAX: f3cb9d80 EBX: f3cb9d80 ECX: 00008000 EDX: 0861a000 Aug 28 21:22:55 localhost kernel: ESI: 0861a000 EDI: f7d18000 EBP: f1a67f5c ESP: f1a67f5c Aug 28 21:22:55 localhost kernel: DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

10 / 19

OOPs

OOPs Example 1

Aug Aug Aug Aug

28 28 28 28

21:22:55 21:22:55 21:22:55 21:22:55

localhost localhost localhost localhost

kernel: kernel: BUG: unable to handle kernel NULL pointer dereference at (null) kernel: IP: [<f7d18008>] mycdrv1_read+0x8/0x14 [char_read_write] kernel: *pde = 5f136067

Message from syslogd@localhost at Aug 28 21:22:55 ... kernel:Oops: 0002 [#1] SMP Aug 28 21:22:55 localhost kernel: Oops: 0002 [#1] SMP

Message from syslogd@localhost at Aug 28 21:22:55 ... kernel:last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0A08:00/device:01/PNP0C09:00/PNP0C0A:00/power_su Aug 28 21:22:55 localhost kernel: last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0A08:00/device:01/PNP0 Aug 28 21:22:55 localhost kernel: Modules linked in: char_read_write ppp_synctty n_hdlc ppp_deflate zlib_deflat Aug 28 21:22:55 localhost kernel: i2c_core video output [last unloaded: scsi_wait_scan] Aug 28 21:22:55 localhost kernel: Aug 28 21:22:55 localhost kernel: Pid: 4716, comm: cat Tainted: G W (2.6.29.5-191.fc11.i586 #1) 7735AE7 Aug 28 21:22:55 localhost kernel: EIP: 0060:[<f7d18008>] EFLAGS: 00010286 CPU: 0 Aug 28 21:22:55 localhost kernel: EIP is at mycdrv1_read+0x8/0x14 [char_read_write] Aug 28 21:22:55 localhost kernel: EAX: f3cb9d80 EBX: f3cb9d80 ECX: 00008000 EDX: 0861a000 Aug 28 21:22:55 localhost kernel: ESI: 0861a000 EDI: f7d18000 EBP: f1a67f5c ESP: f1a67f5c Aug 28 21:22:55 localhost kernel: DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

11 / 19

OOPs

OOPs Example 1

Aug Aug Aug Aug

28 28 28 28

21:22:55 21:22:55 21:22:55 21:22:55

localhost localhost localhost localhost

kernel: kernel: BUG: unable to handle kernel NULL pointer dereference at (null) kernel: IP: [<f7d18008>] mycdrv1_read+0x8/0x14 [char_read_write] kernel: *pde = 5f136067

Message from syslogd@localhost at Aug 28 21:22:55 ... kernel:Oops: 0002 [#1] SMP Aug 28 21:22:55 localhost kernel: Oops: 0002 [#1] SMP

Message from syslogd@localhost at Aug 28 21:22:55 ... kernel:last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0A08:00/device:01/PNP0C09:00/PNP0C0A:00/power_su Aug 28 21:22:55 localhost kernel: last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0A08:00/device:01/PNP0 Aug 28 21:22:55 localhost kernel: Modules linked in: char_read_write ppp_synctty n_hdlc ppp_deflate zlib_deflat Aug 28 21:22:55 localhost kernel: i2c_core video output [last unloaded: scsi_wait_scan] Aug 28 21:22:55 localhost kernel: Aug 28 21:22:55 localhost kernel: Pid: 4716, comm: cat Tainted: G W (2.6.29.5-191.fc11.i586 #1) 7735AE Aug 28 21:22:55 localhost kernel: EIP: 0060:[<f7d18008>] EFLAGS: 00010286 CPU: 0 Aug 28 21:22:55 localhost kernel: EIP is at mycdrv1_read+0x8/0x14 [char_read_write] Aug 28 21:22:55 localhost kernel: EAX: f3cb9d80 EBX: f3cb9d80 ECX: 00008000 EDX: 0861a000 Aug 28 21:22:55 localhost kernel: ESI: 0861a000 EDI: f7d18000 EBP: f1a67f5c ESP: f1a67f5c Aug 28 21:22:55 localhost kernel: DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

12 / 19

OOPs

OOPs Example 1

Aug Aug Aug Aug

28 28 28 28

21:22:55 21:22:55 21:22:55 21:22:55

localhost localhost localhost localhost

kernel: kernel: BUG: unable to handle kernel NULL pointer dereference at (null) kernel: IP: [<f7d18008>] mycdrv1_read+0x8/0x14 [char_read_write] kernel: *pde = 5f136067

Message from syslogd@localhost at Aug 28 21:22:55 ... kernel:Oops: 0002 [#1] SMP Aug 28 21:22:55 localhost kernel: Oops: 0002 [#1] SMP

Message from syslogd@localhost at Aug 28 21:22:55 ... kernel:last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0A08:00/device:01/PNP0C09:00/PNP0C0A:00/power_su Aug 28 21:22:55 localhost kernel: last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0A08:00/device:01/PNP0 Aug 28 21:22:55 localhost kernel: Modules linked in: char_read_write ppp_synctty n_hdlc ppp_deflate zlib_deflat Aug 28 21:22:55 localhost kernel: i2c_core video output [last unloaded: scsi_wait_scan] Aug 28 21:22:55 localhost kernel: Aug 28 21:22:55 localhost kernel: Pid: 4716, comm: cat Tainted: G W (2.6.29.5-191.fc11.i586 #1) 7735AE7 Aug 28 21:22:55 localhost kernel: EIP: 0060:[<f7d18008>] EFLAGS: 00010286 CPU: 0 Aug 28 21:22:55 localhost kernel: EIP is at mycdrv1_read+0x8/0x14 [char_read_write] Aug 28 21:22:55 localhost kernel: EAX: f3cb9d80 EBX: f3cb9d80 ECX: 00008000 EDX: 0861a000 Aug 28 21:22:55 localhost kernel: ESI: 0861a000 EDI: f7d18000 EBP: f1a67f5c ESP: f1a67f5c Aug 28 21:22:55 localhost kernel: DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

13 / 19

OOPs

OOPs Example 1

\# objdump

-S char-read-write.ko (struct file *file, char __user * buf, size_t lbuf, loff_t * ppos) push mov call movl \%ebp \%esp,\%ebp 4 <mycdrv1_read+0x4> \$0x0,0x0

static ssize_t mycdrv1_read { 0: 55 1: 89 e5 3: e8 fc ff ff ff *(int *)0 = 0; 8: c7 05 00 00 00 00 00 f: 00 00 00 } 12: 5d 13: c3

pop ret

\%ebp

00000014 <mycdrv1_release>: printk (" Opening : \%s:\n\n", MYDEV_NAME); return 0; }

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

14 / 19

OOPs

OOPs Example 2

Aug 28 21:40:59 localhost kernel: BUG: unable to handle kernel paging request at ffffffff Aug 28 21:40:59 localhost kernel: IP: [<ffffffff>] 0xffffffff Aug 28 21:40:59 localhost kernel: *pde = 00957067 *pte = 00000000 Message from syslogd@localhost at Aug 28 21:40:59 ... kernel:Oops: 0000 [#3] SMP Aug 28 21:40:59 localhost kernel: Oops: 0000 [#3] SMP

Message from syslogd@localhost at Aug 28 21:40:59 ... kernel:last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0A08:00/device:01/PNP0C09:00/PNP0C0A:00/power_su Aug 28 21:40:59 localhost kernel: last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0A08:00/device:01/PNP0 Aug 28 21:40:59 localhost kernel: Modules linked in: char_read_write1 char_read_write ppp_synctty n_hdlc ppp_de Aug 28 21:40:59 localhost kernel: drm i2c_algo_bit i2c_core video output [last unloaded: char_read_write1] Aug 28 21:40:59 localhost kernel: Aug 28 21:40:59 localhost kernel: Pid: 6050, comm: cat Tainted: G D W (2.6.29.5-191.fc11.i586 #1) 7735AE7 Aug 28 21:40:59 localhost kernel: EIP: 0060:[<ffffffff>] EFLAGS: 00010246 CPU: 0 Aug 28 21:40:59 localhost kernel: EIP is at 0xffffffff Aug 28 21:40:59 localhost kernel: EAX: 00000001 EBX: ffffffff ECX: 00000000 EDX: ffffffff Aug 28 21:40:59 localhost kernel: ESI: 09282000 EDI: ffffffff EBP: ffffffff ESP: f3cebf64 Aug 28 21:40:59 localhost kernel: DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

15 / 19

OOPs

Debugging a running Kernel with gdb

\# gdb vmlinux /proc/kcore (gdb) p jiffies_64 \$1 = 4295075081 (gdb) core-file /proc/kcore

(gdb) p jiffies_64 \$2 = 4295225679

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

16 / 19

OOPs

Debugging a running Kernel with gdb

(gdb) p mycdrv1_read No symbol "mycdrv1_read" in current context. Why? We need to educate about the module symbols

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

17 / 19

OOPs

Debugging a running Kernel with gdb

cd /sys/module/chra read write/sections cat .bss .data .text 0xf8e1e830 0xf8e1e6f4 0xf8e1e000

(gdb) add-symbol-file /home/nkhare/Training/LinuxKernel/oops/char-dev-read-write-oop/char-read-write.ko 0xf8e1 add symbol table from file "/home/nkhare/Training/LinuxKernel/oops/char-dev-read-write-oop/char-read-write.ko" .text_addr = 0xf8e1e000 .data_addr = 0xf8e1e6f4 .bss_addr = 0xf8e1e830 (y or n) y Reading symbols from /home/nkhare/Training/LinuxKernel/oops/char-dev-read-write-oop/char-read-write.ko...done. (gdb) p mycdrv1_read \$1 = {ssize_t (struct file *, char *, size_t, loff_t *)} 0xf8e1e000 <mycdrv1_read> (gdb) list mycdrv1_read

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

18 / 19

OOPs

References

http://www.faqs.org/docs/Linux-HOWTO/Kernel-HOWTO.html http://www.cs.fsu.edu/~baker/devices/notes/ http://linux.com/learn/linux-training

Neependra Khare (STEC India)

FUDCon 2011

November 3, 2011

19 / 19

You might also like