0% found this document useful (0 votes)
21 views

Linux Kernel Programming Intro

This document provides an introduction to writing Linux kernel modules. It discusses that kernel modules are like dynamically linked libraries that can be loaded and unloaded. It covers key topics like loading and removing modules, module initialization and exit routines, accessing kernel functions, differences from userspace like the lack of standard libraries, validating inputs, object-oriented aspects of the virtual file system, logging with printk, and common data structures like lists.

Uploaded by

Jérôme Antoine
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Linux Kernel Programming Intro

This document provides an introduction to writing Linux kernel modules. It discusses that kernel modules are like dynamically linked libraries that can be loaded and unloaded. It covers key topics like loading and removing modules, module initialization and exit routines, accessing kernel functions, differences from userspace like the lack of standard libraries, validating inputs, object-oriented aspects of the virtual file system, logging with printk, and common data structures like lists.

Uploaded by

Jérôme Antoine
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Intro to Linux Kernel

Programming
Don Porter
Lab 4
ò  You will write a Linux kernel module

ò  Linux is written in C, but does not include all standard


libraries
ò  And some other idiosyncrasies
ò  This lecture will give you a crash course in writing Linux
kernel code
Kernel Modules
ò  Sort of like a dynamically linked library

ò  How different?


ò  Not linked at load (boot) time
ò  Loaded dynamically
ò  Often in response to realizing a particular piece of hardware
is present on the system
ò  For more, check out udev and lspci
ò  Built with .ko extension (kernel object), but still an ELF
binary
Kernel Modules, cont.
ò  Load a module
ò  insmod – Just load it
ò  modprobe – Do some dependency checks
ò  Examples?
ò  rmmod – Remove a module
ò  Module internally has init and exit routines, which can
in turn create device files or otherwise register other call
back functions
Events and hooks
ò  When you write module code, there isn’t a main()
routine, just init()

ò  Most kernel code is servicing events---either from an


application or hardware

ò  Thus, most modules will either create a device file,


register a file system type, network protocol, or other
event that will lead to further callbacks to its functions
Kernel Modules, cont.
ò  When a module is loaded, it runs in the kernel’s address
space
ò  And in ring 0
ò  So what does this say about trust in this code?
ò  It is completely trusted as part of the kernel
ò  And if this code has a bug?
ò  It can crash the kernel
Accessing Kernel
Functions
ò  Linux defines public and private functions (similar to Java)

ò  Look for “EXPORT_SYMBOL” in the Linux source


ò  Kernel exports a “jump table” with the addresses of public
functions

ò  At load time, module’s jump table is connected with kernel


jump table
ò  But what prevents a module from using a “private” function?

ò  Nothing, except it is a bit more work to find the right address
ò  Example code to do this in the lab4 handout
Kernel Programming
ò  Big difference: No standard C library!
ò  Sound familiar from lab 1?
ò  Why no libc?
ò  But some libc-like interfaces
ò  malloc -> kmalloc
ò  printf(“boo”) -> printk(KERN_ERR “boo”)
ò  Some things are missing, like floating point division
Kernel Programming, ctd
ò  Stack can’t grow dynamically
ò  Generally limited to 4 or 8KB
ò  So avoid deep recursion, stack allocating substantial
buffers, etc.
ò  Why not?
ò  Mostly for simplicity, and to keep per-thread memory
overheads down
ò  Also, the current task struct can be found by rounding
down the stack pointer (esp/rsp)
Validating inputs
super-important!
ò  Input parsing bugs can crash or compromise entire OS!

ò  Example: Pass read() system call a null pointer for buffer
ò  OS needs to validate that buffer is really mapped
ò  Tools: copy_form_user(), copy_to_user(), access_ok(),
etc.
Cleaning up
ò  After an error, you have to be careful to put things back
the way you found them (generally in reverse order)
ò  Release locks, free memory, decrement ref counts, etc.
ò  The _one_ acceptable use of goto is to compensate for
the lack of exceptions in C
Clean Up Example
    str = getname(name);
    if (IS_ERR(str)) {
        err = -EFAULT;
        printk (KERN_DEBUG "hash_name: getname(str) error!\n");
        goto out;
}
    if (!access_ok(VERIFY_WRITE, hash, HASH_BYTES)) {
        err = -EFAULT;
        printk (KERN_DEBUG "hash_name: access_ok(hash) error!\n");
        goto putname_out;
}
// helper function does all the work here
putname_out:
    putname(str);
out:
    return err;
}
Key objects
ò  task_struct – a kernel-schedulable thread
ò  current points to the current task
ò  inode and dentry – refer to a file’s inode and dentry, as
discussed in the VFS lectures
ò  Handy to find these by calling helper functions in the fs
directory
ò  Read through open and friends
Object-orientation in the
VFS
ò  Files have a standard set of operations
ò  Read, write, truncate, etc.
ò  Each inode includes a pointer to a ‘file_operations’ struct
ò  Which in turn points to a lot of functions
ò  VFS code is full of things like this:
ò  int rv = inode->f_op->stat(inode, statbuf);
OO, cont.
ò  When an inode is created for a given file system, the file
system initializes the file_operation structure

ò  For lab 4, you may find it handy to modify/replace a


given file’s file_operation structure
/proc
ò  The kernel exports a lot of statistics, configuration data,
etc. via this pseudo-file system

ò  These “files” are not stored anywhere on any disk

ò  The kernel just creates a bunch of inodes/dentries


ò  And provides read/write and other file_operations hooks
that are backed by kernel-internal functions
ò  Check out fs/proc source code
Logs?
ò  The kernel log goes into /var/log/dmesg by default
ò  And to the console
ò  Visible in vsphere for your VM
ò  Also dumped by the dmesg command

ò  printk is your friend for debugging!


Verbosity
ò  The kernel is dynamically configured with a given level
of verbosity in the logs

ò  The first argument to printk is the importance level


ò  printk(KERN_ERR “I am serious”);
ò  printk(KERN_INFO “I can be filtered”);
ò  This style creates an integer that is placed at the front of
the character array, and transparently filtered

ò  For your debugging, just use a high importance level


Lists
ò  Linux embeds lists and other data structures in the
objects, rather than dynamically allocate list nodes

ò  Check out include/linux/list.h

ò  It has nice-looking macro loops like list_for_each_entry

ò  In each iteration, it actually uses compiler macros to


figure out the offset from a next pointer to the “top” of a
struct
Assertions
ò  BUG_ON(condition)

ò  Use this.

ò  How does it work?


ò  if (!condition) crash the kernel;
ò  It actually uses the ‘ud2a’ instruction, which is a
purposefully undefined x86 instruction that will cause a
trap
ò  The trap handler can unpack a more detailed crash report
Other tips
ò  Snapshot your VM for quick recreation if the file system
is corrupted

ò  Always save your code on another machine before


testing
ò  git push is helpful for this
ò  Write defensively: lots of test cases and assertions, test
each line you write carefully
ò  Anything you guess might be true, add an assertion
Good luck!

You might also like