Introduction To Linux Device Drivers: Recreating Life One Driver at A Time
Introduction To Linux Device Drivers: Recreating Life One Driver at A Time
User Kernel
Device Driver
Device File
Hardware
Linux Device Drivers, Technion, Jan 2005 p.4/50
Kernel Event List x File Open x Interrupt .... .... x Page Fault x Hotplug
Device Driver
Linux Device Drivers, Technion, Jan 2005 p.5/50
Driver Initialization
One function (init) is called on the drivers initialization. One function (exit) is called when the driver is removed from the system. Question: what happens if the driver is compiled into the kernel, rather than as a module? The init function will register hooks that will get the drivers code called when the appropriate event happens. Question: what if the init function doesnt register any hooks? There are various hooks that can be registered: le operations, pci operations, USB operations, network operations - it all depends on what kind of device this is.
Linux Device Drivers, Technion, Jan 2005 p.9/50
File Operations
. . . and then you start talking to the device. klife uses the following device le operations: open for starting a game (allocating resources). release for nishing a game (releasing resources). write for initializing the game (setting the starting positions on the grid). read for generating and then reading the next state of the games grid. ioctl for querying the current generation number, and for enabling or disabling hooking into the timer interrupt (more on this later). mmap for potentially faster but more complex direct access to the games grid.
Linux Device Drivers, Technion, Jan 2005 p.12/50
klife_open
klifes open routine allocates the klife structure which holds all of the state for this game (the grid, starting positions, current generation, etc).
s t a t i c i n t k l i f e _ o p e n ( s t r u c t inode inode , s t r u c t f i l e f i l p ) { struct klife k; int ret ; r e t = a l l o c _ k l i f e (& k ) ; i f ( ret ) return ret ; f i l p >p r i v a t e _ d a t a = k ; return 0; }
klife_open - alloc_klife
s t a t i c i n t a l l o c _ k l i f e ( s t r u c t k l i f e pk ) { int ret ; struct klife k; k = k m a l l o c ( s i z e o f ( k ) , GFP_KERNEL ) ; if (! k) r e t u r n ENOMEM; ret = i n i t _ k l i f e (k ); i f ( ret ) { kfree ( k ) ; k = NULL ; } pk = k ; return ret ; }
Linux Device Drivers, Technion, Jan 2005 p.15/50
klife_open - init_klife
static int i n i t _ k l i f e ( struct klife k) { int ret ; memset ( k , 0 , s i z e o f ( k ) ) ; s p i n _ l o c k _ i n i t (& k>l o c k ) ; r e t = ENOMEM; / one page t o be e x p o r t e d t o userspace / k>g r i d = ( v o i d ) get_zeroed_page (GFP_KERNEL ) ; i f ( ! k>g r i d ) goto done ; k>t m p g r i d = k m a l l o c ( s i z e o f ( k>t m p g r i d ) , GFP_KERNEL ) ; i f ( ! k>t m p g r i d ) goto f r e e _ g r i d ;
klife_release
klifes release routine frees the resource allocated during open time.
s t a t i c i n t k l i f e _ r e l e a s e ( s t r u c t inode inode , s t r u c t f i l e f i l p ) { s t r u c t k l i f e k = f i l p >p r i v a t e _ d a t a ; i f ( k>t i m e r ) klife_timer_unregister (k ); i f ( k>mapped ) { / undo s e t t i n g t h e g r i d page t o be r e s e r v e d / ClearPageReserved ( v i r t _ t o _ p a g e ( k>g r i d ) ) ; } free_klife (k ); return 0; }
write
For klife, I hijacked write to mean please initialize the grid to these starting positions. There are no hard and fast rules to what write has to mean, but its good to KISS (Keep It Simple, Silly...)
klife_write - 1
s t a t i c s s i z e _ t k l i f e _ w r i t e ( s t r u c t f i l e f i l p , c o n s t char __user ubuf , s i z e _ t count , l o f f _ t f_pos ) { s i z e _ t sz ; char k b u f ; s t r u c t k l i f e k = f i l p >p r i v a t e _ d a t a ; ssize_t ret ; sz = count > PAGE_SIZE ? PAGE_SIZE : count ; k b u f = k m a l l o c ( sz , GFP_KERNEL ) ; i f ( ! kbuf ) r e t u r n ENOMEM;
klife_write - 2
r e t = EFAULT ; i f ( copy_from_user ( k b u f , ubuf , sz ) ) goto f r e e _ b u f ; r e t = k l i f e _ a d d _ p o s i t i o n ( k , k b u f , sz ) ; i f ( ret == 0) r e t = sz ; free_buf : kfree ( kbuf ) ; return ret ; }
Commentary on write
Note that even for such a simple function, care must be exercised when dealing with untrusted users. Users are always untrusted. Always be prepared to handle errors!
read
For klife, read means please calculate and give me the next generation. The bulk of the work is done in two other routines: klife_next_generation calculates the next generation based on the current one, according to the rules of the game of life. klife_draw takes a grid and draws it as a single string in a page of memory.
klife_read - 1
s t a t i c ssize_t k l i f e _ r e a d ( s t r u c t f i l e f i l p , char ubuf , s i z e _ t count , l o f f _ t f_pos ) { struct klife klife ; char page ; ssize_t len ; ssize_t ret ; unsigned l o n g f l a g s ; k l i f e = f i l p >p r i v a t e _ d a t a ; / s p e c i a l h a n d l i n g f o r mmap / i f ( k l i f e >mapped ) r e t u r n klife_read_mapped ( f i l p , ubuf , count , f_pos ) ; i f ( ! ( page = k m a l l o c ( PAGE_SIZE , GFP_KERNEL ) ) ) r e t u r n ENOMEM;
klife_read - 2
s p i n _ l o c k _ i r q s a v e (& k l i f e >l o c k , f l a g s ) ; klife_next_generation ( k l i f e ) ; l e n = k l i f e _ d r a w ( k l i f e , page ) ; s p i n _ u n l o c k _ i r q r e s t o r e (& k l i f e >l o c k , f l a g s ) ; i f ( len < 0 ) { r e t = len ; goto free_page ; } / l e n can t be n e g a t i v e / l e n = min ( count , ( s i z e _ t ) l e n ) ;
Note that the lock is held for the shortest possible time. We will see later what the lock protects us against.
klife_read - 3
i f ( copy_to_user ( ubuf , page , l e n ) ) { r e t = EFAULT ; goto free_page ; } f_pos + = l e n ; r e t = len ; free_page : k f r e e ( page ) ; return ret ; }
klife_read - 4
s t a t i c ssize_t klife_read_mapped ( s t r u c t f i l e f i l p , char ubuf , s i z e _ t count , l o f f _ t f_pos ) { struct klife klife ; unsigned l o n g f l a g s ; k l i f e = f i l p >p r i v a t e _ d a t a ; s p i n _ l o c k _ i r q s a v e (& k l i f e >l o c k , f l a g s ) ; klife_next_generation ( k l i f e ) ; s p i n _ u n l o c k _ i r q r e s t o r e (& k l i f e >l o c k , f l a g s ) ; return 0; }
Commentary on read
Theres plenty of room for optimization in this code . . . can you see where?
ioctl
ioctl is a special access mechanism, for operations that do not cleanly map anywhere else. It is considered extremely bad taste to use ioctls in Linux where not absolutely necessary. New drivers should use either sysfs (a /proc -like virtual le system) or a driver specic le system (you can write a Linux le system in less than a 100 lines of code). In klife, we use ioctl to get the current generation number, for demonstration purposes only . . .
klife_ioctl - 1
s t a t i c i n t k l i f e _ i o c t l ( s t r u c t inode inode , s t r u c t f i l e f i l e , unsigned i n t cmd , unsigned l o n g data ) { s t r u c t k l i f e k l i f e = f i l e >p r i v a t e _ d a t a ; unsigned l o n g gen ; i n t enable ; int ret ; unsigned l o n g f l a g s ; ret = 0; s w i t c h ( cmd ) { case KLIFE_GET_GENERATION : s p i n _ l o c k _ i r q s a v e (& k l i f e >l o c k , f l a g s ) ; gen = k l i f e >gen ; s p i n _ u n l o c k _ i r q r e s t o r e (& k l i f e >l o c k , f l a g s ) ; i f ( copy_to_user ( ( v o i d ) data , & gen , s i z e o f ( gen ) ) ) { r e t = EFAULT ; goto done ; }
Linux Device Drivers, Technion, Jan 2005 p.31/50
klife_ioctl - 2
break ; case KLIFE_SET_TIMER_MODE : i f ( copy_from_user (& enable , ( v o i d ) data , s i z e o f ( enable ) ) r e t = EFAULT ; goto done ; } pr_debug ( " user r e q u e s t t o %s t i m e r mode \ n " , enable ? " enable " : " d i s a b l e " ) ; i f ( k l i f e >t i m e r & & ! enable ) klife_timer_unregister ( klife ); e l s e i f ( ! k l i f e >t i m e r & & enable ) klife_timer_register ( klife ); break ; } done : return ret ; }
memory mapping
The read-write mechanism, previously described, involves an overhead of a system call and related context switching and of memory copying. mmap maps pages of a le into memory, thus enabling programs to directly access the memory directly and save the overhead, . . . but: fast synchronization between kernel space and user space is a pain (why do we need it?), and Linux read and write are really quite fast. mmap is implemented in klife for demonstration purposes, with read() calls used for synchronization and triggering a generation update.
klife_mmap
... SetPageReserved ( v i r t _ t o _ p a g e ( k l i f e >g r i d ) ) ; r e t = remap_pfn_range ( vma , vma>v m _ s t a r t , v i r t _ t o _ p h y s ( k l i f e >g r i d ) > > PAGE_SHIFT , PAGE_SIZE , vma>vm_page_prot ) ; pr_debug ( " io_remap_page_range r e t u r n e d %d \ n " , r e t ) ; i f ( ret == 0) k l i f e >mapped = 1 ; return ret ; }
Deferring Work
You were supposed to learn in class about bottom halves, softirqs, tasklets and other such curse words. The timer interrupt (and every other interrupt) has to happen very quickly. Why? The interrupt handler (top half, hard irq) usually just sets a ag which says there is work to be done. The work is then deferred to a bottom half context, where it is done by an (old style) bottom half, softirq, or tasklet. For klife, we defer the work we wish to do (updating the grid) to a bottom half context by scheduling a tasklet.
in drivers/char/Makele
+ o b j$ ( CONFIG_GAME_OF_LIFE ) + = k l i f e . o
Summary
Writing Linux drivers is easy . . . . . . and fun! Most drivers do fairly simple things, which Linux provides APIs for. The real fun is when dealing with the hardwares quirks. It gets easier with practice . . . . . . but it never gets boring.
Questions?
Linux Device Drivers, Technion, Jan 2005 p.48/50
Bibliography
kernelnewbies - http://www.kernelnewbies.org linux-kernel mailing list archives h t t p : / / marc . theaimsgroup . com / ? l = l i n u x k e r n e l&w=2
Understanding the Linux Kernel, by Bovet and Cesati Linux Device Drivers, 3rd edition, by Rubini et. al. Linux Kernel Development, 2nd edition, by Robert Love /usr/src/linux-xxx/