This paper presents a series of programming projects based on the Linux kernel. The projects cover several key operating systems concepts, including process scheduling. Students' response to these projects is overwhelmingly positive, the authors say.
This paper presents a series of programming projects based on the Linux kernel. The projects cover several key operating systems concepts, including process scheduling. Students' response to these projects is overwhelmingly positive, the authors say.
Linux Kernel Projects for an Undergraduate Operating
Systems Course
Rob Hess and Paul Paulson
School of Electrical Engineering and Computer Science Oregon State University Corvallis, OR 97331, USA {hess,paulson}@eecs.oregonstate.edu
ABSTRACT dining philosophers problem, which illustrates concurrency
In this paper, we present a series of programming projects issues, or the elevator problem, which illustrates issues re- based on the Linux kernel for students in a senior-level un- lated to disk I/O. These problems and their solutions, how- dergraduate operating systems course. The projects we de- ever, really skirt the issue of operating systems practice by scribe cover several key operating systems concepts, includ- sidestepping most of the important details that lie at the ing process scheduling, I/O scheduling, memory manage- heart of a functional operating system, and, in the end, ment, and device drivers. In addition, we assess these projects they really only reinforce operating systems theory rather along several dimensions, from their difficulty to their ca- than teaching operating systems practice. pacity to help students understand operating systems con- An alternative approach has students program within an cepts, based on six terms (three years) of detailed student instructional operating system such as Nachos [4] or Minix exit surveys along with observations and anecdotal evidence. [12]. Anderson and Nguyen [2] provide an excellent survey Through this assessment, we conclude that our Linux-based of the many existing such systems. The motivation behind projects are an effective means by which to teach operating the use of these instructional systems is to expose students systems concepts and, additionally, that students’ response to some of the issues involved in making an operating sys- to these projects is overwhelmingly positive. tem “work” while, at the same time, protecting them from as many as possible of the more formidable, nuts-and-bolts Categories and Subject Descriptors aspects of a real production operating system. In this paper, we argue that protecting senior-level stu- K.3.2 [Computers and Education]: Computer and In- dents completely from the details of a production operating formation Science Education; D.4.7 [Operating Systems]: system is not necessary and, further, that exposing them to Organization and Design some of these details can, in fact, prove to be beneficial. General Terms Many educators have already documented their success in having students work on production software in other do- Human Factors, Design, Experimentation mains [7, 3, 6, 11, 1]. Some of the benefits of classroom pro- duction programming described by these educators include Keywords teaching students that most real programming projects are Linux, Kernel, Operating Systems, Projects, Open Source not started and finished by a single programmer alone, but are rather the collaborative effort of many; teaching them 1. INTRODUCTION how to break down, understand and contribute to very large Effectively teaching operating systems requires providing pieces of software written by other people; introducing them students with a good balance of theory and practice. How to programming environments similar to those they will use to teach operating systems theory is fairly well established, later in their careers; teaching them the importance of good and a number of good textbooks exist for this purpose. How documentation and well-written, self-documenting code; in- to teach operating systems practice, however, is a topic that creasing their interest in programming and hence their mo- still receives a great deal of attention in the literature. tivation to program; and, most importantly, giving them a The most basic approach to teaching operating systems sense of achievement and pride in their work. practice is to have students implement solutions to hypo- In fact, some educators have already begun having stu- thetical problems that are analogously related to operating dents do production programming in their operating sys- systems concepts. Examples of such problems include the tems courses. Indeed, as of 2005, an almost surprising 14% of the top 100 computer science schools reported using the Linux kernel in some form in their undergraduate operating systems course [2]. In the literature, Nieh and Vaill [10], like Permission to make digital or hard copies of all or part of this work for us, describe several projects that involve students program- personal or classroom use is granted without fee provided that copies are ming directly within the Linux kernel. Similarly, Lawson not made or distributed for profit or commercial advantage and that copies and Barnett [8] describe having students modify a custom bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific Linux kernel designed to run on the iPod. In both of these permission and/or a fee. cases, the authors report not only that their students were SIGCSE’10, March 10–13, 2010, Milwaukee, Wisconsin, USA. able to successfully program within a complex production Copyright 2010 ACM 978-1-60558-885-8/10/03 ...$10.00. operating system, but that they did so with tremendous en- mented and poorly maintained. In addition, because we had thusiasm. assigned the projects for several years, students were begin- The goal of this paper is thus not to introduce Linux pro- ning to reference other students’ solutions from previous of- gramming in an operating systems course as a novel practice, ferings of the course. Students also commonly complained but rather to present a set of Linux kernel-based operating that the projects had no relevance to the theoretical topics systems projects which can be seen as complimentary to the discussed in the course, and exit surveys showed that they ones already described in the literature. Specifically, we de- would prefer more realistic projects. scribe five projects that cover system calls, process schedul- Instead of developing new projects for a system we felt ing, memory allocation, I/O scheduling, and device drivers. was too limited, we decided to choose a new system, but us- Each of these projects (with the exception of the process ing another instructional operating system seemed like only scheduler, for which we provide students with virtual ma- a partial solution to our problems. Linux was attractive to chine and skeleton scheduler code derived from the Linux us because it is both a real operating system and an open- process scheduler) is implemented directly within the Linux source project with which students are familiar. This latter kernel and involves understanding and modifying existing point was important to us because we felt it would help Linux code to create new functionality. students approach our operating systems course enthusias- We have run our senior-level operating systems course for tically, knowing that part of our course would involve them six terms (three years) using these Linux-based projects, and modifying a well-known and well-respected piece of software. like previous authors, we too have seen a great deal of enthu- siasm in our students as they complete these projects. Our 2.1 Development Environment previous experience using an instructional operating system In the six terms we have run our course using Linux, we in the same course suggests that it is far more difficult for have evolved a development environment similar to the one students to achieve the same level of enthusiasm using an in- described by Nieh and Vaill in [10]. Specifically, students structional operating system as we have seen them achieve work in teams of three to four members, and each team is when programming in the Linux kernel. provided with a virtual machine (VM) for testing their ker- We have seen other benefits to having students do Linux nel code. Our VMs are hosted using VMware Server (freely programming in our course, as well. The foremost of these is available at www.vmware.com). This setup allows students the exposure it gives them to the type of programming many to access their VMs from anywhere on or off campus, and of them will perform later in their careers, involving under- it provides an inexpensive and less risky alternative to sup- standing and modifying a large, complex piece of software plying each team with its own physical testing machine. written by someone else. After doing this type of program- Each student team is also provided with a Subversion ming, many students demonstrate noticeably improved self- repository for revision control of the code they develop. This confidence in their own computer science abilities. Again, is useful, as it permits different team members to easily work this is markedly different from our experience using an in- on separate parts of their code in parallel. More importantly, structional operating system in the same course. using Subversion helps us to simulate a real production de- After outlining our set of Linux-based projects, we more velopment environment where some form of version control fully assess their effectiveness in an analysis based on data software is generally employed. from exit surveys distributed in each of the six terms we used To help students browse the Linux source code and eas- our Linux-based projects. Through this analysis we show ily track down function, structure and variable definitions, that students find the difficulty of our Linux-based projects we refer them to an indexed, cross-referenced version of the to be appropriate for a senior-level course and, moreover, Linux source, available online at lxr.linux.no. This has that students find these projects to be an effective way to proved to be an invaluable resource for students, as has a learn operating systems concepts in practice. In addition, we course email list, over which we permit a certain degree of combine our survey results along with personal observation collaboration among student teams. and anecdotal evidence to demonstrate the level of enthusi- asm our course generates and, further, to describe some of 3. LINUX KERNEL-BASED OS PROJECTS the benefits students see as a result of having completed our Below we describe our set of five Linux-based projects1 . Linux kernel development-based projects. Following that, we discuss several ideas for projects that are under development. As a compliment to these projects, 2. COURSE DESCRIPTION we recommend using a good Linux kernel reference, such Our work is set in a one-quarter, senior-level course on as the excellent book by Love [9]. Note that, because the the principles of operating systems. The course’s learning projects we describe here involve low-level Linux kernel pro- objectives include concepts related to concurrency, process gramming, familiarity with C or a similar language is a pre- management, CPU scheduling, synchronization, deadlock, requisite. Also, note that the descriptions below are based memory management, disk management, I/O scheduling, on version 2.6.23 of the Linux kernel, though they likely ap- etc. We address the theoretical aspects of these concepts ply to other versions as well. in our lectures. However, another primary course objective is to have students design, implement and test functions re- 3.1 Simple System Call lated to these concepts within a large, complex code base. Our first programming project is designed to help students For many years, we used the Java-based Nachos instruc- set up and acclimate themselves to our Linux kernel devel- tional operating system [4] as our code base for student de- opment environment and to help them overcome any initial velopment, but, while this package included a virtual ma- 1 Full project descriptions and the virtual machine and skele- chine and a small set of projects with auto-graders which ton code for our process scheduling project can be found made it fairly easy to use, we found that it was poorly docu- online at http://eecs.oregonstate.edu/∼hess/cs411.html. apprehension they might have about modifying Linux ker- 3.3 Memory Allocator nel code. In this project, each student team must download The Linux kernel is equipped with three memory allo- and initialize their Subversion repository with a copy of a re- cators: SLAB and SLUB, which are complex frameworks cent stable version of the Linux kernel source code and then designed for use in resource-rich systems to reduce internal compile and install that version of the kernel. Once they fragmentation of memory and to permit efficient reuse of have successfully done this, they must create a new branch freed memory; and SLOB, which is a lightweight, efficient in their repository, and within this branch, they must write framework designed for use in embedded systems and other a simple system call and register it with the kernel. To test systems with limited resources. their work, they must compile and install their modified ver- The SLOB (Simple List Of Blocks) allocator, located in sion of the kernel and write a simple C program to invoke the Linux kernel tree at mm/slob.c, is a piece of the kernel their system call. This project is very straightforward, and that, unlike the process scheduler, can easily be extended by we generally allot one week for students to complete it. students. It works by maintaining a linked list of available blocks of contiguous memory called the free list, and, when a request for memory is made, it is serviced from the free 3.2 Process Scheduler list in first-fit fashion. If there are no available blocks large The process scheduler is one of the most important parts enough to service a given request, a new page of memory of an operating system, and it is thus something we feel stu- is allocated and prepended to the free list. Requests larger dents should gain practical experience with in an operating than one page are serviced separately. systems course. However, the Linux process scheduler is one As a project, we have students modify the SLOB alloca- area of the kernel that does not lend itself well to student tor to service requests from the free list in either best-fit modification, as the fact that it must deal with issues like or worst-fit fashion, both of which are classical allocation symmetric multi-processing, kernel preemption, and atomic strategies discussed in most operating systems textbooks. execution makes it an incredibly complex piece of code. To verify that their implementation works, students use con- For this reason, we have developed a simplified version of figuration scripts to select their allocator and then compile the Linux process scheduler, which does not deal with such and install the modified kernel. A defective implementation complicating issues as the ones mentioned above but still will typically not survive a small amount of normal usage. retains the basic structure and functionality of the original To test the correctness of their implementation, we also Linux process scheduler. This scheduler runs on a simple have students write system calls to compute the total amount virtual machine, which we also developed. of memory on the free list as well as the total amount claimed In our process scheduler project, we provide students with by the SLOB allocator for allocations less than one page (i.e. a skeleton version of our simplified Linux scheduler, which memory that is either on the free list or has been allocated they must modify to implement a round-robin scheduler. off the free list and not released). The values returned by Specifically, processes which are created by the virtual ma- these functions can be used to compute a rough measure of chine are submitted to the scheduler in the same manner as internal fragmentation, and students can use these values processes are submitted in the Linux kernel. Students must to compare the amount of fragmentation that results from design a scheduler that assigns a time-slice to each of these using different allocation strategies in the SLOB allocator processes and cycles through processes as their time-slices (typically, the best-fit and worst-fit strategies will suffer less expire. Students’ schedulers must also account for issues from fragmentation than does the first-fit strategy). such as processes sleeping while waiting for I/O. This project is also straightforward and does not require a Our simplified process scheduler is designed to be as sim- great amount of coding, but it begins to bring students into ilar as possible to the real Linux scheduler. Thus, to com- the inner workings of the Linux kernel and requires them to plete this project, students must still use most of the origi- understand the code they are modifying before they start nal structures and functions from the Linux process sched- hacking. We typically allot two weeks for this project. uler. For example, in this project, students are introduced to Linux’s rather elegant linked list implementation (located 3.4 I/O Scheduler in the Linux kernel tree at include/linux/list.h). Another piece of the Linux kernel that is simple and easily This project is fairly extensible, in that the skeleton we extensible by students is the no-op I/O scheduler, located provide to students can be filled out to implement any num- at block/noop-iosched.c in the kernel tree. This sched- ber of different process scheduling algorithms. For exam- uler services I/O requests in first-in, first-out (FIFO) order ple, it would be straightforward to have students implement and is designed to be used with devices such as flash drives, Linux’s constant time scheduler. In addition, our virtual where disk access time is constant and independent of phys- machine operates on process “profiles” which can be hand ical location on the disk. To determine the FIFO ordering designed and produce a deterministic process ordering for of requests, the no-op scheduler maintains a request queue a given scheduling algorithm with fixed parameters. This using the Linux kernel’s linked list implementation. makes testing students’ schedulers for correctness very easy. As a project, we have students extend the no-op scheduler Because this project does not involve the full kernel devel- to implement the shortest seek time first algorithm (SSTF), opment process, which includes kernel compilation, installa- which services I/O requests in order of increasing distance tion, and debugging, it helps students begin to get comfort- from the current location of the disk head. This is another able accomplishing tasks within an existing piece of Linux classical algorithm discussed in most operating system text- kernel-like code. For this reason, we assign this project books. Extending the no-op scheduler to implement SSTF second, before projects that involve going deeper into the can be achieved in several ways. The most straightforward machinery of the Linux kernel. This project is moderately of these involves sorting the list of pending requests by their challenging and we typically allot two weeks for it. physical location on the disk and selecting the one closest to the current location of the disk head whenever the kernel cause students find it more interesting than other projects, signals that a request should be serviced. and because they are allowed to follow the RAM disk driver This project is quite versatile. For example, it can be tutorial in [5], we typically assign this project last, at a changed from term to term by having students implement time when students are generally busy with final projects a different classical scheduling algorithm, such as SCAN for other classes and studying for final exams. Two weeks or LOOK. In addition, it can be made more challenging are usually sufficient for students to complete this project. by having students implement the sorted request list us- ing the Linux kernel’s red-black tree implementation (in- 3.6 Projects Under Development clude/linux/rbtree.h and lib/rbtree.c); by having them Our course is under continual development, and there are include additional features, like request aging; or by having a number of projects we are developing that we have not yet them implement various optimization features available in assigned to students or which we decided to develop further Linux’s I/O scheduler API, such as request merging. due to poor outcomes. Here, we discuss some of these. To verify the correctness of their implementation, we find Filesystem. Filesystems are another important part of it easiest to have students print to a log file the order in the operating system, so we are working to develop a filesys- which requests are submitted to their scheduler and the or- tem project. We are specifically considering one in which der in which they are serviced. A quick examination of this students modify the RAM filesystem (located at fs/ramfs/) file should suffice to show that requests are serviced in proper to support version control functionality. As an alternative, order. Alternative—though perhaps more complicated— we are also considering a project in which students would methods for testing the I/O scheduler are available, such implement a filesystem in userspace using FUSE, an open- as the blktrace functionality built in to the kernel. source Linux kernel module that allows users to develop This project is more challenging than the previous two, filesystems outside of the kernel (available at fuse.sf.net). as it requires significantly more design and coding, and we Process Synchronization. Process synchronization is typically give students three weeks to finish it. also an important part of an operating system, and we would like to develop a project in which students implement a pro- 3.5 Device Driver cess synchronization method and replace calls to existing synchronization methods in Linux with calls to their own. A common boast among Linux kernel developers is that Nieh and Vaill describe a project along these lines [10]. Linux supports more devices than any other operating sys- More Device Drivers. Although the RAM disk driver tem in history. Indeed, device drivers account for more than project is one of our more popular ones, it does not give stu- half of the code in the Linux kernel. However, many devices dents the satisfaction of writing a driver for a real, physical exist that are still unsupported by Linux, and one of the best device and seeing that device function properly when the ways for a developer to get his or her code into the Linux driver is finished. Virtual machine-related issues aside, to kernel is by writing a driver for one of these. Writing a driver have students write a driver for a real, physical device would is also one of the most common tasks initially assigned to require finding an inexpensive device that was enough and those hired by a company to do Linux kernel development. that came with a clear set of hardware specifications so that For these reasons, students find it particularly interesting to students could be reasonably expected to write a driver for it write a device driver as a project. as a two- to four-week project. We do believe, though, that Unfortunately, since our development environment involves such a device exists, and we are considering options such as VMs hosted on a remote server and because we are con- a simple USB device, like a thermometer, or a simple serial strained to work within a ten-week quarter, it is difficult to port device. Recently, our department also instituted an find real devices for which students can easily write drivers. initiative in which students complete various course-related Moreover, providing devices to students might be financially projects on their own Linux-based handheld devices as they infeasible, even if an appropriate device could be found. We progress through the Computer Science curriculum, and we have thus found it easiest for students to write a driver for are now working to develop driver projects for these devices. a virtual device. Specifically, we have them implement a RAM disk driver, which allocates a large block of memory and presents it in the form of a block device (i.e. a disk). 4. COURSE ASSESSMENT Chapter 16 of Corbet et al.’s book on Linux device drivers In every term we taught our course using the Linux kernel [5] provides a nice step-by-step tutorial on writing a RAM development-based projects, we conducted a detailed, open- disk driver, and we have students follow this tutorial to pro- ended exit survey of all of the students. In all, we received duce a basic, working RAM disk. We typically have stu- 243 responses to these surveys over six terms. While our dents extend their basic driver, for example, by using the surveys were designed to gather information of our own in- Linux kernel’s built-in cryptography API to encrypt their terest rather than as the basis for a formal research study, disk or by having them add additional functionality, such as they still provide a great deal of information that is both an “eject” function, which can be implemented as an ioctl useful and relevant to this paper. command. Validating a RAM disk driver implementation Here, we use these surveys to answer three main questions: involves mounting the disk, creating a filesystem on it, and then using that filesystem and observing whether data is 1. Are our projects appropriately difficult for a senior- correctly stored to and retrieved from the disk. level undergraduate course? This project serves not only to have students write a real 2. From our students’ perspective, are our projects an ef- Linux device driver; it also introduces them to several key fective means by which to teach operating systems con- kernel APIs, such as the block device API and the cryp- cepts in practice? tography API. Because this project does not deal directly 3. What degree of interest and enthusiasm do our projects with any of the theoretical material we cover in lecture, be- engender in our students? The first two of these questions we ask students directly ever, epitomizes the attitude with which we believe many in the surveys. To answer the third, we use the following students leave our course. question as a proxy: I was always so scared of the Linux kernel and 3′ . Have the projects increased your interest in doing work now I feel like most operating systems have sim- on the Linux kernel or on other open-source projects? ilar concepts, and I don’t have to be scared [of ] those either... The transitive property of working Our rationale here is that students who are enthusiastic on the Linux kernel has given me confidence for about the work they do in our course will be interested to other systems as well. continue doing that work after the course has ended. Difficulty. To gauge whether our projects are appro- Hirability. Finally, we note that many students see in- priately difficult, we classified each student’s responses to creased hirability as a benefit of their work on our Linux question 1 as indicating that the projects were either appro- kernel-based projects. Many students who have graduated priately difficult, too difficult, or too easy. In this way we de- since taking our course have described to us how their work termined that 66% of the 243 responding students found our on our projects helped them earn a job. Some students, such projects to be appropriately difficult, and 7% found them to as this one, even discuss this in their surveys: “I will be work- be too easy, while only 24% found them to be too difficult. ing on and contributing to another open source project... These numbers suggest fairly conclusively that our projects during an internship at Intel, and telling them about what are appropriately difficult for senior-level undergraduates. we’ve done in this class helped me get the job.” Others have One student’s answer to this question sums up students’ told us informally that the Linux programming environment general attitude about our projects’ difficulty: “kernel pro- they used in our class prepared them well for the build en- gramming... isn’t the black magic I used to think it to be.” vironments they work with in industry. Effectiveness. To determine whether students felt that our projects were an effective means through which to learn 5. CONCLUSIONS AND FUTURE WORK operating systems concepts, we similarly classified students’ In this paper, we described a set of Linux kernel-based responses to question 2. Through this classification, we de- projects for a senior-level undergraduate operating systems termined that 74% of the 243 responding students did feel course, and we assessed those projects along several dimen- the Linux-based projects were a good way to learn operating sions, both formally and informally. There are many di- systems concepts, while only 23% felt they were not. rections for future work, most notably in developing new Again, we consider these numbers to be fairly conclusive projects, but also, for example, in devising better methods proof of our students’ satisfaction with these projects. How- to grade our current projects. ever, it is also informative to read some of what they wrote in this regard. For example, one student writes, “this was a great way to teach operating system concepts, just because 6. ACKNOWLEDGMENTS it’s so practical, which makes us want to really understand Much of our students’ enthusiasm is attributed to guest- what we’re doing.” Another writes, “I’ve always had a con- lectures by Linux kernel maintainer Greg Kroah-Hartman, ceptual idea of how an OS works, but diving in like this to whom we are extremely grateful. We are also indebted made it very clear.” to Mike Marineau, Yunrim Park, and Ian Oberst for their Enthusiasm. We analyzed question 3′ in the same man- hard work to help develop and run our course. We would ner as we did questions 1 and 2, and found that 48% of also like to thank Google for their financial support. students described an increased interest in open-source pro- gramming or Linux kernel development in particular, while 7. REFERENCES 52% did not. Note, however, that this question, unlike ques- [1] E. Allen, R. Cartwright, and C. Reis. Production programming tions 1 and 2, is not a zero-sum question. In other words, in the classroom. In Proc. SIGCSE, 2003. [2] C. Anderson and M. Nguyen. A survey of contemporary here, we are looking not for a majority opinion of the stu- instructional operating systems for use in undergraduate dents surveyed but, rather, for the total number of students courses. JCSC, 21(1), 2005. who reported increased interest. Indeed, we find it quite [3] D. Carrington and S.-K. Kim. Teaching software design with compelling that nearly half of our students have wanted to open source software. In Proc. FIE, 2003. [4] W. A. Christopher, S. J. Proctor, and T. E. Anderson. The continue working on Linux or another open-source project Nachos instructional operating system. Technical Report after completing our course. CSD-93-739, University of California, Berkeley, 1993. Again, it is informative to read some of what students [5] J. Corbet, A. Rubini, and G. Kroah-Hartman. Linux Device wrote here. For example, one writes, “thanks to this class, Drivers. O’Reilly Media, 3rd edition, 2005. Freely available at http://lwn.net/Kernel/LDD3/. I can follow my interest in developing device drivers for the [6] H. J. C. Ellis, R. A. Morelli, T. R. de Lanerolle, J. Damon, and kernel.” Another writes, “I’ve been interested in kernel de- J. Raye. Can humanitarian open-source software development velopment, but afraid to try. This course has given me a draw new students to CS? In Proc. SIGCSE, 2007. whole new perspective.” [7] C. P. Fuhrman. Appreciation of software design concerns via open-source tools and projects. In Proc. ECOOP, 2006. Self-confidence. The last quote above also touches on [8] B. Lawson and L. Barnett. Using iPodLinux in an introductory a benefit of our projects that we didn’t try to measure for- OS course. In Proc. SIGCSE, 2008. mally through our surveys. Specifically, after completing our [9] R. Love. Linux Kernel Development. Novell, 2nd edition, 2005. projects, many students appear to gain a good deal of self- [10] J. Nieh and C. Vaill. Experiences teaching operating systems using virtual platforms and Linux. In Proc. SIGCSE, 2005. confidence in their computer science abilities. Many of them [11] M. Pedroni, T. Bay, M. Oriol, and A. Pedroni. Open source speak to this in their survey responses. One student writes, projects in programming courses. In Proc. SIGCSE, 2007. for example that our projects “gave [him] confidence that [12] A. S. Tannenbaum. A UNIX clone with source code for [he] could do this as a career.” The following quote, how- operating systems courses. SIGOPS OSR, 21(1), 1987.