We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 9
Anatomy of the Linux Kernel
Tim Jones
Number of words 2730
Computer science content High
Math content Low
English language complexity Low
Learning objectives
+ so acquire basic vocabulary related to ope
+ tounderstand
Sub-areas covered
+ Linux kernel and tt eubsystems
Keywords
+ emel - the central component of most computer operating systems (OS). Its
functions include managing the system's resources (the communication between,
hardware and software components)
+ Linux kemel - Unislike operating system kernel
+ VES(Virtual file system) -an abstraction layer on top of a move concrete filesystem
+ GNU-a computer operating system composed entirely of free software, initiated
sm 1984 by Richard Stallman
+ GPL-a widely used free software license, criginally written by Richard Stallman
for the GNU project
+ Minix- free/open source, Unix like operating system (OS) based on a miciokemel
architecture
+ Unix a computer operating system originally developed in 1969 by a group of ATT
‘employees a Bell Labe including Ken Thompson, Dennis Ritchie and Douglas Ixoy
+ operating system - the software that manages the sharing of the resources of a
computer and provides programmers with an interface used to access those
+ buffer-a region of memory used to temporatily hold data while tts being moved
from one place to another
+ buffer cache - a col sored elsewhere or
tion of data duplicating original values
computed earlier, where the original data is expensive to fetch (owing t
access time) or to compute, compared to the cost of reading the cache
English++
29English++
30
Summary
‘As the title suggests, this aticle is about the linux kernel. I starts with a historical
introduction, which icludes information on unix and minx the predecessors of
linus Kernel in general and how itis constructed
‘The th
ern
1d section is divided into cub sections which describe the subsystems in the lis
J. Subsequent sections deal with memory management, process management
divers layer or network stack. The article isnot very complex, being only an intsoduction,
‘tothe linux kemel It concludes with alist of inks to longer astcles about the linux kernel
and its subsystems.
Jak wskazuje tytul,artykul ten jest o jadrze linuxa, Na poczatku maly wstep histo-
rycany. W te} czesci zawarte sq informacje o unix minixie, poprzednikach linwxa
[Nastepna czgé¢ traktuje o jgdrze linuxa jako calosa, cay jak ogolnie jest ono zbudo
wane, Tizeciacagé¢ jest slotona z mniejszych podczeéci. Kazda opisuje jeden 2 podsys
temow jadra. Kolejne ezeéct mowig o: zarzgdzanta pamigcta, zarzadzantu procesamat
vwarstwa sterownikow, systemic plikow, interfejsie wywolan systemowych oraz sto
sie slecowym, Artykul nie jest zbyt rozbudowany, fest on oy
nw do tnnycs axtykulow bardzle}
vstepem do jadra
Inuxa, Na kostcu tekstu znajduje sig lista odo:
szczegélowo omawiaigcych roane cagéc jgdra
Pre-reading exercises
What ae the most popular operating systems?
2. What are the advantages of Linux?
3. What are the disadvantages of Linux?Anatomy of the Linux Kernel
“The Limuxd kemel isthe core of a lage and complex operating system, and while i
4s huge, its well organized in terms of subsystems and layers In this article, you can
explore the general structure ofthe Linux Kemel and get to know its major subsystems
and core interfaces, Wher
possible, you get links to other IBM articles to help you
dig deeper.
bis antic is to i
‘Given thatthe goal: duce you tothe Linux Kemel and explore ts
architecture and major components, let's start with a short tour of Linux kernel history
then lock at the Linux kernel architecture ftom 30,000 feet, and, finally examine ts
_major subsystems, The Linux kemel is over six million lines of code, so this introduction
is not exhaustive. Use the pointers to more content to dig in further.
‘A short tour of Linux history
While Linux is arguably the most popular open source operating syst
sm, its history
4s actualy quite short considering the timeline of operating systems, In the early
days of computing. programmers developed on the bare hancware in the hardware’s
Tanguage. The lack of an operating system meant that only one application (and one
user) could use the laige and expensive device at a time. Eanly operating systems
were developed in th
50s to provide a simpler development experience, Examples
1 System (GMOS) developed for the IBM 701 and
the FORTRAN Monitor System (EMS) developed by North American Aviation for the
sindlude the General Motors Oper
1M 709.
In the 1960s, the Massachusetts Institute of Technology (M
1) and a host of companies
developed an experimental operating system called Multics (or Multiplexed Information
and Computing Service) for the GE-645, One of the developers ofthis operating system,
ATAT. dropped out of Multics and developed their own operating system in 1970
called Unies. Along with this operating system was the C language, for which C was
developed and then rewritten to make operating system development portable
Andrew Tanenbaum created a microkemel version of UNIX®,
NIX (for minimal UNIX, th
¢ ran on small personal computers. This open
source operating system inspired Linus Torvalds’ initial development of Linux in the
eaily 19905
Linux quickly evolved from a singleperson project to a worldwide development
project involving thousands of developers. One of the most important decisions for
[Linux was its adoption of the GNU General Public License (GPL). Under the GPL, the
Linux kernel was protected from comm
ial exploitation, and it also benefited from
English++
operating system
the softwave that manages
the sharing of the resource:
es ofa computer and
provides programmers
with an interface used to
access those resources
31English++
buffer
region of memory used
to temporarily hold
ata while itis being
moved from one place
to another
‘VES(Wirtual File Systers)
an abstraction layer
‘on top of a more concrete
32.
file system
the userspace development of the GNU project (of Richard Stallman, whose source
dwarfs that of the
«2x Kernel). This allowed useful applications euch as the GNU
Compiler Collection (
and various chell euppost
Introduction to the Linux kernel
Now on to a high-altitude look at the GNU/Linux operating system archit
‘an think about an operating system from two levels.
‘Av the top is the user, or application, space. This is where the user applications are
‘executed. Below the user space is the kernel space. Here, the Linux kemel exists
There is also the GO
ved
nary (glibc). This provides the system call interface that
‘connects to the kemel and provides the mechanism to transition between the userspace
application and the kemel, This is impoxtant because the kemel and user application
‘occupy different protected addzess spaces, And while each user space process occupies
ts own virtual address space, the kemel occupies a si
address space. For more
‘information, see the links in the resources section
‘The
1x kemel can be fusther divided into three gross levels, At the top is the system
call interface, which implements the basi functions sch as ead and waite, Below the
system call interface is the kemel code, which can be more accurately defined as the
architecture independent kernel code, This code is common to all ofthe processor 3:
chitectures supported by Linux Below thi is the architecture dependent code, which
forms what is more commonly called a BSP (Board Support Package). This code serves
as the processor and pl
tform specific code for the piven architecture,
Properties of the Linux kernel
‘When discussing the architecture of a large and complex system, you can view the
system
ym many perspectives. One goal of an archi
tural decomposition is to
vide a way to understand the source better and that’s what well do here,
‘The Linux kemel implements a number of important archi
nal attibutes, Ata high
level. and at lower levels, the kemel is layered into a number of distinct subsystems
‘Linux can also be considered monolithic because stlumps all ofthe basic services into
the kemel. This
fers from a microkesnel architecture, where the kemel provides
basic services such as communication, 1/0, and memory and process management,
and more specific services are plugged in to the microkemel layer. Bach has its own
advantages, but Tl steer clear of that debate
‘Over time, the Linux kemel has become efficient in terms of both memory and CPU
usage, as well as extremely stable, But the most interesting aspect
f Linux, given itssize and complexity. sits portability Linux can be compiled to 1un on a huge number
of procestors and platforms with different architectural constraints and needs, One
‘example is the ability of Linux to run on a process with a memory management unit
(MMU}, as well as those that provide no MMU, The uClinux port of the Linux kemel
provides for non-MMU support See the resources section for more details
Major subsystems of the Linux kernel
[Now let's look at some of the major components of the Linux kemel using the breakdown,
System call interface
The Sc1
a thin layer that provides the means to perform function calls from user
epace into the Kemel, As discussed previously, this interface can be architecture
dependent, even within the same processor family. The SCI is actually an interesting
function-call multiplexing and demultiplexing service, You can find the SCI
implementation in flinuxy/kemel, as well as architecturedependent portions in
Ainux(auch, More details for this component are available tn the
Process management
Process management ie focused on the execution of processes. In the kemnel, these
are called threads and represent an individual vistualization of the processor (thread
code, data, stack, and CPU registers). Im user space, the term process is typically used
‘though the Linux implementation does not separate the two concepts (processes and
interface (API) Uhrough the SCI
$k, exec, or Portable Operating System Interface [POSTX)
threads). The kemel provides an application program
to create a new process ff
functions), stop a process (hill, exit), and communicate and synchronize between
‘them (signal, or POSIX mechanisms)
‘Also in process management there is a need to share the CPU between the active
‘threads, The kernel implements a novel scheduling algorithm that operates in constant
6 O12)
the same amount of time is taken to schedule one thread as,
‘me, regardless of the number of threads vying for the CPU. This is calle
it isto schedule many. The O(2) scheduler also supports multiple processors (called
Symmetric MultiProcessing, or SMP). You can find the process management sources
in flinux/kemnel and architecturedependent sources in J/linux/arch). You can learn,
‘more about this algorithm in the resources section,
Memory management
Another important resource that's managed by the kemel is memory For efficiency
aiven the way that the hardware manages virtual memory, memory is managed in
English++
Linux kernel
Unixlike
operating system kernel
kemel
the central component
of most computer
operating systems (05)
Its factions include
managing the system's
(the communication
between hardware and
software components
33English++
freejopen source, Unbelike
operating
based ona
system (05)
‘microkernel
auchitecture
ort.
widely used
free software license,
coviginally wnitten by
Richard Stallman for the
34
GNU project
what are called pages (4KB in size for most architectures). Linux includes the means
to manage the available memory, as well as the haudware mechanisms for physical
and vistual mappings.
‘But memory management is much more than.
naging 4KB buffers, Linux provides
abstractions over 4KB buffers, such as the slab allocator. This memory management
scheme uses 4K2 buffers as its base. but then allocates structures from within, keeping
‘tack of which pages are full, partially used, and empty. This allows the scheme to
dynamically grow and shrink based
nn the needs of the greater system
Supporting multiple users of memory there ate times when the available memory
can be exhausted, For this reason, pages can be moved out
f memory and onto the
disk This process is called swapping because the pages ate swapped ftom memory
conto the hard disk. You can find the memory management sources in /inux/mm.
Virtual file system
The virtual file aystem (VPS) és an interesting aspect of the Linux kernel because
provides a common interface abstraction for file systems. The VFS provides a switching
layer between the SCI and the file systems supported by the kernel.
{At the top of the VES is a common API abstraction of functions such as open, close,
read and write. At the bottom of the VFS ate the file system abstractions that define
how the upper layer functions are amplemented, These are plugins for the given file
system (of which over 50 exist). You can find the file eystem sources in /linufs
Below the filesystem layer is the buff
to the filesystem layer independent of any particular file eystem). This caching layer
ashi
cache, which provides a common set of fun
suons
‘optimizes access to the physical devices by keeping data around ime (or
speculatively read ahead so that the data is available when needed). Below the buffer
‘cache are the device drivers, which implement the intesface forthe particular physical
device
Network stack
‘The network stack, by design, follows alayexed arc
{ecture modeled after the protocols
themselves. Recall thatthe Internet Protocol (1) is the core network layer protocol that
sits below the transport protocol (most commonly the Transmission Contiol Protocol
‘or TCP}, Above TCP isthe sockets layer, which is invoked through the SCL
‘The sockets layer is the standard API to the networking subsystem and provides a user
interface to a vatiety of networking protocols. From raw frame access to IP protocol
data units (PDUs) and up to TCP and the User Datagram Protocol (UDP), the socketslayer provides a standardized way to manage connections and move data between
‘endpoints. You can find the networking sources in the kemel at imuwnet
Device drivers
‘The vast majority of the source code in the Linux kernel exists im device drivers that
rake a particular hardware device usable. The Linux source tree provides a drivers
subdirectory that is further divided by the various devices that are supported, such as
‘Bluetooth, 12C, serial and so on. You can find the device driver sources in /linux/chivers,
Architecture-dependent code
Whale much o
sax is Independent of the architecture om which it runs, there ate
elements that must consider the architecture for normal operation and for efficiency
Te inux/arch subdirectory defines the architecturedependent portion of the kemel
source contained in a number of subdirectories that are specific to the architecture
(collectively forming the BSP). For a typical desktop, the 1386 directory is used. Each
architecture subdixectory contains a number of other subdiectories that focus on a
particular aspect ofthe kernel, such as boot kemel, memory management, and others
‘You can find the architecture-dependent code in /linux/atch,
Interesting features of the Linux kernel
If the portability and efficiency ofthe Linux kemel weren't enough it provides some
other features that could not be classified in the previous decomposition.
Linux. being a production operating system and open source, is a great test bed for
new protocols and advancements of those protocols. Linux supports a large number of
networking protocols. including the typical TCP/IP and also extension for high-speed
networking (greater than 1 Gigabit Bthemet (GbE] and 10 GbE). Linux also supports
protocols such as the Stream Control Transmission Protocol (SCTP), which provides
many advanced features above TC? (as a replacement transport level protocol)
Linux is also a dynamic kernel, supporting the addition and removal of software
components on the fly: These are called dynamically loadable kemel modules, and
they can be inserted at bot
‘hen they're needed (when a particular device is found
requiring the module) or at any time by the user
A recent advancement of Linux is its use as an operating system for other operating
systems (called a hypervisor). Recently, a modification to the kemel was made called
the Kernel based Virtual Machine (KVM). This modification enabled a new interface
to user space that allows other operating systems to run above the KVM-enabled
soft® Windows® can
kernel, fn addition to running another instance of Laux, Mi
English++
env
a computer operating
system composed
entizely of free software
initiated in 1984
by Richard Stallman
35English++
buffer cache
collection of
data duplicating
‘original values stored.
elsewhere or computed
‘earlier, where the original
data is expensive to fetch
(owing to longer access
time) or to compute,
compared to the cost of
reading the cache
36
also be virtualized. The only constraint is that the underlying processor must suppoxt
the new vi
jon instructions, See the resouuce section for more information,
Going further
‘This article jut scratched the eusface of the Linux kernel architecture and its features
and capabilities. You can check out the Documentation directory that is provided in
‘every Linux distribution for detailed information about the contents of the kernel
Resources:
+ The GNU site (http./www-gnu org/licenses) describes the GNU GPL that covers
the Linux kernel and most useful applications provided with st, so described is
less restrictive form of the GPL «
the Lesser GPI. (LGPL)
su
NIX (http:/fen wikipedia orghviki/Unics), MINIX.(http://en.wikipedia org/wiki/
Minix) and Linux (http.//en wikipedia org/wiki/Linux) are covered in Wikipedia
along with a detailed family tree of the operating systems,
+ Theat
of the standard Cl
UC Library (httpy/www gnu oxg/software/Mbch, o gb. is the implementation
ary It's used in the GNU/Linux operating system, as wellas the
GNU/Hlurd (http: /dixectory fof oxg/bued html) microkesnel operating system.
+ uclinux (hitpy/tewwwuclimuxorgh is a port of the Linux kemel that can execute
fon systems that lack an MMU. This allows the Linux kernel to run on very small
‘embedded platforms, such as the Motorola DragonBall processor used in the PalmPot
Personal Digital Assistants (PDA)
+ “Keel command using Linux system calls” (attps/wwwibm.comvdeveloper:
works/linux/ibrary/Lsystem-alls/) (developerWorks, March 2007) covers the SCI
which is an important layer in the Linux kemel, wit
userspace support from
alibe that enables function calls between user space and the kernel
+ “Inside the Linux scheduler” (hitp,/hvww ibm com/developerworkslinux/baryt-
scheduler) (developerWorks, June 2006) explores the new O() scheduler introduced
fn Linux 2.6 that is efficent, scales with a large number of processes (threads)
and takes advantage of SMP systems.
+ “Access the Linux kernel using the /proc filesystem" (http//wwwibm.com/devel
operworks/linuxtbrary/l-proc html) (developerWorks, March 2006) looks at the
‘proc filesystem, which isa virtual file system that provides a novel way for user
space applications to communicate with the kemel. This article demonstrates /
oc, as well as loadable kernel modules.
+ "Server clinic: Put vistual filesystems to work" (http //www ibm. comvdevelopes:
wworks/linulibrary/Lscl2 html) (developer Works. April 2003) delves into the VES
layer that allows Linux to support a vattety of different
common interface. This same interface is also used for of
such as socketside the Linux boot process’ (httpy/wnwwibmcom/developerworks/inwlibraty/
imuxboov index html (developer Works May2006) examinestheLinuxbootprocess,
which taker cate of bringing up a Linux eystem and is the same basic process
whether you'te booting from a hard disk, floppy. USB memory stick, of over the
network,
“Linux inital RAM disk (init) overview" (http wwwribm.com/developerworks/
linua/ibrary/linited html) (developes Works, July 2006) inspects the intial RAM disk
Which isolates the boot process from the phystal medium from which i's booting
Better networking with SCTP" (httpy/hvwwibm com/developerworks/linu/
library/-sctp) (developerWorks, February 2006) covers one of the most interest
o
networking protocols, Stream Control Transmission Protocol, which operates like
TCP but adds a number of useful features such as messaging, multshoming and
multsetveaming. Linux, like BSD, is a great operating eystem if you're interested
sm networking protocols.
“AnatomyoftheLinuxslaballocator” (http. //www som. com/developerworks/linux/
Ubrary/ Linux slaballocatoy) (developerWorks, May 2007) covers one of the most
interesting aspects of memory management in Linus, the slab allocator. This
mechanism originated in SunOS, but i's found a stendly home inside the Linux.
kernel.
‘Virtual 1
com/developerworks/tinux/library/linuxvitt)
ax" (http rw
(developerWorks, December 2006) shows how Linux can take advantage of
processors with vitualization capabilities
Tanux and symmetric multiprocessing” {http /wvww thm com/developerworks/
library/linux-smp/) developerWorks, March 2007) discusses how Linux can also
take advantage of processors that offer chiplevel multiprocessing
~DiscovertheLinuxKemelVirtualMachine" (http //wrww sbm. conv developerworks/
linuvlibraryllimwckwn) (evelopes Works, Apuil 2007) covers the recent introduction
of virtualization into the kernel, which turns the Linux kes
snto.a hypervisor for
other virtualized operating systems,
Checkout Tim's book GNU/Linux Application Programming httpy/wrww charlesriver
comyBooks/BockDetailaspx?productID =91525) for more information on programming
Lapux in user space
In the developerWorks Linux zone {
tp:/hwww ibm. conv/developerworks/linux),
findmoreresourcesforLinuxdevelopers,including!smuxtutorials http:/www:sbm,
com/developerworksiviews/linuw/ibraryvieweptype_by=Tutorials) aswell as our
readers’ favorite Linux atticles and tutorials (nttp:wrwibm.com/developerworks/
linua/ibrary/-top-10 html] over the last month
Stay curent with
developerWorks technical events and Webcasts (http:/wwwibm,
1O5AGXO3&S_CMP=art)
com/developerworks/offers/techbriefings/"$_TACI
English++
Unix
a computer operating
system oxiginally
developed in 1969
by a group of ATT
employees a Bell Labs
Ancluding Ken Thompson,
Dennis Ritchie
and Douglas Thoy
37