Session 4 Bootsyscall
Session 4 Bootsyscall
Boo)ng&and&Kernel&
Ini)aliza)on&
6/5/14&
1&
Boo#ng&
&
Memory&is&a&vola)le,&limited&resource:&OS&usually&on&disk&
Most&motherboards&contain&a&basis&input/output&system&(BIOS)&chip&(oIen&
ash&RAM)&&stores&instruc)ons&for&basic&HW&ini)aliza)on&and&
management,&and&ini)ates&the&&
...&bootstrap:&loads&the&OS&into&memory&
!
read&the&&boot&program&from&a&known&loca)on&on&secondary&storage&typically&
rst§or(s),&oIen&called&master&boot&record&(MBR)&
run&boot&program&
!
!
!
6/5/14&
read&root&le&system&and&locate&le&with&OS&kernel&
load&kernel&into&memory&
run&kernel&
2&
1&
6/5/14&
Boo#ng&
1.
2.
3.
4.
5.
OS
boot
OS
6/5/14&
3&
4&
2&
6/5/14&
6/5/14&
5&
System&Lifecycle:&Ups&&&Downs&
Power
on
Power
off
Boot
6/5/14&
Kernel
Init
OS
Init
RUN!
Shut
down
6&
3&
6/5/14&
Linux&environment:&Startup&
6/5/14&
7&
System&Memory&usage:&
bzImage&
1&MB&
zImage&
6/5/14&
8&
4&
6/5/14&
Process&of&Loading&
Linux&early&setup&
IA-32 Kernel Setup!
Video Setup !
Linux&architecture7specic&ini#aliza#on&
startup_32:!
Set segment registers to known values!
SMP BSP (Bootstrap Processor) check!
Initialize page tables!
Enable paging!
Clear BSS!
6/5/14&
9&
Process&of&Loading&
32-bit setup!
Copy boot parameters and command line out of the way!
Check CPU type !
Count this processor !
Load descriptor table pointer registers !
Start other processors !
Linux&architecture7independent&ini#aliza#on&
start_kernel:
setup_arch !
init thread!
do_basic_setup {part of the init thread} !
6/5/14&
10&
5&
6/5/14&
A&detailed&code&walk&look(/usr/src/linux)&
6/5/14&
11&
A&detailed&look&Z2&
6/5/14&
12&
6&
6/5/14&
A&detailed&look&Z3&
6/5/14&
13&
Run&control&mechanism&
Let&walkthrough&set&of&scripts&to&understand&
various&run&control&mechanism&
Inicab&&
rc.sysinit&&
init.d&
rcN.d&where&N=0Z6&&
services&&
6/5/14&
14&
7&
6/5/14&
Logging&in&
Login:&
Password:&
&
$&
&
&
$pwd&
/home/trainee&
&
6/5/14&
15&
Note&&
A&Linux&system&must&always&be&shut&down&properly.&Improper&
shutdown,&such&as&simply&turning&o&your&system,&can&cause&
serious&damage&to&your&Linux&system!&&
&
When& you& are& nished& using& your& Linux& system,& you& must&
shut&it&down&properly,&as&described&in&the&next&sec)on.&If&you&
start& to& boot& Linux,& and& then& change& your& mind,& you& should&
let& the& system& start& up& fully& and& then& follow& the& shutdown&
procedure.&
6/5/14&
16&
8&
6/5/14&
LILO:&LInux&LOader&
A&versa)le&boot&manager&that&supports:&
Choice&of&Linux&kernels.&
Boot&)me&kernel¶meters.&
Boo)ng&nonZLinux&kernels.&
A&variety&of&congura)ons.&
Characteris)cs:&
Lives&in&MBR&or&par))on&boot§or.&
Has&no&knowledge&of&lesystem&structure&so&
Builds&a§or& map&le &(block&map)&to&nd&kernel.&
17&
GRUB&&Another&boot&loader&
A&versa)le&boot&manager&that&supports:&
All&the&services&what&any&other&boot&loader&supports&
First&boot&loader&to&support&>1024&cylinders&&
Congura)on&at&/etc/grub.conf&is&congura)on&le&
Can&be&modied&as&CL¶meters&
Can&be&used&at&/boot/grub/menu.lst&also&for&acaching&extra&os&
image&
6/5/14&
18&
9&
6/5/14&
Example&lilo.conf&File&
boot=/dev/hda&
map=/boot/map&
install=/boot/boot.b&
prompt&
)meout=50&
default=linux&
&
image=/boot/vmlinuzZ2.2.12Z20&
&label=linux&
&initrd=/boot/initrdZ2.2.12Z20.img&
&readZonly&
&root=/dev/hda1&
6/5/14&
19&
/sbin/init&
Ancestor&of&all&processes&&
Controls&transi)ons&between& runlevels :&
0:&shutdown&&&
1:&singleZuser&&&&&
2:&mul)Zuser&(no&NFS)&
3:&full&mul)Zuser&&&
5:&X11&&&&
6:&reboot&
Executes&startup/shutdown&scripts&for&each&runlevel.&
Let s&checkout&the&scripts&controlling&the&run&level&at&/etc/&
6/5/14&
20&
10&
6/5/14&
Shutdown&
Use&/bin/shutdown&to&avoid&data&loss&and&
lesystem&corrup)on.&
Shutdown&inhibits&login,&asks&init&to&send&
SIGTERM&to&all&processes,&then&SIGKILL.&
LowZlevel&commands:&halt,&reboot,&powero.&
Use&Zh,&Zr&or&Zp&op)ons&to&shutdown&instead.&
CtrlZAltZDelete&
dened&by&a&line&in&/etc/inicab.&
ca::ctrlaltdel:/sbin/shutdown&Zt3&Zr&now.&
6/5/14&
21&
Advanced&Boot&Concepts&
Ini)al&ramdisk&(initrd)&&twoZstage&boot&for&
exibility:&
First&mount& ini)al &ramdisk&as&root.&
Execute&linuxrc&to&perform&addi)onal&setup,&
congura)on.&
Finally&mount& real &root&and&con)nue.&
See&Documenta)on/initrd.txt&for&details.&
Also&see& man&initrd .&
&
6/5/14&
22&
11&
6/5/14&
Summary&
Bootstrapping& a& system& is& a& complex,& deviceZdependent& process&
that&involves&transi)on&from&hardware,&to&rmware,&to&soIware.&
Boo)ng&within&the&constraints&of&the&Intel&architecture&is&especially&
complex& and& usually& involves& rmware& support& (BIOS)& and& a& boot&
manager&(LILO).&
/sbin/lilo&is&a& map&installer &that&reads&congura)on&informa)on&
and&writes&a&boot§or&and&block&map&les&used&during&boot.&
start_kernel& is& Linux& main & and& sets& up& process& context& before&
spawning& process& 0& (idle)& and& process& 1& (init).& The& init()& func)on&
performs& highZlevel& ini)aliza)on& before& execu)ng& the& userZlevel&
init&process.&
6/5/14&
23&
Linux&File&systems&on&desktop&
Historically&Linux&had&no&fs&of&its&own&and&
formally&had&minix&fs&running&on&it&
Later&adpoted&the&the&second&extended&le&
system&formally&known&as&ext2fs&
Has&been&enhanced&to&ext3fs,&ext4fs&&
Now&the&Linux&3.X&having&&&Btrfs&
Btrfs&is&a&new©&on&write&lesystem&for&Linux&aimed&at&
implemen)ng&advanced&features&while&focusing&on&fault&
tolerance,&repair&and&easy&administra)on&&&
6/5/14&
24&
12&
6/5/14&
Linux&Tree&Hierarchy&
/&
/root&
/bin&
/proc&
/usr&
/sbin&
/dev&
/src&
6/5/14&
/home&
/home/ram&
linux&
/opt&
/home/sita&
25&
Files&and&)mestamps&
There&are&three&major&)mestamps&associated&
with&the&le&
Time&of&last&le&modica)on&
Time&of&last&access&
Time&of&last&inode&modica)on&(kernel&inclusive)&
ls&l&
ls&lu&
ls&Zlc&
6/5/14&
26&
13&
6/5/14&
System&Call&in&Linux&
Basics&
Common&system&calls&
How&is&it&implemented&in&Linux&
Implemen)ng&a&system&call&directly&
Other&methods&&
6/5/14&
27&
Mode,&Space,&Context&
Mode:&hardware&restricted&execu)on&state&
restricted&access,&privileged&instruc)ons&
user&mode&vs.&kernel&mode&
Space:&kernel&(system)&vs.&user&(process)&address&space&&
requires&MMU&support&(virtual&memory)&
userland :&any&process&address&space;&there&are&many&user&address&spaces&
6/5/14&
28&
14&
6/5/14&
User&Mode,&Process&Context&
CONTEXT
Userland
Process
System
User Space
MODE
User
Kernel Space
Kernel
6/5/14&
29&
Kernel&Mode,&Process&Context&
CONTEXT
Process
MODE
User
System
trap to kernel
Kernel
System calls,
exceptions
User Space
Kernel Space
6/5/14&
30&
15&
6/5/14&
Kernel&Mode,&System&Context&
CONTEXT
Process
MODE
System
User
Kernel
interrupts,
system tasks
User Space??
Kernel Space
6/5/14&
interrupts
31&
User&Mode,&System&Context?&
CONTEXT
Process
System
Not
allowed!
User Space
MODE
6/5/14&
User
Kernel Space?
Kernel
32&
16&
6/5/14&
Interrupts&and&Excep)ons&
Interrupts&Z&async&device&to&cpu&communica)on&
example:&service&request,&comple)on&no)ca)on&
aside:&IPI&&interZprocessor&interrupt&(another&cpu!)&
system&may&be&interrupted&in&either&kernel&or&user&mode&
interrupts&are&logically&unrelated&to¤t&processing&
Excep)ons&Z&sync&hardware&error&no)ca)on&
example:÷ZbyZzero&(AU),&illegal&address&(MMU)&
excep)ons&are&caused&by¤t&processing&
SoIware&interrupts&(traps)&
6/5/14&
33&
System&Calls:&read&
C&example:&&&
count&=&read(fd,buer,nbyte)&
&
push¶meters&on&stack&
&
call&library&code&
&
put&system&call&number&in®ister&
&
call&kernel&(TRAP)&
register
X (read)
nbytes
buffer
fd
system call
handler
resume&process&
6/5/14&
buffer
user space
kernel space
kernel&examines&system&call&number&&
nds&requested&system&call&handler&&
execute&requested&opera)on&
&
increase&instruc)on&pointer&
remove¶meters&from&stack&
&
application
memory (stack)
return&to&library&and&clean&up&
read library
procedure
sys_read()
34&
17&
6/5/14&
Kernel&Entry&and&Exit&
Library Code
exceptions
(error traps)
trap /
interrupt
table
boot
system
call
table
scheduler
Kernel
interrupt
6/5/14&
device
dialog
page faults
Devices
35&
More&than&a&procedure&call&
Less&than&a&context&switch&
Costs:&
Establishing&kernel&stack&
Valida)ng¶meters&
Kernel&mapped&to&user&address&space?&
6/5/14&
36&
18&
6/5/14&
System&Calls&vs.&Library&Calls&
man&2&
historical&evolu)on&of&#&of&calls&
Unix&6e&(~50),&Solaris&7&(~250)&
Linux&2.0&(~160),&Linux&2.2&(&~190),&Linux&2.4&(~220)&
library&calls&vs.&system&call&possibili)es:&
library&call&never&invokes&system&call&
library&call&some)mes&invokes&system&call&
library&call&always&invokes&system&call&
system&call¬&available&via&library&
37&
# section declaration
.string "Hello, world!\n"
len = . - msg
.text
.global _start
_start:
# write our string to stdout
movl
movl
movl
movl
int
$len,%edx
$msg,%ecx
$1,%ebx
$4,%eax
$0x80
#
#
#
#
#
$0,%ebx
$1,%eax
$0x80
# and exit
movl
movl
int
6/5/14&
38&
19&
6/5/14&
Tracing&Process&Signals&and&Sys&Calls&
ptrace()&&allow&parent&process&to&observe/control&child&
child&stops&before&signal&delivery&or&system&call&execu)on&
parent&waits&for&child&
parent&can&view/modify&child&state&
possible&to& acach &and& reparent &exis)ng&processes&
architecture&dependent&
strace&&useful&diagnos)c&applica)on&to&trace&processes&
strace&whatever&
Solaris&uses&more&sophis)cated&/proc&mechanism&
6/5/14&
39&
Sample&strace&r&Output&
> strace r sync
0.000000 execve("/bin/sync", ["sync"], [/* 21 vars */]) = 0
0.001002 brk(0)
= 0x804a178
0.000192 old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40014000
0.000164 open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or directory)
0.000133 open("/etc/ld.so.cache", O_RDONLY) = 4
0.000069 fstat(4, {st_mode=S_IFREG|0644, st_size=20404, ...}) = 0
0.000120 old_mmap(NULL, 20404, PROT_READ, MAP_PRIVATE, 4, 0) = 0x40015000
0.000075 close(4)
= 0
0.000064 open("/lib/libc.so.6", O_RDONLY) = 4
0.000076 fstat(4, {st_mode=S_IFREG|0755, st_size=4101324, ...}) = 0
0.000096 read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\210\212"..., 4096) = 4096
0.000192 old_mmap(NULL, 1001564, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x4001a000
0.000083 mprotect(0x40107000, 30812, PROT_NONE) = 0
0.000058 old_mmap(0x40107000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4, 0xec000) = 0x40107000
0.000137 old_mmap(0x4010b000, 14428, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) =
0x4010b000
0.000080 close(4)
= 0
0.000102 mprotect(0x4001a000, 970752, PROT_READ|PROT_WRITE) = 0
0.001043 mprotect(0x4001a000, 970752, PROT_READ|PROT_EXEC) = 0
0.000248 munmap(0x40015000, 20404) = 0
0.000077 personality(PER_LINUX)
= 0
0.000127 getpid()
= 2225
0.000193 brk(0)
= 0x804a178
0.000054 brk(0x804a1b0)
= 0x804a1b0
0.000097 brk(0x804b000)
= 0x804b000
0.000130 sync()
= 0
0.015855 _exit(0)
= ?
6/5/14&
40&
20&
6/5/14&
Sample&strace&c&Output&
> strace c sync
execve("/bin/sync", ["sync"], [/* 21 vars */]) = 0
% time
seconds usecs/call
calls
errors
------ ----------- ----------- --------- --------97.47
0.008277
8277
1
0.85
0.000072
24
3
1
0.45
0.000038
38
1
0.40
0.000034
7
5
0.37
0.000031
10
3
0.19
0.000016
16
1
0.11
0.000009
2
4
0.08
0.000007
4
2
0.05
0.000004
2
2
0.02
0.000002
2
1
0.02
0.000002
2
1
------ ----------- ----------- --------- --------100.00
0.008492
24
1
6/5/14&
syscall
---------------sync
open
read
old_mmap
mprotect
munmap
brk
fstat
close
getpid
personality
---------------total
41&
provides&hardware&support&for&context&switching&(not&used&by&
Linux)&
4&protec)on&levels&
lot s&of&segments&and&descriptors&
segments&and&descriptors&all&have&privilege&levels&
6/5/14&
42&
21&
6/5/14&
All&Linux&interrupt&handlers&are&ac)vated&by&means&of&interrupt&
gate&and&are&restricted&to&Kernel&mode&
trap&gate&
&It&cannot&be&accessed&by&User&level&as&DPL=0.&&
Used&for&Excep)on&genera)on&&&
System&Gate&
An&Intel&trap&gate&that&can&be&accessed&by&a&user s&process&
(DPL=3).&&
The&four&vectors&implemented&by&Linux&is&3,4,5,128(VVI)&
6/5/14&
43&
System&Call&Dispatch&Table&&
Broad&system&call&categories:&
les,&i/o,&devices&
memory,&processes&
ipc,&)me,&misc&
System&call&lis)ng:&&
include/unistd.h&(include/asmZi386/unistd.h)&
6/5/14&
44&
22&
6/5/14&
system_call&
arch/i386/kernel/entry.S:ENTRY(system_call)
SAVE_ALL&
get¤t&task&struct&
syscall&#¬&OK?&"badsys&
traced?&"&tracesys&
dispatch&specic&syscall&"*(sys_call_table[call_number])&
save&return&value&
need&to&reschedule?&"reschedule&
signal&pending?&"signal_return&(do_signal)&
RESTORE_ALL&
return_from_excep)on&
return_from_intr&
6/5/14&
45&
The&system_call()&Func)on&
Saves&syscall&number&&&CPU®isters&used&by&excep)on&
handler&on&the&stack,&except&those&automa)cally&saved&
by&control&unit.&
Checks&for&valid&system&call.&
Invokes&specic&service&rou)ne&associated&with&syscall&
number&(contained&in&eax):&
call *sys_call_table(0, %eax, 4)
Return&code&of&system&call&is&stored&in eax.&
6/5/14&
46&
23&
6/5/14&
Parameter&Passing&
On&the&32Zbit&Intel&80x86:&
6®isters&are&used&to&store&syscall¶meters.&
eax&(syscall&number).&
ebx,&ecx,&edx,&esi,&edi&store¶meters&to&syscall&
service&rou)ne,&iden)ed&by&syscall&number.&&&
6/5/14&
47&
System&calls&
" The&main&interface&between&the&kernel&and&userspace&is&the&set&
"
"
"
of&system&calls&
About&~300&system&calls&that&provides&the&main&kernel&services&
" File&and&device&opera)ons,&networking&opera)ons,&interZ
process&communica)on,&process&management,&memory&
mapping,&)mers,&threads,&synchroniza)on&primi)ves,&etc.&
This&interface&is&stable&over&)me:&only&new&system&calls&can&be&
added&by&the&kernel&developers&
This&system&call&interface&is&wrapped&by&the&C&library,&and&
userspace&applica)ons&usually&never&make&a&system&call&
directly&but&rather&use&the&corresponding&C&library&func)on&
6/5/14&
48&
24&
6/5/14&
Summary&
System&calls&are&implemented&as&a&separate&source&le&
that&requires&upda)on&of&makeles&
/usr/src/linux/arch/ixxx/entry.S&has&the&entry&of&system&
calls&
include/asmZi386/unistd.h&contains&the&table&
kernel/sys.c&contains&the&rou)nes&
$80&is&used&for&x86&architecture.&Others&uses&lcall7&&
6/5/14&
49&
25&