0% found this document useful (0 votes)
95 views98 pages

xv6 Riscv

The document is a comprehensive guide to the xv6 operating system, detailing its interfaces, organization, paging, traps, interrupts, device drivers, and locking mechanisms. It is structured into sections that cover various aspects of operating system design and implementation. The document serves as a resource for understanding the principles and functionalities of xv6, a Unix-like teaching operating system.

Uploaded by

olivery0119
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
95 views98 pages

xv6 Riscv

The document is a comprehensive guide to the xv6 operating system, detailing its interfaces, organization, paging, traps, interrupts, device drivers, and locking mechanisms. It is structured into sections that cover various aspects of operating system design and implementation. The document serves as a resource for understanding the principles and functionalities of xv6, a Unix-like teaching operating system.

Uploaded by

olivery0119
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 98

xv6 Unix

Russ Cox Frans Kaashoek Robert Morris

February 27, 2024


2
Contents

1 Operating system interfaces 9


1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2 I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2 Operating system organization 19


2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4 xv6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.6 xv6 . . . . . . . . . . . . . . . . . . . . . 23
2.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3 Page tables 29
3.1 Paging hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.7 sbrk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3
4 Traps and system calls 39
4.1 RISC-V trap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2 trap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.5 trap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5 Interrupts and device drivers 47


5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

6 Locking 51
6.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

7 Scheduling 61
7.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.4 mycpu myproc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
7.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
7.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
7.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
7.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
7.11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4
8 File system 73
8.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
8.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
8.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
8.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
8.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
8.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
8.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
8.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
8.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
8.10 inode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
8.11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
8.12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
8.13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
8.14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
8.15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
8.16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

9 Concurrency revisited 87
9.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
9.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
9.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
9.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
9.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

10 Summary 91

5
6
Foreword and acknowledgments

xv6
Xv6 Dennis Ritchie Ken Thompson Unix Version 6 (v6) [17] Xv6
v6 RISC-V [15] ANSI C [7]
xv6 John Lions UNIX
[11] https://pdos.csail.mit.edu/6.1810 v6 xv6
xv6
6.828 6.1810
xv6 Adam Belay Austin
Clements Nickolai Zeldovich
Abutalib Aghayev, Sebastian Boehm, brandb97, Anton Burtsev, Raphael
Carvalho, Tej Chajed, Rasit Eskicioglu, Color Fuzzy, Wojciech Gac, Giuseppe, Tao Guo, Haibo
Hao, Naoki Hayama, Chris Henderson, Robert Hilderman, Eden Hochbaum, Wolfgang Keller,
Henry Laih, Jin Li, Austin Liew, Pavan Maddamsetti, Jacek Masiulaniec, Michael McConville,
m3hm00d, miguelgvieira, Mark Morrissey, Muhammed Mourad, Harry Pan, Harry Porter, Siyuan
Qian, Askar Safin, Salman Shah, Huang Sha, Vikram Shenoy, Adeodato Simó, Ruslan Savchenko,
Pawel Szczurko, Warren Toomey, tyfkda, tzerbib, Vanush Vaswani, Xi Wang, and Zou Chang Wei,
Sam Whitlock, LucyShawYang, and Meng Zhou
Frans Kaashoek Robert Morris
(kaashoek,[email protected])

7
8
Chapter 1

Operating system interfaces

xv6 Ken
Thompson Dennis Ritchie Unix [17] Unix
Unix
BSD Linux macOS Solaris
Microsoft Windows Unix xv6

1.1 xv6 kernel


process

system call
user space kernel
space
CPU 1

xv6 Unix
1.2 xv6
1
CPU CPU
RISC-V hart CPU

9
user
shell cat
space
system
call
kernel
space Kernel

Figure 1.1:

xv6
Unix shell shell

shell shell
shell
Unix shell
xv6 shell Unix Bourne shell
(user/sh.c:1)

1.1
xv6 Xv6
time-share CPU xv6
CPU (PID)

fork fork
fork
fork PID fork
parent child
C [7]
int pid = fork();
if(pid > 0){
printf("parent: child=%d\n", pid);
pid = wait((int *) 0);
printf("child %d is done\n", pid);
} else if(pid == 0){
printf("child: exiting\n");
exit(0);
} else {
printf("fork error\n");
}

exit Exit
0 1 wait

10
System call Description
int fork() Create a process, return child’s PID.
int exit(int status) Terminate the current process; status reported to wait(). No return.
int wait(int *status) Wait for a child to exit; exit status in *status; returns child PID.
int kill(int pid) Terminate process PID. Returns 0, or -1 for error.
int getpid() Return the current process’s PID.
int sleep(int n) Pause for n clock ticks.
int exec(char *file, char *argv[]) Load a file and execute it with arguments; only returns if error.
char *sbrk(int n) Grow process’s memory by n bytes. Returns start of new memory.
int open(char *file, int flags) Open a file; flags indicate read/write; returns an fd (file descriptor).
int write(int fd, char *buf, int n) Write n bytes from buf to file descriptor fd; returns n.
int read(int fd, char *buf, int n) Read n bytes into buf; returns number read; or 0 if end of file.
int close(int fd) Release open file fd.
int dup(int fd) Return a new file descriptor referring to the same file as fd.
int pipe(int p[]) Create a pipe, put read/write file descriptors in p[0] and p[1].
int chdir(char *dir) Change the current directory.
int mkdir(char *dir) Create a new directory.
int mknod(char *file, int, int) Create a device file.
int fstat(int fd, struct stat *st) Place info about an open file into *st.
int stat(char *file, struct stat *st) Place info about a named file into *st.
int link(char *file1, char *file2) Create another name (file2) for the file file1.
int unlink(char *file) Remove a file.

Figure 1.2: Xv6 0


-1

PID wait
wait wait -1
0 wait

parent: child=1234
child: exiting
printf
wait
parent: child 1234 is done

wait
pid pid pid
exec

Xv6 ELF 3

11
exec
ELF exec

char *argv[3];

argv[0] = "echo";
argv[1] = "hello";
argv[2] = 0;
exec("/bin/echo", argv);
printf("exec error\n");

echo hello /bin/echo

xv6 shell shell main (user/sh.c:146)


getcmd fork shell
wait shell echo hello
runcmd echo hello runcmd (user/sh.c:55)
echo hello exec (user/sh.c:79) exec
echo runcmd echo exit main (user/sh.c:146)
wait
fork exec shell I/O
exec
fork
4.6
Xv6 fork exec
malloc )
sbrk(n) n sbrk

1.2 I/O
file descriptor

I/O
xv6
0
1 2
shell I/O shell
(user/sh.c:152)
read write
read(fd , buf , n) n fd buf
read

12
read
read read
write(fd , buf , n) n buf fd
n read , write
write
cat

char buf[512];
int n;

for(;;){
n = read(0, buf, sizeof buf);
if(n == 0)
break;
if(n < 0){
fprintf(2, "read error\n");
exit(1);
}
if(write(1, buf, n) != n){
fprintf(2, "write error\n");
exit(1);
}
}

cat
cat
0 1 cat
close open , pipe dup

fork I/O fork


exec
shell
exec I/O redirection shell
cat < input.txt:
char *argv[2];

argv[0] = "cat";
argv[1] = 0;
if(fork() == 0) {
close(0);
open("input.txt", O_RDONLY);
exec("cat", argv);
}

13
0 open input.txt
0 cat 0
input.txt
xv6 shell I/O (user/sh.c:83)
shell shell runcmd exec
open open
(fcntl) (kernel/fcntl.h:1-5) O_RDONLY , O_WRONLY , O_RDWR , O_CREATE
O_TRUNC open

fork exec shell


shell I/O shell I/O
forkexec I/O shell
forkexec I/O forkexec I/O
cat
I/O
fork

if(fork() == 0) {
write(1, "hello ", 6);
exit(0);
} else {
wait(0);
write(1, "world\n", 6);
}
1 hello world
write wait write
shell (echo hello ; echo world)
>output.txt
dup I/O
fork hello
world
fd = dup(1);
write(1, "hello ", 6);
write(fd, "world\n", 6);

fork dup open


dup shell ls existing-file non-existing-file > tmp1
2>&1 2>&1 shell 2 1
tmp1 xv6 shell
I/O

14
1.3
pipe

wc
int p[2];
char *argv[2];

argv[0] = "wc";
argv[1] = 0;

pipe(p);
if(fork() == 0) {
close(0);
dup(p[0]);
close(p[0]);
close(p[1]);
exec("/bin/wc", argv);
} else {
close(p[0]);
write(p[1], "hello world\n", 12);
close(p[1]);
}
pipe p fork
close dup zero
p exec wc wc

read
read 0 read

wc wc wc

xv6 shell grep fork sh.c | wc -l (user/sh.c:101)


fork runcmd
fork runcmd apipe
a | b | c) b c
shell

echo hello world | wc

echo hello world >/tmp/xyz; wc </tmp/xyz

15
shell
/tmp/xyz

1.4
xv6
root path /a/b/c
c b a/
/ current directory
chdir
chdir("/a");
chdir("b");
open("c", O_RDONLY);

open("/a/b/c", O_RDONLY);
/a/b
mkdir open O_CREATE
mknod
mkdir("/dir");
fd = open("/dir/file", O_CREATE|O_WRONLY);
close(fd);
mknod("/console", 1, 1);
mknod
mknod read
write
inode links
inode
metadata

fstat struct stat


stat.h (kernel/stat.h)
#define T_DIR 1 // Directory
#define T_FILE 2 // File
#define T_DEVICE 3 // Device

struct stat {
int dev; // File system’s disk device
uint ino; // Inode number
short type; // Type of file

16
short nlink; // Number of links to file
uint64 size; // Size of file in bytes
};
link inode
both a b
open("a", O_CREATE|O_WRONLY);
link("a", "b");
a b
a b fstat
ino nlink 2
unlink

unlink("a");
inode b
fd = open("/tmp/xyz", O_CREATE|O_RDWR);
unlink("/tmp/xyz");
fd

Unix shellas mkdir , ln rm

Unix shell shell

cd shell (user/sh.c:161) cd shell


cd shell cd cd
shell

1.5
Unix shell
Unix
shell Unix
BSD Linux macOS
Unix (POSIX) Xv6
POSIX lseek
xv6 UNIX
C xv6 Unix
xv6

POSIX

17
Unix
Plan 9 [16]
Unix
Multics
Unix
Multics Unix

Xv6 Unix
xv6 root
xv6 Unix Unix

xv6 xv6

1.6
1. UNIX

18
Chapter 2

Operating system organization

Chapter 1
fork time-share
CPU
isolation

monolithic kernel Unix


xv6 xv6 xv6
Xv6 multi-core 1 RISC-V
RISC-V RISC-V 64 CPU xv6 LP64 C C
long L P 64 int 32
RISC-V RISC-V The
RISC-V Reader: An Open Architecture Atlas [15] ISA [2] [3]

CPU I/O Xv6


qemu -machine virt RAM
ROM /

2.1
1.2

1
CPU CPU
multiprocessor

19
CPU

Unix open , read , write close

Unix CPU

CPU
Unix exec

exec

Unix

1.2
Unix

2.2

CPU RISC-V CPU machine


mode , supervisor mode user mode CPU
Xv6

CPU privileged instructions


CPU

Chapter 1 1.1
user space
kernel space
kernel
xv6 read
CPU CPU
RISC-V ecall

20
user shell File server
space
Send message

kernel
space Microkernel

Figure 2.1:

CPU

2.3

monolithic kernel

microkernel
2.1

shell

Unix Linux
Linux

Minix L4 QNX
L4 seL4 [8]

21
Unix Xv6 xv6
xv6
xv6

2.4 xv6
xv6 kernel/ sub-directory
2.2 defs.h (kernel/defs.h)

2.5
xv6 Unix process
CPU

address space
CPU
Xv6 RISC-V
a virtual address RISC-V physical address CPU

Xv6 2.3
user memory
malloc
RISC-V 64
39 xv6 39 38 238 − 1 = 0x3fffffffff
MAXVA (kernel/riscv.h:363) xv6 trampoline
trapframe Xv6 trampoline
trapframe /
4

22
xv6 struct proc (kernel/proc.h:85)
p->xxx
proc p->pagetable
thread thread

( p->kstack )

RISC-V ecall

sret

I/O I/O
p->state I/O
p->pagetable RISC-V Xv6
p->pagetable

CPU xv6
CPU

2.6 xv6
xv6

RISC-V
xv6 CPU
xv6 _entry (kernel/entry.S:7) RISC-V

xv6 0x80000000 0x80000000


0x0 0x0:0x80000000 I/O
_entry xv6 C Xv6
stack0 start.c (kernel/start.c:11) _entry sp
stack0+4096 RISC-V _entry
C start (kernel/start.c:21)
start
RISC-V mret
start
supervisor mstatus main

23
main mepc 0
satp
start
start mret returns mret
main (kernel/main.c:11)
main (kernel/main.c:11)
userinit (kernel/proc.c:233) RISC-V
xv6 initcode.S (user/initcode.S:3) exec
SYS_EXEC (kernel/syscall.h:8) a7 ecall
syscall a7 SYS_EXEC sys_exec
Chapter 1 exec /init

exec /init init (user/init.c:15)


0 1 2
shell

2.7

RISC-V
RISC-V

/ /
32 RISC-V

RISC-V CPU RAM

CPU

Linux
[1]

24
Linux

2.8
xv6
CPU
xv6 Linux clone fork

2.9
1. xv6

25
File Description
bio.c Disk block cache for the file system.
console.c Connect to the user keyboard and screen.
entry.S Very first boot instructions.
exec.c exec() system call.
file.c File descriptor support.
fs.c File system.
kalloc.c Physical page allocator.
kernelvec.S Handle traps from kernel, and timer interrupts.
log.c File system logging and crash recovery.
main.c Control initialization of other modules during boot.
pipe.c Pipes.
plic.c RISC-V interrupt controller.
printf.c Formatted output to the console.
proc.c Processes and scheduling.
sleeplock.c Locks that yield the CPU.
spinlock.c Locks that don’t yield the CPU.
start.c Early machine-mode boot code.
string.c C string and byte-array library.
swtch.S Thread switching.
syscall.c Dispatch system calls to handling function.
sysfile.c File-related system calls.
sysproc.c Process-related system calls.
trampoline.S Assembly code to switch between user and kernel.
trap.c C code to handle and return from traps and interrupts.
uart.c Serial-port console device driver.
virtio_disk.c Disk device driver.
vm.c Manage page tables and address spaces.

Figure 2.2: Xv6

26
MAXVA
trampoline
trapframe

heap

user stack

user text
and data
0

Figure 2.3:

27
28
Chapter 3

Page tables

xv6

Xv6
trampoline
RISC-V xv6

3.1 Paging hardware


RISC-V RAM
RISC-V

xv6 Sv39 RISC-V 64 39 25


27
Sv39 RISC-V 2 (134,217,728) page table entries
(PTEs) PTE 44 (PPN) 39
27 PTE 56 44
PTE PPN 12 3.1
PTE 3.2
12
4096 ( 2 ) page
Sv39 RISC-V 25 PTE
10 RISC-V
39
2 512 GB RISC-V
256 I/O DRAM
RISC-V 48 Sv48 [3]
3.2 RISC-V CPU
4096 512 PTE
512 PTE
27 9 PTE 9 PTE
9 PTE Sv48 RISC-V 39 47

29
Virtual address
25
27 12

64 EXT Index Offset

44 10
2^27

PPN Flags

1
0
56
Page table
44 12
Physical Address

Figure 3.1: RISC-V

PTE page-fault
exception 4
3.1 3.2 PTE

0 1 511 511
511
511 511 × 512

CPU
CPU PTE /
PTE RISC-V CPU Translation
Look-aside Buffer (TLB)
PTE
PTE_V PTE
PTE_R PTE_W PTE_X CPU
PTE_U
PTE_U PTE 3.2
(kernel/riscv.h)
CPU satp CPU
satp CPU
satp CPU

PTE PTE
DRAM

30
Virtual address Physical Address
9 9 9 12 44 12

EXT L2 L1 L0 Offset PPN Offset

44 10

511
44 10
511
PPN Flags
44 10

511
1 PPN Flags
0
Page Directory 1
PPN Flags
0
Page Directory 1
satp
0
Page Directory

63 53 10 9 8 7 6 5 4 3 2 1 0

Reserved Physical Page Number RSW D A G U X W R V

V - Valid
R - Readable
W - Writable
X - Executable
U - User
G - Global
A - Accessed
D - Dirty (0 in page directory)
Reserved for supervisor software

Figure 3.2: RISC-V

DRAM

3.2
Xv6

3.3
(kernel/memlayout.h) xv6
QEMU RAM RAM 0x80000000
0x88000000 xv6 PHYSTOP QEMU I/O
QEMU memory-mapped
0x80000000 /
RAM Chapter 4 xv6

RAM
KERNBASE=0x80000000
fork
fork

31
Physical Addresses
2^56-1
Virtual Addresses
MAXVA
Trampoline R-X
Guard page ---
Kstack 0 RW- Unused

Guard page ---


RW-
...
Kstack 1

PHYSTOP
(0x88000000)

Free memory
RW-

Physical memory (RAM)

Kernel data RW-

Kernel text R-X

KERNBASE
(0x80000000)
Unused
and other I/O devices

0x10001000 VIRTIO disk RW- VIRTIO disk


0x10000000 UART0 RW- UART0

PLIC RW- PLIC


0x0C000000

CLINT
0x02000000

Unused

boot ROM
0x1000
Unused
0 0

Figure 3.3: xv6 RWX PTE xv6


RISC-V

• trampoline Chapter 4
trampoline
trampoline


xv6 guard page PTE PTE_V
Panic
Panic

32
trampoline PTE_R PTE_X
PTE_R PTE_W

3.3
xv6 vm.c (kernel/vm.c:1)
pagetable_t RISC-V pagetable_t
walk PTE
mappages PTE kvm uvm
copyout copyin
vm.c

main kvminit (kernel/vm.c:54)


kvmmake (kernel/vm.c:20) xv6 RISC-V
kvmmake kvmmap
PHYSTOP
proc_mapstacks (kernel/proc.c:33)
kvmmap KSTACK
kvmmap (kernel/vm.c:132) mappages (kernel/vm.c:143)

mappages walk PTE


PTE PTE_W PTE_X / PTE_R
PTE_V PTE (kernel/vm.c:158)
walk (kernel/vm.c:86) RISC-V PTE 3.2
walk 3 9 9
PTE (kernel/vm.c:92) PTE alloc
walk PTE PTE
(kernel/vm.c:102)
walk
PTE (kernel/vm.c:94)
PTE (kernel/vm.c:92)
main kvminithart (kernel/vm.c:62)
satp CPU

RISC-V CPU Translation Look-aside Buffer (TLB) xv6


CPU TLB
TLB
RISC-V sfence.vma

33
CPU TLB Xv6 kvminithart sfence.vma satp
trampoline (kernel/trampoline.S:89)
sfence.vma satp

TLB RISC-V CPU (ASID) [3]


TLB Xv6

3.4

xv6 PHYSTOP 4096

3.5
kalloc.c (kernel/kalloc.c:1)
struct run (kernel/kalloc.c:17)
run
(kernel/kalloc.c:21-24)
acquire release ;
6
main kinit (kernel/kalloc.c:27) kinit
PHYSTOP Xv6
xv6 128 MB RAM kinit freerange
kfree PTE 4096 4096
freerange PGROUNDUP
kfree
freerange
run
C

kfree (kernel/kalloc.c:47) 1

kfree
pa struct run r->next r
kalloc

34
3.6
xv6 3.4
2.3 0
MAXVA (kernel/riscv.h:360) 256 GB
xv6 PTE_R PTE_X
PTE_U Xv6
PTE_R PTE_W PTE_U
PTE_W

0 xv6
PTE_W 0
4.6

PTE_X

Web
[14]

exec
main
main(argc argv)
xv6 PTE_U

xv6 xv6 Xv6 kalloc


PTE Xv6 PTE_W , PTE_R
, PTE_U PTE PTE_V xv6
PTE_V PTE

trampoline PTE_U

3.7 sbrk
sbrk growproc (ker-
nel/proc.c:260) growproc uvmalloc uvmdealloc n
uvmalloc (kernel/vm.c:226) kalloc mappages PTE

35
MAXVA
trampoline RX---
trapframe R-W--
argument 0
unused ...
argument N
0 nul-terminated string
address of argument N argv[argc]
heap R-WU-
...
address of argument 0 argv[0]
PAGESIZE stack R-WU- address of address of argv argument of main
guard page argument 0
argc argc argument of main
data R-WU- 0xFFFFFFF return PC for main
Page aligned
unused
(empty)
text R-XU
0

Figure 3.4:

uvmdealloc uvmunmap (kernel/vm.c:171) work PTE


kfree
Xv6
uvmunmap

3.8
exec

exec (kernel/exec.c:23) namei path (kernel/exec.c:36)


Chapter 8 ELF Xv6
ELF format (kernel/elf.h) ELF ELF struct elfhdr
(kernel/elf.h:6) struct proghdr (kernel/elf.h:25) progvhdr
xv6

ELF ELF
0x7F , ‘E’ , ‘L’ , ‘F’ ELF_MAGIC (kernel/elf.h:3) ELF header
exec
exec proc_pagetable (kernel/exec.c:49) ELF
uvmalloc (kernel/exec.c:65) loadseg (kernel/exec.c:10)
loadseg walkaddr ELF
readi

36
/init exec

# objdump -p user/_init

user/_init: file format elf64-little

Program Header:
0x70000003 off 0x0000000000006bb0 vaddr 0x0000000000000000
paddr 0x0000000000000000 align 2**0
filesz 0x000000000000004a memsz 0x0000000000000000 flags r--
LOAD off 0x0000000000001000 vaddr 0x0000000000000000
paddr 0x0000000000000000 align 2**12
filesz 0x0000000000001000 memsz 0x0000000000001000 flags r-x
LOAD off 0x0000000000002000 vaddr 0x0000000000001000
paddr 0x0000000000001000 align 2**12
filesz 0x0000000000000010 memsz 0x0000000000000030 flags rw-
STACK off 0x0000000000000000 vaddr 0x0000000000000000
paddr 0x0000000000000000 align 2**4
filesz 0x0000000000000000 memsz 0x0000000000000000 flags rw-

0x1000 0x0
0x1000

filesz memsz C
/init filesz 0x10 memsz 0x30
uvmalloc 0x30 0x10
/init
exec exec
ustack
argv main ustack argc
argv
exec
exec copyout
(kernel/vm.c:352) exec
-1
exec
bad -1 exec
-1 exec
exec (kernel/exec.c:125) (kernel/exec.c:129)
exec ELF ELF
ELF exec ELF

Xv6
if(ph.vaddr + ph.memsz < ph.vaddr) 64
ELF ph.vaddr ph.memsz
0x1000 xv6

37
ELF xv6 RISC-V
loadseg

xv6
xv6

3.9
xv6
xv6 4
Xv6
0x8000000 RAM QEMU
RAM
0x8000000 RAM xv6

RISC-V xv6
RISC-V

8 KB 4 MB
RAM
xv6 malloc

xv6 4096

[9]

3.10
1. RISC-V
2. sbrk(1)
sbrk sbrk
PTE
3. xv6
4. Unix exec shell
#! exec
myprog arg1 myprog #!/interp exec
/interp /interp myprog arg1 xv6
5.

38
Chapter 4

Traps and system calls

CPU
ecall
exception
interrupt

trap trap
trap

trap

trap
xv6 trap trap
trap
xv6

Xv6 trap RISC-V CPU C


trap C
trap trap
trap
trap trap C handler
C vector

4.1 RISC-V trap


RISC-V CPU CPU
trap trap RISC-V [3]
riscv.h (kernel/riscv.h:1) xv6
• stvec trap RISC-V stvec
trap

39
• sepc trap RISC-V pc stvec
sret trap sepc pc
sepc sret

• scause RISC-V trap

• sscratch trap scratch

• sstatus sstatus SIE SIE RISC-


V SIE SPP trap
sret

trap
trap xv6

CPU
CPU trap
trap RISC-V trap

1. trap sstatus SIE

2. sstatus SIE

3. pc sepc

4. sstatus SPP

5. scause trap

6.

7. stvec pc

8. pc

CPU pc
CPU trap
trap
trap
CPU
trap
/ stap
CPU stvec

40
4.2 trap
Xv6 trap trap
trap 4.5 trap
ecall
trap trap uservec (kernel/trampo-
line.S:21) usertrap (kernel/trap.c:37) usertrap (kernel/trap.c:90)
userret (kernel/trampoline.S:101)
xv6 trap RISC-V trap
stvec trap trap
xv6 trap
stvec
Xv6 trampoline trampoline uservec stvec
xv6 trap trampoline TRAMPOLINE
trampoline
TRAMPOLINE 2.3 3.3 trampoline
PTE_U trap trampoline
trap
uservec trap trampoline.S (kernel/trampoline.S:21)
uservec 32 32
trap
RISC-V scratch
uservec csrw a0 scratch
uservec a0
uservec 32
trapframe 32 (kernel/proc.h:43) stap
uservec trapframe xv6
trapframeat trapframe trapframe
TRAMPOLINE p->trapframe trap

uservec trapframe a0
scratch a0
trapframe CPU hartid usertrap
uservec stap
usertrap
usertrap trap (kernel/trap.c:37)
stvec trap kernelvec uservec sepc
usertrap yield
sepc trap
usertrap syscall devintr
4
RISC-V ecall
usertrap CPU

41
trap
usertrapret (kernel/trap.c:90) RISC-V
trap stvec uservec
trap uservec sepc
usertrap trampoline userret
userret
usertrap userret a0 (kernel/tram-
poline.S:101) userret stap
trampoline trapframe
trampoline userret stap
userret trap userret
trapframe a0 a0 trapframe
a0 sret

4.3
2 exec
initcode.S (user/initcode.S:11)
exec
initcode.S exec a0 a1 a7
syscall (kernel/syscall.c:107) ecall
uservec , usertrap syscall
syscall (kernel/syscall.c:132) trapframe a7
a7 SYS_exec (kernel/syscall.h:8) ,
sys_exec
sys_exec syscall p->trapframe->a0
exec() RISC-V Ccalling a0

syscall −1

4.4

RISC-V C kerneltrap
trap argint
, argaddr argfd trap n
argraw (kernel/syscall.c:34)

exec

xv6

42
fetchstr
(kernel/syscall.c:25) exec fetchstr
fetchstr copyinstr
copyinstr (kernel/vm.c:403) max dst pagetable
srcva pagetable copyinstr walkaddr
walk srcva pagetable pa0 RAM
copyinstr pa0
dst walkaddr (kernel/vm.c:109)
copyin

4.5 trap
Xv6 CPU trap
CPU stvec kernelvec (kernel/kernelvec.S:12)
xv6 kernelvec stap
kernelvec 32

kernelvec
trap
trap
kernelvec kerneltrap (kernel/trap.c:135) kerneltrap
trap devintr (kernel/trap.c:178)
trap xv6
panic
kerneltrap
kerneltrap yield
kerneltrap Chapter 7
yield
kerneltrap trap yield
sepc sstatus kerneltrap
kernelvec (kernel/kernelvec.S:50) kernelvec
sret sepc pc

kerneltrap yield
CPU Xv6 CPU stvec kernelvec
usertrap (kernel/trap.c:29)
stvec uservec
RISC-V trap xv6 stvec

43
4.6
Xv6
panic
copy-on-write (COW) fork
fork xv6 fork Chapter 3 fork
fork Xv6 fork uvmcopy (kernel/vm.c:306)

CPU
page-fault exception PTE_V
PTE_R PTE_W , PTE_X , PTE_U
RISC-V

scause stval

COW fork
PTE_W
RISC-V CPU trap

PTE
PTE
fork

fork fork
fork exec
fork exec
fork COW fork

COW
lazy allocation
sbrk
PTE
COW fork

sbrk 1 GB
262,144 4096
/

44
demand paging exec xv6

shell

PTE
COW fork

RAM
paging to disk RAM
paging area RAM
PTE paged out
paged in kerneltrap RAM
RAM PTE RAM
RAM
evicting
PTE
RAM

RAM free

sbrk exec

4.7
trampoline trap RISC-V trap
trap kerneltrap
trap

RISC-V
scratch PTE_U
Xv6 trampoline trapframe RISC-V
PTE
trampoline

Xv6

45
fork

8.2 Xv6 naïve


xv6 xv6

4.8
1. copyin copyinstr
copyin copyinstr memcpy

2.

3. COW fork

4. trapframe
uservec 32
proc

5. xv6 TRAMPOLINE

46
Chapter 5

Interrupts and device drivers

Driver
I/O

xv6
devintr (kernel/trap.c:178)
tophalf
bottomhalf read write
I/O

5.1
(kernel/console.c)
RISC-V UART
control-u shell read
QEMU xv6 QEMU
UART xv6
UART QEMU 16550chip [13]
16550 RS232 QEMU

UART memory-mapped RISC-V


UART RAM
UART 0x10000000 UART0 (kernel/memlayout.h:21)
UART UART0
(kernel/uart.c:22) LSR

47
RHR UART
FIFO FIFO LSR UART
THR UART
Xv6 main consoleinit
sleep (kernel/console.c:96) 7 sleep
UART RISC-V xv6
devintr (kernel/trap.c:178) RISC-V scause
PLIC [3]
(kernel/trap.c:187) UART devintr UARTINTR
UARTINTR (kernel/uart.c:176) UART
consoleintr (kernel/console.c:136) ;
consoleintr cons.buf consoleintr
consoleintr consoleread

consoleread cons.buf

5.2
write UARTPUC (kernel/uart.c:87)
uart_tx_buf UART
UARTPUC uartstart
UARTPUC
UART UARTINTR uartstart

UARTPUC uartstart
uartstart UARTINTR

I/O
UART
I/O concurrency

5.3
consoleread consoleintr acquire

CPU consoleread CPU


UART CPU consoleread
consoleread CPU
6

48
copyout

5.4
Xv6 usertrap
kerneltrap yield RISC-V
CPU Xv6 CPU
RISC-V RISC-V
xv6
xv6 handlertimer
start.c main (ker-
nel/start.c:63) CLIINT

CLINT start mtvec timervec

RISC-V RISC-V

devintr (kernel/trap.c:205)
timervec (kernel/kernelvec.S:95) start
CLINT RISC-V
C

5.5
Xv6
yield

CPU
CPU xv6 6.6

UART UART
programmed I/O I/O

49
direct memory access (DMA) DMA
RAM RAM
DMA DMA RAM

CPU

polling
CPU

UART

DMA
1 read
write
/ Unix
ioctl

Xv6
xv6
xv6

5.6
1. uart.c console.c

2.

50
Chapter 6

Locking

xv6
CPU xv6 RISC-V CPU RAM xv6
CPU CPU
CPU CPU

CPU

concurrency
CPU kalloc

concurrency control
Xv6
lock CPU

CPU

xv6 xv6

6.1
CPU
wait wait CPU kfree
kalloc() (kernel/kalloc.c:69)
kfree() (kernel/kalloc.c:47)
kfree xv6
kfree
6.1 CPU

51
Memory

list
BUS

l->next = list l->next = list

CPU CPU
Figure 6.1: SMP

CPU

push
1 struct element {
2 int data;
3 struct element *next;
4 };
5
6 struct element *list = 0;
7
8 void
9 push(int data)
10 {
11 struct element *l;
12
13 l = malloc(sizeof *l);
14 l->data = data;
15 l->next = list;
16 list = l;
17 }

CPU push line 15 6.1 line 16


6.2 twolist next list
list 16

16 race

CPU

push

52
15 16
CPU 1

l->next list
Memory
l->next list

CPU2
15 16

Time

Figure 6.2:

mutual exclusion CPU


push

6 struct element *list = 0;


7 struct lock listlock;
8
9 void
10 push(int data)
11 {
12 struct element *l;
13 l = malloc(sizeof *l);
14 l->data = data;
15
16 acquire( & listlock);
17 l->next = list;
18 list = l;
19 release( & listlock);
20 }

acquire release critical section


list

list next
push 17 l list
l 18
CPU

53
CPU
CPU
serializing

kfree CPU
conflict contention
Xv6
CPU
CPU CPU
CPU
acquire push 13
malloc
acquire release

6.2
Xv6 Xv6
struct spinlock (kernel/spinlock.h:2) locked
xv6
21 void
22 acquire(struct spinlock *lk) // does not work!
23 {
24 for(;;) {
25 if(lk->locked == 0) {
26 lk->locked = 1;
27 break;
28 }
29 }
30 }

CPU 25
lk->locked line 26 CPU
25 26 atomic

25 26 RISC-V
amoswap r, a amoswap a r
r
CPU

54
Xv6 acquire (kernel/spinlock.c:22) C __sync_lock_test_and_set
amoswap lk->locked acquire
lk->locked
lk->locked
1 1 CPU 1
lk->locked
acquire CPU lk->cpu

release (kernel/spinlock.c:47) acquire lk->cpu


lk->locked C
C release C
__sync_lock_release RISC-V amoswap

6.3
Xv6 kalloc (kernel/kalloc.c:69) kfree (ker-
nel/kalloc.c:47) 1 2

Xv6

CPU CPU

wait

CPU
CPU
xv6 kalloc.c
CPU acquire
CPU
CPU

xv6

xv6 xv6
6.3 xv6

55
Lock Description
bcache.lock Protects allocation of block buffer cache entries
cons.lock Serializes access to console hardware, avoids intermixed output
ftable.lock Serializes allocation of a struct file in file table
itable.lock Protects allocation of in-memory inode entries
vdisk_lock Serializes access to disk hardware and queue of DMA descriptors
kmem.lock Serializes allocation of memory
log.lock Serializes operations on the transaction log
pipe’s pi->lock Serializes operations on each pipe
pid_lock Serializes increments of next_pid
proc’s p->lock Serializes changes to process’s state
wait_lock Helps wait avoid lost wakeups
tickslock Serializes operations on the ticks counter
inode’s ip->lock Serializes operations on each inode and its content
buf’s b->lock Serializes operations on each block buffer

Figure 6.3: xv6

6.4

deadlock inxv6 A
B 1 Athen B B A
T1 1 A T2 2
B T1 B T2 A

Xv6 2 struct proc


sleep 7 consoleintr (kernel/console.c:136)

consoleintr cons.lock wakeup


cons.lock
xv6 inode
vdisk_lock p->lock

M1 M2 M2 M1

wait
exit

56
6.5
re-entrant locks recursive locks

xv6 Panic

f g

f() {
acquire(&lock);
if(data == 0){
call_once();
h();
data = 1;
}
release(&lock);
}

g() {
aquire(&lock);
if(data == 0){
call_once();
data = 1;
}
release(&lock);
}

call_once f g

h g call_once
h g
call_once Panic
call_once
xv6
xv6 acquire
struct spinlock push_off

57
6.6
xv6 clockintr
ticks (kernel/trap.c:164) ticks sys_sleep
(kernel/sysproc.c:59) tickslock
sys_sleep tickslock
CPU clockintr tickslock
tickslock sys_sleep
sys_sleep clockintr CPU

CPU
Xv6 CPU xv6 CPU
CPU acquire
CPU
CPU Xv6
acquire push_off (kernel/spinlock.c:89) release pop_off (kernel/spin-
lock.c:100) CPU pop_off
intr_off intr_on RISC-V

acquire push_off lk->locked (kernel/spinlock.c:28)

release pop_off (kernel/spinlock.c:66)

6.7

[2, 4]

CPU
CPU A B CPU
B A A B
push CPU line 4
release 6
1 l = malloc(sizeof *l);
2 l->data = data;
3 acquire(&listlock);
4 l->next = list;
5 list = l;
6 release(&listlock);
CPU
list list->next

58
CPU memory model

xv6 __sync_synchronize()
acquire (kernel/spinlock.c:22) release (kernel/spinlock.c:47) __sync_synchronize()
memory barrier CPU xv6
acquire release xv6
9

6.8
xv6 Chapter 8

CPU
CPU
CPU
acquire CPU

CPU

Xv6 sleep-locks acquiresleep (kernel/sleeplock.c:22)


CPU 7
locked acquiresleep ’ sleep CPU
while acquiresleep

acquiresleep CPU

CPU

6.9

xv6

POSIX (Pthreads) CPU


Pthreads Pthreads

Pthreads pthread
pthread CPU
pthread
CPU

59
[10]

CPU CPU
CPU
CPU CPU
CPU

[6, 12]

xv6

6.10
1. acquire release kalloc (kernel/kalloc.c:69)
kalloc xv6
usertests
kalloc

2. kfree kalloc
kfree kalloc

3. CPU kalloc
kalloc.c CPU kalloc

4. POSIX
/

5. xv6 Pthread
1 CPU

60
Chapter 7

Scheduling

CPU
CPU
CPU multiplexing CPU
xv6

7.1
Xv6 CPU
xv6 sleep wakeup I/O
sleep xv6
CPU xv6

xv6
Xv6
CPU

sleep wakeup
CPU Xv6

7.2
7.1
CPU
xv6 CPU

61
user
space shell cat

save
swtch swtch restore
kernel
space
kstack kstack kstack
shell scheduler cat

Kernel

Figure 7.1: xv6 CPU

CPU
CPU

swtch swtch
32 RISC-V contexts CPU
swtch
struct context (kernel/proc.h:2) struct proc CPU struct cpu
swtch struct context *old struct context *new
old new
swtch Chapter 4
usertrap yield yield sched swtch
p->context cpu->context (kernel/proc.c:497)

swtch (kernel/swtch.S:3) C
swtch
struct context swtch ra
swtch swtch
swtch swtch ra
swtch sp

sched swtch cpu->context CPU


scheduler swtch (kernel/proc.c:463)
CPU swtch sched
scheduler CPU

62
7.3
swtch ; swtch
CPU
scheduler
CPU p->lock
p->state sched yield (kernel/proc.c:503) , sleep exit
sched (kernel/proc.c:487-492)
sched swtch p->context
cpu->context swtch scheduler
swtch (kernel/proc.c:463) for

xv6 p->lock swtch swtch

p->lock state context swtch


p->lock swtch CPU
yield RUNNABLE swtch
CPU
CPU sched scheduler
sched
xv6 (kernel/proc.c:463) , (kernel/proc.c:497) ,
(kernel/proc.c:463) , (kernel/proc.c:497)
coroutines ; sched scheduler
swtch sched allocproc
ra forkret (kernel/proc.c:515) swtch
forkret p->lock
fork usertrapret
scheduler (kernel/proc.c:445)
p->state ==
RUNNABLE CPU c->proc
RUNNING swtch (kernel/proc.c:458-463)

p->lock RUNNING
yield CPU
swtch context c->proc
RUNNABLE CPU scheduler
p->context CPU
CPU c->proc
p->lock
xv6 p->lock
yield scheduler yield
RUNNABLE

63
scheduler c->proc scheduler
RUNNABLE RUNNING swtch yield

7.4 mycpu myproc


Xv6 proc
proc

Xv6 CPU struct cpu (kernel/proc.h:22) CPU


CPU
mycpu (kernel/proc.c:74) CPU struct cpu RISC-V
CPU hartid Xv6 CPU Hartid CPU
tp mycpu tp cpu
CPU tp CPU shartid start CPU
tp (kernel/start.c:51) usertrapret tp
Trampolinepage tp uservec
tp (kernel/trampoline.S:77) tp xv6
RISC-V hartid RISC-V

cpuid mycpu
CPU xv6
struct cpu
myproc (kernel/proc.c:83) struct proc CPU myproc
mycpu c->proc struct cpu
myproc
CPU struct proc

7.5

xv6 wait
xv6

wakeup
sequence coordination conditional synchronization
xv6
semaphore [5] xv6
V P

64
CPU

100 struct semaphore {


101 struct spinlock lock;
102 int count;
103 };
104
105 void
106 V(struct semaphore *s)
107 {
108 acquire(&s->lock);
109 s->count += 1;
110 release(&s->lock);
111 }
112
113 void
114 P(struct semaphore *s)
115 {
116 while(s->count == 0)
117 ;
118 acquire(&s->lock);
119 s->count -= 1;
120 release(&s->lock);
121 }

while CPU busy waiting polling


s->count CPU V

sleep wakeup sleep(chan) chan wait channel


sleep CPU wakeup(chan)
chan sleep chan , wakeup
sleep wakeup

200 void
201 V(struct semaphore *s)
202 {
203 acquire(&s->lock);
204 s->count += 1;
205 wakeup(s);
206 release(&s->lock);
207 }
208
209 void
210 P(struct semaphore *s)

65
211 {
212 while(s->count == 0)
213 sleep(s);
214 acquire(&s->lock);
215 s->count -= 1;
216 release(&s->lock);
217 }
P CPU sleep
wakeup lost wake-up P s->count == 0 212
P 212 213 V CPU s->count
wakeup P
line 213 sleep P
V V

P s->count == 0 V
P
sleep
300 void
301 V(struct semaphore *s)
302 {
303 acquire(&s->lock);
304 s->count += 1;
305 wakeup(s);
306 release(&s->lock);
307 }
308
309 void
310 P(struct semaphore *s)
311 {
312 acquire( & s->lock);
313 while(s->count == 0)
314 sleep(s);
315 s->count -= 1;
316 release(&s->lock);
317 }
P V 313 314
P V
sleep ’sinterface condition lock
sleep
V P wakeup
sleep /

400 void

66
401 V(struct semaphore *s)
402 {
403 acquire(&s->lock);
404 s->count += 1;
405 wakeup(s);
406 release(&s->lock);
407 }
408
409 void
410 P(struct semaphore *s)
411 {
412 acquire(&s->lock);
413 while(s->count == 0)
414 sleep(s, & s->lock);
415 s->count -= 1;
416 release(&s->lock);
417 }
P s->lock V P’ s->count sleep
sleep s->lock

7.6
Xv6 sleep (kernel/proc.c:536) wakeup (kernel/proc.c:567)
sleep
SLEEPING sched CPU wakeup
RUNNABLE sleep wakeup
Xv6
sleep p->lock (kernel/proc.c:547) p->lock
lk lk P
V wakeup(chan) sleep p->lock
lk wakeup(chan) wakeup p->lock
sleep wakeup sleep
sleep p->lock
SLEEPING sched (kernel/proc.c:551-554)
SLEEPING p->lock scheduler

wakeup(chan)
1
wakeup wakeup (kernel/proc.c:567)
p->lock p->lock
1
wakeup acquire wakeup
release

67
sleep wakeup wakeup state SLEEPING
chan RUNNABLE

sleep wakeup
p->lock SLEEPING
wakeup wakeup
true wakeup
SLEEPING wakeup

wakeup sleep

sleep
/
/

7.7
sleep wakeup xv6
1

pipewrite piperead
struct pipe lock data nread
nwrite
buf[PIPESIZE-1] buf[0] ( nwrite
== nread+PIPESIZE nwrite == nread
buf[nread % PIPESIZE] buf[nread] nwrite
piperead pipewrite CPU pipewrite (ker-
nel/pipe.c:77) piperead
(kernel/pipe.c:106) acquire (kernel/spinlock.c:22)
piperead pipewrite ( addr[0..n-1] )
(kernel/pipe.c:95)
(kernel/pipe.c:88) pipewrite wakeup
&pi->nwrite
sleep pi->lock pipewrite
pi->lock piperead pi->nread !=
pi->nwrite (kernel/pipe.c:113) pipewrite pi->nwrite == pi->nread+PIPESIZE
(kernel/pipe.c:88) for (kernel/pipe.c:120)
nread piperead wakeup

68
(kernel/pipe.c:127) wakeup
&pi->nwrite pipewrite
RUNNABLE
pi->nread pi->nwrite );

7.8
sleep wakeup 1 exit
wait wait
wait
exit xv6 wait exit
ZOMBIE wait
UNUSED
ID init wait

wait exit exit exit


wait wait_lock (kernel/proc.c:391) wait_lock
wakeup wait ZOMBIE
proc
wait 0 ID wait
sleep (kernel/proc.c:433) wait
wait_lock pp->lock wait_lock
pp->lock
exit (kernel/proc.c:347) reparent
init wait
CPU exit wait_lock p->lock wait_lock
wakeup(p->parent) wait exit p->lock
wait ZOMBIE
swtch exit wait
exit ZOMBIE
wakeup wait p->lock
scheduler wait exit
ZOMBIE (kernel/proc.c:379)
exit kill (kernel/proc.c:586)
kill CPU
kill
p->killed
usertrap exit p->killed killed
(kernel/proc.c:615)

69
sleep , kill wakeup sleep
xv6 sleep while
sleep sleep p->killed

xv6 sleep p->killed


virtio (kernel/virtio_disk.c:285) p->killed
I/O
usertrap

7.9
( p->lock ) xv6 p->lock
struct proc p->state , p->chan ,
p->killed , p->xstate p->pid

p->lock xv6
p->lock
• p->state proc[]


• wait ZOMBIE CPU
• RUNNABLE
swtch

• RUNNABLE

• swtch

• wakeup sleep CPU

• kill kill p->pid


p->killed

• kill p->state
p->parent wait_lock p->lock
p->parent
wait_lock wait
wait_lock p->lock ZOMBIE CPU
wait_lock exit init
wait wait_lock

70
7.10
xv6 round robin

priority inversion convoys

sleep wakeup
Unix sleep
Unix CPU xv6
sleep FreeBSD msleep 9 sleep

Linux sleep

wakeup chan sleep


wakeup Linux
9 sleep wakeup
sleep wakeup wait signal

wakeup

thundering herd wakeup


signal broadcast

xv6

C
Unix signal
-1 EINTR
Xv6
Xv6 kill p->killed
sleep p->killed sleep kill ;
p->killed p->killed
sleep p->killed

71
proc
allocproc ;xv6

7.11
1. lk != &p->lock
if(lk != &p->lock){
acquire(&p->lock);
release(lk);
}

release(lk);
acquire(&p->lock);

sleep

2. xv6 sleep wakeup xv6

3. kill sleep kill


p->killed sleep

4. p->killed virtio
while

5. xv6

swtch

6. xv6 scheduler RISC-V WFI


WFI

72
Chapter 8

File system

persistence
xv6 Unix 1
virtio

• crash recovery

xv6

8.1
xv6 8.1 virtio

transaction
inode
inode i-number
i-number
/usr/rtm/xv6/fs.c
Unix

73
File descriptor

Pathname

Directory

Inode

Logging

Buffer cache

Disk

Figure 8.1: xv6

512 0
512 1
Xv6
struct buf (kernel/buf.h:1)

inode xv6
8.2 0 1 superblock ;

mkfs

8.2
1
(2)
bio.c
buffer cache bread bwrite ; buf

brelse
bread brelse

74
boot super log inodes bit map data .... data

0 1 2

Figure 8.2: xv6

8.3
main
valid
disk data

bread (kernel/bio.c:93) bget (kernel/bio.c:97)


bread virtio_disk_rw
bget (kernel/bio.c:59) bget
bget
bget
( b->refcnt = 0 )
bget
b->valid = 0 bread

bget bache.lock
dev , blockno refcnt

bget bcache.lock b->refcnt


bcache.lock

bget Panic

bread
bwrite
bwrite (kernel/bio.c:107) virtio_disk_rw

brelse brelse b-release


Unix BSD Linux Solaris brelse
(kernel/bio.c:117) (kernel/bio.c:128-133)

bget

75
bcache.head next
prev

8.4

xv6

Xv6 xv6

log commit

xv6

8.5

Xv6

76
group commit

xv6 virtio
batching xv6
Xv6

write unlink
inode
Xv6
unlink xv6

8.6

begin_op();
...
bp = bread(...);
bp->data[...] = ...;
log_write(bp);
...
end_op();
begin_op (kernel/log.c:127)
log.outstanding
log.outstanding MAXOPBLOCKS log.outstanding
MAXOPBLOCKS

log_write (kernel/log.c:215) bwrite

log_write

absorption

end_op (kernel/log.c:147)
commit(). write_log() (kernel/log.c:179)
write_head()
(kernel/log.c:103)

77
install_trans (kernel/log.c:69)
end_op

recover_from_log (kernel/log.c:117) initlog (kernel/proc.c:527)


end_op
filewrite (kernel/file.c:135)
begin_op();
ilock(f->ip);
r = writei(f->ip, ...);
iunlock(f->ip);
end_op();

writei

8.7
Xv6

mkfs
balloc bfree balloc
balloc (kernel/fs.c:72) 0 sb.size
balloc

( BPB )

bfree (kernel/fs.c:92) bread


brelse
balloc bfree

8.8
inode
inode inode inode

inode
n n inode n inode
i-number inode
inode struct dinode (kernel/fs.h:32) type
nlink inode

78
inode size
addrs
inode itable struct inode (kernel/-
file.h:17) a struct dinode inode C
inode ref inode C
inode iget iput
inode inode
exec
xv6 sinode itable.lock
ref
inode inode lock inode
inode ref

inode nlink
xv6
A struct inode iget() iput()
iget() inode
inode
iget() inode
inode
struct inode iget
ilock ilock
inode iunlock inode inode

inode C iget inode


inode C inode
inode inode
inode
iupdate

8.9
inode xv6 ialloc (kernel/fs.c:199) ialloc
balloc inode
type inode
iget ialloc bp ialloc
inode
iget (kernel/fs.c:247) ( ip->ref > 0
inode inode iget
(kernel/fs.c:261-262)
ilock inode ilock (kernel/fs.c:293)
ilock inode

79
inode iunlock (kernel/fs.c:321)

iput (kernel/fs.c:337) inode C (kernel/fs.c:360)


inode inode
inode
iput inode C inode
nodirectory inode iput itrunc
inode 0 inode (kernel/fs.c:342)
iput inode
ilock inode inode
inode
inode ip->ref iput
iput itable.lock
ialloc
iput iupdate inode
inode inode
iput
iput()
read()
iput().

iput() iput()
inode

0 inode

Xv6
xv6

8.10 inode
inode struct dinode 8.3 inode
dinode addrs NDIRECT
NDIRECT direct blocks NINDIRECT
indirect block addrs
12 kB ( NDIRECT x BSIZE inode 256 kB (

80
dinode

type
data
major
minor
nlink ...
size
address 1
..... data
address 12
indirect
data

indirect block
address 1
..... ...
address 256

data

Figure 8.3:

NINDIRECT x BSIZE )
bmap readi
writei bmap bn ’
ip ip bmap 1
bmap (kernel/fs.c:383) NDIRECT inode
(kernel/fs.c:388-396) NINDIRECT ip->addrs[NDIRECT] bmap
(kernel/fs.c:407) (kernel/fs.c:408)
NDIRECT+NINDIRECT , bmap Panic writei (kernel/fs.c:513)
bmap ip->addrs[]
bmap (kernel/fs.c:389-390) (kernel/fs.c:401-402)

itrunc itrunc (kernel/fs.c:426)


(kernel/fs.c:447-448)
bmap readi writei inode readi (kernel/fs.c:472)
(kernel/fs.c:477-478)
(kernel/fs.c:479-480)
dst writei (kernel/fs.c:506)
readi
(kernel/fs.c:513-514) (kernel/fs.c:36)

81
writei
stati stat stat

8.11
T_DIR
struct dirent (kernel/fs.h:56) 56
DIRSIZ (14) NULL (0) inode

dirlookup (kernel/fs.c: 552 )


inode *poff
dirlookup *poff
inode iget dirlookup iget
dp ., inode
dp .. ,
. dp ip
dirlink (kernel/fs.c:580) inode dp
dirlink (kernel/fs.c:586-590)
(kernel/fs.c:563-564) off
off dp->size
dirlink off

8.12
dirlookup namei path
inode nameiparent
inode name namex
namex (kernel/fs.c:652)
skipelem inode name
ip ip ip
ip->type ilock
ip->type nameiparent
nameiparent ; name namex
ip dirlookup ip = next (kernel/fs.c:672-677)
ip
namex
inode Xv6
namex I/O
namex

82
Xv6 dirlookup namex
dirlookup inode iget iget inode
dirlookup namex
inode xv6 inode inode

next inode . ip next


ip namex next
iget ilock

8.13

Unix Unix

Xv6 Chapter 1
struct file (kernel/file.h:1) inode
I/O open
struct file ): I/O
struct file
open
dup fork
readable writable

ftable
filealloc filedup fileclose
fileread filewrite
filealloc (kernel/file.c:30)
( f->ref == 0 filedup (kernel/file.c:48)
fileclose (kernel/file.c:60) fileclose

filestat , fileread filewrite stat , read write


filestat (kernel/file.c:88) inode stati fileread filewrite
inode inode
fileread filewrite I/O (kernel/file.c:122-
123) (kernel/file.c:153-154) inode
(kernel/file.c:94-96) (kernel/file.c:121-124) (kernel/file.c:163-166) inode

83
8.14

(kernel/sysfile.c)

sys_link sys_unlink inode


sys_link (kernel/sysfile.c:124)
old new (kernel/sysfile.c:129) old sys_link ip->nlink
sys_link nameiparent new (kernel/sysfile.c:149)
old inode (kernel/sysfile.c:152)
inode inode
sys_link ip->nlink

ip->nlink

sys_link inode create (kernel/sysfile.c:246)


open O_CREATE
mkdir mkdev sys_link , create
nameiparent inode dirlookup
(kernel/sysfile.c:256) create
open mkdir mkdev create open (
type == T_FILE open
create (kernel/sysfile.c:260) (kernel/sysfile.c:261-262)
create inode ialloc (kernel/sysfile.c:265)
inode create . ..
create (kernel/sysfile.c:278) create sys_link
inode ip dp inode ip
ip ’slock dp

create sys_open , sys_mkdir sys_mknod sys_open (ker-


nel/sysfile.c:305) open
O_CREATE create (kernel/sysfile.c:320) namei (kernel/sys-
file.c:326) create inode namei sys_open
inode
inode sys_open
(kernel/sysfile.c:344) (kernel/sysfile.c:356-361)

Chapter 7 sys_pipe

84
8.15
xv6
Xv6 V6
(LRU)
LRU
LRU

Xv6

UNIX fsck inodefree

Xv6 UNIX
BSD UFS/FFS Linux ext2/ext3

Microsoft
Windows NTFS macOS HFS Solaris ZFS

Xv6 xv6 Panic

Panic

Xv6

RAID

xv6 inode

Sun ZFS
Xv6

Unix
/proc xv6 if
fileread filewrite
inode
RPC

85
8.16
1. Panic balloc xv6

2. Panic ialloc xv6

3. filealloc Panic

4. ip sys_link ’ iunlock(ip)
dirlink

5. create ialloc dirlink


create panic

6. sys_chdir iunlock(ip) iput(cp->cwd) cp->cwd


iunlock(ip) iput

7. lseek lseek filewrite


if lseek off f->ip->size.

8. O_TRUNC O_APPEND open > >> shell

9.

10.

11. VM

86
Chapter 9

Concurrency revisited

xv6
xv6

9.1
(kernel/bio.c:26) NBUF

buf

CPU
yield
acquire (kernel/proc.c:503)
acquiresleep ilock (kernel/fs.c:293)
CPU CPU

acquire

pipeclose
(kernel/pipe.c:59) pi->readopen pi->writeopen

setkilled killed (kernel/proc.c:607) p->killed


p->killed race C
1
undefined behavior

1
https://en.cppreference.com/w/c/language/memory_model

87
C killed
p->killed
p->killed

9.2
xv6
p->state file inode buf

struct inode
namex (kernel/fs.c:652)
namex
a/./b
.. Chapter 8 inode

xv6
kmem.lock

mycpu() (kernel/proc.c:83)

CPU

9.3
xv6
RISC-V main.c started
(kernel/main.c:7) CPU CPU 0 xv6 volatile

Xv6 CPU CPU


fork
CPU nolock

6
CPU CPU
acquire release CPU

88
9.4

xv6
xv6
xv6
CPU
/

CPU
yield sched swtch
scheduler CPU
RUNNABLE xv6
CPU
CPU fork pid_lock
kmem.lock UNUSED

CPU

9.5
1. xv6

2. xv6 scheduler()

3. xv6 fork()

89
90
Chapter 10

Summary

xv6
/

Unix

91
92
Bibliography

[1] Linux common vulnerabilities and exposures (CVEs). https://cve.mitre.org/


cgi-bin/cvekey.cgi?keyword=linux.

[2] The RISC-V instruction set manual Volume I: unprivileged ISA. https://github.com/
riscv/riscv-isa-manual/releases/download/Ratified-IMAFDQC/
riscv-spec-20191213.pdf, 2019.

[3] The RISC-V instruction set manual Volume II: privileged architecture. https:
//github.com/riscv/riscv-isa-manual/releases/download/
Priv-v1.12/riscv-privileged-20211203.pdf, 2021.

[4] Hans-J Boehm. Threads cannot be implemented as a library. ACM PLDI Conference, 2005.

[5] Edsger Dijkstra. Cooperating sequential processes. https://www.cs.utexas.edu/


users/EWD/transcriptions/EWD01xx/EWD123.html, 1965.

[6] Maurice Herlihy and Nir Shavit. The Art of Multiprocessor Programming, Revised Reprint.
2012.

[7] Brian W. Kernighan. The C Programming Language. Prentice Hall Professional Technical
Reference, 2nd edition, 1988.

[8] Gerwin Klein, Kevin Elphinstone, Gernot Heiser, June Andronick, David Cock, Philip Derrin,
Dhammika Elkaduwe, Kai Engelhardt, Rafal Kolanski, Michael Norrish, Thomas Sewell,
Harvey Tuch, and Simon Winwood. Sel4: Formal verification of an OS kernel. In Proceedings
of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, page 207 220, 2009.

[9] Donald Knuth. Fundamental Algorithms. The Art of Computer Programming. (Second ed.),
volume 1. 1997.

[10] L Lamport. A new solution of dijkstra’s concurrent programming problem. Communications


of the ACM, 1974.

[11] John Lions. Commentary on UNIX 6th Edition. Peer to Peer Communications, 2000.

[12] Paul E. Mckenney, Silas Boyd-wickizer, and Jonathan Walpole. RCU usage in the linux
kernel: One decade later, 2013.

93
[13] Martin Michael and Daniel Durich. The NS16550A: UART design and application consid-
erations. http://bitsavers.trailing-edge.com/components/national/
_appNotes/AN-0491.pdf, 1987.

[14] Aleph One. Smashing the stack for fun and profit. http://phrack.org/issues/49/
14.html#article.

[15] David Patterson and Andrew Waterman. The RISC-V Reader: an open architecture Atlas.
Strawberry Canyon, 2017.

[16] Dave Presotto, Rob Pike, Ken Thompson, and Howard Trickey. Plan 9, a distributed system.
In In Proceedings of the Spring 1991 EurOpen Conference, pages 43–50, 1991.

[17] Dennis M. Ritchie and Ken Thompson. The UNIX time-sharing system. Commun. ACM,
17(7):365–375, July 1974.

94
Index

., 82, 84 conflict, 54
.., 82, 84 contention, 54
/init, 24, 37 contexts, 62
_entry, 23 convoys, 71
copy-on-write (COW) fork, 44
absorption, 77 copyinstr, 43
acquire, 55, 58 copyout, 37
address space, 22 coroutines, 63
argc, 37 CPU, 9
argv, 37 cpu->context, 62, 63
atomic, 54 crash recovery, 73
create, 84
balloc, 78, 79 critical section, 53
batching, 77 current directory, 16
begin_op, 77
bfree, 78 deadlock, 56
bget, 75 demand paging, 45
block, 74 direct blocks, 80
bmap, 81 direct memory access (DMA), 50
bottomhalf, 47 dirlink, 82
bread, 74, 75 dirlookup, 82, 84
brelse, 74, 75 DIRSIZ, 82
BSIZE, 80 disk, 75
buf, 74 Driver, 47
busy waiting, 65 dup, 83
bwrite, 74, 75, 77 ecall, 20, 23
ELF format, 36
chan, 65, 68
ELF_MAGIC, 36
child, 10
end_op, 77
commit, 76
exception, 39
concurrency, 51
exec, 11–13, 24, 37, 42
concurrency control, 51
exit, 10, 69
condition lock, 66
conditional synchronization, 64 file descriptor, 12

95
filealloc, 83 kvminit, 33
fileclose, 83 kvminithart, 33
filedup, 83 kvmmake, 33
fileread, 83, 85 kvmmap, 33
filestat, 83
filewrite, 78, 83, 85 lazy allocation, 44
fork, 44, 46 links, 16
fork, 10, 12, 13, 83 loadseg, 36
forkret, 63 lock, 51
freerange, 34 log, 76
fsck, 85 log_write, 77
ftable, 83 lost wake-up, 66

getcmd, 12 machine mode, 20


group commit, 77 main, 33, 34, 75
guard page, 32 malloc, 12
mappages, 33
handler, 39 memory barrier, 59
hartid, 64 memory model, 59
memory-mapped, 31, 47
I/O, 12 metadata, 16
I/O concurrency, 48 microkernel, 21
I/O redirection, 13 mkdev, 84
ialloc, 79, 84
mkdir, 84
iget, 79, 82
mkfs, 74
ilock, 79, 82
monolithic kernel, 19, 21
indirect block, 80 multi-core, 19
initcode.S, 24, 42
multiplexing, 61
initlog, 78
multiprocessor, 19
inode, 16, 73, 78
mutual exclusion, 53
install_trans, 78
mycpu, 64
interface design, 9
myproc, 64
interrupt, 39
iput, 79, 80 namei, 36, 84
isolation, 19 nameiparent, 82, 84
itable, 79 namex, 82
itrunc, 80, 81 NDIRECT, 80, 81
iunlock, 80 NINDIRECT, 80, 81

kalloc, 34 O_CREATE, 84
kernel, 9, 20 open, 83, 84
kernel space, 9, 20
kfree, 34 p->context, 63
kinit, 34 p->killed, 69

96
p->kstack, 23 release, 55, 58
p->lock, 63, 67 root, 16
p->pagetable, 23 round robin, 71
p->state, 23 RUNNABLE, 63, 67–69
p->xxx, 23
page, 29 satp, 30
page table entries (PTEs), 29 sbrk, 12
page-fault exception, 30, 44 scause, 40
paging area, 45 sched, 62, 63, 67
paging to disk, 45 scheduler, 62, 63
panic, 44 sector, 74
parent, 10 semaphore, 64
path, 16 sepc, 40
persistence, 73 sequence coordination, 64
PGROUNDUP, 34 serializing, 54
physical address, 22 sfence.vma, 33
PHYSTOP, 33, 34 shell, 10
PID, 10 signal, 71
pipe, 15 skipelem, 82
piperead, 68 sleep, 65–67
pipewrite, 68 sleep-locks, 59
polling, 50, 65 SLEEPING, 67, 68
pop_off, 58 sret, 23
printf, 11 sscratch, 40
priority inversion, 71 sstatus, 40
privileged instructions, 20 stat, 82, 83
proc_mapstacks, 33 stati, 82, 83
proc_pagetable, 36 struct context, 62
process, 9, 22 struct cpu, 64
programmed I/O, 49 struct dinode, 78, 80
PTE_R, 30 struct dirent, 82
PTE_U, 30 struct elfhdr, 36
PTE_V, 30 struct file, 83
PTE_W, 30 struct inode, 79
PTE_X, 30 struct pipe, 68
push_off, 58 struct proc, 23
struct run, 34
race, 52, 87 struct spinlock, 54
re-entrant locks, 57 stval, 44
read, 83 stvec, 39
readi, 36, 81 superblock, 74
recover_from_log, 78 supervisor mode, 20
recursive locks, 57 swtch, 62, 63

97
SYS_exec, 42 UART, 47
sys_link, 84 undefined behavior, 87
sys_mkdir, 84 unlink, 77
sys_mknod, 84 user memory, 22
sys_open, 84 user mode, 20
sys_pipe, 84 user space, 9, 20
sys_sleep, 58 usertrap, 62
sys_unlink, 84 ustack, 37
syscall, 42 uvmalloc, 36, 37
system call, 9
valid, 75
T_DIR, 82
vector, 39
T_FILE, 84
virtio_disk_rw, 75
thread, 23
virtual address, 22
thundering herd, 71
ticks, 58 wait, 10, 11, 69
tickslock, 58 wait channel, 65
time-share, 10, 19 wakeup, 56, 65, 67
tophalf, 47 walk, 33
TRAMPOLINE, 41 walkaddr, 36
trampoline, 22, 41, 42, 45 write, 77, 83
transaction, 73 writei, 78, 81, 82
Translation Look-aside Buffer (TLB), 30, 33
trap, 39–45 yield, 62, 63
trapframe, 22, 45
type cast, 34 ZOMBIE, 69

98

You might also like