The document provides an overview of the xv6 operating system, which is a Unix-like OS designed for educational purposes, focusing on OS concepts with a small codebase. It includes instructions on compiling and executing xv6 code, utilizing tools like cscope, ctags, and doxygen for code navigation and documentation. Additionally, it covers the structure of xv6 code, the build process, and prerequisites for understanding the kernel code.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODP, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
4 views32 pages
10 Xv6 Compilation Code Structure
The document provides an overview of the xv6 operating system, which is a Unix-like OS designed for educational purposes, focusing on OS concepts with a small codebase. It includes instructions on compiling and executing xv6 code, utilizing tools like cscope, ctags, and doxygen for code navigation and documentation. Additionally, it covers the structure of xv6 code, the build process, and prerequisites for understanding the kernel code.
Credits: xv6 book by Cox, Kaashoek, Morris Notes by Prof. Sorav Bansal Introduction to xv6
Structure of xv6 code
Compiling and executing xv6 code
About xv6 Unix Like OS Multi tasking, Single user On x86 processor Supports some system calls Small code, 7 to 10k Meant for learning OS concepts No : demand paging, no copy-on-write fork, no shared-memory, fixed size stack for user programs Use cscope and ctags with VIM Go to folder of xv6 code and run cscope -q *.[chS] Also run ctags *.[chS] Now download the file http://cscope.sourceforge.net/cscope_maps.vim as .cscope_maps.vim in your ~ folder And add line “source ~/.cscope_maps.vim” in your ~/.vimrc file Read this tutorial http://cscope.sourceforge.net/cscope_vim_tutorial.html Use call graphs (using doxygen) Doxygen – a documentation generator. Can also be used to generate “call graphs” of functions Download xv6 Install doxygen on your Ubuntu machine. cd to xv6 folder Run “doxygen -g doxyconfig” This creates the file “doxyconfig” Use call graphs (using doxygen) Create a folder “doxygen” Open “doxyconfig” file and make these changes. PROJECT_NAME = "XV6" OUTPUT_DIRECTORY = ./doxygen CREATE_SUBDIRS = YES EXTRACT_ALL = YES EXCLUDE = usertests.c cat.c yes.c echo.c forktest.c grep.c init.c kill.c ln.c ls.c mkdir.c rm.c sh.c stressfs.c wc.c zombie.c CALL_GRAPH = YES CALLER_GRAPH = YES Now run “doxygen doxyconfig” Go to “doxygen”/html and open “firefox index.html” --> See call graphs in files -> any file Xv6 follows monolithic kernel approach qemu A virtual machine manager, like Virtualbox Qemu provides us BIOS Virtual CPU, RAM, Disk controller, Keyboard controller IOAPIC, LAPIC Qemu runs xv6 using this command qemu -serial mon:stdio -drive file=fs.img,index=1,media=disk,format=raw -drive file=xv6.img,index=0,media=disk,format=raw -smp 2 - m 512 Invoked when you run “make qemu” qemu Understanding qemu command -serial mon:stdio the window of xv6 is also multiplexed in your normal terminal. Run “make qemu”, then Press “Ctrl-a” and “c” in terminal and you get qemu prompt -drive file=fs.img,index=1,media=disk,format=raw Specify the hard disk in “fs.img”, accessible at first slot in IDE(or SATA, etc), as a “disk” , with “raw” format -smp 2 Two cores in SMP mode to be simulated -m 512 Use 512 MB ram About files in XV6 code cat.c echo.c forktest.c grep.c init.c kill.c ln.c ls.c mkdir.c rm.c sh.c stressfs.c usertests.c wc.c yes.c zombie.c User programs for testing xv6 Makefile To compile the code dot-bochsrc For running with emulator bochs About files in XV6 code bootasm.S entryother.S entry.S initcode.S swtch.S trapasm.S usys.S Kernel code written in Assembly. Total 373 lines kernel.ld Instructions to Linker, for linking the kernel properly README Notes LICENSE Misc files Using Makefile make qemu Compile code and run using “qemu” emulator make xv6.pdf Generate a PDF of xv6 code make mkfs Create the mkfs program make clean Remove all intermediary and final build files Files generated by Makefile .o files Compiled from each .c file No need of separate instruction in Makefile to create .o files _%: %.o $(ULIB) line is sufficient to build each .o for a _xyz file Files generated by Makefile asm files Each of them has an equivalent object code file or C file. For example bootblock: bootasm.S bootmain.c $(CC) $(CFLAGS) -fno-pic -O -nostdinc -I. -c bootmain.c $(CC) $(CFLAGS) -fno-pic -nostdinc -I. -c bootasm.S $(LD) $(LDFLAGS) -N -e start -Ttext 0x7C00 -o bootblock.o bootasm.o bootmain.o $(OBJDUMP) -S bootblock.o > bootblock.asm $(OBJCOPY) -S -O binary -j .text bootblock.o bootblock ./sign.pl bootblock Files generated by Makefile _ln, _ls, etc Executable user programs Compilation process is explained after few slides Files generated by Makefile xv6.img Image of xv6 created xv6.img: bootblock kernel dd if=/dev/zero of=xv6.img count=10000 dd if=bootblock of=xv6.img conv=notrunc dd if=kernel of=xv6.img seek=1 conv=notrunc Files generated by Makefile bootblock bootblock: bootasm.S bootmain.c $(CC) $(CFLAGS) -fno-pic -O -nostdinc -I. -c bootmain.c $(CC) $(CFLAGS) -fno-pic -nostdinc -I. -c bootasm.S $(LD) $(LDFLAGS) -N -e start -Ttext 0x7C00 -o bootblock.o bootasm.o bootmain.o $(OBJDUMP) -S bootblock.o > bootblock.asm $(OBJCOPY) -S -O binary -j .text bootblock.o bootblock ./sign.pl bootblock Files generated by Makefile kernel kernel: $(OBJS) entry.o entryother initcode kernel.ld
$(LD) $(LDFLAGS) -T kernel.ld -o
kernel entry.o $(OBJS) -b binary initcode entryother $(OBJDUMP) -S kernel > kernel.asm $(OBJDUMP) -t kernel | sed '1,/SYMBOL TABLE/d; s/ .* / /; /^$$/d' > kernel.sym Files generated by Makefile fs.img A disk image containing user programs and README fs.img: mkfs README $(UPROGS) ./mkfs fs.img README $(UPROGS) .sym files Symbol tables of different programs E.g. for file “kernel” $(OBJDUMP) -t kernel | sed '1,/SYMBOL TABLE/d; s/ .* / /; /^$$/d' > kernel.sym Size of xv6 C code wc *[ch] | sort -n 10595 34249 278455 total Out of which 738 4271 33514 dot-bochsrc wc cat.c echo.c forktest.c grep.c init.c kill.c ln.c ls.c mkdir.c rm.c sh.c stressfs.c usertests.c wc.c yes.c zombie.c 2849 6864 51993 total So total code is 10595 – 2849 – 738 = 7008 lines List of commands to try (in given order) usertests # Runs lot of tests and takes upto 10 minutes to run stressfs # opens , reads and writes to files in parallel ls # out put is filetyep, inode number, type cat README ls;ls cat README | grep BUILD echo hi there echo hi there | grep hi echo "hi there List of commands to try (in this order) echo README | grep Wa ls ../ # works from inside test echo README | grep Wa | cd # fails grep ty # does not work cd / # works cat README | grep Wa | wc README grep bl # works rm out ls > out # takes time! ls . test # listing both mkdir test directories ln cat xyz; ls cd test rm xyz; ls ls # fails User Libraries: Used to link user land programs Ulib.c Strcpy, strcmp,strlen, memset, strchr, stat, atoi, memove Stat uses open() Usys.S -> compiles into usys.o Assembly code file. Basically converts all calls like open() (e.g. used in ulib.c) into assembly code using “int” instruction. Run following command see the last 4 lines in the output objdump -d usys.o 00000048 <open>: 48: b8 0f 00 00 00 mov $0xf,%eax 4d: cd 40 int $0x40 4f: c3 ret User Libraries: Used to link user land programs printf.c Code for printf()! Interesting to read this code. Uses variable number of arguments. Normal technique in C is to use va_args library, but here it uses pointer arithmetic. Written using two more functions: printint() and putc() - both call write() Where is code for write()? User Libraries: Used to link user land programs umalloc.c This is an implementation of malloc() and free() Almost same as the one done in “The C Programming Language” by Kernighan and Ritchie Uses sbrk() to get more memory from xv6 kernel Understanding the build process in more details Run make qemu | tee make-output.txt You will get all compilation commands in make-output.txt Compiling user land programs Normally when you compile a program on Linux You compile it for the same ‘target’ machine ( = CPU + OS) The compiler itself runs on the same OS
To compile a user land program for xv6, we don’t have a compiler
on xv6, So we compile the programs (using make, cc) on Linux , for xv6 Obviously they can’t link with the standard libraries on Linux Compiling user land programs ULIB = ulib.o usys.o printf.o umalloc.o _%: %.o $(ULIB) $(LD) $(LDFLAGS) -N -e main -Ttext 0 -o $@ $^ $(OBJDUMP) -S $@ > $*.asm $(OBJDUMP) -t $@ | sed '1,/SYMBOL TABLE/d; s/ .* / /; /^$$/d' > $*.sym $@ is the name of the file being generated $^ is dependencies . i.e. $(ULIB) and %.o in this case Compiling user land programs gcc -fno-pic -static -fno-builtin -fno-strict-aliasing -O2 -Wall -MD -ggdb -m32 -Werror -fno-omit-frame- pointer -fno-stack-protector -fno-pie -no-pie -c -o cat.o cat.c ld -m elf_i386 -N -e main -Ttext 0 -o _cat cat.o ulib.o usys.o printf.o umalloc.o objdump -S _cat > cat.asm objdump -t _cat | sed '1,/SYMBOL TABLE/d; s/ .* / /; /^$/d' > cat.sym Compiling user land programs Mkfs is compiled like a Linux program ! gcc -Werror -Wall -o mkfs mkfs.c How to read kernel code ? Understand the data structures Know each global variable, typedefs, lists, arrays, etc. Know the purpose of each of them While reading a code path, e.g. exec() Try to ‘locate’ the key line of code that does major work Initially (but not forever) ignore the ‘error checking’ code Keep summarising what you have read Remembering is important ! To understand kernel code, you should be good with concepts in OS , C, assembly, hardware Pre-requisits for reading the code Understanding of core concepts of operating systems Memory Management, processes, fork-exec, file systems, synchronization, x86 architecture, calling convention , computer organization 2 approaches: 1) Read OS basics first, and then start reading xv6 code Good approach, but takes more time ! 2) Read some basics, read xv6, repeat Gives a headstart, but you will always have gaps in your understanding of the code, until you are done with everything We normally follow this approach Good knowledge of C, pointers, function pointers particularly Data structures: doubly linked lists, queues, structures and pointers