How To Create An Operating System
How To Create An Operating System
Table of Contents
1. Introduction
2. Introduction about the x86 architecture and about our OS
3. Setup the development environment
4. First boot with GRUB
5. Backbone of the OS and C++ runtime
6. Base classes for managing x86 architecture
7. GDT
8. IDT and interrupts
9. Theory: physical and virtual memory
10. Memory management: physical and virtual
11. Process management and multitasking
12. External program execution: ELF files
13. Userland and syscalls
14. Modular drivers
15. Some basics modules: console, keyboard
16. IDE Hard disks
17. DOS Partitions
18. EXT2 read-only filesystems
19. Standard C library (libC)
20. UNIX basic tools: sh, cat
21. Lua interpreter
Introduction
Introduction
Install Vagrant
Vagrant is free and open-source software for creating and configuring virtual development environments. It can be
considered a wrapper around VirtualBox.
Vagrant will help us create a clean virtual development environment on whatever system you are using. The first step is to
download and install Vagrant for your system at http://www.vagrantup.com/.
Install Virtualbox
Oracle VM VirtualBox is a virtualization software package for x86 and AMD64/Intel64-based computers.
Vagrant needs Virtualbox to work, Download and install for your system at https://www.virtualbox.org/wiki/Downloads.
Once the lucid32 image is ready, we need to define our development environment using a Vagrantfile, create a file named
Vagrantfile. This file defines what prerequisites our environment needs: nasm, make, build-essential, grub and qemu.
Start your box using:
vagrant up
You can now access your box by using ssh to connect to the virtual box using:
vagrant ssh
The directory containing the Vagrantfile will be mounted by default in the /vagrant directory of the guest VM (in this case,
Ubuntu Lucid32):
cd /vagrant
make all
make run
What is GRUB?
GNU GRUB (short for GNU GRand Unified Bootloader) is a boot loader package from the GNU Project. GRUB is the
reference implementation of the Free Software Foundation's Multiboot Specification, which provides a user the
choice to boot one of multiple operating systems installed on a computer or select a specific kernel configuration
available on a particular operating system's partitions.
To make it simple, GRUB is the first thing booted by the machine (a boot-loader) and will simplify the loading of our kernel
stored on the hard-disk.
The first boot sequence of our kernel is written in Assembly: start.asm and we use a linker file to define our executable
structure: linker.ld.
This boot process also initializes some of our C++ runtime, it will be described in the next chapter.
Multiboot header structure:
struct multiboot_info {
u32 flags;
u32 low_mem;
u32 high_mem;
u32 boot_device;
u32 cmdline;
u32 mods_count;
u32 mods_addr;
struct {
u32 num;
u32 size;
u32 addr;
u32 shndx;
} elf_sec;
unsigned long mmap_length;
unsigned long mmap_addr;
unsigned long drives_length;
unsigned long drives_addr;
unsigned long config_table;
unsigned long boot_loader_name;
unsigned long apm_table;
unsigned long vbe_control_info;
unsigned long vbe_mode_info;
unsigned long vbe_mode;
unsigned long vbe_interface_seg;
unsigned long vbe_interface_off;
unsigned long vbe_interface_len;
};
You can use the command mbchk kernel.elf to validate your kernel.elf file against the multiboot standard. You can also
use the command nm -n kernel.elf to validate the offset of the different objects in the ELF binary.
fdisk ./c.img
# Switch to Expert commands
> x
# Change number of cylinders (1-1048576)
> c
> 4
# Change number of heads (1-256, default 16):
> h
> 16
# Change number of sectors/track (1-63, default 63)
> s
> 63
# Return to main menu
> r
# Add a new partition
> n
# Choose primary partition
> p
# Choose partition number
> 1
# Choose first sector (1-4, default 1)
> 1
# Choose last sector, +cylinders or +size{K,M,G} (1-4, default 4)
> 4
# Toggle bootable flag
> a
# Choose first partition for bootable flag
> 1
# Write table to disk and exit
> w
We need now to attach the created partition to the loop-device (which allows a file to be access like a block device) using
losetup. The offset of the partition is passed as an argument and calculated using: offset= start_sector *
bytes_by_sector.
Using fdisk -l -u c.img , you get: 63 * 512 = 32256.
mke2fs /dev/loop1
10
losetup -d /dev/loop1
See Also
GNU GRUB on Wikipedia
Multiboot specification
11
C types
During the next step, we are going to use different types in our code, most of the types we are going to use unsigned types
(all the bits are used to stored the integer, in signed types one bit is used to signal the sign):
12
# Linker
LD=ld
LDFLAG= -melf_i386 -static -L ./ -T ./arch/$(ARCH)/linker.ld
# C++ compiler
SC=g++
FLAG= $(INCDIR) -g -O2 -w -trigraphs -fno-builtin -fno-exceptions -fno-stack-protector -O0 -m32 -fno-rtti -nostdlib -nodefaultlibs
# Assembly compiler
ASM=nasm
ASMFLAG=-f elf -o
13
14
15
buf[i] =
(j >=
0) ? buf[j] : '0';
print("0x%s", buf);
} else if (c == 's') {
print((char *) va_arg(ap, int));
}
} else
putc(c);
}
return;
}
Assembly interface
A large number of instructions are available in Assembly but there is not equivalent in C (like cli, sti, in and out), so we need
an interface to these instructions.
In C, we can include Assembly using the directive "asm()", gcc use gas to compile the assembly.
Caution: gas uses the AT&T syntax.
/* output byte */
void Io::outb(u32 ad, u8 v){
asmv("outb %%al, %%dx" :: "d" (ad), "a" (v));;
}
/* output word */
void Io::outw(u32 ad, u16 v){
asmv("outw %%ax, %%dx" :: "d" (ad), "a" (v));
}
/* output word */
void Io::outl(u32 ad, u32 v){
asmv("outl %%eax, %%dx" : : "d" (ad), "a" (v));
}
/* input byte */
u8 Io::inb(u32 ad){
u8 _v; \
asmv("inb %%dx, %%al" : "=a" (_v) : "d" (ad)); \
return _v;
}
/* input word */
u16 Io::inw(u32 ad){
u16 _v; \
asmv("inw %%dx, %%ax" : "=a" (_v) : "d" (ad)); \
return _v;
}
/* input word */
u32 Io::inl(u32 ad){
u32 _v; \
asmv("inl %%dx, %%eax" : "=a" (_v) : "d" (ad)); \
return _v;
}
16
Chapter 6: GDT
Thanks to GRUB, your kernel is no longer in real-mode, but already in protected mode, this mode allows us to use all the
possibilities of the microprocessor such as virtual memory management, paging and safe multi-tasking.
struct gdtr {
u16 limite;
u32 base;
} __attribute__ ((packed));
Caution: the directive __attribute__ ((packed)) signal to gcc that the structure should use as little memory as possible.
Without this directive, gcc include some bytes to optimize the memory alignment and the access during execution.
Now we need to define our GDT table and then load it using LGDT. The GDT table can be stored wherever we want in
memory, its address should just be signaled to the process using the GDTR registry.
The GDT table is composed of segments with the following structure:
GDT
17
struct gdtdesc {
u16 lim0_15;
u16 base0_15;
u8 base16_23;
u8 acces;
u8 lim16_19:4;
u8 other:4;
u8 base24_31;
} __attribute__ ((packed));
void init_gdt_desc(u32 base, u32 limite, u8 acces, u8 other, struct gdtdesc *desc)
{
desc->lim0_15 = (limite & 0xffff);
desc->base0_15 = (base & 0xffff);
desc->base16_23 = (base & 0xff0000) >> 16;
desc->acces = acces;
desc->lim16_19 = (limite & 0xf0000) >> 16;
desc->other = (other & 0xf);
desc->base24_31 = (base & 0xff000000) >> 24;
return;
}
And the function init_gdt initialize the GDT, some parts of the below function will be explained later and are used for
multitasking.
void init_gdt(void)
{
default_tss.debug_flag = 0x00;
default_tss.io_map = 0x00;
default_tss.esp0 = 0x1FFF0;
default_tss.ss0 = 0x18;
/* initialize gdt segments */
init_gdt_desc(0x0, 0x0, 0x0, 0x0, &kgdt[0]);
GDT
18
GDT
19
struct idtr {
u16 limite;
u32 base;
} __attribute__ ((packed));
The IDT table is composed of IDT segments with the following structure:
struct idtdesc {
u16 offset0_15;
u16 select;
u16 type;
20
u16 offset16_31;
} __attribute__ ((packed));
Caution: the directive __attribute__ ((packed)) signal to gcc that the structure should use as little memory as possible.
Without this directive, gcc includes some bytes to optimize the memory alignment and the access during execution.
Now we need to define our IDT table and then load it using LIDTL. The IDT table can be stored wherever we want in
memory, its address should just be signaled to the process using the IDTR registry.
Here is a table of common interrupts (Maskable hardware interrupt are called IRQ):
IRQ
Description
Keyboard Interrupt
Floppy Disk
LPT1
10
11
12
PS2 Mouse
13
14
15
void init_idt_desc(u16 select, u32 offset, u16 type, struct idtdesc *desc)
{
desc->offset0_15 = (offset & 0xffff);
desc->select = select;
desc->type = type;
desc->offset16_31 = (offset & 0xffff0000) >> 16;
return;
}
21
void init_idt(void)
{
/* Init irq */
int i;
for (i = 0; i < IDTSIZE; i++)
init_idt_desc(0x08, (u32)_asm_schedule, INTGATE, &kidt[i]); //
/* Vectors 0 -> 31 are for exceptions */
init_idt_desc(0x08, (u32) _asm_exc_GP, INTGATE, &kidt[13]); /* #GP */
init_idt_desc(0x08, (u32) _asm_exc_PF, INTGATE, &kidt[14]); /* #PF */
init_idt_desc(0x08, (u32) _asm_schedule, INTGATE, &kidt[32]);
init_idt_desc(0x08, (u32) _asm_int_1, INTGATE, &kidt[33]);
init_idt_desc(0x08, (u32) _asm_syscalls, TRAPGATE, &kidt[48]);
init_idt_desc(0x08, (u32) _asm_syscalls, TRAPGATE, &kidt[128]); //48
kidtr.limite = IDTSIZE * 8;
kidtr.base = IDTBASE;
After intialization of our IDT, we need to activate interrupts by configuring the PIC. The following function will configure the
two PICs by writting in their internal registries using the output ports of the processor io.outb . We configure the PICs using
the ports:
Master PIC: 0x20 and 0x21
Slave PIC: 0xA0 and 0xA1
For a PIC, there are 2 types of registries:
ICW (Initialization Command Word): reinit the controller
OCW (Operation Control Word): configure the controller once initialized (used to mask/unmask the interrupts)
void init_pic(void)
{
/* Initialization of ICW1 */
io.outb(0x20, 0x11);
io.outb(0xA0, 0x11);
/* Initialization of ICW2 */
io.outb(0x21, 0x20); /* start vector = 32 */
io.outb(0xA1, 0x70); /* start vector = 96 */
/* Initialization of ICW3 */
io.outb(0x21, 0x04);
io.outb(0xA1, 0x02);
/* Initialization of ICW4 */
io.outb(0x21, 0x01);
io.outb(0xA1, 0x01);
/* mask interrupts */
io.outb(0x21, 0x0);
io.outb(0xA1, 0x0);
}
22
|0|0|0|1|x|0|x|x|
| | +--- with ICW4 (1) or without (0)
| +----- one controller (1), or cascade (0)
+--------- triggering by level (level) (1) or by edge (edge) (0)
|x|x|x|x|x|0|0|0|
| | | | |
+----------------- base address for interrupts vectors
|x|x|x|x|x|x|x|x|
| | | | | | | |
+------------------ slave controller connected to the port yes (1), or no (0)
|0|0|0|x|x|x|x|1|
| | | +------ mode "automatic end of interrupt" AEOI (1)
| | +-------- mode buffered slave (0) or master (1)
| +---------- mode buffered (1)
+------------ mode "fully nested" (1)
%macro SAVE_REGS 0
pushad
push ds
push es
push fs
push gs
push ebx
mov bx,0x10
mov ds,bx
pop ebx
%endmacro
%macro RESTORE_REGS 0
pop gs
23
pop fs
pop es
pop ds
popad
%endmacro
%macro INTERRUPT 1
global _asm_int_%1
_asm_int_%1:
SAVE_REGS
push %1
call isr_default_int
pop eax ;;a enlever sinon
mov al,0x20
out 0x20,al
RESTORE_REGS
iret
%endmacro
These macros will be used to define the interrupt segment that will prevent corruption of the different registries, it will be
very useful for multitasking.
24
25
0 = 4kb
1 = 4mb
Note: Physical addresses in the pages diretcory or pages table are written using 20 bits because these addresses are
aligned on 4kb, so the last 12bits should be equal to 0.
A pages directory or pages table used 1024*4 = 4096 bytes = 4k
A pages table can address 1024 * 4k = 4 Mb
A pages directory can address 1024 (1024 4k) = 4 Gb
26
But before, we need to initialize our pages directory with at least one pages table.
Identity Mapping
With the identity mapping model, the page will apply only to the kernel as the first 4 MB of virtual memory coincide with the
first 4 MB of physical memory:
This model is simple: the first virtual memory page coincide to the first page in physical memory, the second page coincide
to the second page on physical memory and so on ...
27
28
The kernel space in virtual memory, which is using 1Gb of virtual memory, is common to all tasks (kernel and user).
This is implemented by pointing the first 256 entries of the task page directory to the kernel page directory (In vmm.cc):
/*
* Kernel Space. v_addr < USER_OFFSET are addressed by the kernel pages table
*/
for (i=0; i<256; i++)
pdir[i] = pd0[i];
29