How To Create An Operating System
How To Create An Operating System
of Contents
Introduction
1.1
1.2
1.3
1.4
1.5
1.6
GDT
1.7
1.8
1.9
1.10
1.11
1.12
1.13
Modular drivers
1.14
1.15
1.16
DOS Partitions
1.17
1.18
1.19
1.20
Lua interpreter
1.21
Introduction
Introduction
API Posix
LibC
"Can" run a shell or some executables (e.g., lua)
Install Vagrant
Vagrant is free and open-source software for creating and configuring virtual
development environments. It can be considered a wrapper around VirtualBox.
Vagrant will help us create a clean virtual development environment on whatever system you
are using. The first step is to download and install Vagrant for your system at
http://www.vagrantup.com/.
Install Virtualbox
Oracle VM VirtualBox is a virtualization software package for x86 and AMD64/Intel64based computers.
Vagrant needs Virtualbox to work, Download and install for your system at
https://www.virtualbox.org/wiki/Downloads.
Once the lucid32 image is ready, we need to define our development environment using a
Vagrantfile, create a file named Vagrantfile. This file defines what prerequisites our
environment needs: nasm, make, build-essential, grub and qemu.
Start your box using:
vagrant up
You can now access your box by using ssh to connect to the virtual box using:
vagrant ssh
The directory containing the Vagrantfile will be mounted by default in the /vagrant directory
of the guest VM (in this case, Ubuntu Lucid32):
cd /vagrant
Throughout this process the CPU has been running in 16-bit Real Mode, which is the default
state for x86 CPUs in order to maintain backwards compatibility. To execute the 32-bit
instructions within our kernel, a bootloader is required to switch the CPU into Protected
Mode.
What is GRUB?
GNU GRUB (short for GNU GRand Unified Bootloader) is a boot loader package from
the GNU Project. GRUB is the reference implementation of the Free Software
Foundation's Multiboot Specification, which provides a user the choice to boot one of
multiple operating systems installed on a computer or select a specific kernel
configuration available on a particular operating system's partitions.
To make it simple, GRUB is the first thing booted by the machine (a boot-loader) and will
simplify the loading of our kernel stored on the hard-disk.
10
struct multiboot_info {
u32 flags;
u32 low_mem;
u32 high_mem;
u32 boot_device;
u32 cmdline;
u32 mods_count;
u32 mods_addr;
struct {
u32 num;
u32 size;
u32 addr;
u32 shndx;
} elf_sec;
unsigned long mmap_length;
unsigned long mmap_addr;
unsigned long drives_length;
unsigned long drives_addr;
unsigned long config_table;
unsigned long boot_loader_name;
unsigned long apm_table;
unsigned long vbe_control_info;
unsigned long vbe_mode_info;
unsigned long vbe_mode;
unsigned long vbe_interface_seg;
unsigned long vbe_interface_off;
unsigned long vbe_interface_len;
};
You can use the command mbchk kernel.elf to validate your kernel.elf file against the
multiboot standard. You can also use the command nm -n kernel.elf to validate the offset
of the different objects in the ELF binary.
11
fdisk ./c.img
# Switch to Expert commands
> x
# Change number of cylinders (1-1048576)
> c
> 4
# Change number of heads (1-256, default 16):
> h
> 16
# Change number of sectors/track (1-63, default 63)
> s
> 63
# Return to main menu
> r
# Add a new partition
> n
# Choose primary partition
> p
# Choose partition number
> 1
# Choose first sector (1-4, default 1)
> 1
# Choose last sector, +cylinders or +size{K,M,G} (1-4, default 4)
> 4
# Toggle bootable flag
> a
# Choose first partition for bootable flag
> 1
# Write table to disk and exit
> w
We need now to attach the created partition to the loop-device using losetup. This allows a
file to be access like a block device. The offset of the partition is passed as an argument and
calculated using: offset= start_sector * bytes_by_sector.
Using fdisk -l -u c.img , you get: 63 * 512 = 32256.
12
See Also
GNU GRUB on Wikipedia
Multiboot specification
13
C types
In the next step, we're going to define different types we're going to use in our code. Most of
our variable types are going to be unsigned. This means that all the bits are used to store
the integer. Signed variables use their first bit to indicate their sign.
14
15
16
17
size = 0;
neg = 0;
if (c == 0)
break;
else if (c == '%') {
c = *s++;
if (c >= '0' && c <= '9') {
size = c - '0';
c = *s++;
}
if (c == 'd') {
ival = va_arg(ap, int);
if (ival < 0) {
uival = 0 - ival;
neg++;
} else
uival = ival;
itoa(buf, uival, 10);
buflen = strlen(buf);
if (buflen < size)
for (i = size, j = buflen; i >= 0;
i--, j--)
buf[i] =
(j >=
0) ? buf[j] : '0';
if (neg)
print("-%s", buf);
else
print(buf);
}
else if (c == 'u') {
uival = va_arg(ap, int);
itoa(buf, uival, 10);
buflen = strlen(buf);
if (buflen < size)
for (i = size, j = buflen; i >= 0;
i--, j--)
buf[i] =
(j >=
0) ? buf[j] : '0';
print(buf);
} else if (c == 'x' || c == 'X') {
uival = va_arg(ap, int);
itoa(buf, uival, 16);
buflen = strlen(buf);
if (buflen < size)
18
Assembly interface
A large number of instructions are available in Assembly but there is not equivalent in C (like
cli, sti, in and out), so we need an interface to these instructions.
In C, we can include Assembly using the directive "asm()", gcc use gas to compile the
assembly.
Caution: gas uses the AT&T syntax.
19
/* output byte */
void Io::outb(u32 ad, u8 v){
asmv("outb %%al, %%dx" :: "d" (ad), "a" (v));;
}
/* output word */
void Io::outw(u32 ad, u16 v){
asmv("outw %%ax, %%dx" :: "d" (ad), "a" (v));
}
/* output word */
void Io::outl(u32 ad, u32 v){
asmv("outl %%eax, %%dx" : : "d" (ad), "a" (v));
}
/* input byte */
u8 Io::inb(u32 ad){
u8 _v; \
asmv("inb %%dx, %%al" : "=a" (_v) : "d" (ad)); \
return _v;
}
/* input word */
u16 Io::inw(u32 ad){
u16 _v; \
asmv("inw %%dx, %%ax" : "=a" (_v) : "d" (ad)); \
return _v;
}
/* input word */
u32 Io::inl(u32 ad){
u32 _v; \
asmv("inl %%dx, %%eax" : "=a" (_v) : "d" (ad)); \
return _v;
}
20
GDT
Chapter 6: GDT
Thanks to GRUB, your kernel is no longer in real-mode, but already in protected mode, this
mode allows us to use all the possibilities of the microprocessor such as virtual memory
management, paging and safe multi-tasking.
21
GDT
struct gdtr {
u16 limite;
u32 base;
} __attribute__ ((packed));
Caution: the directive __attribute__ ((packed)) signal to gcc that the structure should use
as little memory as possible. Without this directive, gcc include some bytes to optimize the
memory alignment and the access during execution.
Now we need to define our GDT table and then load it using LGDT. The GDT table can be
stored wherever we want in memory, its address should just be signaled to the process
using the GDTR registry.
The GDT table is composed of segments with the following structure:
22
GDT
And the function init_gdt initialize the GDT, some parts of the below function will be
explained later and are used for multitasking.
23
GDT
void init_gdt(void)
{
default_tss.debug_flag = 0x00;
default_tss.io_map = 0x00;
default_tss.esp0 = 0x1FFF0;
default_tss.ss0 = 0x18;
/* initialize gdt segments */
init_gdt_desc(0x0, 0x0, 0x0, 0x0, &kgdt[0]);
init_gdt_desc(0x0, 0xFFFFF, 0x9B, 0x0D, &kgdt[1]); /* code */
init_gdt_desc(0x0, 0xFFFFF, 0x93, 0x0D, &kgdt[2]); /* data */
init_gdt_desc(0x0, 0x0, 0x97, 0x0D, &kgdt[3]); /* stack */
init_gdt_desc(0x0, 0xFFFFF, 0xFF, 0x0D, &kgdt[4]); /* ucode */
init_gdt_desc(0x0, 0xFFFFF, 0xF3, 0x0D, &kgdt[5]); /* udata */
init_gdt_desc(0x0, 0x0, 0xF7, 0x0D, &kgdt[6]); /* ustack */
init_gdt_desc((u32) & default_tss, 0x67, 0xE9, 0x00, &kgdt[7]); /* descripteur
de tss */
/* initialize the gdtr structure */
kgdtr.limite = GDTSIZE * 8;
kgdtr.base = GDTBASE;
/* copy the gdtr to its memory area */
memcpy((char *) kgdtr.base, (char *) kgdt, kgdtr.limite);
/* load the gdtr registry */
asm("lgdtl (kgdtr)");
/* initiliaz the segments */
asm(" movw $0x10, %ax \n \
movw %ax, %ds \n \
movw %ax, %es \n \
movw %ax, %fs \n \
movw %ax, %gs \n \
ljmp $0x08, $next \n \
next: \n");
}
24
25
The Interrupt Descriptor Table (IDT) is a data structure used by the x86 architecture to
implement an interrupt vector table. The IDT is used by the processor to determine the
correct response to interrupts and exceptions.
Our kernel is going to use the IDT to define the different functions to be executed when an
interrupt occurred.
Like the GDT, the IDT is loaded using the LIDTL assembly instruction. It expects the location
of a IDT description structure:
struct idtr {
u16 limite;
u32 base;
} __attribute__ ((packed));
The IDT table is composed of IDT segments with the following structure:
struct idtdesc {
u16 offset0_15;
u16 select;
u16 type;
u16 offset16_31;
} __attribute__ ((packed));
Caution: the directive __attribute__ ((packed)) signal to gcc that the structure should use
as little memory as possible. Without this directive, gcc includes some bytes to optimize the
memory alignment and the access during execution.
Now we need to define our IDT table and then load it using LIDTL. The IDT table can be
stored wherever we want in memory, its address should just be signaled to the process
using the IDTR registry.
Here is a table of common interrupts (Maskable hardware interrupt are called IRQ):
26
IRQ
Description
Keyboard Interrupt
Floppy Disk
LPT1
10
11
12
PS2 Mouse
13
14
15
27
void init_idt(void)
{
/* Init irq */
int i;
for (i = 0; i < IDTSIZE; i++)
init_idt_desc(0x08, (u32)_asm_schedule, INTGATE, &kidt[i]); //
/* Vectors 0 -> 31 are for exceptions */
init_idt_desc(0x08, (u32) _asm_exc_GP, INTGATE, &kidt[13]); /* #GP */
init_idt_desc(0x08, (u32) _asm_exc_PF, INTGATE, &kidt[14]); /* #PF */
init_idt_desc(0x08, (u32) _asm_schedule, INTGATE, &kidt[32]);
init_idt_desc(0x08, (u32) _asm_int_1, INTGATE, &kidt[33]);
init_idt_desc(0x08, (u32) _asm_syscalls, TRAPGATE, &kidt[48]);
init_idt_desc(0x08, (u32) _asm_syscalls, TRAPGATE, &kidt[128]); //48
kidtr.limite = IDTSIZE * 8;
kidtr.base = IDTBASE;
After intialization of our IDT, we need to activate interrupts by configuring the PIC. The
following function will configure the two PICs by writting in their internal registries using the
output ports of the processor io.outb . We configure the PICs using the ports:
Master PIC: 0x20 and 0x21
Slave PIC: 0xA0 and 0xA1
For a PIC, there are 2 types of registries:
ICW (Initialization Command Word): reinit the controller
OCW (Operation Control Word): configure the controller once initialized (used to
mask/unmask the interrupts)
28
void init_pic(void)
{
/* Initialization of ICW1 */
io.outb(0x20, 0x11);
io.outb(0xA0, 0x11);
/* Initialization of ICW2 */
io.outb(0x21, 0x20); /* start vector = 32 */
io.outb(0xA1, 0x70); /* start vector = 96 */
/* Initialization of ICW3 */
io.outb(0x21, 0x04);
io.outb(0xA1, 0x02);
/* Initialization of ICW4 */
io.outb(0x21, 0x01);
io.outb(0xA1, 0x01);
/* mask interrupts */
io.outb(0x21, 0x0);
io.outb(0xA1, 0x0);
}
29
|x|x|x|x|x|x|x|x|
| | | | | | | |
+------------------ slave controller connected to the port yes (1), or no (0)
30
%macro SAVE_REGS 0
pushad
push ds
push es
push fs
push gs
push ebx
mov bx,0x10
mov ds,bx
pop ebx
%endmacro
%macro RESTORE_REGS 0
pop gs
pop fs
pop es
pop ds
popad
%endmacro
%macro INTERRUPT 1
global _asm_int_%1
_asm_int_%1:
SAVE_REGS
push %1
call isr_default_int
pop eax ;;a enlever sinon
mov al,0x20
out 0x20,al
RESTORE_REGS
iret
%endmacro
These macros will be used to define the interrupt segment that will prevent corruption of the
different registries, it will be very useful for multitasking.
31
32
0 = 4kb
1 = 4mb
Note: Physical addresses in the pages diretcory or pages table are written using 20 bits
because these addresses are aligned on 4kb, so the last 12bits should be equal to 0.
A pages directory or pages table used 1024*4 = 4096 bytes = 4k
33
But before, we need to initialize our pages directory with at least one pages table.
Identity Mapping
With the identity mapping model, the page will apply only to the kernel as the first 4 MB of
virtual memory coincide with the first 4 MB of physical memory:
This model is simple: the first virtual memory page coincide to the first page in physical
memory, the second page coincide to the second page on physical memory and so on ...
34
35
The kernel space in virtual memory, which is using 1Gb of virtual memory, is common to all
tasks (kernel and user).
This is implemented by pointing the first 256 entries of the task page directory to the kernel
page directory (In vmm.cc):
/*
* Kernel Space. v_addr < USER_OFFSET are addressed by the kernel pages table
*/
for (i=0; i<256; i++)
pdir[i] = pd0[i];
36