Assignment MLQ Paging v20
Assignment MLQ Paging v20
April 8, 2023
Goal: The objective of this assignment is the simulation of major components in a simple operating system,
for example, scheduler, synchronization, related operations of physical memory and virtual memory.
Content: In detail, student will practice with three major modules: scheduler, synchronization, mechanism
of memory allocation from virtual-to-physical memory.
• scheduler
• synchronization
Result: After this assignment, student can understand partly the principle of a simple OS. They can
understand and draw the role of OS key modules.
1
CONTENTS CONTENTS
Contents
1 Introduction 3
1.1 An overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Source Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 How to Create a Process? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 How to Run the Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Implementation 7
2.1 Scheduler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Memory Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 The virtual memory mapping in each process . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.2 The system physical memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.3 Paging-based address translation scheme . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.4 Wrapping-up all paging-oriented implementations . . . . . . . . . . . . . . . . . . . . . 13
2.3 Put It All Together . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Submission 16
3.1 Source code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3 Grading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Page 2 of 16
1 INTRODUCTION
1 Introduction
1.1 An overview
The assignment is about simulating a simple operating system to help student understand the fundamental
concepts of scheduling, synchronization and memory management. Figure 1 shows the overall architecture
of the operating system we are going to implement. Generally, the OS has to manage two virtual resources:
CPU(s) and RAM using two core components:
• Scheduler (and Dispatcher): determines which process is allowed to run on which CPU.
• Virtual memory engine (VME): isolates the memory space of each process from other. The physical
RAM is shared by multiple processes but each process do not know the existence of other. This is done
by letting each process has its own virtual memory space and the Virtual memory engine will map and
translate the virtual addresses provided by processes to corresponding physical addresses.
Through those modules, the OS allows mutliple processes created by users to share and use the virtual
computing resources. Therefore, in this assignment, we focus on implementing scheduler/dispatcher and
virtual memory engine.
• Header files
Page 3 of 16
1.3 Processes 1 INTRODUCTION
– loader.h: Functions used by the loader which load the program from disk to memory
– common.h: Define structs and functions used everywhere in the OS.
– bitopts.h: Define operations on bit data.
– os-mm.h, mm.h: Define the structure and basic data for Paging-based Memory Management.
– os-cfg.h: (Optional) Define contants use to switch the software configuration.
• Source files
• Makefile
1.3 Processes
We are going to build a multitasking OS which lets multiple processes run simultaneously so it is worth to
spend some space explaining the organization of processes. The OS manages processes through their PCB
described as follows:
// From include/common.h
struct pcb_t {
uint32_t pid;
uint32_t priority;
5 uint32_t code_seg_t * code;
addr_t regs[10];
uint32_t pc;
#i f d e f MLQ_SCHED
uint32_t prio;
10 #endif
struct page_table_t * page_table;
uint32_t bp;
}
• priority: Process priority, the lower value the higher priority the process has. This legacy priority
depend on the process’s properties and is fixed over execution session.
Page 4 of 16
1.3 Processes 1 INTRODUCTION
• code: Text segment of the process (To simplify the simulation, we do not put the text segment in
RAM).
• regs: Registers, each process could use up to 10 registers numbered from 0 to 9.
• pc: The current position of program counter.
• page table: The translation from virtual addresses to physical addresses (obsoleted do not use).
• bp: Break pointer, use to manage the heap segment.
• prio: Priority on execution (if supported), and this value overwrites the default priority when it is
existed.
Similar to the real process, each process in this simulation is just a list of instructions executed by the CPU
one by one from the beginning to the end (we do not implement jump instructions here). There are five
instructions a process could perform:
• CALC: do some calculation using the CPU. This instruction does not have argument.
Annotation of Memory region: A storage area where we allocate the storage space for a variable,
this term is actually associated with an index of SYMBOL TABLE and usually supports human-readable
through variable name and a mapping mechanism. Unfortunately, this mapping is out-of-scope of this
Operating System course. It might belong another couse which explains how to the compiler do its job
and map the label to its associated index. For simplicity, we refer here a memory region through its
index and it has a limit on the number of variables in each program/process.
• ALLOC: Allocate some chunk of bytes on the main memory (RAM). Instruction’s syntax:
alloc [size] [reg]
where size is the number of bytes the process want to allocate from RAM and reg is the number
of register which will save the address of the first byte of the allocated memory region. For example,
the instruction alloc 124 7 will allocate 124 bytes from the OS and the address of the first of those
124 bytes with be stored at register #7.
• FREE Free allocated memory. Syntax:
free [reg]
where reg is the number of registers holding the address of the first byte of the memory region to be
deallocated.
• READ Read a byte from memory. Syntax:
read [source] [offset] [destination]
The instruction reads one byte memory at the address which equal to the value of register source +
offset and saves it to destination. For example, assume that the value of register #1 is 0x123
then the instruction read 1 20 2 will read one byte memory at the address of 0x123 + 14 (14 is
20 in hexadecimal) and save it to register #2.
• WRITE Write a value register to memory. Syntax:
write [data] [destination] [offset]
The instruction writes data to the address which equal to the value of register destination +
offset. For example, assume that the value of register #1 is 0x123 then the instruction write 10
1 20 will write 10 to the memory at the address of 0x123 + 14 (14 is 20 in hexadecimal).
Page 5 of 16
1.4 How to Create a Process? 1 INTRODUCTION
where priority is the default priority of the process created from this program. It needs to remind that
this system employs a dual priority mechanism.
The higher priority (with the smaller value) the process has, the process has higher chance to be picked up
by the CPU from the queue (See section 2.1 for more detail). N is the number of instructions and each of
the next N lines(s) are instructions represented in the format mentioned in the previous section. You could
open files in input/proc directory to see some sample programs.
Dual priority mechanism Please remember that this default value can be overwrite by the live priority
during process execution calling. For tackling the conflict, when it has priority in process loading (this inputt
file), it will overwrite and replace the default priority in process description file.
where time slice is the amount of time (in seconds) for which a process is allowed to run. N is the
number of CPUs available and M is the number of processes to be run. The last parameter priority is the
live priority when the process is invoked and this will overwrite the default priority in process desription file
(refers section 1.4).
From the second line onward, each line represents the process arrival time , the path to the file holding the
content of the program to be loaded and its priority. You could find configure files at input directory.
Again, it’s worth to remind that this system equips a dual priority mechanism. If you don’t have the
default priority than we don’t have enough the material to resolve the conflict during the scheduling proce-
dure. But, if this value is fixed, it limits the algorithms that the simulation can illustrate the theory. Verify
with your real-life environment, there is different priority systems, one is about the system program vs user
program while the other also allows you to change the live priority.
To start the simulation, you must compile the source code first by using Make all command. After that,
run the command
./os [configure_file]
Page 6 of 16
2 IMPLEMENTATION
where configure file is the path to configure file for the environment on which you want to run and it
should associated with the name of a description file placed in input directory.
2 Implementation
2.1 Scheduler
We first implement the scheduler. Figure 2 shows how the operating system schedules processes. The OS
is designed to work on multiple processors. The OS uses multiple queue called ready queue to determine
which process to be executed when a CPU becomes available. Each queue is associated with a fixed priority
value. The scheduler is designed based on “multilevel queue” algorithm used in Linux kernel1 .
According to Figure 2, the scheduler works as follows. For each new program, the loader will create a new
process and assign a new PCB to it. The loader then reads and copies the content of the program to the text
segment of the new process (pointed by code pointer in the PCB of the process - section 1.3). The PCB of
the process is pushed to the associated ready queue having the same priority with the value prio of this
process. Then, it waits for the CPU. The CPU runs processes in round-robin style. Each process is allowed
to run in time slice. After that, the CPU is forced to enqueue the process back to it associated priority
ready queue. The CPU then picks up another process from ready queue and continue running.
In this system, we implement the Multi-Level Queue (MLQ) policy. The system contains MAX PRIO priority
levels. Although the real system, i.e. Linux kernel, may group these levels into subsets, we keep the design
where each priority is held by one ready queue for simplicity. We simplify the add queue and put proc
as putting the proc to appropriated ready queue by priority matching. The main design is belong to the
MLQ policy deployed by get proc to fetch a proc and then dispatch CPU.
The description of MLQ policy: the traversed step of ready queue list is a fixed formulated number
based on the priority, i.e. slot= (MAX PRIO - prio), each queue have only fixed slot to use the CPU and
when it is used up, the system must change the resource to the other process in the next queue and left the
remaining work for future slot eventhough it needs a completed round of ready queue.
An example in Linux MAX PRIO=140, prio=0..(MAX PRIO - 1)
prio = 0 | 1 | .... | MAX_PRIO - 1
slot = MAX_PRIO | MAX_PRIO - 1 | .... | 1
MLQ policy only goes through the fixed step to traverse all the queue in the priority ready queue list.
Your job in this part is to implement this algorithm by completing the following functions
• enqueue() and dequeue() (in queue.c): We have defined a struct (queue t) for a priority
queue at queue.h. Your task is to implement those functions to help put a new PCB to the queue
and get the next ’in turn’ PCB out of the queue.
• get proc() (in sched.c): gets PCB of a process waiting from the ready queue system. The
selected ready queue ’in turn’ has been described in the above policy.
You could compare your result with model answers in output directory. Note that because the loader and
the scheduler run concurrently, there may be more than one correct answer for each test.
Notice: the run queue is something not compatible with the theory and has been obsoleted for a while.
We don’t need it in both theory paradigm and code implementation, it is such a legacy/outdated code but
1 Actually, Linux supports the feedback mechanism which allow to move process among priority queues but we don’t the
Page 7 of 16
2.2 Memory Management 2 IMPLEMENTATION
ready_queue
priority=0
add_queue() priority=1
get_proc()
loader
priority=139
disk
CPU
CPU
put_proc()
processed_queue
/run_queue CPU
(obsoleted)
Question: What is the advantage of using priority queue in comparison with other scheduling algorithms
you have learned?
Memory Area Each memory area ranges continuously in [vm start,vm end]. Although the space
spans the whole range, the actual usable area is limited by the top pointing at sbrk. In the area between
vm start and sbrk, there are multiple regions captured by struct vm rg struct and free slots tracking
by the list vm freerg list. Through this design, we make the design to perform the actual allocation of
physical memory only in the usable area, as in Figure 3.
Page 8 of 16
2.2 Memory Management 2 IMPLEMENTATION
//From include/os-mm.h
/*
* Memory region struct
*/
5 struct vm_rg_struct {
unsigned long rg_start;
unsigned long rg_end;
/*
* Memory area struct
*/
15 struct vm_area_struct {
unsigned long vm_id;
unsigned long vm_start;
unsigned long vm_end;
Memory region As we noted in the previous section 1.3, these regions are actually acted as the variables
in the human-readable program’s source code. Due to the current out-of-scope fact, we simply touch in
the concept of namespace in term of indexing. We have not been equipped enough the principle of the
compiler. It is, again, overwhelmed to employs such a complex symbol table in this OS course. We tem-
porarily imagine these regions as a set of limit number of region. We manage them by using an array of
symrgtbl[PAGING MAX SYMTBL SZ]. The array size is fixed by a constant, PAGING MAX SYMTBL SZ,
denoted the number of variable allowed in each program. To wrap up, we use the struct vm rg struct
symrgtbl to keep the start and the end point of the region and the pointer rg next is reserved for future
Page 9 of 16
2.2 Memory Management 2 IMPLEMENTATION
set tracking.
//From include/os-mm.h
/*
* Memory mapping struct
*/
5 struct mm_struct {
uint32_t *pgd;
Memory mapping is represented by struct mm struct, which keeps tracking of all the mentioned
memory regions in a separated contiguous memory area. In each memory mapping struct, many memory
areas are pointed out by struct vm area struct *mmap list. The following important field is the pgd,
which is the page table directory, contains all page table entries. Each entry is a map between the page
number and the frame number in the paging memory management system. We keep detailed page-frame
mapping to the later section 2.2.3. The symrgtbl is a simple implementation of the symbol table. The
other fields are mainly used to keep track of a specific user operation i.e. caller, fifo page (for referencing),
so we left it there, and it can be used on your own or just discarded it.
CPU addresses the address generated by CPU to access a specific memory location. In paging-based
system, it is divided into:
• Page number (p): used as an index into a page table that holds the based address for each page in
physical memory.
• Page offset (d): combined with base address to define the physical memory address that is sent to
the Memory Management Unit
The physical address space of a process can be non-contiguous. We divide physical memory into fixed-sized
blocks (the frames) with two sizes 256B or 512B. We proposed various setting combinations in Table 1 and
ended up with the highlighted configuration. This is a referenced setting and can be modified or re-selected
in other simulations. Based on the configuration of 22-bit CPU and 256B page size, the CPU address is
organized as in Figure 4.
In the VM summary, all structures supporting VM are placed in the module mm-vm.c.
Page 10 of 16
2.2 Memory Management 2 IMPLEMENTATION
Question: In this simple OS, we implement a design of multiple memory segments or memory areas in
source code declaration. What is the advantage of the proposed design of multiple segments?
CPU bus PAGE size PAGE bit No pg entry PAGE Entry sz PAGE TBL OFFSET bit PGT mem MEMPHY fram bit
20 256B 12 ∼4000 4byte 16KB 8 2MB 1MB 12
22 256B 14 ∼16000 4byte 64KB 8 8MB 1MB 12
22 512B 13 ∼8000 4byte 32KB 9 4MB 1MB 11
22 512B 13 ∼8000 4byte 32KB 9 4MB 128kB 8
16 512B 8 256 4byte 1kB 9 128K 128kB 4
10 struct memphy_struct {
/* Basic field of data and size */
BYTE *storage;
int maxsz;
/* Management structure */
20 struct framephy_struct *free_fp_list;
struct framephy_struct *used_fp_list;
};
Page 11 of 16
2.2 Memory Management 2 IMPLEMENTATION
Question: What will happen if we divide the address to more than 2-levels in the paging memory manage-
ment system?
Page table This structure lets a userspace process find out which physical frame each virtual page is
mapped to. It contains one 32-bit value for each virtual page, containing the following data:
The virtual space is isolated to each entity then each struct pcb t has its own table. To work in paging-
based memory system, we need to update this struct and the later section will discuss the required mod-
ification. In all cases, each process has a completely isolated and unique space, N processes in our setting
result in N page tables and in turn, each page must have all entries for the whole CPU address space. For
each entry, the paging number may have an associated frame in MEMRAM or MEMSWP or might have
null value, the functionality of each data bit of the page table entry is illustrated in Figure 5. In our chosen
highlighted setting in Table 1 we have 16.000-entry table each table cost 64 KB storage space.
In section 2.2.1, the process can access the virtual memory space in a contigous manner of vm area structure.
The remained work deals with the mapping between page and frame to provide the contiguous memory space
over discreted frame storing mechanism. It falls into the two main approaches of memory swapping and basic
memory operations, i.e. alloc/free/read/write, which mostly keep in touch with pgd page table structure.
Memory swapping o We have been informed that a memory area (/segment) may not be used up to its
limit storage space. It means that there are the storage spaces which aren’t mapped to MEMRAM. The
Page 12 of 16
2.2 Memory Management 2 IMPLEMENTATION
swapping can help moving the contents of physical frame between the MEMRAM and MEMSWAP. The
swapping is the mechanism performs the copying the frame’s content from outside to main memory ram.
The swapping out, in reverse, tries to move the content of the frame in MEMRAM to MEMSWAP. In typical
context, the swapping help us gain the free frame of RAM since the size of SWAP device is usually large
enough.
• ALLOC in most case, it fits into available region. If there is no such a suitable space, we need lift up
the barrier sbrk and since it have never been touched, it may needs provide some physical frames and
then map them using Page Table Entry.
• FREE the storage space associated with the region id. Since we cannot collect back the taken physical
frame which might cause memory holes, we just keep the collected storage space in a free list for further
alloc request.
• READ/WRITE requires to get the page to be presented in the main memory. The most resource
consuming step is the page swapping. If the page was in the MEMSWAP device, it needs to bring
that page back to MEMRAM device (swapping in) and if it’s lack of space, we need to give back some
pages to MEMSWAP device (swapping out) to make more rooms.
To perform these operations, it needs a collaboration among the mm’s modules as illustrated in Figure 6.
the default setting and avoid touching too much on these values
Page 13 of 16
2.3 Put It All Together 2 IMPLEMENTATION
// From include/os-cfg.h
#define MLQ_SCHED 1
#define MAX_PRIO 140
5 #define MM_PAGING
#define MM_FIXED_MEMSZ
An example of MM PAGING setting: With this new modules of memory paging, we got a derivation
of PCB struct added some additional memory management fields and they are wrapped by a constant
definition. If we want to use the MM PAGING module then we enable the associated #define config line in
include/os-cfg.h
// From include/common.h
struct pcb_t {
...
#i f d e f MM_PAGING
5 struct mm_struct *mm;
struct memphy_struct *mram;
struct memphy_struct **mswp;
struct memphy_struct *active_mswp;
#endif
10 ...
};
Another example of MM FIXED MEMSZ setting: Associated with the new verssion of PCB struct, the
description file in input can keep the old setting with #define MM FIXED MEMSZ while it still works in
the new paging memory management mode. This mode configuration benefits the backward compatible with
old version input file. Enabling this setting allows the backward compatible.
New configuration with explicit declaration of memory size (Be careful, this mode supports a cus-
tom memory size implies that we comment out or delete or disable the constant #define MM FIXED MEMSZ)
If we are in this mode, then the simulation program takes one additional line from the input file. This input
line contains the system physical memory sizes: a MEMRAM and up to 4 MEMSWP. The size value requires
a non-negative integer value. We can set the size equal 0, but that means the swap is deactivated. To keep
a valid parameter, we must have a MEMRAM and at least 1 MEMSWAP, those values must be positive
integers, the remaining values can be set to 0.
[time slice] [N = Number of CPU] [M = Number of Processes to be run]
[MEM_RAM_SZ] [MEM_SWP_SZ_0] [MEM_SWP_SZ_1] [MEM_SWP_SZ_2] [MEM_SWP_SZ_3]
[time 0] [path 0] [priority 0]
[time 1] [path 1] [priority 1]
...
[time M-1] [path M-1] [priority M-1]
The highlighted input line is controlled by the constant definition. Double check the input file and the
contents of include/os-cfg.h will help us understand how the simulation program behaves when there
may be something strange.
Page 14 of 16
2.3 Put It All Together 2 IMPLEMENTATION
on multiple processors, it is possible that share resources could be concurrently accessed by more than one
process at a time. Your job in this section is to find share resource and use lock mechanism to protect them.
virtual memory
address space
vm_area
vm_id
vm_start
vm_end
free rg
...
vm_next
OS vm_area
timer vm_id
mram vm_start
mswp0 vm_end
mswp1 free rg
... ...
mswpx vm_next
...
Question: What will happen if the synchronization is not handled in your simple OS? Illustrate by example
the problem of your simple OS if you have any.
Page 15 of 16
3 SUBMISSION
3 Submission
3.1 Source code
Requirement: you have to code the system call followed by the coding style. Reference:
https://www.gnu.org/prep/standards/html node/Writing-C.html.
3.2 Report
Write a short report that answer questions in implementation section and interpret the results of running
tests in each section:
• Scheduling: draw Gantt diagram describing how processes are executed by the CPU.
• Memory: Show the status of RAM after each memory allocation and de-allocation function call.
• Overall: student find their own way to interpret the results of simulation.
After you finish the assignment, moving your report to source code directory and compress the whole directory
into a single file name assignment MSSV.zip and submit to BKEL.
3.3 Grading
You must carry out this assignment by groups of 2 or 3 students. The overall grade your group is a
combination of two parts:
• Demonstration (6 points)
• Report (4 points)
Page 16 of 16