HAL9000
HAL9000
HAL9000
Alexandru Gurzou
The 9000 series is the most reliable computer ever made. No 9000 computer has ever made a mistake or distorted
information. We are all, by any practical definition of the words, foolproof and incapable of error.
- HAL 9000 [1]
2016
Contents
1 Introduction 1
1.1 Versus Pintos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Project Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Root Directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.2 Source Tree Overview . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.3 Building HAL9000 . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.4 Running HAL9000 . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Grading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.1 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.2 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Useful documents and links . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Project 1: Threads 11
2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.1 Threading System Initialization . . . . . . . . . . . . . . . . . . . . 11
2.1.2 Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.3 Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.1 Timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.2 Priority Scheduler . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.3 Priority Donation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.4 BONUS: Per-CPU ready lists . . . . . . . . . . . . . . . . . . . . . 17
2.3 Source Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 FAQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4.1 Priority Scheduler . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4.2 Priority Donation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3 Project 2: Userprog 21
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.1 Userprog Initialization . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.2 Issuing system calls . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1.3 Working with user applications . . . . . . . . . . . . . . . . . . . . 22
i
3.2 Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.1 Argument Passing . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.2 User memory access . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.3 System calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.4 Handling user-mode exceptions . . . . . . . . . . . . . . . . . . . . 27
3.3 Source Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4 FAQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
A Reference Guide 43
A.1 HAL9000 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
A.1.1 Startup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
A.1.2 Multi-core vs Single-core . . . . . . . . . . . . . . . . . . . . . . . . 46
A.1.3 Why 64-bit? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
A.1.4 Interrupt Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
A.2 Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
A.2.1 Thread Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
A.2.2 Thread Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
A.2.3 Thread States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
A.2.4 Thread Switching . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
A.2.5 Thread Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . 59
A.3 Interprocessor Communication . . . . . . . . . . . . . . . . . . . . . . . . . 60
A.3.1 Parameter overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
A.3.2 Usage Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
A.4 Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
A.4.1 Process Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
A.4.2 Process functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
A.4.3 Program Startup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
A.5 Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
ii
A.5.1 Primitive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
A.5.2 Executive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
A.5.3 Interlocked Operations . . . . . . . . . . . . . . . . . . . . . . . . . 79
A.5.4 Disabling Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . 80
A.6 Memory Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
A.6.1 Physical Memory Management . . . . . . . . . . . . . . . . . . . . 81
A.6.2 Virtual Memory Management . . . . . . . . . . . . . . . . . . . . . 81
A.6.3 Heap Management . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
A.6.4 Memory Management Unit . . . . . . . . . . . . . . . . . . . . . . . 83
A.6.5 Page-fault handling . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
A.7 Virtual Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
A.8 Paging Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
A.8.1 Creation, Destruction and Activation . . . . . . . . . . . . . . . . . 87
A.8.2 Inspection and Updates . . . . . . . . . . . . . . . . . . . . . . . . 88
A.8.3 Accessed and Dirty Bits . . . . . . . . . . . . . . . . . . . . . . . . 88
A.9 List Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
A.10 Hash Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
A.11 Hardware Timers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
A.11.1 PIT Timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
A.11.2 RTC Timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
A.11.3 LAPIC Timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
B Debugging 97
B.1 Signaling function failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
B.1.1 Interpreting STATUS values . . . . . . . . . . . . . . . . . . . . . . 98
B.2 Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
B.3 Asserts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
B.4 Disassembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
B.5 Halt debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
iii
C.2.4 Install the desired platform toolset . . . . . . . . . . . . . . . . . . 125
C.3 Hg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
C.4 Git . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
C.4.1 Why do I need it? . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
C.4.2 Installing Git . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
C.4.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
C.4.4 Visual git clients . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
C.4.5 Diff tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Bibliography 143
iv
Chapter 1
Introduction
HAL9000 is a 64-bit operating system (OS) written for x86 symmetric multipro-
cessor systems (SMP). It provides a simple round robin scheduler for scheduling threads
on the available CPUs, support for launching user-mode applications, a basic FAT32 file-
system driver and basic drivers for providing I/O operations (VGA, IDE disk, keyboard,
legacy COM and ethernet).
HAL9000 can theoretically run on any physical Intel x86 PC which supports 64-
bit operating mode. However, because it is cheaper, easier and safer to test in a virtual
environment we will use VMWare Workstation [2] for running our OS.
The project is a semester long and consists of improving the threading and user-
mode support currently available in HAL9000 .
This chapter provides the basic introduction to the project, highlights the differences
between this project and a very popular one (Pintos [3]) which served as an inspiration
for the conception of HAL9000 . Afterward, this section provides the basic pointers to
navigating, building, running and testing the code.
Support for multi-processor systems. This provides true concurrency and makes
synchronization a bit harder (you can’t just disable interrupts) and provides an envi-
ronment more authentic to the real world. For more information on these differences
see A.1.2 [Multi-core vs Single-core].
1
1. Introduction
are performance wise: PCIDs, syscall/sysret instruction pair while some enhance
security: XD, larger virtual address spaces and protection keys. For more details,
see A.1.3 [Why 64-bit?].
The OS image is multiboot compliant [4], this means it can be loaded by any multi-
boot loader such as grub or PXE. This makes it easy to run the OS through both
hard-disk boot and network boot.
If people are not interested strictly in OS design, but want to write their own file
system, network or disk driver they could easily extend this OS to support any file
system, network adapter or disk controller interface.
Pintos is tried and tested, being part of the curriculum for many top ranked univer-
sities in the world for many years while HAL9000 is at its first iteration.
Pintos doesn’t rely as many CPU features and doesn’t require the host operating
system to be a 64-bit version.
2 2016
1.2. Project Overview
.
|-- [ dir] bin
|-- [ dir] acpi
|-- [ dir] docs
|-- [ dir] postbuild
|-- [ dir] PXE
|-- [ dir] src
| |-- [ dir] Ata
| |-- [ dir] CommonLib
| |-- [ dir] CommonLibUnitTests
| |-- [ dir] Disk
| |-- [ dir] Eth_82574L
| |-- [ dir] FAT32
| |-- [ dir] HAL
| |-- [ dir] HAL9000
| |-- [file] HAL9000.sln
| |-- [file] HAL9000_WithoutApplications.sln
| |-- [ dir] NetworkPort
| |-- [ dir] NetworkStack
| |-- [ dir] PE_Parser
| |-- [ dir] shared
| |-- [ dir] SwapFS
| |-- [ dir] Usermode
| |-- [ dir] Utils
| |-- [ dir] Volume
|-- [ dir] temp
|-- [ dir] tests
|-- [ dir] tools
|-- [ dir] VM
9 directories, 0 files
3
1. Introduction
bin: this is the place where the binaries are placed after compilation.
acpi: contains the external acpica library. It is used for parsing the ACPI tables.
docs: contains this documentation and documentation related to the hardware com-
ponents or external libraries used by HAL9000 .
postbuild: contains the scripts which write the OS binary to the PXE folder, write
the user applications to the VM hard disk and run the tests. You will not run these
scripts directly, they are part of VS projects.
src:
– HAL9000.sln: project solution - this is what you need to open with VS.
– HAL9000 WithoutApplications.sln same as HAL9000.sln except it does not con-
tain the user-mode projects. If you feel that VS is working too slow with the
full project try using this.
– Usermode: contains all the user-mode applications and the common user-mode
library.
– Other folders belong to their respective projects, you will only have to work
with the HAL9000 project, but we suggest you navigate the project only in VS,
see 1.2.2 [Source Tree Overview].
temp: required by VS for storing temporary files - ignore it but do not delete the
folder.
tests: contains the code for parsing the test results and for validating them.
tools: contains miscellaneous tools including the assembler and the tools required to
interact with the VM and its disk.
VM: contains the two VMs: the one which will boot HAL9000 and a PXE server.
PXE: contains the contents required by the PXE server - this folder is shared with
the PXE server VM. Upon successful compilation of HAL9000 its binary is placed
here for network boot. DO NOT TOUCH THIS FOLDER.
4 2016
1.2. Project Overview
To open the project in VS you need to open the HAL9000.sln file from the src
directory. You will see several projects loaded in the VS solution:
FAT32
Provides a simple implementation of a FAT32 file system driver. You will not work
here.
SwapFS
Responsible for managing the swap partition : provides simple read/write function-
ality at a 4KB granularity. You will not work here.
PE parser
Parses portable executable (PE) files, i.e. Windows executables and synthesizes
information for use by other components. The module is required for re-mapping
the kernel and for loading user-mode applications into memory. You will not work
here.
Eth 82574L
Driver for the Intel 82574 GbE Controller Family. The network card emulated by
VMWare belongs to this family and the driver implements all the required function-
ality for network reception and transmission. You will not work here.
NetworkPort
Provides helper functions for ethernet drivers. This layer is added so that it is
not required of each ethernet driver (such as Eth 82574L) to implement the same
functionality over and over again. This layer provides helper functions which can be
used by any ethernet driver. You will not work here.
NetworkStack
Provides an interface for the operating system to access the network devices. You
will not work here.
Ata
Provides the IDE disk controller driver, it is responsible for performing disk I/O by
communicating with the hard disk controller. You will not work here.
Disk, Volume
These drivers provide abstraction for the operating system and the attached file
systems. A file system will be mounted on a volume, while a volume will belong to
a disk and the disk will be on top of a hard-disk controller. You will not work here.
NOTE: In real systems the hierarchy can be more complex (a file-system may be
on multiple volumes and a volume may be on multiple disks, but in HAL9000 these
mappings are one to one).
5
1. Introduction
RemoveAllTests
When this project is built causes HAL9000 ’s next boot not to run any tests and to
accept user-given commands through the keyboard.
RunTests
Will start a VM instance of the operating system and will wait until HAL9000 finishes
running all the tests or until a timeout occurs (default: 5 minutes). Once execution
of the VM finishes the results of the tests will be compared with the expected results
and a summary of the results will be displayed. More details in 1.3.2 [Testing].
CommonLib
Contains some basic data structures and functions which would normally be present
in the C standard library and several other generic constructs which are not coupled
with HAL9000 . Some of the features include: string manipulation functions, refer-
ence counters, bitmaps, spinlocks, hash tables, support for termination handlers and
so on.
CommonLibUnitTests
Not relevant to the project or any of the laboratories. If you want to write user-mode
C++ code to test the CommonLib functionality here’s the place where you can do
it.
HAL
Provides the layer which works directly with the x86 architecture hardware com-
ponents (CMOS, RTC, IOAPIC, LAPIC, PCI) and processor structures (CPUID,
MSR, GDT, IDT, MTRR, TSS).
HAL9000
The actual operating system code. This is the place where all your work
will take place. The project contains too many files and too much functionality
to describe it all here. On each project you will get a more detailed viewed of the
components you will work on. You can also browse the code at any time to determine
what each component does.
For more information on how to navigate the solution you can read C.2 [Visual
Studio].
6 2016
1.3. Grading
Once you have finished configuring the paths.cmd file you can now build the solution.
If you want to simply build the OS without starting the VM and running all the tests it
is enough to build the HAL9000 project.
If you want to go through the whole test cycle rebuild the RunTests project. See
more details in 1.3.2 [Testing].
If you want to make sure next time you boot the OS no test will be run rebuild the
RemoveAllTests project.
1.3 Grading
We will grade your assignments based on the design quality and test results, each
of which comprises 50% of your grade.
1.3.1 Design
We will judge your design based on the design document. We will read your entire
design document. Don’t forget that design quality, including the design document, is
50% of your project grade. It is better to spend one or two hours writing a good design
document than it is to spend that time getting the last 5% of the points for tests.
Design Document
We provide a design document template for each project. For each significant part
of a project, the template asks questions in four areas:
7
1. Introduction
Data Structures
Copy here the declaration of each new or changed struct or struct member, global
or static variable, typedef, or enumeration. Identify the purpose of each in 25 words
or less.
The first part is mechanical. Just copy new or modified declarations into the design
document, to highlight for us the actual changes to data structures. Each declaration
should include the comment that should accompany it in the source code (see below).
We also ask for a very brief description of the purpose of each new or changed data
structure. The limit of 25 words or less is a guideline intended to save your time and
avoid duplication with later areas.
Algorithms
This is where you tell us how your code works, through questions that probe your
understanding of your code. We might not be able to easily figure it out from the
code, because many creative solutions exist for most OS problems. Help us out a
little.
Your answers should be at a level below the high level description of requirements
given in the assignment. We have read the assignment too, so it is unnecessary to
repeat or rephrase what is stated there. On the other hand, your answers should be
at a level above the low level of the code itself. Don’t give a line-by-line run-down
of what your code does. Instead, use your answers to explain how your code works
to implement the requirements.
Synchronization
An operating system kernel is a complex, multithreaded program, in which synchro-
nizing multiple threads can be difficult. This section asks about how you chose to
synchronize this particular type of activity.
Rationale
Whereas the other sections primarily ask “what” and “how,” the rationale section
concentrates on “why.” This is where we ask you to justify some design decisions, by
explaining why the choices you made are better than alternatives. You may be able
to state these in terms of time and space complexity, which can be made as rough
or informal arguments (formal language or proofs are unnecessary).
An incomplete, evasive, or non-responsive design document or one that strays from
the template without good reason may be penalized. See ?? [??] for a sample design
document for a fictitious project.
8 2016
1.3. Grading
1.3.2 Testing
Your test result grade will be based on our tests. Each project has several tests,
testing the implementation you provide. To completely test your submission build the
RunTests project and wait for the results to appear (may take up to 5 minutes). The
following are performed when RunTests is built:
1. Parses the contents of the threads or userprog directory from within the tests folder
to determine which tests must be run, for every test files found in these directories
a test will be run.
2. Generates the Tests.module file and copies it to the PXE location - this file contains
all the commands that must be executed by the OS to run all the tests previously
parsed and to shutdown.
4. Waits for the VM to terminate or, if it runs for more than 5 minutes, it forcefully
shuts it down.
5. The serial log is parsed and divided into .result files having the names of the tests exe-
cuted. As an example: if the OS ran the TestsThreadStart a TestsThreadStart.result
file will be placed by the already existent TestsThreadStart.test file.
6. Each .result file is then parsed and a conclusion is drawn by following these steps:
7. An .outcome file is generated for each test containing the test conclusion (PASS or
FAIL) and the lines which caused the conclusion - the reason may be an error logged
or a PASS text or the lines differing between the .result and .test files.
8. The result of each test is shown, a summary of the results for each category and the
total pass / total count are displayed.
For further details, you can check the execute tests.pl file in the tests folder. If you
want to run only a single test you can run the run single test.cmd script found in the
src/postbuild folder. The parameters taken by the script are the project name and the
9
1. Introduction
[7] - Contains a tutorial for developing an operating system from scratch. Some inter-
esting topics include memory management, programming interrupts, implementing
spinlocks and developing a GUI.
[8] - Intel System Manual: the definitive documentation for everything related to the
Intel CPU: topics which may help you for the projects are Chapters 4 (Paging), 6
(Interrupt and Exception Handling), 8 (Multiple-Processor Management).
[9] - Intel Instruction Set Reference: if you are interested in the exact effects of an
x86 assembly instruction this is the manual for you.
10 2016
Chapter 2
Project 1: Threads
2.1 Overview
The first project strives to improve the current threading system. You are giving
a basic threading system implementation which doesn’t account for thread priorities and
contains an inefficient timer implementation. Your job is to improve this functionality by
removing the busy waits from the timer and by making the scheduler take into account
thread priorities and solving the problems which arise from this new scheduler.
Before starting work on the first assignment you should first read and understand
Chapter 1 [Introduction]. You should be able to navigate through the project files in
Visual Studio, compile the code and be able to run the thread tests.
The VS configuration used for compiling the code must be Threads. The only dif-
ference between configuration are the tests which run when building the RunTests project.
11
2. Project 1: Threads
by the idle thread can be seen in IdleThread() - it does nothing useful - the only reason
it exists is for a consistent view of the system, i.e. as there must always be a Stark in
Winterfell [10] there must always be a thread running on each CPU.
After the idle thread is initialized the interrupts will be enabled and every SCHED-
ULER TIMER INTERRUPT TIME US µs (currently 40ms) a clock interrupt will trig-
ger on each processor and if the current thread is running on the CPU for more than
THREAD TIME SLICE clock ticks (currently 1) it will yield after it finished handling
the interrupt. This is done in ThreadTick().
Another reason why a thread may yield the CPU is when it is trying to acquire a
resource which is currently unavailable: this may be a mutex, an executive event or an
executive timer. See A.5.2 [Executive] for implementation details.
Also, executing threads may be nice and give up their CPU time willingly by calling
ThreadYield() anytime.
For creating a new thread you can use the ThreadCreate() function. You can give
the thread a name (useful only for debugging purposes), a priority (currently disregarded),
a function from which to start execution and a pointer to the parameters of the function.
The ThreadCreateEx() version is available for creating threads in the context of a usermode
process - you will not need this function until project 2 when you will work on user-mode
threads.
The function specified will start execution with interrupts enabled and you will
NOT have to disable or re-enable interrupts manually. The synchronization mecha-
nisms will do this for short (executive) or long (primitive) periods of time. See A.5
[Synchronization] for more details.
The exit point of any thread is ThreadExit() regardless of whether the thread calls it
directly, it finishes the execution of its start function or it is forced terminated by another
thread.
We lied to you, the first function executed by newly created kernel threads is not
the startup function you give to ThreadCreate(), actually it’s ThreadKernelFunction(),
see A.2.5 [Thread Initialization] for the detailed explanation.
If you want to wait for a thread to terminate its execution you can use ThreadWait-
ForTermination(), if you want to terminate it forcefully use ThreadTerminate(). Warning:
this is not a good idea and should seldom be used because the thread may be holding
some resources on termination and they will not be freed; as a result other threads may
be deadlocked.
If you want more information read A.2 [Threads] and read the code in thread.c.
2.1.2 Synchronization
HAL9000 is a multi-threaded OS, i.e. multiple threads may access the same re-
sources concurrently: this causes race conditions if these critical sections are not protected
by proper synchronization mechanisms.
12 2016
2.2. Assignment
HAL9000 is also multi-processor aware and runs on all the CPUs detected in the
system. Due to this fact, you cannot synchronize the code by relying on the bad habit
of disabling interrupts on the current CPU. The thread from the current CPU (the one
disabling the interrupts) will not be interrupted while executing in a critical section, but
another thread (on another CPU) may also access the same data structures concurrently,
hence making disabling interrupts useless for synchronization.
While holding a primitive lock interrupts are disabled until you release it. The only
reason why you’ll want to use primitive synchronization mechanisms will be to synchronize
code between interrupt handlers and other functions. Because interrupt handlers can’t
sleep they cannot wait on executive resources (mutexes, events or timers). See A.5.1
[Primitive].
For all the code you’ll write you should use executive synchronization mechanisms.
The reason why they are called executive is because the OS manages them. In contrast with
the primitive synchronization mechanisms these do not use busy waiting for resources, these
block the current thread (yielding the CPU) and causing them to become un-schedulable
until the resource becomes available and the thread is signaled.
These mechanisms use primitive locks internally holding them for the least amount
of time necessary. See MutexAcquire() for a good example. The MutexLock is taken strictly
for updating the mutex data structure. Before returning control to the caller the function
restores the original interruptibility state. See A.5.2 [Executive].
There should be no busy waiting in your submission. A tight loop that calls
ThreadYield() is one form of busy waiting.
2.1.3 Development
Work together as a TEAM. Think of the design together and piece the code as
early as possible and not on the night before the submission.
Groups that do this often find that two changes conflict with each other, requiring
lots of last-minute debugging. Some groups who have done this have turned in code that
did not even compile or boot, much less pass any tests.
You MUST use a source version control tool such as Git [11] or Hg [12] (recom-
mended). This way you’ll be ’forced’ to collaborate and you will be able to easily merge
all your changes in time and see all your teammates code changes.
You will certainly run into many problems, when doing so it is recommended to read
the debugging section (Appendix B [Debugging]) for tips and tricks and to validate
each assumption you make (the recommended way is through ASSERTs). If you’re still
having problems do not hesitate to contact your TA and ask for help.
2.2 Assignment
In this project your goal is to improve the thread component.
13
2. Project 1: Threads
2.2.1 Timer
The current implementation for timers uses busy waiting to determine when the
timer should trigger. This is a very wasteful use of system resources: there is NO reason
to keep the CPU busy when other productive threads may be scheduled or - if no thread
is ready - to save some power and conserve your laptop’s battery power.
The timer interface is available in A.5.2 [Executive Timer]. A usage example
is illustrated in Listing 2.1 [Timer Usage Example]. For brevity the example does
not check the result of ExTimerInit(), however, when you’re working on your project we
expect you to always check the result of all the functions you call.
// After the timer is i n i t i a l i z e d any number of threads may wait for it , however
// until the timer is started , even if the c o u n t d o w n expired , they must not be
// woken up
// Start the timer , if the c o u n t d o w n has already expired all the threads will
// be i n s t a n t l y woken up
ExTimerStart (& timer ) ;
// Wait for the timer to be signaled , blocks the thread until the c o u n t d o w n
// tr i gg e rs . In the case of one shot timers issuing a wait after the timer
// has expired will cause the calling thread to return i m m e d i a t e l y . However
// in the case of p er io d ic timers the c o u n t d o w n will restart after all the
// waiting threads have been woken up and threads will be blocked again
// until the c o u n t d o w n t ri gg e rs again .
ExTimerWait (& timer ) ;
// After we have waited for the timer to trigger 3 times we will stop it
// If there are any threads still waiting on the timer they will be all
// woken up when the timer is stopped .
ExTimerStop (& timer ) ;
14 2016
2.2. Assignment
15
2. Project 1: Threads
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct io n : ThreadSetPriority
// D e s c r i p t i o n : Sets the thread ’s p ri o ri t y to new p r io r it y . If the
// current thread no longer has the highest priority , yields .
// Returns : void
// P a r a m e t e r : IN T H R E A D _ P R I O R I T Y N e w P r i o r i t y
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
T h r e a d S e t P ri o r i t y (
IN THR EAD_ PRIO RITY NewPriority
);
16 2016
2.3. Source Files
17
2. Project 1: Threads
2.4 FAQ
Q: How much code will I need to write?
A: Here’s a summary of our reference solution, produced by the hg diff program.
The final row gives total lines inserted and deleted; a changed line counts as both an
insertion and a deletion.
The reference solution represents just one possible solution. Many other solutions
are also possible and many of those differ greatly from the reference solution. Some ex-
cellent solutions may not modify all the files modified by the reference solution, and some
may modify files not modified by the reference solution.
src/HAL9000/headers/ex_system.h | 20 +++++
src/HAL9000/headers/ex_timer.h | 17 +++-
src/HAL9000/headers/mutex.h | 2 +
src/HAL9000/headers/thread_internal.h | 17 +++++
src/HAL9000/src/ex_event.c | 4 +-
src/HAL9000/src/ex_system.c | 247 +++++++++++++++++++++++++++++++++++++++
src/HAL9000/src/ex_timer.c | 76 ++++++++++++++++-----
src/HAL9000/src/mutex.c | 7 +-
src/HAL9000/src/system.c | 10 ++
src/HAL9000/src/thread.c | 156 +++++++++++++++++++++++++++++++++++++-
src/shared/kernel/thread.h | 2 +
11 files changed, 525 insertions(+), 33 deletions(-)
18 2016
2.4. FAQ
19
Chapter 3
Project 2: Userprog
3.1 Overview
In the first project you worked with the threading system, interacted a little with
the timer interrupt code and did some inter-processor communication. For the second
project you will work on the user-mode kernel interface system and implement several
system calls (syscalls) which will allow user applications to do useful work.
Currently, HAL9000 supports loading user-applications without passing any argu-
ments. In this project you will add support for passing arguments to user applications.
While in the first project you only worked with kernel data, and you didn’t have
to validate all the input received in the functions you’ve written or modified because you
trusted the kernel, in this project you will receive data from user-mode applications which
must NOT be trusted.
There MUST be a clear separation between trusted code (which runs in kernel
mode) and untrusted code (running in user-mode). It is not OK for the operating system
to crash if a badly written application wants the OS to read the contents of a file to an
invalid address (a NULL pointer for example).
Put shortly, these are your 3 main objectives for this project: implementing system
calls, passing program arguments to loaded applications and validating all user-input not
allowing it to crash the OS.
The VS configuration used for compiling the code must be Userprog. The only dif-
ference between configuration are the tests which run when building the RunTests project.
21
3. Project 2: Userprog
ibility) to be used when a system call is invoked. For more information you can look-up
the SYSCALL and SYSRET instructions in [9] and the IA32 LSTAR, IA32 FMASK and
IA32 STAR MSRs in [8] Chapter 35 - Model-Specific Registers (MSRs).
The kernel entry point is SyscallEntry() defined in syscall.yasm. This function is
responsible for switching to the kernel stack, for saving the user register state, for calling
the C SyscallHandler() function. In its current implementation the function only retrieves
the syscall number, logs it and sets a STATUS NOT IMPLEMENTED error code in the
user-mode RAX register, which holds the status of the system call and will be the result
seen by the user code. This is the function where most of your work will be done.
Once control is returned to SyscallEntry(), it will restore the initial register state,
restore the user-mode stack and place the calling instruction pointer in the RCX register,
and return to user land through the SYSRET instruction.
All the user-mode applications can be found in the User-mode -> Applications VS
filter. However, it is not enough to simply compile an application for it to be seen by
HAL9000 . It must be copied to its file system. This can be done automatically by
running the CopyUmAppsToVm VS project (User-mode -> Utils VS filter).
This project must be built while the HAL9000 VM is powered off, else it won’t be
able to mount the file system and copy the application files.
22 2016
3.1. Overview
Loading an application
HAL9000 has a very basic portable executable (PE - [13]) parser and loader. Each
application written for HAL9000 must statically link the UsermodeLibrary library file,
which will provide the application’s entry point (both for the main and for secondary
threads).
You can see the entry point function for the main thread start() in um lib.c -
this is the place where the common library and system call interface will be initialized.
Once this is done the actual entry point of the application is called through the main()
function.
Besides the two extra underlines, this is similar to a classic C main function, where
the parameters received are the program arguments (argc and argv).
The entry point for secondary threads is start thread() - this function is imple-
mented in um lib helper.c.
These user-mode application projects must be configured in a special way so HAL9000
will be able to load them. We have provided two VS templates, so you can easily add new
user applications to the solution: HAL9000 UserApplication.zip and HAL9000 UserMain.zip:
the first one is a project template, while the second file is a file template (you should use
it when adding the main.c file).
For VS to recognize these templates, you need to place the project template in My
Documents/Visual Studio 2015/Templates/ProjectTemplates and the item template in My
Documents/Visual Studio 2015/Templates/ItemTemplates.
23
3. Project 2: Userprog
As part of a system call, the kernel must often access memory through pointers
provided by a user program. The kernel must be very careful about doing so, because the
user can pass a null pointer, a pointer to unmapped virtual memory, a pointer to a user
address to which it does not have the proper access rights (e.g: the pointer may be to
a read-only area and the system call would write to that region), or a pointer to kernel
virtual address space. All of these types of invalid pointers must be rejected, without harm
to the kernel or other running processes, by failing the system call.
NOTE: For this project, we only consider issues which arise from single-
thread user applications. In case of multi-threaded applications, things get
more complicated: a valid user-mode address may become invalid, because it
is freed by a different thread than the one which issues the system call.
There are at least two reasonable ways to access user memory correctly. The first
method is to verify the validity of a user-provided pointer, then dereference it. If you
choose this route, you’ll want to look at MmuIsBufferValid(). This is the simplest way to
handle user memory access.
The second method is to check only that a user specified pointer doesn’t have
the VA HIGHEST VALID BIT bit set (see VmIsKernelAddress() and VmIsKernel-
Range()), then dereference it. An invalid user pointer will cause a ”page fault” that you
can handle by modifying the code in VmmSolvePageFault(). This technique is normally
faster, because it takes advantage of the processor’s MMU, so it tends to be used in real
kernels.
In either case, you need to make sure not to ”leak” resources. For example, suppose
that your system call has acquired a lock, or allocated memory. If you encounter an
invalid user pointer afterward, you must still be sure to release the lock or free the page
of memory. If you choose to verify user pointers before dereferencing them, this should
be straightforward. It’s more difficult to handle, if an invalid pointer causes a page fault,
because there’s no way to return an error code from a memory access.
NOTE: A safer, but slower solution would be to map the physical pages
described by the user addresses into kernel space. This way, you would not
be bothered if the user application un-maps its VA to PA translations, as long
as it is not able to free the physical pages. Also, this would allow the SMAP
(Supervisor Mode Access Prevention) CPU feature to be activated, causing
page faults on kernel accesses to user memory, thus ensuring the OS doesn’t
access user-memory by mistake.
24 2016
3.2. Assignment
3.2 Assignment
3.2.1 Argument Passing
If you haven’t done so yet, you should read the 3.1.3 [Loading an application]
now. You will need to work in ThreadSetupMainThreadUserStack() to setup the user-
mode stack of the application’s main thread of execution.
You already have the command line and the number of arguments available in
the FullCommandLine and NumberOfArguments fields of the PROCESS structure (see
process internal.h and A.4 [Processes] for more details).
All you need to do is to place the arguments on the user stack and return the
resulting stack pointer in the ResultingStack output parameter. For more information
about program loading, read A.4.3 [Program Startup]. We recommend you to use
strchr() available in string.h for parsing the command arguments.
3. Accessing a user address which does not have enough rights to perform the action
required. Example: if an address is read-only and the system call would put data in
the buffer, it should not do so.
NOTE: All these conditions must be validated for the whole length of
the buffer, and not only for its start or end position. For example, address
0x4000 and 0x6000 may be valid addresses, but 0x5000 may not.
All of these conditions are checked in the test suite.
25
3. Project 2: Userprog
To implement syscalls, you need to provide ways to read and write data in user
virtual address space.
Also, you may be wondering what’s up with the UM HANDLE data type. The idea
is that when an application opens a file, or creates a new thread or process, it needs a way of
later referring to that created object, i.e. it’s useless to open a file if it cannot later be used
for reading or writing data to it. That’s where handles come in: each time a user executes
a system call, which has the effect of creating or opening an object (file/thread/process),
a UM HANDLE will be returned to it.
This handle will later be used by the system calls, which manipulate that class of
objects. For files it would be SyscallFileRead(), SyscallFileWrite() and SyscallFileClose().
NOTE: While the easy solution might seem to simply return kernel pointers as
handles, this is not a good design choice, and the implementation isn’t so straightforward
either.
The reason it is not so straightforward is that in your system call implementation,
we require you to validate all handles. This means, beside validating that the handle is
valid and was created for this process, you should also validate that the handle is of the
appropriate type, i.e. you wouldn’t want to be working with a thread handle when reading
the contents of a file, or you wouldn’t want process B to be able to access the files opened
by process A.
Another reason reason it is a bad design decision is that this effectively introduces
an information disclosure vulnerability into the kernel - CWE-200 Information Exposure,
i.e. any user- application can start mapping the kernel environment by having access to
information it should not know.
When you’re done with this part and with 3.2.4 [Handling user-mode exceptions],
HAL9000 should be bulletproof. Nothing that a user program can do should ever cause
the OS to crash, panic, fail an assertion, or otherwise malfunction. It is important to
emphasize this point: our tests will try to break your system calls in many, many ways.
You need to think of all the corner cases and handle them.
If a system call is passed an invalid argument, the only acceptable option is to return
an error value.
File Paths
SyscallFileCreate() and SyscallProcessCreate() both receive a file path as one of
their parameters. This may be an absolute path, starting from drive letter and specifying
the full path to the file (an example would be ”D:\Folder\File.txt”), or it may be a relative
path, in which case the handling differs for these system calls:
26 2016
3.3. Source Files
In the case of SyscallProcesCreate(), the final path is considered relative to the ap-
plications folder found on the system drive, i.e. if ”Apps.exe” is the path given to the
syscall, internally it will try to open ”%SYSTEMDRIVE%Applications\Apps.exe”.
If for example the system drive is ”F:\”, then the final path will be ”F:\Applications
\Apps.exe”.
27
3. Project 2: Userprog
um application.c
Contains the implementation for reading the user application from disk and loading
it into memory. You will NOT work in these files.
syscall.h
syscall.c
Provides the function to initialize the system call dispatching system, Syscall-
CpuInit(), and the system call handler: SyscallHandler(). Most of your work will be
done in this file.
syscall defs.h
syscall func.h
syscall no.h
These are files which are shared between the kernel and user-mode components.
These define the system call numbers, the system call interface functions, and define basic
types used by system calls such as handles, paging rights and so on.
io.h
Provides functions to work with devices. When implementing the file system system
calls, you will be interested in IoCreateFile(), IoCloseFile(), IoReadFile() and IoWrite-
File().
3.4 FAQ
Q: How much code will I need to write?
A: Here’s a summary of our reference solution, produced by the hg diff program.
The final row gives total lines inserted and deleted; a changed line counts as both an
insertion and a deletion.
The reference solution represents just one possible solution. Many other solutions
are also possible and many of those differ greatly from the reference solution. Some ex-
cellent solutions may not modify all the files modified by the reference solution, and some
may modify files not modified by the reference solution.
src/HAL9000/HAL9000.vcxproj | 5 +
src/HAL9000/HAL9000.vcxproj.filters | 15 +
src/HAL9000/headers/process_internal.h | 3 +
src/HAL9000/headers/syscall_struct.h | 67 +++++++
src/HAL9000/headers/um_handle_manager.h | 46 +++++
src/HAL9000/src/isr.c | 9 +-
src/HAL9000/src/process.c | 8 +
src/HAL9000/src/syscall.c | 636 ++++++++++++++++++++++++++++++++++++-
src/HAL9000/src/syscall_func.c | 541 ++++++++++++++++++++++++++++++++++++
src/HAL9000/src/thread.c | 81 ++++++++-
src/HAL9000/src/um_handle_manager.c | 248 +++++++++++++++++++++++++++
28 2016
3.4. FAQ
src/shared/common/syscall_defs.h | 2 +
src/shared/kernel/heap_tags.h | 3 +-
13 files changed, 1658 insertions(+), 6 deletions(-)
Q: Any user application run crashes the system with a #PF exception
A: The first thing you have to do is make sure the stack is properly set up before
any user application is run. You don’t have to implement argument passing from the
beginning, however you should at least reserve space on the stack for the shadow space
and the return address.
29
Chapter 4
4.1 Overview
By now your OS can load multiple user applications at once, it can service their
requests for accessing system resources and for managing processes and threads. However,
the number and size of programs that can run is limited by the machine’s main memory
size. In this assignment, you will remove that limitation.
You will build this assignment on top of the last one. Test programs from project
2 should also work with project 3. You should take care to fix any bugs in your project 2
submission before you start work on project 3, because those bugs will most likely cause
the same problems in project 3.
HAL9000 already supports some virtual memory features such as lazy mapping and
memory mapped files and makes extensive use of them already, so you won’t have to
intervene in those areas. However, you will have to implement the system calls to allow a
user-application to dynamically allocate and free virtual memory - this could allow a user
application to implement a heap allocator. Moreover, this memory must be shareable by
any number of processes.
You will also have to implement per process quotas: you should limit the number
of physical frames a process uses and the number of open files a process can have at once.
And finally you will implement a swapping mechanism which will cause frame evic-
tion to occur either when there are no more free frames in physical memory or when a
process reaches its frame quota.
31
4. Project 3: Virtual Memory
Pages
A page, sometimes called a virtual page, is a continuous region of virtual memory
4,096 bytes (the page size) in length. A page must be page-aligned, that is, start on a
virtual address evenly divisible by the page size. A 64-bit virtual addresses can be divided
into 6 sections as illustrated below:
The most significant 16 bits are unused because they reflect the 47th bit.
The next 4 sections of 9 bits each provide an index into to the corresponding paging
table structure.
The final 12 bits provide the offset within the final physical address.
63 48 47 39 38 30 29 21 20 12 11 0
+----------------+---------+---------+---------+---------+------------+
| Unused | PML4 | Dir Ptr | Dir | Table | Offset |
+----------------+---------+---------+---------+---------+------------+
Virtual Address
Each process has an independent set of user (virtual) pages, which are those pages
below virtual address 0x8’0000’0000’0000 (128 TiB), while the kernel virtual space begins
at gVirtualToPhysicalOffset which is typically 0xFFFF’8000’0000’0000 (almost 16 EiB).
The set of kernel (virtual) pages, on the other hand, is global, remaining the same regardless
of what thread or process is active. The kernel may access both user and kernel pages,
but a user process may access only its own user pages. See 3.1.3 [Virtual Memory
Layout] for more information.
HAL9000 provides several useful functions for working with virtual addresses. See
A.7 [Virtual Addresses] for details.
Frames
A frame, sometimes called a physical frame or a page frame, is a continuous region
of physical memory. Like pages, frames must be page-size and page-aligned. Even if the
processor runs in 64-bits mode the maximum physical address is not 264 (16 EiB), the
maximum addressable physical address differs from CPU to CPU (this value can be found
out by querying a CPUID leaf), however, according to the Intel manual the maximum
physical address size is limited to 252 bits (4 PiB).
Thus, a 52-bit physical address can be divided into a 40-bit frame number and a
12-bit frame offset, (or just offset) like this:
63 52 12 11 0
+-------+-------------------+-----------+
| 00000 | Frame Number | Offset |
+-------+-------------------+-----------+
Physical Address
32 2016
4.1. Overview
When paging is enabled the x86 architecture works with virtual addresses, transpar-
ently accessing the physical memory mapped by the address. Thus, the software executing
does not need to know the actual whereabouts of the memory or the memory topology
found in the system,
HAL9000 provides functions for translating between physical addresses and kernel
virtual addresses. See A.7 [Virtual Addresses] for details.
Page Tables
The x86 processors translate virtual addresses to physical addresses through the use
of some hardware defined structures, called paging tables. These are hierarchical structures
which describe the virtual address space and provide the final physical address, they also
specify the access rights (read/write/execute) and the privilege level required (kernel or
user-mode access). HAL9000 provides page table management code in pte.h. See A.8
[Paging Tables] for more information.
The diagram below illustrates the relationship between pages and frames. The
virtual address, on the left, consists of 4 page indexes (one for each paging level) and
an offset. The paging tables translate the page indexes into a frame number, which is
combined with the unmodified offset to obtain the physical address, on the right.
+-------------+
.---------------->|Paging Tables|---------.
| +-------------+ |
47 | 12 11 0 52 V 12 11 0
+----------+--------+ +------------+--------+
| Page Idx | Offset | | Frame No | Offset |
+----------+--------+ +------------+--------+
Virt Addr | Phys Addr ^
\_______________________________________/
A more detailed illustration is given in Figure 4.1 [Detailed paging], here, what
was previously called ”Page Idx” is now properly separated into its 4 parts: the index in
the PML4 table, the index in the PDPT table, the index in the PD table and the index in
the PT table.
Swap Slots
A swap slot is a continuous, page-size region of disk space in the swap partition.
Although hardware limitations dictating the placement of slots are looser than for pages
and frames, swap slots should be page-aligned because there is no downside in doing so.
33
4. Project 3: Virtual Memory
47 39 38 30 29 21 20 12 11 0
+---------+---------+---------+---------+------------+
| PML4 Idx| PDPT Idx| PD Idx | PT Idx | Page Offset|
+---------+---------+---------+---------+------------+
| | | | |___________
____/ | \_____ \__________ \
/ | \ \ \
/ PML4 | PDPT | PD | PT | Data Page
/ ._______. | ._______. | ._______. | ._______. | .____________.
| 511|_______| | 511|_______| | 511|_______| | 511|_______| | |____________|
| 510|_______| | 510|_______| | 510|_______| | 510|_______| | |____________|
| 509|_______| | 509|_______| | 509|_______| | 509|_______| | |____________|
| 508|_______| | 508|_______| | 508|_______| | 508|_______| | |____________|
| | | | | | | | | | | | | | |
| | | \___\| . | \___\| . | \___\| . | \___\| . |
| | . | /| . | /| . | /| . | /| . |
\___\| . |_ | . |_ | . |_ | . |_ | . |
/| . | \ | . | \ | . | \ | . | \ | . |
| . | | | . | | | . | | | . | | | . |
| | | | | | | | | | | | | |
|_______| | |_______| | |_______| | |_______| | |____________|
4|_______| | 4|_______| | 4|_______| | 4|_______| | |____________|
3|_______| | 3|_______| | 3|_______| | 3|_______| | |____________|
2|_______| | 2|_______| | 2|_______| | 2|_______| | |____________|
1|_______| | 1|_______| | 1|_______| | 1|_______| | |____________|
0|_______| \__\0|_______| \__\0|_______| \__\0|_______| \__\ |____________|
/ / / /
34 2016
4.1. Overview
Enables page fault handling by supplementing the hardware page table, see 4.1.3
[Managing the Supplemental Page Table].
Frame table
Allows efficient implementation of eviction policy, see 4.1.3 [Managing the Frame
Table].
Swap table
Tracks usage of swap slots, see 4.1.3 [Managing the Swap Table].
You do not necessarily need to implement three completely distinct data structures:
it may be convenient to wholly or partially merge related resources into a unified data
structure.
For each data structure, you need to determine what information each element
should contain. You also need to decide on the data structure’s scope, either local (per-
process) or global (applying to the whole system), and how many instances are required
within its scope.
Possible choices of data structures include arrays, lists, bitmaps, and hash tables. An
array is often the simplest approach, but a sparsely populated array wastes memory. Lists
are also simple, but traversing a long list to find a particular position wastes time. Both
arrays and lists can be re-sized, but lists more efficiently support insertion and deletion in
the middle.
Although more complex data structures may yield performance or other benefits,
they may also needlessly complicate your implementation. Thus, we do not recommend
implementing any advanced data structure (e.g. a balanced binary tree) as part of your
design.
36 2016
4.1. Overview
The frame table contains one entry for each frame that contains a user page. Each
entry in the frame table contains a pointer to the page, if any, that currently occupies it,
and other data of your choice. The frame table allows HAL9000 to efficiently implement
an eviction policy, by choosing a page to evict when no frames are free.
The frames are obtained by calling PmmReserveMemory() or PmmReserveMemo-
ryEx() and are freed using PmmReleaseMemory().
The most important operation on the frame table is obtaining an unused frame.
This is easy when a frame is free. When none is free, or when the process has reached its
physical frame quota a frame must be made free by evicting some page from its frame.
If no frame can be evicted without allocating a swap slot, but the swap is full,
panic the kernel. Real OSes apply a wide range of policies to recover from or prevent such
situations, but these policies are beyond the scope of this project.
The process of eviction comprises roughly the following steps:
1. Choose a frame to evict, using your page replacement algorithm. The ’accessed’ and
’dirty’ bits in the page table, described in A.8.3 [Accessed and Dirty Bits] will
come in handy.
2. Remove references to the frame from any page table that refers to it. Be careful, once
you have implemented shared memory multiple pages can refer to the same frame at
a given time.
The swap table tracks in-use and free swap slots. It should allow picking an unused
swap slot for evicting a page from its frame to the swap partition. It should allow freeing a
swap slot when its page is read back or the process whose page was swapped is terminated.
You may obtain the swap file using IomuGetSwapFile(), once you have the FILE OBJECT
structure you can simply use the IoReadFile() and IoWriteFile() functions to read and
write from the swap FS. The only restriction is that the number of bytes transferred must
be exactly the size of a page and the offset must also be page aligned.
The size of the swap file is 128 MiB which should be sufficient for all the current
tests (as long as cleanup properly occurs after process termination): at most 80 MiB will
be used at a single time.
37
4. Project 3: Virtual Memory
4.2 Assignment
4.2.1 Per Process Quotas
You must implement a mechanism to keep track of the number of files currently
held open by a process and the number of physical frames it currently uses. These def-
initions are found in process internal.h and are PROCESS MAX PHYSICAL FRAMES
and PROCESS MAX OPEN FILES both currently defined as 16.
In case of a real OS the frames occupied by the binary would also add up to the
physical frame quota, however, due to the way in which HAL9000 loads the applications in
memory it would be very hard to implement this. As a result you will only have to count
the frames allocated as a result of calls to SyscallVirtualAlloc() and the frames occupied
by the user stack.
When the quota for files open is reached then that process should not be able to
open any additional files until it closes another file.
When the quota for physical frames is reached the eviction mechanism must be
invoked, which will pick one of the processes frames, swap it to disk and use it for another
virtual memory allocation.
If the value is 0, the virtual allocation created is private for the current process.
If the value is non-zero then the memory backed by the virtual allocation received
by the creator process can be accessed from any other process in the system. This
is done by the second process calling SyscallVirtualAlloc() specifying the same Key
value as the creator. This means that Key acts as a global identifier which can be
used by any other process in the system. An illustration is shown in Listing 4.1
[Shared Memory].
38 2016
4.2. Assignment
// Code ex e cu t ed by Process 0
void Process0Code ( void )
{
STATUS status ;
char * pData ;
status = S y s c a l l V i r t u a l A l l o c ( NULL ,
PAGE_SIZE ,
V M M _ A L L O C _ T Y P E _ R E S E R V E | VMM_ALLOC_TYPE_COMMIT ,
PAGE_RIGHTS_READWRITE ,
NULL ,
SHARED_KEY_VALUE ,
& pData ) ;
ASSERT ( SUCCEEDED ( status ) ) ;
strcpy ( pData , " Shared memory is the coolest thing ever ! " ) ;
// Code ex e cu t ed by Process 1
void Process1Code ( void )
{
STATUS status ;
char * pData ;
status = S y s c a l l V i r t u a l A l l o c ( NULL ,
PAGE_SIZE ,
V M M _ A L L O C _ T Y P E _ R E S E R V E | VMM_ALLOC_TYPE_COMMIT ,
PAGE_RIGHTS_READWRITE ,
NULL ,
SHARED_KEY_VALUE ,
& pData ) ;
ASSERT ( SUCCEEDED ( status ) ) ;
39
4. Project 3: Virtual Memory
4.2.3 Swapping
As previously mentioned, once a process reaches its quota for number of physical
frames allocated the contents of a frame must be swapped out to disk.
After the contents have been swapped out, another virtual address may be mapped
into this physical frame. From the point of view of a user-application this is transparent,
the application only works with virtual addresses and has no idea where the physical
memory which actually holds the data resides.
After you implemented the swap out operation be sure to also add support for
swap in, else when the application accesses the virtual address corresponding to the frame
previously swapped out the kernel won’t be able to solve the #PF exception. For this you
will probably need to make changes in VmmSolvePageFault() and use the supplemental
page table to determine the location of the data on the swap file.
ASSERT(!IsBooleanFlagOn(AllocType, VMM_ALLOC_TYPE_ZERO));
The trivial solution would be to simply mark that the page contains zeroes and
when a page fault occurs to memzero the memory, however there are more elegant and
efficient ways of implementing zero pages.
One of the tests actually checks the following scenario: allocate 2 GiB of virtual
memory and read the data from each page, you should remember that the swap file is only
128 MiB and a process is restricted to 16 frames of physical memory at a time.
40 2016
4.3. Source Files
the limit user-adjustable, e.g. with the ulimit command on many Unix systems. On many
GNU/Linux systems, the default limit is 8 MB. The first stack page need not be allocated
lazily. You can allocate and initialize it with the command line arguments at load time,
with no need to wait for it to be faulted in.
Stack pages do not count towards the processes’s frame quota.
4.4 FAQ
Q: Do we need a working Project 2 to implement Project 3?
A: Yes.
41
4. Project 3: Virtual Memory
42 2016
Appendix A
Reference Guide
A.1 HAL9000
A.1.1 Startup
HAL9000 is a multiboot compliant OS, as a result it expects to be loaded at the
physical address specified in the multiboot header in 32-bit operating mode with paging
disabled.
This section will detail the execution path starting from the assembly code which
runs after the multiboot loader gives us the control to the C code which initializes each
component.
Assembly code
Code execution begins in the EntryMultiboot() function defined in mboot32.yasm.
As stated earlier, we will not continue execution if we were not loaded by a multiboot
loader.
The physical memory map describing the physical memory available in the system
is retrieved using the INT 15H E820H BIOS interrupt function. The list of potential COM
ports are retrieved from the bios data area.
After this information is acquired the OS will transition to operating in 64 bits where
it will forever remain. After this transition occurs in PM32 to PM64() in transition.yasm
the assembly code will give the control to the first C function: Entry64().
C code
Entry64() starts by initializing the commonlib library(providing primitive synchro-
nization mechanisms, asserts and other utility functions).
Control then reaches SystemInit() which validates we are capable of executing on
the CPU we are currently running (CpuMuValidateConfiguration()). Serial communication
is initialized if there was a valid COM port in bios data area.
43
A. Reference Guide
The IDT is setup (InitIdtHandlers()) and exceptions will no longer cause triple
faults. Most exceptions will not be handled but log messages will be generated which help
debug the issue. In contrast, the page fault #PF exception is one which occurs very often
and it will be handled mostly successfully - see [8] Chapter 6 - Interrupt and Exception
Handling and A.1.4 [Interrupt Handling] for details.
The next things to initialize are the memory managers (MmuInitSystem()):
The virtual memory manager - VmmInit() - responsible for allocating virtual ad-
dresses and for mapping virtual addresses (VAs) to physical addresses (PAs). It is
also responsible for handling #PFs.
The heap - MmuInitializeHeap() - the heap is responsible for managing large con-
tinuous area of virtual space and for offering the possibility for other components to
allocate memory at a byte level granularity from these regions managed.
MmuInitSystem() will also create a new paging structure hierarchy and will cause
a CR3 switch to these new structures.
Next, if the multiboot loader passed any boot modules to us they will be loaded in
memory by BootModulesInit(). The Tests.module file describing the tests to run is such a
module.
The ACPI tables will be parsed by AcpiInterfaceInit() to determine the processors
present on the system and to determine if PCI express support is available.
CpuMuAllocAndInitCpu() will then be called to allocate the CPU structure for the
bootstrap processor (BSP - the processor on which the system execution starts) and to
validate that the CPU is compatible with HAL9000 .
This function will activate the CPU features required for operation and those for
enhancing the operating system’s capabilities. Also, the main thread is created here.
IomuInitSystem() is then called to initialize the I/O capabilities of the system by
initializing the following:
the IOAPIC: this is the system interrupt controller which superseded the PIC and
is responsible for interrupt delivery for legacy devices and for PCI devices which do
not have support for directly delivering interrupts through MSI or MSI-X.
the IDT handlers, for the second time - first time there were no TSS stacks allocated
for the current CPU and now there are. The reason to use TSS stacks is to prevent
interrupt handlers from using an invalid or corrupt stack. More information can be
found in [8] Chapter 6 - Interrupt and Exception Handling.
44 2016
A.1. HAL9000
the PCI hierarchy, all PCI devices must be retrieved and placed in a device hierarchy
so proper interrupt routing can be done.
the clocks: RTC - used to update the clock found in the top right-hand corner of the
display and the PIT - programmed to deliver the scheduler clock tick.
the keyboard: used for receiving commands from the user operator.
SmpSetupLowerMemory() will then setup the required memory structures for all
the other CPUs to start up - these are called Application Processors (APs).
ProcessSystemInitSystemProcess() will then create the ”System” process and at-
tribute the only running thread to it.
ThreadSystemInitIdleForCurrentCPU() will spawn the idle thread and will enable
interrupt delivery.
The ACPI tables are parsed again through the AcpiInterfaceLateInit() function for
additional information: the PCI routing tables which describe which entries of the IOAPIC
are used by which devices. This step is required for proper interrupt setup for PCI devices
without MSI/MSI-x capabilities.
All the APs will now be woken up by the SmpWakeupAps() function. This function
returns only after all the processors have woken up - for more details see A.1.1 [AP
initialization]. Once this function returns the main thread of execution - which initially
ran on the BSP - may be moved on any of the other CPUs.
After the APs have all woken up all interrupts registered for devices are enabled:
keyboard and clocks.
The IomuInitDrivers() function is then called to initialize each driver found in the
DRIVER NAMES list. These are the drivers responsible for managing the disk controller,
the abstract disk and volume concepts, the FAT32 file system and the ethernet network
card.
Afterward, the CmdRun() function is executed which executes all the commands
received in all the boot modules (currently only the ”Tests” module) and then allows the
user to issue hand-written commands.
If the /shutdown command is given the system shuts down. To reboot the system
you can use the /reset command.
AP initialization
The APs start execution in 16 bits in the TrampolineStart() function found in the
trampoline.yasm assembly file. The assembly code is responsible for transitioning from
real-mode in 16 bits to protected mode in 32 bits and then to long mode with paging
enabled. Once these transitions have been made the ApInitCpu() C function is called.
Within this function the GDT and IDT are reloaded with their high-memory coun-
terparts, the current CPU structure is initialized thus starting the main thread of execution
on the AP and signaling the BSP that it has woken up. The idle thread is then initialized
for the AP and the main thread exits successfully by calling the ThreadExit() function.
45
A. Reference Guide
46 2016
A.1. HAL9000
When PCIDs are used the CPU will no longer flush any mappings on CR3 switches.
This is because each cached mapping is also indexed by the process’s PCID which
will ensure the CPU will not use the wrong translation after a CR3 switch.
3. Protection Keys
This mechanism offers an additional mechanism for controlling accesses to user-mode
addresses. If enabled, it can be used to disable read/write access at the page-level
granularity for each user-mode address. These restrictions also apply for supervisor
mode accesses.
Unfortunately, HAL9000 does not use this feature.
NOTE: The XD benefit is also available when executing in 32-bit mode with PAE
enabled, but the other benefits apply exclusively to execution in long mode.
47
A. Reference Guide
Internal interrupts - these are synchronous interrupts caused directly by CPU in-
structions. Attempts at invalid memory accesses (page faults), division by 0, software
interrupts and some other activities cause internal interrupts.
Because they are caused by CPU instructions, internal interrupts are synchronous
or synchronized with CPU instructions and cannot be disabled.
External interrupts - these are asynchronous events generated outside the current
CPU, i.e. they may be generated by other CPUs, other hardware devices such as
the system timer, keyboard, disk, network controller and so on. External interrupts
are asynchronous, meaning that their delivery is not synchronized with instruction
execution. Handling of external interrupts can be postponed by disabling interrupts
with CpuIntrDisable() and related functions, see A.5.4 [Disabling Interrupts]
for details.
The CPU treats both classes of interrupts largely the same way, so HAL9000 has
common infrastructure to handle both classes. The following section describes this common
infrastructure. The sections after that give the specifics of external and internal interrupts.
Interrupt Infrastructure
When an interrupt occurs, the CPU saves its most essential state on the stack and
jumps to an interrupt handler routine. The 80x86 architecture supports 256 interrupts,
numbered 0 through 255, each with an independent handler defined in an array called the
interrupt descriptor table or IDT.
In our project, InitIdtHandlers() is responsible for setting up the IDT so that each
entry corresponds to a unique entry point in isr.yasm. The exception handlers (vectors
between 0 and 31) have proper names, while the interrupt handlers are generated using a
macro and have the GenericIsrN() name - where N is between 32 and 255. Because the
CPU doesn’t give us any other way to find out the interrupt number, each entry point
pushes the interrupt number on the stack. For consistent interrupt handling, a dummy
error code is pushed on the stack for interrupts which do not generate such error codes.
After this information is saved the PreIsrHandler() is called - this function saves all the
general purpose registers on the stack and calls the IsrCommonHandler() C function.
IsrCommonHandler() branches to IsrExceptionHandler() for exceptions and to Is-
rInterruptHandler() for any other interrupts. These are described in the A.1.4 [Internal
Interrupt Handling] and A.1.4 [External Interrupt Handling].
If IsrCommonHandler() handles the interrupt successfully, i.e. either the excep-
tion was benign or the interrupt was acknowledged by a device device, control returns to
PreIsrHandler(). The registers and the stack are restored and the CPU returns from the
interrupt through the IRETQ instruction.
NOTE: Execution of both classes of interrupt handlers currently happen
with interrupts disabled.
48 2016
A.1. HAL9000
As a result, an interrupt handler effectively monopolizes the CPU and delays all
other activities on that CPU. Therefore, external interrupt handlers should complete as
quickly as they can. Anything that require much CPU time should instead run in a kernel
thread, possibly one that the interrupt unblocks using a synchronization primitive.
Internal interrupts are caused directly by CPU instructions executed by the running
kernel thread or user process (from project 2 onward). An internal interrupt is therefore
said to arise in a process context.
In the current project implementation, the only type of exceptions IsrException-
Handler() is able to solve are page fault exceptions (see A.6.5 [Page-fault handling] for
more details. For any other exceptions or when a #PF cannot be satisfied the interrupt
frame and some of the stack area is logged to help debug the problem and the system
asserts.
External interrupts are caused by events outside the CPU. They are asynchronous,
so they can be invoked at any time that interrupts have not been disabled. We say that
an external interrupt runs in an interrupt context.
In an external interrupt, the interrupt frame or the processor state is not passed
to the handler because it is not very meaningful. It describes the state of the thread
or process that was interrupted, but there is no way to predict which one that is. It is
possible, although rarely useful, to examine it, but modifying it is a recipe for disaster.
An external interrupt handler must not sleep, yield or block, which rules out
using executive synchronization mechanisms - primitive synchronization or interlocked op-
erations can still be used, see A.5 [Synchronization]. Sleeping in interrupt context
would effectively put the interrupted thread to sleep and block any interrupts of lower or
equal priority than the one currently serviced. This would be disastrous and may cause
the next scheduled thread to run indefinitely because the scheduler interrupt may never
occur again.
Interrupt delivery to a CPU is controlled by the IOAPIC (system-wide) and the
LAPIC (per CPU). If the external interrupt was acknowledged by a device driver it is
considered handled and it is acknowledged (see IomuAckInterrupt()).
After the interrupt was acknowledged IsrInterruptHandler() checks if it should
preempt the current thread or not and takes the appropriate action.
To register an interrupt handler for a device IoRegisterInterrupt() and IoRegister-
InterruptEx() can be used.
49
A. Reference Guide
A.2 Threads
A.2.1 Thread Structure
The structure defining a thread is found in thread internal.h.
TID Id ;
char * Name ;
// Counts the number of ticks the thread has c u r r e n t l y run without being
// de - scheduled , i . e . if the thread yields the CPU to another thread the
// count will be reset to 0 , else if the thread yields , but it will
// s c h e d u l e d again the value will be i n c r e m e n t e d .
QWORD UninterruptedTicks ;
// The highest valid address for the kernel stack ( its initial value )
PVOID In i ti a lS t ac k Ba s e ;
// The current kernel stack pointer ( it gets updated on each thread switch ,
// its used when re su m in g thread e x e c u t i o n )
PVOID Stack ;
// MUST be non - NULL for all threads which belong to user - mode p r o c e s s e s
50 2016
A.2. Threads
PVOID UserStack ;
TID Id
Unique identifier, begins at 0 and is incremented by TID INCREMENT (currently
4) for each thread created. There is no need for recycling the ids because a TID
is defined as a QWORD and even if the increment is 4, we’ll run out of IDs (i.e.
wrap around to 0) after 262 (4,611,686,018,427,387,904) threads are created. We
*probably* won’t be running for that long.
51
A. Reference Guide
PVOID InitialStackBase
Useful only for debugging purposes in the first project. Will be used in the sec-
ond project to determine the user threads kernel stacks. More on this in A.1.4
[Interrupt Handling].
PVOID Stack
Every thread has its own stack to keep track of its state. When the thread is running,
the CPU’s stack pointer register tracks the top of the stack and this member is
unused. But when the CPU switches to another thread, this member saves the
thread’s stack pointer. No other members are needed to save the thread’s registers,
because the other registers that must be saved are saved on the stack.
When an interrupt occurs, whether in the kernel or a user program, an INTER-
RUPT STACK COMPLETE structure is pushed onto the stack. When the inter-
rupt occurs in a user program, this structure is always at the address pointed by
InitialStackBase.
PVOID UserStack
Will be NULL for all the threads created in the first project. This field is valid only
for threads belonging to user-mode processes and points to the stack which is used
by the thread when executing user-mode code.
52 2016
A.2. Threads
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ThreadYield
// D e s c r i p t i o n : Yields the CPU to the scheduler , which picks a new thread to
// run . The new thread might be the current thread , so you can ’t
// depend on this fu n ct io n to keep this thread from running for
// any p a r t i c u l a r length of time .
// Returns : void
// P a r a m e t e r : void
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
ThreadYield (
void
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ThreadExit
// D e s c r i p t i o n : Causes the current thread to exit . Never returns .
// Returns : void
// P a r a m e t e r : IN STATUS E x i t S t a t u s
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
ThreadExit (
IN STATUS ExitStatus
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ThreadWaitForTermination
// D e s c r i p t i o n : Waits for a thread to t e r m i n a t e . The exit status of the thread
// will be placed in E x i t S t a t u s .
// Returns : void
// P a r a m e t e r : IN PTHREAD Thread
// P a r a m e t e r : OUT STATUS * E x i t S t a t u s
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
ThreadWaitForTermination (
IN PTHREAD Thread ,
OUT STATUS * ExitStatus
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ThreadCloseHandle
// D e s c r i p t i o n : Closes a thread handle r e ce iv e d from T h r e a d C r e a t e . This is
// n e c e s s a r y for the s t r u c t u r e to be d e s t r o y e d when it is no
// longer needed .
// Returns : void
// P a r a m e t e r : INOUT PTHREAD Thread
// NOTE : If you need to wait for a thread to t e r m i n a t e or find out its
// t e r m i n a t i o n status call this fu nc t io n only after you called
// ThreadWaitForTermination .
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
ThreadCloseHandle (
INOUT PTHREAD Thread
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ThreadGetName
53
A. Reference Guide
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ThreadGetId
// D e s c r i p t i o n : Returns the thread ’s ID .
// Returns : TID
// P a r a m e t e r : IN_OPT PTHREAD Thread - If NULL returns the ID of the
// current thread .
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
TID
ThreadGetId (
IN_OPT PTHREAD Thread
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct io n : ThreadGetPriority
// D e s c r i p t i o n : Returns the thread ’s pr i or i ty . In the p r es en c e of
// pr io r it y donation , returns the higher ( donated ) p ri o ri ty .
// Returns : THREAD_PRIORITY
// P a r a m e t e r : IN_OPT PTHREAD Thread - If NULL returns the pr io r it y of the
// current thread .
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
THR EAD _PRI ORIT Y
ThreadGetPriority (
IN_OPT PTHREAD Thread
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct io n : ThreadSystemInitMainForCurrentCPU
// D e s c r i p t i o n : Call by each CPU to i n i t i a l i z e the main e x e c u t i o n thread . Has a
// d i f f e r e n t flow than any other thread c re a ti on because some of
// the thread i n f o r m a t i o n already exists and it is c u r r e n t l y
// running .
// Returns : STATUS
// P a r a m e t e r : void
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
STATUS
ThreadSystemInitMainForCurrentCPU (
54 2016
A.2. Threads
void
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ThreadSystemInitIdleForCurrentCPU
// D e s c r i p t i o n : Called by each CPU to spawn the idle thread . E x e c u t i o n will not
// co n ti nu e until after the idle thread is first s c h e d u l e d on the
// CPU . This fu n ct io n is also r e s p o n s i b l e for e n ab l in g i n t e r r u p t s
// on the p r o c e s s o r .
// Returns : STATUS
// P a r a m e t e r : void
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
STATUS
ThreadSystemInitIdleForCurrentCPU (
void
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ThreadCreateEx
// D e s c r i p t i o n : Same as T h r e a d C r e a t e except it also takes an a d d i t i o n a l
// parameter , the process to which the thread should belong . This
// fu n ct io n must be called for c r ea ti n g user - mode threads .
// Returns : STATUS
// P a r a m e t e r : IN_Z char * Name
// P a r a m e t e r : IN T H R E A D _ P R I O R I T Y P r io r it y
// P a r a m e t e r : IN P F U N C _ T h r e a d S t a r t F un ct i on
// P a r a m e t e r : IN_OPT PVOID Context
// P a r a m e t e r : OUT_PTR PTHREAD * Thread
// P a r a m e t e r : INOUT struct _ P RO CE S S * Process
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
STATUS
ThreadCreateEx (
IN_Z char * Name ,
IN THRE AD_ PRIO RITY Priority ,
IN PFUNC_ThreadStart Function ,
IN_OPT PVOID Context ,
OUT_PTR PTHREAD * Thread ,
INOUT struct _PROCESS * Process
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ThreadTick
// D e s c r i p t i o n : Called by the timer i n t e r r u p t at each timer tick . It keeps
// track of thread s t a t i s t i c s and t ri gg e rs the s c h e d u l e r when a
// time slice expires .
// Returns : void
// P a r a m e t e r : void
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
ThreadTick (
void
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ThreadBlock
// D e s c r i p t i o n : T r a n s i t i o n s the running thread into the blocked state . The
// thread will not run again until it is u n b l o c k e d ( T h r e a d U n b l o c k )
// Returns : void
// P a r a m e t e r : void
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
ThreadBlock (
void
55
A. Reference Guide
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct io n : ThreadUnblock
// D e s c r i p t i o n : T r a n s i t i o n s thread , which must be in the blocked state , to the
// ready state , a ll o wi n g it to resume running . This is called when
// the r e so ur c e on which the thread is waiting for becomes
// available .
// Returns : void
// P a r a m e t e r : IN PTHREAD Thread
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
ThreadUnblock (
IN PTHREAD Thread
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct io n : ThreadYieldOnInterrupt
// D e s c r i p t i o n : Returns TRUE if the thread must yield the CPU at the end of
// this i n t e r r u p t . FALSE o t h e r w i s e .
// Returns : BOOLEAN
// P a r a m e t e r : void
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
BOOLEAN
ThreadYieldOnInterrupt (
void
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct io n : ThreadTerminate
// D e s c r i p t i o n : Signals a thread to t e r m i n a t e .
// Returns : void
// P a r a m e t e r : INOUT PTHREAD Thread
// NOTE : This fu nc t io n does not cause the thread to i n s t a n t l y terminate ,
// if you want to wait for the thread to t e r m i n a t e use
// ThreadWaitForTermination .
// NOTE : This fu nc t io n should be used only in EXTREME cases because it
// will not free the r e s o u r c e s a c qu ir e d by the thread .
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
Thr ead Term inat e (
INOUT PTHREAD Thread
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct io n : ThreadTakeBlockLock
// D e s c r i p t i o n : Takes the block lock for the e x e c u t i n g thread . This is re q ui r ed
// to avoid a race c o n d i t i o n which would happen if a thread is
// u n b l o c k e d while in the process of being blocked ( thus still
// running on the CPU ) .
// Returns : void
// P a r a m e t e r : void
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
ThreadTakeBlockLock (
void
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct io n : ThreadExecuteForEachThreadEntry
// D e s c r i p t i o n : It er a te s over the all threads list and invokes F un c ti o n on each
// entry passing an a d d i t i o n a l o p ti on a l Context p a r a m e t e r .
// Returns : STATUS
56 2016
A.2. Threads
// P a r a m e t e r : IN P F U N C _ L i s t F u n c t i o n F un c ti o n
// P a r a m e t e r : IN_OPT PVOID Context
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
STATUS
ThreadExecuteForEachThreadEntry (
IN P F U N C _ L i s t F u n c t i o n Function ,
IN_OPT PVOID Context
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : GetCurrentThread
// D e s c r i p t i o n : Returns the running thread .
// Returns : void
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
# define G e tC u rr e nt T hr e ad () (( THREAD *) __readmsr ( I A3 2 _F S _B A SE _M S R ) )
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : SetCurrentThread
// D e s c r i p t i o n : Sets the current running thread .
// Returns : void
// P a r a m e t e r : IN PTHREAD Thread
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
S et C ur r en t Th r ea d (
IN PTHREAD Thread
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ThreadSetPriority
// D e s c r i p t i o n : Sets the thread ’s p r io ri t y to new pr io r it y . If the
// current thread no longer has the highest priority , yields .
// Returns : void
// P a r a m e t e r : IN T H R E A D _ P R I O R I T Y N e w P r i o r i t y
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
ThreadSetPriority (
IN THR EAD_ PRIO RIT Y NewPriority
);
Ready: The thread is in the global thread ready list waiting to receive CPU time. It
is ready because it is not waiting for any resource.
Running: The thread is currently executing on a CPU: it will run until one the
following happens:
1. The thread’s time quantum expires, a clock interrupt occurs, the scheduler
chooses a different thread to run and places the current one in the ready list.
2. The thread requires a resource to continue execution and it is not currently
available so it will be blocked.
57
A. Reference Guide
3. The thread willingly yields the CPU to another thread and gets placed in the
ready list.
Blocked: The thread is waiting for a currently unavailable resource. Once the re-
source will become available it will be moved to the ready list.
Dying: The thread has finished its execution or another thread forcefully terminated
it.
The Initializing and Destroyed states are pseudo-states and have no relevance in
the code. These states are illustrated for an easier understanding on where a thread starts
execution and where it ends it.
58 2016
A.2. Threads
If the new process differs from the old one a CR3 switch will occur and the paging
tables of the scheduled process will be activated.
The ThreadSwitch() assembly function is called - this function stores a pointer to the
current stack of the old thread in the Stack member and restores the stack pointer
of the new thread.
59
A. Reference Guide
the address of the ThreadStart() function as the first return address and an INTER-
RUPT STACK structure for simulating a return from an interrupt (this is the only way
to start a user-mode thread and there was no use of having a different start-up method
for kernel threads).
For an illustration of the initial stack frame see the ASCII figure above ThreadSe-
tupInitialState() and check out the implementation for further details.
When the newly created thread starts its first execution in ThreadSwitch() it will
call the RestoreRegisters() function to setup its initial register values, it will return at the
start of the ThreadStart() function, call ThreadCleanupPostSchedule() and finally perform
an IRETQ instruction which will cause the thread to continue execution in ThreadKernel-
Function() for kernel threads. For user-mode threads (project 2) see A.2.5 [User-mode
Threads].
This function calls the PFUNC ThreadStart function received as parameter for
ThreadCreate(). Its only responsibility is to make sure ThreadExit() is called by each
thread even if not explicitly coded.
User-mode Threads
The IRETQ instruction will cause the kernel mode thread to switch its privilege
level from ring 0 to ring 3 and to start executing user-mode code. The user-mode function
executed depends on which user thread is started.
If the main thread is started (i.e. the first thread in the user-mode process) the
function executed will be determined by the AddressOfEntryPoint field defined in the
Portable Executable (PE) header. For our project, this will correspond to the start()
function implemented in the UsermodeLibrary project. This function is responsible for
setting up the environment for a user-mode application before actually giving it control.
Similar to ThreadKernelFunction() this function also makes sure that the thread exits
properly through the SyscallThreadExit() call.
If the main thread wishes to create additional threads it should call UmCreateThread().
The UsermodeLibrary will internally call SyscallThreadCreate() with start thread() as
the function. This is to ensure that the newly spawned thread properly exits through the
SyscallThreadExit() system call.
60 2016
A.3. Interprocessor Communication
STATUS
SmpSendGenericIpiEx (
IN PFUNC_IpcProcessEvent BroadcastFunction ,
IN_OPT PVOID Context ,
IN_OPT P F U N C _ F r e e F u n c t i o n FreeFunction ,
IN_OPT PVOID FreeContext ,
IN BOOLEAN WaitForHandling ,
IN _ S t r i c t _ t y p e _ m a t c h _
SMP_IPI_SEND_MODE SendMode ,
_When_ ( SendMode == S mpIp iSen dToC pu || SendMode == SmpIpiSendToGroup , IN )
SM P_DE STIN ATIO N Destination
);
2. Context - If you want to pass some information to the targeted CPUs you can pass
a pointer to this field.
3. FreeFunction - If you have allocated your Context dynamically you can specify a
function to be executed after the targets complete the BroadcastFunction - useful
for cleanup.
4. WaitForHandling
When TRUE this function blocks until all the targeted CPUs execute the Broad-
castFunction
When FALSE the function returns after sending the IPI
5. Determines the wait in which the final parameter is interpreted:
61
A. Reference Guide
Listing A.5 [All the CPUs except the current one - no information passing]
- execute a function on all the CPUs except the current one. The issuing CPU does
not wait for the targeted CPUs to complete execution before continuing.
A possible outcome of the execution on a 4 core system could be:
Listing A.5: All the CPUs except the current one - no information passing
static F U N C _ I p c P r o c e s s E v e n t _CmdIpiCmd ;
LOG ( " Hello from CPU 0 x %02 x [0 x %02 x ]\ n " , pCpu - > ApicId , pCpu - > LogicalApicId ) ;
static
STATUS
( __cdecl _CmdIpiCmd ) (
IN_OPT PVOID Context
)
{
PCPU * pCpu ;
62 2016
A.3. Interprocessor Communication
U N R E F E R E N C E D _ P A R A M E T E R ( Context ) ;
pCpu = GetCurrentPcpu () ;
LOG ( " Hello from CPU 0 x %02 x [0 x %02 x ]\ n " , pCpu - > ApicId , pCpu - > LogicalApicId ) ;
return STATUS_SUCCESS ;
}
Listing A.6 [All the CPUs except the current one - passing information]
is similar to the previous example in the sense that the function will be executed by
all the CPUs except the issuing one, however this time the CPUs will complete in an
array the timestamp at which they executed the function and the issuing CPU will
wait for their execution to complete.
A possible outcome of the execution on a 4 core system could be:
Listing A.6: All the CPUs except the current one - passing information
static F U N C _ I p c P r o c e s s E v e n t _ C m d G e t T i m e s t a m p s ;
// when we get here all the other CPUs will have already c o m p l e t e d the t i m e S t a m p s
array
timeStamps [ GetCurrentPcpu () -> ApicId ] = I o m u G e t S y s t e m T i m e U s () ;
static
STATUS
( __cdecl _ C m d G e t T i m e s t a m p s ) (
IN_OPT PVOID Context
63
A. Reference Guide
)
{
PCPU * pCpu ;
QWORD * timeStamps ;
pCpu = GetCurrentPcpu () ;
timeStamps = Context ;
timeStamps [ pCpu - > ApicId ] = I o m u G e t S y s t e m T i m e U s () ;
return STATUS_SUCCESS ;
}
Listing A.7 [Specific target CPU] offers an example of using the extended func-
tion (SmpSendGenericIpiEx ) by sending a function to be executed on a specific target
CPU. Because the WaitForHandling parameter is TRUE we have the guarantee that
by the time ew return from SmpSendGenericIpiEx the BroadcastFunction will have
already been executed on the target.
The only possible outcome of the execution on a 4 core system would be:
LOG ( " Hello from CPU 0 x %02 x [0 x %02 x ]\ n " , pCpu - > ApicId , pCpu - > LogicalApicId ) ;
static
STATUS
( __cdecl _CmdIpiCmd ) (
IN_OPT PVOID Context
)
{
PCPU * pCpu ;
U N R E F E R E N C E D _ P A R A M E T E R ( Context ) ;
pCpu = GetCurrentPcpu () ;
LOG ( " Hello from CPU 0 x %02 x [0 x %02 x ]\ n " , pCpu - > ApicId , pCpu - > LogicalApicId ) ;
return STATUS_SUCCESS ;
}
64 2016
A.4. Processes
A.4 Processes
A.4.1 Process Structure
The structure defining a process is found in process internal.h.
Listing A.8: Process Structure
typedef struct _PROCESS
{
REF_COUNT RefCnt ;
char * ProcessName ;
MUTEX ThreadListLock ;
_Guarded_by_ ( ThreadListLock )
LIST_ENTRY ThreadList ;
_Guarded_by_ ( ThreadListLock )
volatile DWORD Num berO fThr eads ;
Unique identifier, valid values range between 1 and 4095. These can be recycled,
so while process A has ID 7 after it dies, another process may take its former ID
value. The reason why PIDs are implemented this way is because of the limitations
of PCIDs.
char* ProcessName
The name of the process running - this is the name of the executable.
char* FullCommandLine
The whole process command line - including the application’s name.
DWORD NumberOfArguments
The number of arguments - including the process name, or put differently: the
number of space separated strings in FullCommandLine.
66 2016
A.4. Processes
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ProcessWaitForTermination
// D e s c r i p t i o n : Blocks until the process re c ei ve d as a p a r a m e t e r t e r m i n a t e s
// execution .
// Returns : void
// P a r a m e t e r : IN P P RO CE S S Process
// P a r a m e t e r : OUT STATUS * T e r m i n a t i o n S t a t u s - C o r r e s p o n d s to the status of
// the last exiting thread .
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
ProcessWaitForTermination (
IN PPROCESS Process ,
OUT STATUS * TerminationStatus
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ProcessCloseHandle
// D e s c r i p t i o n : Closes a process handle re ce i ve d from P r o c e s s C r e a t e . This is
// n e c e s s a r y for the s t r u c t u r e to be d e s t r o y e d when it is no
// longer needed .
// Returns : void
// P a r a m e t e r : PP R OC ES S Process
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
ProcessCloseHandle (
_Pre_valid_ _Post_invalid_
PPROCESS Process
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ProcessGetName
// D e s c r i p t i o n : Returns the name of the c u r r e n t l y e x e c u t i n g process ( if the
// p a r a m e t e r is NULL ) or of the s p e c i f i e d process
// Returns : const char *
// P a r a m e t e r : IN_OPT P PR O CE SS Process
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
const
char *
ProcessGetName (
IN_OPT PPROCESS Process
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ProcessGetName
// D e s c r i p t i o n : Returns the PID of the c u r r e n t l y e x e c u t i n g process ( if the
// p a r a m e t e r is NULL ) or of the s p e c i f i e d process
// Returns : PID
// P a r a m e t e r : IN_OPT P PR O CE SS Process
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
PID
ProcessGetId (
IN_OPT PPROCESS Process
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ProcessIsSystem
// D e s c r i p t i o n : Checks if a process or the c u r r e n t l y e x e c u t i n g process ( if the
// p a r a m e t e r is NULL ) is the system process .
// Returns : BOOLEAN
// P a r a m e t e r : IN_OPT P PR O CE SS Process
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
BOOLEAN
Pr oces sIsS yste m (
67
A. Reference Guide
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct io n : ProcessTerminate
// D e s c r i p t i o n : Signals a process for t e r m i n a t i o n ( the current process will be
// t e r m i n a t e d if the p a r a m e t e r is NULL ) .
// Returns : void
// P a r a m e t e r : INOUT PP R OC E SS Process
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
P ro c es s Te r mi n at e (
INOUT PPROCESS Process
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct io n : GetCurrentProcess
// D e s c r i p t i o n : R e t r i e v e s the e x e c u t i n g process .
// Returns : PP RO C ES S
// P a r a m e t e r : void
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
PPROCESS
GetCurrentProcess (
void
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct io n : ProcessSystemInitSystemProcess
// D e s c r i p t i o n : I n i t i a l i z e s the System process .
// Returns : STATUS
// P a r a m e t e r : void
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
_No_competing_thread_
STATUS
ProcessSystemInitSystemProcess (
void
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct io n : ProcessRetrieveSystemProcess
// D e s c r i p t i o n : R e t r i e v e s a pointer to the system process .
// Returns : PP RO C ES S
// P a r a m e t e r : void
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
PPROCESS
ProcessRetrieveSystemProcess (
void
);
68 2016
A.4. Processes
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ProcessInsertThreadInList
// D e s c r i p t i o n : Inserts the Thread in the Process thread list .
// Returns : void
// P a r a m e t e r : INOUT PP R OC E SS Process
// P a r a m e t e r : INOUT struct _THREAD * Thread
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
ProcessInsertThreadInList (
INOUT PPROCESS Process ,
INOUT struct _THREAD * Thread
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ProcessNotifyThreadTermination
// D e s c r i p t i o n : Called when a thread t e r m i n a t e s e x e c u t i o n . If this was the last
// active thread in the process it will signal the p r o c e s s e s ’s
// t e r m i n a t i o n event .
// Returns : void
// P a r a m e t e r : IN struct _THREAD * Thread
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
ProcessNotifyThreadTermination (
IN struct _THREAD * Thread
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ProcessRemoveThreadFromList
// D e s c r i p t i o n : Removes the Thread from its c o n t a i n e r process thread list .
// Called when a thread is d e s t r o y e d .
// Returns : void
// P a r a m e t e r : INOUT struct _THREAD * Thread
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
ProcessRemoveThreadFromList (
INOUT struct _THREAD * Thread
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ProcessExecuteForEachProcessEntry
// D e s c r i p t i o n : It e ra te s over the all threads list and invokes F un c ti o n on each
// entry passing an a d d i t i o n a l o p ti on a l Context p a r a m e t e r .
// Returns : STATUS
// P a r a m e t e r : IN P F U N C _ L i s t F u n c t i o n F un c ti o n
// P a r a m e t e r : IN_OPT PVOID Context
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
STATUS
ProcessExecuteForEachProcessEntry (
IN P F U N C _ L i s t F u n c t i o n Function ,
IN_OPT PVOID Context
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ProcessActivatePagingTables
// D e s c r i p t i o n : Pe r fo rm s a switch to the Process paging tables .
// Returns : void
// P a r a m e t e r : IN P PR O CE S S Process
// P a r a m e t e r : IN BOOLEAN I n v a l i d a t e A d d r e s s S p a c e - if TRUE all the cached
// t r a n s l a t i o n s for the Process PCID will be flushed . This option
// is useful when a process t e r m i n a t e s and its PCID will be
// later used by another process .
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
69
A. Reference Guide
void
ProcessActivatePagingTables (
IN PPROCESS Process ,
IN BOOLEAN InvalidateAddressSpace
);
70 2016
A.5. Synchronization
then access. Once the stack is set up we can free this virtual address using MmuFreeSys-
temVirtualAddressForUserBuffer().
A.5 Synchronization
If sharing of resources between threads is not handled in a careful, controlled fash-
ion, the result is usually a big mess. This is especially the case in operating system kernels,
where faulty sharing can crash the entire machine. HAL9000 provides several synchroniza-
tion mechanisms to help out.
Depending on where and what you’ll want to synchronize you have the following
classes of synchronization mechanisms:
Primitive: these mechanisms wait for a resource through busy waiting, they do not
block thread execution and do not allow for preemption because they disable inter-
rupts before starting to acquire the resource and leave them disabled until releasing
it. These synchronization mechanisms can be used anywhere in code. However, due
to the fact that they disable interrupts, they should be used only when synchroniza-
tion interrupt handlers with other code.
Interlocked operations: if a basic data type is shared and the operations performed on
the data are simple then atomic interlocked operations can be used. Some operations
include: increment, addition, exchange, compare and exchange. These mechanism
are implemented at the hardware level.
A good way to think of the difference between primitive and executive mechanisms is
the following: the primitive mechanisms are used to synchronize CPUs, while the executive
mechanisms synchronize threads.
In both the primitive and executive cases the difference between locks and events
can be thought of in terms of ownerships. Locks have owners, the owner (either the CPU
or thread) must be the one releasing the lock, while in the case of the event anyone can
signal the event or clear it.
Also, when it comes to events, both primitive and executive events are classified in
two categories:
1. Notification events: once an event is signaled it remains that way until it is manually
cleared. This means that if N CPUs/threads are waiting for an event they are all
notified when the event is signaled and will continue execution.
71
A. Reference Guide
72 2016
A.5. Synchronization
2. Synchronization events: once an event is signaled it will remain that way only until
a CPU or thread receives the signal. This means that if N CPUs/threads are waiting
for an event only one of them will receive the notification and it will atomically clear
the event not allowing any one else to continue execution.
A.5.1 Primitive
As said earlier, this class of synchronization mechanisms disable interrupts from the
moment they try to acquire the resource until they release it.
If the OS used only primitive mechanisms a tight bottle-neck would be created
allowing interrupts to come only for short periods of time, thus increasing system latency
and making everything less responsive. In this regard you should be careful and use them
as little as possible.
These should be used only when synchronizing data which is shared between an
interrupt handler and other code.
Locks
HAL9000 supports basic spinlocks (see Listing A.11 [Spinlock Interface]), moni-
tor locks (see monlock.h), read/write spinlocks (see rw spinlock.h) and recursive read/write
spinlocks (see rec rw spinlock.h).
There’s no use in shoving the interface for every kind of lock in this document. If
you’re curious you can check out the mentioned files and read the comments to find out
how they work. You should not use the spinlock or monitor lock functions directly, but
instead you should use the interface exposed in lock common.h.
If the LockInit(), LockAcquire(), etc, functions will be used then the operating sys-
tem will determine dynamically which basic lock type to use: spinlocks or monitor locks
(if the MONITOR feature is supported in the CPU). Monitor locks function the same as
spinlocks except they conserve power and reduce memory contention by using a hardware
mechanism to MONITOR and be notified (MWAIT) when a memory store occurs to the
monitored region.
Listing A.11: Spinlock Interface
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : SpinlockInit
// D e s c r i p t i o n : I n i t i a l i z e s a s p in l oc k . No other s p in lo c k * f u nc t io n can be used
// before this f un c ti on is called .
// Returns : void
// P a r a m e t e r : OUT P S P I N L O C K Lock
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
SpinlockInit (
OUT PSPINLOCK Lock
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : SpinlockAcquire
// D e s c r i p t i o n : Spins until the Lock is ac qu i re d . On return i n t e r r u p t s will be
73
A. Reference Guide
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : SpinlockTryAcquire
// D e s c r i p t i o n : At t em pt s to acquire the Lock . If it is free then the f u nc t io n
// will take the lock and return with the i n t e r r u p t s di s ab l ed and
// I n t r S t a t e will hold the pr ev i ou s i n t e r r u p t i b i l i t y state .
// Returns : BOOLEAN - TRUE if the lock was acquired , FALSE o t h e r w i s e
// P a r a m e t e r : INOUT P S P I N L O C K Lock
// P a r a m e t e r : OUT I N T R _ S T A T E * I n t r S t a t e
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
BOOL_SUCCESS
BOOLEAN
SpinlockTryAcquire (
INOUT PSPINLOCK Lock ,
OUT INTR_STATE * IntrState
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct io n : SpinlockIsOwner
// D e s c r i p t i o n : Checks if the current CPU is the lock owner .
// Returns : BOOLEAN
// P a r a m e t e r : IN P S P I N L O C K Lock
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
BOOLEAN
Spi nlo ckIs Owne r (
IN PSPINLOCK Lock
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct io n : SpinlockRelease
// D e s c r i p t i o n : Re le a se s a p r e v i o u s l y a cq u ir ed Lock . O l d I n t r S t a t e should hold
// the value pr ev i ou s r e tu r ne d by S p i n l o c k A c q u i r e or
// SpinlockTryAcquire .
// Returns : void
// P a r a m e t e r : INOUT P S P I N L O C K Lock
// P a r a m e t e r : IN I N T R _ S T A T E O l d I n t r S t a t e
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
Spi nlo ckRe leas e (
INOUT PSPINLOCK Lock ,
IN INTR_STATE OldIntrState
);
Event
These mechanisms are not used for protecting a critical region, but for notifying
one or more CPUs that an event has occurred. If the event is a synchronization event only
one CPU receives the signal, while if it’s a notification event all the CPUs are informed
when the signal occurs.
74 2016
A.5. Synchronization
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : EvtSignal
// D e s c r i p t i o n : Signals an event .
// Returns : void
// P a r a m e t e r : INOUT EVENT * Event
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
EvtSignal (
INOUT EVENT * Event
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : EvtClearSignal
// D e s c r i p t i o n : Clears an event signal .
// Returns : void
// P a r a m e t e r : INOUT EVENT * Event
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
EvtClearSignal (
INOUT EVENT * Event
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : EvtWaitForSignal
// D e s c r i p t i o n : Busy waits until an event is si gn a le d .
// Returns : void
// P a r a m e t e r : INOUT EVENT * Event
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
E vt W ai t Fo r Si g na l (
INOUT EVENT * Event
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : EvtIsSignaled
// D e s c r i p t i o n : Checks if an event is s ig n al e d and returns i n s t a n t l y .
// Returns : BOOLEAN - TRUE if the event was signaled , FALSE o t h e r w i s e .
// P a r a m e t e r : INOUT EVENT * Event
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
75
A. Reference Guide
BOOLEAN
EvtIsSignaled (
INOUT EVENT * Event
);
A.5.2 Executive
These synchronization mechanisms are aware of the operating system and are man-
aged more efficiently due to this.
These mechanisms differ from the primitive ones due to the fact that they block
thread execution and remain that way until the resource is freed and they are unblocked.
These mechanisms rely on the primitive ones for their implementation, but where
the primitive mechanisms disable interrupts for the whole duration, these require interrupts
disabled only a small amount of time in their acquire and release function; however once
an executive resource has been acquired interrupts remain enabled.
Mutex
Depending on their initialization, mutexes may either be recursive or not. If a mutex
is not recursive then the same thread is not allowed to take the mutex more than once
before releasing it.
If the mutex is recursive the same thread can take the mutex as many times as it
wants (up to 255 times) but it must also release it the same number of times it has acquired
it.
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct io n : MutexAcquire
// D e s c r i p t i o n : Ac qu i re s a mutex . If the mutex is c u r r e n t l y held the thread
// is placed in a waiting list and its e x e c u t i o n is blocked .
// Returns : void
// P a r a m e t e r : INOUT PMUTEX Mutex
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
A C Q U I R E S _ E X C L _ A N D _ R E E N T R A N T _ L O C K (* Mutex )
R E Q U I R E S _ N O T _ H E L D _ L O C K (* Mutex )
76 2016
A.5. Synchronization
void
MutexAcquire (
INOUT PMUTEX Mutex
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : MutexRelease
// D e s c r i p t i o n : Re l ea se s a mutex . If there is a thread on the waiting list it
// will be u n b l o c k e d and placed as the lock ’s holder - this will
// ensure f ai r ne ss .
// Returns : void
// P a r a m e t e r : INOUT PMUTEX Mutex
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
R E L E A S E S _ E X C L _ A N D _ R E E N T R A N T _ L O C K (* Mutex )
R E Q U I R E S _ E X C L _ L O C K (* Mutex )
void
MutexRelease (
INOUT PMUTEX Mutex
);
Executive Event
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ExEventSignal
// D e s c r i p t i o n : Signals an event . If the waiting list is not empty it will
// wakeup one or m u lt i pl e threads d e p e n d i n g on the event type .
// Returns : void
// P a r a m e t e r : INOUT EX _ EV E NT * Event
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
ExEventSignal (
INOUT EX_EVENT * Event
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ExEventClearSignal
// D e s c r i p t i o n : Clears an event signal .
// Returns : void
// P a r a m e t e r : INOUT EX _ EV E NT * Event
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
ExEventClearSignal (
INOUT EX_EVENT * Event
77
A. Reference Guide
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct io n : ExEventWaitForSignal
// D e s c r i p t i o n : Waits for an event to be si g na le d . If the event is s ig na l ed it
// will place the thread in a waiting list and block its
// execution .
// Returns : void
// P a r a m e t e r : INOUT EX _ EV E NT * Event
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
ExEventWaitForSignal (
INOUT EX_EVENT * Event
);
Executive Timer
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct io n : ExTimerStart
// D e s c r i p t i o n : Starts the timer c o u n t d o w n . If the time has already elapsed all
// the waiting threads must be woken up .
// Returns : void
// P a r a m e t e r : IN P E X _ T I M E R Timer
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
ExTimerStart (
IN PEX_TIMER Timer
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct io n : ExTimerStop
78 2016
A.5. Synchronization
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ExTimerWait
// D e s c r i p t i o n : Called by a thread to wait for the timer to trigger . If the
// timer already t r i g g e r e d and it ’s not p er i od i c or if the timer
// is u n i n i t i a l i z e d this f un c ti o n must return i n s t a n t l y .
// Returns : void
// P a r a m e t e r : INOUT P E X _ T I M E R Timer
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
ExTimerWait (
INOUT PEX_TIMER Timer
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ExTimerUninit
// D e s c r i p t i o n : U n i n i t i a l i z e d a timer . It may not be used in the future without
// calling the E x T i m e r I n i t fu nc t io n . All threads waiting for the
// timer must be woken up .
// Returns : void
// P a r a m e t e r : INOUT P E X _ T I M E R Timer
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
void
ExTimerUninit (
INOUT PEX_TIMER Timer
);
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
// Fu n ct i on : ExTimerCompareTimers
// D e s c r i p t i o n : Utility f u nc ti o n to compare to two timers .
// Returns : INT64 - if N EG AT I VE = > the first timers trigger time is earlier
// - if 0 = > the timers trigger time is equal
// - if P OS IT I VE = > the first timers trigger time is later
// P a r a m e t e r : IN P E X _ T I M E R F i r s t E l e m
// P a r a m e t e r : IN P E X _ T I M E R S e c o n d E l e m
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
INT64
ExTimerCompareTimers (
IN PEX_TIMER FirstElem ,
IN PEX_TIMER SecondElem
);
79
A. Reference Guide
memory address.
80 2016
A.6. Memory Management
81
A. Reference Guide
commit it - this can be done at a page level granularity, as an example you may reserve 4
pages of memory, but you may want to commit only the first one and the last one.
When memory is committed it is NOT also mapped to a physical address. This
is because the VMM is lazy - see A.6.2 [Lazy Mapping] for details. However, if you
want to make sure that once you commit a VA range it is mapped you can specify the
VMM ALLOC TYPE NOT LAZY flag to the allocate function.
To release previously allocated memory, call VmmFreeRegion() or VmmFreeRe-
gionEx(). De-commiting memory can be done at the page level granularity, however re-
leasing memory (i.e. freeing the reservation completely) is an all or nothing operation:
either you don’t release the memory, either you release it all.
NOTE: The VMM also provides functions for mapping and un-mapping memory,
however you should not use these and use the MMU provided functions instead, see A.6.4
[Memory Management Unit].
The functions, which effectively work with the CPU paging structures to setup VA
to PA translation and to remove them are VmmMapMemoryInternal() and VmmUnmap-
MemoryEx().
Lazy Mapping
The way in which mappings from virtual to physical addresses are created can be
either eager or lazy. In eager mapping, once the virtual address is allocated it’s also mapped
to a physical address. In contrast, when using lazy mapping, the VA may be allocated,
but it will not be mapped to a physical address until it is actually needed, i.e. on first
access to that region.
Because the lazy mapping approach is much faster than the eager one, it is commonly
used in modern operating systems. Also, it is more practical not to reserve physical frames
for virtual memory allocated, because most of the time most of the memory space will not
be used. For example HAL9000 allocates 5 GB of virtual memory for each user-mode
process for their VAS management structures, but most processes will not use more than
a few KB - some even less.
Additionally, eager mapping may be impossible when allocating large ranges of
virtual addresses, as an example we may want to allocate a VA range of 1TB - however
most systems on which the OS will run certainly do not have 1TB of physical space - thus
making eager mapping impossible without supporting swap space.
In case lazy mapping is used, and the 1TB range is allocated, the physical frames
are allocated only to the VAs actually accessed on a per-page granularity, i.e. if from the
1TB range, we access only the first 100 bytes and the last 5 bytes, we’ll only have allocated
2 physical frames of memory.
HAL9000 ’s default behavior is to lazy map virtual addresses, however, if you need
to make sure that once you allocated a VA range, it is also backed by physical memory,
you can set the VMM ALLOC TYPE NOT LAZY flag when allocating virtual memory.
Okay, so how does the OS know that a virtual address is accessed for the first time
82 2016
A.6. Memory Management
and it needs to be assigned to a physical frame? Well, a #PF will occur because the VA
is not mapped, and once the physical frame is reserved, and the paging structures are
updated to hold the VA to PA mapping, execution of the faulting instruction is restarted.
For more information about handling page faults, see A.6.5 [Page-fault handling].
Interface
The MMU is also responsible for aggregating the functionality exposed by the PMM,
VMM, and heap for easier use and convenience. Here is a summary of what the MMU
83
A. Reference Guide
Support for retrieving the VA to PA translation either in the context of the current
process or by using different paging structures: MmuGetPhysicalAddress() and the
Ex variant.
Support for validating if a buffer is valid and the process has the required access
rights for the operation desired: MmuIsBufferValid().
NOTE: For managing heap memory you should use the ExAllocatePoolWithTag()
and ExFreePoolWithTag() functions exposed in ex.h.
84 2016
A.7. Virtual Addresses
If all these steps happen successfully then the exception handler will return and
without any modification on any of the processor’s state before the exception the instruc-
tion which caused the #PF will be re-executed and the memory access will now complete
successfully.
63 48 47 39 38 30 29 21 20 12 11 0
+----------------+---------+---------+---------+---------+------------+
| Unused | PML4 | Dir Ptr | Dir | Table | Offset |
+----------------+---------+---------+---------+---------+------------+
Virtual Address
Because of the way 64-bit mode works, accesses to virtual addresses require the 47th
bit to be reflected in bits 63:48, as a result these are useless for determining the table to
be used for address translation.
The next 4 pairs of 9 bits (PML4, Dir Ptr, Dir, Table) each give us an index inside
a table which takes us to an entry which holds the physical address of the next table, or
in the case of the last 9 bits it gives us the final physical address where the virtual address
is mapped.
The final 12 bits give us the offset inside the physical frame.
HAL9000 offers the following macros and functions for working with virtual ad-
dresses:
85
A. Reference Guide
86 2016
A.8. Paging Tables
87
A. Reference Guide
PteMap(): Creates an entry in a page table to a physical address. The caller can
also specify the access rights and privilege level.
All of these functions work indifferently of the hierarchy level of the paging structure.
88 2016
A.9. List Structures
Using VmmGetPhysicalAddressEx() both the accessed and dirty bits can be re-
trieved for a virtual address from its corresponding page table entry. This is done by
setting the corresponding output pointer to a non-NULL address where to place the value.
This function can be found in vmm.h. NOTE: When retrieving the value of the
accessed or dirty bit the corresponding bit is cleared from the paging table by
HAL9000 .
89
A. Reference Guide
Real usage examples can be found throughout the HAL9000 ’s code, especially in
thread.c, process.c, heap.c, iomu.c.
90 2016
A.10. Hash Table
We will discus the operation in three stages: initializing the hash table, using the
hash table and destroying it.
An example of how to initialize a hash table is provided in Listing A.17 [Hash
Initialization Example]:
After the hash table has been initialized it can now be populated with elements.
Listing A.17: Hash Initialization Example
1 # define NO_OF_KEYS 8
2
3 // global hash table of p r o c e s s e s
4 HASH_TABLE gHashTable ;
5
6 typedef struct _MY_PROCESS
7 {
8 PID Id ;
9 ...
10 // M Y _ P R O C E S S e le me n ts will link in the global list through the H a s h E n t r y field
11 HASH_ENTRY HashEntry ;
12 ...
13 BYTE Data ;
14 } MY_PROCESS , * PMY_PROCESS ;
15
16 void I n i t P r o c e s s H a s h T a b l e ( void )
17 {
18 // Pre - i n i t i a l i z e the hash table , specify the maximum number of keys we want it to
19 // have and the size of the key . This fu n ct i on returns the size in bytes r eq ui r ed
20 // for its i n te r na l H A S H _ T A B L E _ D A T A structure , we will need to a ll oc a te this
21 // memory d y n a m i c a l l y .
22 DWORD r eq u ir e dH a sh S iz e = H as h Ta b le Pr e in i t (& gHashTable , NO_OF_KEYS , sizeof ( PID ) ) ;
23
24 P HA S H_ T AB L E_ D AT A pUnknown = E x A l l o c a t e P o o l W i t h T a g (0 , requiredHashSize ,
HEAP_TEST_TAG , 0) ;
25 ASSERT ( pUnknown != NULL ) ;
26
27 // I n i t i a l i z e the hash table to use the H a s h F u n c G e n e r i c I n c r e m e n t a l hashing
28 // fu n ct i on and specify the d i f f e r e n c e in bytes between the offset to the Key
29 // field and the offset to the H A S H _ E N T R Y field
30 HashTableInit (& gHashTable ,
31 pUnknown ,
91
A. Reference Guide
32 HashFuncGenericIncremental ,
33 FIELD_OFFSET ( MY_PROCESS , Id ) - FIELD_OFFSET ( MY_PROCESS , HashEntry ) ) ;
34 }
The code in Listing A.18 [Hash Usage Example] provides some usage examples:
On line 7 an initialized MY PROCESS structure is inserted into the hash table. Be-
cause the size of the key, the hashing function and the offset between the HASH ENTRY
and the key are already known the only parameters required by this function are the
hash table and the HASH ENTRY to insert.
On line 15 the element whose key is 0x4 is removed from the hash table. If the
element was not present a NULL pointer is returned.
On line 23 a check is made to make sure that the element whose key is 0x4 is no
longer present in the hash table.
92 2016
A.10. Hash Table
16 if ( pEntry != NULL )
17 {
18 PMY_PROCESS pProcess = C O N T A I N I N G _ R E C O R D ( pEntry , MY_PROCESS , HASH_ENTRY ) ;
19 // ... work with the process s t r u c t u r e ...
20 }
21
22 // Check if the element is still in the list
23 ASSERT ( Has hTab leL ooku p (& gHashTable , & idToRemove ) == NULL ) ;
24
25 LOG ( " Number of elements in the hash table is \% u \ n " , HashTableSize (& gHashTable ) ) ;
26
27 // Iterate through all the el em e nt s of the list
28 // The it er a to r m a i n t a i n s the current t r a v e r s a l state within the hash table
29 H A S H _ T A B L E _ I T E R A T O R it ;
30
31 H a s h T a b l e I t e r a t o r I n i t (& gHashTable , & it ) ;
32
33 while (( pEntry = H a s h T a b l e I t e r a t o r N e x t (& it ) != NULL )
34 {
35 // process the entry
36 }
37 }
Finally, the code listed in Listing A.18 [Hash Usage Example] provides an
example of how to destroy the hash table:
On line 19 we see the call which will empty the hash table and call the provided
ProcessFreeFunc() for each element in the hash table.
On lines 1-10 we see the function which is called for each element removed from the
hash table when HashTableClear() is called. The Object parameter will point to the
HASH ENTRY field from the structure, thus to get to the actual element we will
use the CONTAINING RECORD macro. Once we have the element we can free its
memory.
On line 22 we see the call to free the memory which was allocated for the hash table
internal implementation: it is no longer needed.
93
A. Reference Guide
2. Channel 1: unused.
94 2016
A.11. Hardware Timers
1. Periodic interrupts: the interrupt triggers each timer the timer period expires. The
range of interrupt frequencies is from 122 µs to 500 ms.
2. Alarm interrupt: the interrupt is generated when the system time reaches a software
programmed value (hour:minute:second).
More information about the RTC can be found in [14] Chapter 12.6 Real Time
Clock Registers and OS dev - RTC..
If the argument passed is 0 the timer is stopped, else it is enabled in periodic mode
to trigger every Microseconds. When the interrupt triggers SmpApicTimerIsr() will be
executed.
More information about the LAPIC timer can be found in [8] Chapter 10.5.4 APIC
Timer and OS dev - LAPIC Timer.
95
Appendix B
Debugging
HAL9000 is a complex project and sometimes when you’ll make a change in the
code you’ll see that what’s actually happening in the system differs from what you were
expecting.
Unfortunately, HAL9000 doesn’t have support for a debugger, YET! We wish to
write one for the third iteration of the project.
However, in the author’s opinion sufficient tools are available to solve any bugs
which may arise in the code.
The following sections will go through the different techniques available for debug-
ging code and detecting errors:
Signaling function failure: by returning (preferably) unique status values for func-
tions which could fail you could pinpoint the location of the error more easily.
Logging: simply write messages logging the data which interests you, usually when
there are bugs the information logged will differ from what you were expecting.
Asserts: validate all assumptions. Even if you know that the sun rises in the east
or that only threads in the dying state can be destroyed make sure those conditions
stand.
Disassembly: follow the assembly instructions which caused your system to crash
and determine the call stack.
Halt debugging: when you’re desperate and the system reboots in an infinite loop
place some HLT instructions in the code to diagnose the problem.
97
B. Debugging
The recommended way of doing this is by using the SUCCEEDED macro, which returns
TRUE in case of success, and FALSE in case of an error or warning status.
The reason why this format was chosen is so that it can be used in Windows appli-
cations as well without conflicting with existing Microsoft defined statuses. The format is
defined by Microsoft - you don’t need to know the details, but the idea is that this format
is very extensible and can be used by many different vendors (or in our case components)
without overlaying results.
98 2016
B.2. Logging
VVVV stands for the value within the component - unfortunately there are too many
values to enumerate here, but once you go to status.h you can easily determine the
exact status.
Let’s take an example of a status value and determine what it means: we have the
following log message: [ata.c][82]AtaInitialize failed with status: 0xE0010001.
First of all, the text between the first set of brackets tells us the file in which the
message was logged - ata.c
Then, we have the number between the second set of brackets which indicate the line
number - 81.
And finally we see the status value 0xE0010001.
We always start interpreting from the LEFT to the RIGHT (starting from the MSB
going to the LSB).
The severity is 0xE meaning it is an error message. The component is 0x001 meaning
it is a device error message. And finally the value of the general error message is
0x0001, going to status.h and searching for the value within the statuses marked with
DEVICE MASK we find the status name being STATUS DEVICE DOES NOT EXIST.
When the status name is not explicit enough to understand what it means or when
it is returned you can always search in the whole solution to see where that status value
is used.
Ideally, statuses should not be too generic, but they should be specific enough for
the programmer to be able to pinpoint the location (or few locations) from which the value
could have been returned.
NOTE: It is recommended that you add new status values when you’re
working on your project and the code added can fail in a way not described
by the existing values.
B.2 Logging
The easiest thing to do if you don’t understand something is to log it. There are
different logging levels which you may use depending on the importance of the logged
information: trace, info, warning and error.
You can setup the logging level you want to use when calling the LogSystemInit()
for initializing the logging system or LogSetLevel() for changing the log level any time.
For a log message to be shown, its log level must be greater than or equal to the
logging level set, i.e. if currently the system logging level is LogLevelWarning then only
warning and error messages will be displayed and info and trace messages will be ignored.
Also, there are different components which may log trace messages, and you may
activate/deactivate logging trace messages based on the component logging them. This
99
B. Debugging
can be set, as logging level, either when initializing the logging system or by calling LogSet-
TracedComponents() at a later time. For example, if you want to log all trace messages
only from the generic and exception component you would either do as shown in List-
ing B.1 [Logging Init] or as shown in Listing B.2 [Logging Change] after logging
has been initialized.
Listing B.1: Logging Init
LogSystemInit ( LogLevelTrace ,
L o g C o m p o n e n t G e n e r i c | LogComponentException ,
TRUE
);
All of the logging functions and the definitions can be found in the log.h file. At
first, things may seem confusing, but here’s the basic things that you need to know:
LOG WARNING() If you want to log warnings, these should be unexepected things
from which the function can recover.
LOG ERROR() If you want to log errors, these should be used if the errors cannot
be handled and the function emitting the message failed execution.
LOG TRACE THREAD() If you want to log a message on behalf of the threading
component and include the CPU id and file and line from which the log occurred.
Other logging mechanism exist for each component: LOG TRACE *() functions.
Be careful when logging, not to log too much in functions called often, this can slow down
the system considerably until the point that close to 0 progress is made.
Logging is safe to be used in both normal executing code and in interrupt handlers
because it uses primitive locks for synchronization. However, you cannot log messages in
the logging functions, this would cause infinite recursion and the OS will crash due to a
#PF caused by a stack overflow which cannot be solved.
If you want to dump a raw memory region to find out what’s there you can use
the DumpMemory() function or you can use more specialized functions available in the
dmp *.h files which display information about a specific component/device/entity in a
more organized way. As an example see DumpInterruptStack() or DumpProcess().
100 2016
B.3. Asserts
B.3 Asserts
Another mechanism to make sure everything works as expected is to use asserts.
The code is already full of them (1000+ instances). DO NOT REMOVE ANY OF
THE EXISTING ASSERTS!
When you place an assert in the code you set as the condition the thing you’re
expecting to be true a.k.a an invariant, if the condition does not hold once execution
reaches that point the current CPU will stop execution, notify the other CPUs that a fatal
error has occurred and log the condition which failed the assert.
As an example, lets go to the ThreadSchedule() function and have a look at one
of its asserts : ASSERT(INTR OFF == CpuIntrGetState()). When execution will reach
this point the condition will be verified to see if it’s true, i.e. interrupts are disabled when
entering the function, if this is not the case and for some reason the interrupts are enabled
execution will stop and you will see the following message in the log file:
[ERROR][hal_assert.c][29][CPU:00]Kernel panic!
[ERROR][hal_assert.c][31][CPU:00]Thread: [main-00]
[ERROR][hal_assert.c][33][CPU:00][ASSERT][thread.c][1029]Condition: ((0 ==
CpuIntrGetState())) failed
One can easily see that 0 == CpuIntrGetState() didn’t hold true and you can see
that the check is happening in the thread.c file at line 1029.
HAL9000 is full of such asserts to make sure the functions are called correctly and
if some code changes all the invariants are still respected. This project is a large one and
the work effort invested in it spans almost a year and without these asserts it’s very easy
to forget how a code change in a component can affect other components and alter the
system’s behavior. This is why asserts are used and this is why you should not remove
any of them.
Each time you work on your project you should be asking yourself, what conditions
have to hold for the function to work properly? These conditions should validate the
state of the system and, if the function is one which can be only be used by other OS
components, the parameters. You should NOT assert the validity of parameters received
in a system call because these are user provided parameters, but you should assert if your
ThreadSetPriority() function receives an invalid priority because this function can only be
called by other TRUSTED OS components.
To continue the example when writing the ThreadSetPriority() function you should
assert that the thread priority is a valid one and that GetCurrentThread() returns a valid
non-NULL thread.
B.4 Disassembly
Sometimes, when you make code changes some errors may occur which are not
caught by asserts or any other type of validation. These errors may lead to exceptions on
101
B. Debugging
IDA
IDA (or Interactive Disassembler) is a disassembler for computer software which
generates assembly language source code from machine-executable code. IDA
can be downloaded from https://www.hex-rays.com/products/ida/support/
download_freeware.shtml
Most of the time, it is easy to pinpoint the function and the exact place where the
exception took place once we look at the disassembled code. Either because there is a
log function close to the faulting RIP which also logs the file and line or we can easily
determine the function to which the instruction belongs to.
We will now look into an example on how to diagnose a #PF using WinDbg or
IDA.
We made some changes to the code and now HAL9000 crashes due to an un-handled
page fault. The last lines of the log file looks like this:
[thread.c][184][CPU:00]_ThreadInit succeeded
[ERROR][isr.c][149]Could not handle exception 0xE [#PF - Page-Fault Exception]
102 2016
B.4. Disassembly
Interrupt stack:
Error code: 0x0
RIP: 0xFFFF800001032CFB
CS: 0x18
RFLAGS: 0x10086
RSP: 0xFFFF85014365E850
SS: 0x20
Control registers:
CR0: 0x80010031
CR2: 0x20
CR3: 0x1140000
CR4: 0x100020
CR8: 0x0
Processor State:
RAX: 0x0
RCX: 0xC0000100
RDX: 0xFFFF850100000000
RBX: 0x80800
RSP: 0xCCCCCCCCCCCCCCCC
RBP: 0xFFFF800001006125
RSI: 0x408
RDI: 0xFFFF85014365E860
R8: 0xFFFF85014365E8B8
R9: 0xFFFF85014365E658
R10: 0xFFFF800001118000
R11: 0x1
R12: 0x0
R13: 0x0
R14: 0x0
R15: 0x0
RIP: 0xCCCCCCCCCCCCCCCC
Rflags: 0xCCCCCCCCCCCCCCCC
Faulting stack data:
[0xFFFF85014365E850]: 0xCCCCCCCCCCCCCCCC
[0xFFFF85014365E858]: 0xCCCCCCCCCCCCCCCC
[0xFFFF85014365E860]: 0xFFFF85014365E918
[0xFFFF85014365E868]: 0xFFFF80000103330F
[0xFFFF85014365E870]: 0x0
[0xFFFF85014365E878]: 0xCCCCCCCCCCCCCCCC
103
B. Debugging
[0xFFFF85014365E880]: 0xCCCCCCCCCCCCCCCC
[0xFFFF85014365E888]: 0xCCCCCCCCCCCCCCCC
[0xFFFF85014365E890]: 0xCCCCCCCCCCCCCCCC
[0xFFFF85014365E898]: 0xCCCCCCCCCCCCCCCC
[0xFFFF85014365E8A0]: 0xCCCCCCCCCCCCCCCC
[0xFFFF85014365E8A8]: 0xCCCCCCCCCCCCCCCC
[0xFFFF85014365E8B0]: 0xCCCCCCCCCCCCCCCC
[0xFFFF85014365E8B8]: 0x80800000306C3
[0xFFFF85014365E8C0]: 0x1FABFBFFF7FA3223
[0xFFFF85014365E8C8]: 0xCCCCCCCCCCCCCCCC
[0xFFFF85014365E8D0]: 0xCCCCCCCCCCCCCCCC
[0xFFFF85014365E8D8]: 0xCCCCCCCCCCCCCCCC
[0xFFFF85014365E8E0]: 0xCCCCCCCCCCCCCC00
[0xFFFF85014365E8E8]: 0xCCCCCCCCCCCCCCCC
[0xFFFF85014365E8F0]: 0xCCCCCCCCCCCCCCCC
[0xFFFF85014365E8F8]: 0xCCCCCCCCCCCCCCCC
[0xFFFF85014365E900]: 0x4110860067F20473
[0xFFFF85014365E908]: 0xCCCCCCCCCCCCCCCC
[0xFFFF85014365E910]: 0xCCCCCCCCCCCCCCCC
[0xFFFF85014365E918]: 0xFFFF85014365EBB8
[0xFFFF85014365E920]: 0x80800
[0xFFFF85014365E928]: 0xFFFF80000106AE90
[0xFFFF85014365E930]: 0x0
[0xFFFF85014365E938]: 0xFFFF850140000390
[0xFFFF85014365E940]: 0xFFFF8000010EECA8
[0xFFFF85014365E948]: 0xFFFF8000010EEC9C
[ERROR][hal_assert.c][29][CPU:00]Kernel panic!
[ERROR][hal_assert.c][31][CPU:00]Thread: [main-00]
[ERROR][hal_assert.c][33][CPU:00][ASSERT][isr.c][152]Condition: (exceptionHandled)
failed
Exception 0xE was not handled
We can clearly see the value of the faulting RIP as being 0xFFFF800001032CFB
and the faulting address 0x20 (in CR2). When you see these low valued addresses you
can be sure that they are caused by a NULL pointer dereference, i.e. a field at offset 0x20
from a structure is accessed through a NULL pointer.
We can now open the binary file with WinDbg or IDA.
NOTE: the .pdb file should be in the same folder as the binary file when
you open it. Because of this, our recommendation is to load the HAL9000.bin
file from the bin folder.
NOTE: Be careful not have re-compiled HAL9000 since the time of the
crash. If you have done so, the disassembled instructions will not match those
104 2016
B.4. Disassembly
WinDbg
IDA
Open idaq64.exe from the folder where you have downloaded/installed IDA.
Choose to open New file, and choose the file located at HAL9000\bin\x64\Debug\
HAL9000\HAL9000.bin
Do not change any load options and click Ok.
105
B. Debugging
WinDbg
In WinDbg we can use the ln command to jump to the source code line where the
faulting instruction was executed. To do this, type in ln <address> and hit Enter.
Then, click on the path in blue to open the source code where the exception occured.
The process is shown in Figure B.2 [WinDbg: ln command].
To view the assembly code, click View Disassembly, type the address in the
Address: field and hit Enter.
The process is shown in Figure B.3 [WinDbg: Disassembly].
IDA
In IDA we can use the Jump to Address (G keyboard shortcut) to jump to the
faulting instruction: 0xFFFF800001032CFB. In our case we see Figure B.4 [IDA
disassembly near RIP].
106 2016
B.4. Disassembly
107
B. Debugging
108 2016
B.4. Disassembly
From these 3 functions we can deduce that the one which actually called Pro-
cessGetName() is ProcessThreadInsertIntoList() because the last thing logged before the
exception is from thread.c line 184 which belongs to the ThreadSystemInitMainForCur-
rentCPU() function, which doesn’t interact with the other 2 candidate functions.
Usually, at this point we can remember exactly (or look at the diffs in our source
control system) to see what what code changes we have made in this region.
In case this is not enough information we can also look at the stack dump from the
end of the log to determine the call stack hierarchy. Addresses which begin with 0xFFFF85
correspond to dynamically allocated memory, so we only have to look at addresses which
are of the form 0xFFFF8000010 These addresses belong either to code, or to data segments.
This can be determined with 100% precision by looking at the PE, but because there are
very few such addresses on the stack we can easily examine the position of each address
using methods presented for WinDbg and IDA (ln command and Jump To Address (G)).
109
B. Debugging
The machine hangs without rebooting. If this happens, you know that the HLT
instruction was reached and executed. That means that whatever caused the reboot
must be after the place you inserted the HLT. Now move HLT later in the code
sequence.
The machine reboots in a loop. If this happens, you know that the machine didn’t
make it to the HLT instruction. Thus, whatever caused the reboot must be before
the place you inserted the HLT instruction. Now move the HLT instruction earlier
in the code sequence.
If you move around the HLT instruction in a ”binary search” fashion, you can use
this technique to pin down the exact spot that everything goes wrong. It should only take
a few minutes at most.
NOTE: Instead of using the HLT instruction you could place an infinite
loop as suggested in the Pintos documentation.
110 2016
Appendix C
Development Tools
3. Download and install VMWare workstation (VMWare player will not good or
other virtualization solutions are not good!). You can download it from here
and later activate it using your student license.
4. Unzip HAL9000 ’s source code, we will refer to the folder where you unzipped it as
PROJECT ROOT DIRECTORY.
5. Find the install path of VMware Workstation. We will refer to VMware install folder
as: PATH TO VIX TOOLS. If VMware is installed in the default location it should
be: C:\ProgramFiles(x86)\VMware\VMwareWorkstation
6. Unzip the two virtual machines HAL9000 VM and PXE VM. We will refer to the
folder where you unzipped the HAL9000 VM as PATH TO HAL9000 VM.
7. Unzip the PXE archive, we will refer to the folder where you unzipped the files as
PATH TO PXE.
111
C. Development Tools
8. Open the HAL9000 VM and change its processor configuration so that the number
of virtual CPUs given to the machine equals to the number of CPUs available on the
physical machine.
:config_\var{COMPUTER_NAME}
set PXE_PATH=\var{PATH_TO_PXE}
set PATH_TO_VM_DISK=\var{PATH_TO_HAL9000_VM}\HAL9000.vmdk
set PATH_TO_VIX_TOOLS=\var{PATH_TO_VIX_TOOLS}
set PATH_TO_LOG_FILE=\var{PATH_TO_HAL9000_VM}\HAL9000.log
set PATH_TO_VM_FILE=\var{PATH_TO_HAL9000_VM}\HAL9000.vmx
goto end
And the following line after ”if %COMPUTERNAME% == ALEX-PC goto config ALEX-
PC” :
Where the \raw variables must be replaced with the proper paths as mentioned in
the configuration steps.
10. Open the virtual machines as described in C.1.2 [Opening the virtual machines].
11. Enable shared folder as described in C.1.3 [Enabling shared folder in the PXE
virtual machine].
12. Create a new virtual network as described in C.1.4 [Virtual Network Creation].
112 2016
C.1. Setting Up the Environment for HAL9000
Once you completed the first two steps you should download each archive described
in the configuration step and place them so that the folder layout looks like Figure C.1
[Folder structure for automatic configuration (File Explorer)] or Figure C.2
[Folder structure for automatic configuration (Total Commander)].
You can run the chosen script without any parameters and the script will extract
all the archives in a HAL9000 directory which it will create in the current directory. The
script will do all the configuration steps described in steps 4 to 9 (inclusive).
To run a perl script, open a Command Prompt window, navigate to the direc-
tory shown in Figure C.1 [Folder structure for automatic configuration (File
Explorer)] and type perl HAL9000 checkout.pl.
Once the script finishes execution, you need to continue the configuration manually
with C.1.2 [Opening the virtual machines].
113
C. Development Tools
114 2016
C.1. Setting Up the Environment for HAL9000
115
C. Development Tools
116 2016
C.1. Setting Up the Environment for HAL9000
117
C. Development Tools
118 2016
C.1. Setting Up the Environment for HAL9000
The HAL9000 project: upon successful build the binary is copied to the PXE shared
folder so the PXE VM will be able to instantly provide the new image to the HAL9000
VM.
RunAllTests project: copies the Tests.module file to the PXE share and starts the
HAL9000 VM. The .module file is the one which tells the HAL9000 operating system
which tests it needs to run and validate.
After the HAL9000 VM has finished execution the log file, HAL9000.log, will be
parsed by the project and the results of each test and a summary will be displayed.
C.1.6 Troubleshooting
HAL9000 VM not receiving an IP
If you’re in the situation illustrated in Figure C.9 [DHCP fail] it means you are
not receiving an IP address. The reason is one of the following:
119
C. Development Tools
VMnet1 is not configured properly. It is important to use the IP and subnet addresses
specified in C.1.4 [Virtual Network Creation].
HAL9000 VM is not connected to VMnet1. You can check this by going to the virtual
machine, Edit virtual machine settings -> Hardware -> Network Adapter and make
sure the network selected is Custom: VMnet1. Also make sure Connect at power on
is checked for the network adapter.
PXE VM is not connected to VMnet1. Apply the same steps as if HAL9000 VM was
not connected to VMnet1.
120 2016
C.2. Visual Studio
HAL9000 project compilation failed. Make sure you successfully compiled your
project. Rebuild the HAL9000 project to be sure.
You may have skipped some configuration steps and because of this the HAL9000
project does not copy the output file to the PXE folder. Take a look in the PXE
folder and see if a file appears after you successfully compiled HAL9000, if it does
not go back to C.1 [Setting Up the Environment for HAL9000 ] and try to
figure out what configuration step you’ve skipped.
121
C. Development Tools
122 2016
C.2. Visual Studio
Other tips for easy navigation can be found in C.2.1 [Keyboard Shortcuts].
By pressing ”ALT + SHIFT + S” a find symbol box opens and you can search only
for symbols. NOTE: This is only available with Visual Assist.
By pressing ”ALT + SHIFT + O” you can type the file name to which you want to
jump to. NOTE: This is only available with Visual Assist.
By pressing ”ALT + M” while in a file a drop-down will appear listing all the symbols
defined in this file. NOTE: This is only available with Visual Assist.
By Pressing ”ALT + G” (with Visual Assist) or ”F12” on a function you can either
go to its declaration or its implementation. NOTE: The Visual Assist option
usually works better.
By pressing ”CTRL + SHIFT + F” you can launch a text search in the whole solution,
project or files whose names follow a certain pattern.
By pressing ”CTRL + F” you can perform a text search the current file.
C.2.2 Check the platform toolset
To check the platform toolset with which HAL9000 was built, do the following steps:
124 2016
C.2. Visual Studio
2. Navigate to Configuration Properties General Platform Toolset and read
the platform toolset version, as shown in Figure C.14 [Platform Toolset]
1. Select all the projects using Ctrl + click and go to Properties as shown in Fig-
ure C.15 [Properties]
2. Navigate to Configuration Properties General Platform Toolset and se-
lect the desired platform toolset, then click Apply and Ok, as shown in Figure C.16
[Set the Platform Toolset]. If you cannot find the desired platform toolset in the
dropdown list, you have to install it (see C.2.4 [Install the desired platform
toolset]).
125
C. Development Tools
126 2016
C.2. Visual Studio
127
C. Development Tools
1. In Visual Studio, got to Tools Get Tools and Features..., as shown in Fig-
ure C.17 [Get Tools and Features..].
2. The Visual Studio Installer should have opened. Click on Individual components,
enter the version of platform toolset that you want to install, check the checkbox and
click Modify, as shown in Figure C.18 [Install Platform Toolset].
C.3 Hg
It’s crucial that you use a source code control system to manage your HAL9000
code. This will allow you to keep track of your changes and coordinate changes made by
different people in the project. For this class we recommend that you use hg [12]. If you
don’t already know how to use hg, we recommend that you read the guide at hginit.
For hosting your project we recommend that you use Bitbucket [17]. You can use
any hosting site you like as long as the project repository is a private one, i.e. it is not
visible to any users outside your team (and maybe your TA).
128 2016
C.4. Git
C.4 Git
C.4.1 Why do I need it?
We strongly recommend using Git for versioning your code. If you chose to do so,
it is really important that your repo in the cloud is private because we don’t want the
HAL9000 source code leaked. Here are some reasons why you should use Git:
The laboratories are designed in such a way that you can start each lab (with some
exceptions) from a clean HAL9000 solution (one that does not include any of your
modifications). If you don’t use versioning, you will start each lab where you left
off the lab before and your code might contain bugs that will not let you finish the
current lab activity.
There are some laboratories that require you to use code that was written in previous
laboratories. With version control, it is easy to solve this problem. Supposing that
you need the changes from a branch named lab3 in order to implement lab4, you
have the following options:
– Create a new branch lab4 starting from branch lab3. Then, lab4 will have
from the start all the changes that are in lab3
– Create a new branch lab4 starting from branch master and merge lab3 into
lab4.
– Create a new branch lab4 starting from branch master and cherry-pick specific
commits (in case you don’t need everything from the previous lab)
When preparing for the lab exam, by using a separate branch for each lab you can
clearly see what were the code changes needed to solve a given lab activity.
If you create a private repository on github/bitbucket and push your changes there,
you will have your code backed up.
During each lab activity, you can commit code snippets that you think are correct.
If you later add some changes and discover that they are not needed, you don’t have
to delete them manually, as you have the following ways to deal with them:
– if the bad changes are not committed, you can reset the branch to the latest
commit
– if the bad changes are in the last commit, you can reset the branch to an earlier
commit
– if the bad changes are in an older commit that is followed by good commits,
you can revert the bad commit or you can use interactive rebase to drop the
bad commit
129
C. Development Tools
At the end of each lab you can generate a git diff which will contain your source
code changes for that laboratory.
It might happen that we will update the HAL9000 source code during the semester.
If you use version control, it is really easy to apply our latest changes. The methods
are:
– you can change the remote repo back to the one that was initially cloned, pull
the master branch, then reset the remote repo to your private repo an rebase
your branches on master so that they contain our changes
– you can add another remote to your local repo. This way all your local branches
other than master can track your remote branch, but your local master branch
can track the remote repo that was initially cloned. This way you can easily
rebase your branches on top of master
– you can create a fork of the main repo (only possible on Bitbucket since our
repository is set to allow only private forks), the mechanism of keeping up to
date is nearly the same, you update the master from the fork, and rebase your
branches or cherry-pick the changes to them
130 2016
C.4. Git
Figure C.19: You can check the Git Bash Here and Git GUI Here options if you want to,
but git can be used from Command Prompt perfectly well.
Figure C.20: You can choose your favorite text editor from the dropdown list or try add
a custom one.
131
C. Development Tools
132 2016
C.4. Git
133
C. Development Tools
134 2016
C.4. Git
C.4.3 Examples
135
C. Development Tools
Figure C.22: Changing the remote url of the repository and push
Figure C.23: Creating a new branch, checking out that branch an displaying the branches
Figure C.24: git status shows the branch you’re on and what are the changes not yet
committed
136 2016
C.4. Git
Figure C.25: git add * stages all your changes before committing them
Figure C.26: When issuing the very first commit, git asks you to configure an email and a
user name
Figure C.27: git commit commits your changes and git push origin pushes the changes
to the remote repository
137
C. Development Tools
Figure C.28: git reset –hard HEAD will discard the uncommitted changes and reset
your branch to the latest commit
Figure C.29: Creating new lab2 branch from master, merging in the modifications from
lab1 and showing that the changes are present on lab2
138 2016
C.4. Git
Figure C.30: Generating a diff at the end of lab2 to see what are the modifications com-
pared to the clean solution
139
C. Development Tools
Fork
Sourcetree
GitKraken
any other tool that you prefer
C.4.5 Diff tools
When more people work simultaneously on the same file and change the same line,
a merge conflict will be generated at merge. Diff tools are capable of solving some conflicts
automatically and help you better visualize conflicts that need to be solved manually. You
can use the following diff tools:
140 2016
Appendix D
Coding Style
Unfortunately, due to the fact that the work done on HAL9000 spans a long period
(almost a year) some minor coding style changes have been made and the code is not 100%
consistent with the coding style.
When a function should only be used by a single C file it should be a static function.
Instead of using goto cleanup construct, use the try finally construction, see
MSDN try-finally for details and ProcessCreate() for a usage example.
Functions which can be called from outside the trust boundary (between different
projects or privilege levels) should validate all the parameters.
Internal functions which can be called only from inside the trust boundary (inside the
same project) should NOT validate any arguments, however it is a good technique
to use ASSERTs to validate the function’s parameters.
Validate the successful execution of any function you’re calling that returns a status.
Functions should be annotated using SAL.
141
D. Coding Style
142 2016
Bibliography
[3] B. Pfaff, A. Romano, and G. Back, “The pintos instructional operating system
kernel,” in Proceedings of the 40th ACM Technical Symposium on Computer Science
Education, ser. SIGCSE ’09. New York, NY, USA: ACM, 2009, pp. 453–457.
[Online]. Available: http://doi.acm.org/10.1145/1508865.1509023
[8] Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 3 (3A, 3B, 3C
& 3D): System Programming Guide, 059th ed. Intel, 2016.
[9] Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 2 (2A, 2B, 2C
& 2D): Instruction Set Reference, A-Z, 059th ed. Intel, 2016.
[13] “Peering Inside the PE: A Tour of the Win32 Portable Executable File Format,”
https://msdn.microsoft.com/en-us/library/ms809762.aspx.
[14] Intel 9 Series Chipset Family Platform Controller Hub (PCH), 330550th ed. Intel,
June 2015.
143
Bibliography
144 2016