RTOS
RTOS
Real-Time Operating
System (RTOS) based
Embedded System
Design
Real-Time Operating System (RTOS) based Embedded System Design
User Applications
Application
programming
Memory management interface (API)
Kernel Services
Process management
Time management
File system management
I/O system management
Device driver
interface
Underlying hardware
Fig. 10.1 The Operating System Architecture
File System Management File is a collection of related information. A file could be a program (source code
or executable), text files, image files, word documents, audio/video files, etc. Each of these files differ in the
kind of information they hold and the way in which the information is stored. The file operation is a useful
service provided by the OS. The file system management service of Kernel is responsible for
• The creation, deletion and alteration of files
• Creation, deletion and alteration of directories
• Saving of files in the secondary storage memory (e.g. Hard disk storage)
• Providing automatic allocation of file space based on the amount of free space available
• Providing a flexible naming convention for the files
The various file system management operations are OS dependent. For example, the kernel of Microsoft®
DOS OS supports a specific set of file system management operations and they are not the same as the file
system operations supported by UNIX Kernel.
I/O System (Device) Management Kernel is responsible for routing the I/O requests coming from different
user applications to the appropriate I/O devices of the system. In a well-structured OS, the direct accessing
of I/O devices are not allowed and the access to them are provided through a set of Application Programming
Interfaces (APIs) exposed by the kernel. The kernel maintains a list of all the I/O devices of the system. This
list may be available in advance, at the time of building the kernel. Some kernels, dynamically updates the
list of available devices as and when a new device is installed (e.g. Windows NT kernel keeps the list updated
when a new plug ‘n’ play USB device is attached to the system). The service ‘Device Manager’ (Name may
vary across different OS kernels) of the kernel is responsible for handling all I/O device related operations.
The kernel talks to the I/O device through a set of low-level systems calls, which are implemented in a
service, called device drivers. The device drivers are specific to a device or a class of devices. The Device
Manager is responsible for
• Loading and unloading of device drivers
• Exchanging information and the system specific control signals to and from the device
Secondary Storage Management The secondary storage management deals with managing the secondary
storage memory devices, if any, connected to the system. Secondary memory is used as backup medium for
programs and data since the main memory is volatile. In most of the systems, the secondary storage is kept in
disks (Hard Disk). The secondary storage management service of kernel deals with
• Disk storage allocation
• Disk scheduling (Time interval at which the disk is activated to backup data)
• Free Disk space management
Protection Systems Most of the modern operating systems are designed in such a way to support multiple
users with different levels of access permissions (e.g. Windows 10 with user permissions like ‘Administrator’,
‘Standard’, ‘Restricted’, etc.). Protection deals with implementing the security policies to restrict the access
to both user and system resources by different applications or processes or users. In multiuser supported
operating systems, one user may not be allowed to view or modify the whole/portions of another user’s data
or profile details. In addition, some application may not be granted with permission to make use of some of
the system resources. This kind of protection is provided by the protection services running within the kernel.
Interrupt Handler Kernel provides handler mechanism for all external/internal interrupts generated by the
system.
These are some of the important services offered by the kernel of an operating system. It does not mean
that a kernel contains no more than components/services explained above. Depending on the type of the
Real-Time Operating System (RTOS) based Embedded System Design
operating system, a kernel may contain lesser number of components/services or more number of components/
services. In addition to the components/services listed above, many operating systems offer a number of add-
on system components/services to the kernel. Network communication, network management, user-interface
graphics, timer services (delays, timeouts, etc.), error handler, database management, etc. are examples for
such components/services. Kernel exposes the interface to the various kernel applications/services, hosted by
kernel, to the user applications through a set of standard Application Programming Interfaces (APIs). User
applications can avail these API calls to access the various kernel application/services.
10.1.1.1 Kernel Space and User Space
As we discussed in the earlier section, the applications/services are classified into two categories, namely:
user applications and kernel applications. The program code corresponding to the kernel applications/services
are kept in a contiguous area (OS dependent) of primary (working) memory and is protected from the un-
authorised access by user programs/applications. The memory space at which the kernel code is located is
known as ‘Kernel Space’. Similarly, all user applications are loaded to a specific area of primary memory
and this memory area is referred as ‘User Space’. User space is the memory area where user applications
are loaded and executed. The partitioning of memory into kernel and user space is purely Operating System
dependent. Some OS implements this kind of partitioning and protection whereas some OS do not segregate
the kernel and user application code storage into two separate areas. In an operating system with virtual
memory support, the user applications are loaded into its corresponding virtual memory space with demand
paging technique; Meaning, the entire code for the user application need not be loaded to the main (primary)
memory at once; instead the user application code is split into different pages and these pages are loaded into
and out of the main memory area on a need basis. The act of loading the code into and out of the main memory
is termed as ‘Swapping’. Swapping happens between the main (primary) memory and secondary storage
memory. Each process run in its own virtual memory space and are not allowed accessing the memory space
corresponding to another processes, unless explicitly requested by the process. Each process will have certain
privilege levels on accessing the memory of other processes and based on the privilege settings, processes can
request kernel to map another process’s memory to its own or share through some other mechanism. Most
of the operating systems keep the kernel application code in main memory and it is not swapped out into the
secondary memory.
10.1.1.2 Monolithic Kernel and Microkernel
As we know, the kernel forms the heart of an operating system. Different approaches are adopted for building
an Operating System kernel. Based on the kernel design, kernels can be classified into ‘Monolithic’ and
‘Micro’.
Monolithic Kernel In monolithic kernel architecture, all
kernel services run in the kernel space. Here all kernel Applications
modules run within the same memory space under a
single kernel thread. The tight internal integration of
kernel modules in monolithic kernel architecture allows
the effective utilisation of the low-level features of the
underlying system. The major drawback of monolithic
kernel is that any error or failure in any one of the Monolithic kernel with all
kernel modules leads to the crashing of the entire kernel operating system services
application. LINUX, SOLARIS, MS-DOS kernels running in kernel space
are examples of monolithic kernel. The architecture
representation of a monolithic kernel is given in Fig. 10.2. Fig. 10.2 The Monolithic Kernel Model
Introduction to Embedded Systems
rules concerning time-critical allocation of a system’s resources. The RTOS decides which applications should
run in which order and how much time needs to be allocated for each application. Predictable performance
is the hallmark of a well-designed RTOS. This is best achieved by the consistent application of policies and
rules. Policies guide the design of an RTOS. Rules implement those policies and resolve policy conflicts.
Windows Embedded Compact, QNX, VxWorks MicroC/OS-II etc are examples of Real Time Operating
Systems (RTOS).
10.2.2.1 The Real-Time Kernel
The kernel of a Real-Time Operating System is referred as Real. Time kernel. In complement to the
conventional OS kernel, the Real-Time kernel is highly specialised and it contains only the minimal set of
services required for running the user applications/tasks. The basic functions of a Real-Time kernel are listed
below:
• Task/Process management
• Task/Process scheduling
• Task/Process synchronisation
• Error/Exception handling
• Memory management
• Interrupt handling
• Time management
Task/Process management Deals with setting up the memory space for the tasks, loading the task’s
code into the memory space, allocating system resources, setting up a Task Control Block (TCB) for the
task and task/process termination/deletion. A Task Control Block (TCB) is used for holding the information
corresponding to a task. TCB usually contains the following set of information.
Task ID: Task Identification Number
Task State: The current state of the task (e.g. State = ‘Ready’ for a task which is ready to execute)
Task Type: Task type. Indicates what is the type for this task. The task can be a hard real time or soft real
time or background task.
Task Priority: Task priority (e.g. Task priority = 1 for task with priority = 1)
Task Context Pointer: Context pointer. Pointer for context saving
Task Memory Pointers: Pointers to the code memory, data memory and stack memory for the task
Task System Resource Pointers: Pointers to system resources (semaphores, mutex, etc.) used by the task
Task Pointers: Pointers to other TCBs (TCBs for preceding, next and waiting tasks)
Other Parameters: Other relevant task parameters
The parameters and implementation of the TCB is kernel dependent. The TCB parameters vary across
different kernels, based on the task management implementation. Task management service utilises the TCB
of a task in the following way
• Creates a TCB for a task on creating a task
• Delete/remove the TCB of a task when the task is terminated or deleted
• Reads the TCB to get the state of a task
• Update the TCB with updated parameters on need basis (e.g. on a context switch)
• Modify the TCB to change the priority of the task dynamically
Task/Process Scheduling Deals with sharing the CPU among various tasks/processes. A kernel application
called ‘Scheduler’ handles the task scheduling. Scheduler is nothing but an algorithm implementation, which
Introduction to Embedded Systems
performs the efficient and optimal scheduling of tasks to provide a deterministic behaviour. We will discuss
the various types of scheduling in a later section of this chapter.
Task/Process Synchronisation Deals with synchronising the concurrent access of a resource, which is
shared across multiple tasks and the communication between various tasks. We will discuss the various
synchronisation techniques and inter task /process communication in a later section of this chapter.
Error/Exception Handling Deals with registering and handling the errors occurred/exceptions raised during
the execution of tasks. Insufficient memory, timeouts, deadlocks, deadline missing, bus error, divide by zero,
unknown instruction execution, etc. are examples of errors/exceptions. Errors/Exceptions can happen at the
kernel level services or at task level. Deadlock is an example for kernel level exception, whereas timeout is
an example for a task level exception. The OS kernel gives the information about the error in the form of a
system call (API). GetLastError() API provided by Windows CE/Embedded Compact RTOS is an example
for such a system call. Watchdog timer is a mechanism for handling the timeouts for tasks. Certain tasks may
involve the waiting of external events from devices. These tasks will wait infinitely when the external device
is not responding and the task will generate a hang-up behaviour. In order to avoid these types of scenarios,
a proper timeout mechanism should be implemented. A watchdog is normally used in such situations. The
watchdog will be loaded with the maximum expected wait time for the event and if the event is not triggered
within this wait time, the same is informed to the task and the task is timed out. If the event happens before
the timeout, the watchdog is resetted.
Memory Management Compared to the General Purpose Operating Systems, the memory management
function of an RTOS kernel is slightly different. In general, the memory allocation time increases
depending on the size of the block of memory needs to be allocated and the state of the allocated memory
block (initialised memory block consumes more allocation time than un-initialised memory block). Since
predictable timing and deterministic behaviour are the primary focus of an RTOS, RTOS achieves this
by compromising the effectiveness of memory allocation. RTOS makes use of ‘block’ based memory
allocation technique, instead of the usual dynamic memory allocation techniques used by the GPOS.
RTOS kernel uses blocks of fixed size of dynamic memory and the block is allocated for a task on a need
basis. The blocks are stored in a ‘Free Buffer Queue’. To achieve predictable timing and avoid the timing
overheads, most of the RTOS kernels allow tasks to access any of the memory blocks without any memory
protection. RTOS kernels assume that the whole design is proven correct and protection is unnecessary.
Some commercial RTOS kernels allow memory protection as optional and the kernel enters a fail-safe
mode when an illegal memory access occurs.
A few RTOS kernels implement Virtual Memory* concept for memory allocation if the system supports
secondary memory storage (like HDD and FLASH memory). In the ‘block’ based memory allocation, a block
of fixed memory is always allocated for tasks on need basis and it is taken as a unit. Hence, there will not
be any memory fragmentation issues. The memory allocation can be implemented as constant functions and
thereby it consumes fixed amount of time for memory allocation. This leaves the deterministic behaviour of
the RTOS kernel untouched. The ‘block’ memory concept avoids the garbage collection overhead also. (We
will explore this technique under the MicroC/OS-II kernel in a latter chapter).The ‘block’ based memory
* Virtual Memory is an imaginary memory supported by certain operating systems. Virtual memory expands the address space available
to a task beyond the actual physical memory (RAM) supported by the system. Virtual memory is implemented with the help of a
Memory Management Unit (MMU) and ‘memory paging’. The program memory for a task can be viewed as different pages and the
page corresponding to a piece of code that needs to be executed is loaded into the main physical memory (RAM). When a memory page
is no longer required, it is moved out to secondary storage memory and another page which contains the code snippet to be executed is
loaded into the main memory. This memory movement technique is known as demand paging. The MMU handles the demand paging
and converts the virtual address of a location in a page to corresponding physical address in the RAM.
Real-Time Operating System (RTOS) based Embedded System Design
allocation achieves deterministic behaviour with the trade-of limited choice of memory chunk size and
suboptimal memory usage.
Interrupt Handling Deals with the handling of various types of interrupts. Interrupts provide Real-Time
behaviour to systems. Interrupts inform the processor that an external device or an associated task requires
immediate attention of the CPU. Interrupts can be either Synchronous or Asynchronous. Interrupts which
occurs in sync with the currently executing task is known as Synchronous interrupts. Usually the software
interrupts fall under the Synchronous Interrupt category. Divide by zero, memory segmentation error, etc. are
examples of synchronous interrupts. For synchronous interrupts, the interrupt handler runs in the same context
of the interrupting task. Asynchronous interrupts are interrupts, which occurs at any point of execution of any
task, and are not in sync with the currently executing task. The interrupts generated by external devices (by
asserting the interrupt line of the processor/controller to which the interrupt line of the device is connected)
connected to the processor/controller, timer overflow interrupts, serial data reception/ transmission interrupts,
etc. are examples for asynchronous interrupts. For asynchronous interrupts, the interrupt handler is usually
written as separate task (Depends on OS kernel implementation) and it runs in a different context. Hence,
a context switch happens while handling the asynchronous interrupts. Priority levels can be assigned to the
interrupts and each interrupts can be enabled or disabled individually. Most of the RTOS kernel implements
‘Nested Interrupts’ architecture. Interrupt nesting allows the pre-emption (interruption) of an Interrupt Service
Routine (ISR), servicing an interrupt, by a high priority interrupt.
Time Management Accurate time management is essential for providing precise time reference for all
applications. The time reference to kernel is provided by a high-resolution Real-Time Clock (RTC) hardware
chip (hardware timer). The hardware timer is programmed to interrupt the processor/controller at a fixed rate.
This timer interrupt is referred as ‘Timer tick’. The ‘Timer tick’ is taken as the timing reference by the kernel.
The ‘Timer tick’ interval may vary depending on the hardware timer. Usually the ‘Timer tick’ varies in the
microseconds range. The time parameters for tasks are expressed as the multiples of the ‘Timer tick’.
The System time is updated based on the ‘Timer tick’. If the System time register is 32 bits wide and the
‘Timer tick’ interval is 1 microsecond, the System time register will reset in
232 * 10–6/ (24 * 60 * 60) = 49700 Days = ~ 0.0497 Days = 1.19 Hours
If the ‘Timer tick’ interval is 1 millisecond, the system time register will reset in
232 * 10–3 / (24 * 60 * 60) = 497 Days = 49.7 Days = ~ 50 Days
The ‘Timer tick’ interrupt is handled by the ‘Timer Interrupt’ handler of kernel. The ‘Timer tick’ interrupt
can be utilised for implementing the following actions.
• Save the current context (Context of the currently executing task).
• Increment the System time register by one. Generate timing error and reset the System time register if
the timer tick count is greater than the maximum range available for System time register.
• Update the timers implemented in kernel (Increment or decrement the timer registers for each timer
depending on the count direction setting for each register. Increment registers with count direction
setting = ‘count up’ and decrement registers with count direction setting = ‘count down’).
• Activate the periodic tasks, which are in the idle state.
• Invoke the scheduler and schedule the tasks again based on the scheduling algorithm.
• Delete all the terminated tasks and their associated data structures (TCBs)
• Load the context for the first task in the ready queue. Due to the re-scheduling, the ready task might be
changed to a new one from the task, which was preempted by the ‘Timer Interrupt’ task.
Apart from these basic functions, some RTOS provide other functionalities also (Examples are file
management and network functions). Some RTOS kernel provides options for selecting the required kernel
Introduction to Embedded Systems
functions at the time of building a kernel. The user can pick the required functions from the set of available
functions and compile the same to generate the kernel binary. Windows CE is a typical example for such an
RTOS. While building the target, the user can select the required components for the kernel.
10.2.2.2 Hard Real-Time
Real-Time Operating Systems that strictly adhere to the timing constraints for a task is referred as ‘Hard
Real-Time’ systems. A Hard Real-Time system must meet the deadlines for a task without any slippage.
Missing any deadline may produce catastrophic results for Hard Real-Time Systems, including permanent
data lose and irrecoverable damages to the system/users. Hard Real-Time systems emphasise the principle ‘A
late answer is a wrong answer’. A system can have several such tasks and the key to their correct operation
lies in scheduling them so that they meet their time constraints. Air bag control systems and Anti-lock Brake
Systems (ABS) of vehicles are typical examples for Hard Real-Time Systems. The Air bag control system
should be into action and deploy the air bags when the vehicle meets a severe accident. Ideally speaking, the
time for triggering the air bag deployment task, when an accident is sensed by the Air bag control system,
should be zero and the air bags should be deployed exactly within the time frame, which is predefined for
the air bag deployment task. Any delay in the deployment of the air bags makes the life of the passengers
under threat. When the air bag deployment task is triggered, the currently executing task must be pre-empted,
the air bag deployment task should be brought into execution, and the necessary I/O systems should be
made readily available for the air bag deployment task. To meet the strict deadline, the time between the
air bag deployment event triggering and start of the air bag deployment task execution should be minimum,
ideally zero. As a rule of thumb, Hard Real-Time Systems does not implement the virtual memory model for
handling the memory. This eliminates the delay in swapping in and out the code corresponding to the task
to and from the primary memory. In general, the presence of Human in the loop (HITL) for tasks introduces
unexpected delays in the task execution. Most of the Hard Real-Time Systems are automatic and does not
contain a ‘human in the loop’.
10.2.2.3 Soft Real-Time
Real-Time Operating System that does not guarantee meeting deadlines, but offer the best effort to meet the
deadline are referred as ‘Soft Real-Time’ systems. Missing deadlines for tasks are acceptable for a Soft Real-
time system if the frequency of deadline missing is within the compliance limit of the Quality of Service
(QoS). A Soft Real-Time system emphasises the principle ‘A late answer is an acceptable answer, but it could
have done bit faster’. Soft Real-Time systems most often have a ‘human in the loop (HITL)’. Automatic Teller
Machine (ATM) is a typical example for Soft-Real-Time System. If the ATM takes a few seconds more than
the ideal operation time, nothing fatal happens. An audio-video playback system is another example for Soft
Real-Time system. No potential damage arises if a sample comes late by fraction of a second, for playback.
10.3.1 Process
A ‘Process’ is a program, or part of it, in execution. Process is also known as an instance of a program in
execution. Multiple instances of the same program can execute simultaneously. A process requires various
system resources like CPU for executing the process, memory for storing the code corresponding to the process
and associated variables, I/O devices for information exchange, etc. A process is sequential in execution.
10.3.1.1 The Structure of a Process
The concept of ‘Process’ leads to concurrent execution (pseudo parallelism) of tasks and thereby the efficient
utilisation of the CPU and other system resources. Concurrent execution is achieved through the sharing of
CPU among the processes. A process mimics a processor in properties and holds a set of registers, process
status, a Program Counter (PC) to point to the next executable instruction of the process, a stack for holding the
local variables associated with the process and the code corresponding to the process. This can be visualised
as shown in Fig. 10.4.
A process which inherits all the properties of the CPU can be considered as a virtual processor, awaiting
its turn to have its properties switched into the physical processor. When the process gets its turn, its registers
and the program counter register becomes mapped to the physical registers of the CPU. From a memory
perspective, the memory occupied by the process is segregated into three regions, namely, Stack memory,
Data memory and Code memory (Fig. 10.5).
Process
Stack Memory
Stack
(Stack pointer) Stack memory grows
downwards
Working registers
Data memory grows
Status registers upwards
Data Memory
Code memory
corresponding to the
Process Code Memory
The ‘Stack’ memory holds all temporary data such as variables local to the process. Data memory holds
all global data for the process. The code memory contains the program code (instructions) corresponding
to the process. On loading a process into the main memory, a specific area of memory is allocated for the
process. The stack memory usually starts (OS Kernel implementation dependent) at the highest memory
address from the memory area allocated for the process. Say for example, the memory map of the memory
area allocated for the process is 2048 to 2100, the stack memory starts at address 2100 and grows downwards
to accommodate the variables local to the process.
Introduction to Embedded Systems
Scheduled for
Interrupted or
what it is allowed to do next. Figure 10.6 represents
Execution
Preempted
the various states associated with a process. Blocked
The state at which a process is being created is
referred as ‘Created State’. The Operating System
recognises a process in the ‘Created State’ but no
resources are allocated to the process. The state, Running
where a process is incepted into the memory and
awaiting the processor time for execution, is known
as ‘Ready State’. At this stage, the process is placed Execution Completion
in the ‘Ready list’ queue maintained by the OS.
The state where in the source code instructions
corresponding to the process is being executed is
Completed
called ‘Running State’. Running state is the state
at which the process execution happens. ‘Blocked
State/Wait State’ refers to a state where a running Fig. 10.6 Process states and state transition representation
process is temporarily suspended from execution
and does not have immediate access to resources. The blocked state might be invoked by various conditions
like: the process enters a wait state for an event to occur (e.g. Waiting for user inputs such as keyboard input)
or waiting for getting access to a shared resource (will be discussed at a later section of this chapter). A state
where the process completes its execution is known as ‘Completed State’. The transition of a process from
one state to another is known as ‘State transition’. When a process changes its state from Ready to running
or from running to blocked or terminated or from blocked to running, the CPU allocation for the process may
also change.
It should be noted that the state representation for a process/task mentioned here is a generic representation.
The states associated with a task may be known with a different name or there may be more or less number
of states than the one explained here under different OS kernel. For example, under VxWorks’ kernel, the
tasks may be in either one or a specific combination of the states READY, PEND, DELAY and SUSPEND.
The PEND state represents a state where the task/process is blocked on waiting for I/O or system resource.
The DELAY state represents a state in which the task/process is sleeping and the SUSPEND state represents
a state where a task/process is temporarily suspended from execution and not available for execution. Under
MicroC/OS-II kernel, the tasks may be in one of the states, DORMANT, READY, RUNNING, WAITING
or INTERRUPTED. The DORMANT state represents the ‘Created’ state and WAITING state represents the
state in which a process waits for shared resource or I/O access. We will discuss about the states and state
transition for tasks under VxWorks and uC/OS-II kernel in a later chapter.
Real-Time Operating System (RTOS) based Embedded System Design
10.3.2 Threads
A thread is the primitive that can execute code. A
thread is a single sequential flow of control within Stack memory for Thread 1
a process. ‘Thread’ is also known as lightweight
process. A process can have many threads of Stack memory for Thread 2
Stack Memory
execution. Different threads, which are part of a for Process
process, share the same address space; meaning
they share the data memory, code memory and heap
memory area. Threads maintain their own thread
status (CPU register values), Program Counter (PC) Data memory for process
and stack. The memory model for a process and its
associated threads are given in Fig. 10.7. Code memory for process
10.3.2.1 The Concept of Multithreading
A process/task in embedded application may be a Fig. 10.7 Memory organisation of a Process and its
associated Threads
complex or lengthy one and it may contain various
suboperations like getting input from I/O devices connected to the processor, performing some internal
calculations/operations, updating some I/O devices etc. If all the subfunctions of a task are executed in
sequence, the CPU utilisation may not be efficient. For example, if the process is waiting for a user input,
the CPU enters the wait state for the event, and the process execution also enters a wait state. Instead of this
single sequential execution of the whole process, if the task/process is split into different threads carrying
out the different subfunctionalities of the process, the CPU can be effectively utilised and when the thread
corresponding to the I/O operation enters the wait state, another threads which do not require the I/O event
for their operation can be switched into execution. This leads to more speedy execution of the process and
the efficient utilisation of the processor time and resources. The multithreaded architecture of a process can
be better visualised with the thread-process diagram shown in Fig. 10.8.
If the process is split into multiple threads, which executes a portion of the process, there will be a main
thread and rest of the threads will be created within the main thread. Use of multiple threads to execute a
process brings the following advantage.
• Better memory utilisation. Multiple threads of the same process share the address space for data
memory. This also reduces the complexity of inter thread communication since variables can be shared
across the threads.
• Since the process is split into different threads, when one thread enters a wait state, the CPU can be
utilised by other threads of the process that do not require the event, which the other thread is waiting,
for processing. This speeds up the execution of the process.
• Efficient CPU utilisation. The CPU is engaged all time.
Introduction to Embedded Systems
Task/Process
Code memory
Data memory
Thread Process
Thread is a single unit of execution and is part of process. Process is a program in execution and contains one or
more threads.
A thread does not have its own data memory and heap Process has its own code memory, data memory and stack
memory. It shares the data memory and heap memory with memory.
other threads of the same process.
A thread cannot live independently; it lives within the A process contains at least one thread.
process.
There can be multiple threads in a process. The first thread Threads within a process share the code, data and heap
(main thread) calls the main function and occupies the start memory. Each thread holds separate memory area for stack
of the stack memory of the process. (shares the total stack memory of the process).
Threads are very inexpensive to create Processes are very expensive to create. Involves many OS
overhead.
Context switching is inexpensive and fast Context switching is complex and involves lot of OS over-
head and is comparatively slower.
If a thread expires, its stack is reclaimed by the process. If a process dies, the resources allocated to it are reclaimed
by the OS and all the associated threads of the process
also dies.
any point of time, he throws only one ball and catches only one per hand. However, the speed at which he is
switching the balls for throwing and catching creates the illusion, he is throwing and catching multiple balls
or using more than two hands ☺ simultaneously, to the spectators.
‘Context Switching’
Processes
Time
or system resource. The co-operative and non-preemptive multitasking differs in their behaviour when they
are in the ‘Blocked/Wait’ state. In co-operative multitasking, the currently executing process/task need not
relinquish the CPU when it enters the ‘Blocked/Wait’ state, waiting for an I/O, or a shared resource access or
an event to occur whereas in non-preemptive multitasking the currently executing task relinquishes the CPU
when it waits for an I/O or system resource or an event to occur.
The Operating System maintains various queues† in connection with the CPU scheduling, and a process
passes through these queues during the course of its admittance to execution completion.
The various queues maintained by OS in association with CPU scheduling are:
Job Queue: Job queue contains all the processes in the system
Ready Queue: Contains all the processes, which are ready for execution and waiting for CPU to get their
turn for execution. The Ready queue is empty when there is no process ready for running.
Device Queue: Contains the set of processes, which are waiting for an I/O device.
A process migrates through all these queues during its journey from ‘Admitted’ to ‘Completed’ stage. The
following diagrammatic representation (Fig. 10.12) illustrates the transition of a process through the various
queues.
Based on the scheduling algorithm used, the scheduling can be classified into the following categories.
† Queue is a special kind of arrangement of a collection of objects. In the operating system context queue is considered as a buffer.
412 Introduction to Embedded Systems
Example 1
Three processes with process IDs P1, P2, P3 with estimated completion time 10, 5, 7 milliseconds respectively
enters the ready queue together in the order P1, P2, P3. Calculate the waiting time and Turn Around Time
(TAT) for each process and the average waiting time and Turn Around Time (Assuming there is no I/O
waiting for the processes).
The sequence of execution of the processes by the CPU is represented as
P1 P2 P3
0 10 15 22
10 5 7
Assuming the CPU is readily available at the time of arrival of P1, P1 starts executing without any waiting in
the ‘Ready’ queue. Hence the waiting time for P1 is zero. The waiting time for all processes are given as
Waiting Time for P1 = 0 ms (P1 starts executing first)
Waiting Time for P2 = 10 ms (P2 starts executing after completing P1)
Waiting Time for P3 = 15 ms (P3 starts executing after completing P1 and P2)
Average waiting time = (Waiting time for all processes) / No. of Processes
= (Waiting time for (P1+P2+P3)) / 3
= (0+10+15)/3 = 25/3
= 8.33 milliseconds
Turn Around Time (TAT) for P1 = 10 ms (Time spent in Ready Queue + Execution Time)
Turn Around Time (TAT) for P2 = 15 ms (-Do-)
Turn Around Time (TAT) for P3 = 22 ms (-Do-)
Average Turn Around Time = (Turn Around Time for all processes) / No. of Processes
= (Turn Around Time for (P1+P2+P3)) / 3
= (10+15+22)/3 = 47/3
= 15.66 milliseconds
Average Turn Around Time (TAT) is the sum of average waiting time and average execution time.
Average Execution Time = (Execution time for all processes)/No. of processes
= (Execution time for (P1+P2+P3))/3
= (10+5+7)/3 = 22/3
= 7.33
Average Turn Around Time = Average waiting time + Average execution time
= 8.33 + 7.33
= 15.66 milliseconds
Real-Time Operating System (RTOS) based Embedded System Design 413
Example 2
Calculate the waiting time and Turn Around Time (TAT) for each process and the Average waiting time and
Turn Around Time (Assuming there is no I/O waiting for the processes) for the above example if the process
enters the ‘Ready’ queue together in the order P2, P1, P3.
The sequence of execution of the processes by the CPU is represented as
P2 P1 P3
0 5 15 22
5 10 7
Assuming the CPU is readily available at the time of arrival of P2, P2 starts executing without any waiting in
the ‘Ready’ queue. Hence the waiting time for P2 is zero. The waiting time for all processes is given as
Waiting Time for P2 = 0 ms (P2 starts executing first)
Waiting Time for P1 = 5 ms (P1 starts executing after completing P2)
Waiting Time for P3 = 15 ms (P3 starts executing after completing P2 and P1)
Average waiting time = (Waiting time for all processes) / No. of Processes
= (Waiting time for (P2+P1+P3)) / 3
= (0+5+15)/3 = 20/3
= 6.66 milliseconds
Turn Around Time (TAT) for P2 = 5 ms (Time spent in Ready Queue + Execution Time)
Turn Around Time (TAT) for P1 = 15 ms (-Do-)
Turn Around Time (TAT) for P3 = 22 ms (-Do-)
Average Turn Around Time = (Turn Around Time for all processes) / No. of Processes
= (Turn Around Time for (P2+P1+P3)) / 3
= (5+15+22)/3 = 42/3
= 14 milliseconds
The Average waiting time and Turn Around Time (TAT) depends on the order in which the processes
enter the ‘Ready’ queue, regardless there estimated completion time.
From the above two examples it is clear that the Average waiting time and Turn Around Time improve if
the process with shortest execution completion time is scheduled first.
The major drawback of FCFS algorithm is that it favours monopoly of process. A process, which does
not contain any I/O operation, continues its execution until it finishes its task. If the process contains any
I/O operation, the CPU is relinquished by the process. In general, FCFS favours CPU bound processes and
I/O bound processes may have to wait until the completion of CPU bound process, if the currently executing
process is a CPU bound process. This leads to poor device utilisation. The average waiting time is not
minimal for FCFS scheduling algorithm.
10.5.1.2 Last-Come-First Served (LCFS)/LIFO Scheduling
The Last-Come-First Served (LCFS) scheduling algorithm also allocates CPU time to the processes based
on the order in which they are entered in the ‘Ready’ queue. The last entered process is serviced first. LCFS
scheduling is also known as Last In First Out (LIFO) where the process, which is put last into the ‘Ready’
queue, is serviced first.
Example 1
Three processes with process IDs P1, P2, P3 with estimated completion time 10, 5, 7 milliseconds respectively
enters the ready queue together in the order P1, P2, P3 (Assume only P1 is present in the ‘Ready’ queue when
414 Introduction to Embedded Systems
the scheduler picks it up and P2, P3 entered ‘Ready’ queue after that). Now a new process P4 with estimated
completion time 6 ms enters the ‘Ready’ queue after 5 ms of scheduling P1. Calculate the waiting time and
Turn Around Time (TAT) for each process and the Average waiting time and Turn Around Time (Assuming
there is no I/O waiting for the processes). Assume all the processes contain only CPU operation and no I/O
operations are involved.
Initially there is only P1 available in the Ready queue and the scheduling sequence will be P1, P3, P2. P4
enters the queue during the execution of P1 and becomes the last process entered the ‘Ready’ queue. Now the
order of execution changes to P1, P4, P3, and P2 as given below.
P1 P4 P3 P2
0 10 16 23 28
10 6 7 5
Example 1
Three processes with process IDs P1, P2, P3 with estimated completion time 10, 5, 7 milliseconds respectively
enters the ready queue together. Calculate the waiting time and Turn Around Time (TAT) for each process
Real-Time Operating System (RTOS) based Embedded System Design 415
and the Average waiting time and Turn Around Time (Assuming there is no I/O waiting for the processes) in
SJF algorithm.
The scheduler sorts the ‘Ready’ queue based on the shortest estimated completion time and schedules the
process with the least estimated completion time first and the next least one as second, and so on. The order
in which the processes are scheduled for execution is represented as
P2 P3 P1
0 5 12 22
5 7 10
The estimated execution time of P2 is the least (5 ms) followed by P3 (7 ms) and P1 (10 ms).
The waiting time for all processes are given as
Waiting Time for P2 = 0 ms (P2 starts executing first)
Waiting Time for P3 = 5 ms (P3 starts executing after completing P2)
Waiting Time for P1 = 12 ms (P1 starts executing after completing P2 and P3)
Average waiting time = (Waiting time for all processes) / No. of Processes
= (Waiting time for (P2+P3+P1)) / 3
= (0+5+12)/3 = 17/3
= 5.66 milliseconds
Turn Around Time (TAT) for P2 = 5 ms (Time spent in Ready Queue + Execution Time)
Turn Around Time (TAT) for P3 = 12 ms (-Do-)
Turn Around Time (TAT) for P1 = 22 ms (-Do-)
Average Turn Around Time = (Turn Around Time for all processes) / No. of Processes
= (Turn Around Time for (P2+P3+P1)) / 3
= (5+12+22)/3 = 39/3
= 13 milliseconds
Average Turn Around Time (TAT) is the sum of average waiting time and average execution time.
The average Execution time = (Execution time for all processes)/No. of processes
= (Execution time for (P1+P2+P3))/3
= (10+5+7)/3 = 22/3 = 7.33
Average Turn Around Time = Average Waiting time + Average Execution time
= 5.66 + 7.33
= 13 milliseconds
From this example, it is clear that the average waiting time and turn around time is much improved with the
SJF scheduling for the same processes when compared to the FCFS algorithm.
Example 2
Calculate the waiting time and Turn Around Time (TAT) for each process and the Average waiting time and
Turn Around Time for the above example if a new process P4 with estimated completion time 2 ms enters
the ‘Ready’ queue after 2 ms of execution of P2. Assume all the processes contain only CPU operation and
no I/O operations are involved.
At the beginning, there are only three processes (P1, P2 and P3) available in the ‘Ready’ queue and
the SJF scheduler picks up the process with the least execution completion time (In this example P2 with
416 Introduction to Embedded Systems
execution completion time 5 ms) for scheduling. The execution sequence diagram for this is same as that of
Example 1.
Now process P4 with estimated execution completion time 2 ms enters the ‘Ready’ queue after 2 ms of
start of execution of P2. Since the SJF algorithm is non-preemptive and process P2 does not contain any I/O
operations, P2 continues its execution. After 5 ms of scheduling, P2 terminates and now the scheduler again
sorts the ‘Ready’ queue for process with least execution completion time. Since the execution completion
time for P4 (2 ms) is less than that of P3 (7 ms), which was supposed to be run after the completion of P2
as per the ‘Ready’ queue available at the beginning of execution scheduling, P4 is picked up for executing.
Due to the arrival of the process P4 with execution time 2 ms, the ‘Ready’ queue is re-sorted in the order P2,
P4, P3, P1. At the beginning it was P2, P3, P1. The execution sequence now changes as per the following
diagram
P2 P4 P3 P1
0 5 7 14 24
5 2 7 10
Example 1
Three processes with process IDs P1, P2, P3 with estimated completion time 10, 5, 7 milliseconds and priorities
0, 3, 2 (0—highest priority, 3—lowest priority) respectively enters the ready queue together. Calculate the
waiting time and Turn Around Time (TAT) for each process and the Average waiting time and Turn Around
Time (Assuming there is no I/O waiting for the processes) in priority based scheduling algorithm.
The scheduler sorts the ‘Ready’ queue based on the priority and schedules the process with the highest
priority (P1 with priority number 0) first and the next high priority process (P3 with priority number 2) as
second, and so on. The order in which the processes are scheduled for execution is represented as
P1 P3 P2
0 10 17 22
10 7 5
Example 2
Calculate the waiting time and Turn Around Time (TAT) for each process and the Average waiting time and
Turn Around Time for the above example if a new process P4 with estimated completion time 6 ms and
priority 1 enters the ‘Ready’ queue after 5 ms of execution of P1. Assume all the processes contain only CPU
operation and no I/O operations are involved.
At the beginning, there are only three processes (P1, P2 and P3) available in the ‘Ready’ queue and the
scheduler picks up the process with the highest priority (In this example P1 with priority 0) for scheduling.
The execution sequence diagram for this is same as that of Example 1. Now process P4 with estimated
execution completion time 6 ms and priority 1 enters the ‘Ready’ queue after 5 ms of execution of P1. Since
the scheduling algorithm is non-preemptive and process P1 does not contain any I/O operations, P1 continues
its execution. After 10 ms of scheduling, P1 terminates and now the scheduler again sorts the ‘Ready’ queue
for process with highest priority. Since the priority for P4 (priority 1) is higher than that of P3 (priority 2),
which was supposed to be run after the completion of P1 as per the ‘Ready’ queue available at the beginning
of execution scheduling, P4 is picked up for executing. Due to the arrival of the process P4 with priority 1,
the ‘Ready’ queue is resorted in the order P1, P4, P3, P2. At the beginning it was P1, P3, P2. The execution
sequence now changes as per the following diagram
P1 P4 P3 P2
0 10 16 23 28
10 6 7 5
The waiting time for all the processes are given as
Waiting time for P1 = 0 ms (P1 starts executing first)
Waiting time for P4 = 5 ms (P4 starts executing after completing P1. But P4 arrived after 5 ms of execution of
P1. Hence its waiting time = Execution start time – Arrival Time = 10 – 5 = 5)
Waiting time for P3 = 16 ms (P3 starts executing after completing P1 and P4)
Waiting time for P2 = 23 ms (P2 starts executing after completing P1, P4 and P3)
Average waiting time = (Waiting time for all processes) / No. of Processes
= (Waiting time for (P1+P4+P3+P2)) / 4
= (0 + 5 + 16 + 23)/4 = 44/4
= 11 milliseconds
Turn Around Time (TAT) for P1 = 10 ms (Time spent in Ready Queue + Execution Time)
Turn Around Time (TAT) for P4 = 11 ms (Time spent in Ready Queue + Execution
Time = (Execution Start Time – Arrival Time) + Estimated
Execution Time = (10 – 5) + 6 = 5 + 6)
Turn Around Time (TAT) for P3 = 23 ms (Time spent in Ready Queue + Execution Time)
Turn Around Time (TAT) for P2 = 28 ms (Time spent in Ready Queue + Execution Time)
Average Turn Around Time = (Turn Around Time for all processes) / No. of Processes
= (Turn Around Time for (P2 + P4 + P3 + P1)) / 4
= (10 + 11 + 23 + 28)/4 = 72/4
= 18 milliseconds
Real-Time Operating System (RTOS) based Embedded System Design 419
Similar to SJF scheduling algorithm, non-preemptive priority based algorithm also possess the drawback
of ‘Starvation’ where a process whose priority is low may not get a chance to execute if more and more
processes with higher priorities enter the ‘Ready’ queue before the process with lower priority started its
execution. ‘Starvation’ can be effectively tackled in priority based non-preemptive scheduling by dynamically
raising the priority of the low priority task/process which is under starvation (waiting in the ready queue for a
longer time for getting the CPU time). The technique of gradually raising the priority of processes which are
waiting in the ‘Ready’ queue as time progresses, for preventing ‘Starvation’, is known as ‘Aging’.
Example 1
Three processes with process IDs P1, P2, P3 with estimated completion time 10, 5, 7 milliseconds respectively
enters the ready queue together. A new process P4 with estimated completion time 2 ms enters the ‘Ready’
queue after 2 ms. Assume all the processes contain only CPU operation and no I/O operations are involved.
At the beginning, there are only three processes (P1, P2 and P3) available in the ‘Ready’ queue and
the SRT scheduler picks up the process with the shortest remaining time for execution completion (In this
example, P2 with remaining time 5 ms) for scheduling. The execution sequence diagram for this is same as
that of example 1 under non-preemptive SJF scheduling.
Now process P4 with estimated execution completion time 2 ms enters the ‘Ready’ queue after 2 ms of
420 Introduction to Embedded Systems
start of execution of P2. Since the SRT algorithm is preemptive, the remaining time for completion of process
P2 is checked with the remaining time for completion of process P4. The remaining time for completion of
P2 is 3 ms which is greater than that of the remaining time for completion of the newly entered process P4
(2 ms). Hence P2 is preempted and P4 is scheduled for execution. P4 continues its execution to finish since
there is no new process entered in the ‘Ready’ queue during its execution. After 2 ms of scheduling P4
terminates and now the scheduler again sorts the ‘Ready’ queue based on the remaining time for completion
of the processes present in the ‘Ready’ queue. Since the remaining time for P2 (3 ms), which is preempted
by P4 is less than that of the remaining time for other processes in the ‘Ready’ queue, P2 is scheduled for
execution. Due to the arrival of the process P4 with execution time 2 ms, the ‘Ready’ queue is re-sorted in
the order P2, P4, P2, P3, P1. At the beginning it was P2, P3, P1. The execution sequence now changes as per
the following diagram
P2 P4 P2 P3 P1
0 2 4 7 14 24
2 2 3 7 10
This reveals that the Average waiting Time and Turn Around Time (TAT) improves significantly with
preemptive SJF scheduling.
10.5.2.2 Round Robin (RR) Scheduling
The term Round Robin is very popular among the sports and games activities. You might have heard about
‘Round Robin’ league or ‘Knock out’ league associated with any football or cricket tournament. In the ‘Round
Robin’ league each team in a group gets an equal chance to play against the rest of the teams in the same
group whereas in the ‘Knock out’ league the losing team in a match moves out of the tournament ☺.
In the process scheduling context also, ‘Round Robin’ brings the same message “Equal chance to all”.
In Round Robin scheduling, each process in the ‘Ready’ queue is executed for a pre-defined time slot. The
execution starts with picking up the first process in the ‘Ready’ queue (see Fig. 10.13). It is executed for a
pre-defined time and when the pre-defined time elapses or the process completes (before the pre-defined time
slice), the next process in the ‘Ready’ queue is selected for execution. This is repeated for all the processes
in the ‘Ready’ queue. Once each process in the ‘Ready’ queue is executed for the pre-defined time period,
the scheduler comes back and picks the first process in the ‘Ready’ queue again for execution. The sequence
is repeated. This reveals that the Round Robin scheduling is similar to the FCFS scheduling and the only
difference is that a time slice based preemption is added to switch the execution between the processes in
the ‘Ready’ queue. The ‘Ready’ queue can be considered as a circular queue in which the scheduler picks
up the first process for execution and moves to the next till the end of the queue and then comes back to the
beginning of the queue to pick up the first process.
Process 1
Process 4 Process 2
Execution Switch
Execution Switch
Process 3
The time slice is provided by the timer tick feature of the time management unit of the OS kernel (Refer the
Time management section under the subtopic ‘The Real-Time kernel’ for more details on Timer tick). Time
slice is kernel dependent and it varies in the order of a few microseconds to milliseconds. Certain OS kernels
may allow the time slice as user configurable. Round Robin scheduling ensures that every process gets a fixed
amount of CPU time for execution. When the process gets its fixed time for execution is determined by the
422 Introduction to Embedded Systems
FCFS policy (That is, a process entering the Ready queue first gets its fixed execution time first and so on…).
If a process terminates before the elapse of the time slice, the process releases the CPU voluntarily and the
next process in the queue is scheduled for execution by the scheduler..
Example 1
Three processes with process IDs P1, P2, P3 with estimated completion time 6, 4, 2 milliseconds respectively,
enters the ready queue together in the order P1, P2, P3. Calculate the waiting time and Turn Around Time
(TAT) for each process and the Average waiting time and Turn Around Time (Assuming there is no I/O
waiting for the processes) in RR algorithm with Time slice = 2 ms.
The scheduler sorts the ‘Ready’ queue based on the FCFS policy and picks up the first process P1 from the
‘Ready’ queue and executes it for the time slice 2 ms. When the time slice is expired, P1 is preempted and P2
is scheduled for execution. The Time slice expires after 2ms of execution of P2. Now P2 is preempted and P3
is picked up for execution. P3 completes its execution within the time slice and the scheduler picks P1 again
for execution for the next time slice. This procedure is repeated till all the processes are serviced. The order
in which the processes are scheduled for execution is represented as
P1 P2 P3 P1 P2 P1
0 2 4 6 8 10 12
2 2 2 2 2 2
Average Turn Around Time = (Turn Around Time for all the processes) / No. of Processes
= (Turn Around Time for (P1 + P2 + P3))/3
= (12 + 10 + 6)/3 = 28/3
= 9.33 milliseconds
Average Turn Around Time (TAT) is the sum of average waiting time and average execution time.
Average Execution time = (Execution time for all the process)/No. of processes
= (Execution time for (P1 + P2 + P3))/3
= (6 + 4 + 2)/3 = 12/3 = 4
Average Turn Around Time = Average Waiting time + Average Execution time
= 5.33 + 4
= 9.33 milliseconds
RR scheduling involves lot of overhead in maintaining the time slice information for every process which
is currently being executed.
10.5.2.3 Priority Based Scheduling
Priority based preemptive scheduling algorithm is same as that of the non-preemptive priority based
scheduling except for the switching of execution between tasks. In preemptive scheduling, any high priority
process entering the ‘Ready’ queue is immediately scheduled for execution whereas in the non-preemptive
scheduling any high priority process entering the ‘Ready’ queue is scheduled only after the currently executing
process completes its execution or only when it voluntarily relinquishes the CPU. The priority of a task/
process in preemptive scheduling is indicated in the same way as that of the mechanism adopted for non-
preemptive multitasking. Refer the non-preemptive priority based scheduling discussed in an earlier section
of this chapter for more details.
Example 1
Three processes with process IDs P1, P2, P3 with estimated completion time 10, 5, 7 milliseconds and
priorities 1, 3, 2 (0—highest priority, 3—lowest priority) respectively enters the ready queue together. A new
process P4 with estimated completion time 6 ms and priority 0 enters the ‘Ready’ queue after 5 ms of start of
execution of P1. Assume all the processes contain only CPU operation and no I/O operations are involved.
At the beginning, there are only three processes (P1, P2 and P3) available in the ‘Ready’ queue and the
scheduler picks up the process with the highest priority (In this example P1 with priority 1) for scheduling.
Now process P4 with estimated execution completion time 6 ms and priority 0 enters the ‘Ready’ queue
after 5 ms of start of execution of P1. Since the scheduling algorithm is preemptive, P1 is preempted by P4
and P4 runs to completion. After 6 ms of scheduling, P4 terminates and now the scheduler again sorts the
‘Ready’ queue for process with highest priority. Since the priority for P1 (priority 1), which is preempted
by P4 is higher than that of P3 (priority 2) and P2 ((priority 3), P1 is again picked up for execution by the
scheduler. Due to the arrival of the process P4 with priority 0, the ‘Ready’ queue is resorted in the order P1,
P4, P1, P3, P2. At the beginning it was P1, P3, P2. The execution sequence now changes as per the following
diagram
P1 P4 P1 P3 P2
0 5 11 16 23 28
5 6 5 7 5
424 Introduction to Embedded Systems