Operating System
Operating System
Operating System
1. Basics of Operating Systems
Introduction to Operating Systems:
An Operating System (OS) is system software that manages computer hardware and software
resources and provides services for computer programs. It serves as an intermediary between
users and the computer hardware.
A process is a program in execution. It includes the program code, current activity (represented
by the program counter), and associated resources like open files, registers, and memory.
Process States:
The PCB is a data structure that holds process-related information such as:
● Process ID.
● Program counter.
● Registers.
● Memory limits.
● List of open files.
Process Creation:
Example in UNIX:
In UNIX, a parent process can create a child process using the fork() system call.
Process Termination:
4. Context Switching :
Context switching is the process of saving the state of a currently running process or
thread so that it can be resumed later and restoring the state of another process or thread
that is being scheduled to run by the operating system (OS). It is an essential feature for
multitasking and time-sharing operating systems, allowing the CPU to switch between
processes to ensure that all tasks get a chance to run.
5. Non-Preemptive Scheduling
Definition:
Advantages:
● Simple to implement.
● No context switching overhead during process execution.
Disadvantages:
● Concept: Processes are executed in the order they arrive. The first process to arrive is
executed first.
Example:
P1 0 2 2 2 0 0
P2 1 2 4 3 1 1
P3 5 3 8 3 0 0
P4 6 4 12 6 2 2
Gantt Chart:
P1 P2 Idle P3 P4
0 2 4 5 8 12
● Concept: The process with the shortest burst time is selected for execution.
● Non-preemptive: Once a process starts, it runs to completion.
Example:
P2 2 4 10 8 4 4
P3 1 2 3 2 0 0
P4 4 4 14 10 6 6
Gantt Chart:
Idle P3 P1 P2 P4
0 1 3 6 10 14
6. Pre-emptive Scheduling
Definition:
Pre-emptive scheduling allows the operating system to suspend a running process to allocate
the CPU to another process. This is used in time-sharing and real-time systems, ensuring that
no single process monopolizes the CPU.
How it Works:
● The CPU can switch from one process to another if the current process's time slice
(quantum) expires or a higher-priority process becomes ready to execute.
● Pre-emptive scheduling introduces the concept of context switching, where the CPU's
state is saved before switching to another process.
Advantages:
Disadvantages:
● Concept: The process with the shortest burst time is selected for execution.
● Non-preemptive: Once a process starts, it runs to completion.
Example:
P1 0 5 9 9 4 0
P2 1 3 4 3 0 0
P3 2 4 13 11 7 7
P4 4 1 5 1 0 0
Gantt Chart:
P1 P2 P2 P2 P4 P1 P1 P1 P1 P3
0 1 2 3 4 5 6 7 8 9 13
● Concept: Each process gets a fixed time slice (quantum), and processes are executed
in a circular order.
● Pre-emptive: If a process doesn’t finish within its quantum, it's pre-empted and placed at
the end of the ready queue.
Example:
P1 0 5 12 12 7 0
P2 1 4 11 10 6 1
P3 2 2 6 4 2 2
P4 4 1 9 5 4 4
Ready Queue -
P1 P2 P3 P1 P4 P2 P1
Gantt Chart:
P1 P2 P3 P1 P4 P2 P1
0 2 4 6 8 9 11 12
Example:
10 P1 0 5 12 12 7 0
20 P2 1 4 8 7 3 0
30 P3 2 2 4 2 0 0
40 P4 4 1 5 1 0 0
Gantt Chart:
P1 P2 P3 P4 P2 P1
0 1 2 4 5 8 12
7. Race Conditions
Definition:
A race condition occurs when multiple processes or threads access shared resources
simultaneously, and the final outcome depends on the sequence in which they execute. This
can lead to inconsistent or unexpected results.
Example:
The OS can suspend a process at any time. The process runs until completion.
● Independent Execution: Each thread runs independently of other threads in the same
process. It has its own execution flow but shares process-level resources (memory, data,
etc.) with other threads.
● Shared Memory: Since threads belong to the same process, they share the same
address space, allowing them to access common variables and data. This is both an
advantage (efficient communication) and a challenge (potential for race conditions).
● Lighter than Processes: Threads are more lightweight compared to processes.
Creating a new thread is faster and uses fewer resources than creating a new process,
making multi-threading more efficient than multi-processing for tasks that can be
performed in parallel.
● Context Switching: When switching between threads, the CPU performs a context
switch, which involves saving the state (registers, program counter, etc.) of the current
thread and loading the state of the new thread. Since threads share the same process
resources, context switching between threads is faster than between processes.
1. User-Level Threads: These are managed by the user application, not the operating
system. Thread management (creation, scheduling, and termination) is done in user
space.
○ Advantages: Faster thread management since there's no system call overhead.
○ Disadvantages: The OS kernel is unaware of these threads, which can make
handling certain system-level operations (like I/O) inefficient.
2. Kernel-Level Threads: These are managed by the OS kernel. The kernel is aware of
and handles thread creation, scheduling, and termination.
○ Advantages: The kernel can optimize thread management, especially for
multi-core processors.
○ Disadvantages: More overhead than user-level threads, as every thread
operation requires interaction with the kernel.
There are several models that describe how user threads are mapped to kernel threads.
These models manage how user-level threads (those created and managed by user programs)
are mapped to the underlying kernel-level threads (which the operating system manages) for
execution. These models are important in operating systems to manage the concurrency and
efficiency of multithreaded applications.
1. Many-to-One Model
2. One-to-One Model
3. Many-to-Many Model
In the many-to-one model, multiple user threads are mapped to a single kernel thread. The
thread management (creation, scheduling, synchronization) is done entirely in user space, and
the kernel only knows about the single kernel thread on which the user threads are running.
How it Works:
● The process manages multiple user threads but the OS manages only one kernel
thread.
● All user threads are mapped to this one kernel thread, meaning they must share the
CPU time of the single kernel thread.
Advantages:
● Fast thread management: Since thread creation and context switching happen in user
space, they are very efficient and don’t require kernel intervention.
● No kernel-level overhead: Since the kernel only needs to manage one thread, there is
minimal overhead.
Disadvantages:
● No true parallelism: Even on multiprocessor systems, user threads can only run one at
a time because there is only one kernel thread. This means there’s no true concurrency
or parallel execution.
● Blocking: If one user thread performs a blocking operation (e.g., I/O), the entire process
is blocked because the kernel-level thread blocks, stopping all user threads associated
with it.
Example Systems:
● Green Threads in Java (early versions): A Java implementation of user threads, where
multiple user threads were managed by one kernel thread.
● GNU Portable Threads (Pth): A user-level thread library for UNIX-based systems.
In the one-to-one model, each user thread is mapped to a unique kernel thread. For every user
thread that is created, a corresponding kernel thread is also created.
How it Works:
● Each user thread has a corresponding kernel thread, allowing multiple threads to run in
parallel on multiprocessor systems.
● Thread management (e.g., creation, termination, and context switching) requires system
calls to the kernel, as the OS is aware of each thread.
Advantages:
Disadvantages:
● High overhead: Since each user thread requires a corresponding kernel thread,
creating and managing many threads can lead to significant overhead. Each thread
needs its own kernel resources (memory, registers, etc.).
● System limits: The number of threads in the system is limited by the OS’s ability to
handle kernel threads. Creating thousands of threads may exceed system limits.
Example Systems:
The many-to-many model is a hybrid approach, allowing multiple user threads to be mapped to
a smaller or equal number of kernel threads. This model allows the OS to create a flexible
balance between user-level thread management and kernel-level parallelism.
How it Works:
● The OS can map many user threads to a smaller number of kernel threads.
● The number of kernel threads is configurable based on system resources.
● User threads are scheduled onto the available kernel threads. The mapping can be
dynamic, meaning the system can create more kernel threads if needed or reduce the
number to optimize performance.
Advantages:
● Flexibility: This model offers the benefits of both the many-to-one and one-to-one
models. It can handle multiple user threads without the overhead of a one-to-one model,
while still allowing true parallel execution on multi-core systems.
● Efficient resource usage: It avoids the overhead of creating too many kernel threads
while allowing for some degree of concurrency.
Disadvantages:
● Complex implementation: The mapping between user and kernel threads is more
complex and requires advanced scheduling techniques.
Example Systems:
A race condition occurs in a concurrent system (such as an operating system) when the
behavior or outcome of a process depends on the timing or sequence of uncontrollable
events, such as the execution order of threads or processes. Essentially, two or more
processes or threads are racing to access shared resources (like memory, files, or variables),
and the final result can differ depending on which thread wins the race to execute a particular
part of the code.
For example, in the banking example, both threads access and modify the same resource (the
bank balance) without coordinating with each other.
A race condition usually involves a critical section, a part of the code where shared resources
(such as variables, data structures, or files) are accessed. The goal of process synchronization
is to ensure that only one thread or process is executing the critical section at any given time,
preventing race conditions.
1. Mutual Exclusion: Only one process should be allowed to execute the critical section at
a time.
2. Progress: If no process is executing the critical section and a process wants to enter, it
should be allowed to enter.
3. Bounded Waiting: No process should wait indefinitely to enter the critical section.
● Locks (or Mutexes — short for mutual exclusion) are the simplest form of
synchronization. They are used to enforce mutual exclusion on a critical section.
● When a thread or process wants to access a shared resource, it must first acquire the
lock. If another thread holds the lock, the requesting thread must wait until the lock is
released.
2. Semaphores
● A semaphore is a more general synchronization tool. It uses two operations: wait (P)
and signal (V) to control access to a shared resource.
● Semaphores can be used for mutual exclusion (binary semaphores) or to control access
to a limited number of resources (counting semaphores).
● A binary semaphore can only have two values, 0 and 1, and works similarly to a mutex.
● A counting semaphore can have any integer value and is used to allow multiple
processes to access the resource simultaneously, up to a defined limit.
Here's a breakdown of when UP (signal) and DOWN (wait) semaphores get triggered:
The DOWN operation is triggered when a process or thread wants to access a shared resource.
It decrements the semaphore value and checks if the resource is available:
● The UP operation is triggered when a thread has finished using the resource and wants
to make it available to other threads. It increments the semaphore and wakes up any
waiting threads.
Definition:
A deadlock is a situation where two or more processes are unable to proceed because
each process is waiting for the other to release a resource. It is a circular chain of
dependencies among the processes.
Example:
Consider two processes, P1 and P2, and two resources, R1 and R2.
For a deadlock to occur, the following four conditions must be true simultaneously:
1. Mutual Exclusion:
○ At least one resource must be held in a non-sharable mode (i.e., only
one process can use the resource at a time).
2. Hold and Wait:
○ A process is holding at least one resource and is waiting for additional
resources held by other processes.
3. No Preemption:
○ Resources cannot be forcibly taken away from a process; they must be
released voluntarily by the process holding them.
4. Circular Wait:
○ A set of processes are waiting for each other in a circular chain, where
each process holds a resource that the next process in the chain is waiting
for.
Deadlock prevention ensures that one of the necessary conditions for deadlock does
not occur. This can be done by altering system behavior.
● Ensure that a process requests all its required resources at once and holds
none while waiting. This can be inefficient due to resource underutilization.
● Example: A process waits for both the printer and scanner at the start rather than
holding one and waiting for the other.
3. Allow Preemption:
● Impose a total ordering on the resources. Each process can only request
resources in a predefined order.
● Example: If processes must acquire resources in the order R1, R2, R3, then
deadlocks due to circular waiting are prevented.
In deadlock avoidance, the system ensures that it never enters a deadlock state by
checking resource allocation at runtime. This is done using algorithms that consider
future resource needs and only grant resources if they won’t lead to deadlock.
1. Banker's Algorithm:
● The most famous deadlock avoidance algorithm, the Banker's Algorithm, works
by simulating resource allocation to see if it leads to a safe state. The system
only grants a resource request if it leaves the system in a safe state.
● Safe State: A state in which there exists a sequence of processes that can finish
without leading to deadlock.
● Example:
○ Let’s say there are 3 processes and 5 units of a resource. The Banker's
Algorithm ensures that even if a process requests more units, there will
still be enough for the others to finish and avoid deadlock.
1. Detection:
● The system continuously monitors the state of resource allocation and looks for
cycles in the resource allocation graph (RAG). If a cycle is detected, a
deadlock has occurred.
2. Recovery:
● Terminate Processes: Abort one or more processes to break the cycle and
release resources.
○ Example: Abort the process that is causing the deadlock or the one with
the least priority.
● Preempt Resources: Preempt resources from one or more deadlocked
processes and give them to other processes.
○ Example: Reallocate resources from a deadlocked process and restart it
later.
5. Real-World Examples
Where:
● P = fraction of time a process spends waiting for I/O (i.e., the probability that the
CPU is idle during a process).
● n = number of processes (degree of multiprogramming).
Explanation:
● When only one process is running, the CPU is idle P fraction of the time, and the
CPU utilization is 1 - P.
● When n processes are running, the CPU is only idle if all processes are waiting
for I/O, which happens with probability P^n. Thus, the CPU utilization increases
as n (degree of multiprogramming) increases.
1. Allocation of Memory:
○ Allocate memory to programs when they request it.
○ Ensure that memory is properly deallocated when it's no longer needed.
2. Tracking Memory:
○ Keep track of each memory location, whether it is free or allocated, and
how much memory is available.
3. Swapping and Paging:
○ Manage the movement of processes between main memory and disk
when there is insufficient memory.
4. Protection and Isolation:
○ Prevent processes from accessing memory allocated to other processes,
ensuring data security and integrity.
1. Fixed Partitioning
Definition:
Key Points:
● Internal Fragmentation: If a process is smaller than the partition size, the leftover
space within the partition is wasted, leading to internal fragmentation.
● Simple Implementation: Easy to implement, but not flexible since partition sizes
are fixed.
Example:
2. Variable-Size Partitioning
Definition:
● External Fragmentation: Over time, small holes of free memory can form between
allocated blocks, leading to external fragmentation.
● Memory Compaction: One way to handle external fragmentation is to perform
memory compaction, where free memory blocks are merged together.
Example:
● Suppose you have 100 MB of memory, and three processes request 20 MB, 30
MB, and 40 MB respectively. Memory is divided dynamically into these sizes, but
gaps of free memory can appear between processes over time.
Description:
● First Fit allocates the first block of memory that is large enough to satisfy the
process's request.
● It searches memory from the beginning and assigns the process to the first
suitable free block it finds.
Steps:
Advantages:
● Simple to implement.
● Faster allocation since it stops searching once a suitable block is found.
Disadvantages:
● Can cause fragmentation near the beginning of memory, as small gaps of
unusable free space might be left over.
Example:
Consider memory blocks of sizes: 100 KB, 500 KB, 200 KB, 300 KB, 600 KB.
Description:
● Best Fit allocates the smallest free block that is large enough to satisfy the
process's request.
● It searches the entire memory list to find the best-fitting block (i.e., the block
with the least leftover space after allocation).
Steps:
1. Traverse the memory list and find the smallest block that can hold the process.
2. Allocate the process to that block.
3. If there's leftover space, create a new free block.
Advantages:
● Minimizes leftover space: By finding the smallest block, it tries to use memory
more efficiently.
Disadvantages:
● Slower: The entire list must be searched to find the best fit.
● Can lead to external fragmentation as smaller leftover spaces are scattered
across memory, making it difficult to allocate larger processes later.
Example:
Consider memory blocks of sizes: 100 KB, 500 KB, 200 KB, 300 KB, 600 KB.
Description:
Steps:
1. Traverse the memory list and find the largest block that can hold the process.
2. Allocate the process to that block.
3. If there’s leftover space, create a new free block.
Advantages:
Disadvantages:
● Inefficient use of memory: Since the largest block is chosen, smaller processes
may unnecessarily occupy large blocks, leaving smaller ones unusable.
● Can still lead to fragmentation.
Example:
Consider memory blocks of sizes: 100 KB, 500 KB, 200 KB, 300 KB, 600 KB.
Description:
● Next Fit is a variation of First Fit, but instead of starting the search from the
beginning of memory every time, it starts from where the previous allocation
was made.
● This improves performance by not searching through blocks that have already
been processed.
Steps:
Advantages:
● Faster than First Fit, as it avoids re-scanning blocks that were already processed.
● Simpler to implement compared to Best Fit and Worst Fit.
Disadvantages:
● Can still cause fragmentation, as it may skip over smaller free spaces that could
otherwise be used.
Example:
Consider memory blocks of sizes: 100 KB, 500 KB, 200 KB, 300 KB, 600 KB.
● If the previous allocation was made at the 300 KB block and a new process
requires 212 KB, the Next Fit algorithm will start scanning from the next block
(600 KB) and allocate the process there, leaving 388 KB free.
Key Techniques:
12.1.2.1. Paging
Definition:
● Paging is a memory management scheme that eliminates the need for
contiguous allocation of physical memory. It divides both physical and logical
memory into fixed-sized blocks called pages and frames, respectively.
● The size of the pages and frames is the same, and a page table maps logical
pages to physical frames.
Key Points:
Example:
Questions:
1. Q-1 :
● Logical Address Spacing - 4 GB
● Physical Address Spacing - 64 MB
● Page Size - 4 KB
12.1.2.2. Segmentation
Definition:
Key Points:
● Logical Division: Unlike paging, segmentation reflects the logical divisions of a
process (e.g., code and data segments).
● External Fragmentation: Segmentation can suffer from external fragmentation
due to varying segment sizes.
Example:
● A process has a code segment of 4 KB and a data segment of 6 KB. These are
mapped separately to physical memory using a segment table.
Definition:
● In inverted paging, there is a single page table for the entire system, rather
than one page table per process. The table tracks which page of which process
is stored in each frame.
Key Points:
● Memory Efficiency: Inverted paging reduces the amount of memory needed for
page tables.
● Hashing: Typically, a hashing function is used to map logical addresses to the
frame number.
Example:
● Instead of having separate page tables for each process, the OS maintains one
large page table where each entry contains information about which process
owns a particular frame.
12.1.2.4. Thrashing
Definition:
● Thrashing occurs when a system spends more time swapping pages in and out
of memory than executing processes, resulting in a significant decline in
performance.
Key Points:
● Caused by Overloading Memory: Thrashing occurs when processes do not have
enough pages to run efficiently, leading to frequent page faults.
● Working Set Model: To reduce thrashing, the working set model can be used,
which keeps track of the set of pages that a process frequently accesses.
Example:
● A process continuously accesses more pages than can fit in memory, causing the
OS to repeatedly swap pages in and out, leading to thrashing.
Definition:
● A page fault occurs when a process tries to access a page that is not currently in
memory.
● The OS must fetch the page from disk and load it into memory.
Key Points:
● Minor Page Fault: Occurs when the page is not in the current working set but still
in memory (e.g., swapped out).
● Major Page Fault: Occurs when the page is not in memory and must be loaded
from disk, which is much slower.
Example:
● A process tries to access a page that is currently not in memory. The OS triggers
a page fault, loads the page from disk, and updates the page table.
When memory is full, the OS must replace a page in memory to load a new one. Page
replacement algorithms decide which page to remove.
● Description: Replaces the page that has not been used for the longest period of
time.
● Advantage: More efficient than FIFO in many cases.
● Disadvantage: Requires keeping track of access times, which can be complex.
● Example:
○ If pages 1, 2, 3, 4 are in memory and 5 needs to be loaded, the page
that hasn’t been accessed the longest (e.g., 2) is replaced.
● Description: Replaces the page that will not be used for the longest time in the
future.
● Advantage: The best possible performance.
● Disadvantage: Impossible to implement in real-time (requires future knowledge).
● Example:
○ If pages 1, 2, 3, 4 are in memory and 5 needs to be loaded, the page
that won’t be used soon is replaced.
● Description: A circular list of pages with a reference bit. If the bit is set, the page
is given a second chance; otherwise, it is replaced.
● Advantage: Efficient and simple to implement.
● Disadvantage: May not always be the best performer.
● Example:
○ The OS scans pages and gives a second chance to pages that have been
used recently, skipping them for replacement.
The file system is a crucial component of an operating system, responsible for managing how
data is stored, retrieved, and organized on storage devices. It provides a systematic way to
handle data and files, offering users a simple way to store and access information without
worrying about the complexities of underlying storage mechanisms.
15.1. Introduction to File System
A file system is a method that an operating system uses to organize and manage files on
storage devices like hard drives, SSDs, USB drives, etc. It provides a way to store, retrieve, and
update data, as well as manage access permissions.
● User Interface Layer: Interaction between users and the file system, usually through
system calls (e.g., open(), read(), write()).
● Logical File System: Manages metadata (permissions, file names, directory structure).
Responsible for file abstraction and security checks.
● File Organization Module: Handles logical blocks of data, maps files into physical
blocks on the storage medium.
● Basic File System: Manages access to the physical storage device by issuing
device-specific commands.
● I/O Control: Consists of device drivers that handle reading and writing data on specific
hardware devices.
● Devices: Actual physical media like hard drives, SSDs, etc.
15.3. File System Implementation
The implementation of a file system involves both software and hardware components that
handle file operations, metadata management, and memory/storage management.
1. File Concept
● A file is a named collection of data stored on disk. Files can be text, binary, or
executable.
● A file may have attributes such as:
○ Name: The file’s human-readable identifier.
○ Identifier: A unique tag or number that identifies the file within the file system.
○ Type: The file format or category (e.g., text file, binary file).
○ Location: A pointer to the file's storage location on the device.
○ Size: The total size of the file in bytes.
○ Protection: File access permissions (read, write, execute).
○ Time, date, user identification: Information regarding creation, last modification,
and ownership.
2. Directory Structure
A directory contains metadata about files and subdirectories, such as file names and pointers to
the files' locations on disk. Directory structures vary depending on the file system. Some
common structures include:
The file system needs to allocate space on storage devices for files. There are several
allocation strategies:
● Contiguous Allocation: Files are stored in contiguous blocks of memory. Pros: Fast
access; Cons: External fragmentation and difficulty with file resizing.
● Linked Allocation: Each file is a linked list of blocks scattered across the disk. Pros: No
external fragmentation; Cons: Slow access and high overhead.
● Indexed Allocation: Each file has an index block that points to the blocks on the disk.
Pros: Supports random access; Cons: More overhead to manage index blocks.
4. Free Space Management
The file system must track free space on the disk to allocate it for new files or file growth:
● Bit Vector/Bitmap: A bitmap where each bit represents a disk block (1 = free, 0 =
occupied).
● Linked List: Free blocks are linked together to form a free space list.
● Grouping: Groups of free blocks are stored together in a block.
● Counting: Instead of listing each free block, it stores the starting address and the
number of contiguous free blocks.
5. Disk Scheduling
The disk scheduler decides the order in which I/O requests are processed:
● First-Come, First-Served (FCFS): Requests are processed in the order they arrive.
● Shortest Seek Time First (SSTF): The request closest to the current head position is
served first.
● SCAN (Elevator Algorithm): The disk arm moves in one direction, servicing requests
until it reaches the end of the disk, then reverses.
● C-SCAN (Circular SCAN): Similar to SCAN, but the arm returns to the beginning after
reaching the end without servicing requests on the way back.
Mounting is the process by which the OS makes a file system available for use. During boot
time, the root file system is mounted, and additional file systems can be mounted later using
commands like mount in Unix/Linux.
Files can be accessed in different ways, depending on how they are stored and used:
● Sequential Access: Data is accessed in a linear sequence (e.g., reading a text file).
● Direct/Random Access: Data is accessed directly by its position or index in the file
(e.g., accessing a record in a database).
● Indexed Access: An index is created for faster access, common in database systems.
The file system provides mechanisms to protect files from unauthorized access:
● Access Control Lists (ACLs): Define which users or system processes can access
specific files and directories.
● File Permissions: Read, write, execute permissions can be set for each file or directory
(e.g., rwx in Unix).
● Encryption: Ensures that data in files is secure and unreadable without proper
decryption keys.
● Journaling File Systems: These systems keep track of changes in a journal before
committing them to the file system, helping in crash recovery. NTFS, Ext3, and Ext4 are
examples.
● RAID (Redundant Array of Independent Disks): RAID setups offer redundancy and
fault tolerance to protect data against hardware failure.
● Block Size: Larger blocks can improve throughput but may waste space.
● Caching: Frequently accessed files are kept in memory for faster access.
● Defragmentation: Rearranges file blocks to be contiguous, improving access speed
(important in non-SSD systems).
● A file system that allows files to be accessed over a network, enabling multiple clients to
share files as if they were located on a local file system (e.g., NFS, AFS).
● A layer of abstraction that allows multiple file systems to coexist and be accessed in a
uniform way, regardless of their underlying structure.
● File systems that store and manage data on cloud infrastructure, allowing scalability and
remote access (e.g., Amazon S3, Google Cloud Storage).