Operating System Full Syllabus Notes - MSBTE NOTES AND INFORMATION
Operating System Full Syllabus Notes - MSBTE NOTES AND INFORMATION
Definition -:
Dual mode operation, also known as dual privilege mode or privileged mode, is
a feature of modern operating systems that enables a clear distinction between
privileged and non-privileged operations. It allows the operating system to
protect critical system resources and maintain security by enforcing access
control and preventing unauthorized access or manipulation.
1. User mode: In this mode, the CPU executes instructions on behalf of user
applications or processes. The user mode provides a restricted environment
where applications can run, but they have limited access to system resources.
User mode applications cannot directly access hardware devices or perform
privileged operations.
The transition between user mode and kernel mode occurs through system calls
or exceptions. When a user application needs to perform a privileged operation
or access a protected resource, it makes a request to the operating system
through a system call. The system call interrupts the execution of the user mode
code, transfers control to the kernel mode, and executes the requested operation
on behalf of the application. After completing the privileged operation, control
is returned to the user mode, and the application continues execution.
OS as a Resource Manager
An operating system (OS) acts as a resource manager, responsible for efficiently
allocating and managing the various hardware and software resources of a
computer system. It ensures that these resources are utilized effectively to fulfill
the demands of user applications and provide a seamless computing experience.
Here are some key aspects of how an OS functions as a resource manager:
By managing these resources, the OS ensures that they are allocated efficiently,
conflicts are resolved, and the overall system operates in a stable and reliable
manner. It serves as an intermediary layer between the hardware and software,
abstracting the complexities of resource management and providing a unified
interface for applications to interact with the system.
Imp Questions -:
4. Limited Error Handling: In batch systems, errors or faults in one job may
impact subsequent jobs in the batch queue. If a job encounters an error or
failure, it may require manual intervention to correct the issue and resume
processing, which can slow down the overall job execution.
5. Dependency on Job Order: The order in which jobs are submitted to the
batch queue may impact overall performance. If high-priority or critical jobs are
placed behind long-running jobs, it can delay their execution and affect the
system's responsiveness.
In a time-shared operating system, the CPU time is divided into small time
intervals called time slices or quantum. Each user or process is allocated a time
slice during which it can execute its tasks. When the time slice expires, the
operating system interrupts the execution and switches to the next user or
process in line.
The significance of a time-sharing operating system lies in its ability to
provide several important benefits for both users and computer systems. Here
are some key significances of time-sharing operating systems:
- Security and Privacy Concerns: Sharing system resources raises security and
privacy concerns, as one user's actions or programs may potentially affect
others.
There are various types of Distributed Operating systems. Some of them are as
follows
1. Client-Server Systems
2. Peer-to-Peer Systems
Client-Server System
This type of system requires the client to request a resource, after which the
server gives the requested resource. When a client connects to a server, the
server may serve multiple clients at the same time.
This system allows the interface, and the client then sends its own requests to be
executed as an action. After completing the activity, it sends a back response
and transfers the result to the client.
It provides a file system interface for clients, allowing them to execute actions
like file creation, updating, deletion, and more.
Peer-to-Peer System
The nodes play an important role in this system. The task is evenly distributed
among the nodes. Additionally, these nodes can share data and resources as
needed. Once again, they require a network to connect.
Mobile OS
Android -:
Applications
Application Framework
Android Runtime
Platform Libraries
Linux Kernel
Applications –
Application framework –
Application runtime –
Platform libraries –
The Platform Libraries includes various C/C++ core libraries and Java based
libraries such as Media, Graphics, Surface Manager, OpenGL etc. to provide a
support for android development.
Media library provides support to play and record an audio and video
formats.
Surface manager responsible for managing access to the display subsystem.
SGL and OpenGL both cross-language, cross-platform application program
interface (API) are used for 2D and 3D computer graphics.
SQLite provides database support and FreeType provides font support.
Web-Kit This open source web browser engine provides all the functionality
to display web content and to simplify page loading.
SSL (Secure Sockets Layer) is security technology to establish an
encrypted link between a web server and a web browser.
Linux Kernel –
Linux Kernel is heart of the android architecture. It manages all the available
drivers such as display drivers, camera drivers, Bluetooth drivers, audio
drivers, memory drivers, etc. which are required during the runtime.
The Linux Kernel will provide an abstraction layer between the device
hardware and the other components of android architecture. It is responsible
for management of memory, power, devices etc.
Security: The Linux kernel handles the security between the application and
the system.
Memory Management: It efficiently handles the memory management
thereby providing the freedom to develop our apps.
Process Management: It manages the process well, allocates resources to
processes whenever they need them.
Network Stack: It effectively handles the network communication.
Driver Model: It ensures that the application works properly on the device
and hardware manufacturers responsible for building their drivers into the
Linux build.
Imp Questions :
Doubts Column :
2. Services & Components of OS
5. User Interface: The operating system provides a user interface that allows
users to interact with the computer system. This can include command-line
interfaces (CLI), graphical user interfaces (GUI), or a combination of both,
enabling users to execute commands, launch applications, and manage files.
10. System Utilities: Operating systems offer a range of utility programs that
assist in system management and maintenance. These utilities may include disk
management tools, performance monitoring tools, backup and restore utilities,
and system configuration tools.
System Calls
1. File System Operations: System calls such as "open," "read," "write," and
"close" are used for file manipulation. They allow programs to create, open,
read from, write to, and close files.
2. Process Management: System calls like "fork," "exec," "exit," and "wait"
are used for managing processes. They allow programs to create new processes,
replace the current process with a different program, terminate processes, and
wait for process termination.
4. Memory Management: System calls like "brk" and "mmap" are used for
memory management. They allow programs to allocate and deallocate memory
dynamically, map files into memory, and modify memory protection settings.
6. Time and Date Management: System calls like "time," "gettimeofday," and
"sleep" are used to obtain and manipulate system time and dates.
7. Process Control: System calls like "kill" and "signal" are used for process
control. They allow programs to send signals to processes, handle signal events,
and modify signal behavior.
Imp questions
Process management
Files management
Command Interpreter
System calls
Signals
Network management
Security management
I/O device management
Secondary storage management
Main memory management
Process Management :
Executable program
Program’s data
Stack and stack pointer
Program counter and other CPU registers
Details of opened files
Files Management :
Files are used for long-term storage. Files are used for both input and output.
Every operating system provides a file management service. This file
management service can also be treated as an abstraction as it hides the
information about the disks from the user. The operating system also provides
a system call for file management. The system call for file management
includes –
File creation
File deletion
Read and Write operations
Command Interpreter :
There are several ways for users to interface with the operating system. One of
the approaches to user interaction with the operating system is through
commands. Command interpreter provides a command-line interface. It
allows the user to enter a command on the command line prompt (cmd).
System Calls :
Network Management :
Security Management:
The security mechanisms in an operating system ensure that authorized
programs have access to resources, and unauthorized programs have no access
to restricted resources. Security management refers to the various processes
where the user changes the file, memory, CPU, and other hardware resources
that should have authorization from the operating system.
The I/O device management component is an I/O manager that hides the
details of hardware devices and manages the main memory for devices using
cache and spooling. This component provides a buffer cache and general
device driver code that allows the system to manage the main memory and the
hardware devices connected to it. It also provides and manages custom drivers
for particular hardware devices.
The purpose of the I/O system is to hide the details of hardware devices from
the application programmer. An I/O device management component allows
highly efficient resource utilization while minimizing errors and making
programming easy on the entire range of devices available in their systems.
Broadly, the secondary storage area is any space, where data is stored
permanently and the user can retrieve it easily. Your computer’s hard drive is
the primary location for your files and programs. Other spaces, such as CD-
ROM/DVD drives, flash memory cards, and networked devices, also provide
secondary storage for data on the computer. The computer’s main memory
(RAM) is a volatile storage device in which all programs reside, it provides
only temporary storage space for performing tasks. Secondary storage refers to
the media devices other than RAM (e.g. CDs, DVDs, or hard disks) that
provide additional space for permanent storing of data and software programs
which is also called non-volatile storage.
Main memory management :
Task Scheduler -:
Performance Monitor -:
Doubts Column :
3. Process Management
Process -:
Process States -:
A process has several stages that it passes through from beginning to end.
There must be a minimum of five states. Even though during execution, the
process could be in one of these states, the names of the states are not
standardized. Each process goes through several stages throughout its life
cycle.
Process States in Operating System
New (Create): In this step, the process is about to be created but not yet
created. It is the program that is present in secondary memory that will be
picked up by OS to create the process.
Ready: New -> Ready to run. After the creation of a process, the process
enters the ready state i.e. the process is loaded into the main memory. The
process here is ready to run and is waiting to get the CPU time for its
execution. Processes that are ready for execution by the CPU are maintained
in a queue called ready queue for ready processes.
Run: The process is chosen from the ready queue by the CPU for execution
and the instructions within the process are executed by any one of the
available CPU cores.
Blocked or Wait: Whenever the process requests access to I/O or needs input
from the user or needs access to a critical region(the lock for which is already
acquired) it enters the blocked or waits for the state. The process continues to
wait in the main memory and does not require CPU. Once the I/O operation
is completed the process goes to the ready state.
Terminated or Completed: Process is killed as well as PCB is deleted. The
resources allocated to the process will be released or deallocated.
Suspend Ready: Process that was initially in the ready state but was
swapped out of main memory(refer to Virtual Memory topic) and placed onto
external storage by the scheduler is said to be in suspend ready state. The
process will transition back to a ready state whenever the process is again
brought onto the main memory.
Suspend wait or suspend blocked: Similar to suspend ready but uses the
process which was performing I/O operation and lack of main memory
caused them to move to secondary memory. When work is finished it may go
to suspend ready.
The Process Control Block (PCB), also known as the Task Control Block
(TCB), is a data structure used by an operating system to manage and track
information about a specific process. It contains essential details and control
information that the operating system needs to manage the process effectively.
Each process in the system has its own PCB, and the operating system uses the
PCB to perform process management and scheduling tasks.
The Process Control Block is a vital data structure used by the operating system
to manage and control processes efficiently. It allows the operating system to
maintain and retrieve the necessary information for process scheduling, context
switching, resource allocation, and interprocess communication. By maintaining
a PCB for each process, the operating system can effectively manage the
execution and control of multiple processes concurrently.
Here are the key components and information typically found in a Process
Control Block:
2. Process State: Indicates the current state of the process, such as running,
ready, blocked, or terminated. The state is updated as the process moves through
different phases of execution and interacts with the operating system.
3. Program Counter (PC): The Program Counter holds the address of the next
instruction to be executed by the process. It allows the operating system to keep
track of the execution progress of the process.
4. CPU Registers: The PCB contains the values of various CPU registers
associated with the process, such as the accumulator, stack pointer, and index
registers. These registers hold the intermediate results, program variables, and
execution context of the process.
8. Accounting Information: The PCB may include accounting data, such as the
amount of CPU time used by the process, the number of times it has been
executed, or other statistics related to resource utilization. This information
assists in performance analysis, billing, and system monitoring.
1. Fairness: The scheduler ensures that each process gets a fair share of CPU
time, preventing any particular process from monopolizing system resources.
Fairness promotes an equitable distribution of CPU resources among processes.
Scheduling Queue
Ready Queue:
1. Purpose: The primary purpose of the ready queue is to hold processes that
are waiting to be scheduled for execution on the CPU. These processes have
met the necessary criteria to run, such as having their required resources
available or completing any required I/O operations.
2. Organization: The ready queue is typically implemented as a queue data
structure, where processes are added to the back of the queue and removed from
the front. This follows the First-Come, First-Served (FCFS) principle, where the
process that arrives first is scheduled first.
4. Process State: Processes in the ready queue are in a ready state, indicating
that they are prepared to execute but are waiting for CPU allocation. Once a
process is selected from the ready queue, it transitions to the running state and
starts executing on the CPU.
Device Queue:
The device queue, also known as the I/O queue or waiting queue, is a data
structure used by the operating system to manage processes that are waiting for
access to I/O devices. It holds processes that are waiting for I/O operations to
complete before they can proceed with their execution.
1. Purpose: The device queue is used to hold processes that are waiting for I/O
operations to be performed on a specific device, such as reading from or writing
to a disk, accessing a printer, or interacting with other peripherals. These
processes are unable to proceed until the requested I/O operation is completed.
2. Organization: Similar to the ready queue, the device queue is typically
implemented as a queue data structure. Processes are added to the end of the
queue when they are waiting for an I/O operation and are removed from the
front when the operation is completed.
4. I/O Scheduling: The device queue, in conjunction with the I/O scheduler,
manages the order in which processes access I/O devices. The scheduler
determines the sequence in which processes are granted access to the device,
aiming to optimize device utilization and minimize waiting times.
Schedulers
Schedulers in an operating system are responsible for making decisions about
process execution, resource allocation, and process management. They
determine which processes should run, in what order, and for how long.
Schedulers play a crucial role in achieving efficient utilization of system
resources, responsiveness, fairness, and meeting performance objectives. Here
are the main types of schedulers found in operating systems:
Context Switch
Context switches are essential for multitasking and providing a responsive and
concurrent environment in an operating system. They allow the CPU to
efficiently allocate its processing power to multiple processes or threads,
enabling concurrent execution, time-sharing, and the illusion of parallelism in a
system.
Inter-process Communication
3. Pipes and FIFOs : Pipes and FIFOs (First-In-First-Out) are forms of inter-
process communication that are typically used for communication between
related processes. A pipe is a unidirectional communication channel that allows
data to flow in one direction. FIFOs, also known as named pipes, are similar to
pipes but can be accessed by unrelated processes. Pipes and FIFOs provide a
simple and straightforward method of IPC, with one process writing data into
the pipe and another process reading it.
Message Passing :
Shared Memory :
Shared memory provides several benefits. It is generally faster than other IPC
methods since it avoids the need for data copying between processes. As
processes can directly access the shared memory, it is suitable for scenarios that
require high-performance data sharing, such as inter-process coordination or
inter-thread communication within a single machine.
Threads
Thread Lifecycle -:
The life cycle of a thread describes the different stages that a thread goes
through during its existence. The thread life cycle typically includes the
following states:
1. New : In the new state, a thread is created but has not yet started executing.
The necessary resources for the thread, such as its stack space and program
counter, have been allocated, but the thread has not been scheduled by the
operating system to run.
2. Runnable : Once the thread is ready to execute, it enters the runnable state.
In this state, the thread is eligible to run and can be scheduled by the operating
system's thread scheduler. However, it does not guarantee immediate execution
as it competes with other runnable threads for CPU time.
3. Running : When a thread is selected by the scheduler to execute on a CPU
core, it enters the running state. In this state, the actual execution of the thread's
instructions takes place. The thread remains in the running state until it
voluntarily yields the CPU or its time slice expires.
4. Blocked : Threads can transition to the blocked (or waiting) state if they
need to wait for a particular event or condition to occur. For example, a thread
might block if it requests I/O operations or synchronization primitives like locks
or semaphores. While blocked, the thread is not eligible for CPU time and
remains in this state until the event or condition it is waiting for is satisfied.
Multi-threading Model
The choice of the multithreading model depends on various factors, such as the
application requirements, performance goals, and the level of control and
concurrency desired. Each model has its advantages and trade-offs in terms of
concurrency, overhead, resource usage, and ease of programming.
ps command -:
`ps -u username`: This option allows you to filter and display processes
owned by a specific user. Replace "username" with the actual username.
`ps -p PID`: Use the `-p` option followed by a process ID (PID) to
retrieve information about a specific process.
wait command -:
kill command -:
1. Define Process.
2. List out any two difference between process and program.
3. Explain PCB.
4. With the help of a neat labelled diagram explain process states.
5. Explain types of schedulers with the help of a diagram.
6. What do you mean by scheduling ?
7. What is a thread ? Explain types of thread.
8. Explain multi-threading model.
9. Explain thread life cycle diagram with the help of a diagram.
10. Explain context switching.
11. Explain Inter-process communication. List out all the techniques used in IPC
and explain any one of them.
12. Write a short note on : ps command , sleep , wait , kill
Chapter 3 Checklist
Doubts Column :
4. CPU Scheduling & Algorithms
Process
2. Data: This includes variables, constants, and other data required by the
program during execution.
Scheduling Categories
Preemptive Scheduling
Non-preemptive Scheduling
Preemptive and non-preemptive scheduling are two types of process scheduling
mechanisms used by operating systems to manage the execution of multiple
processes in a multitasking environment.
These mechanisms determine how the operating system decides which process
to run and for how long.
1. Preemptive Scheduling:
Advantages:
Enhances responsiveness and reduces latency for high-priority tasks.
Allows for better utilization of system resources by swiftly responding to
new tasks or higher-priority processes.
Disadvantages:
Requires careful handling to prevent race conditions and ensure data
consistency.
Context switching (switching between processes) involves overhead,
potentially affecting system performance.
Advantages:
Disadvantages:
1. First-Come-First-Served (FCFS):
Processes are scheduled in the order they arrive in the ready queue.
Simple and easy to implement, but may lead to longer waiting times for
shorter processes (convoy effect).
Processes are scheduled based on their burst time (the time taken to
complete their execution).
Minimizes average waiting time and is optimal for minimizing waiting
time for the shortest processes.
However, it requires knowledge of the burst times in advance, which
may not always be available.
5. Priority Scheduling:
6. Priority Aging:
Divides the ready queue into multiple separate queues with different
priorities.
Each queue can have its own scheduling algorithm.
Processes move between queues based on their characteristics or
system-defined criteria.
9. Lottery Scheduling:
Advantages :
Disadvantages:
1. Convoy Effect : FCFS can suffer from the convoy effect, where shorter
processes are stuck behind a long-running process. This increases the average
waiting time, especially for shorter processes.
2. High Average Waiting Time : The average waiting time can be relatively
high, especially if shorter processes arrive after longer processes. This can lead
to suboptimal performance in terms of response time.
4. Not Suitable for Real-Time Systems : FCFS is not suitable for real-time
systems where strict timing requirements need to be met. It doesn't prioritize
processes based on their urgency or importance.
Example -:
Shortest Job First ( SJF )
1. Non-Preemptive SJF:
The operating system schedules the process with the shortest burst time
first.
When a new process arrives, its burst time is compared with the
remaining burst times of other processes in the ready queue. The process
with the shortest burst time is selected to run next.
If a new process arrives with a shorter burst time than the remaining
time of the currently running process, the operating system preempts the
running process and schedules the new process.
The algorithm compares the remaining burst times of all processes and
selects the one with the shortest remaining burst time to run next.
Advantages of SJF:
1. Optimal Average Waiting Time : SJF provides the minimum average
waiting time among all scheduling algorithms. It gives priority to shorter jobs,
which minimizes the overall waiting time.
2. Efficient Utilization of CPU : When the process with the shortest burst time
is scheduled first, it optimizes CPU utilization by executing processes quickly.
Disadvantages of SJF:
3. Unrealistic Assumption : SJF assumes that the burst time of each process is
known in advance, which is not always the case in real-world scenarios.
Example -:
Round Robin
Time Sharing is the main emphasis of the algorithm. Each step of this algorithm
is carried out cyclically. The system defines a specific time slice, known as a
time quantum.
1. Initialization: Create a queue to hold the processes that are ready to execute.
Each process is assigned a fixed time quantum (time slice) that determines how
long it can run on the CPU before being moved to the back of the queue.
4. Execution: The selected process is allowed to execute on the CPU until its
time quantum expires or it voluntarily releases the CPU, such as when it
performs an I/O operation or finishes its execution.
Example -:
Priority Scheduling
Advantages:
Disadvantages:
Deadlock
1. Processes: Multiple processes are competing for resources like CPU time,
memory, or input/output devices.
2. Resources: Resources can be either reusable (e.g., memory, CPU) or
consumable (e.g., files, database records).
4. Circular Wait: A deadlock occurs when a set of processes are each waiting
for a resource acquired by one of the other processes, forming a circular chain
of dependencies.
In this scenario, none of the processes can proceed because they are all waiting
for a resource held by another process. This forms a circular wait, leading to a
deadlock.
Assignment #4
8. Explain deadlock
5. Memory Management
Memory Management
2. Memory Organization:
3. Address Translation:
4. Memory Protection:
- Implementing measures to prevent unauthorized access to memory regions,
ensuring the security and integrity of data.
5. Memory Sharing:
7. Virtual Memory:
Memory Partitioning
1. Fixed Partitioning:
In fixed partitioning, the physical memory is divided into fixed-sized partitions
during system initialization. Each partition can accommodate one process.
Processes are loaded into these partitions based on their size, and any remaining
unused space in a partition is wasted. Fixed partitioning is simple but may lead
to inefficient memory utilization.
Advantages:
Disadvantages:
2. Dynamic Partitioning:
Dynamic partitioning involves allocating memory dynamically based on the
size of the processes. When a new process arrives, the operating system finds a
suitable partition that can accommodate the process and allocates memory
accordingly. The partition size is adjusted dynamically based on the size of the
incoming processes.
Advantages:
Disadvantages:
Overhead of managing and updating information about partitions and
free spaces.
Possibility of external fragmentation, where small blocks of free
memory become scattered and cannot be used.
Fragmentation
1. Internal Fragmentation:
Variable Partitioning:
- Example:
2. External Fragmentation:
External fragmentation occurs when there is enough total free memory to
satisfy a memory request, but this free memory is scattered across the system in
small, non-contiguous blocks. As a result, a request for a specific amount of
memory cannot be fulfilled, even though the aggregate free memory is
sufficient.
- Example:
Imagine you have 100 units of free memory, divided into blocks of 30, 10,
25, and 35 units. If a process requests 50 units of memory, it cannot be
accommodated even though there is enough free memory available.
Both internal and external fragmentation can impact system performance and
memory utilization efficiency. Effective memory management strategies, like
compaction and appropriate memory allocation algorithms, are crucial to
mitigate the effects of fragmentation and enhance overall system performance.
Paging
The key idea behind paging is to divide the virtual address space of a process
into fixed-size pages and map these pages to frames in the physical memory.
When a process accesses a memory address, the operating system uses the page
table to translate the virtual address into a physical address.
3. Page Table: A data structure maintained by the operating system that maps
virtual addresses (used by the processes) to physical addresses (locations in
RAM). Each entry in the page table corresponds to a page and stores the frame
number where the page is located in physical memory.
4. Virtual Address Space: The range of addresses that a process can use, as
seen from the perspective of the process. This address space is divided into
fixed-size pages.
1. FIFO (First-In-First-Out):
This algorithm removes the page that has been in memory the longest. It's like
a queue where the page that came in first is the first to be replaced. However,
FIFO does not consider how frequently a page is accessed or how recently it
was used.
LRU replaces the page that has not been used for the longest time. It's based
on the principle that pages that have not been accessed recently are less likely to
be used soon. LRU requires maintaining a timestamp or a counter for each page
to track its last access time.
This is an idealized algorithm that replaces the page that will not be used for
the longest time in the future. It's used as a reference to measure the efficiency
of other page replacement algorithms, although it's not practically
implementable because it requires knowledge of future memory accesses.
First-In-First-Out
1. Initialization:
The operating system maintains a queue, which initially is empty, to track the
order of page arrivals into the memory.
2. Page Request:
When a page fault occurs (a requested page is not in memory), the operating
system selects the oldest page in the memory for replacement.
3. Page Replacement:
The page at the front of the queue (the one that has been in memory the
longest) is replaced with the new page that caused the page fault.
After replacing the old page, the new page is added to the end of the queue,
representing the most recently brought into memory.
Example -:
The LRU stands for the Least Recently Used. It keeps track of page usage in
the memory over a short period of time. It works on the concept that pages that
have been highly used in the past are likely to be significantly used again in the
future. It removes the page that has not been utilized in the memory for the
longest time. LRU is the most widely used algorithm because it provides fewer
page faults than the other methods.
LRU, which stands for "Least Recently Used," is a page replacement algorithm
used in computer operating systems to manage and optimize the use of physical
memory (RAM) when executing multiple processes or programs. The primary
goal of the LRU algorithm is to minimize page faults by evicting the page that
has not been used for the longest period of time.
1. Maintaining a Page Reference List: LRU keeps track of the order in which
pages are accessed by creating a list or queue, which is sometimes called a
"page reference list." This list is usually implemented as a data structure that
records the recent page accesses.
4. Maintaining List Size: To ensure that the page reference list doesn't grow
infinitely, it is usually limited to a fixed size, which means that when a new
page is added to the front of the list, the page at the end of the list is removed.
1. At each memory access request, the algorithm examines the future references
to determine which page will be used furthest in the future.
2. It selects the page that will not be used for the longest time (the page that has
the farthest future reference).
Concept of File
1. File Name: This is the human-readable name of the file, which is used to
identify and reference it.
2. File Extension: The file extension is typically a part of the file name and
indicates the file type or format. For example, ".txt" is often used for plain text
files, ".jpg" for image files, and ".exe" for executable programs.
3. File Size: This attribute specifies the size of the file in bytes or another
appropriate unit of measurement, indicating how much storage space the file
occupies.
4. File Location: The file's path or directory location specifies where the file is
stored within the file system's hierarchy. It includes information about the
folder(s) in which the file is contained.
5. Date Created: This attribute indicates the date and time when the file was
initially created.
6. Date Modified: This attribute indicates the date and time when the file was
last modified or updated.
7. Date Accessed: Some operating systems track the date and time when the file
was last accessed, although this attribute is often disabled by default due to
performance considerations.
8. File Permissions: File permissions define who can access, read, write, or
execute the file. These permissions are usually categorized into read, write, and
execute permissions for the owner, group, and others.
9. File Type: This attribute provides information about the type or format of the
file, which can be used by the operating system and associated applications to
determine how to handle the file.
10. File Owner: Every file is associated with an owner, typically the user
account that created the file. The owner has special privileges and control over
the file's permissions.
File Operations
In an operating system, file operations are a set of actions that can be performed
on files to create, read, write, update, delete, and manage them. These
operations are essential for managing and manipulating data within the file
system. Here are some of the most common file operations:
1. File Creation: Creating a file involves specifying a file name and optionally
choosing a location within the file system. The operating system reserves space
for the file and assigns initial attributes such as creation date and permissions.
2. File Reading: Reading from a file allows you to retrieve data from an
existing file. Reading can be done sequentially or randomly, depending on the
file access method. The operating system provides system calls and APIs for
reading data from files.
5. File Updating: Updating a file means modifying specific parts of its content,
typically within the file's structure. For instance, updating a database file might
involve changing a record's data without affecting the rest of the file.
6. File Deletion: Deleting a file removes it from the file system, freeing up
storage space. Care should be taken when deleting files, as they may not always
be recoverable from the recycling bin or trash.
File System Structure
A record sequence file system stores data as records, where each record
consists of a fixed-size block or entry.
This structure is well-suited for applications where data is organized into
records, such as databases.
Records are typically identified by a record number or key and can be
read, written, or updated individually.
Record sequence file systems provide efficient access to specific data
within a file without the need to read or write the entire file.
This structure is commonly used in database management systems
(DBMS) and other data-centric applications.
File Contents:
Direct access, also known as random access, allows you to read or write data at
a specific location within the file without the need to traverse the entire file
sequentially.
Each data item within the file is associated with an address or index,
allowing you to directly access the desired item.
This method is suitable for tasks that require quick access to specific data
within a file, such as database systems or indexed data structures.
Example: Accessing a specific record in a database file by its record
number. You don't need to read through all the records to find the one
you're interested in.
The allocation methods define how the files are stored in the disk blocks.
There are three main disk space or file allocation methods.
Contiguous Allocation
Linked Allocation
Indexed Allocation
1. Contiguous Allocation
In this scheme, each file occupies a contiguous set of blocks on the disk. For
example, if a file requires n blocks and is given a block b as the starting
location, then the blocks assigned to the file will be: b, b+1, b+2,……b+n-1.
This means that given the starting block address and the length of the
file (in terms of blocks required), we can determine the blocks occupied
by the file.
The directory entry for a file with contiguous allocation contains
Address of starting block
Length of the allocated portion.
The file ‘mail’ in the following figure starts from the block 19 with length = 6
blocks. Therefore, it occupies 19, 20, 21, 22, 23, 24 blocks.
Advantages:
Both the Sequential and Direct Accesses are supported by this. For direct
access, the address of the kth block of the file which starts at block b can
easily be obtained as (b+k).
This is extremely fast since the number of seeks are minimal because of
contiguous allocation of file blocks.
Disadvantages:
This method suffers from both internal and external fragmentation. This
makes it inefficient in terms of memory utilization.
Increasing file size is difficult because it depends on the availability of
contiguous memory at a particular instance.
2. Linked List Allocation
In this scheme, each file is a linked list of disk blocks which need not
be contiguous. The disk blocks can be scattered anywhere on the disk.
The directory entry contains a pointer to the starting and the ending file block.
Each block contains a pointer to the next block occupied by the file.
The file ‘jeep’ in following image shows how the blocks are randomly
distributed. The last block (25) contains -1 indicating a null pointer and
does not point to any other block.
Advantages:
This is very flexible in terms of file size. File size can be increased easily
since the system does not have to look for a contiguous chunk of memory.
This method does not suffer from external fragmentation. This makes it
relatively better in terms of memory utilization.
Disadvantages:
Because the file blocks are distributed randomly on the disk, a large number
of seeks are needed to access every block individually. This makes linked
allocation slower.
It does not support random or direct access. We can not directly access the
blocks of a file. A block k of a file can be accessed by traversing k blocks
sequentially (sequential access ) from the starting block of the file via block
pointers.
Pointers required in the linked allocation incur some extra overhead.
3. Indexed Allocation
In this scheme, a special block known as the Index block contains the pointers
to all the blocks occupied by a file. Each file has its own index block. The ith
entry in the index block contains the disk address of the ith file block. The
directory entry contains the address of the index block as shown in the image:
Advantages:
This supports direct access to the blocks occupied by the file and therefore
provides fast access to the file blocks.
It overcomes the problem of external fragmentation.
Disadvantages:
Directory Structure
1) Single-level directory:
The single-level directory is the simplest directory structure. In it, all files
are contained in the same directory which makes it easy to support and
understand.
A single level directory has a significant limitation, however, when the
number of files increases or when the system has more than one user. Since all
the files are in the same directory, they must have a unique name. If two users
call their dataset test, then the unique name rule violated.
Advantages:
Disadvantages:
There may chance of name collision because two files can have the same
name.
Searching will become time taking if the directory is large.
This can not group the same type of files together.
2) Two-level directory:
Advantages:
The main advantage is there can be more than two files with same name,
and would be very helpful if there are multiple users.
A security would be there which would prevent user to access other user’s
files.
Searching of the files becomes very easy in this directory structure.
Disadvantages:
As there is advantage of security, there is also disadvantage that the user
cannot share the file with the other users.
Unlike the advantage users can create their own files, users don’t have the
ability to create subdirectories.
Scalability is not possible because one use can’t group the same types of
files together.
Disadvantages:
As the user isn’t allowed to access other user’s directory, this prevents the
file sharing among users.
As the user has the capability to make subdirectories, if the number of
subdirectories increase the searching may become complicated.
Users cannot modify the root directory data.
If files do not fit in one, they might have to be fit into other directories.
Raid Levels -:
RAID combines several independent and relatively small disks into single
storage of a large size. The disks included in the array are called array
members. The disks can combine into the array in different ways, which are
known as RAID levels.
RAID arrays appear to the operating system as a single logical drive. RAID
employs the techniques of disk mirroring or disk striping.
o Disk Mirroring will copy identical data onto more than one drive.
o Disk Striping partitions help spread data over multiple disk drives. Each
drive's storage space is divided into units ranging from 512 bytes up to
several megabytes. The stripes of all the disks are interleaved and
addressed in order.
o Disk mirroring and disk striping can also be combined in a RAID array.
Levels of RAID
Many different ways of distributing data have been standardized into various
RAID levels. Each RAID level is offering a trade-off of data protection, system
performance, and storage space. The number of levels has been broken into
three categories, standard, nested, and non-standard RAID levels.
RAID 0 is taking any number of disks and merging them into one large volume.
It will increase speeds as you're reading and writing from multiple disks at a
time. But all data on all disks is lost if any one disk fails. An individual file can
then use the speed and capacity of all the drives of the array. The downside to
RAID 0, though, is that it is NOT redundant. The loss of any individual disk
will cause complete data loss. This RAID type is very much less reliable than
having a single disk.
It duplicates data across two disks in the array, providing full redundancy. Both
disks are store exactly the same data, at the same time, and at all times. Data is
not lost as long as one disk survives. The total capacity of the array equals the
capacity of the smallest disk in the array. At any given instant, the contents of
both disks in the array are identical.
If either drive fails, you can then replace the broken drive with little to no
downtime. RAID 1 also gives you the additional benefit of increased read
performance, as data can read off any of the drives in the array. The downsides
are that you will have slightly higher write latency. Since the data needs to be
written to both drives in the array, you'll only have a single drive's available
capacity while needing two drives.
RAID 5 requires the use of at least three drives. It combines these disks to
protect data against loss of any one disk; the array's storage capacity is reduced
by one disk. It strips data across multiple drives to increase performance. But, it
also adds the aspect of redundancy by distributing parity information across the
disks.
RAID 6 is similar to RAID 5, but the parity data are written to two drives. The
use of additional parity enables the array to continue to function even if two
disks fail simultaneously. However, this extra protection comes at a cost. RAID
6 has a slower write performance than RAID 5.
The chances that two drives break down at the same moment are minimal.
However, if a drive in a RAID 5 system died and was replaced by a new drive,
it takes a lot of time to rebuild the swapped drive. If another drive dies during
that time, you still lose all of your data. With RAID 6, the RAID array will even
survive that second failure also.