0% found this document useful (0 votes)
69 views21 pages

Dos - Unit-3

The document discusses clock synchronization in distributed systems. It explains that clock synchronization is important to ensure coordinated task execution by aligning clocks across multiple systems. It then describes some challenges in distributed systems like information dispersion and temporal uncertainty. The document proceeds to explain different approaches to clock synchronization, including physical clock synchronization using UTC as a reference, logical clock synchronization focusing on order of events, and mutual exclusion synchronization for managing shared resources.

Uploaded by

poojadharme16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views21 pages

Dos - Unit-3

The document discusses clock synchronization in distributed systems. It explains that clock synchronization is important to ensure coordinated task execution by aligning clocks across multiple systems. It then describes some challenges in distributed systems like information dispersion and temporal uncertainty. The document proceeds to explain different approaches to clock synchronization, including physical clock synchronization using UTC as a reference, logical clock synchronization focusing on order of events, and mutual exclusion synchronization for managing shared resources.

Uploaded by

poojadharme16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

TULSIRAMJI GAIKWAD-PATIL College of Engineering and Technology

Wardha Road, Nagpur - 441108


Accredited with NAAC A+ Grade
Approved by AICTE, New Delhi, Govt. of Maharashtra
(An Autonomous Institute Affiliated to RTM Nagpur University)
Department of Information Technology
Session 2023-2024

Semester: Vth Subject: Distributed Operating System

Unit 3

Clock Synchronization

In the world of distributed computing, where multiple systems collaborate to accomplish tasks ensuring
that all the clocks are synchronized plays a crucial role. Clock synchronization involves aligning the
clocks of computers or nodes which enables efficient data transfer, smooth communication, and
coordinated task execution. This article explores the importance of clock synchronization, in distributed
systems discusses the challenges it addresses, and delves into approaches used to achieve
synchronization.
The Crucial Role of Clock Synchronization: Bridging Time Gaps
Clock synchronization in distributed systems aims to establish a reference for time across nodes.
Imagine a scenario where three distinct systems are part of a distributed environment. In order for data
exchange and coordinated operations to take place it is essential that these systems have a shared
understanding of time. Achieving clock synchronization ensures that data flows seamlessly between
them tasks are executed coherently and communication happens without any ambiguity.
Challenges in Distributed Systems
Clock synchronization in distributed systems introduces complexities compared to centralized ones due
to the use of distributed algorithms. Some notable challenges include:
 Information Dispersion: Distributed systems store information on machines. Gathering and
harmonizing this information to achieve synchronization presents a challenge.
 Local Decision Realm: Distributed systems rely on localized data, for making decisions. As a result
when it comes to synchronization we have to make decisions with information, from each node,
which makes the process more complex.
 Mitigating Failures: In a distributed environment it becomes crucial to prevent failures in one node
from causing disruption in synchronization.
 Temporal Uncertainty: The existence of clocks in distributed systems creates the potential, for time
variations.

Types of Clock Synchronization
 Physical clock synchronization
 Logical clock synchronization
 Mutual exclusion synchronization
1. Physical clock synchronization
In the realm of distributed systems each node operates with its clock, which can lead to time differences.
However the goal of physical clock synchronization is to overcome this challenge. This involves
equipping each node with a clock that is adjusted to match Universal Coordinated Time (UTC) a
recognized standard. By synchronizing their clocks in this way diverse systems, across the distributed
landscape can maintain harmony.
 Addressing Time Disparities: When it comes to distributed systems each node operates with its
clock, which can result in variations. The goal of physical clock synchronization is to minimize these
disparities by aligning the clocks.
 Using UTC as a Common Reference Point: The key to achieving this synchronization lies in
adjusting the clocks to adhere to an accepted standard known as Universal Coordinated Time (UTC).
UTC offers a reference for all nodes.
2. Logical clock synchronization
Within the tapestry of distributed systems absolute time often takes a backseat to clock synchronization.
Think of clocks as storytellers that prioritize the order of events than their exact timing. These clocks
enable the establishment of connections between events like weaving threads of cause and effect. By
bringing order and structure into play, task coordination within distributed systems becomes akin to a
choreographed dance where steps are sequenced for execution.
 Event Order Over Absolute Time: In the realm of distributed systems logical clock
synchronization focuses on establishing the order of events than relying on absolute time. Its primary
objective is to establish connections between events.
 Approach towards Understanding Behavior: Logical clocks serve as storytellers weaving together
a narrative of events. This narrative enhances comprehension and facilitates coordination within the
distributed system.
3. Mutual exclusion synchronization
In the bustling symphony of distributed systems one major challenge is managing shared resources.
Imagine multiple processes competing for access, to the resource simultaneously. To address this issue
mutual exclusion synchronization comes into play as an expert technique that reduces chaos and
promotes resource harmony. This approach relies on creating a system where different processes take
turns accessing shared resources. This helps avoid conflicts and collisions to synchronized swimmers
gracefully performing in a water ballet. It ensures that resources are used efficiently and without any
conflicts.
 Managing Resource Conflicts: In the ecosystem of distributed systems multiple processes often
compete for shared resources simultaneously. To address this issue mutual exclusion synchronization
enforces a mechanism for accessing resources.
 Enhancing Efficiency through Sequential Access: This synchronization approach ensures that
resources are accessed sequentially minimizing conflicts and collisions. By orchestrating access, in
this manner resource utilization and overall system efficiency are optimized.

clock synchronization algorithms

lient processes synchronise time with a time server using Cristian's Algorithm, a clock synchronisation
algorithm. While redundancy-prone distributed systems and applications do not work well with this
algorithm, low-latency networks where round trip time is short relative to accuracy do. The time interval
between the beginning of a Request and the conclusion of the corresponding Response is referred to as
Round Trip Time in this context.

An example mimicking the operation of Cristian's algorithm is provided below:

Algorithm:

1. The process on the client machine sends the clock server a request at time T 0 for the clock time
(time at the server).
2. In response to the client process's request, the clock server listens and responds with clock server
time.
3. The client process retrieves the response from the clock server at time T 1 and uses the formula
below to determine the synchronised client clock time.

TCLIENT = TSERVER plus (T1 - T0)/2.

where TCLIENT denotes the synchronised clock time, TSERVER denotes the clock time returned by the server,
T0 denotes the time at which the client process sent the request, and T 1 denotes the time at which the client
process received the response

The formula above is valid and reliable:

Assuming that the network latency T0 and T1 are roughly equal, T1 - T0 denotes the total amount of time
required by the network and server to transfer the request to the server, process it, and return the result to
the client process.

The difference between client-side and real-time time is no more than (T1 - T0)/2 seconds. We can infer
from the aforementioned statement that the synchronisation error can only be (T1 - T0)/2 seconds at most.

Hence,

Error E [-(T 1 - T 0)/2, (T 1 - T 0)/2]

Election Algorithms: Election algorithms choose a process from a group of processors to act as a
coordinator. If the coordinator process crashes due to some reasons, then a new coordinator is elected on
other processor. Election algorithm basically determines where a new copy of the coordinator should be
restarted. Election algorithm assumes that every active process in the system has a unique priority
number. The process with highest priority will be chosen as a new coordinator. Hence, when a
coordinator fails, this algorithm elects that active process which has highest priority number.Then this
number is send to every active process in the distributed system. We have two election algorithms for
two different configurations of a distributed system.
1. The Bully Algorithm – This algorithm applies to system where every process can send a message to
every other process in the system. Algorithm – Suppose process P sends a message to the coordinator.
1. If the coordinator does not respond to it within a time interval T, then it is assumed that coordinator
has failed.
2. Now process P sends an election messages to every process with high priority number.
3. It waits for responses, if no one responds for time interval T then process P elects itself as a
coordinator.
4. Then it sends a message to all lower priority number processes that it is elected as their new
coordinator.
5. However, if an answer is received within time T from any other process Q,
 (I) Process P again waits for time interval T’ to receive another message from Q that it has been
elected as coordinator.
 (II) If Q doesn’t responds within time interval T’ then it is assumed to have failed and algorithm
is restarted.
2. The Ring Algorithm – This algorithm applies to systems organized as a ring(logically or physically).
In this algorithm we assume that the link between the process are unidirectional and every process can
message to the process on its right only. Data structure that this algorithm uses is active list, a list that
has a priority number of all active processes in the system.
Algorithm –
1. If process P1 detects a coordinator failure, it creates new active list which is empty initially. It sends
election message to its neighbour on right and adds number 1 to its active list.
2. If process P2 receives message elect from processes on left, it responds in 3 ways:
 (I) If message received does not contain 1 in active list then P1 adds 2 to its active list and
forwards the message.
 (II) If this is the first election message it has received or sent, P1 creates new active list with
numbers 1 and 2. It then sends election message 1 followed by 2.
 (III) If Process P1 receives its own election message 1 then active list for P1 now contains
numbers of all the active processes in the system. Now Process P1 detects highest priority
number from list and elects it as the new coordinator.

Distributed Deadlock

Distributed deadlocks can occur when distributed transactions or concurrency control are utilized in
distributed systems. It may be identified via a distributed technique like edge chasing or by creating a
global wait-for graph (WFG) from local wait-for graphs at a deadlock detector. Phantom deadlocks are
identified in a distributed system but do not exist due to internal system delays.

In a distributed system, deadlock cannot be prevented nor avoided because the system is too vast. As a
result, only deadlock detection is possible. The following are required for distributed system deadlock
detection techniques:

1. Progress

The method may detect all the deadlocks in the system.


2. Safety

The approach must be capable of detecting all system deadlocks.

Approaches to detect deadlock in the distributed system

Various approaches to detect the deadlock in the distributed system are as follows:

1. Centralized Approach

Only one resource is responsible for detecting deadlock in the centralized method, and it is simple and
easy to use. Still, the disadvantages include excessive workload on a single node and single-point failure
(i.e., the entire system is dependent on one node, and if that node fails, the entire system crashes), making
the system less reliable.

2. Hierarchical Approach

In a distributed system, it is the integration of both centralized and distributed approaches to deadlock
detection. In this strategy, a single node handles a set of selected nodes or clusters of nodes that are in
charge of deadlock detection.

3. Distributed Approach

In the distributed technique, various nodes work to detect deadlocks. There is no single point of failure as
the workload is equally spread among all nodes. It also helps to increase the speed of deadlock detection.

Deadlock Handling Strategies

Various deadlock handling strategies in the distributed system are as follows:

1. There are mainly three approaches to handling deadlocks: deadlock prevention, deadlock
avoidance, and deadlock detection.

2. Handling deadlock becomes more complex in distributed systems since no site has complete
knowledge of the system's present state and every inter-site communication entails a limited and
unpredictable latency.
3. The operating system uses the deadlock Avoidance method to determine whether the system is in a
safe or unsafe state. The process must inform the operating system of the maximum number of
resources, and a process may request to complete its execution.

4. Deadlocks prevention are commonly accomplished by implementing a process to acquire all of the
essential resources at the same time before starting execution or by preempting a process that
already has the resource.
5. In distributed systems, this method is highly inefficient and impractical.
6. The presence of cyclical wait needs an examination of the status of process resource interactions to
detect deadlock.

7. The best way to dealing with deadlocks in distributed systems appears to be deadlock detection.

Issues of Deadlock Detection

Various issues of deadlock detection in the distributed system are as follows:

1. Deadlock detection-based deadlock handling requires addressing two fundamental issues: first,
detecting existing deadlocks, and second, resolving detected deadlocks.
2. Detecting deadlocks entails tackling two issues: WFG maintenance and searching the WFG for the
presence of cycles.
3. In a distributed system, a cycle may include multiple sites. The search for cycles is highly
dependent on the system's WFG as represented across the system.

Resolution of Deadlock Detection

Various resolutions of deadlock detection in the distributed system are as follows:

1. Deadlock resolution includes the braking existing wait-for dependencies in the system WFG.
2. It includes rolling multiple deadlocked processes and giving their resources to the blocked
processes in the deadlock so that they may resume execution.

Deadlock detection algorithms in Distributed System

Various deadlock detection algorithms in the distributed system are as follows:

1. Path-Pushing Algorithms
2. Edge-chasing Algorithms
3. Diffusing Computations Based Algorithms
4. Global State Detection Based Algorithms

Path-Pushing Algorithms
Path-pushing algorithms detect distributed deadlocks by keeping an explicit global WFG. The main
concept is to create a global WFG for each distributed system site. When a site in this class of algorithms
performs a deadlock computation, it sends its local WFG to all neighboring sites. The term path-pushing
algorithm was led to feature the sending around the paths of global WFG.

Edge-Chasing Algorithms

An edge-chasing method verifies a cycle in a distributed graph structure by sending special messages
called probes along the graph's edges. These probing messages are distinct from request and response
messages. If a site receives the matching probe that it previously transmitted, it can cancel the formation of
the cycle.

Diffusing Computations Based Algorithms

In this algorithm, deadlock detection computation is diffused over the system's WFG. These techniques
use echo algorithms to detect deadlocks, and the underlying distributed computation is superimposed on
this computation. If this computation fails, the initiator reports a deadlock global state detection.

Global State Detection Based Algorithms

Deadlock detection algorithms based on global state detection take advantage of the following facts:

1. A consistent snapshot of a distributed system may be taken without freezing the underlying
computation.
2. If a stable property exists in the system before the snapshot collection starts, it will be preserved.

What is Thrash?

In computer science, thrash is the poor performance of a virtual memory (or paging) system when the
same pages are being loaded repeatedly due to a lack of main memory to keep them in memory.
Depending on the configuration and algorithm, the actual throughput of a system can degrade by multiple
orders of magnitude.

In computer science, thrashing occurs when a computer's virtual memory resources are overused, leading
to a constant state of paging and page faults, inhibiting most application-level processing. It causes the
performance of the computer to degrade or collapse. The situation can continue indefinitely until the user
closes some running applications or the active processes free up additional virtual memory resources.
To know more clearly about thrashing, first, we need to know about page fault and swapping.

o Page fault: We know every program is divided into some pages. A page fault occurs when a
program attempts to access data or code in its address space but is not currently located in the
system RAM.
o Swapping: Whenever a page fault happens, the operating system will try to fetch that page from
secondary memory and try to swap it with one of the pages in RAM. This process is called
swapping.

Thrashing is when the page fault and swapping happens very frequently at a higher rate, and then the
operating system has to spend more time swapping these pages. This state in the operating system is
known as thrashing. Because of thrashing, the CPU utilization is going to be reduced or negligible.

The basic concept involved is that if a process is allocated too few frames, then there will be too many and
too frequent page faults. As a result, no valuable work would be done by the CPU, and the CPU utilization
would fall drastically.

The long-term scheduler would then try to improve the CPU utilization by loading some more processes
into the memory, thereby increasing the degree of multiprogramming. Unfortunately, this would result in a
further decrease in the CPU utilization, triggering a chained reaction of higher page faults followed by an
increase in the degree of multiprogramming, called thrashing.

Algorithms during Thrashing

Whenever thrashing starts, the operating system tries to apply either the Global page replacement
Algorithm or the Local page replacement algorithm.
1. Global Page Replacement

Since global page replacement can bring any page, it tries to bring more pages whenever thrashing is
found. But what actually will happen is that no process gets enough frames, and as a result, the thrashing
will increase more and more. Therefore, the global page replacement algorithm is not suitable when
thrashing happens.

2. Local Page Replacement

Unlike the global page replacement algorithm, local page replacement will select pages which only belong
to that process. So there is a chance to reduce the thrashing. But it is proven that there are many
disadvantages if we use local page replacement. Therefore, local page replacement is just an alternative to
global page replacement in a thrashing scenario.

Causes of Thrashing

Programs or workloads may cause thrashing, and it results in severe performance problems, such as:

o If CPU utilization is too low, we increase the degree of multiprogramming by introducing a new
system. A global page replacement algorithm is used. The CPU scheduler sees the decreasing CPU
utilization and increases the degree of multiprogramming.

o CPU utilization is plotted against the degree of multiprogramming.


o As the degree of multiprogramming increases, CPU utilization also increases.
o If the degree of multiprogramming is increased further, thrashing sets in, and CPU utilization drops
sharply.
o So, at this point, to increase CPU utilization and to stop thrashing, we must decrease the degree of
multiprogramming.

How to Eliminate Thrashing

Thrashing has some negative impacts on hard drive health and system performance. Therefore, it is
necessary to take some actions to avoid it. To resolve the problem of thrashing, here are the following
methods, such as:

o Adjust the swap file size:If the system swap file is not configured correctly, disk thrashing can
also happen to you.
o Increase the amount of RAM: As insufficient memory can cause disk thrashing, one solution is
to add more RAM to the laptop. With more memory, your computer can handle tasks easily and
don't have to work excessively. Generally, it is the best long-term solution.
o Decrease the number of applications running on the computer: If there are too many
applications running in the background, your system resource will consume a lot. And the
remaining system resource is slow that can result in thrashing. So while closing, some applications
will release some resources so that you can avoid thrashing to some extent.
o Replace programs: Replace those programs that are heavy memory occupied with equivalents
that use less memory.

Techniques to Prevent Thrashing

The Local Page replacement is better than the Global Page replacement, but local page replacement has
many disadvantages, so it is sometimes not helpful. Therefore below are some other techniques that are
used to handle thrashing:

1. Locality Model

A locality is a set of pages that are actively used together. The locality model states that as a process
executes, it moves from one locality to another. Thus, a program is generally composed of several
different localities which may overlap.

For example, when a function is called, it defines a new locality where memory references are made to the
function call instructions, local and global variables, etc. Similarly, when the function is exited, the
process leaves this locality.

2. Working-Set Model

This model is based on the above-stated concept of the Locality Model.

The basic principle states that if we allocate enough frames to a process to accommodate its current
locality, it will only fault whenever it moves to some new locality. But if the allocated frames are lesser
than the size of the current locality, the process is bound to thrash.

According to this model, based on parameter A, the working set is defined as the set of pages in the most
recent 'A' page references. Hence, all the actively used pages would always end up being a part of the
working set.

The accuracy of the working set is dependent on the value of parameter A. If A is too large, then working
sets may overlap. On the other hand, for smaller values of A, the locality might not be covered entirely.

If D is the total demand for frames and WSS i is the working set size for process i,

D = ⅀ WSSi
Now, if 'm' is the number of frames available in the memory, there are two possibilities:

o D>m, i.e., total demand exceeds the number of frames, then thrashing will occur as some processes
would not get enough frames.
o D<=m, then there would be no thrashing.

If there are enough extra frames, then some more processes can be loaded into the memory. On the other
hand, if the summation of working set sizes exceeds the frames' availability, some of the processes have to
be suspended (swapped out of memory).

This technique prevents thrashing along with ensuring the highest degree of multiprogramming possible.
Thus, it optimizes CPU utilization.

3. Page Fault Frequency

A more direct approach to handle thrashing is the one that uses the Page-Fault Frequency concept.

The problem associated with thrashing is the high page fault rate, and thus, the concept here is to control
the page fault rate.

If the page fault rate is too high, it indicates that the process has too few frames allocated to it. On the
contrary, a low page fault rate indicates that the process has too many frames.

Upper and lower limits can be established on the desired page fault rate, as shown in the diagram.
If the page fault rate falls below the lower limit, frames can be removed from the process. Similarly, if the
page faults rate exceeds the upper limit, more frames can be allocated to the process.

In other words, the graphical state of the system should be kept limited to the rectangular region formed in
the given diagram.

If the page fault rate is high with no free frames, some of the processes can be suspended and allocated to
them can be reallocated to other processes. The suspended processes can restart later.

Heterogeneous DSM
Data compatibility & conversion:
The data comparability and conversion is the initial design concern in a heterogeneous DSM system.
Different byte-ordering and floating-point representations may be used by machines with different
architectures. Data that is sent from one machine to another must be converted to the destination
machine’s format. The data transmission unit (block) must be transformed according to the data type of
its contents. As a result, application programmers must be involved because they are familiar with the
memory layout. In heterogeneous DSM systems, data conversion can be accomplished by organizing the
system as a collection of source language objects or by allowing only one type of data block.

 DSM as a collection of source language objects:


The DSM is structured as a collection of source language objects, according to the first technique of
data conversion. The unit of data migration in this situation is either a shared variable or an object.
Conversion procedures can be used directly by the compiler to translate between different machine
architectures. The DSM system checks whether the requesting node and the node that has the object
are compatible before accessing remote objects or variables. If the nodes are incompatible, it invokes
a conversion routine, translates, and migrates the shared variable or object.
This approach is employed in Agora Shared Memory systems, and while it is handy for data
conversion, it has a low performance. Scalars, arrays, and structures are the objects of programming
languages. Each of them necessitates access rights, and migration need communication overhead.
Due to the limited packet size of transport protocols, access to big arrays may result in false sharing
and thrashing, while migration would entail fragmentation and reassembling.
 DSM as one type of data block:
Only one type of data block is allowed in the second data conversion procedure. Mermaid DSM use
this approach, which uses a page size equal to the block size. Additional information is kept in the
page table entry, such as the type of data preserved in the page and the amount of data allocated to
the page. The method changes the page to an appropriate format when there are page defects or
incompatibilities.
There are a few drawbacks to this method. Because the block only contains one sort of data,
fragmentation might waste memory. Compilers on heterogeneous computers must also be consistent
in terms of data type size and field order within compound structures in the code generated by the
compiler. Even though only a small piece of the page can be accessed, the complete page is
transformed and sent. Because users must describe the conversion process for the user-specified data
type, as well as the mapping of the data type to the conversion routine, transparency is reduced.
Finally, if data is translated too frequently, the accuracy of floating point numbers may suffer.
Block size selection :
The choice of block size is another difficulty in creating heterogeneous DSM systems. In a DSM system,
heterogeneous machines can have varying virtual memory page sizes. As a result, any of the following
algorithms can be used to choose the right block size :

 Largest page size: The DSM block size is the largest virtual memory page size among all machines
in this technique, as the name implies. Multiple virtual memory pages can fit within a single DSM
block since the page size is always the power of two. Multiple blocks, including the required page,
are received in the event of a page fault on a node with a reduced page size. False sharing and
thrashing are common problem that occur frequently because of larger block size.
 Smallest page size: The DSM block size is selected as the smallest virtual memory page size
available across all computers. Multiple blocks will be sent when a page fault occurs on a node with
a greater page size. This approach decreases data contention but introduces additional block table
management overheads due to greater communication.
 Intermediate page size: Given the aforementioned two procedures, the optimum option is to choose
a block size that falls between the machines’ largest and smallest virtual memory page sizes. This
method is used to balance the issues that come with large and small blocks.
Based on how data catching is managed, there are three approaches for designing a DSM system :
1. DSM managed by the OS
2. DSM managed by the MMU hardware
3. DSM managed by the language runtime system.
1. DSM managed by the OS : This area of data cache management by the OS includes distributed
operating systems like Ivy and Mirage. Each node has its own memory, and a page fault sends a trap
to the operating system, which then employs exchange messages to identify and fetch the required
block. The operating system is in charge of data placement and migration.
2. DSM managed by the MMU hardware : Memory caching is managed by the MMU hardware and
cache in this technique. To keep cache consistency, snooping bus protocols are utilized. DEC
Firefly, for example, uses memory hardware. Directories can be used to keep track of where data
blocks and nodes are located. MMU locates and transfers the requested page in the event of a page
fault.
3. DSM managed by the language runtime system : The DSM is organized as a set of programming
language elements, such as shared variables, data structures, and shared objects, in this system. The
programming language and the operating system handle the placement and migration of these shared
variables or objects at runtime. Features to indicate the data utilization pattern can be added to the
programming language. Such a system can support several consistency protocols and can be applied
to the granularity of individual data. But this technique imposes extra burden on the programmer.
Examples o such systems are Munin and Midway , which uses shared variables, while Orca and
Linda uses shared Objects.
Advantages of DSM:
 When a block is moved, take use of locality-of-reference.
 Passing-by-reference and passing complex data structures are made easier with a single address
space.
 There is no memory access bottleneck because there is no single bus.
 Because DSM programs have a similar programming interface, they are portable.
 Virtual memory space that is quite large.
Difference between a Homogeneous DSM & Heterogeneous DSM:
When distributed application components share memory directly through DSM, they are more tightly
connected than when data is shared through DSM. As a result, extending a DSM to a heterogeneous
system environment is difficult.

The performance of a homogeneous DSM is slight better than a heterogeneous DSM. Despite a number
of challenges in data conversion, Heterogeneous DSM can be accomplished with functional and
performance transparency that is comparable to homogeneous DSM.

Resource Management in Operating System is the process to manage all the resources efficiently like
CPU, memory, input/output devices, and other hardware resources among the various programs and
processes running in the computer.
Resource management is an important thing because resources of a computer are limited and multiple
processes or users may require access to the same resources like CPU, memory etc. at the same time.
The operating system has to manage and ensure that all processes get the resources they need to execute,
without any problems like deadlocks.

Load Balancing

Load balancing is the practice of spreading the workload across distributed system nodes in order to
optimize resource efficiency and task response time while avoiding a situation in which some nodes are
substantially loaded while others are idle or performing little work.
Load Sharing
Load balancing solutions are designed to establish a dispersed network in which requests are evenly
spread across several servers. Load sharing, on the other hand, includes sending a portion of the traffic
to one server and the rest to another.

Differences between Load Balancing and Load Sharing

Load Balancing Load Sharing

Load balancing equally


distributes network traffic or
Load sharing delivers a portion of the traffic or load to one
load across different channels
connection in the network while the remainder is routed through
and can be achieved using both
other channels.
static and dynamic load
1. balancing techniques.

Focuses on the notion of traffic


Works with the notion of traffic splitting across connections.
2. dispersion across connections.

The creation of Ratios, Least


Load Sharing is based on the notion of sharing traffic or network
connections, Fastest, Round
load among connections based on destination IP or MAC address
robin, and observed approaches
selections.
3. are used in load balancing.

4. It is Uni-Directional. It is Uni-Directional.

5. No instance is load sharing. All instances are load sharing.

Accurate Load Balancing is not


Load sharing is easy compared with load balancing.
6. an easy task.

Load sharing is a more specific phrase that refers to the distribution of traffic across different routes,
even if in an uneven manner. If you compare two traffic graphs, the two graphs should be almost the
same with load balancing. However, they may be comparable with load sharing, but the traffic flow
pattern will be different.

Here are some Terminologies related to the resource management in OS:


 Resource Allocation: This terms defines the process of assigning the available resources to
processes in the operating system. This can be done dynamically or statically.
 Resource: Resource can be anything that can be assigned dynamically or statically in the operating
system. Example may include CPU time, memory, disk space, and network bandwidth etc.
 Resource Management: It refers to how to manage resources efficiently between different
processes.
 Process: Process refers to any program or application that is being executed in the operating system
and has its own memory space, execution state, and set of system resources.
 Scheduling: It is the process of determining from multiple number of processes which process
should be allocated a particular resource at a given time.
 Deadlock: When two or more processes are waiting for some resource but resources are busy
somewhere else and resources are also waiting for some process to complete their execution . In such
condition neither resources will be freed nor process would get it and this situation is called
deadlock.
 Semaphore: It is the method or tool which is used to prevent race condition. Semaphore is an
integer variable which is used in mutual exclusive manner by various concurrent cooperative process
in order to achieve synchronization.
 Mutual Exclusion: It is the technique to prevent multiple number of process to access the same
resources simultaneously.
 Memory Management: Memory management is a method used in the operating systems to manage
operations between main memory and disk during process execution.
Features or characteristics of the Resource management of operating system:
 Resource scheduling: The OS allocate available resources to the processes. It decides the sequence
of which process will get access to the CPU, memory, and other resources at any given time.
 Resource Monitoring: The operating system monitors which resources is used by which process
and also take action if any process takes many resources at the same time causing into deadlock.
 Resource Protection: The OS protects the system from unauthorized or fake access by the user or
any other process.
 Resource Sharing: The operating system permits many processes like memory and I/O devices to
share resources. It guarantees that common resources are utilized in a fair and productive way.
 Deadlock prevention: The OS prevents deadlock and also ensure that no process is holding
resources indefinitely . For that it uses techniques likes resource preemption.
 Resource accounting: The operating system always tracks the use of resources by different
processes for allocation and statistical purposes.
 Performance optimization: The OS optimizes resources distribution , the reason is to increase the
system performance. For that many techniques like load balancing and memory management are
followed that ensures efficient resources distribution.
Diagrammatically representation of the Resource management :
Process Management: process Migration, Thread.

A process is essentially a program in execution. The execution of a process should advance in a


sequential design. A process is characterized as an entity that addresses the essential unit of work to be
executed in the system.

Process migration is a particular type of process management by which processes are moved starting
with one computing environment then onto the next.
There are two types of Process Migration:

 Non-preemptive process: If a process is moved before it begins execution on its source node which
is known as a non-preemptive process.
 Preemptive process: If a process is moved at the time of its execution that is known as preemptive
process migration. Preemptive process migration is all the more expensive in comparison to the non-
preemptive on the grounds that the process environment should go with the process to its new node.
Why use Process Migration?
The reason to use process migration are:

 Dynamic Load Balancing: It permits processes to exploit less stacked nodes by relocating from
overloaded ones.
 Accessibility: Processes that inhibit defective nodes can be moved to other perfect nodes.
 System Administration: Processes that inhabit a node if it is going through system maintenance can
be moved to different nodes.
 The locality of data: Processes can exploit the region of information or other extraordinary abilities
of a specific node.
 Mobility: Processes can be relocated from a hand-operated device or computer to an automatic
server-based computer before the device gets detached from the network.
 Recovery of faults: The component to stop, transport and resume a process is actually valuable to
support in recovering the fault in applications that are based on transactions.
What are the steps involved in Process Migration?
The steps which are involved in migrating the process are:

 The process is chosen for migration.


 Choose the destination node for the process to be relocated.
 Move the chosen process to the destination node.
The subcategories to migrate a process are:

 The process is halted on its source node and is restarted on its destination node.
 The address space of the process is transferred from its source node to its destination node.
 Message forwarding is implied for the transferred process.
 Managing the communication between collaborating processes that have been isolated because of
process migration.

Methods of Process Migration


The methods of Process Migration are:

1. Homogeneous Process Migration: Homogeneous process migration implies relocating a process in a


homogeneous environment where all systems have a similar operating system as well as architecture.
There are two unique strategies for performing process migration. These are i) User-level process
migration ii) Kernel level process migration.
 User-level process migration: In this procedure, process migration is managed without converting
the operating system kernel. User-level migration executions are more simple to create and handle
but have usually two issues: i) Kernel state is not accessible by them. ii) They should cross the
kernel limit utilizing kernel demands which are slow and expensive.
 Kernel level process migration: In this procedure, process migration is finished by adjusting the
operating system kernel. Accordingly, process migration will become more simple and more
proficient. This facility permits the migration process to be done faster and relocate more types of
processes.
Homogeneous Process Migration Algorithms:

There are five fundamental calculations for homogeneous process migration:

 Total Copy Algorithm


 Pre-Copy Algorithm
 Demand Page Algorithm
 File Server Algorithm
 Freeze Free Algorithm
2. Heterogeneous Process Migration: Heterogeneous process migration is the relocation of the process
across machine architectures and operating systems. Clearly, it is more complex than the homogeneous
case since it should review the machine and operating designs and attributes, as well as send similar data
as homogeneous process migration including process state, address space, file, and correspondence data.
Heterogeneous process migration is particularly appropriate in the portable environment where is almost
certain that the portable unit and the base help station will be different machine types. It would be
attractive to relocate a process from the versatile unit to the base station as well as the other way around
during calculation. In most cases, this couldn’t be accomplished by homogeneous migration. There are
four essential types of heterogeneous migration:
 Passive object: The information is moved and should be translated
 Active object, move when inactive: The process is relocated at the point when it isn’t executing.
The code exists in the two areas, and just the information is moved and translated.
 Active object, interpreted code: The process is executing through an interpreter so just information
and interpreter state need to be moved.
 Active object, native code: Both code and information should be translated as they are accumulated
for a particular architecture.

You might also like