0% found this document useful (0 votes)
9 views70 pages

Chapter 2

Distributed Operating Systems

Uploaded by

Virat Raj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views70 pages

Chapter 2

Distributed Operating Systems

Uploaded by

Virat Raj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 70

UNIT-II

Distributed Operating
Systems

1
Introduction to distributed systems
• A distributed system is a loosely coupled architecture wherein
processors are inter-connected by a communication network.
• Distributed systems are multiprocessor systems but with the
following differences:
 Distributed system works in a wide area network involving much
more communication as compared to computation.
 Each node in a distributed system is a complete computer
having full set of peripherals including memory, communication
hardware, possibly different operating system and different file
system, etc.
 The users of a distributed system have an impression that they
are working on a single machine.

2
Types of Distributed Operating Systems

• Distributed Operating Systems

• Network Operating Systems

3
Distributed Operating System

• But no longer have shared memory


– Provide message passing
– Can try to provide distributed shared memory
• But tough to get acceptable performance 4
Distributed-Operating Systems
• Users not aware of multiplicity of machines
– Access to remote resources similar to access to local
resources
• Distributed systems need to access any resource or transfer
any task on any node,there are three types of migration
provided by the OSs:
– Data Migration – transfer data by transferring entire file, or
transferring only those portions of the file necessary for the
immediate task
– Computation Migration – transfer the computation, rather than the
data, across the system
• via remote procedure calls (RPCs)
• via messaging system

5
Distributed-Operating Systems (Cont.)
• Process Migration – execute an entire process, or parts of it,
at different sites
– Load balancing – distribute processes across network to
even the workload
– Computation speedup – subprocesses can run
concurrently on different sites
– Hardware preference – process execution may require
specialized processor
– Software preference – required software may be
available at only a particular site
– Data access – run process remotely, rather than transfer
all data locally
• Consider the World Wide Web
6
Advantages of Distributed Systems
– Economy – Better price performance ratio
– Resource sharing
• Sharing and printing files at remote sites
• Processing information in a distributed database
• Using remote specialized hardware devices
– Computation speedup – load sharing or job migration results higher
throughput & rapid response time
– Reliability – detect and recover from site failure, function transfer,
reintegrate failed site
– Communication – at geographically distant nodes
– Incremental Growth – new hardware or software can be added

7
Network Operating System

• OSes can be different (Windows or Linux)


• Typical services: rlogin,ftp
– Fairly primitive way to share files
8
Network-Operating Systems
• Users are aware of multiplicity of machines
• Access to resources of various machines is done explicitly by:
– Remote logging into the appropriate remote machine
(telnet, ssh)
– Remote Desktop (Microsoft Windows)
– Transferring data from remote machines to local machines,
via the File Transfer Protocol (FTP) mechanism
• Users must change paradigms – establish a session, give
network-based commands
– More difficult for users

9
Network Operating System

• Can have one computer provide files


transparently for others (NFS)

10
Network Operating System
• Accesses resources on remote computers that run
independent operating systems
• Not responsible for resource management at remote locations
• Distributed functions are explicit rather than transparent
– A user or process must explicitly specify the resource’s
location to retrieve a networked file or remotely execute
an application
• Lack of transparency in network OSs
– Disadvantage: Does not provide some of the benefits of
distributed OSs
– Advantage: Easier to implement than distributed OSs

11
Distributed-Operating Systems
• Manage resources located in multiple networked computers
• Employ many of the same communication methods, file
system structures and other protocols found in network
operating systems
• Transparent communication
– Objects in the system are unaware of the separate
computers that provide the service (unlike network
operating systems)
• Rare to find a “truly” distributed system because the high level
of transparency is difficult to achieve

12
Design Issues in distributed operating
system
Transparency: the distributed system should appear as a conventional, centralized
system to the user

Transparency Description
Hide differences in data representation and how a resource is
Access
accessed
Location Hide where a resource is located
Migration Hide that a resource may move to another location
Relocation Hide that a resource may be moved to another location while in use
Replication Hide that a resource may be shared by several competitive users
Concurrency Hide that a resource may be shared by several competitive users
Failure Hide the failure and recovery of a resource
Persistence Hide whether a (software) resource is in memory or on disk

13
Design Issues in Distributed Operating System

 Fault tolerance – the distributed system should continue to function in the face of
failure
 Hardware, software and networks fail!

 Distributed systems must maintain availability(by redundancy) even at low levels of


hardware/software/network reliability.

 Performance-Communication should be fast so there is need to minimize the number


of messages. Depends upon grain size of computations(fine grained or coarse grained).
 Load Balancing
 Data Migration
 Computation Migration
 Process Migration

14
Design Issues in distributed operating
system
Scalability – as demands increase, the system should easily accept the addition of new
resources to accommodate the increased demand

Concept Example
Centralized services A single server for all users
Centralized data A single on-line telephone book
Centralized Doing routing based on complete
algorithms information

• As distributed systems grow, centralized


solutions are limited.
15
Design Issues in distributed operating
system
Lack of Global Clock

• Communication delays are at the core of the problem


• Information may become false before it can be acted upon
• these create some fundamental problems:
• no global clock -- scheduling based on fifo queue?
• no global state -- what is the state of a task? What is a correct
program?

16
Communication in Distributed Operating System

• Inter Process Communication is achieved in Distributed Systems is


implemented in two ways:

 Message Passing Model

 Remote Procedure Call

17
Message Passing Protocol

18
Blocking versus Nonblocking Primitives

Client blocked
Client running Client running

Trap to Return from kernel,


kernel, process released
Process blocked
Message being sent

Blocking send primitive

19
Nonblocking send primitive

Client
blocked
Client running
Client running

Trap Return

Message Message being sent


copied to
kernel
buffer

Non-Blocking send primitive


20
Nonblocking primitives

• Advantage: can continue execution without waiting.


• Disadvantage: the sender cannot modify the message
buffer until the message has been sent and it does not
know when the transfer can complete. It can hardly avoid
touching the buffer forever.

21
Solutions to the drawbacks of nonblocking
primitives

• To have the kernel copy the message to an internal kernel


buffer and then allow process to continue.
Problem: extra copies reduce the system performance.
• Interrupt the sender when the message has been sent
Problem: user-level interrupts make programming tricky,
difficult, and subject to race conditions.

22
Reliable versus Unreliable Primitives

• The system has no guarantee about message being


delivered.
• The receiving machine sent an acknowledgement back.
Only when this ack is received, will the sending kernel
free the user (client) process.
• Use reply as ack.

23
Some examples of packet exchanges for
client-server communication

REQ
Client Server
ACK
REP
ACK

REQ
Client Server
REP

24
Remote Procedure call
• Basic idea: To execute a procedure at a remote site and ship
the results back.
• Goal: To make this operation as distribution transparent
as possible (i.e., the remote procedure call should look like a
local one to the calling procedure).

25
Remote Procedure Call (RPC)

26
Steps of a Remote Procedure Call
1. Client procedure calls client stub in normal way
2. Client stub builds message, calls local OS
3. Client's OS sends message to remote OS
4. Remote OS gives message to server stub
5. Server stub unpacks parameters, calls server
6. Server does work, returns result to the stub
7. Server stub packs it in message, calls local OS
8. Server's OS sends message to client's OS
9. Client's OS gives message to client stub
10. Stub unpacks result, returns to client

27
Implementation of Remote Procedure Call

Client machine Client stub Server stub Server machine

Call Pack parameters Unpack Call


parameters
Client Server
Unpack result
Return Pack result
Return

Kernel Kernel

Message transport
over the network 28
Passing Value Parameters

Steps involved in doing remote computation through RPC 29


Clock synchronization in distributed system
Synchronizing Logical Clocks
• Lamport defined a relation known as Happens-before relation.
This relation is basically for capturing the underlying dependencies
between events. It may be denoted as x → y. There may be
following two situations for this relation:
– If x and y are events in the same process and x occurs before y.
– If x is the message sending event by one process and y is the message
receiving event. In this case x will always precede y.
• Happens-before relation is a transitive relation, i.e.,
• If x → y, y → z, then x → z.

30
Clock synchronization in distributed system

• In distributed systems, it is better to know the sequence of events


using this relation

• It is clear that when one event changes the system state, it may
affect its related future events that will happen after this. This
influence among causally-related events satisfying the Happens-
before relation, are known as causal affects.

31
Clock synchronization in distributed system
• If two events are with the following conditions:

– x → y, i.e. x → y is not true.

– y → x, i.e. y → x is not true.

• There is no transitive Happens-before relationship between x and


y.

• Then these two events are known as concurrent, i.e. it cannot be


decided when of the two events happened or which of the two
first happens.

32
Lamport’s Logical Clock

• To implement the logical clock, Lamport introduced the concept of


timestamp to be attached with each process in the system. The
timestamp may be a function that assigns a number Ti(x) to any
event x to process Pi. But these timestamps have nothing to do
with actual real time clock. Now as per the timestamp the
following conditions must be met:
• For any two events x and y in a process Pi, If x → y , then
T(x) < T(y)
• If x is the message sending event in process Pi and y is the message
receiving event in process Pj then
Ti(x) < Tj(y)

33
Lamport’s Logical Clock
• The timestamp value T must always be increasing and never
decreasing. Thus, for the implementation, the clock is incremented
between any two successive events with a positive integer always,
i.e.,
Ti(y) = Ti(x) + d where d >0
• Using all the conditions mentioned above, we can assign
timestamp to all events in the system and thereby provide a total
ordering of all the events in the system.

34
Mutual Exclusion
To provide mutual exclusion among processes in distributed system
the following algorithms are used:
 Centralized algorithm: In this algorithm, a process on one
node of the distributed system is assigned as coordinator to
manage the mutual exclusion problem.
 Ricart-Agarwala algorithm: It is a fully distributed algorithm.
According to this algorithm, a process wishing to enter its
critical section sends a time-stamped request messages to all
other processes and waits.
 Token-ring algorithm: This algorithm assumes a logical
structure of all the processes in a ring, i.e. all the processes are
connected together in a ring sequence and every process
knows who the next process in that sequence is.
35
Centralized algorithm

36
Ricart- Agarawala Algorithm
Ricart-Agarwala Algorithm for Mutual Exclusion
1.A process Pi wishing to enter its critical section CS sends a message M(Name, Process_ID, TS) to all other processes,
i.e. n-1 in the system; where n is the number of processes in the system.
2.When a process Pj receives the message M from Pi, the following actions may be taken:
If Pj is not inside CS
Send OK message to Pi.
Else
{
If Pj has sent M to other processes
{
If (TSi < TSj) // Compare the time stamps of Pi and Pj.
Send OK message to Pi.
Else
Add the Pi ‘s request to Pending_request_queue.
}
Else
{
If Pj is inside the CS
Add the Pi ‘s request to Pending_request_queue.
}
}
3. The process if receives n-1 OK messages from the processes, it gets the permission to enter its CS and starts
executing inside it.
4. If the process Pj exits its CS
{
If items are there in Pending_request_queue
Send OK message to all the processes inside the
Pending_request_queue
}
37
Token-Ring algorithm for mutual exclusion

38
The Deadlock problem

• In a computer system deadlocks arise when


members of a group of processes which hold
resources are blocked indefinitely from access
to resources held by other processes within
the group.
Conditions for deadlocks
• Mutual exclusion. No resource can be shared by
more than one process at a time.
• Hold and wait. There must exist a process that is
holding at least one resource and is waiting to
acquire additional resources that are currently being
held by other processes.
• No preemption. A resource cannot be preempted.
• Circular wait. There is a cycle in the wait-for graph.
Graph-theoretic models

• Wait-for graph.

• Resource-allocation graph.
Wait-for graph

P2
P1

P3

P5
P4
Resource allocation graph

P1 P2 P1 P2

r1 r2

P3 P3
Resource allocation graph With deadlock
Without deadlock
Wait-for graph and Resource-
allocation graph conversion
• Any resource allocation graph with a single
copy of resources can be transferred to a wait-
for graph.

P1 P1

P3 P2 P3 P2
Strategies for handling deadlocks

• Deadlock prevention. Prevents deadlocks by


restraining requests made to ensure that at least one
of the four deadlock conditions cannot occur.
• Deadlock avoidance. Dynamically grants a resource
to a process if the resulting state is safe. A state is
safe if there is at least one execution sequence that
allows all processes to run to completion.
• Deadlock detection and recovery. Allows deadlocks
to form; then finds and breaks them.
Deadlock Detection
• Centralized algorithm: In this algorithm a process is assigned as a
central coordinator that will maintain the resource allocation state of
the entire system.
• Each machine maintains a local wfg
– Changes reported to CC
– CC constructs and analyzes global wfg
• Problems
– Coordinator is a performance bottleneck
– Communication delays may cause
phantom/false deadlocks

46
Deadlock Detection

47
Deadlock Detection
• Distributed Approach
– Detect cycles using probes.
– If process pi blocked on pj , it launches probe pi  pj
– pj sends probe pi  pj  pk along all request edges, etc.
– When probe returns to pi, cycle is detected

48
Deadlock Prevention
• To provide deadlock prevention in distributed system the following
algorithms are used:
– Ordering of resources
– Wait-die algorithm: The older process waits for a younger
process but a younger process is killed if tries to wait for an
older process
– Wound-wait algorithm: The older process preempts the
younger one but if younger one requests, it is allowed to wait.

49
Deadlock Prevention

50
Distributed Process Scheduling
• To perform distributed process scheduling, Load balancing is
performed by transferring some processes from heavily loaded
node to lightly loaded node. This is known as process migration.
• Implementation issues:
– Process Migration Policy (Pre-emptive/non-premptive)
– Transfer Policy (Based on load)
– Selection Policy
– Location Policy
– Information Policy (demand/periodic)

51
Distributed Process Scheduling
• There are two types of schedulers.
 Local scheduler
 Global load scheduler

52
Reasons of Process Migration

• Process migration depend on :-


 No of processes on a processor
 Av waiting time of the processes in ready queue
 No of processor available in the system
 Av. Waiting time of the processes on a node
• Global load scheduler is of two types: static and dynamic. Static
scheduler assigns the processes to processors at their compile time
only while dynamic scheduler assigns the processors when the
processes start execution.

53
Distributed Process Scheduling
Algorithms
• Sender-Initiated Algorithm-overloaded node
– Random scheduling
– Threshold based scheduling
– Shortest queue length scheduling
• Receiver-Initiated Algorithm-underloaded node
• Symmetrically Initiated Algorithm
– Benefits of both

54
Distributed file systems
• In distributed system, to have a distributed file system, some nodes
are dedicated to store the files only. These nodes perform the
storage and retrieval operation on the files. These nodes are known
as file servers. The other nodes used for computational purposes
are known as clients.

55
Distributed file systems

• File system provides an abstract view of secondary


storage and is responsible for global naming, file
access, and overall file organization. These functions
are handled by the name service, the file service, and
the directory service.
• File service is the specification of what the file
system offers to its clients.
• File server is a process that runs on some machine
and helps implement the file service.
Transparency
• If the location of a file is communicated, then the name
may include the location, machine, and file name, such
as myuniversity.edu:/violet/book/chapter8.
• If your distributed system wishes to provide location
transparency, then you must provide various
transparency through global naming.
– Access transparency
– Location transparency
– Concurrency transparency
– Heterogeneity
– Replication transparency
– Migration Transparency
File Protection Modes

• Read to the file


• Write to the file
• Truncate the file
• Append to the file
• Execute the file
• There are two dominating types of file
protection: access lists and capabilities.
Access lists

Access list associates with each file a list of users


who may access the file and how.
• File 0: (John, *, RWX)
• File 1: (John, staff, R_ _)
•…
• File 3: (*, student, R_ _)
Capability list
• Each user has a kind of ticket, called a
capability, for each object to which it has
access.
• Process 0
Type Rights
0 File R_ _
1 File RWX
2 File RW_
3 Printer _W_
File Modification Notification

• Single processor

1. Write “c”
Original file
a b On a single processor,
A when a READ follows
a b c a WRITE, the value
returned by the READ
B is the value just written.
2. Read gets “abc”
File Modification Notification
• Distributed system

Client 1
File Server
A a b 1. Read “ab”
a b c a b
2. Write “c”
3. Read gets “ab”
In a distributed system with Client 2
caching, obsolete values may
be returned. If client 1 modifies B
the file in its cache, it must a b
inform client 2.
Semantics of File Sharing
• Unix semantics
– All updates are immediately visible
– Generates a lot of network traffic
• Session semantics
– Updates visible when file closes
– Simultaneous updates are unpredictable (lost)
• Transaction semantics
– Updates visible at end of transaction
• Immutable-files semantics
– Updates create a new version of file
– Now the problem is one of version management

63
File service implementation

• File service implementations may be based on


remote access or remote copy and may be
stateful or stateless.
Remote access model

Client Server

Requests from client File stays on server


to access remote file
Remote copy model

Client Server

1.File moved to Old file


client
New file
3. When client is
done, file is returned
to server
2. Accesses are done
on client
Stateful & Stateless
• A Stateful server maintains information about all
clients that are utilizing the server to access a file.
• A stateless server maintains no client information.
Each and every request from a client must include
very specific request information, such as file name,
operation, and exact position in the file. The client
maintains the state information.
Places to store files

Client’s main memory Client’s disk Server’s main memory Server’s disk

There are four potential places to store files:


•The server’s disk
•The server’s main memory
•The client’s disk
•The client’s main memory
Cache Consistency

• Solution 1: Write through


• Solution 2: Delayed write
• Solution 3: Write-on-Close
• Solution 4: Centralized control algorithm
• Solution 5: Use immutable files
THANK YOU

70

You might also like