0% found this document useful (0 votes)
37 views23 pages

Aos-Unit 2

A distributed system consists of multiple autonomous computers that communicate via a network and do not share memory. Building such systems can enhance performance, reliability, and modularity, but they also face challenges like heterogeneity, security, and failure handling. Communication in distributed systems can be achieved through message passing and remote procedure calls, while Lamport's logical clocks help manage event ordering and mutual exclusion in these environments.

Uploaded by

Anuradha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views23 pages

Aos-Unit 2

A distributed system consists of multiple autonomous computers that communicate via a network and do not share memory. Building such systems can enhance performance, reliability, and modularity, but they also face challenges like heterogeneity, security, and failure handling. Communication in distributed systems can be achieved through message passing and remote procedure calls, while Lamport's logical clocks help manage event ordering and mutual exclusion in these environments.

Uploaded by

Anuradha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

UNIT 2

What is a distributed system?

 It consists of multiple computers that do not share a memory.


 Each Computer has its own memory and runs its own operating system.
 The computers can communicate with each other through a communication network.

Why build a distributed system?

 Microprocessors are getting more and more powerful.


 A distributed system combines (and increases) the computing power of individual
computer.
 Some advantages include:
o Resource sharing
(but not as easily as if on the same machine)
o Enhanced performance
(but 2 machines are not as good as a single machine that is 2 times as fast)
o Improved reliability & availability
(but probability of single failure increases, as does difficulty of recovery)
o Modular expandability
 Distributed OS's have not been economically successful!!!

DISTRIBUTED OPERATING SYSTEM ISSUES

Distributed System is a collection of autonomous computer systems that are physically


separated but are connected by a centralized computer network that is equipped with
distributed system software. These are used in numerous applications, such as online gaming,
web applications, and cloud computing. However, creating a distributed system is not simple,
and there are a number of design considerations to take into account. The following are some of
the major design issues of distributed systems:

Design issues of the distributed system –

1. Heterogeneity: Heterogeneity is applied to the network, computer hardware, operating


system, and implementation of different developers. A key component of the
heterogeneous distributed system client-server environment is middleware. Middleware
is a set of services that enables applications and end-user to interact with each other
across a heterogeneous distributed system.

2. Openness: The openness of the distributed system is determined primarily by the degree
to which new resource-sharing services can be made available to the users. Open
systems are characterized by the fact that their key interfaces are published. It is based on
a uniform communication mechanism and published interface for access to shared
resources. It can be constructed from heterogeneous hardware and software.
3. Scalability: The scalability of the system should remain efficient even with a significant
increase in the number of users and resources connected. It shouldn’t matter if a program
has 10 or 100 nodes; performance shouldn’t vary. A distributed system’s scaling requires
consideration of a number of elements, including size, geography, and management.

4. Security: The security of an information system has three components Confidentially,


integrity, and availability. Encryption protects shared resources and keeps sensitive
information secrets when transmitted.

5. Failure Handling: When some faults occur in hardware and the software program, it
may produce incorrect results or they may stop before they have completed the intended
computation so corrective measures should to implemented to handle this case. Failure
handling is difficult in distributed systems because the failure is partial i, e, some
components fail while others continue to function.

6. Concurrency: There is a possibility that several clients will attempt to access a shared
resource at the same time. Multiple users make requests on the same resources, i.e. read,
write, and update. Each resource must be safe in a concurrent environment. Any object
that represents a shared resource in a distributed system must ensure that it operates
correctly in a concurrent environment.

7. Transparency: Transparency ensures that the distributed system should be perceived as


a single entity by the users or the application programmers rather than a collection of
autonomous systems, which is cooperating. The user should be unaware of where the
services are located and the transfer from a local machine to a remote one should be
transparent.

2. Communication primitives for distributed communication

COMMUNICATION IN DISTRIBUTED SYSTEMS


i) Message Passing system
ii) Remote Procedure Call (RPC)
i) Message Passing System
Introduction
Naming
Direct or Indirect Communication
Synchronization
Buffering
ii) Remote Procedure Call
Remote Procedure Call (RPC) is a communication technology that is used by one program to
make a request to another program for utilizing its service on a network without even knowing
the network’s details. A function call or a subroutine call are other terms for a procedure call.

It is based on the client-server concept. The client is the program that makes the request, and the
server is the program that gives the service. This is a form of client-server interaction
implemented via a request-response message-passing system.

How to Make a Remote Procedure Call

The calling environment is suspended, procedure parameters are transferred across the network
to the environment where the procedure is to execute, and the procedure is executed there.
When the procedure finishes and produces its results, it is transferred back to the
calling environment, where execution resumes as if returning from a regular procedure
call.
Features of RPC

In an operating system, remote procedure call (RPC) has the following features, such as:

 RPC hides the complexity of the message passing process from the user.
 RPC only uses specific layers of the OSI model like the transport layer.
 Clients can communicate with the server by using higher-level languages.
 RPC works well with both local environments and remote environments.
 The program of RPC is written in simple code and is easily understood by the programmer.
 The operating system can handle processes and threads involved in RPC easily.
 The operating system hides the abstractions of RPC from the user.

RPC Architecture

RPC architecture has mainly five components of the program:

1. Client
2. Client Stub
3. RPC Runtime
4. Server Stub
5. Server

RPC Architecture
During an RPC, the following steps take place:

1. The client calls the client stub. The call is a local procedure call with parameters pushed
onto the stack in the normal way.
2. The client stub packs the procedure parameters into a message and makes a system call to
send the message. The packing of the procedure parameters is called marshalling.
3. The client's local OS sends the message from the client machine to the remote server
machine.
4. The server OS passes the incoming packets to the server stub.
5. The server stub unpacks the parameters -- called unmarshalling -- from the message.
6. When the server procedure is finished, it returns to the server stub, which marshals the
return values into a message. The server stub then hands the message to the transport
layer.
7. The transport layer sends the resulting message back to the client transport layer, which
hands the message back to the client stub.
8. The client stub unmarshalls the return parameters, and execution returns to the caller.

Types of RPC:

Callback RPC: In a Callback RPC, a P2P (Peer-to-Peer) paradigm opts between participating
processes. In this way, a process provides both client and server functions which are quite
helpful. Callback RPC’s features include:
 The problems encountered with interactive applications that are handled remotely
 It provides a server for clients to use.
 Due to the callback mechanism, the client process is delayed.
 Deadlocks need to be managed in callbacks.
 It promotes a Peer-to-Peer (P2P) paradigm among the processes involved.

RPC for Broadcast: A client’s request that is broadcast all through the network and handled by
all servers that possess the method for handling that request is known as a broadcast RPC.
Broadcast RPC’s features include:

 You have an option of selecting whether or not the client’s request message ought to be
broadcast.
 It also gives you the option of declaring broadcast ports.
 It helps in diminishing physical network load.

Batch-mode RPC: Batch-mode RPC enables the client to line and separate RPC inquiries in a
transmission buffer before sending them to the server in a single batch over the network. Batch-
mode RPC’s features include:

 It diminishes the overhead of requesting the server by sending them all at once using the
network.
 It is used for applications that require low call rates.
 It necessitates the use of a reliable transmission protocol.

3. LAMPORT LOGICAL CLOCK

Lamport clocks represent time logically in a distributed system. They are also known as logical
clocks. The idea behind Lamport clocks is to disregard physical time and capture just a
“happens-before” relationship between a pair of events.

Why use Lamport clocks?

Time synchronization is a key problem in distributed systems. Time is used to order events
across servers. Using physical clocks to order events is challenging because real synchronization
is impossible and clocks experience skew. A clock skew is when different clocks run at different
rates, so we cannot assume that time t on node a happened before time t + 1 on node b.

Instead of employing physical time, Leslie Lamport proposed logical clocks that capture events’
orderings through a “happens-before” relationship.

Defining the happens-before relation

An event is something happening at a server node (sending or receiving messages, or a local


execution step). If an event a happens before b, we write it as a ->b.
There are three conditions in which we can say an event a happens before b:

 If it is the same node and a occurs before b, then a ->b


 If c is a message receipt of b, then b ->c
 Transitivity: If a ->b and b ->c, then a ->c

The following diagram illustrates the happens-before relation:

A happens-before relation does not order all events. For instance, the events a andd are not
related by ->. Hence, they are concurrent. Such events are written as a || d.

Implementing Lamport clocks

Lamport clocks tag events in a distributed system and order them accordingly. We seek a clock
time C(a) for every event a. The clock condition is defined as follows:

If a ->b, then C(a) < C(b).

Each process maintains an event counter. This event counter is the local Lamport clock.

The Lamport clock algorithm works in the following way:

 Before the execution of an event, the local clock is updated. This can be explained by the
equation Ci = Ci+1, where i is the process identifier.
 When a message is sent to another process, the message contains the process’ local clock, C m.
 When a process receives a message m, it sets its local clock to 1+max(C I, Cm).
The following diagram illustrates how the Lamport clock algorithm works:

In the example above, all local clocks start from a value of 0. Before the execution of an event,
the local clock increments by 1. Notice that P2’s clock starts from 0, but on the arrival of the
message m1 from P1, it updates its clock in accordance with the third rule mentioned above, i.e.,
1+max(Ci, Cm ) where Ci = 0 and Cm = 2.

Hence, P2’s final clock value is 3.

Note:d is an event on P3, so, C(d) = 1, where d is parallel to a.

Lamport Algorithm for Mutual Exclusion in Distributed System

Multiple processes that are running on various machines or nodes and


interacting with one another to accomplish a single objective makeup
distributed systems. In these systems, it's crucial to make sure that only one
process is able to utilize a shared resource at once to prevent conflicts and data
inconsistencies.

One way to make sure that only one function is using a shared resource at once
is through the use of mutual exclusion and Lamport's Algorithm is one of many
accessible mutual exclusion algorithms.
Pseudo code of Lamportís Algorithm for Mutual Exclusion in Distributed System

Lamport's algorithm for mutual exclusion in distributed systems has the


following pseudo code −

When a process wants to access the shared resource

requesting = true;
clock = clock + 1;
send_request_message_to_all_other_processes(clock);
wait_for_replies_from_all_other_processes();

When a process receives a request message from another process

receive_request_message(clock);
update_clock(clock);
if (!requesting || (clock, this_process_id) <
(requesting_clock, requesting_process_id))
send_reply_message(sender_process_id);
else
queue_request(sender_process_id, sender_clock);

When a process receives a reply message from another process

receive_reply_message(sender_process_id);
update_clock(sender_clock);
num_replies_received = num_replies_received + 1;
if (num_replies_received == num_processes - 1)
accessing_shared_resource = true;

When a process finishes accessing the shared resource

accessing_shared_resource = false;
num_replies_received = 0;
for each queued request
send_reply_message(requesting_process_id);
dequeue_request();
 A logical clock that is originally set to 0 is kept by each process, along with a boolean
variable called requesting that denotes whether the process desires access to the shared
resource.
 A process increases its logical clock value by 1 and sets the request to true when it wishes
to use a shared resource.
 All other processes in the system are then sent a request notification by the process. The
logical clock number for the process is contained in the request message.
 A process updates its own logical clock value to be the highest of its existing logical
clock value and the clock value in the incoming request message when it gets a request
message from another process.
 A reply message is sent to the requesting process if the receiving process is not presently
using the shared resource. There is no more detail in the reply message.
 The request communication is queued if the receiving process is already using the shared
resource.
 A process updates its own logical clock value to be the maximum of its existing logical
clock value and the clock value in the received reply message when it gets a reply
message from another process. A property called "num_replies_received" is also
incremented.
 The process sets a flag to indicate that it is accessing the shared resource if
"num_replies_received" matches the total number of processes in the system minus 1 (i.e., all
processes excluding the requesting process).
 The shared resource is now accessible to the workflow. When finished, it transmits a reply
message to each request that was in line, sorted by timestamp. (i.e., the order in which they
were received).
 The procedure then resets num_replies_received to 0 and sets requesting to false.
 Each time a process needs to access the shared resource steps 2 through 10 can be repeated as
often as required.
Advantages

Easy to understand − The Lamportis Algorithm is straightforward and simple to

understand. This makes it an excellent option for a variety of applications.

Scalable − This algorithm does not usually depend on any server or coordinator
in the middle of the system. Hence, it can be used in systems with many
processes.
Fairness − This algorithm makes sure that every process has an equal
opportunity to utilize the shared resource. This is because the algorithm handles
the processes in the same order that it receives.

Low latency − Because mutual exclusion is achieved after only one round of
contact, the algorithm has low latency.

Works in asynchronous environments − The algorithm operates in asynchronous


environments where the timing of message transmission is unpredictable.

Disadvantages
 The algorithm necessitates frequent contact between processes, which could lead to a high
message overhead.
 The algorithm may be ineffective in situations with high levels of contention, such as when
several processes are trying for access to the same resource at once, as it may take several
rounds of conversation to settle disputes.
 The algorithm implies that all processes' clocks are synchronized, which can be difficult
to achieve in practice.
 Limited to exclusive access: The algorithm cannot be easily modified to support other kinds
of access control because it is built specifically for achieving mutual exclusion.
 The algorithm is susceptible to network failures, which can result in delays or lost
communications.
4. Deadlock Handling Strategies in Distributed System

The following are the strategies used for Deadlock Handling in Distributed System:
 Deadlock Prevention
 Deadlock Avoidance
 Deadlock Detection and Recovery
1. Deadlock Prevention:
This strategy ensures that deadlock can never happen because system designing is carried out
in such a way. If any one of the deadlock-causing conditions is not met then deadlock can be
prevented. Following are the three methods used for preventing deadlocks by making one of the
deadlock conditions to be unsatisfied:
 Collective Requests: In this strategy, all the processes will declare the required resources for
their execution beforehand and will be allowed to execute only if there is the availability of
all the required resources. When the process ends up with processing then only resources will
be released. Hence, the hold and wait condition of deadlock will be prevented.
 But the issue is initial resource requirements of a process before it starts are based on an
assumption and not because they will be required. So, resources will be unnecessarily
occupied by a process and prior allocation of resources also affects potential concurrency.
 Ordered Requests: In this strategy, ordering is imposed on the resources and thus, process
requests for resources in increasing order. Hence, the circular wait condition of deadlock can
be prevented.
 An ordering strictly indicates that a process never asks for a low resource while
holding a high one.
 There are two more ways of dealing with global timing and transactions in
distributed systems, both of which are based on the principle of assigning a global
timestamp to each transaction as soon as it begins.
 During the execution of a process, if a process seems to be blocked because of the
resource acquired by another process then the timestamp of the processes must be
checked to identify the larger timestamp process. In this way, cycle waiting can
be prevented.
 It is better to give priority to the old processes because of their long existence and
might be holding more resources.
 It also eliminates starvation issues as the younger transaction will eventually be
out of the system.
 Preemption: Resource allocation strategies that reject no-preemption conditions can be used
to avoid deadlocks.
 Wait-die: If an older process requires a resource held by a younger process, the
latter will have to wait. A young process will be destroyed if it requests a resource
controlled by an older process.
 Wound-wait: If an old process seeks a resource held by a young process, the
young process will be preempted, wounded, and killed, and the old process will
resume and wait. If a young process needs a resource held by an older process, it
will have to wait.

2. Deadlock Avoidance:

In this strategy, deadlock can be avoided by examining the state of the system at every
step. The distributed system reviews the allocation of resources and wherever it finds an unsafe
state, the system backtracks one step and again comes to the safe state. For this, resource
allocation takes time whenever requested by a process. Firstly, the system analysis occurs
whether the granting of resources will make the system in a safe state or unsafe state then only
allocation will be made.
 A safe state refers to the state when the system is not in deadlocked state and order is there
for the process regarding the granting of requests.
 An unsafe state refers to the state when no safe sequence exists for the system. Safe sequence
implies the ordering of a process in such a way that all the processes run to completion in a
safe state.
3. Deadlock Detection and Recovery:
 This requires examination of the status of process-resource interactions for presence of cyclic
wait.
 Deadlock detection in distributed systems seems to be the best approach to handle deadlocks in
distributed systems.
In this strategy, deadlock is detected and an attempt is made to resolve the deadlock state of
the system. These approaches rely on a Wait-For-Graph (WFG), which is generated and
evaluated for cycles in some methods. The following two requirements must be met by a
deadlock detection algorithm:
 Progress: In a given period, the algorithm must find all existing deadlocks. There should be
no deadlock existing in the system which is undetected under this condition. To put it another
way, after all, wait-for dependencies for a deadlock have arisen, the algorithm should not
wait for any additional events to detect the deadlock.
 No False Deadlocks: Deadlocks that do not exist should not be reported by the algorithm
which is called phantom or false deadlocks.
There are different types of deadlock detection techniques:
 Centralized Deadlock Detector: The resource graph for the entire system is managed by a
central coordinator. When the coordinator detects a cycle, it terminates one of the processes
involved in the cycle to break the deadlock. Messages must be passed when updating the
coordinator’s graph. Following are the methods:
 A message must be provided to the coordinator whenever an arc is created or
removed from the resource graph.
 Every process can transmit a list of arcs that have been added or removed since
the last update periodically.
 When information is needed, the coordinator asks for it.
 Hierarchical Deadlock Detector: In this approach, deadlock detectors are arranged in a
hierarchy. Here, only those deadlocks can be detected that fall within their range.
 Distributed Deadlock Detector: In this approach, detectors are distributed so that all the
sites can fully participate to resolve the deadlock state. In one of the following below four
classes for the Distributed Detection Algorithm- The probe-based scheme can be used for
this purpose. It follows local WFGs to detect local deadlocks and probe messages to detect
global deadlocks.

There are four classes for the Distributed Detection Algorithm:


 Path-pushing: In path-pushing algorithms, the detection of distributed deadlocks is carried
out by maintaining an explicit global WFG.
 Edge-chasing: In an edge-chasing algorithm, probe messages are used to detect the presence
of a cycle in a distributed graph structure along the edges of the graph.
 Diffusion computation: Here, the computation for deadlock detection is dispersed
throughout the system’s WFG.
 Global state detection: The detection of Distributed deadlocks can be made by taking a
snapshot of the system and then inspecting it for signs of a deadlock.

To recover from a deadlock, one of the methods can be followed:


 Termination of one or more processes that created the unsafe state.
 Using checkpoints for the periodic checking of the processes so that whenever required,
rollback of processes that makes the system unsafe can be carried out and hence, maintained
a safe state of the system.
 Breaking of existing wait-for relationships between the processes.
 Rollback of one or more blocked processes and allocating their resources to stopped
processes, allowing them to restart operation.

5. Issues in deadlock Detection


Deadlock handling faces two major issues
1. Detection of existing deadlocks
2. Resolution of detected deadlocks

5.1 Deadlock Detection


 Detection of deadlocks involves addressing two issues namely maintenance of the WFG and
searching of the WFG for the presence of cycles or knots.
 In distributed systems, a cycle or knot may involve several sites, the search for cycles greatly
depends upon how the WFG of the system is represented across the system.
 Depending upon the way WFG information is maintained and the search for cycles is carried
out, there are centralized, distributed, and hierarchical algorithms for deadlock detection in
distributed systems.

Correctness criteria
A deadlock detection algorithm must satisfy the following two conditions:
1. Progress-No undetected deadlocks:
The algorithm must detect all existing deadlocks in finite time. In other words, after all wait-for
dependencies for a deadlock have formed, the algorithm should not wait for any more events to
occur to detect the deadlock.
2. Safety -No false deadlocks: The algorithm should not report deadlocks which do not exist.
This is also called as called phantom or false deadlocks.

5.2 Resolution of Detected Deadlock


 Deadlock resolution involves breaking existing wait-for dependencies between the processes to
resolve the deadlock.
 It involves rolling back one or more deadlocked processes and assigning their resources to
blocked processes so that they can resume execution.
 The deadlock detection algorithms propagate information regarding wait-for dependencies
along the edges of the wait-for graph.
 When a wait-for dependency is broken, the corresponding information should be immediately
cleaned from the system.
 If this information is not cleaned in a timely manner, it may result in detection of phantom
deadlocks.

6. Distributed file system (DFS)


A distributed file system (DFS) is a file system that enables clients to access file storage from
multiple hosts through a computer network as if the user was accessing local storage. Files are
spread across multiple storage servers and in multiple locations, which enables users to share
data and storage resources. A DFS can be designed so geographically distributed users, such as
remote workers and distributed teams, can access and share files remotely as if they were stored
locally.

Working of DFS

A DFS clusters together multiple storage nodes and logically distributes data sets across multiple
nodes that each have their own computing power and storage. The data on a DFS can reside on
various types of storage devices, such as solid-state drives and hard disk drives.

Data sets are replicated onto multiple servers, which enables redundancy to keep data
highly available. The DFS is located on a collection of servers, mainframes or a cloud
environment over a local area network (LAN) so multiple users can access and store unstructured
data. If organizations need to scale up their infrastructure, they can add more storage nodes to the
DFS.
Clients access data on a DFS using namespaces. Organizations can group shared folders into
logical namespaces. A namespace is the shared group of networked storage on a DFS root. These
present files to users as one shared folder with multiple subfolders. When a user requests a file,
the DFS brings up the first available copy of the file.

There are two types of namespaces:

1. Standalone DFS namespaces. A standalone or independent DFS namespace has just one
host server. Standalone namespaces do not use Active Directory (AD). In a standalone
namespace, the configuration data for the DFS is stored on the host server's registry. A
standalone namespace is often used in environments that only need one server.

2. Domain-based DFS namespaces. Domain-based DFS namespaces integrate and store the
DFS configuration in AD. Domain-based namespaces have multiple host servers, and the
DFS topology data is stored in AD. Domain-based namespaces are commonly used in
environments that require higher availability.

Advantages and disadvantages of a DFS

A DFS provides organizations with a scalable system to manage unstructured data remotely. It
can enable organizations to use legacy storage to save costs of storage devices and hardware. A
DFS also improves availability of data through replication.
However, security measures need to be in place to protect storage nodes. In addition, there is a
risk for data loss when data is replicated across storage nodes. It can also be complicated to
reconfigure a DFS should an organization replace storage hardware on any of the DFS nodes.

Features of a DFS

Organizations use a DFS for features such as scalability, security and remote access to
data. Features of a DFS include the following:

 Location independence. Users do not need to be aware of where data is stored. The DFS
manages the location and presents files as if they are stored locally.

 Transparency. Transparency keeps the details of one file system away from other file
systems and users. There are multiple types of transparency in distributed file systems,
including the following:

o Structural transparency. Data appears as if it's on a user's device. Users are unable to
see how the DFS is configured, such as the number of file servers or storage devices.

o Access transparency. Users can access files that are located locally or remotely. Files
can be accessed no matter where the user is, as long as they are logged in to the system.
If data is not stored on the same server, users should not be able to tell, and applications
for local files should also be able to run on remote files.

o Replication transparency. Replicated files that are located on different nodes of the file
system, such as on another storage system, are hidden from other nodes in the system.
This enables the system to create multiple copies without affecting performance.

o Naming transparency. Files should not change when moving among storage nodes.

 Scalability. To scale a DFS, organizations can add file servers or storage nodes.

 High availability. The DFS should continue to work in the event of a partial failure in the
system, such as a node failure or drive crash. A DFS should also create backup copies if
there are any failures in the system.

 Security. Data should be encrypted at rest and in transit to prevent unauthorized access or
data deletion.
Implementations of a DFS

A DFS uses file sharing protocols. Protocols enable users to access file servers over the DFS as if
it was local storage.

Protocols a DFS can use include the following:

 Server Message Block (SMB). SMB is a file sharing protocol designed to allow read and
write operations on files over a LAN. It is used primarily in Windows environments.

 Network File System (NFS). NFS is a client-server protocol for distributed file sharing
commonly used for network-attached storage systems. It is also more commonly used with
Linux and Unix operating systems.

 Hadoop Distributed File System (HDFS). HDFS helps deploy a DFS designed for Hadoop
applications.

Presentation on theme: "Case Study -- Sun’s Network File


System (NFS) NFS is popular and widely used.
1Case Study -- Sun’s Network File System (NFS) NFS is popular and widely used. NFS was
originally designed and implemented by Sun Microsystems for use on its UNIX-based
workstations. Other manufacturers now support it as well, for both UNIX and other operating
systems (including MS-DOS). NFS supports heterogeneous systems, for example, MS-DOS
clients making use of UNIX servers. It is not even required that all the machines use the same
hardware. It is common to find MS-DOS clients running on Intel 386 CPUs getting service from
UNIX file servers running Motorola 68030 or Sun SPARC CPUs. Three aspects of NFS are of
interest: –architecture –protocol –implementation

2 NFS Architecture The basic idea behind NFS is to allow an arbitrary collection of clients and
servers to share a common file system. In most cases, all the clients and servers are on the same
LAN. NFS allows every machine to be both a client and a server at the same time. Each NFS
server exports one or more of its directories for access by remote clients. When a directory is
made available, so are all of its sub-directories, so the entire directory tree is exported as a unit.
The list of directories a server exports is maintained in the /etc/exports file, so these directories
can be exported automatically whenever the server is booted. Clients access exported directories
by mounting them. When a client mounts a directory, it becomes part of its directory hierarchy.
A diskless workstation can mount a remote file system on its root directory, resulting in a file
system that is supported entirely on a remote server. Those workstations that have a local disk
can mount remote directories anywhere they wish. There is no difference between a remote file
and a local file. If two or more clients mount the same directory at the same time, they can
communicate by sharing files in their common directories.
3 NFS Protocols (Mounting) A protocol is a set of requests sent by clients to servers, along with
the corresponding replies sent by the servers back to the clients. As long as a server recognizes
and can handle all the requests in the protocols, it need not know anything at all about its clients.
Clients can treat servers as “black boxes” that accepts and process a specific set of requests. How
they do it is their own business. Mounting: –A client can send a path name to a server and
request permission to mount that directory somewhere in its directory hierarchy. –The place
where it is to be mounted is not contained in the message, as the server does not care where it is
to be mounted. –If the path name is legal and the directory specified has been exported, the
server returns a file handle to the client. –The file handle contains fields uniquely identifying the
file system type, the disk, the i- node number of the directory, and security information. –
Subsequent calls to read and write files in the mounted directory use the file handle.

4 Automounting Sun’s version of UNIX also supports automounting. This feature allows a set of
remote directories to be associated with a local directory. None of these remote directories are
mounted (or their servers even contacted) when the client is booted. Instead, the first time a
remote file is opened, the operating system sends a message to each of the servers. The first one
to reply wins, and its directory is mounted. Automounting has two principal advantages over
static mounting. First, in static mounting via the /etc/rc file, if one of the NFS servers happens to
be down, it is impossible to bring the client up -- at least not without some difficulty, delay, and
quite a few error messages. Second, by allowing the client to try a set of servers in parallel, a
degree of fault tolerance can be achieved (because only one of them need to be up), and the
performance can be improved (by choosing the first one to reply --- presumably the least heavily
loaded). On the other hand, it is assumed that all the file systems specified as alternatives for the
automount are identical. Since NFS provides no support for file or directory replication, it is up
to the user to arrange for all the file systems to be the same. Automounting is most often used for
read-only file systems containing system binaries and other files that rarely change.

5 NFS Protocols (Directory and File Access) Clients can send messages to servers to manipulate
directories and to read and write files. They can also access file attributes, such as file mode,
size, and time of last modification. Most UNIX system calls are supported by NFS. In NFS, each
message is self-contained. The advantage of this scheme is that the server does not have to
remember anything about open connections in between calls to it. Thus, if a server crashes and
then recovers, no information about open files is lost, because there is none. A server like this
that does not maintain state information about open files is said to be stateless. In contrast, in
UNIX System V, the Remote File System (RFS) requires a file to be opened before it can be read
or written. The server then makes a table entry keeping track of the file is open, and where the
reader currently is, so each request need not carry an offset. The disadvantage of this scheme is
that if a server crashes and then quickly reboots, all open connections are lost, and client
programs fails.

6 The NFS scheme makes it difficult to achieve the exact UNIX file semantics. In UNIX, a file
can be opened and locked so that other processes cannot access it. When the file is closed, the
locks are released. In a stateless server such as NFS, locks cannot be associated with open files,
because the server does not know which files are open. NFS therefore needs a separate,
additional mechanism to handle locking. NFS uses the UNIX protection mechanism, with rwx
bits for the owner, group, and others. Originally, each request message simply contained the user
and group ids of the caller, which the NFS server used to validate the access. In effect, it trusted
the clients not to cheat. Currently, public key cryptography can be used to establish a secure key
for validating the client and server on each request and reply. When this option is enabled, a
malicious client cannot impersonate another client because it does not know that client’s secret
key. As an aside, cryptography is used only to authenticate the parties. The data themselves are
never encrypted. NFS Protocols (Directory and File Access - continue)

7 Network Information Service (NIS) All the keys used for the authentication, as well as other
information are maintained by the NIS (Network Information Service). The NIS was formerly
known as the yellow pages. Its function is to store (key, value) pairs. When a key is provided, it
returns the corresponding value. Not only does it handle encryption keys, but it also stores the
mapping of user names to (encrypted) passwords, as well as the mapping of machine names to
network addresses, and other items. The network information servers are replicated using a
master/slave arrangement. To read their data, a process can use either the master or any of the
copies in the slaves. However, all changes must be made only to the master, which then
propagates them to the slaves. There is a short interval after an update in which the NIS server is
inconsistent.

8 NFS Layer Structure Local disk System call layer Virtual file system layer Local O.S.NFS
clientNFS server Virtual file system layer Local O.S. MSG to serverMSG from client Network
ClientServer

9 NFS Implementation It consists of three layers: –system call layer: This handles calls like
OPEN, READ, and CLOSE. –virtual file system (VFS): The task of the VFS layer is to maintain
a table with one entry for each open file, analogous to the table of I-nodes for open files in
UNIX. VFS layers has an entry, called a v-node (virtual i-node) for every open file telling
whether the file is local or remote. –NFS client code: to create an r-node (remote i-node) in its
internal tables to hold the file handles. The v-node points to the r-node. Each v-node in the VFS
layer will ultimately contain either a pointer to an r-node in the NFS client code, or a pointer to
an i-node in the local operating system. Thus from the v-node it is possible to see if a file or
directory is local or remote, and if it is remote, to find its file handle. Caching to improve the
performance: –Transfer between client and server are done in large chunks, normally 8 Kbytes,
even if fewer bytes are requested. This is known as read ahead. –The same for writes, If a write
system call writes fewer than 8 Kbytes, the data are just accumulated locally. Only when the
entire 8K chunk is full is it sent to the server. However, when a file is closed, all of its data are
sent to the server immediately.

10 NFS Implementation (continue) Client caching improves performance –Problem: Two


Clients caching the same file block and that one of them modifies it. When the other one reads
the block, it gets the old value. –Solutions: Solution 1: –Associate with each cache block a timer,
when the timer expires, the entry is discarded. Normally, the timer is 3 sec. for data blocks and
30 sec. for directory block. Solution 2: –Whenever a cached file is open, a message is sent to the
server to find out when the file was last modified. –If the last modification occurred after the
local copy was cached, the cached copy is discarded and the new copy fetched from the server. –
Finally once every 30 sec. A cache timer expires, and all the dirty blocks in the cache are sent to
the server.
11 NFS Implementation (continue.) Criticism: –NFS has been widely criticized for not
implementing the proper UNIX semantics. –A write to a file on one client may or may not be
seen when another client reads the file, depending on the timing. –When a file is created, it may
not be visible to the outside world for as much as 30 sec. Lessons learned: –Workstations have
cycles to burn, so do it on the client-side, not the server-side. –Cache whenever possible. –
Exploit the usage properties –Minimize system wide knowledge and change –Trust the fewest
possible entities –Batch work where possible

CODA
What is Coda File System ?

The Coda File System, commonly referred to as CodaFS, is a distributed file system designed to
tackle the complexities of mobile and wide-area networking environments. It was developed at
Carnegie Mellon University and is known for its robust support of disconnected operation.
Disconnected operation refers to the ability of a mobile device or client to continue working with
data even when it is disconnected from the network. The Coda File System achieves this through
a unique combination of techniques and principles.

Key Principles of Coda File System

1. Replication: Coda replicates files across servers and clients. This means that multiple copies of a
file exist on different devices, ensuring data availability even when a device is disconnected.
2. Caching: Coda caches frequently accessed files on the client side, reducing the need to fetch
data from the server every time it is requested.
3. Conflict Resolution: In a distributed environment, conflicts can arise when different clients
modify the same file while disconnected. Coda has mechanisms for resolving conflicts and
ensuring data consistency.
4. Disconnected Operation: Coda is designed to support disconnected operation. Clients can work
with files even when they are not connected to the server. When the client reconnects, Coda
automatically reconciles changes with the server.
5. Security: Security is a paramount concern in mobile computing. Coda incorporates encryption
and authentication mechanisms to protect data in transit and at rest.

The Need for CodaFS

Traditional file systems were ill-equipped to handle the demands of mobile computing. Issues
such as intermittent connectivity, varying network conditions, and the need for synchronization
across multiple devices presented formidable obstacles. Recognizing these challenges, the
creators of CodaFS sought to design a file system that could overcome these limitations and
provide a consistent user experience in the face of ever-changing conditions.
Core Concepts of Coda File System

1. Disconnected Operation

One of the defining features of CodaFS is its ability to support disconnected operation. In
scenarios where network connectivity is intermittent or unavailable, users can continue to access
and modify files seamlessly. CodaFS achieves this by caching relevant data on the client side,
allowing users to work on files even when disconnected from the network. Subsequent
synchronization occurs when connectivity is reestablished.

2. Server-Client Architecture

CodaFS employs a client-server architecture to manage file access and storage. The server stores
the master copy of the files, while clients maintain cached copies. This architecture facilitates
efficient collaboration and ensures that changes made by one user are propagated to others
through the server.

3. Replication and Consistency

Replication is a fundamental aspect of CodaFS, contributing to its robustness in handling


distributed environments. Multiple copies of data exist across different servers and clients,
ensuring redundancy and fault tolerance. The system employs sophisticated consistency
mechanisms to reconcile conflicting changes and maintain a coherent view of the file system
across all instances.

4. Token-based Concurrency Control

To manage concurrent access to files and maintain consistency, CodaFS employs a token-based
approach. Clients request tokens from servers to gain exclusive access to specific files or
directories. This mechanism prevents conflicting modifications and ensures that only one client
has write access at any given time.

Architecture of Coda File System

1. Venus Client

At the client side, the Venus component of CodaFS plays a pivotal role. Venus is responsible for
caching files, handling disconnection scenarios, and ensuring a smooth user experience. It
interacts with the server to retrieve updates and synchronize changes when connectivity is
restored.

2. Vice Server

The server side of CodaFS is managed by the Vice component. Vice is responsible for storing
the master copies of files, managing replication, and coordinating communication with clients. It
implements the token-based concurrency control mechanism to ensure consistency and prevent
conflicts.

3. Update Resolution

CodaFS employs a sophisticated update resolution mechanism to handle conflicts that may arise
when multiple users modify the same file concurrently. This process involves merging changes
and resolving conflicts in a way that maintains the integrity of the file system.

4. Resolution Logs

To facilitate update resolution and maintain a history of changes, CodaFS uses resolution logs.
These logs capture information about modifications made by different users, enabling the system
to reconcile conflicting changes during synchronization.

You might also like