0% found this document useful (0 votes)
60 views

Distributed Computing: Unit-1 (

Distributed computing allows software components to run across multiple computers connected by a network. This architecture provides advantages like scalability, data sharing, and high availability if one computer fails. However, distributed systems also have disadvantages like increased development costs, potential for bugs, and processing overhead. Fundamental models describe distributed systems in terms of processes interacting through message passing, failures that can occur, and security concerns around protecting resources. The web relies on URLs, HTTP, and markup languages like HTML and XML to share resources across a distributed network.

Uploaded by

Divyansh Jain
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views

Distributed Computing: Unit-1 (

Distributed computing allows software components to run across multiple computers connected by a network. This architecture provides advantages like scalability, data sharing, and high availability if one computer fails. However, distributed systems also have disadvantages like increased development costs, potential for bugs, and processing overhead. Fundamental models describe distributed systems in terms of processes interacting through message passing, failures that can occur, and security concerns around protecting resources. The web relies on URLs, HTTP, and markup languages like HTML and XML to share resources across a distributed network.

Uploaded by

Divyansh Jain
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

Distributed Computing

Unit-1​ ​(​Characteristics of Distributed System​)


a.) Distributed System(Advantages and Disadvantages)​:-

A distributed computer system consists of multiple software components that are on multiple computers, but
run as a single system. The computers that are in a distributed system can be physically close together and
connected by a local network, or they can be geographically distant and connected by a wide area network. A
distributed system can consist of any number of possible configurations, such as mainframes, personal
computers, workstations, minicomputers, and so on. The goal of distributed computing is to make such a
network work as a single computer.

Distributed computing systems can run on hardware that is provided by many vendors, and can use a variety
of standards-based software components. Such systems are independent of the underlying software. They can
run on various operating systems, and can use various communications protocols. Some hardware might use
UNIX or Linux as the operating system, while other hardware might use Windows operating systems. For
intermachine communications, this hardware can use SNA or TCP/IP on Ethernet or Token Ring.
We can organize software to run on distributed systems by separating functions into two parts: clients and
servers.

Advantages of Distributed System​ :


1. Scalability : The system can easily be expanded by adding more machines as needed into the system.
2. Sharing Data : There is a provision in the environment where user at one site may be able to access the
data residing at other sites.
3. Autonomy : Because of sharing data by means of data distribution each site is able to retain a degree of
control over data that are stored locally.
4. Availability : If one site fails in a distributed system, the remaining sites may be able to continue operating.
Thus a failure of a site doesn't necessarily imply the shutdown of the System.

Disadvantages of Distributed Systems​ :


1 .Software Development Cost : It is more difficult to implement a distributed system; thus it is more costly.
2. Greater Potential For Bugs : Since the sites that constitute the distributed database system operate parallel,
it is harder to ensure the correctness of algorithms, especially operation during failures of part of the system,
and recovery from failures. The potential exists for extremely subtle bugs.
3. Increased Processing Overhead : The exchange of information and additional computation required to
achieve intersite coordination are a form of overhead that does not arise in centralized system.
Distributed Computing
Distributed Computing

b.) Architectural and Fundamental Models:-

1. ​Architectural Models
An architectural model of a distributed system:
1. it simplifies and abstracts the functions of the individual components.
2. it considers:
‣ the placement of the components across a network of computers (seeking to define useful patterns for the
distribution of data and workload).
‣ the interrelationships between the components (i.e., their functional roles and the patterns of communication
between them).

Architectural Model​ : ​Client-Server


• Still the most widely employed architectural model.

Client processes interact with individual server processes in separate host computers in order to access the
shared resources that they manage.

On the Client-Server Role​ : ​Web Server Example


• Example 1: a Web server is often a client of a local file server that manages the files in which the web pages
are stored.
• Example 2: Web servers and most Internet services are clients of the DNS service (which translates Internet
Domain names to network addresses).
• Example 3: search engine
‣ Server: it responds to queries from browser clients
‣ Client: it runs (in the background) programs called web crawlers that act as clients of other web servers.
Distributed Computing

Architectural Model​ : ​Peer-to-Peer (P2P)


• All the processes involved in a task or activity play similar roles, interacting cooperatively as peers without
any distinction between client and server processes or the computers that they run on.
• The aim of the P2P architecture is to exploit the resources (both data and hardware) in a large number of
participating computers for the fulfillment of a given task or activity.

Variation of the Models :-

• ​Services provided by multiple servers


Distributed Computing

• ​Proxy server and caches


Distributed Computing

•​ Mobile code

2. ​Fundamental models
Fundamental models are concerned with a more formal description of the properties that are common in all of
the architectural models.
A model contains only the essential ingredients that we need to consider in order to understand and reason
about some aspects of a system’s behaviour.

Three Fundamental Models :-


Interaction model​:
1. Computation occurs within processes that interact by passing messages, resulting in communication (i.e.,
information flow) and coordination (synchronization and ordering of activities) between processes.
2. Distributed systems are composed of many processes interacting in complex ways.
• For example:
‣ Multiple server processes may cooperate with one another to provide a service.
‣ A set of peer processes may cooperate with one another to achieve a common goal.
Distributed Computing

Two Variants of the Interaction Model


• In a distributed system it is hard to set time limits on the time taken for process execution, message delivery
or clock drift.
• Two opposite extreme positions provide a pair of simple models:
1. ​Synchronous distributed systems​: strong assumption of time.
‣ A distributed system in which the following bounds are defined:
• the time to execute each step of a process has known lower and upper bounds
• each message transmitted over a channel is received within a known bounded time
• each process has a local clock whose drift rate from real time has a known bound.

2. ​Asynchronous distributed systems​: no assumptions about time


‣ A distributed system in which there are no bounds on:
• The time to execute each step of a process is arbitrarily long.
• each message transmitted over a channel is received after arbitrarily long time.
• each process has a local clock whose drift rate from real time has a unknown bound.
‣ This exactly models the Internet, in which there is no intrinsic bound on server or network load and therefore
on how long it takes, for example, to transfer a file using ftp, or to receive an email message.

Failure model​: the correct operation of a distributed system is threatened whenever a fault occurs in any of the
computers on which it runs or in the network that connects them.
• In a distributed system both processes and communication channels may fail (that is, they may depart from
what is considered to be correct or desirable behavior).
• The failure model defines the ways in which failures may occur in order to provide an understanding of the
effects of failures.
‣ ​Omission failures​: a process or communication channel fails to perform actions that it is supposed to do.
Distributed Computing

‣ ​Arbitrary failures​: any type of error may occur. The term arbitrary is used to describe the worst possible
failure semantics, in which any type of error may occur.
• Arbitrary failure of a process: the process arbitrarily omits intended processing steps or takes unintended
processing steps.
• Communication channel arbitrary failures: message contents may be corrupted or non-existent messages
may be delivered or real messages may be delivered more than once.

‣ ​Timing failures​: applicable in synchronous distributed systems. Timing failures are applicable in synchronous
distributed systems, where time limits are set on process execution time, message delivery time and clock drift
rate.

Security model​: the openness of distributed systems exposes them to attack by both external and internal
agents.
• This model provides the basis for the security model : The security of a distributed system can be achieved
by securing the processes and the channels used for their interactions and by protecting the objects that they
encapsulate against unauthorized access.
Distributed Computing

c.) Resource sharing and the web challenges:-


Three fundamental elements comprise the technology architecture of the Web:
■ Uniform Resource Locators (URL) - A standard syntax used for creating identifiers that point to
Web-based resources, the URL is often structured using a logical network location.
■ Hypertext Transfer Protocol (HTTP) - This is the primary communications protocol used to exchange
wecontent and data throughout the World Wide Web. URLs are typically transmitted via HTTP.
■ Markup Languages (HTML, XML) - Markup languages provide a lightweight means of expressing
Web-centric data and metadata. The two primary markup languages are HTML (which is used to
express the presentation of Web pages) and XML (​Extensible Markup Language that defines a set
of rules for encoding documents in a format that is both human-readable and machine-readable.​).
For example, a Web browser can request to execute an action like read, write, update, or delete on a Web
resource on the Internet, and proceed to identify and locate the Web resource through its URL. The request is
sent using HTTP to the resource host, which is also identified by a URL. The Web server locates the Web
resource and performs the requested operation, which is followed by a response being sent back to the client.
The response may be comprised of content that includes HTML and XML statements.
Web resources are represented as ​hypermedia as opposed to hypertext, meaning media such as graphics,
audio, video, plain text, and URLs can be referenced collectively in a single document. Some types of
hypermedia resources cannot be rendered without additional software or Web browser plug-ins.

d.) Challenges and Goals of Distributed System :-


Distributed Computing

Designing a distributed system does not come as easy and straightforward. A number of challenges need to
be overcome in order to get the ideal system. The major challenges in distributed systems are listed below:

1. Heterogeneity:
The Internet enables users to access services and run applications over a heterogeneous collection of
computers and networks. Heterogeneity (that is, variety and difference) applies to all of the following:

Hardware devices: computers, tablets, mobile phones, embedded devices, etc.


Operating System: Ms Windows, Linux, Mac, Unix, etc.
Network: Local network, the Internet, wireless network, satellite links, etc.
Programming languages: Java, C/C++, Python, PHP, etc.

Middleware : The term middleware applies to a software layer that provides a programming abstraction as well
as masking the heterogeneity of the underlying networks, hardware, operating systems and programming
languages. Most middleware is implemented over the Internet protocols, which themselves mask the
differences of the underlying networks, but all middleware deals with the differences in operating systems
and hardware.
Heterogeneity and mobile code : The term mobile code is used to refer to program code that can be
transferred from one computer to another and run at the destination – Java applets are an example. Code
suitable for running on one computer is not necessarily suitable for running on another because executable
programs are normally specific both to the instruction set and to the host operating system.

2. Transparency:
Transparency is defined as the concealment from the user and the application programmer of the separation of
components in a distributed system, so that the system is perceived as a whole rather than as a collection of
independent components. In other words, distributed systems designers must hide the complexity of the
systems as much as they can. Some terms of transparency in distributed systems are:
Access - Hide differences in data representation and how a resource is accessed.
Concurrency - Hide that a resource may be shared by several competitive users.
Failure - Hide the failure and recovery of a resource.
Location - Hide where a resource is located.
Migration - Hide that a resource may move to another location.
Persistence - Hide whether a (software) resource is in memory or a disk.
Relocation - Hide that a resource may be moved to another location while in use.
Replication - Hide that a resource may be copied in several places.

3. Openness
The openness of a computer system is the characteristic that determines whether the system can be extended
Distributed Computing

and reimplemented in various ways. The openness of distributed systems is determined primarily by the
degree to which new resource-sharing services can be added and be made available for use by a variety of
client programs. If the well-defined interfaces for a system are published, it is easier for developers to add new
features or replace sub-systems in the future. Example: Twitter and Facebook have API that allows developers
to develop their own software interactively.

4. Concurrency
Both services and applications provide resources that can be shared by clients in a distributed system. There
is therefore a possibility that several clients will attempt to access a shared resource at the same time. For an
object to be safe in a concurrent environment, its operations must be synchronized in such a way that its data
remains consistent. This can be achieved by standard techniques such as semaphores, which are used in
most operating systems.

5. Security
Many of the information resources that are made available and maintained in distributed systems have a high
intrinsic value to their users. Their security is therefore of considerable importance. Security for information
resources has three components:
Confidentiality - protection against disclosure to unauthorized individuals.
Integrity - protection against alteration or corruption.
Availability for the authorized - protection against interference with the means to access the resources.

6. Scalability
Distributed systems must be scalable as the number of user increases. A system is said to be scalable if it can
handle the addition of users and resources without suffering a noticeable loss of performance or increase in
administrative complexity. Scalability has 3 dimensions:

Size - Number of users and resources to be processed. Problem associated is overloading.


Geography - Distance between users and resources. Problem associated is communication reliability.
Administration - As the size of distributed systems increases, many of the system needs to be controlled.
Problem associated is administrative mess.

7. Failure Handling
Computer systems sometimes fail. When faults occur in hardware or software, programs may produce
incorrect results or may stop before they have completed the intended computation. The handling of failures is
particularly difficult.
Distributed Computing

Unit-2 (​Networking Issues for Distributed Systems​)


a) External data representation:
The information stored in running programs is represented as data structures – for example, by sets of
interconnected objects – whereas the information in messages consists of streams of bytes. Irrespective of the
form of communication used, the data structures must be flattened (converted to a sequence of bytes) before
transmission and rebuilt on arrival. Messages can contain primitive data of many different types, and not all
computers store primitive values such as integers in the same order.

One of the following methods can be used to enable any two computers to exchange binary data values:
• The values are converted to an agreed external format before transmission and converted to the local form
on receipt; if the two computers are known to be the same type, the conversion to external format can be
omitted.
• The values are transmitted in the sender’s format, together with an indication of the format used, and the
recipient converts the values if necessary.
Note, however, that bytes themselves are never altered during transmission. To support RMI or RPC, any data
type that can be passed as an argument or returned as a result must be able to be flattened and the primitive
data values represented in an agreed format. An agreed standard for the representation of data structures
and primitive values is called an external data representation.

b) Marshalling and Unmarshalling:


Marshalling is the process of taking a collection of data items and assembling them into a form suitable for
transmission in a message.Thus marshalling consists of the translation of data structure and primitive values
into an external data representation. ​Unmarshalling is the process of disassembling them on arrival to
produce an equivalent collection of data items at the destination.Unmarshalling consists of the generation of
Distributed Computing

primitive values from their external data representation and the rebuilding of the data structures.

c) Remote Procedure call:

Remote Procedure Call (RPC) is a ​protocol that enables one program to request a service from a program
located in another computer on a ​network without having to understand the network's details. A procedure call
is also sometimes known as a function call or a subroutine call.

RPC uses the client-server model. The requesting program is a client and the service providing program is the
server. Like a regular or local procedure call, an RPC is a synchronous operation requiring the requesting
program to be suspended until the results of the remote procedure are returned. However, the use of
lightweight processes or threads that share the same address space allows multiple RPCs to be performed
concurrently.

RPC message procedure


When program that use RPC framework are compiled into an executable program, a stub is included in the
compiled code that acts as the representative of the remote procedure code. When the program is run and the
procedure call is issued, the stub receives the request and forwards it to a client runtime program in the local
computer.
Distributed Computing

Explanation of Remote Procedure Call

d) Remote Method Invocation :

Remote Method Invocation (RMI) is an API which allows an object to invoke a method on an object that exists
in another address space, which could be on the same machine or on a remote machine. Through RMI, object
running in a JVM present on a computer (Client side) can invoke methods on an object present in another JVM
(Server side). RMI creates a public remote server object that enables client and server side communications
through simple method calls on the server object.

Working of RMI

The communication between client and server is handled by using two intermediate objects: Stub object (on
client side) and Skeleton object (on server side).

Stub Object

The stub object on the client machine builds an information block and sends this information to the server. The
block consists of

● An identifier of the remote object to be used


Distributed Computing

● Method name which is to be invoked


● Parameters to the remote JVM

Skeleton Object

The skeleton object passes the request from the stub object to the remote object. It performs following tasks

● It calls the desired method on the real object present on the server.
● It forwards the parameters received from the stub object to the method.

e) Difference between RMI and RPC:

1. Approach:

RMI uses an object-oriented paradigm where the user needs to know the object and the method of the object
he needs to invoke.
RPC doesn't deal with objects. Rather, it calls specific subroutines that are already established.

2. Working​:

With RPC, you get a procedure call that looks pretty much like a local call. RPC handles the complexities
involved with passing the call from local to the remote computer.
RMI does the very same thing, but RMI passes a object and the method that is being called.
RMI = RPC + Object-orientation

3. Better one​:

RMI is a better approach compared to RPC, especially with larger programs as it provides a cleaner code that
is easier to identify if something goes wrong.
Distributed Computing

f.) Invocation semantics RMI

1. Exactly once semantics –The call executes at most once - either it does not execute at all or it executes
exactly once depending on whether the server machine goes down. Unlike the previous semantics, these
semantics require the detection of duplicate packets, but work for non-idempotent operations.
2. ​Maybe semantics​ – caller can not determine whether or not the remote method has been executed.
3. ​At-least-once semantics – The call executes at least once as long as the server machine does not fail.
These semantics require very little overhead and are easy to implement. The client machine continues to send
call requests to the server machine until it gets an acknowledgement. If one or more acknowledgements are
lost, the server may execute the call multiple times. This approach works only if the requested operation is
idempotent​, that is, multiple invocations of it return the same result. Servers that implement only idempotent
operations must be ​stateless​, that is, must not change global state in response to client requests. Thus, RPC
systems that support these semantics rely on the design of stateless servers.
4. At-most-once semantics - RMI implements "at most once" semantics (some other distributed systems
provide "at least once" semantics). If a method returns normally, it is guaranteed to have executed once.
However, if an exception such as a MarshallException occurs, the caller doesn't know whether the exception
occurred when transmitting the call or returning the return value, i.e. the caller cannot determine whether the
remote method executed. In this case, the client may wish to attempt the invocation again. To facilitate this,
the implementation of a remote method should be ​idempotent where possible, which means that the operation
can be executed multiple times with the same effect as executing once. Even a method such as an account
withdrawal can be made idempotent by passing an operation ID with the invocation: the remote object ignores
invocations for operation IDs that have already been performed.

RPC Systems: SUN RPC, DCE RPC


RMI Systems: Java RMI, CORBA, Microsoft DCOM/COM+, SOAP(Simple Object Access Protocol)

g.) Steps for implementing RMI callback:

1. Defining a remote interface.


2. Implementing the remote interface.
3. Creating Stub and Skeleton objects from the implementation class using rmic (rmi compiler).
4. Start the rmiregistry.
5. Create and execute the server application program.
6. Create and execute the client application program.
Distributed Computing

Step 1: Defining the remote interface

The first thing to do is to create an interface which will provide the description of the methods that can be
invoked by remote clients. This interface should extend the Remote interface and the method prototype within
the interface should throw the RemoteException.

// Creating a Search interface


import java.rmi.*;
public interface Search extends Remote
{
// Declaring the method prototype
public String query(String search) throws RemoteException;
}

Step 2: Implementing the remote interface

The next step is to implement the remote interface. To implement the remote interface, the class should extend
to UnicastRemoteObject class of java.rmi package. Also, a default constructor needs to be created to throw the
java.rmi.RemoteException from its parent constructor in class.

// Java program to implement the Search interface


import java.rmi.*;
import java.rmi.server.*;
public class SearchQuery extends UnicastRemoteObject
implements Search
{
// Default constructor to throw RemoteException
// from its parent constructor
SearchQuery() throws RemoteException
{
super();
}

// Implementation of the query interface


public String query(String search)
throws RemoteException
{
String result;
if (search.equals("Reflection in Java"))
result = "Found";
else
result = "Not Found";

return result;
}
Distributed Computing

}
Step 3 : Creating Stub and Skeleton objects from the implementation class using rmic. The rmic tool is used to
invoke the rmi compiler that creates the Stub and Skeleton objects. Its prototype is rmic classname. For above
program the following command need to be executed at the command prompt
rmic SearchQuery

STEP 4: Start the rmiregistry

Start the registry service by issuing the following command at the command prompt start rmiregistry

STEP 5: Create and execute the server application program

The next step is to create the server application program and execute it on a separate command prompt.

● The server program uses createRegistry method of LocateRegistry class to create rmiregistry within the
server JVM with the port number passed as argument.
● The rebind method of Naming class is used to bind the remote object to the new name.

//program for server application


import java.rmi.*;
import java.rmi.registry.*;
public class SearchServer
{
public static void main(String args[])
{
try
{
// Create an object of the interface
// implementation class
Search obj = new SearchQuery();

// rmiregistry within the server JVM with


// port number 1900
LocateRegistry.createRegistry(1900);

// Binds the remote object by the name


// geeksforgeeks
Naming.rebind("rmi://localhost:1900"+
"/geeksforgeeks",obj);
}
catch(Exception ae)
{
System.out.println(ae);
}
}
Distributed Computing

}
Step 6: Create and execute the client application program
The last step is to create the client application program and execute it on a separate command prompt. The
lookup method of Naming class is used to get the reference of the Stub object.

//program for client application


import java.rmi.*;
public class ClientRequest
{
public static void main(String args[])
{
String answer,value="Reflection in Java";
try
{
// lookup method to find reference of remote object
Search access =
(Search)Naming.lookup("rmi://localhost:1900"+
"/geeksforgeeks");
answer = access.query(value);
System.out.println("Article on " + value +
" " + answer+" at GeeksforGeeks");
}
catch(Exception ae)
{
System.out.println(ae);
}
}
}

h.) Distributed objects and Distributed events.

The idea of distributed objects is an extension of the concept of remote procedure calls. In a remote procedure
call system, code is executed remotely via a remote procedure call. The unit of distribution is the procedure /
function /method (used as synonyms). So the client has to import a client stub (either manually as in RPC or
automatically as in RMI) to allow it to connect to the server offering the remote procedure. Distribution unit in a
system for distributed objects, is the object. That is, a client imports a ”something” (in Java JINI system, it’s
called a proxy) which allows the client access to the remote object as if it were part of the original client
program (as with RPC and RMI, sort of transparently).
Distributed Computing

Distributed event-based systems extend local event model by allowing multiple objects at diff. locations to be
notified of events taking place at an object.
Two characteristics: heterogeneous, asynchronous. Such systems can be useful: stock dealing room

Publish-subscribe paradigm:
Publisher sends notifications.
Subscriber registers interest to receive notifications.
Object of interest: where events happen, change of state as a result of its operations being invoked.
Events: occurs in the object of interest.
Notification: an object containing information about an event.
Observer objects: decouple an object of interest from its subscribers.
Distributed Computing

RPC ISSUES (Not Important)


§​ ​Issues that must be addressed:
1. RPC Runtime: RPC run-time system, is a library of routines and a set of services that handle the
network communications that underlie the RPC mechanism. In the course of an RPC call, client-side
and server-side run-time systems’ code handle binding, establish communications over an
appropriate protocol, pass call data between the client and server, and handle communications
errors.
2. Stub: The function of the stub is to provide transparency to the programmer-written application
code.
On the client side, the stub handles the interface between the client’s local procedure call and the
run-time system, marshaling and unmarshaling data, invoking the RPC run-time protocol, and if
requested, carrying out some of the binding steps.
On the server side, the stub provides a similar interface between the run-time system and the local
manager procedures that are executed by the server.
3. Binding: How does the client know who to call, and where the service resides?
The most flexible solution is to use dynamic binding and find the server at run time when the RPC is
first made. The first time the client stub is invoked, it contacts a name server to determine the
transport address at which the server resides.

Binding consists of two parts:


§​ ​Naming:
Remote procedures are named through interfaces. An interface uniquely identifies a particular
service, describing the types and numbers of its arguments. It is similar in purpose to a type definition
in programming languauges.
§​ ​Locating:
Finding the transport address at which the server actually resides. Once we have the transport
address of the service, we can send messages directly to the server.

A Server having a service to offer exports an interface for it. Exporting an interface registers it with the
system so that clients can use it.
A Client must import an (exported) interface before communication can begin
Distributed Computing

Unit-3 (​Distributed File Systems​)


·

​Distributed file systems:

A distributed file system enables programs to store and access remote files exactly as they do local ones,
allowing users to access files from any computer on a network. The performance and reliability experienced for
access to files stored at a server should be comparable to that for files stored on local disks.

The design of large-scale wide area read-write file storage systems poses problems of load balancing,
reliability, availability and security.

a.) ​File Server Architecture ​:- ​File server architecture provides access to files by structuring the file service
as three components:
1. Flat file service
2. Directory service
3. Client module
The relevant modules and their relationship is shown in Figure
Distributed Computing

Flat file service​ :-


Responsibilities of various modules can be defined as follows:
• Concerned with the implementation of operations on the contents of file.
• Unique File Identifiers (UFIDs) are used to refer to files in all requests for flat file service operations.

Operations​ :-
1. Read 2. Write 3. Create 4. Delete 5. GetAttributes 6. SetAttributes
Read(FileId, i, n) : Reads a sequence of items up to n from a file starting at item i.
Write(FileId, i, Data) : Write a sequence of Data to a file, starting at item i.
Create() : Creates a new file of length 0 and delivers a UFID for it.
Delete(FileId) :Removes the file from the file store.
GetAttributes(FileId) : Returns the attributes of the file.
SetAttributes(FileId, Attr) :Sets the file attributes.

Directory service​ :-
• Provides mapping between text names for the files and their UFIDs.
• Clients may obtain the UFID of a file by quoting its text name to directory service.
• Directory service supports functions to add new files to directories.

Operations​ :-
1. Lookup 2. AddName 3. UnName 4. GetNames
Lookup(Dir, Name) : Locates the text name in the directory and returns the relevant UFID. If Name is not in the
directory, throws an exception.
AddName(Dir, Name, File) : If Name is not in the directory, adds(Name,File) to the directory and updates the
file’s attribute record. If Name is already in the directory, throws an exception.
UnName(Dir, Name) :If Name is in the directory, the entry containing Name is removed from the directory. If
Name is not in the directory, throws an exception.
GetNames(Dir, Pattern):Returns all the text names in the directory that match the regular expression Pattern.

Client module​ ​:-


• It runs on each computer and provides integrated service (flat file and directory) as a single API to application
programs.
• It holds information about the network locations of flat-file and directory server processes.

b.) Sun Network File System ​:- ​provides transparent, remote access to filesystems. Unlike many other
remote file system implementations under UNIX, NFS is designed to be easily portable to other operating
systems and machine architectures. It uses an External Data Representation (XDR) specification to describe
protocols in a machine and system independent way. NFS is implemented on top of a Remote Procedure Call
package (RPC) to help simplify protocol definition, implementation, and maintenance.
Distributed Computing

The “filesystem interface” consists of two parts:


(a) The Virtual File System (VFS) interface defines the operations that can be done on a filesystem.
(b) The virtual node (vnode) interface defines the operations that can be done on a file within that filesystem.
This new interface allows us to install and implement new filesystems in much the same way as new device
drivers are added to the kernel.

Design Goals​ -
(a) NFS was designed to simplify the sharing of filesystem resources in a network of non-homogeneous
machines.
(b) The overall design goals of NFS were: Machine and Operating System Independence. The protocols used
should be independent of UNIX so that an NFS server can supply files to many different types of clients. The
protocols should also be simple enough that they can be implemented on low-end machines like the PC.

(c) Crash Recovery - When clients can mount remote filesystems from many different servers it is very
important that clients and servers be able to recover easily from machine crashes and network problems.

(d) Transparent Access - to provide a system which allows programs to access remote files in exactly the
same way as local files, without special pathname parsing, libraries, or recompiling. Programs should not need
or be able to tell whether a file is remote or local.

d.) Andrew File System:-Motivation of AFS


• Information sharing – Share information among large number of users
• Scalability – Large number of users – Large amount of files – Large number users accessing the hot files
• Unusual designs – Whole file serving – Whole file caching.
Distributed Computing

e.) Mounting:-
The process of including a new filesystem.
/etc/exports has filesystems that can be mounted by others.
Clients use a modified mount command for remote filesystems.
Communicates with the mount process on the server in a mount protocol.

Hard-mounted
• user process is suspended until request is successful
• when server is not responding, request is retried until it's satisfied.
Soft-mounted
• if server fails, client returns failure after a small no. of retries.
• user process handles the failure

AutoFS
You can mount NFS file system resources by using a client-side service called automounting (or AutoFS), which enables a system to
automatically mount and unmount NFS resources whenever you access them. The resource remains mounted as long as you remain in
the directory and are using a file. If the resource is not accessed for a certain period of time, it is automatically unmounted.
AutoFS provides the following features:
● NFS resources don't need to be mounted when the system boots, which saves booting time.
● Users don't need to know the root password to mount and unmount NFS resources.
● Network traffic might be reduced, since NFS resources are only mounted when they are in use.
Distributed Computing

f.) Discovery service:- ​Directory service that registers services provided in a spontaneous networking
environment. Provide an interface for automatically registering and de-registering services, as well as an interface
for clients to look up the services they require. Ex : a printer (or the service that manages it) may register its
attributes with the discovery service as follows :‘resourceClass = printer, type=laser, color=yes, resolution=600dpi,
location=room101,url=​http://www.hotelNW.com/services/printer98​’.

Jini Discovery Service Designed to be used for spontaneous networking Entirely java-based. Computers
communicate by means of RMI, and can download code if necessary. Discovery-related components in a Jini
system are look up services. A Jini service (such as printing service) may be registered with many look up services.

g.) Domain Name System(port, tree structure, block diagram name to ip):-
Port : 53
Domain Name System(DNS) tree structure:-

Working of DNS:-
Distributed Computing

h.) Name Services :- Stores a collection of one or more set of bindings between textual name and attributes
for objects. ​Provide facilities for resource location, email addressing and authentication. When the naming database
grows from small to large scale, the structure of namespace may change. The service should accommodate it. The
structure of the name space may change during that time to reflect changes in organizational structures. Difficult to
maintain complete consistency between all copies of a database entry.

Namespace = collection of all valid names recognised by a service with – a syntax for specifying names, and – rules
for resolving names (e.g., left to right)
Distributed Computing

Name Resolution
Distributed Computing

Name Resolution
The way these hostnames are ​resolved to their mapped IP address is called Domain ​Name
Resolution​. On almost all operating systems whether they be Apple, Linux, Unix, Netware, or
Windows the majority of ​resolutions from domain ​names to IP addresses are done through a
procedure called​DNS​.
Distributed Computing
Distributed Computing

Unit-4 (​Global States and Coordination​)


a.) Impact of absence of global clock and shared memory:-

Two inherent limitations of distributed systems are: lack of global clock and lack of shared memory. This has
two important implications. First, due to the absence of any system-wide clock that is equally accessible to all
processes, the notion of common time does not exist in a distributed system; different processes may have
different notions of time. As a result, it is not always possible to determine the order in which two events on
different processes were executed. Second, since processes in a distributed system do not share common
memory, it is not possible for an individual process to obtain an up-to-date state of the entire system.

b.)​ ​Bully Algorithm:-


Background: any process ​P​i sends a message to the current coordinator; if no response in ​T time units, ​P​i tries
to elect itself as leader. ​It works as follows:

Algorithm for process ​P​i ​that detected the lack of coordinator


1. Process ​P​i​ sends an “Election” message to every process with higher priority.
2. If no other process responds, process ​P​i starts the coordinator code and sends a message to all
processes with lower priorities saying “Elected ​P​i”​
3. Else, ​P​i waits for ​T’ time units to hear from the new coordinator, and if there is no response à start from
step (1) again.

Algorithm for other processes (also called ​P​i)​


If ​Pi​ ​ is not the coordinator then ​Pi​ ​ may receive either of these messages from ​P​j

if Pj sends “Elected ​P​j”;​ [this message is only received if ​ i​ < ​j​]


Pi updates its records to say that ​P​j​ is the coordinator.
Else if ​P​j​ sends “election” message (i > j)
Pi sends a response to ​Pj​ ​ saying it is alive
Pi starts an election.

Example:
The group consists of eight processes, numbered from 0 to 7.
● Previously process 7 was the coordinator, but it has just crashed.
● Process 4 is the first one to notice this, so it sends ELECTION messages to all the processes higher than it,
namely 5, 6, and 7, as shown in Figure (a).
● Processes 5 and 6 both respond with OK, as shown in Figure (b).
● Upon getting the first of these responses, 4 knows that its job is over. It knows that one of these bigwigs will
take over and become the coordinator. It just sits back and waits to see who the winner will be.
● Figure (c), both 5 and 6 hold elections, each one only sending messages to those processes higher than
itself.
● Figure (d) process 6 tells 5 that it will take over.
Distributed Computing

● At this point, 6 knows that 7 is dead and that it (6) is the winner.
● If there is state information to collect from disk or elsewhere to pick up where the old coordinator left off, 6
must now do what is needed.
● When it is ready to take over, 6 announces this by sending a COORDINATOR message to all running
processes.
● When 4 gets this message, it can now continue with the operation it was trying to do when it discovered that
7 was dead, but using 6 as the coordinator this time.
● In this way, the failure of 7 handled and the work can continue.
● If process 7 ever restarted, it will just send all the others a COORDINATOR message and bully them into
submission.
Distributed Computing

c.) Clock Synchronization


● The common approach to clock synchronization is to have many computers make use of a time server.
● Typically the time server is equipped with special hardware that provides a more accurate time than
does a cheaper computer timer
● The challenge with this approach is that there is a delay in the transmission from the time server to the
client receiving the time update.
● This delay is not constant for all requests. Some request may be faster and others slower.

● We can solve this problem using relative time correction C which can be calculated as:

C​=(​T​2−​T​1)+(​T​3−​T​4)2C=(T2−T1)+(T3−T4)2

● The way this works is that the client sends a packet with ​T​1 recorded to the time server. The time
server will record the receipt time of the packet ​T​2. When the response is sent, the time server will write
its current time ​T​3 to the response. When the client receives the response packet, it will record ​T​4 from
its local clock.
● When the value of C is worked out, the client can correct its local clock.
● The client must be careful. If the value of C is positive, then C can be added to the software clock.
● If the value of C is negative, then the client must artificially decrease the amount of milliseconds added
to its software clock until the offset is cleared.
● It is always inadvisable to cause the clock to go backwards. Most software that relies on time will not
react well to this.

Physical Clocks
● Computer Timer : an integrated circuit that contains a precise machined quartz crystal. When kept
under tension the quartz crystal oscillates at a well-defined frequency.
Distributed Computing

● Clock Tick : after a predefined number of oscillations, the timer will generate a clock tick. This clock tick
generates a hardware interrupt that causes the computer’s operating system to enter a special routine
in which it can update the software clock and run the process scheduler.

Physical Clocks - Multiple Systems


● Unfortunately, it is impossible for each machined quartz crystal in every computer timer to be exactly
the same. These differences create clock skew.
● For example, if a timer interrupts 60 times per second, it should generate 216,000 ticks per hour. But in
practice, the real number of ticks is typically between 215,998 and 216,002 per hour. This means that
we aren’t actually getting precisely 60 ticks per second.
● We can say that a timer is within specification if there is some constant p such that:

1−​p​<=​dC/dT​<=1+​p​1−p<=dCdT<=1+p

● The constant p is the maximum drift rate of the timer.


● On any two given computers, the drift rate will likely differ.
● To solve this problem, clock synchronization algorithms are necessary.

(d) Election Algorithms


● The ​coordinator election problem is to choose a process from among a group of processes
on different processors in a distributed system to act as the central coordinator. An ​election
algorithm​ is an algorithm for solving the coordinator election problem.
Ring Algorithm :-
We assume that the processes are physically or logically ordered, so that each process knows who its
successor is. When any process notices that the coordinator is not functioning, it builds an ​ELECTION
message containing its own process number and sends the message to its successor. If the successor is
down, the sender skips over the successor and goes to the next member along the ring, or the one after that,
until a running process is located. At each step, the sender adds its own process number to the list in the
message.
Eventually, the message gets back to the process that started it all. That process recognizes this event
when it receives an incoming message containing its own process number. At that point, the message type is
changed to ​COORDINATOR and circulated once again, this time to inform everyone else who the coordinator
is (the list member with the highest number) and who the members of the new ring are. When this message
has circulated once, it is removed and everyone goes back to work.
Distributed Computing

Fig. 3-13. Election algorithm using a ring.

In Fig. 3-13 we see what happens if two processes, 2 and 5, discover simultaneously that the previous coordinator,
process 7, has crashed. Each of these builds an ​ELECTION message and starts circulating it. Eventually, both messages will
go all the way around, and both 2 and 5 will convert them into ​COORDINATOR messages, with exactly the same members
and in the same order. When both have gone around again, both will be removed. It does no harm to have extra messages
circulating; at most it wastes a little bandwidth.

-​f.) Multicast Communication

g.) Two algorithms of mutual exclusion :-


Lamport’s Algorithm
• Every node i has a request queue qi
– keeps requests sorted by logical timestamps.
• To request critical section: – send time stamped REQUEST(tsi, i) to all other nodes .
– put (tsi, i) in its own queue.
• On receiving a request (tsi, i):
– send time stamped REPLY to the requesting node i.
– put REQUEST(tsi, i) in the queue.
• To enter critical section
– Process i enters critical section if:
- (tsi, i) is at the top if its own queue, and
- Process i has received a message with timestamp larger than (tsi, i) from ALL other nodes.
• To release critical section:
Process i removes its request from its own queue and sends a time stamped RELEASE message to all
other nodes
Distributed Computing

• On receiving a RELEASE message from i, i’s request is removed from the local request queue.

Peterson Algorithm : ​is a ​concurrent programming ​algorithm for ​mutual exclusion that allows two or more
processes to share a single-use resource without conflict, using only shared memory for communication. The
algorithm uses two variables, ​flag and ​turn​. A ​flag[n] value of ​true indicates that the process ​n wants to enter the
critical section​. Entrance to the critical section is granted for process P0 if P1 does not want to enter its critical
section or if P1 has given priority to P0 by setting ​turn​ to ​0​.
Distributed Computing

Clock drift : ​Clock drift refers to several related phenomena where a ​clock does not run at exactly the same
rate as a reference ​clock​. That is, after some ​time the ​clock "​drifts apart" or gradually desynchronizes from
the other ​clock​.
Distributed Computing

Unit-5 (​Transaction and concurrency control​)


a.) Common Object Request Broker Architecture ​:-

(CORBA) is an architecture and specification for creating, distributing, and managing distributed program
objects in a network. It allows programs at different locations and developed by different vendors to
communicate in a network through an "interface broker." CORBA was developed by an association of vendors
through the Object Management Group (OMG), which currently includes over 500 member companies. Both
International Organization for Standardization (ISO) and X/Open have sanctioned CORBA as the standard
architecture for distributed objects (which are also known as components).
CORBA 3 is the latest level.
The essential concept in CORBA is the Object Request Broker (ORB). ORB support in a network of clients and
servers on different computers means that a client program (which may itself be an object) can request
services from a server program or object without having to understand where the server is in a distributed
network or what the interface to the server program looks like. To make requests or reponse between the
ORBs, programs use the General Inter-ORB Protocol (GIOP) and, for the Internet, its Internet Inter-ORB
Protocol (IIOP). IIOP maps GIOP requests and replies to the Internet's Transmission Control Protocol (TCP)
layer in each computer.
Distributed Computing Environment (DCE), a distributed programming architecture that preceded the trend
toward OOP and CORBA, is currently used by a number of large companies. DCE will perhaps continue to
exist along with CORBA and there will be "bridges" between the two.
Distributed Computing

b.) Flat Transactions and Nested Transactions


• A flat client transaction completes each of its requests before going on to the next one. Therefore, each
transaction accesses servers’ objects sequentially.

Nested Transactions
* Structured in an invert-root tree.
* The outermost transaction is the top-level transaction. Others are sub-transactions.
* a sub-transaction is atomic to its parent transaction.
* Sub-transactions at the same level can run concurrently.
* Each sub-transaction can fail independently of its parent and of the other sub-transactions.

Main advantages of nested transactions


* ​Additional concurrency in a transaction : Subtransactions at one level may run concurrently with other
sub-transactions at the same level in the hierarchy.

* More robust : Sub-transactions can commit or abort independently. For example, a transaction to deliver a
mail message to a list of recipients.

The rules for committing of nested transactions


* A transaction commits or aborts only after its child transactions have completed;
* When a sub-transaction completes, it makes an independent decision on provisionally commit or abort. Its
decision to abort is final.
* When a sub-transaction aborts, the parent can decide whether to abort or not.
* When a parent aborts, all of its sub-transactions are aborted, even though some of them may have
provisionally committed.
* When the top-level transaction commits, then all of the sub-transactions that have provisionally committed
can commit.
Distributed Computing

c.) Deadlock, Prevention, Detection, Lock Timeouts, Hierarchical Locks.

A deadlock is a condition in a system where a set of processes (or threads) have requests for resources that
can never be satisfied. Essentially, a process cannot proceed because it needs to obtain a resource held by
another process but it itself is holding a resource that the other process needs.

Distributed Deadlock Prevention


The algorithms below, by Rosenkrantz (1978), prevent the ​No Preemption​ condition.
Assign each process a global timestamp when it starts. No two processes should have same timestamp.
Basic idea​: "When one process is about to block waiting for a resource that another process is using, a check
is made to see which has a larger timestamp (i.e. is younger)."
Somehow​ put timestamps on each process, providing the creation time of each process.
Suppose a process needs a resource already owned by another process.
● Determine relative ages of both processes.
● Decide if waiting process should
○ Preempt​,
Distributed Computing

○ Wait​,
○ Die​, or
○ Wound
The ​Wait-Die​ algorithm:
Allow wait only if waiting process is older.
Since timestamps increase in any chain of waiting processes, cycles are impossible.

The Wait-Die algorithm kills the younger process.


When the younger process restarts and requests the resource again, may be killed once more. This is the less
efficient of these two algorithms.
The ​Wound-Wait​ algorithm:
Otherwise allow wait only if waiting process is younger.
Here timestamps decrease in any chain of waiting process, so cycles are again impossible. It is wiser to give
older processes priority.
Distributed Computing

The Wound-Wait algorithm preempts the younger process. When the younger process re-requests resource,
it has to wait for older process to finish. This is the better of the two algorithms.

Note:​ To avoid ​starvation​, a process should ​not​ be assigned a new timestamp each time it restarts.

Distributed Deadlock Detection

a.) Centralized deadlock detection :


One server is selected as global deadlock detector.
Each server will send the latest copy of its local wait-for graph to this distinguished server.
Problems:
• poor availability, lack of fault tolerance, no ability to scale, detected and high traffic.
• Phantom deadlock: a situation where a deadlock is but is not really a deadlock.
• It takes time to transmit local wait-for graphs. During that time, it’s possible some locks are released and
there is no cycle any more in the new global wait-for graph.

b.) Distributed deadlock detection:


• when a process has to wait for a resource, a probe message is sent to the process holding that resource.
• The probe message contains three components: the process ID that blocked, the process ID that is sending
the request, and the destination.
• Initially, the first two components will be the same. When a process receives the probe:
Distributed Computing

• if the process itself is waiting on a resource, it updates the sending and destination fields of the
message and forwards it to the resource holder.
• If it is waiting on multiple resources, a message is sent to each process holding the resources. This
process continues as long as processes are waiting for resources. If the originator gets a message and
sees its own process number in the blocked field of the message, it knows that a cycle has been taken
and deadlock exists. In this case, some process (transaction) will have to die. The sender may choose
to commit suicide and abort itself or an election algorithm may be used to determine an alternate victim.
Distributed Computing

Lock Hierarchies

The idea of a lock hierarchy is to assign a numeric lock to every mutex in the system, and then consistently
follow two simple rules:
● Rule 1: While holding a lock on a mutex at level ​N​, you may only acquire new locks on mutexes at
lower levels ​<N​.
● Rule 2: Multiple locks at the same level must be acquired at the same time, which means we need a
"lock-multiple" operation such as lock( mut1, mut2, mut3, ... ). This operation internally make sure it
always takes the requested locks in some consistent global order.
If the entire program follows these rules, then there can be no deadlock among the mutex acquire operations,
because no two pieces of code can ever try to acquire two mutexes ​a​ and ​b​ in opposite orders.

d.) Problems due to concurrent execution


1.) Loss Update Problem

2.) Dirty Read Problem


Distributed Computing

3.) Inconsistent Retrieval

4.) Premature write - an aborted operation is reset to the wrong value


● Premature write is a problem related to the interaction between the write operations on the same object
belonging to different transactions.
● a - balance is $100
● T: a.setBalance($105) - (before image: 100)
● U: a.setBalance($110) - (before image: 105)
● U commits, T aborts and resets to 100 -- should be 110
● If T aborts then U aborts, result will be 105, but should be 100.
Distributed Computing

5.) Cascading Aborts

6.) Serial equivalence and conflicting operations


Two transactions are ​serial​ if all the operations in one transaction precede the operations in the other.
eg the following actions are serial
R​i​(​x​)​W​i​(​x​)​R​i​(​y​)​R​j​(​x​)​W​j​(​y​)
Definition: Two operations are ​in conflict​ if:
● At least one is a write
● They both act on the same data
● They are issued by different transactions
Ri​​ (​x​)​Rj​​ (​x​)​Wi​​ (​x​)​Wj​​ (​y​)​Ri​​ (​y​) has ​Rj​​ (​x​)​Wi​​ (​x​) in conflict

e.)​ Two-version locking– allows more concurrency by postponing write locks till commit time
​ read operations are allowed while write operation is being performed
​ write operation is done on a tentative version of data items
​ read operation is done on committed version
– three types of locks: read, write, & commit locks
Distributed Computing

Hierarchic locks
– allows mixed granularity locks, building a hierarchy of locks
​ giving owner of lock explicit access to node in hierarchy and implicit access to its children
– introduces an additional type of lock: intention-Read/Write
​ before a child node is granted a read/write lock, an intention to read/write lock is set on the parent node

Problems in locking-based concurrency control


– extra overhead to manage locking which may not be required.
– use of lock can give a rise to deadlock.
– locks cannot be leased until the end of the transaction to avoid cascading aborts.

f.) Problems associated with aborting transaction


Premature write and cascading aborts.

You might also like