Module 1_Chapter 1
Module 1_Chapter 1
Module-1
CHARACTERIZATION OF DISTRIBUTED SYSTEMS: Introduction, Focus on resource
sharing, Challenges.
REMOTE INVOCATION: Introduction, Request-reply protocols, Remote procedure call,
Introduction to Remote Method Invocation.
Textbook: Chapter- 1.1,1.4,1.5, 5.1-5.5
Introduction
A distributed system is one in which components located at networked computers communicate and
coordinate their actions only by passing messages.
A distributed system as one in which hardware or software components located at networked computers
communicate and coordinate their actions only by passing messages. This simple definition covers the
entire range of systems in which networked computers can usefully be deployed.
Computers that are connected by a network may be spatially separated by any distance. They may be
on separate continents, in the same building or in the same room. Our definition of distributed systems
has the following significant consequences:
Concurrency: In a network of computers, concurrent program execution is the norm. I can do my work
on my computer while you do your work on yours, sharing resources such as web pages or files when
necessary. The capacity of the system to handle shared resources can be increased by adding more
resources (for example. computers) to the network.
No global clock: When programs need to cooperate they coordinate their actions by exchanging
messages. Close coordination often depends on a shared idea of the time at which the programs’ actions
occur. But it turns out that there are limits to the accuracy with which the computers in a network can
synchronize their clocks – there is no single global notion of the correct time. This is a direct
consequence of the fact that the only communication is by sending messages through a network.
Independent failures: All computer systems can fail, and it is the responsibility of system designers to
plan for the consequences of possible failures. Distributed systems can fail in new ways. Faults in the
network result in the isolation of the computers that are connected to it, but that doesn’t mean that they
stop running. In fact, the programs on them may not be able to detect whether the network has failed or
has become unusually slow. Similarly, the failure of a computer, or the unexpected termination of a
program somewhere in the system (a crash), is not immediately made known to the other components
with which it communicates. Each component of the system can fail independently, leaving the others
still running.
Challenges
As distributed systems are getting complex, developers face a number of challenges:
– Heterogeneity
– Openness
– Security
– Scalability
– Failure handling
– Concurrency
– Transparency
– Quality of service
Heterogeneity: The Internet enables users to access services and run applications over a heterogeneous
collection of computers and networks. Heterogeneity (that is, variety and difference) applies to all of
the following:
o Hardware devices: computers, tablets, mobile phones, embedded devices, etc.
o Operating System: Ms Windows, Linux, Mac, Unix, etc.
o Network: Local network, the Internet, wireless network, satellite links, etc.
o Programming languages: Java, C/C++, Python, PHP, etc.
o Different roles of software developers, designers, system managers
Different programming languages use different representations for characters and data structures such
as arrays and records. These differences must be addressed if programs written in different languages
are to be able to communicate with one another. Programs written by different developers cannot
communicate with one another unless they use common standards, for example, for network
communication and the representation of primitive data items and data structures in messages. For this
to happen, standards need to be agreed and adopted – as have the Internet protocols.
Middleware : The term middleware applies to a software layer that provides a programming abstraction
as well as masking the heterogeneity of the underlying networks, hardware, operating systems and
programming languages. Most middleware is implemented over the Internet protocols, which
themselves mask the differences of the underlying networks, but all middleware deals with the
difference in operating systems and hardware
Heterogeneity and mobile code : The term mobile code is used to refer to program code that can be
transferred from one computer to another and run at the destination – Java applets are an example. Code
suitable for running on one computer is not necessarily suitable for running on another because
executable programs are normally specific both to the instruction set and to the host operating system.
Openness The openness of a computer system is the characteristic that determines whether the system
can be extended and re-implemented in various ways. The openness of distributed systems is determined
primarily by the degree to which new resource-sharing services can be added and be made available for
use by a variety of client programs. If the well-defined interfaces for a system are published, it is easier
for developers to add new features or replace sub-systems in the future. Example: Twitter and Facebook
have API that allows developers to develop their own software interactively.
Security Many of the information resources that are made available and maintained in distributed
systems have a high intrinsic value to their users. Their security is therefore of considerable importance.
Security for information resources has three components:
Confidentiality (protection against disclosure to unauthorized individuals)
Scalability Distributed systems must be scalable as the number of user increases. The scalability is
defined by B. Clifford Neuman as A system is said to be scalable if it can handle the addition of users
and resources without suffering a noticeable loss of performance or increase in administrative
complexity Scalability has 3 dimensions:
o Size
o Number of users and resources to be processed. Problem associated is overloading
o Geography
o Distance between users and resources. Problem associated is communication
reliability
o Administration
o As the size of distributed systems increases, many of the system needs to be
controlled. Problem associated is administrative mess
Failure Handling Computer systems sometimes fail. When faults occur in hardware or software,
programs may produce incorrect results or may stop before they have completed the intended
computation. The handling of failures is particularly difficult.
– Dealing with failures in distributed systems:
• Detecting failures – known/unknown failures
• Masking failures – hide the failure from become severe. E.g. retransmit messages, backup of file
data
• Tolerating failures – clients can be designed to tolerate failures – e.g. inform users of failure and
ask them to try later
• Recovery from failures - recover and rollback data after a server has crashed
• Redundancy- the way to tolerate failures – replication of services and data in multiple servers
Concurrency Both services and applications provide resources that can be shared by clients in a
distributed system. There is therefore a possibility that several clients will attempt to access a shared
resource at the same time. For example, a data structure that records bids for an auction may be accessed
very frequently when it gets close to the deadline time. For an object to be safe in a concurrent
environment, its operations must be synchronized in such a way that its data remains consistent. This
can be achieved by standard techniques such as semaphores, which are used in most operating systems.
Transparency: Transparency is defined as the concealment from the user and the application
programmer of the separation of components in a distributed system, so that the system is perceived as
a whole rather than as a collection of independent components. In other words, distributed systems
designers must hide the complexity of the systems as much as they can.
– 8 forms of transparency:
• Access transparency – access to local and remote resources using identical operations
• Location transparency – access to resources without knowing the physical location of the machine
• Concurrency transparency – several processes operate concurrently without interfering each other
• Replication transparency – replication of resources in multiple servers. Users are not aware of the
replication
• Failure transparency – concealment of faults, allows users to complete their tasks without knowing
of the failures
• Mobility transparency – movement of resources and clients within a system without affecting users
operations
• Performance transparency – systems can be reconfigured to improve performance by considering
their loads
• Scaling transparency – systems and applications can be expanded without changing the structure
or the application algorithms
Quality of service
The main nonfunctional properties of distributed systems that affect the quality of service experienced
by users or clients are: reliability, security, performance, adaptability.
– Reliability
– Security
– Performance
– Adaptability