0% found this document useful (0 votes)
51 views

Design of Parallel and Distributed Systems: Dr. Seemab Latif

The document discusses parallel and distributed systems. It defines a parallel computer as a collection of processing elements that communicate to solve problems fast, while a distributed system is a collection of independent computers that appear as a single system to users. Key characteristics of distributed systems include concurrency, lack of a global clock, and independent failures of components. Examples given are networks of workstations, automatic banking systems, cloud computing, and automotive systems. Design issues specific to distributed systems involve transparency, communication, performance, heterogeneity, openness, reliability, and security.

Uploaded by

Nadi Yaqoob
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views

Design of Parallel and Distributed Systems: Dr. Seemab Latif

The document discusses parallel and distributed systems. It defines a parallel computer as a collection of processing elements that communicate to solve problems fast, while a distributed system is a collection of independent computers that appear as a single system to users. Key characteristics of distributed systems include concurrency, lack of a global clock, and independent failures of components. Examples given are networks of workstations, automatic banking systems, cloud computing, and automotive systems. Design issues specific to distributed systems involve transparency, communication, performance, heterogeneity, openness, reliability, and security.

Uploaded by

Nadi Yaqoob
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 36

Design of Parallel and

Distributed Systems
Dr. Seemab Latif

Lecture 1

Introduction

What is a parallel computer?


A collection of processing elements that
communicate and co-operate to solve large
problems fast.

What is a distributed system?


A collection of independent computers that
appear to its users as a single coherent system.
A parallel computer is implicitly a distributed
system.

CHARACTERIZATION OF
DISTRIBUTED SYSTEMS
Chapter 1

Characterization of Distributed
Systems
1.Introduction
2.Examples of distributed systems
3.Trends in distributed systems
4.Focus on resource sharing
5.Challenges
6. Case study: The World Wide Web

Introduction

A distributed system as one in which hardware or


software components located at networked
computers communicate and coordinate their
actions only by passing messages.
Computers that are connected by a network may be
spatially separated by any distance. They may be on
separate continents, in the same building or in the
same room.

Characteristics of Distributed
Systems
Concurrency: is a property of systems in which
several
computations
are
executing
simultaneously, and potentially interacting with
each other.

The computations may be executing


multiple cores in the same chip
or
executed
processors.

on

physically

on

separated

In Lab, I do my work on my computer and you do


your work on yours.

Also, you share resources as well, e.g. Disks, Printers,


Files, Web Pages

Co-ordination required between concurrently executing

Characteristics of Distributed
Systems
No global clock: When programs need to
cooperate they coordinate their actions by
exchanging messages.

Close coordination often depends on a shared


idea of the time at which the programs actions
occur.
But it turns out that there are limits to the
accuracy with which the computers in a network
can synchronize their clocks there is no single
global notion of the correct time.
This is a direct consequence of the fact that the
only communication is by sending messages
through a network.

Characteristics of Distributed
Systems
Independent failures: Distributed systems can fail
in new ways. Faults in the network result in the
isolation of the computers that are connected to
it, but that doesnt mean that they stop running.

In fact, the programs on them may not be able to


detect whether the network has failed or has
become unusually slow.

Similarly, the failure of a computer, or the


unexpected termination of a program somewhere
in the system (a crash), is not immediately made
known to the other components with which it
communicates.

Each component of the system can fail


independently, leaving the others still running.

Independent Failures
Components of Distributed systems can
fail.
Networks (Communication)
Computers (Devices)

Each component can fail independently,


leaving the others still running.
System
Designer
to
plan
for
consequences of possible failures.

the

Network Failures
Isolation of computers: Network failure doesnt
mean computer stop working.
Programs may
not be able to detect whether the
22/01/15
network has failed or has become unusually slow.

Computer/Process Failures

Examples of Distributed
Systems
Network of Workstations
Personal workstations + processors not
assigned to specific users.

Single file system, with all files accessible


from all machines in the same way and
using the same path name.
For a certain command the system can look
for the best place (workstation) to execute it.

Network of Workstations

Examples of Distributed
Systems
Automatic banking (teller machine) system

Primary requirements: security and reliability.


Consistency of replicated data.
Concurrent transactions (operations which
involve
accounts
in
different
banks;
simultaneous access from several users, etc).
Fault tolerance

Automatic banking (teller machine)


system

Examples of Distributed
Systems
The Cloud

Computing as a utility: application, storage,


computing services; pay on per-usage basis.
Main concerns: scaling, performance,
security/ reliability.

The Cloud

Examples of Distributed
Systems
Automotive system (a distributed real-time

system)
Synchronization of physical clocks
Scheduling with hard time constraints
Real-time communication
Fault tolerance

Automotive system (a distributed real-time


system)

Why do we Need Them?


Advantages of Distributed System
Performance: very often a collection of
processors can provide higher performance
(and better price/performance ratio) than a
centralized computer.
Distribution: many applications involve, by
their nature, spatially separated machines
(banking, commercial, automotive system).
Reliability (fault tolerance): if some of the
machines crash, the system can survive.

Why do we Need Them?


Advantages of Distributed System
Incremental growth: as requirements on
processing power grow, new machines can
be added incrementally.
Sharing of data/resources: shared data is
essential to many applications (banking,
computer- supported cooperative work,
reservation systems); other resources can be
also shared (e.g. expensive printers).
Communication: facilitates human-to-human
communication.

Disadvantages of Distributed
System
Difficulties of developing distributed software:
how should operating systems, programming
languages and applications look like?
Networking problems: several problems are
created by the network infrastructure, which
have to be dealt with: loss of messages,
overloading,...
Security problems: sharing generates the
problem of data security.

Design Issues with Distributed


Systems
Design issues that arise specifically from the distributed
nature of the application:
Transparency
Communication
Performance & scalability
Heterogeneity
Openness
Reliability & fault tolerance
Security

Transparency

How to achieve the single system image?


How to "fool" everyone into thinking that
the collection of machines is a "simple"
computer

Access transparency
- local and remote resources are accessed
using identical operations.
Location transparency
- users cannot tell where hardware and
software resources (CPUs, files, data
bases) are located; the name of the
resource shouldnt encode the location of

Transparency
Migration (mobility) transparency
- resources should be free to move from one location to
another without having their names changed.
Replication transparency
- the system is free to make additional copies of files and
other resources (for purpose of performance and/or
reliability), without the users noticing.
Example: several copies of a file; at a certain request
that copy is accessed which is the closest to the client.

Transparency
Concurrency transparency
- the users will not notice the existence of other
users in the system (even if they access the
same resources).
Failure transparency
- applications should be able to complete their
task despite failures occurring in certain
components of the system.
Performance transparency
- load variation should not lead to performance
degradation.
This could be achieved by automatic
reconfiguration as response to changes of the
load; it is difficult to achieve.

Communication
Components of a distributed system have to
communicate in order to interact. This
implies support at two levels:
1. Networking infrastructure (interconnections
& network software).
2. Appropriate communication primitives and
models and their implementation:
communication primitives:
- send

- receive
- remote procedure call (RPC)

Communication
communication models
- client-server communication: implies a
message exchange between two processes:
the process which requests a service and the
one which provides it;
- group multicast: the target of a message is a
set of processes, which are members of a given
group.

Performance and Scalability


Several factors are influencing the
performance of a distributed system:
1.The performance of individual workstations.
2.The speed of the communication
infrastructure.
3.Extent to which reliability (fault tolerance) is
provided (replication and preservation of
coherence imply large overheads).
4.Flexibility in workload allocation: for
example, idle processors (workstations)
could be allocated automatically to a users
task.

Scalability
The system should remain efficient even with a
significant increase in the number of users
and resources connected:

cost of adding resources should be


reasonable;
performance loss with increased number of
users and resources should be controlled;
software resources should not run out
(number of bits allocated to addresses,
number of entries in tables, etc.)

Heterogeneity
Distributed applications are typically
heterogeneous:
- different hardware: mainframes,
workstations,
PCs, servers, etc.;
- different software:UNIX,MSWindows,IBMOS/2,
Real-time OSs, etc.;
- unconventional devices: teller machines,
telephone switches, robots, manufacturing
systems, etc.;
- diverse networks and protocols: Ethernet,
FDDI, ATM, TCP/IP, Novell Netware, etc.

Openness
One of the important features of distributed
systems is openness and flexibility:
- every service is equally accessible to every client
(local or remote);
- it is easy to implement, install and debug new
services;
- users can write and install their own services.
Key aspect of openness:
- Standard interfaces and protocols (like Internet
communication protocols)
- Support of heterogeneity (by adequate
middleware, like CORBA)

Openness
Software Architecture:

Openness
The same, looking at two distributed nodes:

Reliability and Fault Tolerance


One of the main goals of building distributed
systems is improvement of reliability.
Availability: If machines go down, the system
should work with the reduced amount of
resources.
There should be a very small number of critical
resources (single points of failure);
critical resources: resources which have to be up
in order the distributed system to work.
Key pieces of hardware and software (critical
resources) should be replicated if one of them
fails another one takes up - redundancy.

Reliability and Fault Tolerance


Data on the system must not be lost, and copies
stored redundantly on different servers must be
kept consistent.

The more copies kept, the better the


availability, but keeping consistency becomes
more difficult.
Fault-tolerance is a main issue related to
reliability: the system has to detect faults and
act in a reasonable way:
mask the fault: continue to work with possibly
reduced performance but without loss of data/
information.
fail gracefully: react to the fault in a predictable
way and possibly stop functionality for a short

Security
Security of information resources:
1. Confidentiality
Protection against disclosure to unauthorised
person
2. Integrity
Protection against alteration and corruption
3. Availability
Keep the resource accessible

Security
Distributed systems should allow
communication
between programs/users/
resources on different
computers.

Security risks associated with free access.


The appropriate use of resources by different
users has to be guaranteed.

You might also like