Distributed Computing: Farhad Muhammad Riaz

Distributed computing systems allow programs to coordinate actions by exchanging messages over a computer network. They operate by executing protocols that govern message passing between processes. Key aspects of distributed systems include reliability through fault tolerance, high availability, and recoverability from failures. Distributed systems must also address issues like security, privacy, scalability and predictable performance.

Uploaded by

Farhad Muhammad Riaz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

65 views18 pages

Distributed Computing: Farhad Muhammad Riaz

Uploaded by

Farhad Muhammad Riaz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 18

Distributed Computing

Lecture: 02
Farhad Muhammad Riaz
What is Distributed Computing
 Distributed computing system is a set of computer programs,
executing on one or more computers, and coordinating
actions by exchanging messages.
 A computer network is a collection of computers
interconnected by hardware that directly supports message
passing.
 Most distributed computing systems operate over computer
networks, but one can also build a distributed computing
system in which the components execute on a single
multitasking computer, and one can build distributed
computing systems in which information flows between the
components by means other than message passing.
 Parallel Computing VS Grid Computing
 Both lies in the class of distributed Systems.
Working
 In distributed Systems “protocol” in reference to an
algorithm governing the exchange of messages, by
which a collection of processes coordinate their
actions and communicate information among
themselves.
 Much as a program is a set of instructions, and a
process denotes the execution of those instructions,
a protocol is a set of instructions governing the
communication in a distributed program, and a
distributed computing system is the result of
executing some collection of such protocols to
coordinate the actions of a collection of processes in
a network.
Reliability
 Fault tolerance:
 The ability of a distributed computing system to recover from component failures without performing
incorrect actions.
 High availability:
 In the context of a fault-tolerant distributed computing system, the ability of the system to restore
correct operation, permitting it to resume providing services during periods when some
components have failed. A highly available system may provide reduced service for short
periods of time while reconfiguring itself.
 Continuous availability:
 A highly available system with a very small recovery time, capable of providing uninterrupted service
to its users. The reliability properties of a continuously available system are unaffected or
only minimally affected by failures.
 Recoverability:
 Also in the context of a fault-tolerant distributed computing system, the ability of failed
components to restart themselves and rejoin the system, after the cause of failure has been
repaired.
 Consistency:
 The ability of the system to coordinate related actions by multiple components, often in the
presence of concurrency and failures. Consistency underlies the ability of a distributed
system to emulate a non-distributed system.
Reliability
 Scalability:
 The ability of a system to continue to operate correctly even as some aspect is scaled to a larger
size. For example, we might increase the size of the network on which the system
is running—doing so increases the frequency of such events as network outages
and could degrade a “non-scalable” system.We might increase numbers of users,
or numbers of servers, or load on the system. Scalability thus has many
dimensions; a scalable system would normally specify the dimensions in which it
achieves scalability and the degree of scaling it can sustain.
 Security:
 The ability of the system to protect data, services, and resources against misuse by
unauthorized users.
 Privacy:
 The ability of the system to protect the identity and locations of its users, or the contents of
sensitive data, from unauthorized disclosure.
 Correct specification:
 The assurance that the system solves the intended problem.
 Correct implementation:
 The assurance that the system correctly implements its specification.
Reliability
 Predictable performance:
 of
The guarantee that a distributed system achieves desired levels
performance—for example, data throughput from source to
destination, latencies measured for critical paths, requests
processed per second, and so forth.
 Timeliness:
 In systems subject to real-time constraints, the assurance that actions are
taken within the specified time bounds, or are performed with a
desired degree of temporal synchronization between the
components.
Tolerating Failures
 Halting failures:
 In this model, a process or computer either works correctly, or simply stops executing and crashes without
taking incorrect actions, as a result of failure. As the model is normally specified, there is no way to
detect that the process has halted except by timeout: It stops sending “keep alive” messages or
responding to “pinging” messages and hence other processes can deduce that it has failed.
 Fail-stop failures:
 fail by halting. However, other
These are accurately detectable halting failures. In this model, processe
processes that may be interacting with the faulty process also have a completely accurate way
to detect such failures—for example, a fail-stop environment might be one in which timeouts
can be used to monitor the status of processes, and no timeout occurs unless the process being
monitored has actually crashed. Obviously, such a model may be unrealistically optimistic,
representing an idealized world in which the handling of failures is reduced to a pure problem
of how the system should react when a failure is sensed. If we solve problems with this model,
we then need to ask how to relate the solutions to the real world.
 Send-omission failures:
 of the distributed computing systems,
These are failures to send a message that, according to the logic
should have been sent. Send-omission failures are commonly caused by a lack of buffering
space in the operating system or network interface, which can cause a message to be
discarded after the application program has sent it but before it leaves the sender’s machine.
Perhaps surprisingly, few operating systems report such events to the application.
Tolerating Failures
 Receive-omission failures:
 These are similar to send-omission failures, but they occur when a message is lost near the destination process, often
because of a lack of memory in which to buffer it or because evidence of data corruption has been
discovered.
 Network failures:
 These occur when the network loses messages sent between certain pairs of processes.
 Network partitioning failures:
 These are a more severe form of network failure, in which the network fragments into disconnected sub-networks,
within which messages can be transmitted, but between which messages are lost. When a failure of this sort
is repaired, one talks about merging the network partitions. Network partitioning failures are a common
problem in modern distributed systems
 Timing failures:
 These occur when a temporal property of the system is violated—for example, when a clock on a computer exhibits a
value that is unacceptably far from the values of other clocks, or when an action is taken too soon or too
late, or when a message is delayed by longer than the maximum tolerable delay for a network connection.
 Byzantine failures:
 This is a term that captures a wide variety of other faulty behaviors, including data corruption, programs that fail to
follow the correct protocol, and even malicious or adversarial behaviors by programs that actively seek to
force a system to violate its reliability properties.
Computation Models
 Real-world networks:
 These are composed of workstations, personal computers,
and other computing devices interconnected by hardware.
 Properties of the hardware and software components will
often be known to the designer, such as speed, delay, and
error frequencies for communication devices; latencies for
critical software and scheduling paths; throughput for data
generated by the system and data distribution patterns; speed
of the computer hardware, accuracy of clocks; and so forth.
 This information can be of tremendous value in designing
solutions to problems that might be very hard—or
impossible—in a completely general sense.
Computation Models
 Asynchronous computing systems:
 This is a very simple theoretical model used to approximate one extreme sort of computer
network. In this model, no assumptions can be made about the relative speed of the
communication system, processors, and processes in the network.
 One message from a process p to a process q may be delivered in zero time, while the
next is delayed by a million years.
 The asynchronous model reflects an assumption about time, but not failures: Given an
asynchronous model, one can talk about protocols that tolerate message loss,
protocols that overcome fail-stop failures in asynchronous networks, and so forth.
 The main reason for using the model is to prove properties about protocols for which
one makes as few assumptions as possible.
 The model is very clean and simple, and it lets us focus on fundamental properties of
systems without cluttering up the analysis by including a great number of practical
considerations.
 If a problem can be solved in this model, it can be solved at least as well in a more
realistic one.
 On the other hand, the converse may not be true:
 We may be able to do things in realistic systems by making use of features not
available in the asynchronous model, and in this way may be able to solve problems in
real systems that are impossible in ones that use the asynchronous model.
Computation Models
 Synchronous computing systems:
 extreme end of the spectrum. In the synchronous
Like the asynchronous systems, these represent an
systems, there is a very strong concept of time that all processes in the system share.
 One common formulation of the model can be thought of as having a system wide gong that
sounds periodically; when the processes in the system hear the gong, they run one round of a
protocol, reading messages from one another, sending messages that will be delivered in the
next round, and so forth.
 And these messages always are delivered to the application by the start of the next round, or
not at all.
 Normally, the synchronous model also assumes bounds on communication latency between
processes, clock skewand precision, and other properties of the environment.
 As in the case of an asynchronous model, the synchronous one takes an extreme point of view
because this simplifies reasoning about certain types of protocols.
 Real-world systems are not synchronous—it is impossible to build a system in which actions
are perfectly coordinated as this model assumes.
 However, if one proves the impossibility of solving some problem in the synchronous model, or
proves that some problem requires at least a certain number of messages in this model, one
has established a sort of lower bound.
 In a real-world system, things can only get worse, because we are limited to weaker
assumptions.
 This makes the synchronous model a valuable tool for understanding how hard it will be to
solve certain problems.
Computation Models
 Parallel-shared memory systems:
 An important family of systems is based on multiple processors
that share memory.
 Unlike for a network, where communication is by message
passing, in these systems communication is by reading and
writing shared memory locations. Clearly, the shared memory
model can be emulated using message passing, and can be used
to implement message communication.
 Nonetheless, because there are important examples of real
computers that implement this model, there is considerable
theoretical interest in the model per-se.
 Unfortunately, although this model is very rich and a great deal is
known about it, it would be beyond the scope of this book to
attempt to treat the model in any detail.
Communication Technology
 The most basic communication technology in any distributed system
is the hardware support for message passing.
 Although there are some types of networks that offer special
properties, most modern networks are designed to transmit data in
packets with some fixed, but small, maximum size. Each packet
consists of a header, which is a data structure containing
information about the packet—its destination, route, and so forth.
 It contains a body, which are the bytes that make up the content of
the packet.
 And it may contain a trailer, which is a second data structure that is
physically transmitted after the header and body and would normally
consist of a checksum for the packet that the hardware computes
and appends to it as part of the process of transmitting the packet.

List of Hormones - Hypersecretion and Hyposecretion
88% (49)
List of Hormones - Hypersecretion and Hyposecretion
11 pages
Harrison's Principles of Internal Medicine - 19e - Kasper, Fauci, Hauser, Longo, Jameson, Loscalzo (Dragged)
0% (1)
Harrison's Principles of Internal Medicine - 19e - Kasper, Fauci, Hauser, Longo, Jameson, Loscalzo (Dragged)
16 pages
DS CH7 - Fault Tolerance
No ratings yet
DS CH7 - Fault Tolerance
17 pages
Intro To DS Chapter 6
No ratings yet
Intro To DS Chapter 6
51 pages
Chapter 8-Fault Tolerance
No ratings yet
Chapter 8-Fault Tolerance
30 pages
Chapter 8-Fault Tolerance
No ratings yet
Chapter 8-Fault Tolerance
37 pages
Ds Chapter 7
No ratings yet
Ds Chapter 7
21 pages
Chapter 1 - Intro
No ratings yet
Chapter 1 - Intro
31 pages
Fault Tolerance: Click To Add Text Dealing Successfully With Partial System. Key Technique: Redundancy
No ratings yet
Fault Tolerance: Click To Add Text Dealing Successfully With Partial System. Key Technique: Redundancy
48 pages
Chapter 8-Fault Tolerance
100% (1)
Chapter 8-Fault Tolerance
71 pages
1-Lecture (2. Intro-Core Challenges) - Slides
No ratings yet
1-Lecture (2. Intro-Core Challenges) - Slides
22 pages
Chapter 2 MODELS OF DISTRIBUTED SYSTEMS
No ratings yet
Chapter 2 MODELS OF DISTRIBUTED SYSTEMS
7 pages
Module 1
No ratings yet
Module 1
47 pages
Abstract On Challenges in Distributed Systems
No ratings yet
Abstract On Challenges in Distributed Systems
4 pages
Distributed Computing: Unit-1 (
No ratings yet
Distributed Computing: Unit-1 (
47 pages
Fault Tolerance FDCC
No ratings yet
Fault Tolerance FDCC
76 pages
Blockchain - Unit1
No ratings yet
Blockchain - Unit1
115 pages
Se342: Distributed Computing: Lecture # 03-b Fundamental Models
No ratings yet
Se342: Distributed Computing: Lecture # 03-b Fundamental Models
26 pages
DistributedComputing (University) PartA
No ratings yet
DistributedComputing (University) PartA
19 pages
RMCS
No ratings yet
RMCS
127 pages
Distributed Computing Note
100% (1)
Distributed Computing Note
54 pages
CSE352 Lecture9 DistributedSystemsDesignInto
No ratings yet
CSE352 Lecture9 DistributedSystemsDesignInto
98 pages
Chapter 1 - Characterization of Distributed Systems
No ratings yet
Chapter 1 - Characterization of Distributed Systems
20 pages
CS407 M1 Ktunotes - in
No ratings yet
CS407 M1 Ktunotes - in
9 pages
Jisy Raju Assistant Professor, CE Cherthala
No ratings yet
Jisy Raju Assistant Professor, CE Cherthala
10 pages
DC Rev
No ratings yet
DC Rev
11 pages
DS Chapter V8.0fault Tolerance
No ratings yet
DS Chapter V8.0fault Tolerance
23 pages
Fault System One
No ratings yet
Fault System One
19 pages
DS Syllabus Introduction (Reference)
No ratings yet
DS Syllabus Introduction (Reference)
44 pages
Unit I (Pet)
No ratings yet
Unit I (Pet)
21 pages
Lec 3
No ratings yet
Lec 3
30 pages
Chapter 8-Fault Tolerance
No ratings yet
Chapter 8-Fault Tolerance
51 pages
Fault Tolerant Message Passing Systems
No ratings yet
Fault Tolerant Message Passing Systems
26 pages
Distributed System Unit 1
No ratings yet
Distributed System Unit 1
20 pages
DS Unit-3 Notes
No ratings yet
DS Unit-3 Notes
35 pages
Distributed System
No ratings yet
Distributed System
19 pages
Distributed System2
No ratings yet
Distributed System2
102 pages
Distributed Computing
No ratings yet
Distributed Computing
40 pages
PDS Unit 1
No ratings yet
PDS Unit 1
59 pages
Chapter 8
No ratings yet
Chapter 8
107 pages
01 en Principles of Distributed Systems
No ratings yet
01 en Principles of Distributed Systems
35 pages
Chapter 06 Fault - Tolerance
No ratings yet
Chapter 06 Fault - Tolerance
30 pages
Distributed Systems - Fault Tolerance
No ratings yet
Distributed Systems - Fault Tolerance
21 pages
University of Okara, Okara: Department of Computer Science
No ratings yet
University of Okara, Okara: Department of Computer Science
5 pages
Chen 07
No ratings yet
Chen 07
39 pages
CS3551-Distributed Computing Notes - Removed
No ratings yet
CS3551-Distributed Computing Notes - Removed
32 pages
Distributed System
No ratings yet
Distributed System
71 pages
Distributed Computing QB Answers
No ratings yet
Distributed Computing QB Answers
15 pages
Shuman Guo CSC 8320 Advanced Operating Systems
No ratings yet
Shuman Guo CSC 8320 Advanced Operating Systems
31 pages
Design of Parallel and Distributed Systems: Dr. Seemab Latif
No ratings yet
Design of Parallel and Distributed Systems: Dr. Seemab Latif
36 pages
Chapte Four DS
No ratings yet
Chapte Four DS
37 pages
Unit 1
No ratings yet
Unit 1
6 pages
Failure Detector: Degrees of Completeness
No ratings yet
Failure Detector: Degrees of Completeness
4 pages
Week 04
No ratings yet
Week 04
49 pages
Design Issues of DS
No ratings yet
Design Issues of DS
21 pages
IntroDistribuetComputing
No ratings yet
IntroDistribuetComputing
41 pages
Distributed Systems
No ratings yet
Distributed Systems
35 pages
Distributed System
No ratings yet
Distributed System
5 pages
Distributed Systems (Cosc 6003) : Chapter 1 - Introduction
No ratings yet
Distributed Systems (Cosc 6003) : Chapter 1 - Introduction
37 pages
DS&CC Lab Manual
No ratings yet
DS&CC Lab Manual
23 pages
Mastering OpenTelemetry: Building Scalable Observability Systems for Cloud-Native Applications
From Everand
Mastering OpenTelemetry: Building Scalable Observability Systems for Cloud-Native Applications
Robert Johnson
No ratings yet
Breaking the Availability Barrier Ii: Achieving Century Uptimes with Active/Active Systems
From Everand
Breaking the Availability Barrier Ii: Achieving Century Uptimes with Active/Active Systems
Dr. Bruce Holenstein
No ratings yet
Commissioning Report For Boiler Air and Flue Gas System Unit 1
No ratings yet
Commissioning Report For Boiler Air and Flue Gas System Unit 1
6 pages
WhitespaceAlpha Deck Jan'25
No ratings yet
WhitespaceAlpha Deck Jan'25
18 pages
NVEM - EC300 - EC500 Self Test Guide
No ratings yet
NVEM - EC300 - EC500 Self Test Guide
8 pages
Khan Noorlander-Studies in The Grammar and Lexicon of Neo-Aramaic
No ratings yet
Khan Noorlander-Studies in The Grammar and Lexicon of Neo-Aramaic
542 pages
Unit-5 6
No ratings yet
Unit-5 6
12 pages
Sexual Sounds Can Trigger Porn Filter
No ratings yet
Sexual Sounds Can Trigger Porn Filter
1 page
Lesson Plan: Headings Matter & Method Board
No ratings yet
Lesson Plan: Headings Matter & Method Board
3 pages
SPA Dance CG G7
100% (1)
SPA Dance CG G7
11 pages
FS 2 LEARNING EPISODE 3 Final Episode
No ratings yet
FS 2 LEARNING EPISODE 3 Final Episode
10 pages
Ampere's Law
No ratings yet
Ampere's Law
20 pages
The Amish Community PowerPoint
No ratings yet
The Amish Community PowerPoint
21 pages
Cleaning Validation For Biopharmaceutical Manufacturing at Genentech
100% (1)
Cleaning Validation For Biopharmaceutical Manufacturing at Genentech
4 pages
MG Sw-Simpl Symbols Guide 1 PDF
No ratings yet
MG Sw-Simpl Symbols Guide 1 PDF
210 pages
Compare Two Images
0% (1)
Compare Two Images
3 pages
Exploring Social Psychology 8th Edition Myers Full Download
No ratings yet
Exploring Social Psychology 8th Edition Myers Full Download
405 pages
Chapter#1 - Introduction To Web Engineering
No ratings yet
Chapter#1 - Introduction To Web Engineering
54 pages
PART ONE: Reading: Plagiarism
No ratings yet
PART ONE: Reading: Plagiarism
2 pages
The Unknown Life of Jesus Christ
No ratings yet
The Unknown Life of Jesus Christ
104 pages
Introduction Presentation
No ratings yet
Introduction Presentation
17 pages
Remote Viewing Dialogues Daz Smith PDF Download
No ratings yet
Remote Viewing Dialogues Daz Smith PDF Download
87 pages
The SHS For SHS Framework
No ratings yet
The SHS For SHS Framework
3 pages
Writing Effective Covering Letters
No ratings yet
Writing Effective Covering Letters
3 pages
Perhitungan Tugas Besar Geometri Jalan Raya (Andre Gunawan 1622201019)
No ratings yet
Perhitungan Tugas Besar Geometri Jalan Raya (Andre Gunawan 1622201019)
77 pages
PL 2
No ratings yet
PL 2
1 page
RT - Tank - Bunds
No ratings yet
RT - Tank - Bunds
3 pages
1brochure - Machine Learning PDF
No ratings yet
1brochure - Machine Learning PDF
5 pages
Mathematics BSC FYUP Syllabus 2024
No ratings yet
Mathematics BSC FYUP Syllabus 2024
36 pages
High Performance Work Systems
No ratings yet
High Performance Work Systems
26 pages

Distributed Computing: Farhad Muhammad Riaz

Uploaded by

Distributed Computing: Farhad Muhammad Riaz

Uploaded by

Distributed Computing

You might also like