Distributed Systems Ch2
Distributed Systems Ch2
SYSTEM MODELS
2.1 Introduction
2.2 Physical models
2.3 Architectural models
2.4 Fundamental models
2.5 Summary
2
2.1 Introduction
4
2.2 Physical models
8
2.3.1 Architectural elements
9
2.3.1 Architectural elements
10
2.3.1 Architectural elements
1. Communicating entities
From a system perspective: the entities that
communicate in a distributed system are typically
processes coupled with appropriate inter-process
communication paradigms with two caveats:
In some primitive environments, such as sensor networks,
the underlying operating systems may not support
process abstractions and hence the entities that
communicate in such systems are nodes.
In most distributed system environments, processes are
supplemented by threads
11
2.3.1 Architectural elements
1. Communicating entities
From a programming perspective:
Distributed Objects: refer to software objects that exist across different
networks or distributed environments but interact as if they were in the same
environment. Programmers can invoke methods on objects located on a
remote server as if they were local. These objects are accessed using
specific protocols like:
RMI (Remote Method Invocation) or
CORBA (Common Object Request Broker Architecture)
Components: refer to independent software units that perform specific
functions and provide clear interfaces. Components are designed to be
reusable across different projects and often rely on standards like:
COM (Component Object Model) or
EJB (Enterprise JavaBeans)
Web services
12
2.3.1 Architectural elements
1. Communicating entities
web service: web service defined as: “a software application
identified by a URI, whose interfaces and bindings are capable
of being defined, described and discovered as XML artefacts. A
Web service supports direct interactions with other software
agents using XML-based message exchanges via Internet-
based protocols.”
Web services represent the third important paradigm for the
development of distributed systems.
Web services are closely related to objects and components.
13
2.3.1 Architectural elements
14
2.3.1 Architectural elements
15
2.3.1 Architectural elements
2. Communication paradigms
3. Indirect communication: it allows a strong degree of decoupling
between senders and receivers. In particular:
1. Senders do not need to know who they are sending to (space uncoupling).
2. Senders and receivers do not need to exist at the same time (time uncoupling).
Key techniques for indirect communication include:
Group communication: is one-to-many communication (ex: sending Email to mailing list)
Publish-subscribe systems: wherein a large number of producers (or publishers)
distribute information items of interest (events) to a similarly large number of consumers
or subscribers (ex: financial trading)
Message queues: whereby producer processes can send messages to a specified
queue and consumer processes can receive messages from the queue or be notified of
the arrival of new messages in the queue.
Tuple spaces: whereby processes can place arbitrary items of structured data, called
tuples, in a persistent tuple space and other processes can either read or remove such
tuples from the tuple space by specifying patterns of interest
Distributed shared memory (DSM): where systems provide an abstraction for sharing
data between processes that do not share physical memory 17
2.3.1 Architectural elements
19
2.3.1 Architectural elements
20
2.3.1 Architectural elements
4. Entities placement
Placement is the mapping of entities (objects or services) on
to the underlying physical distributed infrastructure which will
consist of a potentially large number of machines
interconnected by a network of arbitrary complexity.
We focus mainly on the following placement strategies:
4.1 mapping of services to multiple servers
4.2 Caching
4.3 mobile code
4.4 mobile agents
.
21
2.3.1 Architectural elements
4. Entities placement
4.1 mapping of services to multiple servers:
Services may be implemented as
several server processes in separate
host computers interacting as
necessary to provide a service to
client processes (Figure 2.4b). The
servers may partition the set of
objects on which the service is based
and distribute those objects between
themselves, or they may maintain
replicated copies of them on several
hosts.
22
2.3.1 Architectural elements
4. Entities placement
4.2 Caching:
A caching is a store of recently used data objects that is closer to
one client or a particular set of clients than the objects
themselves.
For example, web proxy servers (Figure 2.5) provide a shared
cache. The purpose of proxy servers is to increase the availability
and performance of the service by reducing the load on the wide
area network and web servers.
23
2.3.1 Architectural elements
4. Entities placement
4.3 Mobile code:
Applets are a well-known and widely
used example of mobile code.
the user running a browser selects a
link to an applet whose code is stored
on a web server;
the code is downloaded to the browser
and runs there, as shown in Figure 2.6.
An advantage of running the
downloaded code locally is that it can
give good interactive response since it
does not suffer from the delays or
variability of bandwidth associated
with network communication.
24
2.3.1 Architectural elements
4. Entities placement
4.4 Mobile agents:
A mobile agent is a running program (including both code and data) that travels
from one computer to another in a network carrying out a task on someone’s behalf,
such as collecting information, and eventually returning with the results.
Mobile agent reduces the communication cost and time through the replacement of
remote invocations with local ones.
Mobile agents might be used to install and maintain software on the computers
within an organization or to compare the prices of products from a number of vendors
by visiting each vendor’s site and performing a series of database operations.
Example: worm program developed at Xerox PARC which was designed to make
use of idle computers in order to carry out intensive computations.
Mobile agents (like mobile code) are a potential security threat to the resources in
computers that they visit.
25
2.3 Architectural models
26
2.3.2 Architectural patterns
27
2.3.2 Architectural patterns
Layering
In a layered approach, a complex system is
partitioned into a number of layers, with a given layer
making use of the services offered by the layer below.
Given the complexity of distributed systems, it is
often helpful to organize services into layers.
We present a common view of a layered
architecture in Figure 2.7.
28
2.3.2 Architectural patterns
Layering
Given architecture in Figure 2.7, platform and middleware are
define as follows:
A platform consists of the lowest-level hardware
and software layers that provide services to the
layers above them.
Middleware is a layer of software whose purpose
is to mask heterogeneity and to provide a
convenient programming model to application
programmers.
30
2.3.2 Architectural patterns
Tiered architecture
Tiered architectures are complementary to layering.
Tiering is a technique to organize functionality of a given
layer.
The associated two-tier and three-tier solutions are presented
together for comparison in Figure 2.8 (a) and (b), respectively.
31
32
2.3.2 Architectural patterns
Tiered architecture
In the standard web style of interaction:
browser sends an HTTP request to a server for a resource (page, image …)
The server replies by sending an entire page
The constrains of the standard style of interaction:
The time interval between HTTP request and content arrival is indeterminate
In order to update even a small part of page, an entire new page must be requested
The contents of a page cannot be updated in response to changes in the data held at the server.
XJAX resolve the aforementioned constraints
XJAX (Asynchronous Javascript And XML) is an extension to the standard client-
server style of interaction used in the World Wide Web.
XJAX can request any data content in the current page selectively.
XJAX provides a communication mechanism enabling front-end components
running in a browser to issue requests and receive results from back-end
components running on a server
The AJAX constitutes an effective technique for the construction of responsive web
applications in the context of the indeterminate latency of the Internet.
33
2.3.2 Architectural patterns
Thin client
The trend in distributed computing is towards moving complexity away
from the end-user device towards services in the Internet.
This trend has given rise to interest in the concept of a thin client.
Thin client refers to a software layer that supports a window-based user
interface that is local to the user while executing application programs or
accessing services on a remote computer.
The main drawback of the thin client architecture is in highly interactive
graphical activities (such as CAD and image processing), where the delays
experienced by users are increased to unacceptable levels.
virtual network computing (VNC) provide a flexible solution and now
dominates the marketplace
VNC is a software solution, which is a hardware-based, supporting the
transmission of keyboard, video and mouse events over IP (KVM-over-IP).
Here, VNC client interacts with a VNC server through a VNC protocol
34
2.3.2 Architectural patterns
Other commonly occurring patterns
The proxy pattern:
proxies support location transparency
proxy is created in the local address space to represent the remote object
The use of brokerage:
it is used in complex distributed infrastructures
help in discovering available services
can transform messages between different formats
can enforce security policies
can implement load balancing by distributing requests across multiple service
instances
can provide monitoring and logging features
Example: Amazon Web Services (AWS) API Gateway which manages the client
requests to the appropriate backend services,
Reflection: it supports both introspection (the dynamic discovery of system’s
properties) and intercession (the ability to dynamically modify structure)
35
2.3 Architectural models
36
2.3.3 Associated middleware solutions
Middleware is a software layer that provides a
programming abstraction as well as masking the
heterogeneity of the underlying networks, hardware,
operating systems and programming languages. For
example:
CORBA: stands for Common Object Request Broker
RMI: stands for Java Remote Method Invocation
Categories of middleware: Figure 2.12 shows a top-level categorization of
middleware that is driven by the choice of communicating entities and associated
communication paradigms. These categories follow five of the main architectural models:
distributed objects
distributed components
publish-subscribe systems
message queues
web services 37
38
2.4 Fundamental models
Fundamental models contain only the essential
ingredients that we need to consider in order to
understand some aspects of a system’s behavior.
Fundamental models examines three important
aspects of distributed systems:
Interaction models, which consider the structure and sequencing
of the communication between the elements of the system.
Failure models, defines the ways in which failure may occur in
order to provide an understanding of the effects of failures.
Security models, which consider how the system is protected
against attempts to interfere with its correct operation or to steal its
data.
39
2.4.1 Interaction model
We have two opposing extreme interaction
models:
Synchronous distributed systems: has a strong
assumption of time
Asynchronous distributed systems: makes no
assumptions about time
Before discussing the above two models we
discuss the following 3 considerations:
1. Single program algorithms VS. Distributed algorithms
2. Process state
3. Factors affecting interacting processes
40
2.4.1 Interaction model
1. Single program algorithms VS. Distributed
algorithms:
I. Single program algorithms: algorithms in simple
programs define a sequence of steps for computation,
typically executed sequentially.
II. Distributed algorithms: distributed systems involve
multiple processes, each following a distributed
algorithm that includes message transmission between
processes.
2. Process state: in DS, each process has a private
state, inaccessible by other processes, and
communication performance and lack of a global
time concept are key challenges.41
2.4.1 Interaction model
3. Factors affecting interacting processes: in DS, two
significant factors affecting interacting processes:
a) Communication performance: which has following characteristics:
Latency, it includes:
• The time taken for the first of a string of bits transmitted
through a network to reach its destination.
• The delay in accessing the network, which increases
significantly when the network is heavily loaded.
• The time taken by the operating system communication
services at both the sending and the receiving processes
Bandwidth: is the total amount of information that can be
transmitted over a computer network in a given time.
Jitter is the variation in the time taken to deliver a series of
messages
b) Computer clocks and timing events
42
2.4.1 Interaction model
3. Factors affecting interacting processes: in DS, two
significant factors affecting interacting processes:
a) Communication performance
b) Computer clocks and timing events:
Each computer in a DS has its own internal clock,
which can be used by local processes to obtain the
value of the current time.
Local clocks may supply different time values. This is
because computer clocks drift from perfect time and
their drift rates differ from one another.
The term clock drift rate refers to the rate at which a
computer clock deviates from a perfect reference clock.
43
2.4.1 Interaction model
Based on the 3 discussed considerations, we
conclude that: in DS, it is hard to set limits on the
time that can be taken for:
process execution
message delivery
clock drift rate
44
2.4.1 Interaction model
We have two opposing extreme interaction
models:
Synchronous distributed systems: has a
strong assumption of time
Asynchronous distributed systems:
makes no assumptions about time
45
2.4.1 Interaction model
Synchronous distributed systems model
In this model the following bounds are defined:
A known lower and upper bounds on process execution time.
A known bound on message receiving time.
A known bound on process clock drift rate.
It is difficult to arrive to realistic values and to provide
guarantees of the chosen values.
However, this model may be useful for giving some
idea of how it will behave in a real distributed system.
46
2.4.1 Interaction model
Asynchronous distributed systems model
In this model, there is:
No bounds on process execution speeds.
No bounds on message transmission delays.
No bounds on clock drift rate.
This model exactly models the Internet.
Actual distributed systems are very often
asynchronous because of the need for processes to
share the processors and for communication channels to
share the network
47
2.4.2 Failure model
In a DS, two thing may fail:
Processes
communication channels
There are three categories of failures:
1. Omission failures
2. Arbitrary failures
3. Timing failures
48
2.4.2 Failure model
1. Omission failures:
refer to cases when a process or communication channel fails to
perform actions that it is supposed to do
Process omission failures
when a process is crashed.
this method of crash detection relies on the use of timeouts.
A process crash is called fail-stop if other processes can detect
certainly that the process has crashed
Communication omission failures
This is known as ‘dropping messages’ and is generally caused by
lack of buffer space at the receiver or at an intervening gateway, or
by a network transmission error, detected by a checksum carried
with the message data.
49
2.4.2 Failure model
2. Arbitrary failures:
Arbitrary or Byzantine failure is used to describe the worst
possible failure semantics, in which any type of error may occur
For example, a process may set wrong values in its data items,
or it may return a wrong value in response to an invocation.
Arbitrary failures in processes cannot be detected by seeing
whether the process responds to invocations, because it might
arbitrarily omit to reply.
50
2.4.2 Failure model
3. Timing failures
are applicable in synchronous distributed systems where time
limits are set on process execution time, message delivery time
and clock drift rate.
Timing failures are listed in following Figure.
51
2.4.3 Security model
Protecting objects
access rights specify who is allowed to perform the operations
of an object – for example, who is allowed to read or to write its
state.
Figure 2.17 shows a server that manages a collection of objects
on behalf of some users.
52
2.4.3 Security model
Securing processes and their interactions
To model security threats, we postulate an enemy
Enemy can be made simply by using a computer connected to a
network to run a program that reads network messages
addressed to other computers on the network, or a program that
generates messages that make false requests to services.
53
2.4.3 Security model
Defeating security threats
Cryptography is the science of keeping messages secure, and
encryption is the process of scrambling a message in such a way
as to hide its contents.
Modern cryptography is based on encryption algorithms that use
secret keys – large numbers that are difficult to guess – to
transform data in a manner that can only be reversed with
knowledge of the corresponding decryption key.
Authentication involves encrypting a portion of a message with a
shared secret key to verify the sender's identity and ensure
message authenticity.
54