0% found this document useful (0 votes)
15 views9 pages

(Given) DSs CH 01 - Introduction

The document provides an introduction to distributed systems, highlighting their evolution from centralized systems to a network of autonomous computers that collaborate to present a unified interface to users. It outlines key definitions, goals such as resource accessibility, transparency, openness, and scalability, and discusses types of distributed systems including distributed computing and information systems. The document emphasizes the importance of middleware in managing communication and coordination among distributed components.

Uploaded by

Student
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views9 pages

(Given) DSs CH 01 - Introduction

The document provides an introduction to distributed systems, highlighting their evolution from centralized systems to a network of autonomous computers that collaborate to present a unified interface to users. It outlines key definitions, goals such as resource accessibility, transparency, openness, and scalability, and discusses types of distributed systems including distributed computing and information systems. The document emphasizes the importance of middleware in managing communication and coordination among distributed components.

Uploaded by

Student
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

DSs Lecture Notes

Chapter One
Introduction to Distributed Systems

1.1. Introduction
Rise of Distributed Systems
Computer systems undergoing revolution for almost a century, for instance from 1945 to 1985
computers were large and expensive each may cost at least tens of thousands of dollars;
computers were slow in processing instructions; and computers lack a way to connect them so
that they operated independently from one another. However, in the 70s and 80s the concept of
Parallel Computing was a hot topic (though its vision existed since the 1920s) and Cluster
Computing started to dominate in the 1990s.
The real proliferation came around the mid-1980s with two major technological innovations and
began to change situations. The first was the development of cheap and powerful
microprocessor-based computers. Initially, these were 8-bit machines, but soon 16-bit, 32-bit,
and 64-bit CPUs became common. The second development was the invention of high-speed
computer networks. Local-area networks (LANs) allow hundreds of machines within a building to
be connected so that larger amounts of data can be moved between machines at rates of 100
million to 10 billion bits/sec. In addition, Wide-area networks (WANs) allow millions of machines
all over the earth to be connected at speeds varying from 64 Kbps (kilobits per second) to gigabits
per second.
The result of these technologies is that it is feasible to put together computing systems composed
of large numbers of computers connected by a high-speed network to work for the same
application. This is in contrast to the old centralized systems (or single processor systems) where
there was single computer with its peripherals.
Traditional Centralized Systems vs. Distributed Systems
Below Table 1 presents characteristic differences between the traditional centralized and
distributed systems.
Centralized Systems Distributed Systems
One component with non-autonomous parts. Multiple autonomous components.
Component shared by users all the time. Components are not shared by all users.
All resources accessible. Resources may not be accessible.
Software runs in concurrent processes on
Software runs in a single process.
different processors.
Single point of control. Multiple points of control.
Single point of failure. Multiple points of failure.
Table 1: Characteristics Difference Between Traditional Centralized and Distributed Systems

BireZman ([email protected]) UoG, Computer Science Department 1


DSs Lecture Notes

1.2. Definitions of Distributed Systems


There are various definitions of distributed systems in the literature, however here only we look
into definitions having closer characterizations with our course interest.
Definition 1 [Andrew S. Tanenbaum et al.]
“A distributed system is a collection of independent computers that appears to its
users as a single coherent system.”
 This definition has two important aspects.
 The first one is that a distributed system consists of components that are
autonomous (or autonomous computers).
 The second aspect is that users (people or programs) think they are dealing with a
single system (or single system view for users). This means that one way or the other
the autonomous components need to collaborate. Establishing this collaboration
lies at the heart of developing distributed systems. Note that no assumptions are
made concerning the type of computers.
Definition 2 [George Coulouris et al.]
“A distributed system is a system in which hardware or software components
located at networked computers communicate and coordinate their actions only by
message passing.”
 The above definition tells communication is the heart of distributed system and it is only
through message passing (or no more on shared memory). Due to this significant
characteristic distributed systems must deal with many problems like concurrency issues,
lack of a global clock and independent failures of components.
Definition 3 [Leslie Lamport et al.]
“You know you have a distributed system when the crash of a computer you’ve never
heard of stops you from getting any work done.”
 Another important aspect of distributed systems is described in this definition, it’s:-
 The existence of fault tolerance in distributed systems
 Or there is an independent failure, but not system failure.
Definition 4 [Generic]
“Distributed system is a collection of processors executing independent instruction
streams that communicate and synchronize their actions.”
“Distributed system is a system in which hardware and software components of
networked computers communicate and coordinate their activity only by passing
messages.”

BireZman ([email protected]) UoG, Computer Science Department 2


DSs Lecture Notes

“Distributed system is a computing platform built with many computers – that


operate concurrently; that are physically distributed (have their own failure modes);
that are linked by a network; and that have independent clocks.”
In order to support heterogeneous computers and networks while offering a single-system view,
distributed systems are often organized by means of a layer of software called Middleware.
Below Figure 1 shows this layer logically placed between a higher-level layer consisting of users
and applications and a layer underneath consisting of operating systems and basic
communication facilities. Note that the middleware layer extends over multiple machines and
offers each application the same interface.

Figure 1: Distributed Systems Organization using Middleware Layer


Above Figure 1 shows four networked computers and three applications of which application B
is distributed across computers 2 and 3, each application is offered the same interface. The
distributed system provides means for components of a single distributed application to
communicate with each other, but also to let different applications communicate. At the same
time, it hides as best and reasonable as possible the differences in hardware and operating
systems from each application.
1.3. Goals of Distributed System
Just because it is possible to build distributed systems does not necessarily mean that it is a good
idea. In this section we’ll look at four important goals that should be met to make building a
distributed system worth the effort. A distributed system should make resources easily
accessible, it should reasonably hide the fact that resources are distributed across a network, it
should be open and it should be scalable.
1.3.1. Making Resources Accessible
One main goal of a distributed system is to make it easy for the users (and applications) to access
remote resources and to share them in a controlled and efficient way. Resources can be just

BireZman ([email protected]) UoG, Computer Science Department 3


DSs Lecture Notes

about anything, but typical examples include things like printers, computers, storage facilities,
data, files, Web pages and networks.
There are many reasons for wanting to share resources. One obvious reason is that of economics.
It makes economic sense to share costly resources such as supercomputers, high-performance
storage systems, printers and other expensive peripherals. Connecting users and resources also
makes it easier to collaborate and exchange information.
1.3.2. Distribution Transparency
Another important goal of distributed systems is to hide the fact that its processes and resources
are physically distributed across multiple computers. A distributed system that is able to present
itself to users and applications as if it were only a single computer system is said to be
transparent. The concept of transparency can be applied to several aspects of a distributed
system, the most important are presented in Table 2, below.

Transparency Description
Access Hide differences in data representation and how a resource is accessed.
Location Hide where a resource is physically located.
Migration Hide that a resource may move to another location.
Relocation Hide that a resource may be moved to another location while in use.
Replication Hide that a resource is replicated.
Concurrency Hide that a resource may be shared by several competitive users.
Failure Hide the failure and recovery of a resource.

Table 2: Various aspects of Distributed Transparency


a) Access Transparency
Deals with hiding differences in data representation and the way that resources can be accessed
by users. At a basic level, we wish to hide differences in machine architectures, but more
important is that we reach agreement on how data is to be represented by different machines
and operating systems. For example, a distributed system may have computer systems that run
different operating systems, each having their own file-naming conventions. Differences in
naming conventions, as well as how files can be manipulated, should all be hidden from users
and applications.
b) Location Transparency
Refers to the fact that users cannot tell where a resource is physically located in the system.
Naming plays an important role in achieving location transparency. In particular, location
transparency can be achieved by assigning only logical names to resources, that is, names in

BireZman ([email protected]) UoG, Computer Science Department 4


DSs Lecture Notes

which the location of a resource is not secretly encoded. An example of such a name is the URL
http://www.prenhall.com/index.html which gives no clue about the location of Prentice Hall's
main Web server.
c) Migration Transparency
The URL also gives no clue as to whether index.html has always been at its current location or
was recently moved there. Distributed systems in which resources can be moved without
affecting how those resources can be accessed are said to provide migration transparency.
d) Relocation Transparency
Even stronger is the situation in which resources can be relocated while they are being accessed
without the user or application noticing anything. In such cases, the system is said to support
relocation transparency. An example of relocation transparency is when mobile users can
continue to use their wireless laptops while moving from place to place without ever being
(temporarily) disconnected.
e) Replication Transparency
Resources may be replicated to increase availability or to improve performance by placing a copy
close to the place where it is accessed. Replication transparency deals with hiding the fact that
several copies of a resource exist. To hide replication from users, it is necessary that all replicas
have the same name. Consequently, a system that supports replication transparency should
generally support location transparency as well, because it would otherwise be impossible to
refer to replicas at different locations.
f) Concurrency Transparency
Enables several processes to operate concurrently using shared resources without interference
between them.
g) Failure Transparency
Enables the concealment of faults, allowing users and application programs to complete their
tasks despite the failure of hardware or software components. Making a distributed system
failure transparent means that a user does not notice that a resource (he has possibly never heard
of) fails to work properly, and that the system subsequently recovers from that failure.
1.3.3. Openness in a Distributed System
Another important goal of distributed systems is openness. An open distributed system is a
system that offers services according to standard rules that describe the syntax and semantics of
those services. For example, in computer networks, standard rules govern the format, contents,
and meaning of messages sent and received. Such rules are formalized in protocols. In distributed
systems, services are generally specified through interfaces, which are often described in an
Interface Definition Language (IDL). This interface should properly specified.

BireZman ([email protected]) UoG, Computer Science Department 5


DSs Lecture Notes

In addition to well defined interfaces, interoperability and portability are goals for open
distributed system. Interoperability characterizes the extent by which two implementations of
systems or components from different manufacturers can co-exist and work together by merely
relying on each other's services as specified by a common standard. Portability characterizes to
what extent an application developed for a distributed system A can be executed without
modification, on a different distributed system B that implements the same interfaces as A.
Another important goal for an open distributed system is that it should be easy to configure the
system out of different components (possibly from different developers). Also, it should be easy
to add new components or replace existing ones without affecting those components that stay
in place. In other words, an open distributed system should also be extensible.
1.3.4. Scalability in Distributed Systems
Distributed systems operate effectively and efficiently at many different scales, ranging from a
small intranet to the Internet. A system is described as scalable if it will remain effective when
there is a significant increase in the number of resources and the number of users. It allows the
system and applications to expand in scale without change to the system structure or the
application algorithms.
Scalability of a system can be measured along at least three different dimensions. First, a system
can be scalable with respect to its size, meaning that we can easily add more users and resources
to the system. Second, a geographically scalable system is one in which the users and resources
may lie far apart. Third, a system can be administratively scalable, spanning that it can still be
easy to manage even if it spans many independent administrative organizations.

Quiz
Understanding the above definitions and goals, what challenges (or limitations)
came with distributed systems designs.

1.4. Types of Distributed System


1.4.1. Distributed Computing Systems
It is class of distributed systems which is used for high-performance computing tasks. Distributed
computing system consists of two sub groups: cluster computing and grid computing.
a) Cluster Computing
In cluster computing the underlying hardware consists of a collection of similar workstations or
PCs (homogeneous), closely connected by means of a high speed local-area network. In addition,
each node runs the same operating system. It is used for parallel programming in which a single
compute intensive program is run in parallel on multiple machines.

BireZman ([email protected]) UoG, Computer Science Department 6


DSs Lecture Notes

Figure 2: Example of Cluster Computing System


Each cluster consists of a collection of compute nodes that are controlled and accessed by means
of a single master node. The master typically handles the allocation of nodes to a particular
parallel program, maintains a batch queue of submitted jobs, and provides an interface for the
users of the system. The master actually runs the middleware needed for the execution of
programs and management of the cluster, while the compute nodes often need nothing else but
a standard operating system.
b) Grid Computing
Grid computing subgroup consists of distributed systems that are often constructed as a
federation of computer systems. Grid computing systems have high degree of heterogeneity: no
assumptions are made concerning hardware, operating systems, networks, administrative
domains, security policies, etc. A key issue in a grid computing system is that resources from
different organizations are brought together to allow the collaboration of a group of people or
institutions.
1.4.2. Distributed Information Systems
In many cases, a networked application simply consisted of a server running that application and
making it available to remote programs (clients). Such clients could send a request to the server
for executing a specific operation, after which a response would be sent back. Integration at the
lowest level would allow clients to wrap a number of requests into a single larger request and
have it executed as a distributed transaction; all or none of the requests would be executed.
a) Transaction Processing Systems
For database applications, operations on a database are usually carried out in the form of
transactions. Programming using transactions requires special primitives that must either be
supplied by the underlying distributed system or by the language runtime system. The exact list
of primitives depends on what kinds of objects are being used in the transaction. Typical
examples of transaction primitives are shown below in Table 3.

BireZman ([email protected]) UoG, Computer Science Department 7


DSs Lecture Notes

Primitives Descriptions

BEGIN_TRANSACTION Mark the start of a transaction


END_TRANSACTION Terminate the transaction and try to commit
ABORT_TRANSACTION Kill the transaction and restore the old values
READ Read data from a file, a table, or otherwise
WRITE Write data to a file, a table, or otherwise
Table 3: Example of Primitives for Transaction
Assume the following two banking operations:
 withdraw an amount x from account 1
 deposit the amount x to account 2
What happens if there is a problem after the first activity is carried out? The solution is to group
the two operations into one transaction and then either both are carried out or neither. This all-
or-nothing property of transactions is one of the four characteristic properties that transactions
have. More specifically, these transactions are referred as ACID:-
Atomic – property to the outside world, the transaction happens indivisibly; a
transaction either happens completely or not at all; intermediate states are not seen
by other processes.
Consistent – the transaction does not violate system invariants; e.g., in an internal
transfer in a bank, the amount of money in the bank must be the same as it was before
the transfer (the law of conservation of money); this may be violated for a brief period
of time, but not seen to other processes.
Isolated – concurrent transactions do not interfere with each other; if two or more
transactions are running at the same time, the final result must look as though all
transactions run sequentially in some order.
Durable – once a transaction commits, the changes are permanent.
b) Distributed Pervasive Systems
The distributed systems discussed so far are characterized by their stability; fixed nodes having
high-quality connection to a network.
There are also mobile and embedded computing devices which are small, battery-powered,
mobile, and with a wireless connection. They are referred as distributed pervasive systems, in
which instability is the default behavior.

BireZman ([email protected]) UoG, Computer Science Department 8


DSs Lecture Notes

There are three requirements for pervasive applications:


a) Embrace contextual changes: a device is aware that its environment may change all the
time, e.g., changing its network access point
b) Encourage ad hoc composition: devices are used in different ways by different users
c) Recognize sharing as the default: devices join a system to access or provide information.
Some examples of pervasive systems are:-
 Home Systems that integrate consumer electronics
 Electronic Health Care Systems to monitor the well-being of individuals
 Sensor Networks

BireZman ([email protected]) UoG, Computer Science Department 9

You might also like