(Given) DSs CH 01 - Introduction
(Given) DSs CH 01 - Introduction
Chapter One
Introduction to Distributed Systems
1.1. Introduction
Rise of Distributed Systems
Computer systems undergoing revolution for almost a century, for instance from 1945 to 1985
computers were large and expensive each may cost at least tens of thousands of dollars;
computers were slow in processing instructions; and computers lack a way to connect them so
that they operated independently from one another. However, in the 70s and 80s the concept of
Parallel Computing was a hot topic (though its vision existed since the 1920s) and Cluster
Computing started to dominate in the 1990s.
The real proliferation came around the mid-1980s with two major technological innovations and
began to change situations. The first was the development of cheap and powerful
microprocessor-based computers. Initially, these were 8-bit machines, but soon 16-bit, 32-bit,
and 64-bit CPUs became common. The second development was the invention of high-speed
computer networks. Local-area networks (LANs) allow hundreds of machines within a building to
be connected so that larger amounts of data can be moved between machines at rates of 100
million to 10 billion bits/sec. In addition, Wide-area networks (WANs) allow millions of machines
all over the earth to be connected at speeds varying from 64 Kbps (kilobits per second) to gigabits
per second.
The result of these technologies is that it is feasible to put together computing systems composed
of large numbers of computers connected by a high-speed network to work for the same
application. This is in contrast to the old centralized systems (or single processor systems) where
there was single computer with its peripherals.
Traditional Centralized Systems vs. Distributed Systems
Below Table 1 presents characteristic differences between the traditional centralized and
distributed systems.
Centralized Systems Distributed Systems
One component with non-autonomous parts. Multiple autonomous components.
Component shared by users all the time. Components are not shared by all users.
All resources accessible. Resources may not be accessible.
Software runs in concurrent processes on
Software runs in a single process.
different processors.
Single point of control. Multiple points of control.
Single point of failure. Multiple points of failure.
Table 1: Characteristics Difference Between Traditional Centralized and Distributed Systems
about anything, but typical examples include things like printers, computers, storage facilities,
data, files, Web pages and networks.
There are many reasons for wanting to share resources. One obvious reason is that of economics.
It makes economic sense to share costly resources such as supercomputers, high-performance
storage systems, printers and other expensive peripherals. Connecting users and resources also
makes it easier to collaborate and exchange information.
1.3.2. Distribution Transparency
Another important goal of distributed systems is to hide the fact that its processes and resources
are physically distributed across multiple computers. A distributed system that is able to present
itself to users and applications as if it were only a single computer system is said to be
transparent. The concept of transparency can be applied to several aspects of a distributed
system, the most important are presented in Table 2, below.
Transparency Description
Access Hide differences in data representation and how a resource is accessed.
Location Hide where a resource is physically located.
Migration Hide that a resource may move to another location.
Relocation Hide that a resource may be moved to another location while in use.
Replication Hide that a resource is replicated.
Concurrency Hide that a resource may be shared by several competitive users.
Failure Hide the failure and recovery of a resource.
which the location of a resource is not secretly encoded. An example of such a name is the URL
http://www.prenhall.com/index.html which gives no clue about the location of Prentice Hall's
main Web server.
c) Migration Transparency
The URL also gives no clue as to whether index.html has always been at its current location or
was recently moved there. Distributed systems in which resources can be moved without
affecting how those resources can be accessed are said to provide migration transparency.
d) Relocation Transparency
Even stronger is the situation in which resources can be relocated while they are being accessed
without the user or application noticing anything. In such cases, the system is said to support
relocation transparency. An example of relocation transparency is when mobile users can
continue to use their wireless laptops while moving from place to place without ever being
(temporarily) disconnected.
e) Replication Transparency
Resources may be replicated to increase availability or to improve performance by placing a copy
close to the place where it is accessed. Replication transparency deals with hiding the fact that
several copies of a resource exist. To hide replication from users, it is necessary that all replicas
have the same name. Consequently, a system that supports replication transparency should
generally support location transparency as well, because it would otherwise be impossible to
refer to replicas at different locations.
f) Concurrency Transparency
Enables several processes to operate concurrently using shared resources without interference
between them.
g) Failure Transparency
Enables the concealment of faults, allowing users and application programs to complete their
tasks despite the failure of hardware or software components. Making a distributed system
failure transparent means that a user does not notice that a resource (he has possibly never heard
of) fails to work properly, and that the system subsequently recovers from that failure.
1.3.3. Openness in a Distributed System
Another important goal of distributed systems is openness. An open distributed system is a
system that offers services according to standard rules that describe the syntax and semantics of
those services. For example, in computer networks, standard rules govern the format, contents,
and meaning of messages sent and received. Such rules are formalized in protocols. In distributed
systems, services are generally specified through interfaces, which are often described in an
Interface Definition Language (IDL). This interface should properly specified.
In addition to well defined interfaces, interoperability and portability are goals for open
distributed system. Interoperability characterizes the extent by which two implementations of
systems or components from different manufacturers can co-exist and work together by merely
relying on each other's services as specified by a common standard. Portability characterizes to
what extent an application developed for a distributed system A can be executed without
modification, on a different distributed system B that implements the same interfaces as A.
Another important goal for an open distributed system is that it should be easy to configure the
system out of different components (possibly from different developers). Also, it should be easy
to add new components or replace existing ones without affecting those components that stay
in place. In other words, an open distributed system should also be extensible.
1.3.4. Scalability in Distributed Systems
Distributed systems operate effectively and efficiently at many different scales, ranging from a
small intranet to the Internet. A system is described as scalable if it will remain effective when
there is a significant increase in the number of resources and the number of users. It allows the
system and applications to expand in scale without change to the system structure or the
application algorithms.
Scalability of a system can be measured along at least three different dimensions. First, a system
can be scalable with respect to its size, meaning that we can easily add more users and resources
to the system. Second, a geographically scalable system is one in which the users and resources
may lie far apart. Third, a system can be administratively scalable, spanning that it can still be
easy to manage even if it spans many independent administrative organizations.
Quiz
Understanding the above definitions and goals, what challenges (or limitations)
came with distributed systems designs.
Primitives Descriptions