Introduction to Distributed System
Introduction to Distributed System
System – ITT458-02
Unit-1
Introduction to
Distributed Systems
Syllabus
Software
System Software Application Software
Operating System
Windows Linux Mac
Hardware
CPU RAM I/O
OS Example
Regular OS
P2
F2
P3
T2 P4 D2 D3
D1
Client-1 Client-2
Request
Reply
Network
NOS Example
Communication
Network
Distributed Operating System
Definition by Coulouris, Dollimore, Kindberg and Blair
• “A distributed system is defined as one in which
components at networked computers communicate and
coordinate their actions only by passing messages.”
• “A Distributed system is collection of independent
computers which are connected through network.”
Distributed Operating System
• A great example of distributed system is the web page of
Amazon .
www.amazon.in
User Internet
Web Server
Vertical Scaling
Scenario-2:
Want to process
RAM: 4GB
500GB Data
Add more
Processors/Systems
DOS
Horizontal Scaling
Examples of Distributed Systems
• From the definition, Distributed Systems also looks the
same as single system.
• Let us say about Google Web Server, from users
perspective while they submit the searched query, they
assume google web server as a single system.
• Just visit google.com, then search.
• However, under the hood Google builds a lot of servers
even distributes in different geographical area to give
you a search result within few seconds.
• So the Distributed Systems does not make any sense for
normal users.
Examples of Distributed Systems
• Web Search Engines:
– Major growth industry in the last decade.
– 10 billion per month for global number of searches.
– e.g. Google distributed infrastructure
Examples of Distributed Systems
Massively multiplayer online games:
• Large number of people interact through the Internet with a
virtual world.
• Challenges include fast response time, real-time propagation of
events.
The Top 20 Valuable Facebook Statistics
(Zephoria, Updated Dec., 2017)
Why Distributed Operating System?
• Facebook, currently, has 1.5 billion active monthly users.
• Google performs at least 1 trillion searches per year.
• About 48 hours of video is uploaded in Youtube every minute.
• A single system would be unable to handle the processing. Thus,
comes the need for Distributed Systems.
• The main answer is to cope with the extremely higher demand
of users in both processing power and data storage.
• With this extremely demand, single system could not achieve it.
• There are many reasons that make distributed systems is viable
such as high availability, scalability, resistant to failure, etc.
Why Distributed Operating System?
• It is Challenging/Interesting.
• Partial Failures
– Network
– Node failures
• Concurrency
– Nodes execute in parallel. Parallel Computing
– Messages travel asynchronously.
Network OS vs Distributed OS
Network Operating System Distributed Operating System
A network operating system is A distributed operating system is
made up of software and an ordinary centralized operating
associated protocols that allow a system but runs on multiple
set of computer network to be independent CPUs.
used together.
× Load Transfer
Advantages of Distributed Systems over Independent
PCs
• Data sharing: Allow many users to access to a common database.
• Resource Sharing: Expensive peripherals such as color laser
printers, photo-type setters and massive archival storage devices
are also among the few things that should be sharable.
• Software:
– Would be complex.
• Network problem:
– Network saturation.
– Malfunctioning of network.
• Security:
– Possibility of security violation since the private data are
visible to others over the network.
Classification of Distributed System
Based on Hardware
Based on number
of instructions and
DataStream
Classification based on Hardware
• Even though all distributed system consist of multiple CPUs, there
are several different ways the hardware can be organized, specially
in terms of how they are interconnected and communicate.
Multiprocessor Multicomputer
(Shared Memory) (Private Memory)
P P P P
Shared Memory
Tightly-Coupled OS(Shared Memory)
Shared Memory Machine: The n processors shares physical
address space. Communication can be done through shared
memory.
A=10+10
A=A+10
A=20 Read
A=20A
P P P P
A=10
A=20
10
Unit 1: Introduction to DOS 40 Darshan Institute of Engineering & Technology
Loosely-Coupled OS(Private Memory)
Private Memory Machine: Each processors has its own local
memory. Communication can be done through Message passing.
M M M M
P P P P
M M M M
A=10A
Read P P P P A=20A
Read
Read
A=20A
Instruction Pool +
Data Stream
A B
PU
A B +
PU
Data Stream
C D +
PU
E F
PU
Classification based on Instruction & DataStream
• Multiple instruction streams, single data stream
(MISD)
– No more computers fit this model.
– Uncommon architecture which is generally used for fault
tolerance.
+ Instruction Pool *
A B A B
Data Stream
PU PU
Classification based on Instruction & DataStream
• Multiple instruction streams, Multiple data
stream (MIMD)
– A group of independent computers, each with its own program counter,
program, and data.
– A computer that can run multiple processes or threads that are cooperating
towards a common objective.
+ Instruction Pool *
A B A +B *
PU PU
Data Stream
C D C D
PU PU
Classification based on Instruction &
DataStream
Workstation Model
Hybrid Model
Minicomputer Model
• Resource sharing
– Database T
Mini-
– High-performance devices
Computer
T
• Example:
– ARPAnet
T T
Mini- Communication Mini-
Computer Network Computer
T T
Workstation Model
• Process migration
– Users first log on his/her personal workstation.
– If there are idle remote workstations, a heavy job may migrate to
one of them.
• Problems:
– What if a user log on the remote machine
– How to find an idle workstation Workstation
Workstation Workstation
Workstation-Server Model
Workstation
Workstation Workstation
Communication
Workstation Workstation
Network
Mini Mini
Computer Mini Computer
Used as a Computer Used as a
File Server Used as a Print Server
database
Server
Workstation-Server Model
• Client workstations
– Diskless
– Graphic/interactive applications processed in local.
– All file, Print, http and even cycle computation requests are sent to servers.
• Server minicomputers
– Each minicomputer is dedicated to one or more different types of services.
• Client-Server model of communication
– RPC (Remote Procedure Call)
– RMI (Remote Method Invocation)
• A Client process calls a server process function.
• No process migration invoked
Processor-Pool Model
Terminals
Terminals Terminals
Communication
Terminals Terminals
Network
Run File
Server ---------- Server
Pool of processors
Processor-Pool Model
• Clients:
• They log in one of terminals (diskless workstations)
• All services are dispatched to servers.
• Servers:
• Necessary number of processors are allocated to each user
from the pool.
• Better utilization of resources.
• Example:
• Web Search Engines
Hybrid Model
• Advantages of the workstation-server and
processor-pool models are combined to build a
hybrid model.
• It is built on the workstation-server model with a
pool of processors.
• Processors in the pool can be allocated dynamically
for large computations, that cannot be handled by
the workstations, and require several computers
running concurrently for efficient execution.
• This model is more expensive to implement than
the hybrid or the processor-pool model.
Issues in Designing a Distributed
System
Transparency Reliability Flexibility
Security
Transparency
• Main goal of Distributed system is to make the existence of
multiple computers invisible (transparent) and provide single
system image to user.
• A transparency is some aspect of the distributed system that is
hidden from the user (programmer, system developer,
application).
• While users hit search in google.com, They never notice that their
query goes through a complex process before google shows them
a result.
• Each user has the feeling that he or she is the sole user
of the system and other user do not exists in the
Concurrency system.
Transparency • Hide the possibility that the resource may be shared
concurrently.
• Example: Automatic teller machine network, DBMS
Types of Transparency
• Resource object is to be moved from one place to
another automatically by the system.
Migration • Hide possibility that a system may change location of
Transparency resource (no effect on access).
• Load balancing is one among many reason for
migration of objects.
• Each user has the feeling that he or she is the sole user
of the system and other user do not exists in the
Concurrency system.
Transparency • Hide the possibility that the resource may be shared
concurrently.
• Example: Automatic teller machine network, DBMS
Types of Transparency
• It allow the system to be automatically reconfigured to
Performance
improve performance, as load vary dynamically in the
Transparency system.