ch01-introduction
ch01-introduction
Distributed System
Distributed Systems
(3rd Edition) Definition
A distributed system is a collection of autonomous computing elements that
appears to its users as a single coherent system.
Maarten van Steen Andrew S. Tanenbaum
Characteristic features
Autonomous computing elements, also referred to as nodes, be they
hardware devices or software processes.
Chapter 01: Introduction
Single coherent system: users or applications perceive a single system ⇒
Edited by: Hicham G. Elmongui nodes need to collaborate.
2 / 40
Introduction: What is a distributed system? Characteristic 1: Collection of autonomous computing elements Introduction: What is a distributed system? Characteristic 1: Collection of autonomous computing elements
Overlay network
Independent behavior Each node in the collection communicates only with other nodes in the system,
Each node is autonomous and will thus have its own notion of time: there is no its neighbors. The set of neighbors may be dynamic, or may even be known
global clock. Leads to fundamental synchronization and coordination problems. only implicitly (i.e., requires a lookup).
How to know that you are indeed communicating with an authorized Structured: each node has a well-defined set of neighbors with whom it can
(non)member? communicate (tree, ring).
Unstructured: each node has references to randomly selected other nodes
from the system.
3 / 40 4 / 40
Introduction: What is a distributed system? Characteristic 2: Single coherent system Introduction: What is a distributed system? Middleware and distributed systems
The collection of nodes as a whole operates the same, no matter where, when, Computer 1 Computer 2 Computer 3 Computer 4
and how interaction between a user and the system takes place.
Appl. A Application B Appl. C
Examples
An end user cannot tell where a computation is taking place Distributed-system layer (middleware)
Where data is exactly stored should be irrelevant to an application
If or not data has been replicated is completely hidden Local OS 1 Local OS 2 Local OS 3 Local OS 4
5 / 40 6 / 40
Introduction: What is a distributed system? Middleware and distributed systems Introduction: Design goals
7 / 40 8 / 40
Introduction: Design goals Supporting resource sharing Introduction: Design goals Making distribution transparent
Types
Introduction: Design goals Making distribution transparent Introduction: Design goals Making distribution transparent
Observation
Aiming at full distribution transparency may be too much: Exposing distribution may be good
There are communication latencies that cannot be hidden Making use of location-based services (finding your nearby friends)
Completely hiding failures of networks and nodes is (theoretically and When dealing with users in different time zones
practically) impossible
When it makes it easier for a user to understand what’s going on (when
You cannot distinguish a slow computer from a failing one
e.g., a server does not respond for a long time, report it as failing).
You can never be sure that a server actually performed an operation
before a crash
Full transparency will cost performance, exposing distribution of the Conclusion
system Distribution transparency is a nice a goal, but achieving it is a different story,
Keeping replicas exactly up-to-date with the master takes time and it should often not even be aimed at.
Immediately flushing write operations to disk for fault tolerance
What are we talking about? What level of consistency do we require for client-cached data?
Which operations do we allow downloaded code to perform?
Be able to interact with services from other open systems, irrespective of the Which QoS requirements do we adjust in the face of varying bandwidth?
underlying environment: What level of secrecy do we require for communication?
Systems should conform to well-defined interfaces
Systems should easily interoperate Implementing openness: mechanisms
Systems should support portability of applications
Allow (dynamic) setting of caching policies
Systems should be easily extensible
Support different levels of trust for mobile code
Provide adjustable QoS parameters per data stream
Offer different encryption algorithms
Introduction: Design goals Being scalable Introduction: Design goals Being scalable
Observation
Many developers of modern distributed systems easily use the adjective
“scalable” without making clear why their system actually scales.
Observation
Most systems account only, to a certain extent, for size scalability. Often a
solution: multiple powerful servers operating independently in parallel. Today,
the challenge still lies in geographical and administrative scalability.
Introduction: Design goals Being scalable Introduction: Design goals Being scalable
Essence
Conflicting policies concerning usage (and thus payment), management, and
security
Cannot simply go from LAN to WAN: many distributed systems assume
synchronous client-server interactions: client sends request and waits for
an answer. Latency may easily prohibit this scheme. Examples
WAN links are often inherently unreliable: simply moving streaming video Computational grids: share expensive resources between different
from LAN to WAN is bound to fail. domains.
Lack of multipoint communication, so that a simple search broadcast
Exception: several peer-to-peer networks
cannot be deployed. Solution is to develop separate naming and directory
services (having their own scalability problems). File-sharing systems (based, e.g., on BitTorrent)
Peer-to-peer telephony (Skype)
Peer-assisted audio streaming (Spotify)
Introduction: Design goals Being scalable Introduction: Design goals Being scalable
Introduction: Design goals Being scalable Introduction: Design goals Being scalable
Introduction: Types of distributed systems High performance distributed computing Introduction: Types of distributed systems High performance distributed computing
Introduction: Types of distributed systems High performance distributed computing Introduction: Types of distributed systems High performance distributed computing
The layers
Fabric: Provides interfaces to local resources Google docs
(for querying state and capabilities, locking,
Software
aa Svc
Collective layer
resources. Also various authentication Platforms
protocols. Computation (VM), storage (block, file)
Amazon S3
Amazon EC2
Resource layer
Connectivity layer
Resource: Manages a single resource, such as Infrastructure
Infrastructure
Situation
Make a distinction between four layers
Organizations confronted with many networked applications, but achieving
Hardware: Processors, routers, power and cooling systems. Customers interoperability was painful.
normally never get to see these.
Infrastructure: Deploys virtualization techniques. Evolves around Basic approach
allocating and managing virtual storage devices and virtual servers. A networked application is one that runs on a server making its services
Platform: Provides higher-level abstractions for storage and such. available to remote clients. Simple integration: clients combine requests for
Example: Amazon S3 storage system offers an API for (locally created) (different) applications; send that off; collect responses, and present a coherent
files to be organized and stored in so-called buckets. result to the user.
Cloud computing 31 / 40 32 / 40
Introduction: Types of distributed systems Distributed information systems Introduction: Types of distributed systems Distributed information systems
ABORT TRANSACTION Kill the transaction and restore the old values Request
Client
READ Read data from a file, a table, or otherwise application
TP monitor Server
Subtransaction Subtransaction
Observation
Atomic: happens indivisibly (seemingly)
In many cases, the data involved in a transaction is distributed across several
Airline database Consistent: does not violate system invariants
Hotel database
Isolated: not mutual interference servers. A TP Monitor is responsible for coordinating the execution of a
Two different (independent) databases Durable: commit means changes are permanent transaction.
Introduction: Types of distributed systems Distributed information systems Introduction: Types of distributed systems Pervasive systems
Client Client
application application Observation
Emerging next-generation of distributed systems in which nodes are small,
Communication middleware
mobile, and often embedded in a larger system, characterized by the fact that
the system naturally blends into the user’s environment.
Server-side Server-side Server-side
application application application
Core elements
Distinctive features
1 (Distribution) Devices are networked, distributed, and accessible in a
transparent manner A myriad of different mobile devices (smartphones, tablets, GPS devices,
2 (Interaction) Interaction between users and devices is highly unobtrusive remote controls, active badges.
3 (Context awareness) The system is aware of a user’s context in order to Mobile implies that a device’s location is expected to change over time ⇒
optimize interaction change of local services, reachability, etc. Keyword: discovery.
4 (Autonomy) Devices operate autonomously without human intervention,
and are thus highly self-managed Communication may become more difficult: no stable route, but also
5 (Intelligence) The system as a whole can handle a wide range of perhaps no guaranteed connectivity ⇒ disruption-tolerant networking.
dynamic actions and interactions
Introduction: Types of distributed systems Pervasive systems Introduction: Types of distributed systems Pervasive systems
Two extremes
Sensor network
Operator's site
Characteristics
The nodes to which sensors are attached are:
Sensor data
Many (10s-1000s) is sent directly
to operator
Simple (small memory/compute/communication capacity) Each sensor
can process and Sensor network
Often battery-powered (or even battery-less) store data
Operator's site
Query
Sensors
send only
answers