Cse 803 Final
Cse 803 Final
Term Test 1
Examples:
★ Collection of Web Servers: Distributed database of hypertext and multimedia
documents.
★ Distributed file system on a LAN
★ Domain Name System(DNS)
★ Cray XK7 & CLE(massive multiprocessor)
★ Telephone networks and cellular network
★ Computer networks such as the Internet or intranet
★ ATM(bank) machines
★ Distributed database and distributed database management system
★ Network of workstation
★ Mobile computing etc.
Answer: Middleware: Middleware is software that is used to bridge the gap between
applications and operating systems. Middleware sits between an operating system and
the applications that run on it to provide a method of communication and data
management.
2
Answer:
1. The network is reliable: The network is reliable and will lead to trouble when the
network starts dropping packets.
2. Latency is zero: Assuming latency is zero will undoubtedly lead to scalability
problems as the application grows geographically, or is moved to a different kind
of network.
3. Bandwidth is infinite: Assuming that you continue to increase data size on a
channel without limit, can be quite a mistake. This problem only turns its head
when scale difficulties enter the conversation and specific communication
channels hit their limits.
4. The network is secure: Assuming you can trust the network you are on or the
people you are building your system for can be a crucial mistake.
Answer:
1. Persistent Asynchronous:
3
Answer:
Distributed Hash Table: Chord, Pastry, Kademlia.
Term Test 2
Answer:
Synchronization: Synchronization refers to the process of aligning the clocks or
timestamps of different nodes in a distributed system./ the right time/ordering of all
actions.
4
Berkeley Algorithm:
The Berkeley algorithm assumes no machine has an accurate time source, and obtains
the average participating computers all clocks to average. Machine run time daemon
process that implements the protocol. One machine is elected(or designated) as the
server(master) then others are slave. Master polls each machine periodically where ask
each machine for time and can use Christian's algorithm to compensate for network
latency. When results are in compute average including master's time, average cancels
out individuals clock tendencies to run fast or slow. It sends offset by which each clock
needs adjustment to each slave,avoids problems with network delays if we want a time
stamp. The algorithm has pŕovisions for ignoring readings from clocks whose skew is too
great to compute a fault-tolerant average if the master fails any slave can take over.
Example:
Answer:
Suitable for systems where performance is Suitable for systems where consistency is
critical and rollbacks are costly. paramount and rollbacks are acceptable.
3. What is partial failure? Explain with an example the justification of the CAP
theorem.
Answer: Partial Failure: One or more(but not all) components in a distributed system fail.
According to CAP theory, any distributed system can only guarantee two of the three properties
at any point in time. You can’t guarantee all three properties at once.
6
Banking System(C & A): Imagine a banking system that prioritizes consistency and availability.
In this scenario, every time a client requests their account balance or processes a transaction,
they expect to receive the most up-to-date information. However, if a network partition occurs,
the system will be unable to continue processing requests, potentially causing service disruption
and loss of data.
Social Media Platform(A & P): A social media platform like Twitter, on the other hand, may
prioritize availability and partition tolerance. This ensures that the platform remains accessible
even during network partitions. In this case, users may see slightly outdated content, but they
can still interact with the platform without significant disruptions.
DSM = MC + SM
7
Application:
Chapter-01
★ ATM(bank) Machines
★ Distributed database and distributed database management system
★ Network of Workstations
★ Mobile Computing
★ Collection of Web servers: distributed database of hypertext and
multimedia documents
★ Distributed file system on a LAN
★ Domain Name Service (DNS)
★ Cray XK7 & CLE (massive multiprocessor)
Or, What are economic and technical reasons for having distributed systems?/
Advantages of Distributed System
Or, What problems are there in the use and development of distributed systems? /
Disadvantages of Distributed System
Hardware Architecture:
● Uniprocessor:
★ Network OS:
Properties:
➜ No single system image
11
★ Distributed OS:
Properties:
➜ High degree of transparency
➜ Single system image (FS, process, devices, etc.)
➜ Homogeneous hardware
➜ Examples: Amoeba, Plan 9, Chorus, Mungi
★ Middleware:
Properties:
➜ System independent interface for distributed programming
➜ Improves transparency (e.g., hides heterogeneity)
➜ Provides services (e.g., naming service, transactions, etc.)
➜ Provides programming model (e.g., distributed objects)
Kernel: It is the core that provides basic services for all other parts of
the OS.
12
PARALLEL COMPUTING
● Parallel computing: improve performance by using multiple processors per
application
● There are two flavors:
1. Shared-memory systems:
➢ Multiprocessor (multiple processors share a single bus and
memory unit)
➢ SMP support in OS
➢ Much simpler than distributed systems
➢ Limited scalability
2. Distributed memory systems:
➢ Multicomputer (multiple nodes connected via a network)
➢ These are a form of distributed systems
➢ share many of the challenges discussed here
➢ Better scalability & cheaper
which multiple processors have direct access to shared memory which forms a common
address space.
Usually, tightly coupled systems are referred to as Parallel Systems. In these systems,
there is a single system-wide primary memory (address space) that all the processors
share. On the other hand, Distributed Systems are loosely coupled systems.
Parallel computing is the use of two or more processors (cores, computers) in
combination to solve a single problem.
★ An example of Parallel computing would be two servers that share the workload
of routing mail, managing connections to an accounting system or database,
solving a mathematical problem, etc.
★ Supercomputers are usually placed in parallel system architecture
★ Terminals connected to single server
Note:
★ Scalability often conflicts with (small system) performance
★ Claim of scalability is often abused
Techniques for scaling:
★ Decentralization
★ Hiding communication latencies (asynchronous communication, reduce
communication)
★ Distribution (spreading data and control around)
★ Replication (making copies of data and processes)
Decentralization:
Avoid centralizing:
★ Services (e.g., single server)
★ Data (e.g., central directories)
★ Algorithms (e.g., based on complete information).
With regards to algorithms:
★ Do not require the machine to hold a complete system state.
★ Allow nodes to make decisions based on local info.
★ Algorithms must survive failure of nodes
★ No assumption of a global clock
Decentralization is hard.
4. PERFORMANCE;
➜ Any system should strive for maximum performance
➜ In distributed systems, performance directly conflicts with some other
desirable properties
★ Transparency
★ Security
★ Dependability
★ Scalability
5. FLEXIBILITY:
★ Build a system out of (only) required components
★ Extensibility: Components/services can be changed or added
★ Openness of interfaces and specification
➢ allows reimplementation and extension
★ Interoperability
★ Separation of policy and mechanism
➢ standardized internal interfaces
Answer:
Chapter-02
★ Layered
★ Object-oriented
★ Data-centered
★ Service-oriented
★ Event-based
There is no single best architecture
★ depends on application requirements
★ and the environment!
CLIENT-SERVER MODEL
STRUCTURED OVERLAY
★ Example: BitTorrent
★ Node downloads chunks of file from many other nodes
★ Node provides downloaded chunks to other nodes
★ Tracker keeps track of active nodes that have chunks of file
★ Enforce collaboration by penalizing selfish nodes
3. Edge-Server Networks:
COMMUNICATION MODES
★ Data oriented vs control oriented communication
★ Synchronous vs asynchronous communication
★ Transient vs persistent communication
★ Provider-initiated vs consumer-initiated communication
★ Direct -addressing vs indirect-addressing communication
Asynchronous
★ Sender continues execution after sending the message (does not block
waiting for reply)
★ Message may be queued if receiver not active
★ Message may be processed later at receiver’s convenience
★ Example: email
Coupling: Time
COMBINATIONS:
There are a number of alternative ways, or modes, in which communication can take
place. It is important to know and understand these different modes because they are
used to describe the different services that a communication subsystem offers to higher
layers. The first distinction is between the two modes of data-oriented communication
and control-oriented communication. In the first mode, communication serves solely to
exchange data between processes. Although the data might trigger an action at the
receiver, there is no explicit transfer of control implied in this mode. The second mode,
control-oriented communication, explicitly associates a transfer of control with every
data transfer. Data-oriented communication is clearly the type of communication used in
communication via shared address space and shared memory, as well as message
passing. Control-oriented communication is the mode used by abstractions such as
remote procedure call, remote method invocation, active messages, etc.
(communication abstractions are described in the next section). Next, communication
operations can be synchronous or asynchronous. In synchronous communication the
sender of a message blocks until the message has been received by the intended
recipient. Synchronous communication is usually even stronger than this in that the
sender often blocks until the receiver has processed the message and the sender has
received a reply. In asynchronous communication, on the other hand, the sender
24
There are also varying degrees of reliability of the communication. With reliable
communication, errors are discovered and fixed transparently. This means that the
processes can assume that a message that is sent will arrive at the destination (as long
as the destination process is there to receive it). With unreliable communication,
messages may get lost and processes have to deal with it. Finally, it is possible to
provide guarantees about the ordering of messages. Thus, for example, a
communication system may guarantee that all messages are received in the same
order that they are sent, while another system may make no guarantees about the order
of arrival of messages.
REQUEST-REPLY COMMUNICATION
Request:
★ a service
★ data
Reply:
★ result of executing service
★ data
Requirement:
★ Message formatting
★ Protocol
25
Idea: Replace I/O oriented message passing model by execution of a procedure call on
a remote node [BN84]
★ Synchronous - based on blocking messages
★ Message-passing details hidden from application
★ Procedure call parameters used to transmit data
★ Client calls local “stub” which does messaging and marshaling Confusion of local
and remote operations can be dangerous.
RPC IMPLEMENTATIONS
ASYNCHRONOUS RPC
Chapter-03
★ Attack: An assault on system security that derives from an intelligent threat; that
is, an intelligent act that is a deliberate attempt (especially in the sense of a
method or technique) to evade security services and violate the security policy of
a system.
Security Model
★ Object: intended for use by different clients, via remote invocation
★ Principal: authority on whose behalf invocation is issued
Security Threat
★ Online shopping/banking
➢ intercept credit card information
➢ Purchase goods using stolen credit card details
➢ Replay bank transaction, e.g. credit an account
★ Online stock market information service
➢ observe frequency or timing of requests to deduce useful information, e.g.
the level of stock
★ Website
➢ flooding with requests (denial of service)
★ My computer
➢ receive/download malicious code (virus)
Secure channels
★ Basic message
➢ Networks are insecure
➢ Interfaces are exposed
★ Threat analysis
➢ assume worst-case scenario
➢ list all threats - complex scenarios!!!
★ Design guidelines
➢ log at points of entry so that violations are detected
➢ limit the lifetime and scope of each secret
➢ publish algorithms, restrict access to shared keys
➢ minimize trusted base
➢ ciphers
➢ authentication
➢ digital signatures
Firewalls
Firewall Configurations
Key Distributions
symmetric key crypto: Bob and Alice share same (symmetric) key: K
• e.g., key is knowing substitution pattern in mono alphabetic substitution cipher
Cryptographic Algorithms
★ Encryption
➢ apply rules to transform plaintext to cipher text
➢ defined with a function F and key K
➢ denote message M encrypted with K by FK(M) = {M}K
★ Decryption
➢ uses inverse function
-1
★ F K({M}K) = M
➢ can be symmetric (based on secret key known to both parties)
➢ or asymmetric (based on public key)
➢ separate computer within intranet
➢ protected by IP packet filtering, runs TCP/application gateway
Symmetric Cryptography
Asymmetric Cryptography
★ Trap-door functions
➢ pair of keys (e.g. large numbers)
➢ encryption function easy to compute (e.g. multiply keys)
➢ decryption function infeasible unless secret known (e.g. factorise the
product if one key not known)
★ Idea
➢ two keys produced: encryption key made public, decryption key kept
secret
➢ anyone can encrypt messages, only a participant with decryption key
can operate the trap door
★ Examples
➢ a few practical schemes: RSA
➢ How it works
★ relies on N = P × Q (product of two very large primes)
★ factorization of N hard
★ choose keys e, d such that e × d = 1 mod Z where Z = (P-1) × (Q-1)
➢ It turns out...
★ can encrypt M by Me mod N
★ can decrypt by Cd mod N (C is encrypted message)
➢ Thus
★ Can freely make e and N public, while retaining d. In 1978 Rivest et al
thought factorizing numbers > 10200 would take more than four billion
years. Now (ca 2000)– faster computers, better methods numbers with
33
Digital Signatures
★ Why needed?
➢ alternative to handwritten signatures
➢ authentic, difficult to forge, and undeniable
★ How it works
➢ relies on secure hash functions which compress a message into a
so-called digest
➢ sender encrypts the digest and appends it to the message as a signature
➢ receiver verifies signature
➢ generally, public key cryptography is used, but a secret key is also
possible
Cryptographic Protocols
Authentication
34
Definition
★ protocol for ensuring the authenticity of the sender
Secure Communication
Network Auditing
★ Network Auditing is the collective measures done to analyze, study, and gather
data about a network to ascertain its health by network /organization requirements.
★ It works through a systematic process where a network is analyzed for :
➢ Security
➢ Implementation of control
➢ Availability
➢ Management
➢ Performance
★ It uses both manual and automated techniques to gather data and review network
posture. It reviews:
➢ Each of node a network
➢ Network Control and security processes
➢ Network monitoring processes
➢ Other Data
Answer: Securing the processes and the channels used for their interactions and
protecting the objects that they encapsulate against unauthorized access.
★ Protecting Objects: The server manages a collection of objects on behalf of
some users. The users can request the server to perform operations on the
objects. The server carries out the operation specified in each invocation and
sends the result to the client. Objects are intended to be used in different ways by
different users. For example, some objects may hold a user’s private data, such
as their mailbox and other objects may hold shared data such as web pages. To
support this, access rights specify who is allowed to perform the operations of an
object- for example, who is allowed to read or write its state.
★ The Enemy: Processes interact by sending messages. The messages are
exposed to attack because the network and the communication service that they
use are open. Enemy that is capable of sending any message to any process
and reading or copying any message sent between a pair of processes. Such
attacks can be made simply by using a computer connected to a network or a
program that generates messages that make false requests for services.
2. What are threats and attacks? Write down the types of threats.
Answer:
★ Threat: A potential for violation of security, which exists when there is a
circumstance, capability, action, or event that could breach security and cause
harm. That is, a threat is a possible danger that might exploit a vulnerability.
★ Attack: An assault on system security that derives from an intelligent threat; that
is, an intelligent act that is a deliberate attempt (especially in the sense of a
method or technique) to evade security services and violate the security policy of
a system.
Types of Threats
★ Eavesdropping: Obtaining copies of messages without authority
★ Masquerading: Sending/receiving messages using the identity of another
principal without their authority
★ Message tampering: Intercepting and altering messages
★ Replaying: Intercepting, storing, and replaying messages
★ Denial of service: Flooding a channel with requests to deny access to others
Threat Attack
9. How can we configure firewalls? To design a secure system what criteria should
follow?
37
10. Briefly explain the security model for distributed computing. How can we defeat
the enemy in a network system?
11. What is network auditing? To design a secure system what criteria should follow?
12. What is Digital Signature? How does Digital Signature work?
13. Write short notes on different cryptographic algorithms: RSA, AES
14. Define Network Security. Briefly describe the network security model.
Chapter-04
REPLICATION ISSUES
★ Updates
➢ Consistency (how to deal with updated data)
➢ Update propagation
★ Replica placement
➢ How many replicas?
➢ Where to put them?
★ Redirection/Routing
➢ Which replica should clients use?
CONSISTENCY
38
Example:
Client A: x = 1; x = 0;
Client B: print(x);
print(x);
Possible results:
- -, 11, 10, 00
How about 01?
What are the conflicting ops? What are the partial orders?
What are the total orders?
CONSISTENCY MODEL
Defines which interleavings of operations are valid(admissible)
Consistency Model:
★ Concerned with the consistency of a data store.
★ Specifies characteristics of valid total orderings
A data store that implements a particular model of consistency will provide a total ordering of
operations that is valid according to the model.
STRICT CONSISTENCY
Any read on a data item x returns a value corresponding to the result of the most recent write
on x.
Absolute time ordering of all shared accesses
SEQUENTIAL CONSISTENCY
All operations are performed in some sequential order
★ More than one correct sequential order
★ All clients see the same order
★ Program order of each client maintained
★ Not ordered according to time
Performance: read time + write time >= minimal packet transfer time
CAUSAL CONSISTENCY
Potentially causally related writes are executed in the same order everywhere.
Causally Related Operations:
★ Read followed by a write (in same client)
★ W(x) followed by R(x) (in same or different clients)
WEAK CONSISTENCY
Shared data can be counted on to be consistent only after a synchronisation is done
Enforces consistency on a group of operations, rather than single operations
★ Synchronization variable (S)
★ Synchronise operation (synchronise(S))
★ Define ‘critical section’ with synchronise operations
Properties:
★ Order of synchronise operations sequentially consistent
★ Synchronise operation cannot be performed until all previous writes have completed
everywhere
★ Read or Write operations cannot be performed until all previous synchronise operations
have completed
Example:
★ synchronise(S) W(x)a W(y)b W(x)c synchronise(S)
★ Writes performed locally
★ Updates propagated only upon synchronisation
★ Only W(y)b and W(x)c have to be propagated
RELEASE CONSISTENCY
Explicit separation of synchronisation tasks
★ acquire(S) - bring local state up to date
41
CAP THEORY
★ C: Consistency: Linearisability
★ A: Availability: Timely response
★ P: Partition-Tolerance: Functions in the face of a partition
CAP CONSEQUENCES
For wide-area systems:
★ must choose: Consistency or Availability
★ choosing Availability: Eventual consistency
★ choosing Consistency: delayed (and potentially failing) operations
CONSISTENCY PROTOCOLS
Consistency Protocol: implementation of a consistency model
Primary-Based Protocols:
★ Remote-write protocols
★ Local-write protocols
Replicated-Write Protocols:
★ Active Replication
★ Quorum-Based Protocols
REMOTE-WRITE PROTOCOLS
Single Server:
★ All writes and reads executed at single server
★ No replication of data
42
LOCAL-WRITE PROTOCOLS
Migration:
★ Data item migrated to local server on access
★ Distributed, non-replicated, data store
DYNAMIC REPLICATION
PUSH VS PULL
Pull:
★ Updates propagated only on request
★ Also called client-based
★ R/W low
★ Polling delay
Push:
★ Push updates to replicas
★ Also called server-based
★ When low staleness required
★ R»W
★ Have to keep track of all replicas
Example of Replication: Data can be copied between two on-premises hosts, between
hosts in different locations, to multiple storage on the same host, or to form a cloud
based host.
b) Push vs Pull
44
Chapter-05
DSM = SM + MC
DSM is a mechanism for allowing user processes to access shared data without using
interprocess communications. This provides a virtual address space that is shared
among all components.
Properties:
★ Remote access is expensive compared to local memory access
★ Individual operations can have very low overhead
★ Threads can distinguish between local and remote access
Why DSM?
Benefits of DSM
★ Ease of programming (shared memory model)
★ Eases porting of existing code
★ Pointer handling
➢ Shared pointers refer to shared memory
➢ Share complex data (lists, etc.)
46
★ No marshaling
DSM IMPLEMENTATIONs
Typical Implementations
★ Provided by some research OSs(e.g. Mach and Chorus)
★ Most often implemented in user space (e.g., TreadMarks, CVM)
★ User space: what’s needed from the kernel?
➢ User-level fault handler [e.g., Unix signals]
➢ User-level VM page mapping and protection [e.g., mmap() and
mprotect()]
➢ Message passing layer [e.g., socket API]
Hardware
★ Multiprocessor
★ Example: MIT Alewife, DASH
DSM Models
3. Shared Variable
★ Release and Entry-based consistency
★ Annotations
★ Fine-grained
★ More complex for programmer
★ Examples: Munin, Midway
4. Shared structure
★ Encapsulate shared data
★ Access only through predefined procedures (e.g., methods)
★ Tightly integrated synchronization
★ Encapsulate (hide) consistency model
★ Lose familiar shared memory model
★ Examples: Orca (shared object), Linda (tuple space)
APPLICATIONS OF DSM
DSM Environments
★ Multiprocessor
➢ NUMA
★ Multicomputer
➢ Supercomputer
48
➢ Cluster
➢ Network of Workstations
➢ Wide-area
Requirements of DSM
Transparency
★ Location, migration, replication, concurrency
Reliability
★ Computations depend on availability of data
Performance
★ Important in high-performance computing
★ Important for transparency
Scalability
★ Important in wide-area
★ Important for large computations
★ Access to DSM should be consistent
★ According to a consistency model
Programmability
★ Easy to program
★ Communication transparency
Design Issues
★ Granularity
➢ Page-based, Page size: minimum system page size
★ Replication
➢ Lazy release consistency
★ Scalability
➢ Meant for cluster or NOW (Network of Workstations)
★ Synchronisation primitives
➢ Locks (acquire and release), Barrier
★ Heterogeneity
➢ Limited (doesn’t address endianness or mismatched word sizes)
★ Fault Tolerance
➢ Research
49
★ No Security
★ Strict Consistency: A read returns the most recently written value. This form of
consistency is what most programmers intuitively expect. However, it implies a
total ordering on all memory operations in the system so the most recent write
can be determined. This forced total ordering leads to inefficiency.
Reliability
★ Computations depend on the availability of data
Performance
★ Important in high-performance computing
★ Important for transparency
Scalability
★ Important in wide-area
★ important for large computations
★ Access to DSM should be consistent
★ According to a consistency model
Programmability
★ Easy to program
★ Communication transparency
3. Write the application areas of DSM. What are the benefits of DSM in terms of
shared memory?
Answer:
APPLICATIONS OF DSM
★ Pointer handling
➢ Shared pointers refer to shared memory
➢ Share complex data (lists, etc.)
★ No marshaling
APPLICATIONS OF DSM
6. What is Distributed Shared Memory? What are the main requirements of this?
Transparency
★ Location, migration, replication, concurrency
Reliability
★ Computations depend on availability of data
Performance
52
Scalability
★ Important in wide-area
★ mportant for large computations
★ Access to DSM should be consistent
★ According to a consistency model
Programmability
★ Easy to program
★ Communication transparency
Chapter-06
Fault Tolerance
Dependability
Failure
Terminology:
53
Failure: a system fails when it fails to meet its promises or cannot provide its services in the
specified manner
Error: part of the system state that leads to failure (i.e., it differs from its intended value)
Fault: the cause of an error (results from design errors, manufacturing faults, deterioration, or
external disturbance)
Recursive:
★ Failure can be a fault
★ Manufacturing fault leads to disk failure
★ Disk failure is a fault that leads to database failure
★ Database failure is a fault that leads to email service failure
Types of Faults:
★ Transient Fault: occurs once and then disappears
★ Intermittent Fault: occurs, vanishes, reoccurs, vanishes, etc.
★ Permanent Fault: persists until the faulty component is replaced.
Types of Failures:
★ Process Failure: process proceeds incorrectly or not at all.
★ Storage Failure: “stable” secondary storage is inaccessible.
★ Communication Failure: communication link or node failure.
FAILURE MODELS
Timing Failure: a server’s response lies outside the specified time interval.
Arbitrary Failure: a server may produce an arbitrary response at arbitrary times (aka Byzantine
failure).
FAULT TOLERANCE
Fault Tolerance: The system can provide its services even in the presence of faults.
Goal:
★ Automatically recover from partial failure
★ Without seriously affecting overall performance
Techniques:
★ Prevention: prevent or reduce the occurrence of faults
★ Prediction: predict the faults that can occur and deal with them
★ Masking: hide the occurrence of the fault
★ Recovery: restore an erroneous state to an error-free state
FAILURE PREVENTION
FAILURE PREDICTION
DETECTING FAILURE
Failure Detector:
★ Service that detects process failures
★ Answers queries about status of a process
Reliable:
★ Failed – crashed
★ Unsuspected – hint
Unreliable:
★ Suspected – may still be alive
★ Unsuspected – hint
Synchronous systems:
★ Timeout
★ Failure detector sends probes to detect crash failures
Asynchronous systems:
★ Timeout gives no guarantees
★ Failure detector can track suspected failures
★ Combine results from multiple detectors
★ How to distinguish communication failure from process failure?
★ Ignore messages from suspected processes
★ Turn an asynchronous system into a synchronous one
FAILURE MASKING
Redundancy:
★ Information redundancy
★ Time redundancy
★ Physical redundancy
RELIABLE COMMUNICATION
★ 1 → 2 attack!
★ 2 → 1 ack
★ 2: did 1 get my ack?
★ 1 → 2 ack ack
★ 1: did 2 get my ack ack?
57
★ etc.
REPLICATION
★ n generals (processes)
★ m are traitors (will send incorrect and contradictory info)
★ Need to know everyone else’s troop strength gi
★ Each process has a vector: hg1, ...gni
★ (Note: this is interactive consistency)
Faulty process
➜ If m faulty processes then 2m + 1 nonfaulty processes required for correct functioning
FAILURE RECOVERY
BACKWARD RECOVERY
General Approach:
59
Operation Based Recovery - Logging: Update in-place together with write-ahead logging
★ Every change (update) of data is recorded in a log, which includes:
➢ Data item name (for identification)
➢ Old data item state (for undo)
➢ New data item state (for redo)
★ Undo log is written before update (write-ahead log).
★ Transaction semantics
Checkpointing:
★ Pessimistic vs Optimistic
➢ Pessimistic: assumes failure, optimised toward recovery
➢ Optimistic: assumes infrequent failure, minimises checkpointing overhead
★ Independent vs Coordinated
➢ Coordinated: processes synchronise to create global checkpoint
➢ Independent: each process takes local checkpoints independently of others
★ Synchronous vs Asynchronous
➢ Synchronous: distributed computation blocked while checkpoint taken
➢ Asynchronous: distributed computation continues while checkpoint taken
Checkpointing Overhead:
★ Frequent checkpointing increases overhead
★ Infrequent checkpointing increases recovery cost
Decreasing Checkpointing Overhead:
Incremental checkpointing: Only write changes since last checkpoint:
★ Write-protect whole address space
★ On write-fault mark page as dirty and unprotect
★ On checkpoint only write dirty pages
Asynchronous checkpointing: Use copy-on-write to checkpoint while execution continues
★ Easy with UNIX fork()
Compress checkpoints: Reduces storage and I/O cost at the expense of CPU time
60
Domino Effect://**
★ P2 fails → P2 y R22
Orphan message m is received but not sent → P1 y R12
★ P3 fails → P3 y R32 → P2 y R21 → P1 y R11, P3 y R31
Messaging dependencies plus independent checkpointing may force system to roll back to initial
state
Message Loss:
★ Failure of P2 → P2 y R21
★ Message m is now recorded as sent (by P1) but not received (by P2), and m will never
be received after rollback
★ Message m is lost
★ Whether m is lost due to rollback or due to imperfect communication channels is
indistinguishable!
★ Require protocols resilient to message loss
Livelock:
Consistent Checkpointing
Consistent Cut
ROLLBACK RECOVERY
First Phase:
★ Coordinator sends “r” messages to all other processes to ask them to roll back
★ Each process replies true, unless already in checkpoint or rollback
★ If all replies are true, coordinator decides to roll back, otherwise continue
Second Phase:
★ Coordinator sends decision to other processes
★ Processes receiving this message perform corresponding action
ASYNCHRONOUS CHECKPOINTING
2. What are total and partial failures? Briefly discuss about fault tolerance
techniques.
Answer: Total Failure: Total Failure means a fault which causes continuous and
complete loss of Service.
Partial Failure: A partial failure is less serious than a complete failure, and typically
causes a degradation of service, but not a complete loss of service.
Timing Failure: a server’s response lies outside the specified time interval.
Arbitrary Failure: a server may produce arbitrary response at arbitrary times (aka
Byzantine failure).
4. Write the issues to restore an erroneous state to an error free state in failure
recovery. Discuss about different phases of rollback recovery.
Answer:
Restoring an erroneous state to an error-free state
Issues:
★ Reclamation of resources: Locks, buffers held on other nodes.
★ Consistency: Undo partially completed operations before restart.
★ Efficiency: Avoid restarting the whole system from the start of computation.
Different phases of rollback recovery:
First Phase:
★ Coordinator sends “r” messages to all other processes to ask them to roll back
★ Each process replies true, unless already in checkpoint or rollback
★ If all replies are true, coordinator decides to roll back, otherwise continue
Second Phase:
★ Coordinator sends decision to other processes
★ Processes receiving this message perform corresponding action
Failure: a system fails when it fails to meet its promises or cannot provide its services in
the specified manner.
Fault: the cause of an error (results from design errors, manufacturing faults,
deterioration, or external disturbance).
Types of Failures:
★ Process Failure: process proceeds incorrectly or not at all.
★ Storage Failure: “stable” secondary storage is inaccessible.
★ Communication Failure: communication link or node failure.
Chapter-07
DISTRIBUTED ALGORITHMS
Algorithms that are intended to work in a distributed environment.
are Used to accomplish tasks such as:
★ Communication
★ Accessing resources
★ Allocating resources
★ Consensus etc.
Synchronization and coordination inextricably linked to distributed algorithms
★ Achieved using distributed algorithms
★ Required by distributed algorithms
Affected by:
★ Execution speed/time of processes
★ Communication delay
★ Clocks & clock drift
★ (Partial) failure
MAIN ISSUES
● Time and Clocks: synchronizing clocks and using time in distributed algorithms
● Global State: how to acquire knowledge of the system’s global state
● Concurrency Control: coordinating concurrent access to resources
TIME
Global Time:
★ ’Absolute’ time
➢ Einstein says no absolute time
➢ Absolute enough for our purposes
★ Astronomical time
➢ Based on earth’s rotation
➢ Not stable
★ International Atomic Time (IAT)
➢ Based on oscillations of Cesium-133
★ Coordinated Universal Time (UTC)
➢ Leap seconds
➢ Signals broadcast over the world
Local Time:
★ Not synchronised to Global source
★ Relative not ’absolute’
Computer Clocks:
★ Crystal oscillates at known frequency
★ Oscillations cause timer interrupts
★ Timer interrupts update clock
Clock Skew:
★ Crystals in different computers run at slightly different rates
★ Clocks get out of sync
★ Skew: instantaneous difference
★ Drift: rate of change of skew
67
Timestamps:
★ Used to denote at which time an event occurred
1. Internal Synchronisation:
★ Clocks synchronize locally
★ Only synchronized with each other
2. External Synchronisation:
★ Clocks synchronize to an external time source
★ Synchronise with UTC every δ seconds
3. Time Server:
★ Server that has the correct time
★ Server that calculates the correct time
BERKELEY ALGORITHM
The Berkeley algorithm assumes no machine has an accurate time source, and obtains
the average participating computers all clocks to average. Machine run time daemon
process that implements the protocol. One machine is elected(or designated) as the
server(master) the others are slaves. Master polls each machine periodically asks each
machine for time and can use Christian's algorithm to compensate for network latency.
When results are in computing average including master's time, average cancels out
individual clock tendencies to run fast or slow. It sends offset by which each clock needs
adjustment to each slave, avoiding problems with network delays if we want a time
stamp. The algorithm has pŕovisions for ignoring readings from clocks whose skew is too
great to compute a fault-tolerant average if the master fails any slave can take over.
68
Example:
CRISTIAN’S ALGORITHM
Time Server:
★ Has UTC receiver
★ Passive
Algorithm:
★ Clients periodically request the time
★ Don’t set time backward
★ Take propagation and interrupt handling delay into account
➢ (T1 − T0)/2
➢ Or take a series of measurements and average the delay
★ Accuracy: 1-10 millisec (RTT in LAN)
Hierarchy of Servers:
★ Primary Server: has UTC clock
★ Secondary Server: connected to primary etc.
Synchronization Modes:
★ Multicast: for LAN, low accuracy
★ Procedure Call: clients poll, reasonable accuracy
★ Symmetric: Between peer servers. highest accuracy
Synchronization
★ Estimate clock offsets and transmission delays between two nodes
★ Keep estimates for past communication
69
LOGICAL CLOCKS
Event ordering is more important than physical time.
★ Events (e.g., state changes) in a single process are ordered
★ Processes need to agree on ordering of causally related events (e.g., message send
and receive)
Local ordering:
★ System consists of N processes pi, i ∈ {1, . . . , N}
★ Local event ordering →i:
If pi observes e before e′, we have e →i e′
Global ordering:
★ Leslie Lamport’s happened before relation →
★ Smallest relation, such that
➢ e →i e′ implies e → e′
➢ For every message m, send(m) → receive(m)
➢ Transitivity: e → e′ and e′ → e′′ implies e → e′′
Example:
Properties:
★ a → b implies L(a) < L(b)
★ L(a) < L(b) does not necessarily imply a → b
Example:
CONSISTENT CUTS
Determining global properties:
★ We need to combine information from multiple nodes
★ Without global time, how do we know whether collected local information is consistent?
★ Local state sampled at arbitrary points in time surely is not consistent
★ We need a criterion for what constitutes a globally consistent collection of local
information
Cuts:
Picture need
Picture need
Consistent cut:
★ We call a cut consistent iff,
for all events e′ ∈ C, e → e′ implies e ∈ C
★ A global state is consistent if it corresponds to a consistent cut
★ Note: we can characterize the execution of a system as a sequence of consistent global
states
S0 → S1 → S2 → · · ·
71
Chapter-08
WHAT IS NAMING?
Systems manage a wide collection of entities of different kinds. They are identified by different
kinds of names:
★ Files (/boot/vmlinuz), Processes (1, 14293), Users (chak, ikuz, cs9243), Hosts
(weill, facebook.com), . . .
//Examples of naming in distributed systems? What’s the difficulty?
BASIC CONCEPTS
Name:
★ String of bits or characters
★ Refers to an entity
Entity:
★ Resource, process, user, etc.
★ Operations performed on entities at access points
Address:
★ Access point named by an address
★ Entity address = address of entity’s access point
★ Multiple access points per entity
★ Entity’s access points may change
System-Oriented Names:
★ Represented in machine readable form (32 or 64-bit strings)
★ Structured or unstructured
★ Easy to store, manipulate, compare
72
Human-Oriented Names:
★ Variable length character strings
★ Usually structured
★ Often many human-oriented names map onto a single system-oriented name
★ Easy to remember and distinguish between
★ Hard for machine to process
★ Example: URL
Structure options:
★ Flat (only leaf nodes)
★ Hierarchical (Strictly hierarchical, DAG, Multiple root nodes)
★ Tag-based
Path Names (in hierarchies):
★ Sequence of edge labels
★ Absolute: if first node in path name is a root node
★ Relative: otherwise
Aliasing:
★ Alias: another name for an entity
★ Hard link: two or more paths to an entity in the graph
★ Soft link: leaf node stores a (absolute) path name to another node
Mounting
★ Directory node stores info about a directory node in another namespace
★ Need: protocol, server, path name, authentication, and authorization info, keys for
secure communication, etc.
Name Server:
★ Naming service implemented by name servers
★ Implements naming service operations
Operations:
★ Lookup: resolve a path name, or element of a path name
73
Structured Partitioning:
★ split name space according to graph structure
★ Name resolution can use zone hints to quickly find appropriate server
★ Improved lookup performance due to knowledge of structure
★ Rigid structure
Structure-free Partitioning:
★ content placed on servers independent of name space
★ Flexible
★ Decreased lookup performance, increased load on root
Structure:
★ Hierarchical structure (tree)
★ Top-level domains (TLD) (.com, .org, .net, .au, .nl, ...)
★ Zone: a (group of) directory node
★ Resource records: contents of a node
★ Domain: a subtree of the global tree
★ Domain name: an absolute path name
Table hey
Cloud Computing
A style of computing in which dynamically scalable and often virtualized resources are provided
as a service over the Internet.
BENEFITS
Flexibility:
★ Flexible provisioning
★ Add machines on demand
★ Add storage on demand
Effort:
★ Low barrier to entry
★ Initial effort: no need to spec and set up physical infrastructure
★ Continuing effort: no need to maintain physical infrastructure
Speed: Most cloud computing services are provided self-service and on demand, so even vast
amounts of computing resources can be provisioned in minutes, typically with just a few mouse
clicks, giving businesses a lot of flexibility and taking the pressure off capacity planning.
Cost:
★ Cloud computing eliminates the capital expense of buying hardware and software
★ setting up and running on-site data centers—the racks of servers, the round-the-clock
electricity for power and cooling, and the IT experts for managing the infrastructure. It
adds up fast.
★ Low initial capital expenditure
★ Avoid costs of over-provisioning for scalability
★ Pay for what you use
Global scale: The benefits of cloud computing services include the ability to scale elastically. In
cloud speak, that means delivering the right amount of IT resources—for example,
more or less computing power, storage, and bandwidth—right when they’re needed, and
from the right geographic location.
Security
★ Many cloud providers offer a broad set of policies, technologies, and controls that
strengthen your security posture overall, helping protect your data, apps, and
infrastructure from potential threats.
★ Redundancy
★ Trust reliability of provider
★ Data backups
➢ Reliability: Cloud computing makes data backup, disaster recovery, and business
continuity easier and less expensive because data can be mirrored at multiple redundant
sites on the cloud provider’s network.
➢ Productivity: On-site data centers typically require a lot of “racking and
stacking”—hardware setup, software patching, and other time-consuming IT
management chores. Cloud computing removes the need for many of these tasks, so IT
teams can spend time on achieving more important business goals.
➢ Speed: Most cloud computing services are provided self-service and on-demand, so
even vast amounts of computing resources can be provisioned in minutes, typically with
just a few mouse clicks, giving businesses a lot of flexibility and taking the pressure off
capacity planning.
Most cloud computing services fall into four broad categories: infrastructure as a service
(IaaS), platform as a service (PaaS), serverless, and software as a service (SaaS). These
are sometimes called the cloud computing "stack" because they build on top of one another.
Knowing what they are and how they’re different makes it easier to accomplish your business
goals.
SaaS, cloud providers host and manage the software application and underlying
infrastructure, and handle any maintenance, like software upgrades and security
patching. Users connect to the application over the Internet, usually with a web browser
on their phone, tablet, or PC.
➢ t, m, c, p, g, x, r, i, d
➢ micro, small, medium, large, xlarge, ...
★ Cost:
➢ free tier: limited instances, free CPU hours
➢ on-demand: $0.007 - $39 per hour
➢ reserved: 1-3 years, discounted, fixed cost
★ Launch Amazon Machine Image (AMI) on instances
★ Preconfigured or custom images
PLATFORM AS A SERVICE
SOFTWARE AS A SERVICE
Client provides:
★ Data
Challenges– Client:
★ Learn new application
★ Deal with potential restrictions
➢ Web interface, restricted functionality
➢ No offline access, no local storage
Challenges – Provider:
★ Transparency (naming, redirection)
★ Scalability: replication and load balancing decisions
★ Synchronisation and coordination
★ Security
★ Fault tolerance
★ Monitoring
★ Software maintenance and sys admin
★ Application development and maintenance
Scalability:
★ Datacentre vs Global
★ Partitioning
➢ Services and Data
★ Replication
Consistency:
★ Dealing with the consequences of CAP Theorem
★ Dealing with un-usability of eventual consistency
Reliability:
★ SLA (Service Level Agreement): guarantees given by provider
➢ How reliable are the guarantees?
➢ What is the consequence if they aren’t met?
★ Redundancy and Replication
➢ within the same provider (e.g. Availability Zones, Regions, etc.)
➢ migration across providers
★ Geographically distributed architecture
★ Design for failure: Chaos Monkey
➢ Test how well the system deals with failure
➢ regularly and randomly kill system services
2. What are the key characteristics of cloud computing? Describe the benefits of it.
Answer:
Key Characteristics of Cloud Computing:
SP 800-145. The NIST Definition of Cloud Computing:
★ On-demand, self-service
➢ get resources (CPU, storage, bandwidth, etc),
➢ automated: as needed, right now!
★ Network access
➢ Services accessible over the network, standard protocols
★ Pooled resources
➢ provider: multi-tenant pool of resources
➢ dynamically assigned and reassigned per customer demand
★ Elasticity
➢ Scalability: rapidly adjust resource usage as needed
★ Measured service
➢ monitor resource usage
➢ billing for resources used
Benefits of Cloud Computing:
Flexibility:
★ Flexible provisioning
★ Add machines on demand
★ Add storage on demand
Effort:
★ Low barrier to entry
★ Initial effort: no need to spec and set up physical infrastructure
81
Speed: Most cloud computing services are provided self-service and on demand, so even vast
amounts of computing resources can be provisioned in minutes, typically with just a few mouse
clicks, giving businesses a lot of flexibility and taking the pressure off capacity planning.
Cost:
★ Cloud computing eliminates the capital expense of buying hardware and software
★ setting up and running on-site data centers—the racks of servers, the round-the-clock
electricity for power and cooling, and the IT experts for managing the infrastructure. It
adds up fast.
★ Low initial capital expenditure
★ Avoid costs of over-provisioning for scalability
★ Pay for what you use
Global scale: The benefits of cloud computing services include the ability to scale elastically. In
cloud speak, that means delivering the right amount of IT resources—for example,
more or less computing power, storage, and bandwidth—right when they’re needed, and
from the right geographic location.
Security
★ Many cloud providers offer a broad set of policies, technologies, and controls that
strengthen your security posture overall, helping protect your data, apps, and
infrastructure from potential threats.
★ Redundancy
★ Trust reliability of provider
★ Data backups
Not easy to remember hard for humans to Easy to remember and distinguish
use. between hard for a machine to process
Machine-generated Human-generated
Use case: Internal communications. Data Use Case: User interactions, naming
management resources.
7. Why is it called cloud in terms of cloud computing? Write down the types of cloud
computing with examples.
8. Explain cloud computing Infrastructure as a Service with example.
9. Describe the types of partitioning.
83
Chapter-09
INTRODUCTION
CHALLENGES
Transparency
★ Location: a client cannot tell where a file is located
★ Migration: a file can transparently move to another server
★ Replication: multiple copies of a file may exist
★ Concurrency: multiple clients access the same file
Flexibility
★ Servers may be added or replaced
★ Support for multiple file system types
Dependability
★ Consistency: conflicts with replication & concurrency
★ Security: users may have different access rights on clients sharing files & network
transmission
84
CACHING
Cache consistency:
★ Obvious parallels to shared-memory systems, but other trade offs
★ No UNIX semantics without centralized control
★ Plain write-through is too expensive; alternatives: delay
★ WRITEs and agglomerate multiple WRITEs
★ Write-on-close; possibly with delay (file may be deleted)
★ Invalid cache entries may be accessed if server is not contacted whenever a file is
opened
REPLICATION
Multiple copies of files on different servers
85
CASE STUDIES
Properties:
★ Introduced by Sun
★ Fits nicely into UNIX’s idea of mount points, but does not implement UNIX semantics
★ Multiple clients & servers (a single machine can be a client and a server)
★ Stateless servers (no OPEN & CLOSE) (changed in v4)
★ File locking through separate server
★ No replication
★ ONC RPC for communication
★ Caching: local files copies
➢ consistency through polling and timestamps
➢ asynchronous update of the file after close
Properties:
★ From Carnegie Mellon University (CMU) in the 1980s.
★ Developed as campus-wide file system: Scalability
★ Global namespace for file system (divided in cells, e.g. /afs/cs.cmu.edu, /afs/ethz.ch)
★ API same as for UNIX
★ UNIX semantics for processes on one machine, but globally write-on-close
★ Client: User-level process Venus (AFS daemon)
★ Cache on local disk
★ Trusted servers collectively called Vice
Scalability:
★ Server serves whole files. Clients cache whole files
★ Server invalidates cached files with a callback (stateful servers)
★ Clients do not validate cache (except on first use after booting)
86
CODA
Motivation:
★ 10+ clusters
★ 1000+ nodes per cluster
★ Pools of 1000+ clients
★ 350TB+ filesystems
★ 500Mb/s read/write load
★ Commercial and R&Dapplications
Assumptions:
★ No explicit caching
Picture hey
Throughput vs Latency:
★ Too much latency for interactive applications (e.g. Gmail)
★ Automated master failover
88
CHUBBY
Chubby is
★ Lock service
★ Simple FS
★ Name service
★ Synchronisation/consensus service
Architecture:
★ Cell: 5 replicas
★ Master:
➢ gets all client requests
➢ elected with Paxos
➢ master lease: no new master until lease expires
★ Write: Paxos agreement of all replicas
★ Read: local by master
★ Pathname: /ls/cell/some/file/name
★ Open (R/W), Close, Read, Write, Delete
★ Lock: Acquire, Release
★ Events: file modified, lock acquired, etc.
Colossus:
★ follow up to GFS
BigTable:
★ Distributed, sparse, storage map
★ Chubby for consistency
★ GFS/Colossus for actual storage
Megastore:
★ Semi-relational data model, ACID transactions
★ BigTable as storage , synchronous replication (using Paxos)
★ Poor write latency (100-400 ms) and throughput
Spanner:
★ Structured storage, SQL-like language
★ Transactions with TrueTime, synchronous replication (Paxos)
89
Transparency
★ Location: a client cannot tell where a file is located
★ Migration: a file can transparently move to another server
★ Replication: multiple copies of a file may exist
★ Concurrency: multiple clients access the same file
Flexibility
★ Servers may be added or replaced
★ Support for multiple file system types
Dependability
★ Consistency: conflicts with replication & concurrency
★ Security: users may have different access rights on clients sharing files &
network transmission
★ Fault tolerance: server crash, availability of files
★ Requests may be distributed across servers
★ Multiple servers allow higher storage capacity
Scalability
★ Handle increasing number of files and users
★ Growth over geographic and administrative areas
★ Growth of storage space
★ No central naming service
★ No centralized locking
★ No central file store
★ Google File System: Google File System(GFS), a scalable distributed file system(DFS),
to meet the company’s growing data processing needs. GFS offers fault tolerance,
dependability, scalability, availability, and performance to big networks and connected
nodes. GFS is made up of several storage systems constructed from inexpensive
commodity hardware parts. The search engine, which creates enormous volumes of
data that must be kept, is only only one example of how it is customized to meet
Google’s various data use and storage requirements. The GFS reduced hardware flaws
while gaining commercially available servers.
3. Draw the upload/download model for the Network File System(NFS).
4. Write short notes on: Andrew File System, and Google File System.
Answer:
★ Andrew File System: AFS is a distributed file system. It uses the client/server model,
where all the files are stored on file server machines. Files are transferred to client
machines as necessary and cached on local disk. The server part of AFS is called the
AFS File Server, and the client part of AFS is called the AFS Cache Manager.
★ Google File System: Google File System(GFS), a scalable distributed file system(DFS),
to meet the company’s growing data processing needs. GFS offers fault tolerance,
dependability, scalability, availability, and performance to big networks and connected
nodes. GFS is made up of several storage systems constructed from inexpensive
commodity hardware parts. The search engine, which creates enormous volumes of
data that must be kept, is only only one example of how it is customized to meet
Google’s various data use and storage requirements. The GFS reduced hardware flaws
while gaining commercially available servers.