Distributed Computing Module 5 Important Topics PYQs
Distributed Computing Module 5 Important Topics PYQs
Important-Topics-PYQs
For more notes visit
https://rtpnotes.vercel.app
Distributed-Computing-Module-5-Important-Topics-PYQs
1. Define Byzantine agreement problem.
2. Write the features of SUN Network File System.
1. Distributed File Access
2. Architecture Support
3. Virtual File System (VFS)
4. File Handles and Inodes
5. Client Integration
6. Access Control and Authentication
7. Naming and Mounting
8. Client Caching
9. Server Caching
3. Explain the components of Google File System
Master Server
Chunk Servers
Clients
How GFS Reads a File
How GFS Writes a File
4. List distributed file system requirements
1. Access Transparency
2. Location Transparency
3. Mobility Transparency
4. Performance Transparency
5. Scaling Transparency
6. Concurrent File Updates
7. File Replication
8. Hardware and Operating System Heterogeneity
9. Fault Tolerance
10. Consistency
11. Security
12. Efficiency
5. Differentiate between whole file serving and whole file caching in Andrew file
System.
6. Define flat file service and directory service components.
Flat File Service
Directory Service
7. What are the advantages of Google File System.
8. Explain consensus algorithm for crash failures under synchronous systems.
What is a Consensus Algorithm?
Understanding Synchronous Systems
Consensus in the Presence of Crash Failures
Steps to Reach Consensus in Crash Failures:
11. Which are the assumptions made in Consensus and Agreement Algorithm
1. Failure Model
2. System Type: Synchronous vs Asynchronous
3. Network Connectivity
4. Sender Identification
5. Reliable Channels
6. Message Authentication
7. Agreement Variable Type
12. Explain about the file service architecture
Three Main Parts of File Service Architecture
1️. Flat File Service (Manages File Contents)
2️. Directory Service (Manages File Names)
3️. Client Module (Acts as a Bridge)
Operations of File Service (What It Can Do)
Basic File Operations
Creating and Deleting Files
File Attributes (Metadata)
How It Works (Step-by-Step)
13. Write in detail about distributed file system characteristics.
Characteristics of Distributed File Systems
1. Organization & Storage
2. File Location & Access
3. Data and Attributes
4. Directories and Naming
5. Metadata
One process is called the source process, and it starts with an initial value. The goal is for all
non-faulty (honest) processes to agree on a value, following three important conditions:
1. Agreement
1. All non-faulty processes must decide on the same value, no matter what faulty
processes do.
2. Validity
1. If the source process is not faulty, then all non-faulty processes must decide on the
same value as the source’s initial value.
3. Termination
1. Every non-faulty process must eventually decide on a value. No process should keep
waiting forever.
We want to distribute (Distributed File Access) some items to a remote location. First,
we check the architecture (Architecture Support) of the destination to understand how
many items need to be delivered. Then, we virtually(Virtual File System) place the order.
All the transaction details are carefully recorded in files and handled (File Handles and
Inodes) by an accountant. Once the items reach their destination, we must
authenticate(Access Control and Authentication) before handing them over to the
client.
Finally, we mount(Naming and Mounting) each item into its designated cache (Client
and Server Caching), making them ready for both the client and server to access and
use efficiently.
2. Architecture Support
NFS works on top of the UNIX kernel and supports the Virtual File System (VFS) layer.
VFS helps to manage both local and remote files without any difference from the user’s
point of view.
Provides access transparency: users don't need to know if a file is local or remote.
Keeps track of all active file systems and routes requests accordingly.
Uses file handles to uniquely identify files.
5. Client Integration
NFS is stateless: the server does not keep track of open files.
Each access is individually checked for user permissions.
Supports Kerberos for strong authentication and security.
Types of mount:
Soft mount: retries for a limited time.
Hard mount: retries indefinitely until success.
Auto mount: mounted when accessed.
8. Client Caching
9. Server Caching
Master Server
Chunk Servers
Clients
We open a system and access files at a specific location. Then, we pick up our mobile to
check its performance. Next, we scale this setup and test multiple mobiles concurrently.
After that, we return to the system to replicate some files, ensuring they work smoothly
across different hardware and operating systems. Everything continues to function
consistently, with strong fault tolerance. The system also maintains high security and
excellent efficiency.
1. Access Transparency
Users and programs should not need to know whether a file is stored locally or remotely.
The same file operations should work for both.
Example: Accessing a file on Google Drive should feel the same as opening a file stored on
a personal computer.
2. Location Transparency
Files can be moved between servers, but their path remains unchanged.
Users do not need to know where the file is physically stored.
Example: A video on a streaming platform may move to different data centers, but users
can still access it using the same link.
3. Mobility Transparency
Files can be moved without requiring changes in client applications or system settings.
Ensures that files remain accessible even when they are relocated.
Example: A company might move employee files from one server to another without
employees noticing any change.
4. Performance Transparency
The system should maintain stable performance even when the load on servers varies.
Example: A cloud storage service should provide smooth access to files even when many
users are active.
5. Scaling Transparency
The system should be able to expand and handle an increasing number of users and data
without major changes.
Example: A cloud-based file service should function efficiently whether it serves ten users
or a million users
7. File Replication
The system should maintain multiple copies of files across different locations.
Helps in load balancing and fault tolerance.
Example: A distributed file system storing data on multiple servers ensures availability even
if one server fails.
The system should work across different operating systems and hardware.
Example: A file service should be accessible from Windows, Linux, and macOS without
compatibility issues.
9. Fault Tolerance
The system should continue functioning even if some servers or clients fail.
Example: If one storage server crashes, another should take over automatically to prevent
data loss.
10. Consistency
When files are updated, all copies should reflect the latest changes.
There might be delays in propagating updates across different sites.
Example: When an email is deleted from one device, it should also disappear from all other
devices.
11. Security
The system should protect data using authentication, access control, and encryption.
Example: Only authorized users should be able to access confidential company files, and
data should be encrypted to prevent unauthorized access.
12. Efficiency
The system should provide high performance comparable to traditional file systems.
Example: Opening and saving files in a distributed system should be as fast as working
with local files.
5. Differentiate between whole file serving and whole file
caching in Andrew file System.
Directory Service
Trick to learn
Scalable
Available (fault tolerant)
Fast (direct access)
Efficient (metadata split from data)
and handles BIG files
🔹 Example: Imagine a group of friends voting on where to eat. Everyone must answer within
10 seconds, or they are considered unavailable.
In a synchronous system, crash failures happen when some processes stop working but
don’t send incorrect data. The goal of the consensus algorithm is to let the remaining
processes agree on a decision, even if some fail.
1. Each process proposes a value (e.g., "Let's eat pizza" or "Let's eat burgers").
2. Processes exchange values with each other within a fixed time.
3. If a process crashes, it stops responding, but the remaining processes continue.
4. Majority rule: If more than half of the processes agree on a value, that value is chosen.
5. Final Decision: All non-crashed processes adopt the agreed-upon value.
9. Explain Andrew File System in detail
AFS is a distributed file system that allows users to access files from multiple computers as if
they were stored locally. It is designed to handle large numbers of users efficiently by using
caching to speed up file access.
When a user opens a file, the entire file is downloaded to their computer.
This file is then cached (stored temporarily) so future access is faster.
Even if the computer restarts, cached files remain available.
Whole-File Serving
AFS Components
Vice (Server-Side)
AFS ensures that all users see the latest version of a file using a system called Callback
Promises:
✔ When a file is cached, the AFS server promises to notify the user if the file is modified
by someone else.
✔ If another user modifies the file, the server sends a callback message to all users with
the old version.
✔ If a file’s callback is canceled, Venus downloads a fresh copy.
1. Client System
2. Server System
Sends requests to the server for file operations (read, write, open, etc.).
Receives the responses and acts like the file is local.
Caches the file data in local memory for faster access.
Acts as a middle layer between user applications and the file system.
It checks whether a file is local or remote.
Sends the request to either the local file system or the NFS client module.
2. File Handle
3. Access Transparency
The user doesn't need to know whether the file is remote or local.
All file operations look the same.
4. Mounting
Mounting is the process of attaching remote directories to the local file system.
Example: mount(remote_host:/data, /home/user/data)
Types of mounting:
Soft mount: times out if server doesn't respond
Hard mount: keeps retrying until it succeeds
Auto mount: mounted automatically when accessed
When we handle processes, some of them might fail (Failure Model). So, our first step is
to check the expected delivery time to determine whether the system is synchronous or
asynchronous.
Once that’s clear, we verify if the issue is due to network connectivity. If the connection
still doesn’t work, we send a complaint, making sure to include the sender's identity, and
transmit it through reliable channels.
1. Failure Model
3. Network Connectivity
The network is fully connected: every process can send messages to every other process.
4. Sender Identification
5. Reliable Channels
6. Message Authentication
Two cases:
Unauthenticated messages: Faulty processes can forge or tamper with messages
(like spreading rumors).
Authenticated messages: With digital signatures, forgeries and tampering can be
detected, making consensus easier.
Authenticated systems are more robust against Byzantine faults.
When we need to store something, we start by deciding how to organize the files and
where to place them. Once that's set, we define the file location and store both the data
and its attributes. To keep things tidy, we place the files in separate directories and give
them meaningful names. Finally, we save the metadata to help track and manage the files
in the future.
DFS handles the organization, naming, retrieval, sharing, and protection of files spread
across various locations.
It provides a consistent programming interface that abstracts the low-level details of
storage allocation and file layout.
Files are stored on non-volatile media like hard disks or SSDs across different servers.
From a user’s perspective, accessing a file over DFS feels the same as accessing it locally.
5. Metadata
Metadata refers to all extra information that the file system keeps for managing files
(including attributes, directory structure, and access permissions).
It plays a key role in tracking, securing, and organizing files within the system.