0% found this document useful (0 votes)
21 views35 pages

Distributed System Highlighted Yellow

A distributed system is a network of independent computers that work together to present a unified system to users, with goals including scalability, reliability, and security. Key concepts discussed include multicasting, firewalls, cryptography, and the differences between network and distributed operating systems. The document also covers various terms related to distributed systems, such as remote procedure calls, digital signatures, and inter-process communication.

Uploaded by

somyajiit07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views35 pages

Distributed System Highlighted Yellow

A distributed system is a network of independent computers that work together to present a unified system to users, with goals including scalability, reliability, and security. Key concepts discussed include multicasting, firewalls, cryptography, and the differences between network and distributed operating systems. The document also covers various terms related to distributed systems, such as remote procedure calls, digital signatures, and inter-process communication.

Uploaded by

somyajiit07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Distributed System - 3170719 - @thatmishrajii

3-Marks:
1. What is Distributed System? List out goals of distributed system.
Ans:
A distributed system is a collection of independent computers that work together to appear as a single
system to users. These computers communicate over a network, sharing resources and coordinating
actions to achieve common goals.

Goals of Distributed Systems:

1. Scalability: Easy to expand.


2. Reliability: Continues working even with failures.
3. Performance: Efficient and fast.
4. Transparency: Hides complexity from users.
5. Security: Protects data and resources.
6. Fault Tolerance: Recovers from failures.
7. Resource Sharing: Shares resources like files or devices.

2. Define scalability, Fault Tolerance and Replication.


Ans:

Scalability is the ability of a system to handle increasing workloads by adding more resources (like
servers) without affecting performance. It ensures that the system can grow smoothly as demand
increases.

Fault tolerance is the system’s ability to continue functioning properly even when some of its components
fail. This ensures that failures don't disrupt the overall operation.

Replication is the process of creating and maintaining multiple copies of data or services across different
nodes in a distributed system. It improves availability, reliability, and fault tolerance.

3. What is multicasting? List the characteristics of multicasting.


Ans:

Multicasting is a communication method where a message is sent from one sender to a group of
receivers in a network. Unlike broadcasting (sending to all), multicasting targets specific recipients, making
it more efficient for group communication.

Characteristics of Multicasting:

1. Group Communication: Data is sent to a selected group of receivers.


2. Efficient Resource Usage: Reduces bandwidth by sending a single message to multiple recipients,
instead of sending separate messages.
3. Selective Delivery: Only intended receivers within the group get the message.
4. Supports Dynamic Groups: Receivers can join or leave the multicast group dynamically.
5. Used in Real-Time Applications: Common in video conferencing, streaming, and online gaming.

4. Explain Firewall in detail.


Ans:

1
A firewall is a security system that controls the flow of network traffic between different networks, such as
between a private network and the internet. Its main functions are:

1. Traffic Filtering: It examines incoming and outgoing data packets and decides whether to allow or
block them based on predefined rules.
2. Protection: It helps protect internal networks from unauthorized access and potential threats from the
outside.
3. Monitoring: It logs traffic activity, which helps in detecting and analyzing security threats.
4. Access Control: It enforces rules about which devices or users can access specific resources or
services.

Firewalls can be implemented in hardware, software, or a combination of both. They are essential for
maintaining network security by creating a barrier between trusted and untrusted networks.

5. What is cryptography? What is the use of cryptography?


Ans:

Cryptography is the science of protecting information by converting it into a secure format that can only
be understood by authorized people. It uses mathematical techniques to encrypt (scramble) and decrypt
(unscramble) data, making it unreadable to anyone except those with the correct key.

Uses of Cryptography:

1. Confidentiality: Ensures that only authorized users can read the information.
2. Integrity: Protects data from being altered by unauthorized users.
3. Authentication: Verifies the identity of the sender and receiver to prevent impersonation.
4. Non-repudiation: Ensures that once a message is sent, the sender cannot deny having sent it.

Cryptography is widely used in secure communications, online transactions, and data protection.

6. Define Terms: Name Space, Name server.


Ans:

1. Name Space: A name space is a structured system that defines how names (or identifiers) are
assigned to objects in a distributed system. It provides a way to organize and locate resources like
files, devices, or services by using unique names.
2. Name Server: A name server is a server responsible for translating human-readable names (like
domain names) into machine-readable addresses (like IP addresses). It manages the mapping
between names and resources in the system and helps users locate services or devices efficiently.

7. Briefly explain HTTP.


Ans:

HTTP (Hypertext Transfer Protocol) is a protocol used for transferring data over the web. It allows web
browsers and servers to communicate by sending and receiving requests and responses.

Key Points about HTTP:

 Requests and Responses: A browser (client) sends an HTTP request to a server asking for a
web page or resource. The server responds with the requested data.
 Stateless: Each HTTP request is independent and does not remember previous requests,
meaning it doesn’t retain any information about past interactions.
2
 Methods: Common HTTP methods include GET (to retrieve data), POST (to submit data), PUT (to
update data), and DELETE (to remove data).

8. Explain CORBA’s common Data Representation.


Ans:
CORBA’s Common Data Representation (CDR) is a standardized way to encode and decode data for
communication between different systems in a distributed environment. It ensures that data can be
correctly interpreted regardless of the hardware or software differences between the systems involved.
CDR defines a format for representing data types, such as integers and strings, so that they can be
consistently transmitted over a network. This way, a system sending data can be confident that the
receiving system will understand and correctly process it.

9. Explain how simple client-server communication is done.


Ans:

In a simple client-server communication, the process works like this:

1. Server Setup: The server listens for incoming connections on a specific port.
2. Client Request: The client connects to the server's IP address and port and sends a request.
3. Server Response: The server receives the request, processes it, and sends back a response.
4. Client Receives: The client receives the server’s response and processes it.

In essence, the client sends a request, the server processes it, and then the server sends a response
back to the client.

10. Briefly explain scalability in distributed system.


Ans:

Scalability in a distributed system means the system's ability to handle growing amounts of work or
increasing numbers of users efficiently. There are two main types:

1. Horizontal Scaling: Adding more machines (nodes) to the system to handle more load.
2. Vertical Scaling: Increasing the power of existing machines (e.g., more CPU, RAM).

Scalability ensures that as demand increases, the system can expand and continue to perform well.

11. List down requirements for distributed file system.


Ans:

For a distributed file system, key requirements include:

1. Transparency: Users should not need to know if files are stored on different machines.
2. Scalability: The system should efficiently handle growing amounts of data and users.
3. Reliability: Data should be protected against hardware failures and other issues.
4. Consistency: Updates to files should be correctly synchronized across the system.
5. Performance: The system should provide fast access and transfer speeds for files.
6. Security: Data should be protected from unauthorized access and tampering.

12. Explain the term availability and reliability.


Ans:

3
Availability refers to how often a system is operational and accessible when needed. It's about
minimizing downtime and ensuring the system is up and running for users.

Reliability is about how consistently a system performs its functions without failures. It ensures that the
system can be trusted to work correctly over time and recover from issues if they arise.

13. Define failure? List down various reasons for the occurrence of failure.
Ans:

In a distributed system, a failure occurs when a component, such as a server or network, does not
perform its expected function or becomes unavailable, leading to disruptions in the system.

Reasons for failures include:

1. Hardware Malfunction: Physical components like hard drives or memory failing.


2. Software Bugs: Errors or flaws in the software that cause it to malfunction.
3. Network Issues: Problems with network connectivity or communication.
4. Overload: Excessive load or traffic that exceeds system capacity.
5. Human Error: Mistakes made by users or administrators, such as incorrect configuration.
6. Power Outages: Loss of electrical power affecting system components.
7. External Attacks: Security breaches or cyberattacks targeting the system.

14. Define happened before relation.


Ans:

The happened-before relation is a concept in distributed systems used to determine the order of events.
It helps to establish whether one event can causally influence another.

Definition: Event A is said to have happened-before event B if:

1. A occurs before B in the same process.


2. A is the result of a message sent by B (i.e., B sent a message that A received).
3. A happened-before B transitively, meaning if A happened-before B, and B happened-before C, then A
happened-before C.

This relation ensures a logical order of events, which is crucial for consistency and understanding
causality in distributed systems.

15. Define names, identifiers and addresses.


Ans:

In distributed systems:

1. Names: Human-readable labels assigned to entities (like users or resources) for easy identification.
For example, "Alice" or "Document1."
2. Identifiers: Unique codes or numbers assigned to entities to distinguish them from others. For
instance, a user ID like "12345" or a file ID like "file_6789."
3. Addresses: Specific locations used to locate entities within a network. This could be an IP address
(like "192.168.1.1") or a URL (like "http://example.com").

16. What is a distributed system? List out the advantages of Distributed System.

4
Ans:

A distributed system is a network of independent computers that work together to appear as a single
cohesive system to users. These computers communicate and coordinate their actions to achieve a
common goal.

Advantages of Distributed Systems:

1. Scalability: Easily expand by adding more machines or resources.


2. Reliability: Improved fault tolerance; if one component fails, others can take over.
3. Resource Sharing: Utilizes resources from multiple machines, such as storage and processing
power.
4. Flexibility: Allows for diverse and geographically dispersed resources and services.
5. Performance: Can handle large loads and tasks by distributing them across multiple machines.

17. Define Remote Procedure Call (RPC), Virtualization, Replication.


Ans:

Remote Procedure Call (RPC): A method that allows a program on one computer to execute a procedure
(function) on another computer in a distributed system as if it were a local procedure call. It simplifies
communication between different systems.

Virtualization: The process of creating virtual versions of physical resources, such as servers or storage
devices. It allows multiple virtual instances to run on a single physical machine, optimizing resource use
and flexibility.

Replication: The technique of duplicating data or services across multiple machines to ensure reliability
and availability. If one replica fails, others can take over, helping to prevent data loss and improve
performance.

18. What is digital signature? Explain briefly.


Ans:

A digital signature is a cryptographic technique used to verify the authenticity and integrity of a digital
message or document. It works like this:

1. Signing: The sender creates a unique digital signature by using a private key to encrypt a hash (a
unique code) of the message.
2. Verification: The recipient uses the sender's public key to decrypt the signature and compare it with a
newly generated hash of the message. If they match, the message is verified as authentic and
unaltered.

In summary, a digital signature ensures that a message comes from a legitimate source and hasn’t been
tampered with during transmission.

19. What is Denial of Service? Explain briefly.


Ans:

Denial of Service (DoS) is an attack aimed at making a computer, network, or service unavailable to its
intended users. This is typically achieved by overwhelming the target with excessive requests or traffic,
causing it to slow down or crash.

5
Types of DoS Attacks:

1. Volume-based: Floods the target with massive amounts of traffic.


2. Protocol-based: Exploits weaknesses in network protocols to disrupt service.
3. Application-based: Targets specific applications or services with malicious data to cause crashes.

In summary, a DoS attack disrupts normal operations by overloading or exploiting vulnerabilities in the
system.

20. Discuss Non-persistent HTTP connection in briefly.


Ans:

Non-persistent HTTP connection is a type of HTTP connection where each request and response pair is
handled over a separate connection. Here's how it works:

1. Separate Connections: For every HTTP request from a client to a server, a new TCP connection is
established.
2. Short-Lived: After the server sends the response, the connection is closed.
3. Overhead: Establishing and closing connections for each request can add extra overhead and
latency.

21. How a distributed system does projects a single system image?


Ans:

A distributed system projects a single system image by making multiple, separate components appear as
a unified system to users. This is achieved through:

1. Abstraction: Hides the complexity and details of the underlying components.


2. Transparency: Ensures users and applications interact with the system as if it were a single entity,
without needing to know about the distributed nature.
3. Consistency: Synchronizes data and state across the system to provide a cohesive experience.
4. Coordination: Manages interactions and communication between different components seamlessly.

22. List out the various characteristics of distributed system.


Ans:

Here are the key characteristics of a distributed system:

1. Transparency: Hides the complexity of the system from users, making it appear as a single entity.
2. Scalability: Can expand by adding more resources or nodes without major changes to the system.
3. Fault Tolerance: Can continue to function even if some components fail.
4. Concurrency: Supports simultaneous operations by multiple processes or users.
5. Resource Sharing: Allows different parts of the system to share resources like data and processing
power.
6. Scalability: Can handle growth by adding more nodes or resources.
7. Heterogeneity: Supports different types of hardware and software components working together.

23. Compare: Network Operating System and Distributed Operating System.


Ans:

 NOS (Network Operating System):

6
o Scope: Manages each computer separately on a network.
o Resource Management: Each computer handles its own resources; network services like file
sharing are added.
o Transparency: Users see and use their own computer’s resources.
o Coordination: Requires manual setup for network services.
 DOS (Distributed Operating System):
o Scope: Manages multiple computers as one system.
o Resource Management: Shares and manages resources across all computers together.
o Transparency: Users interact with the system as a single, unified entity.
o Coordination: Automatically handles coordination and synchronization between computers.

24. What is flat naming and structured naming?


Ans:

Flat Naming: A simple naming scheme where each name is unique and not related to others. For
example, using IDs like 12345 or abcde with no additional context or hierarchy.

Structured Naming: A more organized naming scheme where names have a hierarchical or descriptive
structure. For example, using paths like /home/user/documents or company.department.employee to
provide context and relationships between names.

25. Define IPC. What are the characteristics of IPC?


Ans:

IPC (Inter-Process Communication) is a mechanism that allows processes (programs) running on the
same computer or across different computers in a network to exchange data and coordinate actions.

Characteristics of IPC:

1. Communication: Enables data exchange between processes.


2. Synchronization: Manages the timing and order of interactions to avoid conflicts.
3. Data Integrity: Ensures that data is accurately transmitted and received.
4. Resource Sharing: Allows multiple processes to access shared resources.
5. Scalability: Supports communication between processes on the same or different machines.

26. Difference between Authorization and Authentication.


Ans:

Authentication: The process of verifying the identity of a user or system. It checks if someone is who they
claim to be, usually through credentials like passwords, fingerprints, or digital certificates.

Authorization: The process of determining what actions or resources an authenticated user or system is
allowed to access. It decides what permissions or rights the user has after their identity is confirmed.

4-Marks:
1. Compare Network operating system and Distributed system.
Ans:

 Network Operating System (NOS):


o Scope: Manages resources on individual computers connected in a network.

7
o Resource Management: Each computer handles its own resources; network services like file
sharing are provided.
o Coordination: Minimal coordination; machines operate independently, with network services
added manually.
 Distributed System:
o Scope: Integrates multiple computers to work together as a single, unified system.
o Resource Management: Shares and manages resources across all computers collectively.
o Coordination: High level of automatic coordination and synchronization to present a cohesive
system.

2. Compare Process and Thread.


Ans:

 Process:
 Definition: A self-contained execution unit with its own memory space.
 Isolation: Processes are independent and do not share memory with other processes.
 Overhead: Creating and managing processes have higher overhead due to separate memory and
resources.
 Thread:
 Definition: A smaller execution unit within a process that shares the same memory space as other
threads in the same process.
 Sharing: Threads within the same process share memory and resources.
 Overhead: Creating and managing threads is generally more efficient and has lower overhead
compared to processes.

3. Explain LDAP in detail.


Ans:

LDAP (Lightweight Directory Access Protocol) is a protocol used to access and manage directory
services over a network. Here’s a brief overview:

1. Purpose: LDAP provides a way to look up and manage information about resources such as users,
groups, and devices in a directory.
2. Structure: The directory is organized in a hierarchical structure, similar to a tree, where each entry is
identified by a unique Distinguished Name (DN).
3. Operations: Common LDAP operations include:
o Search: Finding entries based on specific criteria.
o Bind: Authenticating a user to the directory.
o Add, Modify, Delete: Managing directory entries.
4. Protocol: LDAP operates over TCP/IP and uses a client-server model. Clients send requests to an
LDAP server, which processes and responds with the requested information.
5. Standard: LDAP is an open standard, widely used for directory services in various applications like
email systems, network management, and access control.

4. Explain Entry consistency and weak consistency.


Ans:

Entry Consistency and Weak Consistency are concepts related to how updates are handled in
distributed systems:

 Entry Consistency:

8
 Definition: Ensures that when a data item is updated, all subsequent accesses to that item will
see the most recent update.
 Guarantee: Guarantees that any read operation will reflect the most recent write for the specific
data entry being accessed.
 Weak Consistency:
 Definition: Allows for more relaxed consistency where updates may not be immediately visible to
all nodes or users.
 Guarantee: Provides no guarantee that subsequent reads will reflect the most recent write, leading
to potential discrepancies in data visibility.

5. Write short note on various security services.


Ans:

Here's a brief overview:

1. Confidentiality: Ensures that data is only accessible to authorized users. This is often achieved
through encryption, which converts data into a secure format that only authorized parties can decrypt.
2. Integrity: Guarantees that data has not been altered or tampered with during transmission or storage.
Techniques like checksums, hashes, and digital signatures are used to verify data integrity.
3. Authentication: Confirms the identity of users or systems before granting access. This can involve
passwords, biometrics, or digital certificates.
4. Authorization: Determines what resources a user or system is permitted to access and what actions
they can perform. This often relies on access control lists or roles.
5. Non-repudiation: Prevents users from denying their actions. This is achieved through mechanisms
like digital signatures and audit logs, which provide evidence of actions taken.
6. Availability: Ensures that services and data are accessible when needed. This involves protecting
against denial-of-service attacks and implementing redundancy and failover strategies.

These services work together to safeguard distributed systems from various security threats and ensure
reliable and trustworthy operations.

6. Briefly explain apache web server.


Ans:

Apache Web Server, often just called Apache, is one of the most popular and widely used web servers in
the world. Here’s a simple breakdown:

 Function: It serves web pages to users over the Internet. When you type a web address into your
browser, Apache processes the request and sends the appropriate web page back to you.
 Open Source: Apache is free to use and modify. It’s developed and maintained by a community of
volunteers.
 Flexibility: It supports various features through modules, which allow you to add functionalities like
URL rewriting, authentication, and more.
 Cross-Platform: It works on different operating systems, including Windows, Linux, and macOS.
 Configuration: It’s configured using text files (like httpd.conf), which control how the server behaves
and handles requests.

7. Briefly explain distributed commit and recovery in distributed systems.


Ans:

9
Distributed Commit: In a distributed system, when a transaction involves multiple nodes, all nodes must
agree to either commit (complete) or abort (cancel) the transaction to ensure consistency. This is typically
managed using protocols like Two-Phase Commit (2PC). In the first phase, nodes prepare and agree to
commit. In the second phase, they either all commit or all abort based on the majority's decision.

Recovery: If a failure occurs (e.g., a node crashes), recovery mechanisms ensure that the system returns
to a consistent state. This usually involves logging changes and using these logs to redo or undo
operations to maintain consistency across nodes. Techniques like logging, checkpointing, and consensus
algorithms are used to manage recovery and ensure that all nodes are synchronized.

8. Explain advantages and disadvantages of distributed systems.


Ans:

Advantages of Distributed Systems:

1. Scalability: Can easily expand by adding more nodes to handle increased load.
2. Fault Tolerance: If one node fails, others can continue to operate, improving reliability.
3. Resource Sharing: Nodes can share resources like data and processing power, optimizing usage.
4. Flexibility: Different components can be developed and maintained independently, allowing for
diverse applications and technologies.

Disadvantages of Distributed Systems:

1. Complexity: Managing and coordinating multiple nodes can be complicated.


2. Latency: Communication between nodes can introduce delays, affecting performance.
3. Security: More nodes and communication paths can increase the risk of security breaches.
4. Consistency: Ensuring all nodes have consistent data can be challenging, especially in the face of
failures.

9. Discuss flat and structured naming with example.


Ans:

Flat Naming:

 Definition: In a flat naming system, each name is unique but doesn’t provide any hierarchical
relationship or structure.
 Example: An IP address is a flat name. For example, 192.168.1.1 identifies a device uniquely but
doesn’t indicate any relationship to other addresses.

Structured Naming:

 Definition: In a structured naming system, names are organized in a hierarchy or structure that
reflects relationships or categories.
 Example: Domain names on the Internet are structured. For instance, www.example.com has a
hierarchical structure: .com (top-level domain), example (second-level domain), and www
(subdomain).

10. Give some examples of true identifiers.


Ans:

True Identifiers are unique and unchanging names or labels used to identify entities in a distributed
system. Here are some examples:
10
1. UUID (Universally Unique Identifier): A 128-bit number used to uniquely identify information across
systems. Example: 123e4567-e89b-12d3-a456-426614174000.
2. IP Address: A unique address assigned to each device on a network. Example: 192.168.1.1.
3. MAC Address: A unique identifier assigned to network interfaces for communications on the physical
network segment. Example: 00:1A:2B:3C:4D:5E.
4. Global Unique Identifier (GUID): Similar to UUID, used in various software applications to uniquely
identify objects. Example: 550e8400-e29b-41d4-a716-446655440000.

11. Explain berkley clock synchronization algorithm.


Ans:

Berkeley Clock Synchronization Algorithm is a method to synchronize clocks across a distributed


system where there's no central time source. Here's how it works:

1. Request Time: A coordinator node (or a designated time keeper) periodically requests the current
time from all other nodes in the system.
2. Send Times: Each node sends its local time to the coordinator.
3. Calculate Average: The coordinator calculates the average time of all the received times, excluding
its own time to avoid bias.
4. Distribute Time: The coordinator then sends this average time back to all nodes.
5. Adjust Clocks: Each node adjusts its clock to match the received average time.

Key Points:

 The algorithm assumes that network delays are not significant or are uniformly distributed.
 It helps reduce the discrepancy between clocks but may not handle network delays perfectly.

12. Discuss persistent and non-persistent HTTP connection.


Ans:

Persistent HTTP Connection:

 Definition: Also known as HTTP keep-alive, it allows multiple requests and responses to be sent over
a single connection.
 Benefits: Reduces the overhead of establishing new connections for each request, leading to faster
communication and improved performance.
 Example: A web browser making multiple requests (e.g., for images, CSS, JavaScript) to a server
over the same connection.

Non-Persistent HTTP Connection:

 Definition: Also known as HTTP 1.0 or HTTP without keep-alive, it opens a new connection for each
request and closes it after the response is received.
 Drawbacks: Increased overhead due to the need to establish and close a connection for every
request, which can lead to slower performance.
 Example: A web browser opening a new connection each time it requests a webpage, an image, or
other resources.

13. Explain causal consistency.


Ans:

11
Causal Consistency ensures that operations in a distributed system reflect the cause-and-effect
relationships between them. Here's a simple breakdown:

1. Definition: In a system with causal consistency, if one operation causally affects another (i.e., the first
must happen before the second), then all nodes in the system see these operations in the same order.
2. Order Preservation: If Node A performs an operation that Node B can see, then any subsequent
operation by Node B that depends on the first operation will also be seen in the same order by all
nodes.
3. Example: If Alice sends a message to Bob, and Bob replies, causal consistency ensures that if
another node sees Bob's reply, it will also see Alice's original message.

Key Points:

 Causal consistency maintains a logical order of operations based on their causal relationships.
 It’s more flexible than strict consistency but provides a reasonable guarantee of order in distributed
systems.

14. Discuss different alternatives of client-server organization.


Ans:

1. Peer-to-Peer (P2P):
o Definition: Each node in the network can act as both a client and a server.
o Advantages: Reduces reliance on a central server, improves scalability, and can be more resilient
to failures.
o Example: File-sharing networks like BitTorrent.
2. Multi-Tier Architecture:
o Definition: Divides the system into multiple layers or tiers, such as presentation, application, and
data layers.
o Advantages: Modularizes the system, making it easier to manage, scale, and update.
o Example: Web applications with separate front-end (client), application server, and database
server.
3. Service-Oriented Architecture (SOA):
o Definition: Uses services as the fundamental unit of communication, with services providing and
consuming functionalities over a network.
o Advantages: Promotes loose coupling between services, allowing for flexibility and scalability.
o Example: An e-commerce platform with separate services for payment, inventory, and shipping.
4. Microservices Architecture:
o Definition: Breaks down applications into smaller, independent services that communicate over a
network.
o Advantages: Enhances scalability, allows for continuous deployment, and improves fault isolation.
o Example: An online retail system with separate microservices for user management, product
catalog, and order processing.

15. What is Interceptor? Explain briefly.


Ans:

An Interceptor in distributed systems is a component that intercepts and processes requests or


responses before they reach their final destination or after they leave it. Here’s a brief overview:

1. Function: Interceptors can modify, log, or handle requests and responses. They are used for tasks
such as authentication, logging, or modifying data.

12
2. Usage: They sit "in the middle" of the communication process, allowing additional operations to be
performed without altering the core system logic.
3. Example: In a web application, an interceptor might check if a user is authenticated before allowing
access to certain resources, or it could log details about each request for debugging purposes.

16. Briefly explain iterative name resolution technique.


Ans:

Iterative Name Resolution is a technique used in distributed systems to resolve names (like domain
names) into addresses (like IP addresses) in multiple steps. Here’s how it works:

1. Client Initiates Request: The client sends a request to a name server asking for the address of a
particular name (e.g., www.example.com).
2. Server Responds with Reference: Instead of resolving the name completely, the server responds
with a reference to another server that may know the answer.
3. Client Repeats Process: The client then contacts the next server and continues this process, asking
each server in the chain until it reaches one that can fully resolve the name.
4. Final Resolution: The client gets the final address and can now communicate with the target.

Example: In DNS, when resolving a domain, the client might first contact a root DNS server, which points
to a TLD (Top-Level Domain) server, and then the authoritative server for the domain.

17. Write short note on Data-centric consistency model.


Ans:

A Data-Centric Consistency Model in distributed systems focuses on how data is read and written
across multiple replicas to maintain consistency. Here’s a brief explanation:

1. Definition: It defines the rules for how updates to a shared data item are propagated and viewed by
different processes. The goal is to ensure that all users or processes see a consistent view of the data,
even in a distributed environment.
2. Types of Data-Centric Consistency Models:
o Strict Consistency: Ensures that every read returns the most recent write. It's difficult to achieve
in distributed systems due to network delays.
o Sequential Consistency: Operations appear in the same order to all processes, but not
necessarily in real-time.
o Causal Consistency: Only operations that are causally related must appear in the same order for
all processes.
o Eventual Consistency: Guarantees that, in the absence of new updates, all replicas will
eventually become consistent.
3. Example: In an e-commerce platform, data-centric consistency ensures that when a user updates
their cart, all replicas of the system eventually reflect the correct cart contents.

18. Write short note on Process Resilience.


Ans:

Process Resilience in distributed systems refers to the system's ability to continue functioning correctly
even when some processes fail. It ensures that the system remains operational and can recover from
failures. Here's a brief overview:

13
1. Redundancy: Multiple copies of processes or services are maintained so that if one fails, others can
take over. This is key to ensuring resilience.
2. Failure Detection: Mechanisms like heartbeat signals or timeouts are used to detect failed processes
quickly, allowing the system to respond.
3. Recovery: After a failure, the system may restart or replace the failed process, or use checkpointing to
restore its previous state.
4. Example: In cloud computing, if a server crashes, the processes running on it may be automatically
restarted on another server to maintain service availability.

18. What is RMI (Remote Method Invocation)? Explain briefly.


Ans:

RMI (Remote Method Invocation) is a mechanism in distributed systems that allows a program to call
methods on objects located on different machines as if they were local. Here's a brief explanation:

1. Definition: RMI enables communication between Java objects on different virtual machines (or
computers), allowing one object to invoke methods on a remote object.
2. How It Works:
o The client calls a method on a remote object.
o The request is sent over the network to the server where the object resides.
o The server processes the request, executes the method, and returns the result to the client.
3. Example: If a Java program on one computer wants to access a database on another, RMI allows it to
call a method on a remote object that handles database operations.

19. Briefly discuss the issues related to distributed system design.


Ans:

When designing a Distributed System, several key issues need to be addressed to ensure efficiency,
reliability, and scalability. Here’s a brief overview of the main challenges:

1. Communication Delays: Network latency can cause delays in communication between nodes,
impacting system performance. Efficient communication protocols are needed to minimize this.
2. Fault Tolerance: Distributed systems must be able to handle node failures without crashing the entire
system. Techniques like replication and redundancy are used to achieve fault tolerance.
3. Consistency: Ensuring data consistency across multiple nodes is challenging, especially when
updates happen concurrently. Various consistency models (e.g., eventual consistency) are used to
balance performance and correctness.
4. Security: Distributed systems are vulnerable to attacks like data breaches, unauthorized access, and
denial of service. Encryption, authentication, and secure communication protocols are essential to
ensure data and system security.
5. Scalability: As the system grows, it must handle an increasing number of requests without
performance degradation. Proper system architecture and load balancing are key to achieving
scalability.
6. Concurrency: Managing multiple users accessing shared resources simultaneously requires
synchronization to avoid conflicts and ensure data integrity.

20. What is the need for code migration? Explain the code migration issues in detail?
Ans:

14
Code Migration refers to moving code (or processes) from one machine to another in a distributed
system. It allows systems to balance load, improve performance, or place computation closer to data or
resources. Here’s why it’s needed and some issues it faces:

Need for Code Migration:

1. Load Balancing: Distribute workloads across multiple machines to avoid overloading one server.
2. Improved Performance: Move code closer to where data or resources are located to reduce
communication overhead.
3. Fault Tolerance: Migrate processes to healthy nodes if the current one fails.
4. Dynamic Upgrades: Apply software updates without stopping the system by migrating running
processes.

Code Migration Issues:

1. Heterogeneity: The destination machine may have a different hardware or software environment,
causing compatibility issues when migrating code.
o Solution: Use platform-independent languages like Java or virtual machines to ensure
compatibility.
2. State Transfer: The state of the running process (variables, open files, etc.) must be correctly
transferred to the new machine.
o Solution: Techniques like process checkpointing can save and restore the process state during
migration.
3. Security: Migrated code may be exposed to untrusted environments, raising concerns about security
and integrity.
o Solution: Implement strong authentication and sandboxing to protect the code and the host
machine.
4. Communication: After migration, ongoing communication (e.g., network connections) might be
disrupted or need to be reestablished.
o Solution: Use proxy mechanisms or dynamic binding to handle communication redirection.
5. Resource Availability: The new machine might not have the necessary resources (e.g., memory,
CPU power) to execute the migrated code efficiently.
o Solution: Check resource availability before migration and use load monitoring.

21. Enumerate various issues in clock synchronization.


Ans:

In Clock Synchronization for distributed systems, several challenges need to be addressed to ensure
that all nodes maintain a consistent and synchronized time. Here are the key issues:

1. Clock Drift: The clocks of different machines can drift apart because no two clocks are perfectly
synchronized, leading to time discrepancies.
o Solution: Regular synchronization using protocols like NTP (Network Time Protocol) helps reduce
drift.
2. Network Delays: Communication between nodes takes time, and variable network delays can cause
errors in time synchronization.
o Solution: Algorithms like the Berkeley Algorithm account for these delays when adjusting clocks.
3. Failure of Time Sources: If a central time server or a reliable time source fails, it can disrupt
synchronization across the system.
o Solution: Use redundant time sources or distributed algorithms that do not rely on a single point of
failure.

15
4. Clock Granularity: Some systems have coarse-grained clocks that update infrequently, making fine-
grained synchronization difficult.
o Solution: Use higher-resolution clocks where possible to improve accuracy.
5. Security: Unauthorized manipulation of time (e.g., by hackers) can cause issues like incorrect event
ordering or security breaches.
o Solution: Ensure secure communication and authentication for time synchronization messages.
6. Handling Time Zones: In global distributed systems, different nodes may be in different time zones,
complicating synchronization and time interpretation.
o Solution: Use UTC (Coordinated Universal Time) as a common reference across nodes.

22. List the differences between RMI and RPC.


Ans:

RMI (Remote Method Invocation):

 Java-specific and works only in Java environments.


 Supports object-oriented features, allowing invocation on remote objects.
 Transmits complex objects and methods between systems.
 Built into the Java platform, using Java-specific protocols like JRMP.
 Best suited for distributed Java applications.

RPC (Remote Procedure Call):

 Language-independent and usable across various programming languages.


 Focuses on procedural calls to remote functions or procedures.
 Typically handles simpler data types.
 Requires external frameworks such as gRPC or XML-RPC.
 Can use various protocols like HTTP or TCP/IP.
 Ideal for distributed systems involving multiple languages.

23. What is Replication? Write about motivations for replication.


Ans:

Replication in distributed systems involves creating and maintaining multiple copies of data across
different nodes. This ensures that the data remains available and consistent even if some nodes fail.
Here’s why replication is important:

1. Fault Tolerance: If one node fails, other nodes with copies of the data can continue to provide access,
ensuring the system remains operational.
2. High Availability: By having multiple copies of data, the system can handle more requests and
maintain service availability, even during high traffic or node failures.
3. Load Balancing: Replication allows distributing read requests across multiple nodes, improving
system performance and response times.
4. Disaster Recovery: Having data replicated in different locations helps protect against data loss due to
disasters, ensuring that data can be recovered.

24. What is DFS? Also write the features of DFS.


Ans:

16
DFS (Distributed File System) is a system that allows files to be stored and accessed across multiple
networked computers as if they were on a single local machine. It provides a unified view of files across
different locations.

Features of DFS:

1. Transparency: Users interact with files as if they are on their local system, even though they might be
distributed across multiple servers.
2. Scalability: Can handle increasing amounts of data and users by adding more servers or storage
resources.
3. Fault Tolerance: Maintains data availability and integrity even if some servers fail, often through
replication and redundancy.
4. Data Distribution: Distributes data across various nodes to balance load and improve performance.
5. Centralized Management: Provides a single point of management for file access, permissions, and
configuration.
6. Consistency: Ensures that all nodes see a consistent view of the files, even in the presence of
concurrent access.

7-Marks:
1. What is Transparency? Explain various types of Transparency.
Ans:

Transparency in distributed systems refers to the concealment of the complexity of the distributed nature
of the system from users and applications. It makes the system appear as if it were a single, unified entity,
despite being distributed across multiple locations.

Types of Transparency:

1. Access Transparency: Hides the details of how resources are accessed, making it appear as if all
resources are accessed in the same way, regardless of their location.
2. Location Transparency: Conceals the physical location of resources. Users and applications interact
with resources without needing to know where they are located.
3. Migration Transparency: Hides the fact that resources or services can move from one location to
another. Users do not need to be aware of or adapt to these changes.
4. Replication Transparency: Conceals the existence of multiple copies of data or services. Users
interact with a single, coherent resource without knowing that it is replicated across several nodes.
5. Concurrency Transparency: Manages simultaneous access to resources by multiple users or
processes, ensuring that they do not interfere with each other’s operations.
6. Failure Transparency: Ensures that the system continues to operate correctly and without disruption
even if some components fail, hiding the impact of failures from users.

2. What is Clock Synchronization?


Ans:

Clock Synchronization is the process of ensuring that the clocks of different machines or nodes in a
distributed system are aligned and show the same time. This is crucial for coordinating activities,
maintaining order, and ensuring consistency across the system.

Why It's Important:

1. Consistency: Ensures that events are recorded in the same order across all nodes.
2. Coordination: Helps in synchronizing actions or transactions that occur across multiple nodes.
17
3. Debugging: Makes it easier to troubleshoot and analyze system behavior when all nodes have a
consistent time reference.

3. Define RPC. Explain implementation mechanism of RPC.


Ans:

RPC (Remote Procedure Call) is a protocol that allows a program to execute a procedure (function or
method) on a remote server as if it were a local procedure call. It simplifies communication between
distributed systems by abstracting the complexities of network communication.

Implementation Mechanism of RPC:

1. Client-Side Stubs: The client uses a stub, which is a piece of code that acts as a proxy for the remote
procedure. The client stub handles the preparation of the procedure call and marshals (packages) the
arguments into a message.
2. Request Message: The client stub sends the request message containing the procedure call and
arguments over the network to the server.
3. Server-Side Stubs: On the server side, a server stub receives the request message, unmarshals
(unpacks) the arguments, and invokes the actual procedure on the server.
4. Procedure Execution: The server executes the procedure and generates a result.
5. Response Message: The result is sent back to the client stub in a response message.
6. Client Stub Receives Result: The client stub receives the response, unmarshals the result, and
returns it to the client application.

4. What is the objective of Election algorithm? Explain Ring election algorithm.


Ans:

Objective of Election Algorithm: The primary goal of an election algorithm in a distributed system is to
select a leader or coordinator among a group of nodes. The leader is responsible for managing tasks that
require a single point of coordination, such as resource allocation or decision making.

Ring Election Algorithm:

1. Formation of a Ring: Nodes in the system are organized in a logical ring structure, where each node
knows the address of its successor.
2. Initiating the Election: A node (say Node A) detects that the current leader has failed or needs to be
replaced and initiates an election process.
3. Election Message: Node A sends an election message with its identifier to its successor node in the
ring.
4. Propagation of Message: The message is passed around the ring, with each node comparing its own
identifier with the one in the message. If a node finds a higher identifier, it updates the message with
its own identifier.
5. Choosing the Leader: When the message returns to the initiating node, it contains the highest
identifier. This node is then declared the new leader.
6. Notification: The new leader informs all other nodes of its new role, completing the election process.

5. Explain Distributed File System in details.


Ans:

18
Distributed File System (DFS) is a system that allows files to be stored and accessed across multiple
networked computers as if they were on a single local machine. It provides a unified view of files spread
across different locations.

Key Features of DFS:

1. Transparency: Users and applications interact with files as if they are all on their local system, even
though the files may be distributed across various servers.
2. Scalability: DFS can handle growing amounts of data and users by adding more servers or storage
resources without affecting performance.
3. Fault Tolerance: DFS maintains data availability and integrity even if some servers fail, often through
data replication and redundancy.
4. Data Distribution: Files are distributed across multiple servers to balance the load and improve
performance. This ensures that no single server becomes a bottleneck.
5. Centralized Management: Provides a single point for managing file access, permissions, and
configuration, making administration easier.
6. Consistency: Ensures that all users see a consistent view of the files, even if multiple users are
accessing or modifying files simultaneously.

How DFS Works:

1. File Access: When a user requests a file, the DFS determines the location of the file and retrieves it
from the appropriate server.
2. File Replication: To ensure reliability and availability, files may be replicated across multiple servers.
If one server fails, the system can still provide access from another server.
3. Load Balancing: DFS distributes file requests across multiple servers to prevent any single server
from becoming overloaded.
4. Metadata Management: DFS maintains metadata (information about files, such as their location and
permissions) to efficiently manage file access and updates.

6. What is mutual exclusion? Categorize and compare mutual exclusion algorithms.


Ans:

Mutual Exclusion is a concept in distributed systems ensuring that only one process or node can access
a critical resource (like a file or database) at a time. This prevents conflicts and ensures data consistency.

Categories of Mutual Exclusion Algorithms:

1. Centralized Algorithm:
o Description: A single central coordinator manages access to the critical resource. Processes
request access from the coordinator, which grants permission to only one process at a time.
o Example: Centralized Locking Algorithm.
o Pros: Simple to implement; avoids conflicts.
o Cons: Single point of failure; can become a bottleneck.
2. Distributed Algorithms:
o Description: No single coordinator. All nodes collaborate to manage access to the critical
resource using distributed messages.
o Types:
 Token-Based:
 Description: A unique token circulates among the nodes. Only the node holding the
token can access the resource.
 Example: Token Ring Algorithm.
19
 Pros: Simple; avoids deadlock.
 Cons: Token loss can cause issues; requires token circulation.
 Quorum-Based:
 Description: Requires a majority (quorum) of nodes to grant permission for access.
Nodes communicate to agree on access.
 Example: Ricart-Agrawala Algorithm.
 Pros: More fault-tolerant; no single point of failure.
 Cons: Message overhead; can be complex.
3. Lock-Based Algorithms:
o Description: Nodes use locks to manage access to the resource. Locks can be either exclusive
(only one node can hold) or shared (multiple nodes can hold).
o Example: Lamport’s Algorithm.
o Pros: Flexible; suitable for different types of resources.
o Cons: Requires careful management to avoid deadlocks and ensure fairness.

Comparison:

 Centralized: Simple but has a single point of failure and potential bottlenecks.
 Distributed Token-Based: Avoids bottlenecks but requires token management.
 Distributed Quorum-Based: Fault-tolerant and avoids single points of failure but can have high
message overhead.

7. Draw and explain architecture of SUN Network File System.


Ans:

Sun Network File System (NFS) allows files to be shared across a network, providing a consistent file
system view on multiple machines. Here’s a simple architecture of NFS:

Architecture of NFS:

1. Client: Requests files from the server. It sends file requests to the NFS server and receives
responses, providing access to remote files as if they were local.
2. NFS Server: Hosts the files and handles client requests. It stores the files and manages file access,
ensuring that the client can read or write data as needed.
3. File System: On the NFS server, the file system is the actual storage where files are kept. It could be
any standard file system (e.g., ext4, NTFS).
4. Network: Connects the NFS client and server, allowing them to communicate and transfer file data.

20
How It Works:

 Mounting: The client mounts the remote file system from the NFS server to access the files.
 Requests: The client sends file operation requests (e.g., read, write) to the NFS server.
 Responses: The server processes these requests and sends the appropriate responses (e.g., file
data, confirmation).

8. Explain distribution of transparency in distributed system.


Ans:

Distribution Transparency in a distributed system means hiding the complexities and details of the
system's distributed nature from users and applications. It makes the distributed system appear as a
single, unified system, even though it consists of multiple, physically separated components.

Types of Distribution Transparency:

1. Access Transparency: Users access resources as if they are local, without knowing they are on
different machines.
2. Location Transparency: The physical location of resources is hidden from users. They interact with
resources without knowing where they are stored.
3. Migration Transparency: Users are unaware when resources or processes move from one location
to another within the system.
4. Replication Transparency: The existence of multiple copies of data is hidden. Users see a single,
consistent view of the data.
5. Concurrency Transparency: Users can access resources simultaneously without being affected by
other users' operations, ensuring smooth interaction.
6. Failure Transparency: The system hides failures from users, allowing continued operation despite
issues with some components.

9. Explain connection-oriented message communication with the help of diagram.


Ans:

Connection-Oriented Message Communication ensures that a reliable connection is established


between two parties before they exchange messages. It provides a continuous and stable communication
channel.

21
How It Works:

1. Connection Establishment: A connection is set up between the sender and receiver before data
transfer begins. This involves a handshake process to confirm both parties are ready.
2. Data Transfer: Once the connection is established, data messages are exchanged between the
sender and receiver over this established connection.
3. Connection Termination: After the data transfer is complete, the connection is terminated, freeing up
resources.

Steps Explained:

1. Connection Request: Sender initiates a request to connect.


2. Connection Acknowledgment: Receiver acknowledges the request and establishes the connection.
3. Data Transfer: Both parties exchange messages over the established connection.
4. Connection Termination: The connection is closed once communication is complete.

10. Define virtualization. Explain architecture of virtual machine.


Ans:
Virtualization is a technology that allows multiple virtual instances (such as virtual machines) to run on a
single physical hardware system. It creates virtual versions of physical resources like servers, storage, or
networks.

Architecture of a Virtual Machine (VM):

1. Physical Hardware: The actual physical server or computer where the virtualization occurs.
2. Hypervisor (Virtual Machine Monitor): The software layer that manages and allocates physical
resources to virtual machines. It can be:
o Type 1 Hypervisor: Runs directly on the hardware (e.g., VMware ESXi).
o Type 2 Hypervisor: Runs on top of an existing operating system (e.g., VMware
Workstation).
3. Virtual Machines (VMs): Virtual instances created by the hypervisor. Each VM operates as if it
has its own separate hardware, including a virtual CPU, memory, storage, and network interfaces.
4. Guest Operating System: The operating system installed within each VM. It runs independently
of other VMs and the host OS.

Explanation:

22
1. Physical Hardware: The actual server or computer.
2. Hypervisor: Manages the VMs and allocates resources from the physical hardware.
3. Virtual Machines: Each VM acts like a separate computer with its own OS and applications.
4. Guest Operating System: The OS running inside each VM, managing its virtual resources.

11. Explain two phase commit protocol.


Ans:

Two-Phase Commit Protocol (2PC) is a method used in distributed systems to ensure all participants in
a transaction agree to commit or abort changes. It ensures consistency across distributed databases or
systems.

How It Works:

1. Prepare Phase:
o Coordinator: Sends a prepare request to all participating nodes (voters).
o Participants: Each participant node performs the necessary checks and prepares to commit. They
then send a "Yes" (ready to commit) or "No" (cannot commit) response back to the coordinator.
2. Commit Phase:
o If All Participants Respond "Yes": The coordinator sends a commit request to all participants,
instructing them to commit the transaction.
o If Any Participant Responds "No": The coordinator sends an abort request to all participants,
instructing them to roll back the transaction.

Explanation:

1. Prepare Phase: The coordinator asks participants if they are ready to commit.
2. Commit Phase: Based on participants' responses, the coordinator tells them to either commit or abort
the transaction.

12. Explain bully election algorithms. And compare it with ring election algorithm.
Ans:

Bully Election Algorithm is a method used to select a leader or coordinator among distributed
processes. Here's how it works:
23
Bully Election Algorithm:

1. Initiation: A process (candidate) detects that the current leader has failed or a new leader is needed
and starts the election by sending an election message to all processes with higher IDs.
2. Responses:
o Higher-ID Processes: If a process with a higher ID receives the election message, it sends a
"Message Received" response and starts its own election.
o Lower-ID Processes: If a process with a lower ID receives the election message, it responds with
a "Yes" and waits for the higher-ID process to complete the election.
3. Winning Process: The process with the highest ID eventually becomes the leader. If no higher-ID
process responds, the initiating process wins and becomes the leader.

Comparison with Ring Election Algorithm:

 Bully Algorithm:
o Initiator: Any process can start the election.
o Communication: Sends messages to higher-ID processes.
o Fault Tolerance: Relies on processes with higher IDs responding.
o Complexity: Can involve multiple rounds of communication.
o Efficiency: May have more overhead if many processes are involved.
 Ring Algorithm:
o Initiator: A process detects the need for a leader and starts the election.
o Communication: Messages circulate in a logical ring, with each process passing the message to
its successor.
o Fault Tolerance: A token circulates; if lost, it must be recovered.
o Complexity: Simpler and less message overhead as messages follow a single path.
o Efficiency: More predictable communication path, with less overhead.

13. Explain vector clock timestamp using suitable example.


Ans:

Vector Clock is a mechanism used in distributed systems to keep track of the order of events and ensure
consistency. Each node in the system maintains a vector of counters, one for each node, to record the
state of events.

How Vector Clock Works:

1. Initialization: Each node starts with a vector clock initialized to zeros. For example, in a system with
three nodes (A, B, C), each node's vector clock might start as [0, 0, 0].
2. Event Occurrence: When a node performs an event, it increments its own counter in its vector clock.
For example, if node A performs an event, its vector clock becomes [1, 0, 0].
3. Message Sending: When a node sends a message, it includes its current vector clock in the
message. For example, if node A sends a message to node B with vector clock [1, 0, 0], node B
updates its own vector clock by taking the maximum of its current vector clock and the received clock,
then increments its own counter.
4. Message Receiving: Upon receiving a message, a node updates its vector clock by taking the
element-wise maximum of its own vector clock and the received vector clock, and then increments its
own counter. For example, if node B’s clock is [1, 1, 0] and it receives a message with vector clock [1,
0, 0], it updates to [1, 1, 0].

Example:

24
Consider three nodes A, B, and C with initial vector clocks [0, 0, 0].

 Node A performs an event: Clock [1, 0, 0].


 Node B performs an event: Clock [0, 1, 0].
 Node A sends a message to Node C: Message contains clock [1, 0, 0].
 Node C receives the message from Node A and updates its clock to [1, 0, 1].

Summary:

Vector clocks help track causality by keeping each node's clock and updating it based on events and
messages. They provide a way to determine the order of events and resolve conflicts in distributed
systems.

14. Describe Kerberos authentication with neat diagram.


Ans:

Kerberos Authentication is a network authentication protocol designed to provide secure authentication


for users and services in a distributed system.

How It Works:

1. Login Request: A user logs in to the system with their username and password.
2. Authentication Server (AS):
o The user’s credentials are sent to the Authentication Server.
o The AS verifies the credentials and, if valid, issues a Ticket-Granting Ticket (TGT) encrypted with
the user’s password.
3. Ticket-Granting Server (TGS):
o When the user wants to access a service, they send the TGT to the TGS along with a service
request.
o The TGS decrypts the TGT, verifies it, and issues a Service Ticket for the requested service.
4. Service Access:
o The user sends the Service Ticket to the service they want to access.
o The service verifies the Service Ticket with the TGS and grants access if valid.

Diagram:

25
Steps Explained:

1. Login Request: User sends credentials to the AS.


2. TGT Issuance: AS sends a Ticket-Granting Ticket (TGT) to the user.
3. Service Ticket Request: User requests access to a service from the TGS using the TGT.
4. Service Access: User presents the Service Ticket to the service for access.

15. Write a short note on: Distributed object-based system.


Ans:

Distributed Object-Based System is a type of distributed system where objects are distributed across
multiple networked computers, yet they interact and operate as if they were local.

Key Features:

1. Object-Oriented: Uses objects (encapsulating data and behavior) rather than just raw data. Objects
interact with each other through method calls.
2. Location Transparency: Objects appear to be in the same location, even though they may be on
different machines. Users interact with objects without knowing their physical location.
3. Interoperability: Objects on different machines can communicate and work together. This is managed
by middleware that handles object location, method invocation, and data serialization.
4. Scalability: Easily scale by distributing objects across multiple servers, balancing the load and
increasing system capacity.
5. Fault Tolerance: Enhances reliability by replicating objects and handling failures gracefully.

Example:

In a distributed object-based system, you might have a banking application where customer account
objects are distributed across servers. A client application can interact with these account objects (e.g., to
check balance or transfer funds) as if they were on the same machine.

In summary, a distributed object-based system allows objects to be distributed across a network, providing
seamless interaction, location transparency, and scalability.

16. Explain Message-Oriented Communication in detail.


Ans:

Message-Oriented Communication involves the exchange of messages between distributed systems or


components. It can be asynchronous or synchronous and supports various messaging patterns.

Key Features:

1. Asynchronous Communication:
o Definition: Messages are sent without waiting for an immediate response.
o Benefit: Allows systems to continue working while waiting for message processing.
o Example: Email or message queues.
2. Synchronous Communication:
o Definition: The sender waits for a response before continuing.
o Benefit: Ensures immediate feedback or acknowledgment.
o Example: Remote Procedure Calls (RPC).
3. Message Queues:
o Definition: Messages are stored in a queue until they are processed.
26
o Benefit: Decouples sender and receiver, allowing for load balancing and fault tolerance.
o Example: RabbitMQ, Apache Kafka.
4. Publish-Subscribe Model:
o Definition: Senders (publishers) send messages to topics, and receivers (subscribers) receive
messages from those topics.
o Benefit: Supports many-to-many communication.
o Example: News feeds, event notification systems.

Diagram:

Sender ----> Message Queue ----> Receiver


(Optional) |
|
Publish-Subscribe
(Multiple Receivers)
17. Explain Layered Architecture in detail.
Ans:

Layered Architecture is a design approach in distributed systems where the system is divided into
distinct layers, each with specific responsibilities. This helps in organizing complex systems and managing
dependencies.

Key Features:

1. Layers:
o Application Layer: Provides user interfaces and application logic.
o Presentation Layer: Manages the presentation of data (e.g., web pages, user interfaces).
o Business Logic Layer: Handles the core business rules and data processing.
o Data Access Layer: Manages interactions with data storage (e.g., databases).
o Network Layer: Handles communication between distributed components.
2. Encapsulation:
o Each layer hides its implementation details from other layers, exposing only the necessary
interfaces.
3. Separation of Concerns:
o Different responsibilities are managed in separate layers, making the system easier to develop,
test, and maintain.

27
4. Inter-layer Communication:
o Layers interact with each other through well-defined interfaces. For example, the Application Layer
communicates with the Business Logic Layer.

18. Explain Ring algorithm in detail.


Ans:

Ring Algorithm is used in distributed systems for various tasks, such as leader election and resource
sharing, where nodes are arranged in a logical ring topology.

Ring Algorithm for Leader Election:

1. Initialization:
o Nodes are arranged in a logical ring. Each node is connected to its successor and predecessor.
2. Election Initiation:
o A node detects the need for a leader (e.g., if the current leader fails) and starts an election by
sending an election message around the ring.
3. Message Passing:
o The election message travels from one node to the next in the ring.
o Each node updates the message with its own ID if it has a higher ID than the one in the message.
4. Determine Leader:
o When the message returns to the starting node, it contains the highest ID.
o The node with the highest ID is declared the leader.
5. Broadcast Leader:
o The elected leader may then send a message to all nodes to announce its new role.

Diagram:

19. Explain Bully algorithm in detail.


Ans:

Bully Algorithm is used in distributed systems to elect a leader or coordinator among processes. It’s
called "bully" because higher-ID processes can "bully" lower-ID processes to step aside.

How It Works:

1. Initiate Election:

28
o A process detects the need for a leader (e.g., if the current leader fails) and starts an election
by sending an "election" message to all processes with higher IDs.
2. Higher-ID Response:
o Processes with higher IDs respond with a "message received" and initiate their own election.
o They send a "challenge" message to processes with lower IDs, indicating they are starting an
election.
3. Election Process:
o Each higher-ID process continues the election by sending messages to even higher-ID
processes.
o If a process with the highest ID responds, it wins and becomes the leader.
4. Declare Leader:
o The process with the highest ID sends a "victory" message to all processes, announcing itself
as the new leader.
5. Update State:
o All processes update their state to recognize the new leader.

Diagram:

20. Explain Authorization Management in detail.


Ans:

Authorization Management controls access to resources in a distributed system, ensuring that only
authorized users or processes can perform certain actions.

Key Components:

1. Access Control Lists (ACLs):


o Definition: A list attached to resources specifying which users or groups can access the resource
and what actions they can perform.
o Example: A file might have an ACL that allows read access to User A and write access to User B.
2. Role-Based Access Control (RBAC):
o Definition: Access rights are assigned based on roles rather than individual users. Users inherit
permissions based on their assigned roles.

29
o Example: In a company, the "Manager" role may have access to sensitive financial reports, while
the "Employee" role does not.
3. Policies and Rules:
o Definition: Defines conditions under which access is granted or denied. Policies can be based on
attributes like user identity, time of access, or location.
o Example: A policy might restrict access to certain resources during non-business hours.
4. Authentication:
o Definition: The process of verifying the identity of a user or system. Authentication must be
completed before authorization can occur.
o Example: A user logging in with a username and password.

Diagram:

User/Process
|
v
Authentication (Verify Identity)
|
v
Authorization Management
|
v
Access Control Lists / RBAC / Policies
|
v
Resource Access (Allowed/Denied)
21. Discuss and compare various election algorithms.
Ans:

Election Algorithms are used in distributed systems to select a leader or coordinator among processes.
Here are three common election algorithms:

1. Bully Algorithm:

 How It Works:
o A process initiates an election by sending messages to all higher-ID processes.
o Higher-ID processes respond and may initiate their own election.
o The process with the highest ID wins and becomes the leader.
 Pros:
o Simple to understand and implement.
 Cons:
o Can involve many messages, especially if there are many processes.
o Higher-ID processes may be busy, causing delays.

2. Ring Algorithm:

 How It Works:
o Nodes are arranged in a logical ring.
o A process initiates an election by sending an election message around the ring.
o Each node adds its ID to the message, passing it until it returns to the initiator.
o The node with the highest ID in the message becomes the leader.
 Pros:
30
o Efficient message passing since messages follow a single path.
 Cons:
o Requires that nodes be arranged in a ring.
o May have delays due to the message traveling around the entire ring.

3. Paxos Algorithm:

 How It Works:
o A consensus algorithm used to agree on a single value (e.g., leader) among a group of
processes.
o Involves multiple rounds of proposals, voting, and agreement.
o Processes propose values and vote, reaching consensus if a majority agrees on one value.
 Pros:
o Provides strong consistency and fault tolerance.
 Cons:
o More complex to implement and understand.
o Requires coordination among multiple processes.

Comparison:

 Bully Algorithm: Simple but can be message-heavy and slow with many processes.
 Ring Algorithm: Efficient with a single communication path but depends on ring topology and may
have delays.
 Paxos Algorithm: Ensures consistency and fault tolerance but is complex and requires more
coordination.

22. List out the types of System Architectures in distributed system and explain it.
Ans:

System Architectures in Distributed Systems can be categorized based on their design and interaction
models. Here are the main types:

1. Client-Server Architecture:

 Definition: Clients request services or resources from servers. Servers provide the requested
services.
 Example: Web browsers (clients) request web pages from web servers.
 Pros: Centralized management, easy to scale servers.
 Cons: Single point of failure at the server.

2. Peer-to-Peer (P2P) Architecture:

 Definition: All nodes (peers) have equal roles, sharing resources directly with each other without a
central server.
 Example: File sharing systems like BitTorrent.
 Pros: No single point of failure, scalable.
 Cons: Harder to manage, potential security issues.

3. Multi-Tier Architecture:

 Definition: Divides the system into multiple tiers (e.g., presentation, application, and data tiers) to
separate concerns.
31
 Example: Web applications where the presentation layer is separate from the business logic and
data layers.
 Pros: Modular, easier to maintain and scale.
 Cons: Increased complexity, potential for communication overhead between tiers.

4. Microservices Architecture:

 Definition: Breaks down applications into smaller, independent services that communicate through
APIs.
 Example: An e-commerce site with separate services for user management, inventory, and
payment.
 Pros: Highly scalable, each service can be developed and deployed independently.
 Cons: Complex to manage and orchestrate multiple services.

5. Service-Oriented Architecture (SOA):

 Definition: Uses services as the fundamental building blocks for application development, where
services are loosely coupled and communicate over a network.
 Example: Enterprise systems with services for order processing, customer management, etc.
 Pros: Reusable services, promotes integration and flexibility.
 Cons: Can be complex to implement, performance overhead due to service interactions.

23. Compare and contrast any 3 consistency models.


Ans:

Consistency Models define how data consistency is maintained across distributed systems. Here’s a
comparison of three common consistency models:

1. Strong Consistency:

 Definition: Guarantees that once a write is committed, all subsequent reads will see that write.
 Example: Traditional database systems like SQL databases.
 Pros: Simple and predictable; all nodes see the same data.
 Cons: Can be slower and less available due to the need for synchronization across all nodes.

2. Eventual Consistency:

 Definition: Guarantees that if no new updates are made, eventually all replicas will converge to
the same value.
 Example: NoSQL databases like Amazon DynamoDB.
 Pros: Provides high availability and better performance in distributed environments.
 Cons: Reads may return stale data; data consistency is not immediate.

3. Causal Consistency:

 Definition: Ensures that operations that are causally related are seen by all processes in the same
order. Concurrent operations may be seen in different orders.
 Example: Systems using conflict-free replicated data types (CRDTs) or certain distributed
databases.
 Pros: Balances consistency and availability; respects causal relationships between operations.
 Cons: More complex to implement than eventual consistency; not as strict as strong consistency.

32
Comparison:

 Strong Consistency: Ensures all nodes see the same data immediately but can impact
performance and availability.
 Eventual Consistency: Focuses on high availability and performance but allows temporary
inconsistencies.
 Causal Consistency: Provides a middle ground by maintaining causality while allowing some
flexibility in data ordering.

24. What is a logical clock? Explain how logical clocks are implemented in distributed system.
Ans:

Logical Clock is a mechanism used in distributed systems to order events or operations across different
processes, without relying on physical time.

How Logical Clocks Are Implemented:

1. Lamport Clocks:
o Concept: Each process maintains a counter that is incremented with every event (e.g., message
sent or received).
o Implementation:
 When a process sends a message, it includes its counter value.
 Upon receiving a message, a process updates its counter to be greater than the maximum
of its current counter and the received counter, then increments it.
o Purpose: Provides a partial ordering of events; if one event happened before another, it will have
a lower Lamport timestamp.
2. Vector Clocks:
o Concept: Each process maintains a vector of counters, one for each process in the system.
o Implementation:
 Each process updates its own counter in the vector for each event.
 When sending a message, the process includes its entire vector.
 Upon receiving a message, a process updates its vector to be the element-wise maximum
of its vector and the received vector, then increments its own counter.
o Purpose: Provides a more precise ordering of events than Lamport clocks, including capturing
causality.

25. What is RPC? Discuss the design issues for RPC.


Ans:

Remote Procedure Call (RPC) is a protocol that allows a program to execute a procedure (or function)
on a remote server as if it were a local procedure call.

Design Issues for RPC:

1. Transparency:
o Definition: The process should appear as if it's local, hiding the complexities of remote
communication.
o Issue: Ensuring that the remote call is as seamless as a local call.
2. Communication:
o Definition: Handling the network communication between client and server.
o Issue: Managing message passing, handling network failures, and ensuring reliable delivery.
3. Marshalling and Unmarshalling:
33
o Definition: Converting arguments and return values to/from a format suitable for transmission.
o Issue: Ensuring proper data format conversion between client and server.
4. Concurrency:
o Definition: Managing multiple simultaneous RPC calls.
o Issue: Handling simultaneous requests and responses efficiently.
5. Fault Tolerance:
o Definition: Ensuring system reliability despite network or server failures.
o Issue: Implementing retries, error handling, and recovery mechanisms.
6. Security:
o Definition: Protecting data and ensuring that only authorized clients can make requests.
o Issue: Implementing authentication, authorization, and encryption.

26. Explain the common approaches to user authentication. What problems are associated with
these approaches?
Ans:

User Authentication verifies the identity of users before granting access to resources. Here are common
approaches and associated problems:

1. Password-Based Authentication:

 Definition: Users provide a username and password.


 Problems:
o Weak Passwords: Users often choose easily guessable passwords.
o Password Theft: Passwords can be stolen or leaked.
o Phishing: Users might be tricked into revealing their passwords.

2. Two-Factor Authentication (2FA):

 Definition: Requires two forms of verification (e.g., a password and a code sent to a phone).
 Problems:
o Complexity: Can be inconvenient for users.
o Device Loss: If the second factor (e.g., phone) is lost, access can be difficult.
o SMS Vulnerabilities: Codes sent via SMS can be intercepted.

3. Biometric Authentication:

 Definition: Uses physical characteristics (e.g., fingerprints, facial recognition) for authentication.
 Problems:
o False Positives/Negatives: Inaccurate recognition can either deny access or wrongly grant it.
o Privacy Concerns: Biometric data is sensitive and can be misused.
o Spoofing: Biometric features can sometimes be mimicked or replicated.

4. Token-Based Authentication:

 Definition: Users authenticate and receive a token (e.g., JWT) used for subsequent requests.
 Problems:
o Token Theft: Tokens can be stolen and used by unauthorized parties.
o Token Expiry: Tokens need to be refreshed and managed securely.

27. Explain the DNS name service and bind implementation of DNS.
Ans:
34
DNS (Domain Name System) is a hierarchical system used to translate domain names (like
www.example.com) into IP addresses (like 192.0.2.1), allowing users to access websites and services
using human-readable names.

DNS Name Service:

1. Name Resolution:
o Definition: Translates domain names to IP addresses.
o Process: When a user types a domain name in a browser, DNS servers resolve it to the
corresponding IP address.
2. DNS Hierarchy:
o Structure: Organized in a tree-like structure with root servers, top-level domain (TLD) servers (like
.com), and authoritative servers for specific domains.
o Process: Requests are passed from root servers to TLD servers, and then to authoritative servers
that provide the final IP address.
3. Caching:
o Definition: Stores DNS query results to speed up future requests and reduce load on DNS
servers.
o Process: DNS servers and clients cache responses for a specified time (TTL - Time To Live).

BIND (Berkeley Internet Name Domain):

1. Definition:
o Implementation: BIND is one of the most widely used DNS server software implementations.
o Function: Provides DNS services for translating domain names into IP addresses and vice versa.
2. Components:
o Named: The DNS server daemon that handles queries and responses.
o Configuration: Defined in configuration files like named.conf, specifying zones, caching policies,
and other settings.
o Zone Files: Contain mappings of domain names to IP addresses and other DNS records.
3. Features:
o Authoritative: Provides answers for domains it is responsible for.
o Caching: Stores query results to improve efficiency and reduce network load.
o Forwarding: Can forward queries to other DNS servers if it does not have the answer.

35

You might also like